| | 177 | <p>Examples: |
| | 178 | <code>v_xor_b32 v1,v2,v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1 |
| | 179 | v_xor_b32 v1,v2,v3 dst_sel:b1 src0_sel:b1 src1_sel:w1 |
| | 180 | v_xor_b32 v1,v2,v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1 dst_unused:preserve |
| | 181 | v_xor_b32 v1,v2,v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1 dst_unused:sext |
| | 182 | v_xor_b32 v1,sext(v2),v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1</code></p> |
| | 307 | <h3>VOP_DPP</h3> |
| | 308 | <p>The VOP_DPP encoding is enabled by setting 0xfa in VSRC0 field in VOP1/VOP2/VOPC encoding. |
| | 309 | List of fields:</p> |
| | 310 | <table> |
| | 311 | <thead> |
| | 312 | <tr> |
| | 313 | <th>Bits</th> |
| | 314 | <th>Name</th> |
| | 315 | <th>Description</th> |
| | 316 | </tr> |
| | 317 | </thead> |
| | 318 | <tbody> |
| | 319 | <tr> |
| | 320 | <td>0-7</td> |
| | 321 | <td>SRC0</td> |
| | 322 | <td>First source vector operand</td> |
| | 323 | </tr> |
| | 324 | <tr> |
| | 325 | <td>8-16</td> |
| | 326 | <td>DPP_CTRL</td> |
| | 327 | <td>Data parallel primitive control</td> |
| | 328 | </tr> |
| | 329 | <tr> |
| | 330 | <td>19</td> |
| | 331 | <td>BOUND_CTRL</td> |
| | 332 | <td>Specifies behaviour when shared data is invalid</td> |
| | 333 | </tr> |
| | 334 | <tr> |
| | 335 | <td>20</td> |
| | 336 | <td>SRC0_NEG</td> |
| | 337 | <td>Negation modifier for SRC0</td> |
| | 338 | </tr> |
| | 339 | <tr> |
| | 340 | <td>21</td> |
| | 341 | <td>SRC0_ABS</td> |
| | 342 | <td>Absolute value for SRC0</td> |
| | 343 | </tr> |
| | 344 | <tr> |
| | 345 | <td>22</td> |
| | 346 | <td>SRC1_NEG</td> |
| | 347 | <td>Negation modifier for SRC1</td> |
| | 348 | </tr> |
| | 349 | <tr> |
| | 350 | <td>23</td> |
| | 351 | <td>SRC1_ABS</td> |
| | 352 | <td>Absolute value for SRC1</td> |
| | 353 | </tr> |
| | 354 | <tr> |
| | 355 | <td>24-27</td> |
| | 356 | <td>BANK_MASK</td> |
| | 357 | <td>Bank enable mask</td> |
| | 358 | </tr> |
| | 359 | <tr> |
| | 360 | <td>28-31</td> |
| | 361 | <td>ROW_MASK</td> |
| | 362 | <td>Row enable mask</td> |
| | 363 | </tr> |
| | 364 | </tbody> |
| | 365 | </table> |
| | 366 | <p>The operation on wavefronts applied to VSRC0 operand in VOP instruction. |
| | 367 | The wavefront contains 4 rows (16 threads), and each row contains 4 banks (4 threads). |
| | 368 | The DPP_CTRL choose which operation will be applied to VSRC0. |
| | 369 | List of data parallel operations:</p> |
| | 370 | <table> |
| | 371 | <thead> |
| | 372 | <tr> |
| | 373 | <th>Value</th> |
| | 374 | <th>Name</th> |
| | 375 | <th>Modifier</th> |
| | 376 | <th>Description</th> |
| | 377 | </tr> |
| | 378 | </thead> |
| | 379 | <tbody> |
| | 380 | <tr> |
| | 381 | <td>0x00-0xff</td> |
| | 382 | <td>DPP_QUAD_PERM{00:ff}</td> |
| | 383 | <td>quad_perm:[A,B,C,D]</td> |
| | 384 | <td>Full permute of 4 threads</td> |
| | 385 | </tr> |
| | 386 | <tr> |
| | 387 | <td>0x101-0x10f</td> |
| | 388 | <td>DPP_ROW_SL{1:15}</td> |
| | 389 | <td>row_shl:N</td> |
| | 390 | <td>Row shift left by N threads</td> |
| | 391 | </tr> |
| | 392 | <tr> |
| | 393 | <td>0x111-0x11f</td> |
| | 394 | <td>DPP_ROW_SR{1:15}</td> |
| | 395 | <td>row_shr:N</td> |
| | 396 | <td>Row shift right by N threads</td> |
| | 397 | </tr> |
| | 398 | <tr> |
| | 399 | <td>0x121-0x12f</td> |
| | 400 | <td>DPP_ROW_RR{1:15}</td> |
| | 401 | <td>row_ror:N</td> |
| | 402 | <td>Row rotate right by N threads</td> |
| | 403 | </tr> |
| | 404 | <tr> |
| | 405 | <td>0x130</td> |
| | 406 | <td>DPP_WF_SL1</td> |
| | 407 | <td>wave_shl:1</td> |
| | 408 | <td>Wave shift left by 1 thread</td> |
| | 409 | </tr> |
| | 410 | <tr> |
| | 411 | <td>0x134</td> |
| | 412 | <td>DPP_WF_RL1</td> |
| | 413 | <td>wave_rol:1</td> |
| | 414 | <td>Wave rotate left by 1 thread</td> |
| | 415 | </tr> |
| | 416 | <tr> |
| | 417 | <td>0x138</td> |
| | 418 | <td>DPP_WF_SR1</td> |
| | 419 | <td>wave_shr:1</td> |
| | 420 | <td>Wave shift right by 1 thread</td> |
| | 421 | </tr> |
| | 422 | <tr> |
| | 423 | <td>0x13c</td> |
| | 424 | <td>DPP_WF_RR1</td> |
| | 425 | <td>wave_ror:1</td> |
| | 426 | <td>Wave rotate right by 1 thread</td> |
| | 427 | </tr> |
| | 428 | <tr> |
| | 429 | <td>0x140</td> |
| | 430 | <td>DPP_ROW_MIRROR</td> |
| | 431 | <td>row_mirror</td> |
| | 432 | <td>Mirror threads within row</td> |
| | 433 | </tr> |
| | 434 | <tr> |
| | 435 | <td>0x141</td> |
| | 436 | <td>DPP_ROW_HALF_MIRROR</td> |
| | 437 | <td>row_half_mirror</td> |
| | 438 | <td>Mirror threads within half row</td> |
| | 439 | </tr> |
| | 440 | <tr> |
| | 441 | <td>0x142</td> |
| | 442 | <td>DPP_ROW_BCAST15</td> |
| | 443 | <td>row_bcast:15</td> |
| | 444 | <td>Broadcast 15 thread of each row to next row</td> |
| | 445 | </tr> |
| | 446 | <tr> |
| | 447 | <td>0x143</td> |
| | 448 | <td>DPP_ROW_BCAST15</td> |
| | 449 | <td>row_bcast:15</td> |
| | 450 | <td>Broadcast 31 thread to row 2 and row 3</td> |
| | 451 | </tr> |
| | 452 | </tbody> |
| | 453 | </table> |
| | 454 | <p>The BOUND_CTRL flag (modifier <code>bound_ctrl</code> or <code>bound_ctrl:0</code>) control how to fill invalid |
| | 455 | threads (for example that last threads after left shifting). Zero value (no modifier) |
| | 456 | sets invalid threads by original VSRC0 value for particular thread. One value (with modifier) |
| | 457 | fills invalid threads by 0 thread VSRC0 value.</p> |
| | 458 | <p>The field BANK_MASK (modifier <code>bank_mask:value</code>) choose which banks will be enabled during |
| | 459 | data parallel operation in each enabled row. The Nth bit represents Nth bank in each row. |
| | 460 | Disabled bank will be filled by original VSRC0 value for particular thread</p> |
| | 461 | <p>The field ROW_MASK (modifier <code>row_mask:value</code>) choose which rows will be enabled during |
| | 462 | data parallel operation. The Nth bit represents Nth row. |
| | 463 | Disabled row will be filled by original VSRC0 value for particular thread.</p> |