| 177 | <p>Examples: |
| 178 | <code>v_xor_b32 v1,v2,v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1 |
| 179 | v_xor_b32 v1,v2,v3 dst_sel:b1 src0_sel:b1 src1_sel:w1 |
| 180 | v_xor_b32 v1,v2,v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1 dst_unused:preserve |
| 181 | v_xor_b32 v1,v2,v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1 dst_unused:sext |
| 182 | v_xor_b32 v1,sext(v2),v3 dst_sel:byte_1 src0_sel:byte1 src1_sel:word1</code></p> |
| 307 | <h3>VOP_DPP</h3> |
| 308 | <p>The VOP_DPP encoding is enabled by setting 0xfa in VSRC0 field in VOP1/VOP2/VOPC encoding. |
| 309 | List of fields:</p> |
| 310 | <table> |
| 311 | <thead> |
| 312 | <tr> |
| 313 | <th>Bits</th> |
| 314 | <th>Name</th> |
| 315 | <th>Description</th> |
| 316 | </tr> |
| 317 | </thead> |
| 318 | <tbody> |
| 319 | <tr> |
| 320 | <td>0-7</td> |
| 321 | <td>SRC0</td> |
| 322 | <td>First source vector operand</td> |
| 323 | </tr> |
| 324 | <tr> |
| 325 | <td>8-16</td> |
| 326 | <td>DPP_CTRL</td> |
| 327 | <td>Data parallel primitive control</td> |
| 328 | </tr> |
| 329 | <tr> |
| 330 | <td>19</td> |
| 331 | <td>BOUND_CTRL</td> |
| 332 | <td>Specifies behaviour when shared data is invalid</td> |
| 333 | </tr> |
| 334 | <tr> |
| 335 | <td>20</td> |
| 336 | <td>SRC0_NEG</td> |
| 337 | <td>Negation modifier for SRC0</td> |
| 338 | </tr> |
| 339 | <tr> |
| 340 | <td>21</td> |
| 341 | <td>SRC0_ABS</td> |
| 342 | <td>Absolute value for SRC0</td> |
| 343 | </tr> |
| 344 | <tr> |
| 345 | <td>22</td> |
| 346 | <td>SRC1_NEG</td> |
| 347 | <td>Negation modifier for SRC1</td> |
| 348 | </tr> |
| 349 | <tr> |
| 350 | <td>23</td> |
| 351 | <td>SRC1_ABS</td> |
| 352 | <td>Absolute value for SRC1</td> |
| 353 | </tr> |
| 354 | <tr> |
| 355 | <td>24-27</td> |
| 356 | <td>BANK_MASK</td> |
| 357 | <td>Bank enable mask</td> |
| 358 | </tr> |
| 359 | <tr> |
| 360 | <td>28-31</td> |
| 361 | <td>ROW_MASK</td> |
| 362 | <td>Row enable mask</td> |
| 363 | </tr> |
| 364 | </tbody> |
| 365 | </table> |
| 366 | <p>The operation on wavefronts applied to VSRC0 operand in VOP instruction. |
| 367 | The wavefront contains 4 rows (16 threads), and each row contains 4 banks (4 threads). |
| 368 | The DPP_CTRL choose which operation will be applied to VSRC0. |
| 369 | List of data parallel operations:</p> |
| 370 | <table> |
| 371 | <thead> |
| 372 | <tr> |
| 373 | <th>Value</th> |
| 374 | <th>Name</th> |
| 375 | <th>Modifier</th> |
| 376 | <th>Description</th> |
| 377 | </tr> |
| 378 | </thead> |
| 379 | <tbody> |
| 380 | <tr> |
| 381 | <td>0x00-0xff</td> |
| 382 | <td>DPP_QUAD_PERM{00:ff}</td> |
| 383 | <td>quad_perm:[A,B,C,D]</td> |
| 384 | <td>Full permute of 4 threads</td> |
| 385 | </tr> |
| 386 | <tr> |
| 387 | <td>0x101-0x10f</td> |
| 388 | <td>DPP_ROW_SL{1:15}</td> |
| 389 | <td>row_shl:N</td> |
| 390 | <td>Row shift left by N threads</td> |
| 391 | </tr> |
| 392 | <tr> |
| 393 | <td>0x111-0x11f</td> |
| 394 | <td>DPP_ROW_SR{1:15}</td> |
| 395 | <td>row_shr:N</td> |
| 396 | <td>Row shift right by N threads</td> |
| 397 | </tr> |
| 398 | <tr> |
| 399 | <td>0x121-0x12f</td> |
| 400 | <td>DPP_ROW_RR{1:15}</td> |
| 401 | <td>row_ror:N</td> |
| 402 | <td>Row rotate right by N threads</td> |
| 403 | </tr> |
| 404 | <tr> |
| 405 | <td>0x130</td> |
| 406 | <td>DPP_WF_SL1</td> |
| 407 | <td>wave_shl:1</td> |
| 408 | <td>Wave shift left by 1 thread</td> |
| 409 | </tr> |
| 410 | <tr> |
| 411 | <td>0x134</td> |
| 412 | <td>DPP_WF_RL1</td> |
| 413 | <td>wave_rol:1</td> |
| 414 | <td>Wave rotate left by 1 thread</td> |
| 415 | </tr> |
| 416 | <tr> |
| 417 | <td>0x138</td> |
| 418 | <td>DPP_WF_SR1</td> |
| 419 | <td>wave_shr:1</td> |
| 420 | <td>Wave shift right by 1 thread</td> |
| 421 | </tr> |
| 422 | <tr> |
| 423 | <td>0x13c</td> |
| 424 | <td>DPP_WF_RR1</td> |
| 425 | <td>wave_ror:1</td> |
| 426 | <td>Wave rotate right by 1 thread</td> |
| 427 | </tr> |
| 428 | <tr> |
| 429 | <td>0x140</td> |
| 430 | <td>DPP_ROW_MIRROR</td> |
| 431 | <td>row_mirror</td> |
| 432 | <td>Mirror threads within row</td> |
| 433 | </tr> |
| 434 | <tr> |
| 435 | <td>0x141</td> |
| 436 | <td>DPP_ROW_HALF_MIRROR</td> |
| 437 | <td>row_half_mirror</td> |
| 438 | <td>Mirror threads within half row</td> |
| 439 | </tr> |
| 440 | <tr> |
| 441 | <td>0x142</td> |
| 442 | <td>DPP_ROW_BCAST15</td> |
| 443 | <td>row_bcast:15</td> |
| 444 | <td>Broadcast 15 thread of each row to next row</td> |
| 445 | </tr> |
| 446 | <tr> |
| 447 | <td>0x143</td> |
| 448 | <td>DPP_ROW_BCAST15</td> |
| 449 | <td>row_bcast:15</td> |
| 450 | <td>Broadcast 31 thread to row 2 and row 3</td> |
| 451 | </tr> |
| 452 | </tbody> |
| 453 | </table> |
| 454 | <p>The BOUND_CTRL flag (modifier <code>bound_ctrl</code> or <code>bound_ctrl:0</code>) control how to fill invalid |
| 455 | threads (for example that last threads after left shifting). Zero value (no modifier) |
| 456 | sets invalid threads by original VSRC0 value for particular thread. One value (with modifier) |
| 457 | fills invalid threads by 0 thread VSRC0 value.</p> |
| 458 | <p>The field BANK_MASK (modifier <code>bank_mask:value</code>) choose which banks will be enabled during |
| 459 | data parallel operation in each enabled row. The Nth bit represents Nth bank in each row. |
| 460 | Disabled bank will be filled by original VSRC0 value for particular thread</p> |
| 461 | <p>The field ROW_MASK (modifier <code>row_mask:value</code>) choose which rows will be enabled during |
| 462 | data parallel operation. The Nth bit represents Nth row. |
| 463 | Disabled row will be filled by original VSRC0 value for particular thread.</p> |