| 1271 | <h4>V_MAD_I64_I32</h4> |
| 1272 | <p>Opcode (VOP3B): 375 (0x177) for GCN 1.1; 489 (0x1e9) for GCN 1.2<br /> |
| 1273 | Syntax: V_MAD_I64_I32 VDST(2), SDST(2), SRC0, SRC1, SRC2(2)<br /> |
| 1274 | Description: Multiply 32-bit signed integer value from SRC0 by 32-bit signed value |
| 1275 | from SRC1 and add 64-bit unsigned value to this result, and store final result into |
| 1276 | VDST and store some value of bits to SDST (unknown behavior).<br /> |
| 1277 | Operation:<br /> |
| 1278 | <code>INT64 PROD = (INT64)SRC0*(INT32)SRC1 |
| 1279 | VDST = SRC2 + PROD |
| 1280 | SDST = 0 |
| 1281 | UINT64 mask = (1ULL<<LANEID) |
| 1282 | //SDST = (SDST&~mask) | ((?????) ? mask : 0)</code></p> |
| 1299 | <h4>V_MAD_U64_U32</h4> |
| 1300 | <p>Opcode (VOP3B): 374 (0x176) for GCN 1.1; 488 (0x1e8) for GCN 1.2<br /> |
| 1301 | Syntax: V_MAD_U64_U32 VDST(2), SDST(2), SRC0, SRC1, SRC2(2)<br /> |
| 1302 | Description: Multiply 32-bit unsigned integer value from SRC0 by 32-bit unsigned value |
| 1303 | from SRC1 and add 64-bit unsigned value to this result, and store final result into |
| 1304 | VDST and store carry bits to SDST.<br /> |
| 1305 | Operation:<br /> |
| 1306 | <code>UINT64 PROD = (UINT64)SRC0*SRC1 |
| 1307 | VDST = SRC2 + PROD |
| 1308 | SDST = 0 |
| 1309 | UINT64 mask = (1ULL<<LANEID) |
| 1310 | SDST = (SDST&~mask) | ((VDST < PROD) ? mask : 0)</code></p> |
| 1464 | <h4>V_MQSAD_U32_U8</h4> |
| 1465 | <p>Opcode: 373 (0x175) for GCN 1.1; 487 (0x1e7) for GCN 1.2<br /> |
| 1466 | Syntax: V_MQSAD_U32_U8 VDST(4), SRC0(2), SRC1, SRC2(4)<br /> |
| 1467 | Description: Compute four masked sum of absolute differences with accumulation. |
| 1468 | Any that operation get first argument from four bytes begins from N and ends to N+3 |
| 1469 | (where N is number of operation), second argument is SRC1, and third argument is |
| 1470 | N'th 32-bit dword from SRC2.<br /> |
| 1471 | Operation:<br /> |
| 1472 | <code>void MSADU8(UINT32 S0, UINT32 S1, UINT32 S2) |
| 1473 | { |
| 1474 | UINT64 OUT = S2; |
| 1475 | for (UINT8 i = 0; i < 4; i++) |
| 1476 | if ((S1 >> (i*8)) & 0xff) != 0) |
| 1477 | OUT += ABS(((S0 >> (i*8)) & 0xff) - ((S1 >> (i*8)) & 0xff)) |
| 1478 | return (UINT32)MIN(OUT,0xffffffff); |
| 1479 | } |
| 1480 | VDST = (MSADU8((UINT32)SRC0, SRC1, SRC2) |
| 1481 | VDST |= (MSADU8((UINT32)(SRC0>>8), SRC1, SRC2>>32)<<32 |
| 1482 | VDST |= (MSADU8((UINT32)(SRC0>>16), SRC1, SRC2>>64)<<64 |
| 1483 | VDST |= (MSADU8((UINT32)(SRC0>>24), SRC1, SRC2>>96)<<96</code></p> |
1442 | | Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> |
1443 | | Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> |
| 1486 | Syntax (GCN 1.0): V_MQSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> |
| 1487 | Syntax (GCN 1.1/1.2): V_MQSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> |