Changes between Version 21 and Version 22 of GcnInstrsVop3


Ignore:
Timestamp:
05/20/16 23:00:22 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop3

    v21 v22  
    12691269INT32 V1 = (INT32)((SRC1&0x7fffff) | (SSRC1&0x800000 ? 0xff800000 : 0))
    12701270VDST = V0 * V1 + SRC2</code></p>
     1271<h4>V_MAD_I64_I32</h4>
     1272<p>Opcode (VOP3B): 375 (0x177) for GCN 1.1; 489 (0x1e9) for GCN 1.2<br />
     1273Syntax: V_MAD_I64_I32 VDST(2), SDST(2), SRC0, SRC1, SRC2(2)<br />
     1274Description: Multiply 32-bit signed integer value from SRC0 by 32-bit signed value
     1275from SRC1 and add 64-bit unsigned value to this result, and store final result into
     1276VDST and store some value of bits to SDST (unknown behavior).<br />
     1277Operation:<br />
     1278<code>INT64 PROD = (INT64)SRC0*(INT32)SRC1
     1279VDST = SRC2 + PROD
     1280SDST = 0
     1281UINT64 mask = (1ULL&lt;&lt;LANEID)
     1282//SDST = (SDST&amp;~mask) | ((?????) ? mask : 0)</code></p>
    12711283<h4>V_MAD_LEGACY_F32</h4>
    12721284<p>Opcode: 320 (0x140) for GCN 1.0/1.1; 448 (0x1c0) for GCN 1.2<br />
     
    12851297Operation:<br />
    12861298<code>VDST = (UINT32)(SRC0&amp;0xffffff) * (UINT32)(SRC1&amp;0xffffff) + SRC2</code></p>
     1299<h4>V_MAD_U64_U32</h4>
     1300<p>Opcode (VOP3B): 374 (0x176) for GCN 1.1; 488 (0x1e8) for GCN 1.2<br />
     1301Syntax: V_MAD_U64_U32 VDST(2), SDST(2), SRC0, SRC1, SRC2(2)<br />
     1302Description: Multiply 32-bit unsigned integer value from SRC0 by 32-bit unsigned value
     1303from SRC1 and add 64-bit unsigned value to this result, and store final result into
     1304VDST and store carry bits to SDST.<br />
     1305Operation:<br />
     1306<code>UINT64 PROD = (UINT64)SRC0*SRC1
     1307VDST = SRC2 + PROD
     1308SDST = 0
     1309UINT64 mask = (1ULL&lt;&lt;LANEID)
     1310SDST = (SDST&amp;~mask) | ((VDST &lt; PROD) ? mask : 0)</code></p>
    12871311<h4>V_MAX_F64</h4>
    12881312<p>Opcode: 359 (0x167) for GCN 1.0/1.1; 643 (0x283) for GCN 1.2<br />
     
    14381462else
    14391463    VDST = MIN(SRC1, SRC0)</code></p>
     1464<h4>V_MQSAD_U32_U8</h4>
     1465<p>Opcode: 373 (0x175) for GCN 1.1; 487 (0x1e7) for GCN 1.2<br />
     1466Syntax: V_MQSAD_U32_U8 VDST(4), SRC0(2), SRC1, SRC2(4)<br />
     1467Description: Compute four masked sum of absolute differences with accumulation.
     1468Any that operation get first argument from four bytes begins from N and ends to N+3
     1469(where N is number of operation), second argument is SRC1, and third argument is
     1470N'th 32-bit dword from SRC2.<br />
     1471Operation:<br />
     1472<code>void MSADU8(UINT32 S0, UINT32 S1, UINT32 S2)
     1473{
     1474    UINT64 OUT = S2;
     1475    for (UINT8 i = 0; i &lt; 4; i++)
     1476        if ((S1 &gt;&gt; (i*8)) &amp; 0xff) != 0)
     1477            OUT += ABS(((S0 &gt;&gt; (i*8)) &amp; 0xff) - ((S1 &gt;&gt; (i*8)) &amp; 0xff))
     1478    return (UINT32)MIN(OUT,0xffffffff);
     1479}
     1480VDST = (MSADU8((UINT32)SRC0, SRC1, SRC2)
     1481VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;8), SRC1, SRC2&gt;&gt;32)&lt;&lt;32
     1482VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;16), SRC1, SRC2&gt;&gt;64)&lt;&lt;64
     1483VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;24), SRC1, SRC2&gt;&gt;96)&lt;&lt;96</code></p>
    14401484<h4>V_MQSAD_U8, V_MQSAD_PK_U16_U8</h4>
    14411485<p>Opcode: 371 (0x173) for GCN 1.0/1.1; 486 (0x1e6) for GCN 1.2<br />
    1442 Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
    1443 Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
     1486Syntax (GCN 1.0): V_MQSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
     1487Syntax (GCN 1.1/1.2): V_MQSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
    14441488Description: Compute four masked sum of absolute differences with accumulation.
    14451489Any that operation get first argument from four bytes begins from N and ends to N+3