Changes between Version 14 and Version 15 of GcnInstrsVop3


Ignore:
Timestamp:
12/11/15 16:00:18 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop3

    v14 v15  
    12991299else
    13001300    VDST = MIN(SRC1, SRC0)</code></p>
     1301<h4>V_MQSAD_U8, V_MQSAD_PK_U16_U8</h4>
     1302<p>Opcode: 371 (0x173) for GCN 1.0/1.1; 486 (0x1e6) for GCN 1.2<br />
     1303Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
     1304Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
     1305Description: Compute four masked sum of absolute differences with accumulation.
     1306Any that operation get first argument from four bytes begins from N and ends to N+3
     1307(where N is number of operation), second argument is SRC1, and third argument is
     1308N'th 16-bit dword from SRC2.<br />
     1309Operation:<br />
     1310<code>void MSADU8(UINT32 S0, UINT32 S1, UINT32 S2)
     1311{
     1312    UINT32 OUT = S2;
     1313    for (UINT8 i = 0; i &lt; 4; i++)
     1314        if ((S1 &gt;&gt; (i*8)) &amp; 0xff) != 0)
     1315            OUT += ABS(((S0 &gt;&gt; (i*8)) &amp; 0xff) - ((S1 &gt;&gt; (i*8)) &amp; 0xff))
     1316    return OUT;
     1317}
     1318VDST = (MSADU8((UINT32)SRC0, SRC1, SRC2 &amp; 0xffff)
     1319VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;8), SRC1, (SRC2&gt;&gt;16) &amp; 0xffff)&lt;&lt;16
     1320VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;16), SRC1, (SRC2&gt;&gt;32) &amp; 0xffff)&lt;&lt;32
     1321VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;24), SRC1, (SRC2&gt;&gt;48) &amp; 0xffff)&lt;&lt;48</code></p>
     1322<h4>V_MSAD_U8</h4>
     1323<p>Opcode: 369 (0x171) for GCN 1.0/1.1; 484 (0x1e4) for GCN 1.2<br />
     1324Syntax: V_MSAD_U8 VDST, SRC0, SRC1, SRC2<br />
     1325Description: Calculate sum of absolute differences in SRC0 and SRC1 for bytes that have
     1326non-zero value in SRC1; add SRC2 to result, and store result to VDST.<br />
     1327Operation:<br />
     1328<code>VDST = SRC2
     1329for (UINT8 i = 0; i &lt; 4; i++)
     1330    if ((SRC1 &gt;&gt; (i*8)) &amp; 0xff) != 0)
     1331        VDST += ABS(((SRC0 &gt;&gt; (i*8)) &amp; 0xff) - ((SRC1 &gt;&gt; (i*8)) &amp; 0xff))</code></p>
    13011332<h4>V_MUL_F64</h4>
    13021333<p>Opcode: 357 (0x165) for GCN 1.0/1.1; 641 (0x281) for GCN 1.2<br />
     
    13481379        VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1)
    13491380}</code></p>
     1381<h4>V_QSAD_U8, V_QSAD_PK_U16_U8</h4>
     1382<p>Opcode: 370 (0x172) for GCN 1.0/1.1; 485 (0x1e5) for GCN 1.2<br />
     1383Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
     1384Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
     1385Description: Compute four sum of absolute differences with accumulation. Any that operation
     1386get first argument from four bytes begins from N and ends to N+3 (where N is number of
     1387operation), second argument is SRC1, and third argument is N'th 16-bit dword from SRC2.<br />
     1388Operation:<br />
     1389<code>void SADU8(UINT32 S0, UINT32 S1, UINT32 S2)
     1390{
     1391    UINT32 OUT = S2;
     1392    for (UINT8 i = 0; i &lt; 4; i++)
     1393        OUT += ABS(((S0 &gt;&gt; (i*8)) &amp; 0xff) - ((S1 &gt;&gt; (i*8)) &amp; 0xff))
     1394    return OUT;
     1395}
     1396VDST = (SADU8((UINT32)SRC0, SRC1, SRC2 &amp; 0xffff)
     1397VDST |= (SADU8((UINT32)(SRC0&gt;&gt;8), SRC1, (SRC2&gt;&gt;16) &amp; 0xffff)&lt;&lt;16
     1398VDST |= (SADU8((UINT32)(SRC0&gt;&gt;16), SRC1, (SRC2&gt;&gt;32) &amp; 0xffff)&lt;&lt;32
     1399VDST |= (SADU8((UINT32)(SRC0&gt;&gt;24), SRC1, (SRC2&gt;&gt;48) &amp; 0xffff)&lt;&lt;48</code></p>
    13501400<h4>V_SAD_HI_U8</h4>
    13511401<p>Opcode: 347 (0x15b) for GCN 1.0/1.1; 474 (0x1da) for GCN 1.2<br />