Context Navigation

Changes between Version 14 and Version 15 of GcnInstrsVop3

Timestamp:: 12/11/15 16:00:18 (8 years ago)
Author:: trac
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

GcnInstrsVop3

-                      v14
+                      v15
 else
     VDST = MIN(SRC1, SRC0)</code></p>
+<h4>V_MQSAD_U8, V_MQSAD_PK_U16_U8</h4>
+<p>Opcode: 371 (0x173) for GCN 1.0/1.1; 486 (0x1e6) for GCN 1.2<br />
+Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
+Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
+Description: Compute four masked sum of absolute differences with accumulation.
+Any that operation get first argument from four bytes begins from N and ends to N+3
+(where N is number of operation), second argument is SRC1, and third argument is
+N'th 16-bit dword from SRC2.<br />
+Operation:<br />
+<code>void MSADU8(UINT32 S0, UINT32 S1, UINT32 S2)
+{
+    UINT32 OUT = S2;
+    for (UINT8 i = 0; i &lt; 4; i++)
+        if ((S1 &gt;&gt; (i*8)) &amp; 0xff) != 0)
+            OUT += ABS(((S0 &gt;&gt; (i*8)) &amp; 0xff) - ((S1 &gt;&gt; (i*8)) &amp; 0xff))
+    return OUT;
+}
+VDST = (MSADU8((UINT32)SRC0, SRC1, SRC2 &amp; 0xffff)
+VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;8), SRC1, (SRC2&gt;&gt;16) &amp; 0xffff)&lt;&lt;16
+VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;16), SRC1, (SRC2&gt;&gt;32) &amp; 0xffff)&lt;&lt;32
+VDST |= (MSADU8((UINT32)(SRC0&gt;&gt;24), SRC1, (SRC2&gt;&gt;48) &amp; 0xffff)&lt;&lt;48</code></p>
+<h4>V_MSAD_U8</h4>
+<p>Opcode: 369 (0x171) for GCN 1.0/1.1; 484 (0x1e4) for GCN 1.2<br />
+Syntax: V_MSAD_U8 VDST, SRC0, SRC1, SRC2<br />
+Description: Calculate sum of absolute differences in SRC0 and SRC1 for bytes that have
+non-zero value in SRC1; add SRC2 to result, and store result to VDST.<br />
+Operation:<br />
+<code>VDST = SRC2
+for (UINT8 i = 0; i &lt; 4; i++)
+    if ((SRC1 &gt;&gt; (i*8)) &amp; 0xff) != 0)
+        VDST += ABS(((SRC0 &gt;&gt; (i*8)) &amp; 0xff) - ((SRC1 &gt;&gt; (i*8)) &amp; 0xff))</code></p>
 <h4>V_MUL_F64</h4>
 <p>Opcode: 357 (0x165) for GCN 1.0/1.1; 641 (0x281) for GCN 1.2<br />
 …
         VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1)
 }</code></p>
+<h4>V_QSAD_U8, V_QSAD_PK_U16_U8</h4>
+<p>Opcode: 370 (0x172) for GCN 1.0/1.1; 485 (0x1e5) for GCN 1.2<br />
+Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
+Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br />
+Description: Compute four sum of absolute differences with accumulation. Any that operation
+get first argument from four bytes begins from N and ends to N+3 (where N is number of
+operation), second argument is SRC1, and third argument is N'th 16-bit dword from SRC2.<br />
+Operation:<br />
+<code>void SADU8(UINT32 S0, UINT32 S1, UINT32 S2)
+{
+    UINT32 OUT = S2;
+    for (UINT8 i = 0; i &lt; 4; i++)
+        OUT += ABS(((S0 &gt;&gt; (i*8)) &amp; 0xff) - ((S1 &gt;&gt; (i*8)) &amp; 0xff))
+    return OUT;
+}
+VDST = (SADU8((UINT32)SRC0, SRC1, SRC2 &amp; 0xffff)
+VDST |= (SADU8((UINT32)(SRC0&gt;&gt;8), SRC1, (SRC2&gt;&gt;16) &amp; 0xffff)&lt;&lt;16
+VDST |= (SADU8((UINT32)(SRC0&gt;&gt;16), SRC1, (SRC2&gt;&gt;32) &amp; 0xffff)&lt;&lt;32
+VDST |= (SADU8((UINT32)(SRC0&gt;&gt;24), SRC1, (SRC2&gt;&gt;48) &amp; 0xffff)&lt;&lt;48</code></p>
 <h4>V_SAD_HI_U8</h4>
 <p>Opcode: 347 (0x15b) for GCN 1.0/1.1; 474 (0x1da) for GCN 1.2<br />