Changes between Version 3 and Version 4 of GcnInstrsVop2


Ignore:
Timestamp:
11/21/15 00:00:23 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop2

    v3 v4  
    192192<p>Negation and absolute value can be combined: <code>-ABS(V0)</code>. Modifiers CLAMP and
    193193OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.</p>
     194<p>Limitations for operands:</p>
     195<ul>
     196<li>only one SGPR can be read by instruction. Multiple occurrences of this same
     197SGPR is allowed</li>
     198<li>only one literal constant can be used, and only when a SGPR or M0 is not used in
     199source operands</li>
     200<li>only SRC0 can holds LDS_DIRECT</li>
     201</ul>
    194202<p>VOP2 opcodes (0-63) are reflected in VOP3 in range: 256-319.
    195203List of the instructions by opcode:</p>
     
    301309Description: If bit for current thread of VCC or SDST is set then store SRC1 to VDST,
    302310otherwise store SRC0 to VDST. CLAMP and OMOD modifier doesn't affect on result.<br />
    303 Operation:
     311Operation:<br />
    304312<code>VDST = SSRC2&amp;(1ULL&lt;&lt;THREADID) ? SRC1 : SRC0</code></p>
     313<h4>V_MAC_LEGACY_F32</h4>
     314<p>Opcode VOP2: 6 (0x6) for GCN 1.0/1.1
     315Opcode VOP3a: 262 (0x106) for GCN 1.0/1.1
     316Syntax: V_MUL_LEGACY_F32 VDST, SRC0, SRC1<br />
     317Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to VDST.
     318If one of value is 0.0 then always do not change VDST (do not apply IEEE rules for 0.0*x).<br />
     319Operation:<br />
     320<code>if ((FLOAT)SRC0!=0.0 &amp;&amp; (FLOAT)SRC1!=0.0)
     321    VDST = (FLOAT)SRC0 * (FLOAT)SRC1 + (FLOAT)VDST</code></p>
     322<h4>V_MUL_LEGACY_F32</h4>
     323<p>Opcode VOP2: 7 (0x7) for GCN 1.0/1.1; 5 (0x4) for GCN 1.2<br />
     324Opcode VOP3a: 263 (0x107) for GCN 1.0/1.1; 260 (0x104) for GCN 1.2<br />
     325Syntax: V_MUL_LEGACY_F32 VDST, SRC0, SRC1<br />
     326Description: Multiply FP value from SRC0 by FP value from SRC1 and store result to VDST.
     327If one of value is 0.0 then always store 0.0 to VDST (do not apply IEEE rules for 0.0*x).<br />
     328Operation:<br />
     329<code>if ((FLOAT)SRC0!=0.0 &amp;&amp; (FLOAT)SRC1!=0.0)
     330    VDST = (FLOAT)SRC0 * (FLOAT)SRC1
     331else
     332    VDST = 0.0</code></p>
     333<h4>V_MUL_F32</h4>
     334<p>Opcode VOP2: 8 (0x8) for GCN 1.0/1.1; 5 (0x5) for GCN 1.2<br />
     335Opcode VOP3a: 264 (0x108) for GCN 1.0/1.1; 261 (0x105) for GCN 1.2<br />
     336Syntax: V_MUL_F32 VDST, SRC0, SRC1<br />
     337Description: Multiply FP value from SRC0 by FP value from SRC1 and store result to VDST.<br />
     338Operation:<br />
     339<code>VDST = (FLOAT)SRC0 * (FLOAT)SRC1</code></p>
     340<h4>V_MUL_HI_I32_24</h4>
     341<p>Opcode VOP2: 10 (0xa) for GCN 1.0/1.1; 7 (0x7) for GCN 1.2<br />
     342Opcode VOP3a: 266 (0x10a) for GCN 1.0/1.1; 263 (0x107) for GCN 1.2<br />
     343Syntax: V_MUL_HI_I32_24 VDST, SRC0, SRC1<br />
     344Description: Multiply 24-bit signed integer value from SRC0 by 24-bit signed value from SRC1
     345and store higher 16-bit of the result to VDST with sign extension.
     346Any modifier doesn't affect on result.<br />
     347Operation:<br />
     348<code>INT32 V0 = (INT32)((SRC0&amp;0x7fffff) | (SSRC0&amp;0x800000 ? 0xff800000 : 0))
     349INT32 V1 = (INT32)((SRC1&amp;0x7fffff) | (SSRC1&amp;0x800000 ? 0xff800000 : 0))
     350VDST = ((INT64)V0 * V1)&gt;&gt;32</code></p>
     351<h4>V_MUL_HI_U32_U24</h4>
     352<p>Opcode VOP2: 12 (0xc) for GCN 1.0/1.1; 9 (0x9) for GCN 1.2<br />
     353Opcode VOP3a: 268 (0x10c) for GCN 1.0/1.1; 265 (0x109) for GCN 1.2<br />
     354Syntax: V_MUL_HI_U32_U24 VDST, SRC0, SRC1<br />
     355Description: Multiply 24-bit unsigned integer value from SRC0 by 24-bit unsigned value
     356from SRC1 and store higher 16-bit of the result to VDST.
     357Any modifier doesn't affect to result.<br />
     358Operation:<br />
     359<code>VDST = ((UINT64)(SRC0&amp;0xffffff) * (UINT32)(SRC1&amp;0xffffff)) &gt;&gt; 32</code></p>
     360<h4>V_MUL_I32_I24</h4>
     361<p>Opcode VOP2: 9 (0x9) for GCN 1.0/1.1; 6 (0x6) for GCN 1.2<br />
     362Opcode VOP3a: 265 (0x109) for GCN 1.0/1.1; 262 (0x106) for GCN 1.2<br />
     363Syntax: V_MUL_I32_I24 VDST, SRC0, SRC1<br />
     364Description: Multiply 24-bit signed integer value from SRC0 by 24-bit signed value from SRC1
     365and store result to VDST. Any modifier doesn't affect to result.<br />
     366Operation:<br />
     367<code>INT32 V0 = (INT32)((SRC0&amp;0x7fffff) | (SSRC0&amp;0x800000 ? 0xff800000 : 0))
     368INT32 V1 = (INT32)((SRC1&amp;0x7fffff) | (SSRC1&amp;0x800000 ? 0xff800000 : 0))
     369VDST = V0 * V1</code></p>
     370<h4>V_MUL_U32_U24</h4>
     371<p>Opcode VOP2: 11 (0xb) for GCN 1.0/1.1; 8 (0x8) for GCN 1.2<br />
     372Opcode VOP3a: 267 (0x10b) for GCN 1.0/1.1; 264 (0x108) for GCN 1.2<br />
     373Syntax: V_MUL_U32_U24 VDST, SRC0, SRC1<br />
     374Description: Multiply 24-bit unsigned integer value from SRC0 by 24-bit unsigned value
     375from SRC1 and store result to VDST. Any modifier doesn't affect to result.<br />
     376Operation:<br />
     377<code>VDST = (UINT32)(SRC0&amp;0xffffff) * (UINT32)(SRC1&amp;0xffffff)</code></p>
    305378<h4>V_READLANE_B32</h4>
    306379<p>Opcode VOP2: 1 (0x1) for GCN 1.0/1.1<br />
     
    326399Operation:<br />
    327400<code>VDST = (FLOAT)SRC0 - (FLOAT)SRC1</code></p>
     401<h4>V_SUBREV_F32</h4>
     402<p>Opcode VOP2: 5 (0x5) for GCN 1.0/1.1; 2 (0x3) for GCN 1.2<br />
     403Opcode VOP3a: 261 (0x105) for GCN 1.0/1.1; 259 (0x103) for GCN 1.2<br />
     404Syntax: V_SUBREV_F32 VDST, SRC0, SRC1<br />
     405Description: Subtract FP value from SRC1 and FP value from SRC0 and store result to VDST.<br />
     406Operation:<br />
     407<code>VDST = (FLOAT)SRC1 - (FLOAT)SRC0</code></p>
    328408}}}