Changes between Version 25 and Version 26 of GcnInstrsVop2


Ignore:
Timestamp:
06/10/17 10:00:25 (7 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop2

    v25 v26  
    194194NOTE: OMOD and CLAMP modifier affects only for instruction that output is
    195195floating point value.<br />
    196 NOTE: ABS and negation is applied to source operand for any instruction.</p>
     196NOTE: ABS and negation is applied to source operand for any instruction.<br />
     197OMOD: OMOD modifier doesn't work for half precision (FP16) instructions (except V_MAC_F16).</p>
    197198<p>Negation and absolute value can be combined: <code>-ABS(V0)</code>. Modifiers CLAMP and
    198199OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.</p>
     
    534535<h3>Instruction set</h3>
    535536<p>Alphabetically sorted instruction list:</p>
     537<h4>V_ADD_F16</h4>
     538<p>Opcode VOP2: 31 (0x1f) for GCN 1.2<br />
     539Opcode VOP3A: 287 (0x11f) for GCN 1.2<br />
     540Syntax: V_ADD_F16 VDST, SRC0, SRC1<br />
     541Description: Add two FP16 values from SRC0 and SRC1 and store result to VDST.<br />
     542Operation:<br />
     543<code>VDST = ASHALF(SRC0) + ASHALF(SRC1)</code></p>
    536544<h4>V_ADD_F32</h4>
    537545<p>Opcode VOP2: 3 (0x3) for GCN 1.0/1.1; 1 (0x1) for GCN 1.2<br />
     
    731739Operation:<br />
    732740<code>VDST = SRC1 &gt;&gt; (SRC0&amp;31)</code></p>
     741<h4>V_MAC_F16</h4>
     742<p>Opcode VOP2: 35 (0x23) for GCN 1.2<br />
     743Opcode VOP3A: 291 (0x123) for GCN 1.2<br />
     744Syntax: V_MAC_F16 VDST, SRC0, SRC1<br />
     745Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and add result to VDST.<br />
     746Operation:<br />
     747<code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(VDST)</code></p>
    733748<h4>V_MAC_F32</h4>
    734749<p>Opcode VOP2: 31 (0x1f) for GCN 1.0/1.1; 22 (0x16) for GCN 1.2<br />
     
    747762<code>if (ASFLOAT(SRC0)!=0.0 &amp;&amp; ASFLOAT(SRC1)!=0.0)
    748763    VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(VDST)</code></p>
     764<h4>V_MADMK_F16</h4>
     765<p>Opcode: 36 (0x24) for GCN 1.2<br />
     766Opcode: 292 (0x124) for GCN 1.2<br />
     767Syntax: V_MADMK_F16 VDST, SRC0, FLOAT16LIT, SRC1<br />
     768Description: Multiply FP16 value from SRC0 with the constant literal FLOAT16LIT and add
     769FP16 value from SRC1; and store result to VDST. Constant literal follows
     770after instruction word. Use nearest-even rouding.<br />
     771Operation:
     772<code>VDST = ASHALF(SRC0) * ASHALF(FLOAT16LIT) + ASHALF(SRC1)</code></p>
    749773<h4>V_MADMK_F32</h4>
    750774<p>Opcode: VOP2: 32 (0x20) for GCN 1.0/1.1; 23 (0x17) for GCN 1.2<br />
     
    756780Operation:
    757781<code>VDST = ASFLOAT(SRC0) * ASFLOAT(FLOATLIT) + ASFLOAT(SRC1)</code></p>
     782<h4>V_MADAK_F16</h4>
     783<p>Opcode: 37 (0x25) for GCN 1.2<br />
     784Opcode: 293 (0x125) for GCN 1.2<br />
     785Syntax: V_MADAK_F16 VDST, SRC0, SRC1, FLOAT16LIT<br />
     786Description: Multiply FP16 value from SRC0 with FP16 value from SRC1 and add
     787the constant literal FLOATLIT16; and store result to VDST. Constant literal follows
     788after instruction word.<br />
     789Operation:
     790<code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(FLOAT16LIT)</code></p>
    758791<h4>V_MADAK_F32</h4>
    759792<p>Opcode: VOP2: 33 (0x21) for GCN 1.0/1.1; 24 (0x18) for GCN 1.2<br />
     
    864897else
    865898    VDST = 0.0</code></p>
     899<h4>V_MUL_F16</h4>
     900<p>Opcode VOP2: 34 (0x22) for GCN 1.2<br />
     901Opcode VOP3A: 290 (0x122) for GCN 1.2<br />
     902Syntax: V_MUL_F16 VDST, SRC0, SRC1<br />
     903Description: Multiply FP16 value from SRC0 by FP16 value from SRC1
     904and store result to VDST.<br />
     905Operation:<br />
     906<code>VDST = ASHALF(SRC0) * ASHALF(SRC1)</code></p>
    866907<h4>V_MUL_F32</h4>
    867908<p>Opcode VOP2: 8 (0x8) for GCN 1.0/1.1; 5 (0x5) for GCN 1.2<br />
     
    925966Operation:<br />
    926967<code>SDST = VSRC0[SSRC1 &amp; 63]</code></p>
     968<h4>V_SUB_F16</h4>
     969<p>Opcode VOP2: 32 (0x20) for GCN 1.2<br />
     970Opcode VOP3A: 288 (0x120) for GCN 1.2<br />
     971Syntax: V_SUB_F16 VDST, SRC0, SRC1<br />
     972Description: Subtract FP16 value of SRC1 from FP16 value of SRC0 and store result to VDST.<br />
     973Operation:<br />
     974<code>VDST = ASHALF(SRC0) - ASHALF(SRC1)</code></p>
    927975<h4>V_SUB_F32</h4>
    928976<p>Opcode VOP2: 4 (0x4) for GCN 1.0/1.1; 2 (0x2) for GCN 1.2<br />
     
    9801028VDST = temp
    9811029SDST = (SDST&amp;~mask) | ((temp &gt;&gt; 32) ? mask : 0)</code></p>
     1030<h4>V_SUBREV_F16</h4>
     1031<p>Opcode VOP2: 33 (0x21) for GCN 1.2<br />
     1032Opcode VOP3A: 289 (0x121) for GCN 1.2<br />
     1033Syntax: V_SUBREV_F16 VDST, SRC0, SRC1<br />
     1034Description: Subtract FP16 value of SRC0 from FP16 value of SRC1 and store result to VDST.<br />
     1035Operation:<br />
     1036<code>VDST = ASHALF(SRC1) - ASHALF(SRC0)</code></p>
    9821037<h4>V_SUBREV_F32</h4>
    9831038<p>Opcode VOP2: 5 (0x5) for GCN 1.0/1.1; 2 (0x3) for GCN 1.2<br />