Changes between Version 26 and Version 27 of GcnInstrsVop2


Ignore:
Timestamp:
06/10/17 12:00:24 (7 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop2

    v26 v27  
    565565UINT64 mask = (1ULL<<LANEID)
    566566SDST = (SDST&amp;~mask) | ((temp &gt;&gt; 32) ? mask : 0)</code></p>
     567<h4>V_ADD_U16</h4>
     568<p>Opcode VOP2: 38 (0x26) for GCN 1.2<br />
     569Opcode VOP3A: 294 (0x126) for GCN 1.2<br />
     570Syntax: V_ADD_U16 VDST, SRC0, SRC1<br />
     571Description: Add two 16-bit unsigned values from SRC0 and SRC1 and
     572store 16-bit unsigned result to VDST.<br />
     573Operation:<br />
     574<code>VDST = (SRC0 + SRC1) &amp; 0xffff</code></p>
    567575<h4>V_ADDC_U32</h4>
    568576<p>Opcode VOP2: 40 (0x28) for GCN 1.0/1.1; 28 (0x1c) for GCN 1.2<br />
     
    595603Operation:<br />
    596604<code>VDST = (INT32)SRC0 &gt;&gt; (SRC1&amp;31)</code></p>
     605<h4>V_ASHRREV_B16</h4>
     606<p>Opcode VOP2: 44 (0x2c) for GCN 1.2<br />
     607Opcode VOP3A: 300 (0x12c) for GCN 1.2<br />
     608Syntax: V_ASHRREV_B16 VDST, SRC0, SRC1<br />
     609Description: Shift right signed 16-bit value from SRC1 by (SRC0&amp;15) bits and
     610store 16-bit signed result into VDST.<br />
     611Operation:<br />
     612<code>VDST = ((INT16)SRC1 &gt;&gt; (SRC0&amp;15)) &amp; 0xffff</code></p>
    597613<h4>V_ASHRREV_I32</h4>
    598614<p>Opcode VOP2: 24 (0x18) for GCN 1.0/1.1; 16 (0x11) for GCN 1.2<br />
     
    703719UINT16 D1 = ASINT16(CVT_HALF_RTZ(ASFLOAT(SRC1)))
    704720VDST = D0 | (((UINT32)D1) &lt;&lt; 16)</code></p>
     721<h4>V_LDEXP_F16</h4>
     722<p>Opcode VOP2: 51 (0x33) for GCN 1.2<br />
     723Opcode VOP3A: 307 (0x133) for GCN 1.2<br />
     724Syntax: V_LDEXP_F16 VDST, SRC0, SRC1<br />
     725Description: Do ldexp operation on SRC0 and SRC1 (multiply SRC0 by 2**(SRC1)).
     726SRC1 is signed integer, SRC0 is half floating point value.<br />
     727Operation:<br />
     728<code>VDST = ASHALF(SRC0) * POW(2.0, (INT32)SRC1)</code></p>
    705729<h4>V_LDEXP_F32</h4>
    706730<p>Opcode VOP2: 43 (0x2b) for GCN 1.0/1.1<br />
     
    718742Operation:<br />
    719743<code>VDST = SRC0 &lt;&lt; (SRC1&amp;31)</code></p>
     744<h4>V_LSHLREV_B16</h4>
     745<p>Opcode VOP2: 42 (0x2a) for GCN 1.2<br />
     746Opcode VOP3A: 298 (0x12a) for GCN 1.2<br />
     747Syntax: V_LSHLREV_B16 VDST, SRC0, SRC1<br />
     748Description: Shift left unsigned 16-bit value from SRC1 by (SRC0&amp;15) bits and
     749store 16-bit unsigned result into VDST.<br />
     750Operation:<br />
     751<code>VDST = (SRC1 &lt;&lt; (SRC0&amp;15)) &amp; 0xffff</code></p>
    720752<h4>V_LSHLREV_B32</h4>
    721753<p>Opcode VOP2: 26 (0x1a) for GCN 1.0/1.1; 18 (0x12) for GCN 1.2<br />
     
    732764Operation:<br />
    733765<code>VDST = SRC0 &gt;&gt; (SRC1&amp;31)</code></p>
     766<h4>V_LSHRREV_B16</h4>
     767<p>Opcode VOP2: 43 (0x2b) for GCN 1.2<br />
     768Opcode VOP3A: 299 (0x12b) for GCN 1.2<br />
     769Syntax: V_LSHRREV_B16 VDST, SRC0, SRC1<br />
     770Description: Shift right unsigned 16-bit value from SRC1 by (SRC0&amp;15) bits and
     771store 16-bit unsigned result into VDST.<br />
     772Operation:<br />
     773<code>VDST = (SRC1 &gt;&gt; (SRC0&amp;15)) &amp; 0xffff</code></p>
    734774<h4>V_LSHRREV_B32</h4>
    735775<p>Opcode VOP2: 22 (0x16) for GCN 1.0/1.1; 16 (0x10) for GCN 1.2<br />
     
    743783Opcode VOP3A: 291 (0x123) for GCN 1.2<br />
    744784Syntax: V_MAC_F16 VDST, SRC0, SRC1<br />
    745 Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and add result to VDST.<br />
     785Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and
     786add result to VDST. It applies OMOD modifier to result.<br />
    746787Operation:<br />
    747788<code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(VDST)</code></p>
     
    798839Operation:
    799840<code>VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(FLOATLIT)</code></p>
     841<h4>V_MAX_F16</h4>
     842<p>Opcode VOP2: 45 (0x2d) for GCN 1.2<br />
     843Opcode VOP3A: 301 (0x12d) for GCN 1.2<br />
     844Syntax: V_MAX_F16 VDST, SRC0, SRC1<br />
     845Description: Choose largest half floating point value from SRC0 and SRC1,
     846and store result to VDST.<br />
     847Operation:<br />
     848<code>VDST = MAX(ASFHALF(SRC0), ASFHALF(SRC1))</code></p>
    800849<h4>V_MAX_F32</h4>
    801850<p>Opcode VOP2: 16 (0x10) for GCN 1.0/1.1; 11 (0xb) for GCN 1.2<br />
     
    852901<code>UINT32 MASK = ((1ULL &lt;&lt; LANEID) - 1ULL) &amp; SRC0
    853902VDST = SRC1 + BITCOUNT(MASK)</code></p>
     903<h4>V_MIN_F16</h4>
     904<p>Opcode VOP2: 46 (0x2e) for GCN 1.2<br />
     905Opcode VOP3A: 302 (0x12e) for GCN 1.2<br />
     906Syntax: V_MIN_F16 VDST, SRC0, SRC1<br />
     907Description: Choose smallest half floating point value from SRC0 and SRC1,
     908and store result to VDST.<br />
     909Operation:<br />
     910<code>VDST = MIN(ASFHALF(SRC0), ASFHALF(SRC1))</code></p>
    854911<h4>V_MIN_F32</h4>
    855912<p>Opcode VOP2: 15 (0xf) for GCN 1.0/1.1; 10 (0xa) for GCN 1.2<br />
     
    860917Operation:<br />
    861918<code>VDST = MIN(ASFLOAT(SRC0), ASFLOAT(SRC1))</code></p>
     919<h4>V_MIN_i16</h4>
     920<p>Opcode VOP2: 50 (0x32) for GCN 1.2<br />
     921Opcode VOP3A: 306 (0x132) for GCN 1.2<br />
     922Syntax: V_MIN_i16 VDST, SRC0, SRC1<br />
     923Description: Choose smallest signed 16-bit value from SRC0 and SRC1,
     924and store result to VDST.<br />
     925Operation:<br />
     926<code>VDST = MIN((INT16)SRC0, (INT16)SRC1)</code></p>
    862927<h4>V_MIN_I32</h4>
    863928<p>Opcode VOP2: 17 (0x11) for GCN 1.0/1.1; 12 (0xc) for GCN 1.2<br />
     
    879944else
    880945    VDST = NaN</code></p>
     946<h4>V_MIN_U16</h4>
     947<p>Opcode VOP2: 49 (0x31) for GCN 1.2<br />
     948Opcode VOP3A: 305 (0x131) for GCN 1.2<br />
     949Syntax: V_MIN_U16 VDST, SRC0, SRC1<br />
     950Description: Choose smallest unsigned 16-bit value from SRC0 and SRC1,
     951and store result to VDST.<br />
     952Operation:<br />
     953<code>VDST = MIN(SRC0&amp;0xffff, SRC1&amp;0xffff)</code></p>
    881954<h4>V_MIN_U32</h4>
    882955<p>Opcode VOP2: 19 (0x13) for GCN 1.0/1.1; 14 (0xe) for GCN 1.2<br />
     
    9421015INT32 V1 = (INT32)((SRC1&amp;0x7fffff) | (SSRC1&amp;0x800000 ? 0xff800000 : 0))
    9431016VDST = V0 * V1</code></p>
     1017<h4>V_MUL_LO_U16</h4>
     1018<p>Opcode VOP2: 41 (0x29) for GCN 1.2<br />
     1019Opcode VOP3A: 297 (0x129) for GCN 1.2<br />
     1020Syntax: V_MUL_LO_U16 VDST, SRC0, SRC1<br />
     1021Description: Multiply 16-bit unsigned value from SRC0 by 16-bit unsigned value from SRC1
     1022and store 16-bit result to VDST.<br />
     1023Operation:<br />
     1024<code>VDST = ((SRC0&amp;0Xffff) * (SRC1&amp;0xffff)) &amp; 0xffff</code></p>
    9441025<h4>V_MUL_U32_U24</h4>
    9451026<p>Opcode VOP2: 11 (0xb) for GCN 1.0/1.1; 8 (0x8) for GCN 1.2<br />
     
    9801061Operation:<br />
    9811062<code>VDST = ASFLOAT(SRC0) - ASFLOAT(SRC1)</code></p>
     1063<h4>V_SUB_U16</h4>
     1064<p>Opcode VOP2: 39 (0x27) for GCN 1.2<br />
     1065Opcode VOP3A: 295 (0x127) for GCN 1.2<br />
     1066Syntax: V_SUB_U16 VDST, SRC0, SRC1<br />
     1067Description: Subtract unsigned 16-bit value of SRC1 from SRC0 and store
     106816-bit unsigned result to VDST.<br />
     1069Operation:<br />
     1070<code>VDST = (SRC0 - SRC1) &amp; 0xffff</code></p>
    9821071<h4>V_SUB_I32, V_SUB_U32</h4>
    9831072<p>Opcode VOP2: 38 (0x26) for GCN 1.0/1.1; 26 (0x1a) for GCN 1.2<br />
     
    10581147UINT64 mask = (1ULL&lt;&lt;LANEID)
    10591148SDST = (SDST&amp;~mask) | ((temp&gt;&gt;32) ? mask : 0)</code></p>
     1149<h4>V_SUBREV_U16</h4>
     1150<p>Opcode VOP2: 40 (0x28) for GCN 1.2<br />
     1151Opcode VOP3A: 296 (0x128) for GCN 1.2<br />
     1152Syntax: V_SUBREV_U16 VDST, SRC0, SRC1<br />
     1153Description: Subtract unsigned 16-bit value of SRC0 from SRC1 and store
     115416-bit unsigned result to VDST.<br />
     1155Operation:<br />
     1156<code>VDST = (SRC1 - SRC0) &amp; 0xffff</code></p>
    10601157<h4>V_XOR_B32</h4>
    10611158<p>Opcode: VOP2: 29 (0x1d) for GCN 1.0/1.1; 21 (0x15) for GCN 1.2<br />