Changes between Version 4 and Version 5 of GcnInstrsVop3p


Ignore:
Timestamp:
11/27/17 20:00:30 (6 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop3p

    v4 v5  
    230230Syntax: V_MAD_MIX_F32 VDST, SRC0, SRC1, SRC2<br />
    231231Description: Multiply single FP value from SRC0 by single FP value SRC1 and add
    232 single FP value from SRC2, and store result to VDST. NEG_HI changes meaning
    233 to absolute-value modifier. The OP_SEL_HI controls left-shifting of source operands by
    234 16 bits (???).<br />
    235 <code>UINT32 SS0 = OP_SEL_HI&amp;1 ? SRC0&lt;&lt;16 : SRC0
    236 UINT32 SS1 = OP_SEL_HI&amp;2 ? SRC1&lt;&lt;16 : SRC1
    237 UINT32 SS2 = OP_SEL_HI&amp;4 ? SRC2&lt;&lt;16 : SRC2
    238 FLOAT S0 = NEG_HI&amp;1 ? ABS(ASFLOAT(SS0)) : ASFLOAT(SS0)
    239 FLOAT S1 = NEG_HI&amp;2 ? ABS(ASFLOAT(SS1)) : ASFLOAT(SS1)
    240 FLOAT S2 = NEG_HI&amp;4 ? ABS(ASFLOAT(SS2)) : ASFLOAT(SS2)
    241 VDST = S0 * S1 + S2</code></p>
     232single FP value from SRC2, and store result to VDST.
     233OP_SEL and OP_SEL_HI controls type and place of sources:</p>
     234<table>
     235<thead>
     236<tr>
     237<th>OP_SEL</th>
     238<th>OP_SEL_HI</th>
     239<th>Meaning</th>
     240</tr>
     241</thead>
     242<tbody>
     243<tr>
     244<td>0</td>
     245<td>0</td>
     246<td>FP32</td>
     247</tr>
     248<tr>
     249<td>1</td>
     250<td>0</td>
     251<td>FP32</td>
     252</tr>
     253<tr>
     254<td>0</td>
     255<td>1</td>
     256<td>FP16 in lower part</td>
     257</tr>
     258<tr>
     259<td>1</td>
     260<td>1</td>
     261<td>FP32 in higher part</td>
     262</tr>
     263</tbody>
     264</table>
     265<p>NEG_HI changes meaning to absolute-value modifier.<br />
     266```
     267FLOAT getSource(UINT32 S, BYTE OP_SEL, BYTE OP_SEL_HI, SRCINDEX)
     268{
     269    BYTE mask = 1&lt;<SRCINDEX
     270    if ((OP_SEL_HI&mask) == 0)
     271        return ASFLOAT(S)
     272    if ((OP_SEL&mask) == 0 && (OP_SEL_HI&mask) == 1)
     273        return (FLOAT)ASHALF(S&0xffff)
     274    else
     275        return (FLOAT)ASHALF(S>&gt;16)
     276}</p>
     277<p>FLOAT SS0 = getSource(SRC0, OP_SEL, OP_SEL_HI, 0)
     278FLOAT SS1 = getSource(SRC1, OP_SEL, OP_SEL_HI, 1)
     279FLOAT SS2 = getSource(SRC2, OP_SEL, OP_SEL_HI, 2)
     280FLOAT S0 = NEG_HI&amp;1 ? ABS(SS0) : SS0
     281FLOAT S1 = NEG_HI&amp;2 ? ABS(SS1) : SS1
     282FLOAT S2 = NEG_HI&amp;4 ? ABS(SS2) : SS2
     283VDST = S0 * S1 + S2
     284```</p>
    242285<h4>V_MAD_MIXLO_F16</h4>
    243286<p>Opcode: 33 (0x21)<br />
    244287Syntax: V_MAD_MIXLO_F16 VDST, SRC0, SRC1, SRC2<br />
    245 Description: Multiply half FP value from SRC0 by half FP value SRC1 and add
    246 half FP value from SRC2, and store result to lower 16-bit of VDST. NEG_HI changes meaning
    247 to absolute-value modifier.<br />
    248 <code>HALF S0 = NEG_HI&amp;1 ? ABS(ASHALF(SRC0)) : ASHALF(SRC0)
    249 HALF S1 = NEG_HI&amp;2 ? ABS(ASHALF(SRC1)) : ASHALF(SRC1)
    250 HALF S2 = NEG_HI&amp;4 ? ABS(ASHALF(SRC2)) : ASHALF(SRC2)
    251 VDST = (ASUINT16(S0 * S1 + S2)&amp;0xfff) | (VDST&amp;0xffff0000)</code></p>
     288Description: Multiply FP value from SRC0 by FP value SRC1 and add
     289half FP value from SRC2, and store result to lower 16-bit of VDST.
     290OP_SEL and OP_SEL_HI controls type and place of sources:</p>
     291<table>
     292<thead>
     293<tr>
     294<th>OP_SEL</th>
     295<th>OP_SEL_HI</th>
     296<th>Meaning</th>
     297</tr>
     298</thead>
     299<tbody>
     300<tr>
     301<td>0</td>
     302<td>0</td>
     303<td>FP32</td>
     304</tr>
     305<tr>
     306<td>1</td>
     307<td>0</td>
     308<td>FP32</td>
     309</tr>
     310<tr>
     311<td>0</td>
     312<td>1</td>
     313<td>FP16 in lower part</td>
     314</tr>
     315<tr>
     316<td>1</td>
     317<td>1</td>
     318<td>FP32 in higher part</td>
     319</tr>
     320</tbody>
     321</table>
     322<p>NEG_HI changes meaning to absolute-value modifier.<br />
     323```
     324FLOAT getSource(UINT32 S, BYTE OP_SEL, BYTE OP_SEL_HI, SRCINDEX)
     325{
     326    BYTE mask = 1&lt;<SRCINDEX
     327    if ((OP_SEL_HI&mask) == 0)
     328        return ASFLOAT(S)
     329    if ((OP_SEL&mask) == 0 && (OP_SEL_HI&mask) == 1)
     330        return (FLOAT)ASHALF(S&0xffff)
     331    else
     332        return (FLOAT)ASHALF(S>&gt;16)
     333}</p>
     334<p>FLOAT SS0 = getSource(SRC0, OP_SEL, OP_SEL_HI, 0)
     335FLOAT SS1 = getSource(SRC1, OP_SEL, OP_SEL_HI, 1)
     336FLOAT SS2 = getSource(SRC2, OP_SEL, OP_SEL_HI, 2)
     337FLOAT S0 = NEG_HI&amp;1 ? ABS(SS0) : SS0
     338FLOAT S1 = NEG_HI&amp;2 ? ABS(SS1) : SS1
     339FLOAT S2 = NEG_HI&amp;4 ? ABS(SS2) : SS2
     340VDST = (ASUINT32((HALF)(S0 * S1 + S2))&amp;0xfff) | (VDST&amp;0xffff0000)
     341```</p>
    252342<h4>V_MAD_MIXHI_F16</h4>
    253343<p>Opcode: 34 (0x22)<br />
     
    255345Description: Multiply half FP value from SRC0 by half FP value SRC1 and add
    256346half FP value from SRC2, and store result to higher 16-bit part of VDST.
    257 NEG_HI changes meaning to absolute-value modifier.<br />
    258 <code>HALF S0 = NEG_HI&amp;1 ? ABS(ASHALF(SRC0)) : ASHALF(SRC0)
    259 HALF S1 = NEG_HI&amp;2 ? ABS(ASHALF(SRC1)) : ASHALF(SRC1)
    260 HALF S2 = NEG_HI&amp;4 ? ABS(ASHALF(SRC2)) : ASHALF(SRC2)
    261 VDST = (ASUINT16(S0 * S1 + S2)&lt;&lt;16)) | (VDST&amp;0xffff)</code></p>
     347OP_SEL and OP_SEL_HI controls type and place of sources:</p>
     348<table>
     349<thead>
     350<tr>
     351<th>OP_SEL</th>
     352<th>OP_SEL_HI</th>
     353<th>Meaning</th>
     354</tr>
     355</thead>
     356<tbody>
     357<tr>
     358<td>0</td>
     359<td>0</td>
     360<td>FP32</td>
     361</tr>
     362<tr>
     363<td>1</td>
     364<td>0</td>
     365<td>FP32</td>
     366</tr>
     367<tr>
     368<td>0</td>
     369<td>1</td>
     370<td>FP16 in lower part</td>
     371</tr>
     372<tr>
     373<td>1</td>
     374<td>1</td>
     375<td>FP32 in higher part</td>
     376</tr>
     377</tbody>
     378</table>
     379<p>NEG_HI changes meaning to absolute-value modifier.<br />
     380```
     381FLOAT getSource(UINT32 S, BYTE OP_SEL, BYTE OP_SEL_HI, SRCINDEX)
     382{
     383    BYTE mask = 1&lt;<SRCINDEX
     384    if ((OP_SEL_HI&mask) == 0)
     385        return ASFLOAT(S)
     386    if ((OP_SEL&mask) == 0 && (OP_SEL_HI&mask) == 1)
     387        return (FLOAT)ASHALF(S&0xffff)
     388    else
     389        return (FLOAT)ASHALF(S>&gt;16)
     390}</p>
     391<p>FLOAT SS0 = getSource(SRC0, OP_SEL, OP_SEL_HI, 0)
     392FLOAT SS1 = getSource(SRC1, OP_SEL, OP_SEL_HI, 1)
     393FLOAT SS2 = getSource(SRC2, OP_SEL, OP_SEL_HI, 2)
     394FLOAT S0 = NEG_HI&amp;1 ? ABS(SS0) : SS0
     395FLOAT S1 = NEG_HI&amp;2 ? ABS(SS1) : SS1
     396FLOAT S2 = NEG_HI&amp;4 ? ABS(SS2) : SS2
     397VDST = (ASUINT32((HALF)(S0 * S1 + S2))&lt;&lt;16) | (VDST&amp;0xffff)
     398```</p>
    262399<h4>V_PK_ADD_F16</h4>
    263400<p>Opcode: 15 (0xf)<br />