Changes between Initial Version and Version 1 of GcnInstrsVop1


Ignore:
Timestamp:
Nov 23, 2015, 6:00:17 PM (5 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsVop1

    v1 v1  
     1[wiki:ClrxToc Back to Table of content]
     2{{{
     3#!html
     4<h2>GCN ISA VOP1/VOP3 instructions</h2>
     5<p>VOP1 instructions can be encoded in the VOP1 encoding and the VOP3A/VOP3B encoding.
     6List of fields for VOP1 encoding:</p>
     7<table>
     8<thead>
     9<tr>
     10<th>Bits</th>
     11<th>Name</th>
     12<th>Description</th>
     13</tr>
     14</thead>
     15<tbody>
     16<tr>
     17<td>0-8</td>
     18<td>SRC0</td>
     19<td>First (scalar or vector) source operand</td>
     20</tr>
     21<tr>
     22<td>9-16</td>
     23<td>OPCODE</td>
     24<td>Operation code</td>
     25</tr>
     26<tr>
     27<td>17-24</td>
     28<td>VDST</td>
     29<td>Destination vector operand</td>
     30</tr>
     31<tr>
     32<td>25-31</td>
     33<td>ENCODING</td>
     34<td>Encoding type. Must be 0b0111111</td>
     35</tr>
     36</tbody>
     37</table>
     38<p>Syntax: INSTRUCTION VDST, SRC0</p>
     39<p>List of fields for VOP3A/VOP3B encoding (GCN 1.0/1.1):</p>
     40<table>
     41<thead>
     42<tr>
     43<th>Bits</th>
     44<th>Name</th>
     45<th>Description</th>
     46</tr>
     47</thead>
     48<tbody>
     49<tr>
     50<td>0-7</td>
     51<td>VDST</td>
     52<td>Vector destination operand</td>
     53</tr>
     54<tr>
     55<td>8-10</td>
     56<td>ABS</td>
     57<td>Absolute modifiers for source operands (VOP3A)</td>
     58</tr>
     59<tr>
     60<td>8-14</td>
     61<td>SDST</td>
     62<td>Scalar destination operand (VOP3B)</td>
     63</tr>
     64<tr>
     65<td>11</td>
     66<td>CLAMP</td>
     67<td>CLAMP modifier (VOP3A)</td>
     68</tr>
     69<tr>
     70<td>15</td>
     71<td>CLAMP</td>
     72<td>CLAMP modifier (VOP3B)</td>
     73</tr>
     74<tr>
     75<td>17-25</td>
     76<td>OPCODE</td>
     77<td>Operation code</td>
     78</tr>
     79<tr>
     80<td>26-31</td>
     81<td>ENCODING</td>
     82<td>Encoding type. Must be 0b110100</td>
     83</tr>
     84<tr>
     85<td>32-40</td>
     86<td>SRC0</td>
     87<td>First (scalar or vector) source operand</td>
     88</tr>
     89<tr>
     90<td>41-49</td>
     91<td>SRC1</td>
     92<td>Second (scalar or vector) source operand</td>
     93</tr>
     94<tr>
     95<td>50-58</td>
     96<td>SRC2</td>
     97<td>Third (scalar or vector) source operand</td>
     98</tr>
     99<tr>
     100<td>59-60</td>
     101<td>OMOD</td>
     102<td>OMOD modifier. Multiplication modifier</td>
     103</tr>
     104<tr>
     105<td>61-63</td>
     106<td>NEG</td>
     107<td>Negation modifier for source operands</td>
     108</tr>
     109</tbody>
     110</table>
     111<p>List of fields for VOP3A/VOP3B encoding (GCN 1.2):</p>
     112<table>
     113<thead>
     114<tr>
     115<th>Bits</th>
     116<th>Name</th>
     117<th>Description</th>
     118</tr>
     119</thead>
     120<tbody>
     121<tr>
     122<td>0-7</td>
     123<td>VDST</td>
     124<td>Destination vector operand</td>
     125</tr>
     126<tr>
     127<td>8-10</td>
     128<td>ABS</td>
     129<td>Absolute modifiers for source operands (VOP3A)</td>
     130</tr>
     131<tr>
     132<td>8-14</td>
     133<td>SDST</td>
     134<td>Scalar destination operand (VOP3B)</td>
     135</tr>
     136<tr>
     137<td>15</td>
     138<td>CLAMP</td>
     139<td>CLAMP modifier</td>
     140</tr>
     141<tr>
     142<td>16-25</td>
     143<td>OPCODE</td>
     144<td>Operation code</td>
     145</tr>
     146<tr>
     147<td>26-31</td>
     148<td>ENCODING</td>
     149<td>Encoding type. Must be 0b110100</td>
     150</tr>
     151<tr>
     152<td>32-40</td>
     153<td>SRC0</td>
     154<td>First (scalar or vector) source operand</td>
     155</tr>
     156<tr>
     157<td>41-49</td>
     158<td>SRC1</td>
     159<td>Second (scalar or vector) source operand</td>
     160</tr>
     161<tr>
     162<td>50-58</td>
     163<td>SRC2</td>
     164<td>Third (scalar or vector) source operand</td>
     165</tr>
     166<tr>
     167<td>59-60</td>
     168<td>OMOD</td>
     169<td>OMOD modifier. Multiplication modifier</td>
     170</tr>
     171<tr>
     172<td>61-63</td>
     173<td>NEG</td>
     174<td>Negation modifier for source operands</td>
     175</tr>
     176</tbody>
     177</table>
     178<p>Syntax: INSTRUCTION VDST, SRC0 [MODIFIERS]</p>
     179<p>Modifiers:</p>
     180<ul>
     181<li>CLAMP - clamps destination floating point value in range 0.0-1.0</li>
     182<li>MUL:2, MUL:4, DIV:2 - OMOD modifiers. Multiply destination floating point value by
     1832.0, 4.0 or 0.5 respectively</li>
     184<li>-SRC - negate floating point value from source operand</li>
     185<li>ABS(SRC) - apply absolute value to source operand</li>
     186</ul>
     187<p>Negation and absolute value can be combined: <code>-ABS(V0)</code>. Modifiers CLAMP and
     188OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.</p>
     189<p>Limitations for operands:</p>
     190<ul>
     191<li>only one SGPR can be read by instruction. Multiple occurrences of this same
     192SGPR is allowed</li>
     193<li>only one literal constant can be used, and only when a SGPR or M0 is not used in
     194source operands</li>
     195<li>only SRC0 can holds LDS_DIRECT</li>
     196</ul>
     197<p>VOP1 opcodes (0-127) are reflected in VOP3 in range: 384-511 for GCN 1.0/1.1 or
     198320-447 for GCN 1.2.</p>
     199<p>List of the instructions by opcode (GCN 1.0/1.1):</p>
     200<table>
     201<thead>
     202<tr>
     203<th>Opcode</th>
     204<th>Opcode(VOP3)</th>
     205<th>GCN 1.0</th>
     206<th>GCN 1.1</th>
     207<th>Mnemonic</th>
     208</tr>
     209</thead>
     210<tbody>
     211<tr>
     212<td>0 (0x0)</td>
     213<td>384 (0x180)</td>
     214<td>✓</td>
     215<td>✓</td>
     216<td>V_NOP</td>
     217</tr>
     218<tr>
     219<td>1 (0x1)</td>
     220<td>385 (0x181)</td>
     221<td>✓</td>
     222<td>✓</td>
     223<td>V_MOV_B32</td>
     224</tr>
     225<tr>
     226<td>2 (0x2)</td>
     227<td>386 (0x182)</td>
     228<td>✓</td>
     229<td>✓</td>
     230<td>V_READFIRSTLANE_B32</td>
     231</tr>
     232<tr>
     233<td>3 (0x3)</td>
     234<td>387 (0x183)</td>
     235<td>✓</td>
     236<td>✓</td>
     237<td>V_CVT_I32_F64</td>
     238</tr>
     239<tr>
     240<td>4 (0x4)</td>
     241<td>388 (0x184)</td>
     242<td>✓</td>
     243<td>✓</td>
     244<td>V_CVT_F64_I32</td>
     245</tr>
     246<tr>
     247<td>5 (0x5)</td>
     248<td>389 (0x185)</td>
     249<td>✓</td>
     250<td>✓</td>
     251<td>V_CVT_F32_I32</td>
     252</tr>
     253<tr>
     254<td>6 (0x6)</td>
     255<td>390 (0x186)</td>
     256<td>✓</td>
     257<td>✓</td>
     258<td>V_CVT_F32_U32</td>
     259</tr>
     260<tr>
     261<td>7 (0x7)</td>
     262<td>391 (0x187)</td>
     263<td>✓</td>
     264<td>✓</td>
     265<td>V_CVT_U32_F32</td>
     266</tr>
     267<tr>
     268<td>8 (0x8)</td>
     269<td>392 (0x188)</td>
     270<td>✓</td>
     271<td>✓</td>
     272<td>V_CVT_I32_F32</td>
     273</tr>
     274<tr>
     275<td>9 (0x9)</td>
     276<td>393 (0x189)</td>
     277<td>✓</td>
     278<td>✓</td>
     279<td>V_MOV_FED_B32</td>
     280</tr>
     281<tr>
     282<td>10 (0xa)</td>
     283<td>394 (0x18a)</td>
     284<td>✓</td>
     285<td>✓</td>
     286<td>V_CVT_F16_F32</td>
     287</tr>
     288<tr>
     289<td>11 (0xb)</td>
     290<td>395 (0x18b)</td>
     291<td>✓</td>
     292<td>✓</td>
     293<td>V_CVT_F32_F16</td>
     294</tr>
     295<tr>
     296<td>12 (0xc)</td>
     297<td>396 (0x18c)</td>
     298<td>✓</td>
     299<td>✓</td>
     300<td>V_CVT_RPI_I32_F32</td>
     301</tr>
     302<tr>
     303<td>13 (0xd)</td>
     304<td>397 (0x18d)</td>
     305<td>✓</td>
     306<td>✓</td>
     307<td>V_CVT_FLR_I32_F32</td>
     308</tr>
     309<tr>
     310<td>14 (0xe)</td>
     311<td>398 (0x18e)</td>
     312<td>✓</td>
     313<td>✓</td>
     314<td>V_CVT_OFF_F32_I4</td>
     315</tr>
     316<tr>
     317<td>15 (0xf)</td>
     318<td>399 (0x18f)</td>
     319<td>✓</td>
     320<td>✓</td>
     321<td>V_CVT_F32_F64</td>
     322</tr>
     323<tr>
     324<td>16 (0x10)</td>
     325<td>400 (0x190)</td>
     326<td>✓</td>
     327<td>✓</td>
     328<td>V_CVT_F64_F32</td>
     329</tr>
     330<tr>
     331<td>17 (0x11)</td>
     332<td>401 (0x191)</td>
     333<td>✓</td>
     334<td>✓</td>
     335<td>V_CVT_F32_UBYTE0</td>
     336</tr>
     337<tr>
     338<td>18 (0x12)</td>
     339<td>402 (0x192)</td>
     340<td>✓</td>
     341<td>✓</td>
     342<td>V_CVT_F32_UBYTE1</td>
     343</tr>
     344<tr>
     345<td>19 (0x13)</td>
     346<td>403 (0x193)</td>
     347<td>✓</td>
     348<td>✓</td>
     349<td>V_CVT_F32_UBYTE2</td>
     350</tr>
     351<tr>
     352<td>20 (0x14)</td>
     353<td>404 (0x194)</td>
     354<td>✓</td>
     355<td>✓</td>
     356<td>V_CVT_F32_UBYTE3</td>
     357</tr>
     358<tr>
     359<td>21 (0x15)</td>
     360<td>405 (0x195)</td>
     361<td>✓</td>
     362<td>✓</td>
     363<td>V_CVT_U32_F64</td>
     364</tr>
     365<tr>
     366<td>22 (0x16)</td>
     367<td>406 (0x196)</td>
     368<td>✓</td>
     369<td>✓</td>
     370<td>V_CVT_F64_U32</td>
     371</tr>
     372<tr>
     373<td>23 (0x17)</td>
     374<td>407 (0x197)</td>
     375<td>✓</td>
     376<td>✓</td>
     377<td>V_TRUNC_F64</td>
     378</tr>
     379<tr>
     380<td>24 (0x18)</td>
     381<td>408 (0x198)</td>
     382<td>✓</td>
     383<td>✓</td>
     384<td>V_CEIL_F64</td>
     385</tr>
     386<tr>
     387<td>25 (0x19)</td>
     388<td>409 (0x199)</td>
     389<td>✓</td>
     390<td>✓</td>
     391<td>V_RNDNE_F64</td>
     392</tr>
     393<tr>
     394<td>26 (0x1a)</td>
     395<td>410 (0x19a)</td>
     396<td>✓</td>
     397<td>✓</td>
     398<td>V_FLOOR_F64</td>
     399</tr>
     400<tr>
     401<td>32 (0x20)</td>
     402<td>416 (0x1a0)</td>
     403<td>✓</td>
     404<td>✓</td>
     405<td>V_FRACT_F32</td>
     406</tr>
     407<tr>
     408<td>33 (0x21)</td>
     409<td>417 (0x1a1)</td>
     410<td>✓</td>
     411<td>✓</td>
     412<td>V_TRUNC_F32</td>
     413</tr>
     414<tr>
     415<td>34 (0x22)</td>
     416<td>418 (0x1a2)</td>
     417<td>✓</td>
     418<td>✓</td>
     419<td>V_CEIL_F32</td>
     420</tr>
     421<tr>
     422<td>35 (0x23)</td>
     423<td>419 (0x1a3)</td>
     424<td>✓</td>
     425<td>✓</td>
     426<td>V_RNDNE_F32</td>
     427</tr>
     428<tr>
     429<td>36 (0x24)</td>
     430<td>420 (0x1a4)</td>
     431<td>✓</td>
     432<td>✓</td>
     433<td>V_FLOOR_F32</td>
     434</tr>
     435<tr>
     436<td>37 (0x25)</td>
     437<td>421 (0x1a5)</td>
     438<td>✓</td>
     439<td>✓</td>
     440<td>V_EXP_F32</td>
     441</tr>
     442<tr>
     443<td>38 (0x26)</td>
     444<td>422 (0x1a6)</td>
     445<td>✓</td>
     446<td>✓</td>
     447<td>V_LOG_CLAMP_F32</td>
     448</tr>
     449<tr>
     450<td>39 (0x27)</td>
     451<td>423 (0x1a7)</td>
     452<td>✓</td>
     453<td>✓</td>
     454<td>V_LOG_F32</td>
     455</tr>
     456<tr>
     457<td>40 (0x28)</td>
     458<td>424 (0x1a8)</td>
     459<td>✓</td>
     460<td>✓</td>
     461<td>V_RCP_CLAMP_F32</td>
     462</tr>
     463<tr>
     464<td>41 (0x29)</td>
     465<td>425 (0x1a9)</td>
     466<td>✓</td>
     467<td>✓</td>
     468<td>V_RCP_LEGACY_F32</td>
     469</tr>
     470<tr>
     471<td>42 (0x2a)</td>
     472<td>426 (0x1aa)</td>
     473<td>✓</td>
     474<td>✓</td>
     475<td>V_RCP_F32</td>
     476</tr>
     477<tr>
     478<td>43 (0x2b)</td>
     479<td>427 (0x1ab)</td>
     480<td>✓</td>
     481<td>✓</td>
     482<td>V_RCP_IFLAG_F32</td>
     483</tr>
     484<tr>
     485<td>44 (0x2c)</td>
     486<td>428 (0x1ac)</td>
     487<td>✓</td>
     488<td>✓</td>
     489<td>V_RSQ_CLAMP_F32</td>
     490</tr>
     491<tr>
     492<td>45 (0x2d)</td>
     493<td>429 (0x1ad)</td>
     494<td>✓</td>
     495<td>✓</td>
     496<td>V_RSQ_LEGACY_F32</td>
     497</tr>
     498<tr>
     499<td>46 (0x2e)</td>
     500<td>430 (0x1ae)</td>
     501<td>✓</td>
     502<td>✓</td>
     503<td>V_RSQ_F32</td>
     504</tr>
     505<tr>
     506<td>47 (0x2f)</td>
     507<td>431 (0x1af)</td>
     508<td>✓</td>
     509<td>✓</td>
     510<td>V_RCP_F64</td>
     511</tr>
     512<tr>
     513<td>48 (0x30)</td>
     514<td>432 (0x1b0)</td>
     515<td>✓</td>
     516<td>✓</td>
     517<td>V_RCP_CLAMP_F64</td>
     518</tr>
     519<tr>
     520<td>49 (0x31)</td>
     521<td>433 (0x1b1)</td>
     522<td>✓</td>
     523<td>✓</td>
     524<td>V_RSQ_F64</td>
     525</tr>
     526<tr>
     527<td>50 (0x32)</td>
     528<td>434 (0x1b2)</td>
     529<td>✓</td>
     530<td>✓</td>
     531<td>V_RSQ_CLAMP_F64</td>
     532</tr>
     533<tr>
     534<td>51 (0x33)</td>
     535<td>435 (0x1b3)</td>
     536<td>✓</td>
     537<td>✓</td>
     538<td>V_SQRT_F32</td>
     539</tr>
     540<tr>
     541<td>52 (0x34)</td>
     542<td>436 (0x1b4)</td>
     543<td>✓</td>
     544<td>✓</td>
     545<td>V_SQRT_F64</td>
     546</tr>
     547<tr>
     548<td>53 (0x35)</td>
     549<td>437 (0x1b5)</td>
     550<td>✓</td>
     551<td>✓</td>
     552<td>V_SIN_F32</td>
     553</tr>
     554<tr>
     555<td>54 (0x36)</td>
     556<td>438 (0x1b6)</td>
     557<td>✓</td>
     558<td>✓</td>
     559<td>V_COS_F32</td>
     560</tr>
     561<tr>
     562<td>55 (0x37)</td>
     563<td>439 (0x1b7)</td>
     564<td>✓</td>
     565<td>✓</td>
     566<td>V_NOT_B32</td>
     567</tr>
     568<tr>
     569<td>56 (0x38)</td>
     570<td>440 (0x1b8)</td>
     571<td>✓</td>
     572<td>✓</td>
     573<td>V_BFREV_B32</td>
     574</tr>
     575<tr>
     576<td>57 (0x39)</td>
     577<td>441 (0x1b9)</td>
     578<td>✓</td>
     579<td>✓</td>
     580<td>V_FFBH_U32</td>
     581</tr>
     582<tr>
     583<td>58 (0x3a)</td>
     584<td>442 (0x1ba)</td>
     585<td>✓</td>
     586<td>✓</td>
     587<td>V_FFBL_B32</td>
     588</tr>
     589<tr>
     590<td>59 (0x3b)</td>
     591<td>443 (0x1bb)</td>
     592<td>✓</td>
     593<td>✓</td>
     594<td>V_FFBH_I32</td>
     595</tr>
     596<tr>
     597<td>60 (0x3c)</td>
     598<td>444 (0x1bc)</td>
     599<td>✓</td>
     600<td>✓</td>
     601<td>V_FREXP_EXP_I32_F64</td>
     602</tr>
     603<tr>
     604<td>61 (0x3d)</td>
     605<td>445 (0x1bd)</td>
     606<td>✓</td>
     607<td>✓</td>
     608<td>V_FREXP_MANT_F64</td>
     609</tr>
     610<tr>
     611<td>62 (0x3e)</td>
     612<td>446 (0x1be)</td>
     613<td>✓</td>
     614<td>✓</td>
     615<td>V_FRACT_F64</td>
     616</tr>
     617<tr>
     618<td>63 (0x3f)</td>
     619<td>447 (0x1bf)</td>
     620<td>✓</td>
     621<td>✓</td>
     622<td>V_FREXP_EXP_I32_F32</td>
     623</tr>
     624<tr>
     625<td>64 (0x40)</td>
     626<td>448 (0x1c0)</td>
     627<td>✓</td>
     628<td>✓</td>
     629<td>V_FREXP_MANT_F32</td>
     630</tr>
     631<tr>
     632<td>65 (0x41)</td>
     633<td>449 (0x1c1)</td>
     634<td>✓</td>
     635<td>✓</td>
     636<td>V_CLREXCP</td>
     637</tr>
     638<tr>
     639<td>66 (0x42)</td>
     640<td>450 (0x1c2)</td>
     641<td>✓</td>
     642<td>✓</td>
     643<td>V_MOVRELD_B32</td>
     644</tr>
     645<tr>
     646<td>67 (0x43)</td>
     647<td>451 (0x1c3)</td>
     648<td>✓</td>
     649<td>✓</td>
     650<td>V_MOVRELS_B32</td>
     651</tr>
     652<tr>
     653<td>68 (0x44)</td>
     654<td>452 (0x1c4)</td>
     655<td>✓</td>
     656<td>✓</td>
     657<td>V_MOVRELSD_B32</td>
     658</tr>
     659<tr>
     660<td>69 (0x45)</td>
     661<td>453 (0x1c5)</td>
     662<td></td>
     663<td>✓</td>
     664<td>V_LOG_LEGACY_F32</td>
     665</tr>
     666<tr>
     667<td>70 (0x46)</td>
     668<td>454 (0x1c6)</td>
     669<td></td>
     670<td>✓</td>
     671<td>V_EXP_LEGACY_F32</td>
     672</tr>
     673</tbody>
     674</table>
     675}}}