Changes between Version 5 and Version 6 of GcnInstrsSmem


Ignore:
Timestamp:
11/23/17 23:00:34 (5 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsSmem

    v5 v6  
    2424</tr>
    2525<tr>
     26<td>14</td>
     27<td>SOE</td>
     28<td>Scalar offset enable (GCN 1.4)</td>
     29</tr>
     30<tr>
     31<td>15</td>
     32<td>NV</td>
     33<td>Non-volative (GCN 1.4)</td>
     34</tr>
     35<tr>
    2636<td>16</td>
    2737<td>GLC</td>
     
    4757<td>OFFSET</td>
    4858<td>Unsigned 20-bit byte offset or SGPR number that holds byte offset</td>
     59</tr>
     60<tr>
     61<td>32-52</td>
     62<td>OFFSET</td>
     63<td>Unsigned 21-bit byte offset or SGPR number (byte offset) (GCN 1.4)</td>
     64</tr>
     65<tr>
     66<td>57-63</td>
     67<td>SOFFSET</td>
     68<td>SGPR offset (only if SOE=1)</td>
    4969</tr>
    5070</tbody>
     
    587816-bit size. For S_BUFFER_LOAD_DWORD* instructions, 4 SBASE SGPRs holds a
    5979buffer descriptor. In this case, SBASE must be a multipla of 2.
    60 S_STORE_* and S_BUFFER_STORE_* accepts only M0 as offset register.</p>
     80S_STORE_* and S_BUFFER_STORE_* accepts only M0 as offset register for GCN 1.2.
     81In GCN 1.4 S_STORE_* and S_BUFFER_STORE_* accepts also SGPR as offset register.</p>
    6182<p>The SMEM instructions can return the result data out of the order. Any SMEM operation
    6283(including S_MEMTIME) increments LGKM_CNT counter. The best way to wait for results
     
    7394<tr>
    7495<th>Opcode</th>
    75 <th>Mnemonic (GCN1.2)</th>
     96<th>GCN 1.2</th>
    7697</tr>
    7798</thead>
     
    79100<tr>
    80101<td>0 (0x0)</td>
    81 <td>S_LOAD_DWORD</td>
     102<td></td>
    82103</tr>
    83104<tr>
    84105<td>1 (0x1)</td>
    85 <td>S_LOAD_DWORDX2</td>
     106<td></td>
    86107</tr>
    87108<tr>
    88109<td>2 (0x2)</td>
    89 <td>S_LOAD_DWORDX4</td>
     110<td></td>
    90111</tr>
    91112<tr>
    92113<td>3 (0x3)</td>
    93 <td>S_LOAD_DWORDX8</td>
     114<td></td>
    94115</tr>
    95116<tr>
    96117<td>4 (0x4)</td>
    97 <td>S_LOAD_DWORDX16</td>
     118<td></td>
    98119</tr>
    99120<tr>
    100121<td>8 (0x8)</td>
    101 <td>S_BUFFER_LOAD_DWORD</td>
     122<td></td>
    102123</tr>
    103124<tr>
    104125<td>9 (0x9)</td>
    105 <td>S_BUFFER_LOAD_DWORDX2</td>
     126<td></td>
    106127</tr>
    107128<tr>
    108129<td>10 (0xa)</td>
    109 <td>S_BUFFER_LOAD_DWORDX4</td>
     130<td></td>
    110131</tr>
    111132<tr>
    112133<td>11 (0xb)</td>
    113 <td>S_BUFFER_LOAD_DWORDX8</td>
     134<td></td>
    114135</tr>
    115136<tr>
    116137<td>12 (0xc)</td>
    117 <td>S_BUFFER_LOAD_DWORDX16</td>
     138<td></td>
    118139</tr>
    119140<tr>
    120141<td>16 (0x10)</td>
    121 <td>S_STORE_DWORD</td>
     142<td></td>
    122143</tr>
    123144<tr>
    124145<td>17 (0x11)</td>
    125 <td>S_STORE_DWORDX2</td>
     146<td></td>
    126147</tr>
    127148<tr>
    128149<td>18 (0x12)</td>
    129 <td>S_STORE_DWORDX4</td>
     150<td></td>
    130151</tr>
    131152<tr>
    132153<td>24 (0x18)</td>
    133 <td>S_BUFFER_LOAD_DWORD</td>
     154<td></td>
    134155</tr>
    135156<tr>
    136157<td>25 (0x19)</td>
    137 <td>S_BUFFER_LOAD_DWORDX2</td>
     158<td></td>
    138159</tr>
    139160<tr>
    140161<td>27 (0x1a)</td>
    141 <td>S_BUFFER_LOAD_DWORDX4</td>
     162<td></td>
    142163</tr>
    143164<tr>
    144165<td>32 (0x20)</td>
    145 <td>S_DCACHE_INV</td>
     166<td></td>
    146167</tr>
    147168<tr>
    148169<td>33 (0x21)</td>
    149 <td>S_DCACHE_WB</td>
     170<td></td>
    150171</tr>
    151172<tr>
    152173<td>34 (0x22)</td>
    153 <td>S_DCACHE_INV_VOL</td>
     174<td></td>
    154175</tr>
    155176<tr>
    156177<td>35 (0x23)</td>
    157 <td>S_DCACHE_WB_VOL</td>
     178<td></td>
    158179</tr>
    159180<tr>
    160181<td>36 (0x24)</td>
    161 <td>S_MEMTIME</td>
     182<td></td>
    162183</tr>
    163184<tr>
    164185<td>37 (0x25)</td>
    165 <td>S_MEMREALTIME</td>
     186<td></td>
    166187</tr>
    167188<tr>
    168189<td>38 (0x26)</td>
    169 <td>S_ATC_PROBE</td>
     190<td></td>
    170191</tr>
    171192<tr>
    172193<td>39 (0x27)</td>
    173 <td>S_ATC_PROBE_BUFFER</td>
     194<td>✓</td>
     195</tr>
     196<tr>
     197<td>40 (0x28)</td>
     198<td></td>
     199</tr>
     200<tr>
     201<td>41 (0x29)</td>
     202<td></td>
     203</tr>
     204<tr>
     205<td>128 (0x80)</td>
     206<td></td>
     207</tr>
     208<tr>
     209<td>129 (0x81)</td>
     210<td></td>
     211</tr>
     212<tr>
     213<td>130 (0x82)</td>
     214<td></td>
     215</tr>
     216<tr>
     217<td>131 (0x83)</td>
     218<td></td>
     219</tr>
     220<tr>
     221<td>132 (0x84)</td>
     222<td></td>
     223</tr>
     224<tr>
     225<td>133 (0x85)</td>
     226<td></td>
     227</tr>
     228<tr>
     229<td>134 (0x86)</td>
     230<td></td>
     231</tr>
     232<tr>
     233<td>135 (0x87)</td>
     234<td></td>
     235</tr>
     236<tr>
     237<td>136 (0x88)</td>
     238<td></td>
     239</tr>
     240<tr>
     241<td>137 (0x89)</td>
     242<td></td>
     243</tr>
     244<tr>
     245<td>138 (0x8a)</td>
     246<td></td>
     247</tr>
     248<tr>
     249<td>139 (0x8b)</td>
     250<td></td>
     251</tr>
     252<tr>
     253<td>140 (0x8c)</td>
     254<td></td>
     255</tr>
     256<tr>
     257<td>160 (0xa0)</td>
     258<td></td>
     259</tr>
     260<tr>
     261<td>161 (0xa1)</td>
     262<td></td>
     263</tr>
     264<tr>
     265<td>162 (0xa2)</td>
     266<td></td>
     267</tr>
     268<tr>
     269<td>163 (0xa3)</td>
     270<td></td>
     271</tr>
     272<tr>
     273<td>164 (0xa4)</td>
     274<td></td>
     275</tr>
     276<tr>
     277<td>165 (0xa5)</td>
     278<td></td>
     279</tr>
     280<tr>
     281<td>166 (0xa6)</td>
     282<td></td>
     283</tr>
     284<tr>
     285<td>167 (0xa7)</td>
     286<td></td>
     287</tr>
     288<tr>
     289<td>168 (0xa8)</td>
     290<td></td>
     291</tr>
     292<tr>
     293<td>169 (0xa9)</td>
     294<td></td>
     295</tr>
     296<tr>
     297<td>170 (0xaa)</td>
     298<td></td>
     299</tr>
     300<tr>
     301<td>171 (0xab)</td>
     302<td></td>
     303</tr>
     304<tr>
     305<td>172 (0xac)</td>
     306<td></td>
    174307</tr>
    175308</tbody>
     
    237370<code>for (BYTE i = 0; i &lt; 4; i++)
    238371    *(UINT32*)(SMEM + i*4 + (OFFSET &amp; ~3)) = SDATA[i]</code></p>
     372<h4>S_DCACHE_DISCARD</h4>
     373<p>Opcode 40 (0x28) only for GCN 1.4<br />
     374Syntax: S_DCACHE_DISCARD SBASE(2), SOFFSET1<br />
     375Description: Discard one dirty scalar data cache line. A cache line is 64
     376bytes. Address calculated as S_STORE_DWORD with alignment to 64-byte boundary.
     377LGKM count is incremented by 1 for this opcode.</p>
     378<h4>S_DCACHE_DISCARD_X2</h4>
     379<p>Opcode 41 (0x29) only for GCN 1.4<br />
     380Syntax: S_DCACHE_DISCARD_X2 SBASE(2), SOFFSET1<br />
     381Description: Discard two dirty scalar data cache lines. A cache line is 64
     382bytes. Address calculated as S_STORE_DWORD with alignment to 64-byte boundary.
     383LGKM count is incremented by 1 for this opcode.</p>
    239384<h4>S_DCACHE_INV</h4>
    240385<p>Opcode: 32 (0x20)<br />
     
    295440<p>Opcode: 16 (0x10)<br />
    296441Syntax: S_STORE_DWORD SDATA, SBASE(2), OFFSET<br />
    297 Description: Store single dword to memory. It accepts only offset as M0 or any immediate.<br />
    298 SBASE is buffer descriptor.<br />
     442Description: Store single dword to memory.
     443It accepts only offset as M0 or any immediate (only GCN 1.2).<br />
    299444Operation:<br />
    300445<code>*(UINT32*)(SMEM + (OFFSET &amp; ~3)) = SDATA</code></p>
     
    302447<p>Opcode: 17 (0x11)<br />
    303448Syntax: S_STORE_DWORDX2 SDATA(2), SBASE(2), OFFSET<br />
    304 Description: Store two dwords to memory. It accepts only offset as M0 or any immediate.<br />
     449Description: Store two dwords to memory.
     450It accepts only offset as M0 or any immediate (only GCN 1.2).<br />
    305451Operation:<br />
    306452<code>*(UINT64*)(SMEM + (OFFSET &amp; ~3)) = SDATA</code></p>
     
    308454<p>Opcode: 18 (0x12)<br />
    309455Syntax: S_STORE_DWORDX4 SDATA(4), SBASE(2), OFFSET<br />
    310 Description: Store four dwords to memory. It accepts only offset as M0 or any immediate.<br />
     456Description: Store four dwords to memory.
     457It accepts only offset as M0 or any immediate (only GCN 1.2).<br />
    311458Operation:<br />
    312459<code>for (BYTE i = 0; i &lt; 4; i++)