| 1 | [wiki:ClrxToc Back to Table of content] |
| 2 | {{{ |
| 3 | #!html |
| 4 | <h2>GCN ISA SOP1 instructions</h2> |
| 5 | <p>The basic encoding of the SOP1 instructions needs 4 bytes (dword). List of fields:</p> |
| 6 | <table> |
| 7 | <thead> |
| 8 | <tr> |
| 9 | <th>Bits</th> |
| 10 | <th>Name</th> |
| 11 | <th>Description</th> |
| 12 | </tr> |
| 13 | </thead> |
| 14 | <tbody> |
| 15 | <tr> |
| 16 | <td>0-7</td> |
| 17 | <td>SSRC0</td> |
| 18 | <td>Scalar source operand. Refer to operand encoding</td> |
| 19 | </tr> |
| 20 | <tr> |
| 21 | <td>8-15</td> |
| 22 | <td>OPCODE</td> |
| 23 | <td>Operation code</td> |
| 24 | </tr> |
| 25 | <tr> |
| 26 | <td>16-22</td> |
| 27 | <td>SDST</td> |
| 28 | <td>Destination scalar operand. Refer to operand encoding</td> |
| 29 | </tr> |
| 30 | <tr> |
| 31 | <td>23-31</td> |
| 32 | <td>ENCODING</td> |
| 33 | <td>Encoding type. Must be 0b101111101</td> |
| 34 | </tr> |
| 35 | </tbody> |
| 36 | </table> |
| 37 | <p>Syntax for almost instructions: INSTRUCTION SDST, SSRC0</p> |
| 38 | <p>Example: s_mov_b32 s0, s1</p> |
| 39 | <p>List of the instructions by opcode:</p> |
| 40 | <table> |
| 41 | <thead> |
| 42 | <tr> |
| 43 | <th>Opcode</th> |
| 44 | <th>Mnemonic (GCN1.0/1.1)</th> |
| 45 | <th>Mnemonic (GCN 1.2)</th> |
| 46 | </tr> |
| 47 | </thead> |
| 48 | <tbody> |
| 49 | <tr> |
| 50 | <td>0 (0x0)</td> |
| 51 | <td>--</td> |
| 52 | <td>S_MOV_B32</td> |
| 53 | </tr> |
| 54 | <tr> |
| 55 | <td>1 (0x1)</td> |
| 56 | <td>--</td> |
| 57 | <td>S_MOV_B64</td> |
| 58 | </tr> |
| 59 | <tr> |
| 60 | <td>2 (0x2)</td> |
| 61 | <td>--</td> |
| 62 | <td>S_CMOV_B32</td> |
| 63 | </tr> |
| 64 | <tr> |
| 65 | <td>3 (0x3)</td> |
| 66 | <td>S_MOV_B32</td> |
| 67 | <td>S_CMOV_B64</td> |
| 68 | </tr> |
| 69 | <tr> |
| 70 | <td>4 (0x4)</td> |
| 71 | <td>S_MOV_B64</td> |
| 72 | <td>S_NOT_B32</td> |
| 73 | </tr> |
| 74 | <tr> |
| 75 | <td>5 (0x5)</td> |
| 76 | <td>S_CMOV_B32</td> |
| 77 | <td>S_NOT_B64</td> |
| 78 | </tr> |
| 79 | <tr> |
| 80 | <td>6 (0x6)</td> |
| 81 | <td>S_CMOV_B64</td> |
| 82 | <td>S_WQM_B32</td> |
| 83 | </tr> |
| 84 | <tr> |
| 85 | <td>7 (0x7)</td> |
| 86 | <td>S_NOT_B32</td> |
| 87 | <td>S_WQM_B64</td> |
| 88 | </tr> |
| 89 | <tr> |
| 90 | <td>8 (0x8)</td> |
| 91 | <td>S_NOT_B64</td> |
| 92 | <td>S_BREV_B32</td> |
| 93 | </tr> |
| 94 | <tr> |
| 95 | <td>9 (0x9)</td> |
| 96 | <td>S_WQM_B32</td> |
| 97 | <td>S_BREV_B64</td> |
| 98 | </tr> |
| 99 | <tr> |
| 100 | <td>10 (0xa)</td> |
| 101 | <td>S_WQM_B64</td> |
| 102 | <td>S_BCNT0_I32_B32</td> |
| 103 | </tr> |
| 104 | <tr> |
| 105 | <td>11 (0xb)</td> |
| 106 | <td>S_BREV_B32</td> |
| 107 | <td>S_BCNT0_I32_B64</td> |
| 108 | </tr> |
| 109 | <tr> |
| 110 | <td>12 (0xc)</td> |
| 111 | <td>S_BREV_B64</td> |
| 112 | <td>S_BCNT1_I32_B32</td> |
| 113 | </tr> |
| 114 | <tr> |
| 115 | <td>13 (0xd)</td> |
| 116 | <td>S_BCNT0_I32_B32</td> |
| 117 | <td>S_BCNT1_I32_B64</td> |
| 118 | </tr> |
| 119 | <tr> |
| 120 | <td>14 (0xe)</td> |
| 121 | <td>S_BCNT0_I32_B64</td> |
| 122 | <td>S_FF0_I32_B32</td> |
| 123 | </tr> |
| 124 | <tr> |
| 125 | <td>15 (0xf)</td> |
| 126 | <td>S_BCNT1_I32_B32</td> |
| 127 | <td>S_FF0_I32_B64</td> |
| 128 | </tr> |
| 129 | <tr> |
| 130 | <td>16 (0x10)</td> |
| 131 | <td>S_BCNT1_I32_B64</td> |
| 132 | <td>S_FF1_I32_B32</td> |
| 133 | </tr> |
| 134 | <tr> |
| 135 | <td>17 (0x11)</td> |
| 136 | <td>S_FF0_I32_B32</td> |
| 137 | <td>S_FF1_I32_B64</td> |
| 138 | </tr> |
| 139 | <tr> |
| 140 | <td>18 (0x12)</td> |
| 141 | <td>S_FF0_I32_B64</td> |
| 142 | <td>S_FLBIT_I32_B32</td> |
| 143 | </tr> |
| 144 | <tr> |
| 145 | <td>19 (0x13)</td> |
| 146 | <td>S_FF1_I32_B32</td> |
| 147 | <td>S_FLBIT_I32_B64</td> |
| 148 | </tr> |
| 149 | <tr> |
| 150 | <td>20 (0x14)</td> |
| 151 | <td>S_FF1_I32_B64</td> |
| 152 | <td>S_FLBIT_I32</td> |
| 153 | </tr> |
| 154 | <tr> |
| 155 | <td>21 (0x15)</td> |
| 156 | <td>S_FLBIT_I32_B32</td> |
| 157 | <td>S_FLBIT_I32_I64</td> |
| 158 | </tr> |
| 159 | <tr> |
| 160 | <td>22 (0x16)</td> |
| 161 | <td>S_FLBIT_I32_B64</td> |
| 162 | <td>S_SEXT_I32_I8</td> |
| 163 | </tr> |
| 164 | <tr> |
| 165 | <td>23 (0x17)</td> |
| 166 | <td>S_FLBIT_I32</td> |
| 167 | <td>S_SEXT_I32_I16</td> |
| 168 | </tr> |
| 169 | <tr> |
| 170 | <td>24 (0x18)</td> |
| 171 | <td>S_FLBIT_I32_I64</td> |
| 172 | <td>S_BITSET0_B32</td> |
| 173 | </tr> |
| 174 | <tr> |
| 175 | <td>25 (0x19)</td> |
| 176 | <td>S_SEXT_I32_I8</td> |
| 177 | <td>S_BITSET0_B64</td> |
| 178 | </tr> |
| 179 | <tr> |
| 180 | <td>26 (0x1a)</td> |
| 181 | <td>S_SEXT_I32_I16</td> |
| 182 | <td>S_BITSET1_B32</td> |
| 183 | </tr> |
| 184 | <tr> |
| 185 | <td>27 (0x1b)</td> |
| 186 | <td>S_BITSET0_B32</td> |
| 187 | <td>S_BITSET1_B64</td> |
| 188 | </tr> |
| 189 | <tr> |
| 190 | <td>28 (0x1c)</td> |
| 191 | <td>S_BITSET0_B64</td> |
| 192 | <td>S_GETPC_B64</td> |
| 193 | </tr> |
| 194 | <tr> |
| 195 | <td>29 (0x1d)</td> |
| 196 | <td>S_BITSET1_B32</td> |
| 197 | <td>S_SETPC_B64</td> |
| 198 | </tr> |
| 199 | <tr> |
| 200 | <td>30 (0x1e)</td> |
| 201 | <td>S_BITSET1_B64</td> |
| 202 | <td>S_SWAPPC_B64</td> |
| 203 | </tr> |
| 204 | <tr> |
| 205 | <td>31 (0x1f)</td> |
| 206 | <td>S_GETPC_B64</td> |
| 207 | <td>S_RFE_B64</td> |
| 208 | </tr> |
| 209 | <tr> |
| 210 | <td>32 (0x20)</td> |
| 211 | <td>S_SETPC_B64</td> |
| 212 | <td>S_AND_SAVEEXEC_B64</td> |
| 213 | </tr> |
| 214 | <tr> |
| 215 | <td>33 (0x21)</td> |
| 216 | <td>S_SWAPPC_B64</td> |
| 217 | <td>S_OR_SAVEEXEC_B64</td> |
| 218 | </tr> |
| 219 | <tr> |
| 220 | <td>34 (0x22)</td> |
| 221 | <td>S_RFE_B64</td> |
| 222 | <td>S_XOR_SAVEEXEC_B64</td> |
| 223 | </tr> |
| 224 | <tr> |
| 225 | <td>35 (0x23)</td> |
| 226 | <td>--</td> |
| 227 | <td>S_ANDN2_SAVEEXEC_B64</td> |
| 228 | </tr> |
| 229 | <tr> |
| 230 | <td>36 (0x24)</td> |
| 231 | <td>S_AND_SAVEEXEC_B64</td> |
| 232 | <td>S_ORN2_SAVEEXEC_B64</td> |
| 233 | </tr> |
| 234 | <tr> |
| 235 | <td>37 (0x25)</td> |
| 236 | <td>S_OR_SAVEEXEC_B64</td> |
| 237 | <td>S_NAND_SAVEEXEC_B64</td> |
| 238 | </tr> |
| 239 | <tr> |
| 240 | <td>38 (0x26)</td> |
| 241 | <td>S_XOR_SAVEEXEC_B64</td> |
| 242 | <td>S_NOR_SAVEEXEC_B64</td> |
| 243 | </tr> |
| 244 | <tr> |
| 245 | <td>39 (0x27)</td> |
| 246 | <td>S_ANDN2_SAVEEXEC_B64</td> |
| 247 | <td>S_XNOR_SAVEEXEC_B64</td> |
| 248 | </tr> |
| 249 | <tr> |
| 250 | <td>40 (0x28)</td> |
| 251 | <td>S_ORN2_SAVEEXEC_B64</td> |
| 252 | <td>S_QUADMASK_B32</td> |
| 253 | </tr> |
| 254 | <tr> |
| 255 | <td>41 (0x29)</td> |
| 256 | <td>S_NAND_SAVEEXEC_B64</td> |
| 257 | <td>S_QUADMASK_B64</td> |
| 258 | </tr> |
| 259 | <tr> |
| 260 | <td>42 (0x2a)</td> |
| 261 | <td>S_NOR_SAVEEXEC_B64</td> |
| 262 | <td>S_MOVRELS_B32</td> |
| 263 | </tr> |
| 264 | <tr> |
| 265 | <td>43 (0x2b)</td> |
| 266 | <td>S_XNOR_SAVEEXEC_B64</td> |
| 267 | <td>S_MOVRELS_B64</td> |
| 268 | </tr> |
| 269 | <tr> |
| 270 | <td>44 (0x2c)</td> |
| 271 | <td>S_QUADMASK_B32</td> |
| 272 | <td>S_MOVRELD_B32</td> |
| 273 | </tr> |
| 274 | <tr> |
| 275 | <td>45 (0x2d)</td> |
| 276 | <td>S_QUADMASK_B64</td> |
| 277 | <td>S_MOVRELD_B64</td> |
| 278 | </tr> |
| 279 | <tr> |
| 280 | <td>46 (0x2e)</td> |
| 281 | <td>S_MOVRELS_B32</td> |
| 282 | <td>S_CBRANCH_JOIN</td> |
| 283 | </tr> |
| 284 | <tr> |
| 285 | <td>47 (0x2f)</td> |
| 286 | <td>S_MOVRELS_B64</td> |
| 287 | <td>S_MOV_REGRD_B32</td> |
| 288 | </tr> |
| 289 | <tr> |
| 290 | <td>48 (0x30)</td> |
| 291 | <td>S_MOVRELD_B32</td> |
| 292 | <td>S_ABS_I32</td> |
| 293 | </tr> |
| 294 | <tr> |
| 295 | <td>49 (0x31)</td> |
| 296 | <td>S_MOVRELD_B64</td> |
| 297 | <td>S_MOV_FED_B32</td> |
| 298 | </tr> |
| 299 | <tr> |
| 300 | <td>50 (0x32)</td> |
| 301 | <td>S_CBRANCH_JOIN</td> |
| 302 | <td>S_SET_GPR_IDX_IDX</td> |
| 303 | </tr> |
| 304 | <tr> |
| 305 | <td>51 (0x33)</td> |
| 306 | <td>S_MOV_REGRD_B32</td> |
| 307 | <td>--</td> |
| 308 | </tr> |
| 309 | <tr> |
| 310 | <td>52 (0x34)</td> |
| 311 | <td>S_ABS_I32</td> |
| 312 | <td>--</td> |
| 313 | </tr> |
| 314 | <tr> |
| 315 | <td>53 (0x35)</td> |
| 316 | <td>S_MOV_FED_B32</td> |
| 317 | <td>--</td> |
| 318 | </tr> |
| 319 | </tbody> |
| 320 | </table> |
| 321 | <h3>Instruction set</h3> |
| 322 | <p>Alphabetically sorted instruction list:</p> |
| 323 | <h3>S_BREV_B32</h3> |
| 324 | <p>Opcode: 11 (0xb) for GCN 1.0/1.1; 8 (0x8) for GCN 1.2<br /> |
| 325 | Syntax: S_BREV_B32 SDST, SSRC0<br /> |
| 326 | Description: Reverse bits in SSRC0 and store result to SDST. SCC is not changed.<br /> |
| 327 | <code>SDST = REVBIT(SSRC0)</code></p> |
| 328 | <h3>S_BREV_B64</h3> |
| 329 | <p>Opcode: 12 (0xc) for GCN 1.0/1.1; 9 (0x9) for GCN 1.2<br /> |
| 330 | Syntax: S_BREV_B64 SDST(2), SSRC0(2)<br /> |
| 331 | Description: Reverse bits in SSRC0 and store result to SDST. SCC is not changed. |
| 332 | SDST and SSRC0 are 64-bit.<br /> |
| 333 | <code>SDST = REVBIT(SSRC0)</code></p> |
| 334 | <h3>S_CMOV_B32</h3> |
| 335 | <p>Opcode: 5 (0x5) for GCN 1.0/1.1; 2 (0x2) for GCN 1.2<br /> |
| 336 | Syntax: S_CMOV_B32 SDST, SSRC0<br /> |
| 337 | Description: If SCC is 1, store SSRC0 into SDST, otherwise do not change SDST. |
| 338 | SCC is not changed.<br /> |
| 339 | Operation:<br /> |
| 340 | <code>SDST = SCC ? SSRC0 : SDST</code></p> |
| 341 | <h3>S_CMOV_B64</h3> |
| 342 | <p>Opcode: 6 (0x6) for GCN 1.0/1.1; 3 (0x3) for GCN 1.2<br /> |
| 343 | Syntax: S_CMOV_B64 SDST(2), SSRC0(2)<br /> |
| 344 | Description: If SCC is 1, store SSRC0 into SDST, otherwise do not change SDST. |
| 345 | SCC is not changed. SDST and SSRC0 are 64-bit.<br /> |
| 346 | Operation:<br /> |
| 347 | <code>SDST = SCC ? SSRC0 : SDST</code></p> |
| 348 | <h3>S_MOV_B32</h3> |
| 349 | <p>Opcode: 3 (0x3) for GCN 1.0/1.1; 0 (0x0) for GCN 1.2<br /> |
| 350 | Syntax: S_MOV_B32 SDST, SSRC0<br /> |
| 351 | Description: Move value of SSRC0 into SDST.<br /> |
| 352 | Operation:<br /> |
| 353 | <code>SDST = SSRC0</code></p> |
| 354 | <h3>S_MOV_B64</h3> |
| 355 | <p>Opcode: 4 (0x4) for GCN 1.0/1.1; 1 (0x1) for GCN 1.2<br /> |
| 356 | Syntax: S_MOV_B64 SDST(2), SSRC0(2)<br /> |
| 357 | Description: Move value of SSRC0 into SDST. SDST and SSRC0 are 64-bit.<br /> |
| 358 | Operation:<br /> |
| 359 | <code>SDST = SSRC0</code></p> |
| 360 | <h3>S_MOV_B32</h3> |
| 361 | <p>Opcode: 3 (0x3) for GCN 1.0/1.1; 0 (0x0) for GCN 1.2<br /> |
| 362 | Syntax: S_MOV_B32 SDST, SSRC0<br /> |
| 363 | Description: Move value of SSRC0 into SDST.<br /> |
| 364 | Operation:<br /> |
| 365 | <code>SDST = SSRC0</code></p> |
| 366 | <h3>S_NOT_B32</h3> |
| 367 | <p>Opcode: 7 (0x7) for GCN 1.0/1.1; 4 (0x4) for GCN 1.2<br /> |
| 368 | Syntax: S_NOT_B32 SDST, SSRC0<br /> |
| 369 | Description: Store bitwise negation of the SSRC0 into SDST. |
| 370 | If result is non-zero, store 1 to SCC, otherwise store 0 to SCC.<br /> |
| 371 | Operation:<br /> |
| 372 | <code>SDST = ~SSRC0 |
| 373 | SCC = SDST!=0</code></p> |
| 374 | <h3>S_NOT_B64</h3> |
| 375 | <p>Opcode: 8 (0x8) for GCN 1.0/1.1; 5 (0x5) for GCN 1.2<br /> |
| 376 | Syntax: S_NOT_B64 SDST(2), SSRC0(2)<br /> |
| 377 | Description: Store bitwise negation of the SSRC0 into SDST. |
| 378 | If result is non-zero, store 1 to SCC, otherwise store 0 to SCC. |
| 379 | SDST and SSRC0 are 64-bit.<br /> |
| 380 | Operation:<br /> |
| 381 | <code>SDST = ~SSRC0 |
| 382 | SCC = SDST!=0</code></p> |
| 383 | <h3>S_WQM_B32</h3> |
| 384 | <p>Opcode: 9 (0x9) for GCN 1.0/1.1; 6 (0x6) for GCN 1.2<br /> |
| 385 | Syntax: S_WQM_B32 SDST, SSRC0<br /> |
| 386 | Description: For every 4-bit groups in SSRC0, if any bit of that group is set, then |
| 387 | set all four bits for that group, otherwise zeroes all bits; and store that result into SDST. |
| 388 | If result is non-zero, store 1 to SCC, otherwise store 0 to SCC.<br /> |
| 389 | Operation:<br /> |
| 390 | <code>UINT32 temp |
| 391 | for (UINT8 i = 0; i < 32; i+=4) |
| 392 | temp |= ((SSRC0>>i) & 15)!=0 ? (15<<i) : 0 |
| 393 | SDST = temp |
| 394 | SCC = SDST!=0</code></p> |
| 395 | <h3>S_WQM_B64</h3> |
| 396 | <p>Opcode: 10 (0xa) for GCN 1.0/1.1; 7 (0x7) for GCN 1.2<br /> |
| 397 | Syntax: S_WQM_B64 SDST(2), SSRC0(2)<br /> |
| 398 | Description: For every 4-bit groups in SSRC0, if any bit of that group is set, then |
| 399 | set all four bits for that group, otherwise zeroes all bits; and store that result into SDST. |
| 400 | If result is non-zero, store 1 to SCC, otherwise store 0 to SCC. |
| 401 | SDST and SSRC0 are 64-bit.<br /> |
| 402 | Operation:<br /> |
| 403 | <code>UINT64 temp |
| 404 | for (UINT8 i = 0; i < 64; i+=4) |
| 405 | temp |= ((SSRC0>>i) & 15)!=0 ? (15ULL<<i) : 0 |
| 406 | SDST = temp |
| 407 | SCC = SDST!=0</code></p> |
| 408 | }}} |