[wiki:ClrxToc Back to Table of content] {{{ #!html

GCN ISA VOP1/VOP3 instructions

VOP1 instructions can be encoded in the VOP1 encoding and the VOP3A/VOP3B encoding. List of fields for VOP1 encoding:

Bits Name Description
0-8 SRC0 First (scalar or vector) source operand
9-16 OPCODE Operation code
17-24 VDST Destination vector operand
25-31 ENCODING Encoding type. Must be 0b0111111

Syntax: INSTRUCTION VDST, SRC0

List of fields for VOP3A/VOP3B encoding (GCN 1.0/1.1):

Bits Name Description
0-7 VDST Vector destination operand
8-10 ABS Absolute modifiers for source operands (VOP3A)
8-14 SDST Scalar destination operand (VOP3B)
11 CLAMP CLAMP modifier (VOP3A)
15 CLAMP CLAMP modifier (VOP3B)
17-25 OPCODE Operation code
26-31 ENCODING Encoding type. Must be 0b110100
32-40 SRC0 First (scalar or vector) source operand
41-49 SRC1 Second (scalar or vector) source operand
50-58 SRC2 Third (scalar or vector) source operand
59-60 OMOD OMOD modifier. Multiplication modifier
61-63 NEG Negation modifier for source operands

List of fields for VOP3A/VOP3B encoding (GCN 1.2):

Bits Name Description
0-7 VDST Destination vector operand
8-10 ABS Absolute modifiers for source operands (VOP3A)
8-14 SDST Scalar destination operand (VOP3B)
15 CLAMP CLAMP modifier
16-25 OPCODE Operation code
26-31 ENCODING Encoding type. Must be 0b110100
32-40 SRC0 First (scalar or vector) source operand
41-49 SRC1 Second (scalar or vector) source operand
50-58 SRC2 Third (scalar or vector) source operand
59-60 OMOD OMOD modifier. Multiplication modifier
61-63 NEG Negation modifier for source operands

Syntax: INSTRUCTION VDST, SRC0 [MODIFIERS]

Modifiers:

Negation and absolute value can be combined: -ABS(V0). Modifiers CLAMP and OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.

Limitations for operands:

VOP1 opcodes (0-127) are reflected in VOP3 in range: 384-511 for GCN 1.0/1.1 or 320-447 for GCN 1.2.

List of the instructions by opcode (GCN 1.0/1.1):

Opcode Opcode(VOP3) GCN 1.0 GCN 1.1 Mnemonic
0 (0x0) 384 (0x180) V_NOP
1 (0x1) 385 (0x181) V_MOV_B32
2 (0x2) 386 (0x182) V_READFIRSTLANE_B32
3 (0x3) 387 (0x183) V_CVT_I32_F64
4 (0x4) 388 (0x184) V_CVT_F64_I32
5 (0x5) 389 (0x185) V_CVT_F32_I32
6 (0x6) 390 (0x186) V_CVT_F32_U32
7 (0x7) 391 (0x187) V_CVT_U32_F32
8 (0x8) 392 (0x188) V_CVT_I32_F32
9 (0x9) 393 (0x189) V_MOV_FED_B32
10 (0xa) 394 (0x18a) V_CVT_F16_F32
11 (0xb) 395 (0x18b) V_CVT_F32_F16
12 (0xc) 396 (0x18c) V_CVT_RPI_I32_F32
13 (0xd) 397 (0x18d) V_CVT_FLR_I32_F32
14 (0xe) 398 (0x18e) V_CVT_OFF_F32_I4
15 (0xf) 399 (0x18f) V_CVT_F32_F64
16 (0x10) 400 (0x190) V_CVT_F64_F32
17 (0x11) 401 (0x191) V_CVT_F32_UBYTE0
18 (0x12) 402 (0x192) V_CVT_F32_UBYTE1
19 (0x13) 403 (0x193) V_CVT_F32_UBYTE2
20 (0x14) 404 (0x194) V_CVT_F32_UBYTE3
21 (0x15) 405 (0x195) V_CVT_U32_F64
22 (0x16) 406 (0x196) V_CVT_F64_U32
23 (0x17) 407 (0x197) V_TRUNC_F64
24 (0x18) 408 (0x198) V_CEIL_F64
25 (0x19) 409 (0x199) V_RNDNE_F64
26 (0x1a) 410 (0x19a) V_FLOOR_F64
32 (0x20) 416 (0x1a0) V_FRACT_F32
33 (0x21) 417 (0x1a1) V_TRUNC_F32
34 (0x22) 418 (0x1a2) V_CEIL_F32
35 (0x23) 419 (0x1a3) V_RNDNE_F32
36 (0x24) 420 (0x1a4) V_FLOOR_F32
37 (0x25) 421 (0x1a5) V_EXP_F32
38 (0x26) 422 (0x1a6) V_LOG_CLAMP_F32
39 (0x27) 423 (0x1a7) V_LOG_F32
40 (0x28) 424 (0x1a8) V_RCP_CLAMP_F32
41 (0x29) 425 (0x1a9) V_RCP_LEGACY_F32
42 (0x2a) 426 (0x1aa) V_RCP_F32
43 (0x2b) 427 (0x1ab) V_RCP_IFLAG_F32
44 (0x2c) 428 (0x1ac) V_RSQ_CLAMP_F32
45 (0x2d) 429 (0x1ad) V_RSQ_LEGACY_F32
46 (0x2e) 430 (0x1ae) V_RSQ_F32
47 (0x2f) 431 (0x1af) V_RCP_F64
48 (0x30) 432 (0x1b0) V_RCP_CLAMP_F64
49 (0x31) 433 (0x1b1) V_RSQ_F64
50 (0x32) 434 (0x1b2) V_RSQ_CLAMP_F64
51 (0x33) 435 (0x1b3) V_SQRT_F32
52 (0x34) 436 (0x1b4) V_SQRT_F64
53 (0x35) 437 (0x1b5) V_SIN_F32
54 (0x36) 438 (0x1b6) V_COS_F32
55 (0x37) 439 (0x1b7) V_NOT_B32
56 (0x38) 440 (0x1b8) V_BFREV_B32
57 (0x39) 441 (0x1b9) V_FFBH_U32
58 (0x3a) 442 (0x1ba) V_FFBL_B32
59 (0x3b) 443 (0x1bb) V_FFBH_I32
60 (0x3c) 444 (0x1bc) V_FREXP_EXP_I32_F64
61 (0x3d) 445 (0x1bd) V_FREXP_MANT_F64
62 (0x3e) 446 (0x1be) V_FRACT_F64
63 (0x3f) 447 (0x1bf) V_FREXP_EXP_I32_F32
64 (0x40) 448 (0x1c0) V_FREXP_MANT_F32
65 (0x41) 449 (0x1c1) V_CLREXCP
66 (0x42) 450 (0x1c2) V_MOVRELD_B32
67 (0x43) 451 (0x1c3) V_MOVRELS_B32
68 (0x44) 452 (0x1c4) V_MOVRELSD_B32
69 (0x45) 453 (0x1c5) V_LOG_LEGACY_F32
70 (0x46) 454 (0x1c6) V_EXP_LEGACY_F32

List of the instructions by opcode (GCN 1.2):

Opcode Opcode(VOP3) Mnemonic
0 (0x0) 320 (0x140) V_NOP
1 (0x1) 321 (0x141) V_MOV_B32
2 (0x2) 322 (0x142) V_READFIRSTLANE_B32
3 (0x3) 323 (0x143) V_CVT_I32_F64
4 (0x4) 324 (0x144) V_CVT_F64_I32
5 (0x5) 325 (0x145) V_CVT_F32_I32
6 (0x6) 326 (0x146) V_CVT_F32_U32
7 (0x7) 327 (0x147) V_CVT_U32_F32
8 (0x8) 328 (0x148) V_CVT_I32_F32
9 (0x9) 329 (0x149) V_MOV_FED_B32
10 (0xa) 330 (0x14a) V_CVT_F16_F32
11 (0xb) 331 (0x14b) V_CVT_F32_F16
12 (0xc) 332 (0x14c) V_CVT_RPI_I32_F32
13 (0xd) 333 (0x14d) V_CVT_FLR_I32_F32
14 (0xe) 334 (0x14e) V_CVT_OFF_F32_I4
15 (0xf) 335 (0x14f) V_CVT_F32_F64
16 (0x10) 336 (0x150) V_CVT_F64_F32
17 (0x11) 337 (0x151) V_CVT_F32_UBYTE0
18 (0x12) 338 (0x152) V_CVT_F32_UBYTE1
19 (0x13) 339 (0x153) V_CVT_F32_UBYTE2
20 (0x14) 340 (0x154) V_CVT_F32_UBYTE3
21 (0x15) 341 (0x155) V_CVT_U32_F64
22 (0x16) 342 (0x156) V_CVT_F64_U32
23 (0x17) 343 (0x157) V_TRUNC_F64
24 (0x18) 344 (0x158) V_CEIL_F64
25 (0x19) 345 (0x159) V_RNDNE_F64
26 (0x1a) 346 (0x15a) V_FLOOR_F64
27 (0x1b) 347 (0x15b) V_FRACT_F32
28 (0x1c) 348 (0x15c) V_TRUNC_F32
29 (0x1d) 349 (0x15d) V_CEIL_F32
30 (0x1e) 350 (0x15e) V_RNDNE_F32
31 (0x1f) 351 (0x15f) V_FLOOR_F32
32 (0x20) 352 (0x160) V_EXP_F32
33 (0x21) 353 (0x161) V_LOG_F32
34 (0x22) 354 (0x162) V_RCP_F32
35 (0x23) 355 (0x163) V_RCP_IFLAG_F32
36 (0x24) 356 (0x164) V_RSQ_F32
37 (0x25) 357 (0x165) V_RCP_F64
38 (0x26) 358 (0x166) V_RSQ_F64
39 (0x27) 359 (0x167) V_SQRT_F32
40 (0x28) 360 (0x168) V_SQRT_F64
41 (0x29) 361 (0x169) V_SIN_F32
42 (0x2a) 362 (0x16a) V_COS_F32
43 (0x2b) 363 (0x16b) V_NOT_B32
44 (0x2c) 364 (0x16c) V_BFREV_B32
45 (0x2d) 365 (0x16d) V_FFBH_U32
46 (0x2e) 366 (0x16e) V_FFBL_B32
47 (0x2f) 367 (0x16f) V_FFBH_I32
48 (0x30) 368 (0x170) V_FREXP_EXP_I32
49 (0x31) 369 (0x171) V_FREXP_MANT_F6
50 (0x32) 370 (0x172) V_FRACT_F64
51 (0x33) 371 (0x173) V_FREXP_EXP_I32
52 (0x34) 372 (0x174) V_FREXP_MANT_F3
53 (0x35) 373 (0x175) V_CLREXCP
54 (0x36) 374 (0x176) V_MOVRELD_B32
55 (0x37) 375 (0x177) V_MOVRELS_B32
56 (0x38) 376 (0x178) V_MOVRELSD_B32
57 (0x39) 377 (0x179) V_CVT_F16_U16
58 (0x3a) 378 (0x17a) V_CVT_F16_I16
59 (0x3b) 379 (0x17b) V_CVT_U16_F16
60 (0x3c) 380 (0x17c) V_CVT_I16_F16
61 (0x3d) 381 (0x17d) V_RCP_F16
62 (0x3e) 382 (0x17e) V_SQRT_F16
63 (0x3f) 383 (0x17f) V_RSQ_F16
64 (0x40) 384 (0x180) V_LOG_F16
65 (0x41) 385 (0x181) V_EXP_F16
66 (0x42) 386 (0x182) V_FREXP_MANT_F16
67 (0x43) 387 (0x183) V_FREXP_EXP_I16_F16
68 (0x44) 388 (0x184) V_FLOOR_F16
69 (0x45) 389 (0x185) V_CEIL_F16
70 (0x46) 390 (0x186) V_TRUNC_F16
71 (0x47) 391 (0x187) V_RNDNE_F16
72 (0x48) 392 (0x188) V_FRACT_F16
73 (0x49) 393 (0x189) V_SIN_F16
74 (0x4a) 394 (0x18a) V_COS_F16
75 (0x4b) 395 (0x18b) V_EXP_LEGACY_F32
76 (0x4c) 396 (0x18c) V_LOG_LEGACY_F32

Instruction set

Alphabetically sorted instruction list:

V_CVT_F16_F32

Opcode VOP2: 10 (0xa)
Opcode VOP3A: 394 (0x18a) for GCN 1.0/1.1; 330 (0x14a) for GCN 1.2
Syntax: V_CVT_F16_F32 VDST, SRC0
Description: Convert single FP value to half floating point value with rounding from MODE register (single FP rounding mode), and store result to VDST. If absolute value is too high, then store -/+infinity to VDST.
Operation:
VDST = RNDHALF(ASFLOAT(SRC0))

V_CVT_F32_F16

Opcode VOP2: 11 (0xb)
Opcode VOP3A: 395 (0x18b) for GCN 1.0/1.1; 331 (0x14b) for GCN 1.2
Syntax: V_CVT_F32_F16 VDST, SRC0
Description: Convert half FP value to single FP value, and store result to VDST.
Operation:
VDST = (FLOAT)(ASHALF(SRC0))

V_CVT_F32_I32

Opcode VOP2: 5 (0x5)
Opcode VOP3A: 389 (0x185) for GCN 1.0/1.1; 325 (0x145) for GCN 1.2
Syntax: V_CVT_F32_I32 VDST, SRC0
Description: Convert signed 32-bit integer to single FP value, and store it to VDST.
Operation:
VDST = (FLOAT)(INT32)SRC0

V_CVT_F32_U32

Opcode VOP2: 6 (0x6)
Opcode VOP3A: 390 (0x186) for GCN 1.0/1.1; 326 (0x146) for GCN 1.2
Syntax: V_CVT_F32_U32 VDST, SRC0
Description: Convert unsigned 32-bit integer to single FP value, and store it to VDST.
Operation:
``` VDST = (FLOAT)SRC0

V_CVT_F64_I32

Opcode VOP2: 4 (0x4)
Opcode VOP3A: 388 (0x184) for GCN 1.0/1.1; 324 (0x144) for GCN 1.2
Syntax: V_CVT_F64_I32 VDST(2), SRC0
Description: Convert signed 32-bit integer to double FP value, and store it to VDST.
Operation:
VDST = (DOUBLE)(INT32)SRC0

V_CVT_I32_F32

Opcode VOP2: 8 (0x8)
Opcode VOP3A: 392 (0x188) for GCN 1.0/1.1; 328 (0x148) for GCN 1.2
Syntax: V_CVT_I32_F32 VDST, SRC0
Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and store result to VDST. Conversion uses rounding to zero. If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST. If input value is NaN then store 0 to VDST.
Operation:
VDST = 0 if (SRC0!=NAN) VDST = (INT32)MAX(MIN(RNDTZINT(ASFLOAT(SRC0)), 2147483647.0), -2147483648.0)

V_CVT_I32_F64

Opcode VOP2: 3 (0x3)
Opcode VOP3A: 387 (0x183) for GCN 1.0/1.1; 323 (0x143) for GCN 1.2
Syntax: V_CVT_I32_F64 VDST, SRC0(2)
Description: Convert 64-bit floating point value from SRC0 to signed 32-bit integer, and store result to VDST. Conversion uses rounding to zero. If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST. If input value is NaN then store 0 to VDST.
Operation:
VDST = 0 if (SRC0!=NAN) VDST = (INT32)MAX(MIN(RNDTZINT(ASDOUBLE(SRC0)), 2147483647.0), -2147483648.0)

V_CVT_U32_F32

Opcode VOP2: 7 (0x7)
Opcode VOP3A: 391 (0x187) for GCN 1.0/1.1; 327 (0x147) for GCN 1.2
Syntax: V_CVT_U32_F32 VDST, SRC0
Description: Convert 32-bit floating point value from SRC0 to unsigned 32-bit integer, and store result to VDST. Conversion uses rounding to zero. If value is higher than maximal integer then store MAX_UINT32 to VDST. If input value is NaN then store 0 to VDST.
Operation:
VDST = 0 if (SRC0!=NAN) VDST = (UINT32)MIN(RNDTZINT(ASFLOAT(SRC0)), 4294967295.0)

V_MOV_FED_B32

Opcode VOP2: 9 (0x9)
Opcode VOP3A: 393 (0x189) for GCN 1.0/1.1; 329 (0x149) for GCN 1.2
Syntax: V_MOV_FED_B32 VDST, SRC0
Description: Introduce edc double error upon write to dest vgpr without causing an exception (???).

V_MOV_B32

Opcode VOP2: 1 (0x1)
Opcode VOP3A: 385 (0x181) for GCN 1.0/1.1; 321 (0x141) for GCN 1.2
Syntax: V_MOV_B32 VDST, SRC0
Description: Move SRC0 into VDST.
Operation:
VDST = SRC0

V_NOP

Opcode VOP2: 0 (0x0)
Opcode VOP3A: 384 (0x180) for GCN 1.0/1.1; 320 (0x140) for GCN 1.2
Syntax: V_NOP
Description: Do nothing.

V_READFIRSTLANE_B32

Opcode VOP2: 2 (0x2)
Opcode VOP3A: 386 (0x182) for GCN 1.0/1.1; 322 (0x142) for GCN 1.2
Syntax: V_READFIRSTLANE_B32 SDST, VSRC0
Description: Copy one VSRC0 lane value to one SDST. Lane (thread id) is first active lane id or first lane id all lanes are inactive. SSRC1 can be SGPR or M0. Ignores EXEC mask.
Operation:
Operation:
UINT8 firstlane = 0 for (UINT8 i = 0; i < 64; i++) if ((1ULL<<i) & EXEC) != 0) { firstlane = i; break; } SDST = VSRC0[firstlane]

}}}