source: CLRX/CLRadeonExtender/trunk/doc/GcnInstrsVop1.md @ 3501

Last change on this file since 3501 was 3501, checked in by matszpk, 10 months ago

CLRadeonExtender: CLRXDocs: Small fixes in VOP1, VOP2, VOP3, VOPC. Add VOP3P encoding.

File size: 51.1 KB
Line 
1## GCN ISA VOP1/VOP3 instructions
2
3VOP1 instructions can be encoded in the VOP1 encoding and the VOP3A/VOP3B encoding.
4List of fields for VOP1 encoding:
5
6Bits  | Name     | Description
7------|----------|------------------------------
80-8   | SRC0     | First (scalar or vector) source operand
99-16  | OPCODE   | Operation code
1017-24 | VDST     | Destination vector operand
1125-31 | ENCODING | Encoding type. Must be 0b0111111
12
13Syntax: INSTRUCTION VDST, SRC0
14
15List of fields for VOP3A/VOP3B encoding (GCN 1.0/1.1):
16
17Bits  | Name     | Description
18------|----------|------------------------------
190-7   | VDST     | Vector destination operand
208-10  | ABS      | Absolute modifiers for source operands (VOP3A)
218-14  | SDST     | Scalar destination operand (VOP3B)
2211    | CLAMP    | CLAMP modifier (VOP3A)
2317-25 | OPCODE   | Operation code
2426-31 | ENCODING | Encoding type. Must be 0b110100
2532-40 | SRC0     | First (scalar or vector) source operand
2641-49 | SRC1     | Second (scalar or vector) source operand
2750-58 | SRC2     | Third (scalar or vector) source operand
2859-60 | OMOD     | OMOD modifier. Multiplication modifier
2961-63 | NEG      | Negation modifier for source operands
30
31List of fields for VOP3A/VOP3B encoding (GCN 1.2/1.4):
32
33Bits  | Name     | Description
34------|----------|------------------------------
350-7   | VDST     | Destination vector operand
368-10  | ABS      | Absolute modifiers for source operands (VOP3A)
378-14  | SDST     | Scalar destination operand (VOP3B)
3811-14 | OP_SEL   | Operand selection (VOP3A) (GCN 1.4)
3915    | CLAMP    | CLAMP modifier
4016-25 | OPCODE   | Operation code
4126-31 | ENCODING | Encoding type. Must be 0b110100
4232-40 | SRC0     | First (scalar or vector) source operand
4341-49 | SRC1     | Second (scalar or vector) source operand
4450-58 | SRC2     | Third (scalar or vector) source operand
4559-60 | OMOD     | OMOD modifier. Multiplication modifier
4661-63 | NEG      | Negation modifier for source operands
47
48Syntax: INSTRUCTION VDST, SRC0 [MODIFIERS]
49
50Modifiers:
51
52* CLAMP - clamps destination floating point value in range 0.0-1.0
53* MUL:2, MUL:4, DIV:2 - OMOD modifiers. Multiply destination floating point value by
542.0, 4.0 or 0.5 respectively. Clamping applied after OMOD modifier.
55* -SRC - negate floating point value from source operand. Applied after ABS modifier.
56* ABS(SRC), |SRC| - apply absolute value to source operand
57* OP_SEL:VALUE|[B0,...] - operand half selection (0 - lower 16-bits, 1 - bits)
58
59NOTE: OMOD modifier doesn't work if output denormals are allowed
60(5 bit of MODE register for single precision or 7 bit for double precision). 
61NOTE: OMOD and CLAMP modifier affects only for instruction that output is
62floating point value. 
63NOTE: ABS and negation is applied to source operand for any instruction. 
64
65Negation and absolute value can be combined: `-ABS(V0)`. Modifiers CLAMP and
66OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.
67
68Operand half selection (OP_SEL) take value with bits number depends of number operands.
69Last bit control destination operand. Zero in bit choose lower 16-bits in dword,
70one choose higher 16-bits. Example: op_sel:[0,1,1] - higher 16-bits in second source and
71in destination. List of bits of OP_SEL field:
72
73Bit | Operand | Description
74----|---------|----------------------
75 11 | SRC0    | Choose part of SRC0 (first source operand)
76 12 | SRC1    | Choose part of SRC1 (second source operand)
77 13 | SRC2    | Choose part of SRC2 (third source operand)
78 14 | VDST    | Choose part of VDST (destination)
79
80Limitations for operands:
81
82* only one SGPR can be read by instruction. Multiple occurrences of this same
83SGPR is allowed
84* only one literal constant can be used, and only when a SGPR or M0 is not used in
85source operands
86* only SRC0 can holds LDS_DIRECT
87
88Unaligned pairs of SGPRs are allowed in source operands.
89
90VOP1 opcodes (0-127) are reflected in VOP3 in range: 384-511 for GCN 1.0/1.1 or
91320-447 for GCN 1.2.
92
93List of the instructions by opcode (GCN 1.0/1.1):
94
95 Opcode     | Opcode(VOP3)|GCN 1.0|GCN 1.1| Mnemonic
96------------|-------------|-------|-------|-----------------------------
97 0 (0x0)    | 384 (0x180) |   ✓   |   ✓   | V_NOP
98 1 (0x1)    | 385 (0x181) |   ✓   |   ✓   | V_MOV_B32
99 2 (0x2)    | 386 (0x182) |   ✓   |   ✓   | V_READFIRSTLANE_B32
100 3 (0x3)    | 387 (0x183) |   ✓   |   ✓   | V_CVT_I32_F64
101 4 (0x4)    | 388 (0x184) |   ✓   |   ✓   | V_CVT_F64_I32
102 5 (0x5)    | 389 (0x185) |   ✓   |   ✓   | V_CVT_F32_I32
103 6 (0x6)    | 390 (0x186) |   ✓   |   ✓   | V_CVT_F32_U32
104 7 (0x7)    | 391 (0x187) |   ✓   |   ✓   | V_CVT_U32_F32
105 8 (0x8)    | 392 (0x188) |   ✓   |   ✓   | V_CVT_I32_F32
106 9 (0x9)    | 393 (0x189) |   ✓   |   ✓   | V_MOV_FED_B32
107 10 (0xa)   | 394 (0x18a) |   ✓   |   ✓   | V_CVT_F16_F32
108 11 (0xb)   | 395 (0x18b) |   ✓   |   ✓   | V_CVT_F32_F16
109 12 (0xc)   | 396 (0x18c) |   ✓   |   ✓   | V_CVT_RPI_I32_F32
110 13 (0xd)   | 397 (0x18d) |   ✓   |   ✓   | V_CVT_FLR_I32_F32
111 14 (0xe)   | 398 (0x18e) |   ✓   |   ✓   | V_CVT_OFF_F32_I4
112 15 (0xf)   | 399 (0x18f) |   ✓   |   ✓   | V_CVT_F32_F64
113 16 (0x10)  | 400 (0x190) |   ✓   |   ✓   | V_CVT_F64_F32
114 17 (0x11)  | 401 (0x191) |   ✓   |   ✓   | V_CVT_F32_UBYTE0
115 18 (0x12)  | 402 (0x192) |   ✓   |   ✓   | V_CVT_F32_UBYTE1
116 19 (0x13)  | 403 (0x193) |   ✓   |   ✓   | V_CVT_F32_UBYTE2
117 20 (0x14)  | 404 (0x194) |   ✓   |   ✓   | V_CVT_F32_UBYTE3
118 21 (0x15)  | 405 (0x195) |   ✓   |   ✓   | V_CVT_U32_F64
119 22 (0x16)  | 406 (0x196) |   ✓   |   ✓   | V_CVT_F64_U32
120 23 (0x17)  | 407 (0x197) |       |   ✓   | V_TRUNC_F64
121 24 (0x18)  | 408 (0x198) |       |   ✓   | V_CEIL_F64
122 25 (0x19)  | 409 (0x199) |       |   ✓   | V_RNDNE_F64
123 26 (0x1a)  | 410 (0x19a) |       |   ✓   | V_FLOOR_F64
124 32 (0x20)  | 416 (0x1a0) |   ✓   |   ✓   | V_FRACT_F32
125 33 (0x21)  | 417 (0x1a1) |   ✓   |   ✓   | V_TRUNC_F32
126 34 (0x22)  | 418 (0x1a2) |   ✓   |   ✓   | V_CEIL_F32
127 35 (0x23)  | 419 (0x1a3) |   ✓   |   ✓   | V_RNDNE_F32
128 36 (0x24)  | 420 (0x1a4) |   ✓   |   ✓   | V_FLOOR_F32
129 37 (0x25)  | 421 (0x1a5) |   ✓   |   ✓   | V_EXP_F32
130 38 (0x26)  | 422 (0x1a6) |   ✓   |   ✓   | V_LOG_CLAMP_F32
131 39 (0x27)  | 423 (0x1a7) |   ✓   |   ✓   | V_LOG_F32
132 40 (0x28)  | 424 (0x1a8) |   ✓   |   ✓   | V_RCP_CLAMP_F32
133 41 (0x29)  | 425 (0x1a9) |   ✓   |   ✓   | V_RCP_LEGACY_F32
134 42 (0x2a)  | 426 (0x1aa) |   ✓   |   ✓   | V_RCP_F32
135 43 (0x2b)  | 427 (0x1ab) |   ✓   |   ✓   | V_RCP_IFLAG_F32
136 44 (0x2c)  | 428 (0x1ac) |   ✓   |   ✓   | V_RSQ_CLAMP_F32
137 45 (0x2d)  | 429 (0x1ad) |   ✓   |   ✓   | V_RSQ_LEGACY_F32
138 46 (0x2e)  | 430 (0x1ae) |   ✓   |   ✓   | V_RSQ_F32
139 47 (0x2f)  | 431 (0x1af) |   ✓   |   ✓   | V_RCP_F64
140 48 (0x30)  | 432 (0x1b0) |   ✓   |   ✓   | V_RCP_CLAMP_F64
141 49 (0x31)  | 433 (0x1b1) |   ✓   |   ✓   | V_RSQ_F64
142 50 (0x32)  | 434 (0x1b2) |   ✓   |   ✓   | V_RSQ_CLAMP_F64
143 51 (0x33)  | 435 (0x1b3) |   ✓   |   ✓   | V_SQRT_F32
144 52 (0x34)  | 436 (0x1b4) |   ✓   |   ✓   | V_SQRT_F64
145 53 (0x35)  | 437 (0x1b5) |   ✓   |   ✓   | V_SIN_F32
146 54 (0x36)  | 438 (0x1b6) |   ✓   |   ✓   | V_COS_F32
147 55 (0x37)  | 439 (0x1b7) |   ✓   |   ✓   | V_NOT_B32
148 56 (0x38)  | 440 (0x1b8) |   ✓   |   ✓   | V_BFREV_B32
149 57 (0x39)  | 441 (0x1b9) |   ✓   |   ✓   | V_FFBH_U32
150 58 (0x3a)  | 442 (0x1ba) |   ✓   |   ✓   | V_FFBL_B32
151 59 (0x3b)  | 443 (0x1bb) |   ✓   |   ✓   | V_FFBH_I32
152 60 (0x3c)  | 444 (0x1bc) |   ✓   |   ✓   | V_FREXP_EXP_I32_F64
153 61 (0x3d)  | 445 (0x1bd) |   ✓   |   ✓   | V_FREXP_MANT_F64
154 62 (0x3e)  | 446 (0x1be) |   ✓   |   ✓   | V_FRACT_F64
155 63 (0x3f)  | 447 (0x1bf) |   ✓   |   ✓   | V_FREXP_EXP_I32_F32
156 64 (0x40)  | 448 (0x1c0) |   ✓   |   ✓   | V_FREXP_MANT_F32
157 65 (0x41)  | 449 (0x1c1) |   ✓   |   ✓   | V_CLREXCP
158 66 (0x42)  | 450 (0x1c2) |   ✓   |   ✓   | V_MOVRELD_B32
159 67 (0x43)  | 451 (0x1c3) |   ✓   |   ✓   | V_MOVRELS_B32
160 68 (0x44)  | 452 (0x1c4) |   ✓   |   ✓   | V_MOVRELSD_B32
161 69 (0x45)  | 453 (0x1c5) |       |   ✓   | V_LOG_LEGACY_F32
162 70 (0x46)  | 454 (0x1c6) |       |   ✓   | V_EXP_LEGACY_F32
163
164List of the instructions by opcode (GCN 1.2/1.4):
165
166 Opcode     | Opcode(VOP3)| Mnemonic (GCN 1.2)  | Mnemonic (GCN 1.4)
167------------|-------------|---------------------|------------------------
168 0 (0x0)    | 320 (0x140) | V_NOP               | V_NOP
169 1 (0x1)    | 321 (0x141) | V_MOV_B32           | V_MOV_B32
170 2 (0x2)    | 322 (0x142) | V_READFIRSTLANE_B32 | V_READFIRSTLANE_B32
171 3 (0x3)    | 323 (0x143) | V_CVT_I32_F64       | V_CVT_I32_F64
172 4 (0x4)    | 324 (0x144) | V_CVT_F64_I32       | V_CVT_F64_I32
173 5 (0x5)    | 325 (0x145) | V_CVT_F32_I32       | V_CVT_F32_I32
174 6 (0x6)    | 326 (0x146) | V_CVT_F32_U32       | V_CVT_F32_U32
175 7 (0x7)    | 327 (0x147) | V_CVT_U32_F32       | V_CVT_U32_F32
176 8 (0x8)    | 328 (0x148) | V_CVT_I32_F32       | V_CVT_I32_F32
177 9 (0x9)    | 329 (0x149) | V_MOV_FED_B32       | V_MOV_FED_B32
178 10 (0xa)   | 330 (0x14a) | V_CVT_F16_F32       | V_CVT_F16_F32
179 11 (0xb)   | 331 (0x14b) | V_CVT_F32_F16       | V_CVT_F32_F16
180 12 (0xc)   | 332 (0x14c) | V_CVT_RPI_I32_F32   | V_CVT_RPI_I32_F32
181 13 (0xd)   | 333 (0x14d) | V_CVT_FLR_I32_F32   | V_CVT_FLR_I32_F32
182 14 (0xe)   | 334 (0x14e) | V_CVT_OFF_F32_I4    | V_CVT_OFF_F32_I4
183 15 (0xf)   | 335 (0x14f) | V_CVT_F32_F64       | V_CVT_F32_F64
184 16 (0x10)  | 336 (0x150) | V_CVT_F64_F32       | V_CVT_F64_F32
185 17 (0x11)  | 337 (0x151) | V_CVT_F32_UBYTE0    | V_CVT_F32_UBYTE0
186 18 (0x12)  | 338 (0x152) | V_CVT_F32_UBYTE1    | V_CVT_F32_UBYTE1
187 19 (0x13)  | 339 (0x153) | V_CVT_F32_UBYTE2    | V_CVT_F32_UBYTE2
188 20 (0x14)  | 340 (0x154) | V_CVT_F32_UBYTE3    | V_CVT_F32_UBYTE3
189 21 (0x15)  | 341 (0x155) | V_CVT_U32_F64       | V_CVT_U32_F64
190 22 (0x16)  | 342 (0x156) | V_CVT_F64_U32       | V_CVT_F64_U32
191 23 (0x17)  | 343 (0x157) | V_TRUNC_F64         | V_TRUNC_F64
192 24 (0x18)  | 344 (0x158) | V_CEIL_F64          | V_CEIL_F64
193 25 (0x19)  | 345 (0x159) | V_RNDNE_F64         | V_RNDNE_F64
194 26 (0x1a)  | 346 (0x15a) | V_FLOOR_F64         | V_FLOOR_F64
195 27 (0x1b)  | 347 (0x15b) | V_FRACT_F32         | V_FRACT_F32
196 28 (0x1c)  | 348 (0x15c) | V_TRUNC_F32         | V_TRUNC_F32
197 29 (0x1d)  | 349 (0x15d) | V_CEIL_F32          | V_CEIL_F32
198 30 (0x1e)  | 350 (0x15e) | V_RNDNE_F32         | V_RNDNE_F32
199 31 (0x1f)  | 351 (0x15f) | V_FLOOR_F32         | V_FLOOR_F32
200 32 (0x20)  | 352 (0x160) | V_EXP_F32           | V_EXP_F32
201 33 (0x21)  | 353 (0x161) | V_LOG_F32           | V_LOG_F32
202 34 (0x22)  | 354 (0x162) | V_RCP_F32           | V_RCP_F32
203 35 (0x23)  | 355 (0x163) | V_RCP_IFLAG_F32     | V_RCP_IFLAG_F32
204 36 (0x24)  | 356 (0x164) | V_RSQ_F32           | V_RSQ_F32
205 37 (0x25)  | 357 (0x165) | V_RCP_F64           | V_RCP_F64
206 38 (0x26)  | 358 (0x166) | V_RSQ_F64           | V_RSQ_F64
207 39 (0x27)  | 359 (0x167) | V_SQRT_F32          | V_SQRT_F32
208 40 (0x28)  | 360 (0x168) | V_SQRT_F64          | V_SQRT_F64
209 41 (0x29)  | 361 (0x169) | V_SIN_F32           | V_SIN_F32
210 42 (0x2a)  | 362 (0x16a) | V_COS_F32           | V_COS_F32
211 43 (0x2b)  | 363 (0x16b) | V_NOT_B32           | V_NOT_B32
212 44 (0x2c)  | 364 (0x16c) | V_BFREV_B32         | V_BFREV_B32
213 45 (0x2d)  | 365 (0x16d) | V_FFBH_U32          | V_FFBH_U32
214 46 (0x2e)  | 366 (0x16e) | V_FFBL_B32          | V_FFBL_B32
215 47 (0x2f)  | 367 (0x16f) | V_FFBH_I32          | V_FFBH_I32
216 48 (0x30)  | 368 (0x170) | V_FREXP_EXP_I32_F64 | V_FREXP_EXP_I32_F64
217 49 (0x31)  | 369 (0x171) | V_FREXP_MANT_F64    | V_FREXP_MANT_F64
218 50 (0x32)  | 370 (0x172) | V_FRACT_F64         | V_FRACT_F64
219 51 (0x33)  | 371 (0x173) | V_FREXP_EXP_I32_F32 | V_FREXP_EXP_I32_F32
220 52 (0x34)  | 372 (0x174) | V_FREXP_MANT_F32    | V_FREXP_MANT_F32
221 53 (0x35)  | 373 (0x175) | V_CLREXCP           | V_CLREXCP
222 54 (0x36)  | 374 (0x176) | V_MOVRELD_B32       | V_MOV_PRSV_B32
223 55 (0x37)  | 375 (0x177) | V_MOVRELS_B32       | V_SCREEN_PARTITION_4SE_B32
224 56 (0x38)  | 376 (0x178) | V_MOVRELSD_B32      | --
225 57 (0x39)  | 377 (0x179) | V_CVT_F16_U16       | V_CVT_F16_U16
226 58 (0x3a)  | 378 (0x17a) | V_CVT_F16_I16       | V_CVT_F16_I16
227 59 (0x3b)  | 379 (0x17b) | V_CVT_U16_F16       | V_CVT_U16_F16
228 60 (0x3c)  | 380 (0x17c) | V_CVT_I16_F16       | V_CVT_I16_F16
229 61 (0x3d)  | 381 (0x17d) | V_RCP_F16           | V_RCP_F16
230 62 (0x3e)  | 382 (0x17e) | V_SQRT_F16          | V_SQRT_F16
231 63 (0x3f)  | 383 (0x17f) | V_RSQ_F16           | V_RSQ_F16
232 64 (0x40)  | 384 (0x180) | V_LOG_F16           | V_LOG_F16
233 65 (0x41)  | 385 (0x181) | V_EXP_F16           | V_EXP_F16
234 66 (0x42)  | 386 (0x182) | V_FREXP_MANT_F16    | V_FREXP_MANT_F16
235 67 (0x43)  | 387 (0x183) | V_FREXP_EXP_I16_F16 | V_FREXP_EXP_I16_F16
236 68 (0x44)  | 388 (0x184) | V_FLOOR_F16         | V_FLOOR_F16
237 69 (0x45)  | 389 (0x185) | V_CEIL_F16          | V_CEIL_F16
238 70 (0x46)  | 390 (0x186) | V_TRUNC_F16         | V_TRUNC_F16
239 71 (0x47)  | 391 (0x187) | V_RNDNE_F16         | V_RNDNE_F16
240 72 (0x48)  | 392 (0x188) | V_FRACT_F16         | V_FRACT_F16
241 73 (0x49)  | 393 (0x189) | V_SIN_F16           | V_SIN_F16
242 74 (0x4a)  | 394 (0x18a) | V_COS_F16           | V_COS_F16
243 75 (0x4b)  | 395 (0x18b) | V_EXP_LEGACY_F32    | V_EXP_LEGACY_F32
244 76 (0x4c)  | 396 (0x18c) | V_LOG_LEGACY_F32    | V_LOG_LEGACY_F32
245 77 (0x4d)  | 397 (0x18d) | --                  | V_CVT_NORM_I16_F16
246 78 (0x4e)  | 398 (0x18e) | --                  | V_CVT_NORM_U16_F16
247 79 (0x4f)  | 399 (0x18f) | --                  | V_SAT_PK_U8_I16
248 80 (0x50)  | 400 (0x190  | --                  | V_WRITELANE_REGWR_B32
249 81 (0x51)  | 401 (0x191) | --                  | V_SWAP_B32
250
251### Instruction set
252
253Alphabetically sorted instruction list:
254
255#### V_BFREV_B32
256
257Opcode VOP1: 56 (0x38) for GCN 1.0/1.1; 44 (0x2c) for GCN 1.2 
258Opcode VOP3A: 440 (0x1b8) for GCN 1.0/1.1; 364 (0x16c) for GCN 1.2 
259Syntax: V_BFREV_B32 VDST, SRC0 
260Reverse bits in SRC0 and store result to VDST. 
261Operation: 
262```
263VDST = REVBIT(SRC0)
264```
265
266#### V_CEIL_F16
267
268Opcode VOP1: 69 (0x45) for GCN 1.2 
269Opcode VOP3A: 389 (0x185) for GCN 1.2 
270Syntax: V_CEIL_F16 VDST, SRC0 
271Description: Truncate half floating point valu from SRC0 with rounding to positive infinity
272(ceilling), and store result to VDST. Implemented by flooring.
273If SRC0 is infinity or NaN then copy SRC0 to VDST. 
274Operation: 
275```
276HALF F = FLOOR(ASHALF(SRC0))
277if (ASHALF(SRC0) > 0.0 && ASHALF(SRC0) != F)
278    F += 1.0
279VDST = F
280```
281
282#### V_CEIL_F32
283
284Opcode VOP1: 34 (0x22) for GCN 1.0/1.1; 29 (0x1d) for GCN 1.2 
285Opcode VOP3A: 418 (0x1a2) for GCN 1.0/1.1; 349 (0x15d) for GCN 1.2 
286Syntax: V_CEIL_F32 VDST, SRC0 
287Description: Truncate floating point valu from SRC0 with rounding to positive infinity
288(ceilling), and store result to VDST. Implemented by flooring.
289If SRC0 is infinity or NaN then copy SRC0 to VDST. 
290Operation: 
291```
292FLOAT F = FLOOR(ASFLOAT(SRC0))
293if (ASFLOAT(SRC0) > 0.0 && ASFLOAT(SRC0) != F)
294    F += 1.0
295VDST = F
296```
297
298#### V_CEIL_F64
299
300Opcode VOP1: 24 (0x18) for GCN 1.1/1.2 
301Opcode VOP3A: 408 (0x198) for GCN 1.1; 344 (0x158) for GCN 1.2 
302Syntax: V_CEIL_F64 VDST(2), SRC0(2) 
303Description: Truncate double floating point valu from SRC0 with rounding to
304positive infinity (ceilling), and store result to VDST. Implemented by flooring.
305If SRC0 is infinity or NaN then copy SRC0 to VDST. 
306Operation: 
307```
308DOUBLE F = FLOOR(ASDOUBLE(SRC0))
309if (ASDOUBLE(SRC0) > 0.0 && ASDOUBLE(SRC0) != F)
310    F += 1.0
311VDST = F
312```
313
314#### V_CLREXCP
315
316Opcode VOP1: 65 (0x41) for GCN 1.0/1.1; 53 (0x35) for GCN 1.2 
317Opcode VOP3A: 449 (0x1c1) for GCN 1.0/1.1; 373 (0x175) for GCN 1.2 
318Syntax: V_CLREXCP 
319Description: Clear wave's exception state in SIMD. 
320
321#### V_COS_F16
322
323Opcode VOP1: 74 (0x4a) for GCN 1.2 
324Opcode VOP3A: 394 (0x18a) for GCN 1.2 
325Syntax: V_COS_F16 VDST, SRC0 
326Description: Compute cosine of half FP value from SRC0.
327Input value must be normalized to range 1.0 - 1.0 (-360 degree : 360 degree).
328If SRC0 value is out of range then store 1.0 to VDST.
329If SRC0 value is infinity, store -NAN to VDST. 
330Operation: 
331```
332FLOAT SF = ASHALF(SRC0)
333VDST = 1.0
334if (SF >= -1.0 && SF <= 1.0)
335    VDST = APPROX_COS(SF)
336else if (ABS(SF)==INF_H)
337    VDST = -NAN_H
338else if (ISNAN(SF))
339    VDST = SRC0
340```
341
342#### V_COS_F32
343
344Opcode VOP1: 54 (0x36) for GCN 1.0/1.1; 42 (0x2a) for GCN 1.2 
345Opcode VOP3A: 438 (0x1b6) for GCN 1.0/1.1; 362 (0x16a) for GCN 1.2 
346Syntax: V_COS_F32 VDST, SRC0 
347Description: Compute cosine of FP value from SRC0. Input value must be normalized to range
3481.0 - 1.0 (-360 degree : 360 degree). If SRC0 value is out of range then store 1.0 to VDST.
349If SRC0 value is infinity, store -NAN to VDST. 
350Operation: 
351```
352FLOAT SF = ASFLOAT(SRC0)
353VDST = 1.0
354if (SF >= -1.0 && SF <= 1.0)
355    VDST = APPROX_COS(SF)
356else if (ABS(SF)==INF)
357    VDST = -NAN
358else if (ISNAN(SF))
359    VDST = SRC0
360```
361
362#### V_CVT_F16_F32
363
364Opcode VOP1: 10 (0xa) 
365Opcode VOP3A: 394 (0x18a) for GCN 1.0/1.1; 330 (0x14a) for GCN 1.2 
366Syntax: V_CVT_F16_F32 VDST, SRC0 
367Description: Convert single FP value to half floating point value with rounding from
368MODE register (single FP rounding mode for GCN 1.0, double FP rounding modefor GCN 1.2),
369and store result to VDST. If absolute value is too high, then store -/+infinity to VDST.
370In GCN 1.2 flushing denormals controlled by MODE. In GCN 1.0/1.1, denormals are enabled. 
371Operation: 
372```
373VDST = CVTHALF(ASFLOAT(SRC0))
374```
375
376#### V_CVT_F16_I16
377
378Opcode: VOP1: 58 (0x3a) for GCN 1.2 
379Opcode VOP3A: 378 (0x17a) for GCN 1.2 
380Syntax: V_CVT_F16_I16 VDST, SRC0 
381Description: Convert 16-bit signed valut to half floating point value. 
382Operation: 
383```
384VDST = (HALF)(INT16)SRC0
385```
386
387#### V_CVT_F16_U16
388
389Opcode: VOP1: 57 (0x39) for GCN 1.2 
390Opcode VOP3A: 377 (0x179) for GCN 1.2 
391Syntax: V_CVT_F16_U16 VDST, SRC0 
392Description: Convert 16-bit unsigned valut to half floating point value. 
393Operation: 
394```
395VDST = (HALF)(SRC0&0xffff)
396```
397
398#### V_CVT_F32_F16
399
400Opcode VOP1: 11 (0xb) 
401Opcode VOP3A: 395 (0x18b) for GCN 1.0/1.1; 331 (0x14b) for GCN 1.2 
402Syntax: V_CVT_F32_F16 VDST, SRC0 
403Description: Convert half FP value to single FP value, and store result to VDST.
404**By default, immediate is in FP32 format!**.
405In GCN 1.2 flushing denormals controlled by MODE. In GCN 1.0/1.1, denormals are enabled. 
406Operation: 
407```
408VDST = (FLOAT)(ASHALF(SRC0))
409```
410
411#### V_CVT_F32_F64
412
413Opcode VOP1: 15 (0xf) 
414Opcode VOP3A: 399 (0x18f) for GCN 1.0/1.1; 335 (0x14f) for GCN 1.2 
415Syntax: V_CVT_F32_F64 VDST, SRC0(2) 
416Description: Convert double FP value to single floating point value with rounding from
417MODE register (single FP rounding mode), and store result to VDST.
418If absolute value is too high, then store -/+infinity to VDST. 
419Operation: 
420```
421VDST = CVTHALF(ASDOUBLE(SRC0))
422```
423
424#### V_CVT_F32_I32
425
426Opcode VOP1: 5 (0x5) 
427Opcode VOP3A: 389 (0x185) for GCN 1.0/1.1; 325 (0x145) for GCN 1.2 
428Syntax: V_CVT_F32_I32 VDST, SRC0 
429Description: Convert signed 32-bit integer to single FP value, and store it to VDST. 
430Operation: 
431```
432VDST = (FLOAT)(INT32)SRC0
433```
434
435#### V_CVT_F32_U32
436
437Opcode VOP1: 6 (0x6) 
438Opcode VOP3A: 390 (0x186) for GCN 1.0/1.1; 326 (0x146) for GCN 1.2 
439Syntax: V_CVT_F32_U32 VDST, SRC0 
440Description: Convert unsigned 32-bit integer to single FP value, and store it to VDST. 
441Operation: 
442```
443VDST = (FLOAT)SRC0
444```
445
446#### V_CVT_F32_UBYTE0
447
448Opcode VOP1: 17 (0x11) 
449Opcode VOP3A: 401 (0x191) for GCN 1.0/1.1; 337 (0x151) for GCN 1.2 
450Syntax: V_CVT_F32_UBYTE0 VDST, SRC0 
451Description: Convert the first unsigned 8-bit byte from SRC0 to single FP value,
452and store it to VDST. 
453Operation: 
454```
455VDST = (FLOAT)(SRC0 & 0xff)
456```
457
458#### V_CVT_F32_UBYTE1
459
460Opcode VOP1: 18 (0x12) 
461Opcode VOP3A: 402 (0x192) for GCN 1.0/1.1; 338 (0x152) for GCN 1.2 
462Syntax: V_CVT_F32_UBYTE1 VDST, SRC0 
463Description: Convert the second unsigned 8-bit byte from SRC0 to single FP value,
464and store it to VDST. 
465Operation: 
466```
467VDST = (FLOAT)((SRC0>>8) & 0xff)
468```
469
470#### V_CVT_F32_UBYTE2
471
472Opcode VOP1: 19 (0x13) 
473Opcode VOP3A: 403 (0x193) for GCN 1.0/1.1; 339 (0x153) for GCN 1.2 
474Syntax: V_CVT_F32_UBYTE2 VDST, SRC0 
475Description: Convert the third unsigned 8-bit byte from SRC0 to single FP value,
476and store it to VDST. 
477Operation: 
478```
479VDST = (FLOAT)((SRC0>>16) & 0xff)
480```
481
482#### V_CVT_F32_UBYTE3
483
484Opcode VOP1: 20 (0x14) 
485Opcode VOP3A: 404 (0x194) for GCN 1.0/1.1; 340 (0x154) for GCN 1.2 
486Syntax: V_CVT_F32_UBYTE3 VDST, SRC0 
487Description: Convert the fourth unsigned 8-bit byte from SRC0 to single FP value,
488and store it to VDST. 
489Operation: 
490```
491VDST = (FLOAT)(SRC0>>24)
492```
493
494#### V_CVT_F64_F32
495
496Opcode VOP1: 16 (0x10) 
497Opcode VOP3A: 400 (0x190) for GCN 1.0/1.1; 336 (0x150) for GCN 1.2 
498Syntax: V_CVT_F64_F32 VDST(2), SRC0 
499Description: Convert single FP value to double FP value, and store result to VDST. 
500Operation: 
501```
502VDST = (DOUBLE)(ASFLOAT(SRC0))
503```
504
505#### V_CVT_F64_I32
506
507Opcode VOP1: 4 (0x4) 
508Opcode VOP3A: 388 (0x184) for GCN 1.0/1.1; 324 (0x144) for GCN 1.2 
509Syntax: V_CVT_F64_I32 VDST(2), SRC0 
510Description: Convert signed 32-bit integer to double FP value, and store it to VDST. 
511Operation: 
512```
513VDST = (DOUBLE)(INT32)SRC0
514```
515
516#### V_CVT_F64_U32
517
518Opcode VOP1: 22 (0x16) 
519Opcode VOP3A: 406 (0x196) for GCN 1.0/1.1; 342 (0x156) for GCN 1.2 
520Syntax: V_CVT_F64_U32 VDST(2), SRC0 
521Description: Convert unsigned 32-bit integer to double FP value, and store it to VDST. 
522Operation: 
523```
524VDST = (DOUBLE)SRC0
525```
526
527#### V_CVT_FLR_I32_F32
528
529Opcode VOP1: 13 (0xd) 
530Opcode VOP3A: 397 (0x18d) for GCN 1.0/1.1; 333 (0x14d) for GCN 1.2 
531Syntax: V_CVT_FLR_I32_F32 VDST, SRC0 
532Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
533store result to VDST. Conversion uses rounding to negative infinity (floor).
534If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
535If input value is NaN/-NaN then store MAX_INT32/MIN_INT32 to VDST. 
536Operation: 
537```
538FLOAT SF = ASFLOAT(SF)
539if (!ISNAN(SF))
540    VDST = (INT32)MAX(MIN(FLOOR(SF), 2147483647.0), -2147483648.0)
541else
542    VDST = (INT32)SF>=0 ? 2147483647 : -2147483648
543```
544
545#### V_CVT_I16_F16
546
547Opcode VOP1: 60 (0x3c) 
548Opcode VOP3A: 380 (0x17c) for GCN 1.2 
549Syntax: V_CVT_I16_F16 VDST, SRC0 
550Description: Convert 16-bit floating point value from SRC0 to signed 16-bit integer, and
551store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
552maximal/minimal integer then store MAX_INT16/MIN_INT16 to VDST.
553If input value is NaN then store 0 to VDST. 
554Operation: 
555```
556VDST = 0
557if (!ISNAN(ASHALF(SRC0)))
558    VDST = (INT16)MAX(MIN(RNDTZINT(ASHALF(SRC0)), 32767.0), -32768.0)
559```
560
561#### V_CVT_I32_F32
562
563Opcode VOP1: 8 (0x8) 
564Opcode VOP3A: 392 (0x188) for GCN 1.0/1.1; 328 (0x148) for GCN 1.2 
565Syntax: V_CVT_I32_F32 VDST, SRC0 
566Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
567store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
568maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
569If input value is NaN then store 0 to VDST. 
570Operation: 
571```
572VDST = 0
573if (!ISNAN(ASFLOAT(SRC0)))
574    VDST = (INT32)MAX(MIN(RNDTZINT(ASFLOAT(SRC0)), 2147483647.0), -2147483648.0)
575```
576
577#### V_CVT_I32_F64
578
579Opcode VOP1: 3 (0x3) 
580Opcode VOP3A: 387 (0x183) for GCN 1.0/1.1; 323 (0x143) for GCN 1.2 
581Syntax: V_CVT_I32_F64 VDST, SRC0(2) 
582Description: Convert 64-bit floating point value from SRC0 to signed 32-bit integer, and
583store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
584maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
585If input value is NaN then store 0 to VDST. 
586Operation: 
587```
588VDST = 0
589if (!ISNAN(ASDOUBLE(SRC0)))
590    VDST = (INT32)MAX(MIN(RNDTZINT(ASDOUBLE(SRC0)), 2147483647.0), -2147483648.0)
591```
592
593#### V_CVT_NORM_I16_F16
594
595Opcode VOP1: 77 (0x4d) for GCN 1.4 
596Opcode VOP3A: 397 (0x18d) for GCN 1.4 
597Syntax: V_CVT_NORM_I16_F16 VDST, SRC0(2) 
598Description: Convert 16-bit floating point value from SRC0 to signed normalized 16-bit value
599by multiplying value by 32768.0 and make conversion to 16-bit signed integer, and
600store result to VDST. Conversion depends on rounding mode. 
601```
602VDST = 0
603if (!ISNAN(ASHALF(SRC0)))
604    VDST = (INT16)(MAX(MIN(RNDINT(ASHALF(SRC0*32767.0)), 32767.0, -32767.0)))
605```
606
607#### V_CVT_NORM_U16_F16
608
609Opcode VOP1: 78 (0x4e) for GCN 1.4 
610Opcode VOP3A: 398 (0x18e) for GCN 1.4 
611Syntax: V_CVT_NORM_U16_F16 VDST, SRC0(2) 
612Description: Convert 16-bit floating point value from SRC0 to unsigned normalized
61316-bit value by multiplying value by 65535.0 and make conversion to
61416-bit unsigned integer, and store result to VDST. Probably rounds to +Infinity. 
615```
616VDST = 0
617if (!ISNAN(ASHALF(SRC0)))
618    VDST = (UINT16)(MAX(MIN(RNDINT(ASHALF(SRC0*65535.0)), 65535.0, 0.0)))
619```
620
621#### V_CVT_OFF_F32_I4
622
623Opcode VOP1: 14 (0xe) 
624Opcode VOP3A: 398 (0x18e) for GCN 1.0/1.1; 334 (0x14e) for GCN 1.2 
625Syntax: V_CVT_OFF_F32_I4 VDST, SRC0 
626Description: Convert 4-bit signed value from SRC0 to floating point value, normalize that
627value to range -0.5:0.4375 and store result to VDST. 
628Operation: 
629```
630VDST = (FLOAT)((SRC0 & 0xf) ^ 8) / 16.0 - 0.5
631```
632
633#### V_CVT_RPI_I32_F32
634
635Opcode VOP1: 12 (0xc) 
636Opcode VOP3A: 396 (0x18c) for GCN 1.0/1.1; 332 (0x14c) for GCN 1.2 
637Syntax: V_CVT_RPI_I32_F32 VDST, SRC0 
638Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
639store result to VDST. Conversion adds 0.5 to value and rounds negative infinity (floor).
640If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to
641VDST. If input value is NaN/-NaN then store MAX_INT32/MIN_INT32 to VDST. 
642Operation: 
643```
644FLOAT SF = ASFLOAT(SRC0)
645if (!ISNAN(SF))
646    VDST = (INT32)MAX(MIN(FLOOR(SF + 0.5), 2147483647.0), -2147483648.0)
647else
648    VDST = (INT32)SF>=0 ? 2147483647 : -2147483648
649```
650
651#### V_CVT_U16_F16
652
653Opcode VOP1: 59 (0x3b) for GCN 1.2 
654Opcode VOP3A: 379 (0x17b) for GCN 1.2 
655Syntax: V_CVT_U16_F16 VDST, SRC0 
656Description: Convert 32-bit half floating point value from SRC0 to unsigned 16-bit integer,
657and store result to VDST. Conversion uses rounding to zero. If value is higher than
658maximal integer then store MAX_UINT16 to VDST. If input value is NaN then store 0 to VDST. 
659Operation: 
660```
661VDST = 0
662if (!ISNAN(ASHALF(SRC0)))
663    VDST = (UINT16)MIN(RNDTZINT(ASHALF(SRC0)), 65535.0)
664```
665
666
667#### V_CVT_U32_F32
668
669Opcode VOP1: 7 (0x7) 
670Opcode VOP3A: 391 (0x187) for GCN 1.0/1.1; 327 (0x147) for GCN 1.2 
671Syntax: V_CVT_U32_F32 VDST, SRC0 
672Description: Convert 32-bit floating point value from SRC0 to unsigned 32-bit integer, and
673store result to VDST. Conversion uses rounding to zero. If value is higher than
674maximal integer then store MAX_UINT32 to VDST.
675If input value is NaN then store 0 to VDST. 
676Operation: 
677```
678VDST = 0
679if (!ISNAN(ASFLOAT(SRC0)))
680    VDST = (UINT32)MIN(RNDTZINT(ASFLOAT(SRC0)), 4294967295.0)
681```
682
683#### V_CVT_U32_F64
684
685Opcode VOP1: 21 (0x15) 
686Opcode VOP3A: 405 (0x195) for GCN 1.0/1.1; 341 (0x155) for GCN 1.2 
687Syntax: V_CVT_U32_F64 VDST, SRC0(2) 
688Description: Convert 64-bit floating point value from SRC0 to unsigned 32-bit integer, and
689store result to VDST. Conversion uses rounding to zero. If value is higher than
690maximal integer then store MAX_UINT32 to VDST.
691If input value is NaN then store 0 to VDST. 
692Operation: 
693```
694VDST = 0
695if (!ISNAN(ASDOUBLE(SRC0)))
696    VDST = (UINT32)MIN(RNDTZINT(ASDOUBLE(SRC0)), 4294967295.0)
697```
698
699#### V_EXP_F16
700
701Opcode VOP1: 65 (0x41) for GCN 1.2 
702Opcode VOP3A: 385 (0x181) for GCN 1.2 
703Syntax: V_EXP_F16 VDST, SRC0 
704Description: Approximate power of two from half FP value SRC0 and store it to VDST. 
705Operation: 
706```
707VDST = APPROX_POW2(ASHALF(SRC0))
708```
709
710#### V_EXP_F32
711
712Opcode VOP1: 37 (0x25) for GCN 1.0/1.1; 32 (0x20) for GCN 1.2 
713Opcode VOP3A: 421 (0x1a5) for GCN 1.0/1.1; 352 (0x160) for GCN 1.2 
714Syntax: V_EXP_F32 VDST, SRC0 
715Description: Approximate power of two from FP value SRC0 and store it to VDST. Instruction
716for values smaller than -126.0 always returns 0 regardless floatmode in MODE register. 
717Operation: 
718```
719if (ASFLOAT(SRC0)>=-126.0)
720    VDST = APPROX_POW2(ASFLOAT(SRC0))
721else
722    VDST = 0.0
723```
724
725### V_EXP_LEGACY_F32
726
727Opcode VOP1: 70 (0x46) for GCN 1.1; 75 (0x4b) for GCN 1.2 
728Opcode VOP3A: 454 (0x1c6) for GCN 1.1; 395 (0x18b) for GCN 1.2 
729Syntax: V_EXP_LEGACY_F32 VDST, SRC0 
730Description: Approximate power of two from FP value SRC0 and store it to VDST. Instruction
731for values smaller than -126.0 always returns 0 regardless floatmode in MODE register.
732For some cases this instructions returns slightly less accurate result than V_EXP_F32. 
733Operation: 
734```
735if (ASFLOAT(SRC0)>=-126.0)
736    VDST = APPROX_POW2(ASFLOAT(SRC0))
737else
738    VDST = 0.0
739```
740
741#### V_FFBH_U32
742
743Opcode VOP1: 57 (0x39) for GCN 1.0/1.1; 45 (0x2d) for GCN 1.2 
744Opcode VOP3A: 441 (0x1b9) for GCN 1.0/1.1; 365 (0x16d) for GCN 1.2 
745Syntax: V_FFBH_U32 VDST, SRC0 
746Description: Find last one bit in SRC0. If found, store number of skipped bits to VDST,
747otherwise set VDST to -1. 
748Operation: 
749```
750VDST = -1
751for (INT8 i = 31; i >= 0; i--)
752    if ((1U<<i) & SRC0) != 0)
753    { VDST = 31-i; break; }
754```
755
756#### V_FFBH_I32
757
758Opcode VOP1: 59 (0x3b) for GCN 1.0/1.1; 47 (0x2f) for GCN 1.2 
759Opcode VOP3A: 443 (0x1bb) for GCN 1.0/1.1; 367 (0x16f) for GCN 1.2 
760Syntax: V_FFBH_I32 VDST, SRC0 
761Description: Find last opposite bit to sign in SRC0. If found, store number of skipped bits
762to VDST, otherwise set VDST to -1. 
763Operation: 
764```
765VDST = -1
766UINT32 bitval = (INT32)SRC0>=0 ? 1 : 0
767for (INT8 i = 31; i >= 0; i--)
768    if ((1U<<i) & SRC0) == (bitval<<i))
769    { VDST = 31-i; break; }
770```
771
772#### V_FFBL_B32
773
774Opcode VOP1: 58 (0x3a) for GCN 1.0/1.1; 46 (0x2e) for GCN 1.2 
775Opcode VOP3A: 442 (0x1ba) for GCN 1.0/1.1; 366 (0x16e) for GCN 1.2 
776Syntax: V_FFBL_B32 VDST, SRC0 
777Description: Find first one bit in SRC0. If found, store number of bit to VDST,
778otherwise set VDST to -1. 
779Operation: 
780```
781VDST = -1
782for (UINT8 i = 0; i < 32; i++)
783    if ((1U<<i) & SRC0) != 0)
784    { VDST = i; break; }
785```
786
787#### V_FLOOR_F16
788
789Opcode VOP1: 68 (0x44) for GCN 1.2 
790Opcode VOP3A: 388 (0x184) for GCN 1.2 
791Syntax: V_FLOOR_F16 VDST, SRC0 
792Description: Truncate half floating point value SRC0 with rounding to negative infinity
793(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
794Operation: 
795```
796VDST = FLOOR(ASHALF(SRC0))
797```
798
799#### V_FLOOR_F32
800
801Opcode VOP1: 36 (0x24) for GCN 1.0/1.1; 31 (0x1f) for GCN 1.2 
802Opcode VOP3A: 420 (0x1a4) for GCN 1.0/1.1; 351 (0x15f) for GCN 1.2 
803Syntax: V_FLOOR_F32 VDST, SRC0 
804Description: Truncate floating point value SRC0 with rounding to negative infinity
805(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
806Operation: 
807```
808VDST = FLOOR(ASFLOAT(SRC0))
809```
810
811#### V_FLOOR_F64
812
813Opcode VOP1: 26 (0x1a) for GCN 1.1/1.2 
814Opcode VOP3A: 410 (0x19a) for GCN 1.1; 346 (0x15a) for GCN 1.2 
815Syntax: V_FLOOR_F64 VDST(2), SRC0(2) 
816Description: Truncate double floating point value SRC0 with rounding to negative infinity
817(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
818Operation: 
819```
820VDST = FLOOR(ASDOUBLE(SRC0))
821```
822
823#### V_FRACT_F32
824
825Opcode VOP1: 32 (0x20) for GCN 1.0/1.1; 27 (0x1b) for GCN 1.2 
826Opcode VOP3A: 416 (0x1a0) for GCN 1.0/1.1; 347 (0x15b) for GCN 1.2 
827Syntax: V_FRACT_F32 VDST, SRC0 
828Description: Get fractional from floating point value SRC0 and store it to VDST.
829Fractional will be computed by subtracting floor(SRC0) from SRC0.
830If SRC0 is infinity or NaN then NaN with proper sign is stored to VDST. 
831Operation: 
832```
833FLOAT SF = ASFLOAT(SRC0)
834if (!ISNAN(SF) && SF!=-INF && SF!=INF)
835    VDST = SF - FLOOR(ASFLOAT(SF))
836else
837    VDST = NAN * SIGN(SF)
838```
839
840#### V_FRACT_F64
841
842Opcode VOP1: 62 (0x3e) for GCN 1.0/1.1; 52 (0x32) for GCN 1.2 
843Opcode VOP3A: 446 (0x1be) for GCN 1.0/1.1; 372 (0x172) for GCN 1.2 
844Syntax: V_FRACT_F64 VDST(2), SRC0(2) 
845Description: Get fractional from double floating point value SRC0 and store it to VDST.
846Fractional will be computed by subtracting floor(SRC0) from SRC0.
847If SRC0 is infinity or NaN then NaN with proper sign is stored to VDST. 
848Operation: 
849```
850FLOAT SD = ASDOUBLE(SRC0)
851if (!ISNAN(SD) && SD!=-INF && SD!=INF)
852    VDST = SD - FLOOR(ASDOUBLE(SD))
853else
854    VDST = NAN * SIGN(SD)
855```
856
857#### V_FREXP_EXP_I16_F16
858
859Opcode VOP1: 67 (0x43) for GCN 1.2 
860Opcode VOP3A: 387 (0x183) for GCN 1.2 
861Syntax: V_FREXP_EXP_I16_F16 VDST, SRC0 
862Description: Get exponent plus 1 from half FP value SRC0, and store that exponent to VDST
863as 16-bit signed integer. This instruction realizes frexp function.
864If SRC0 is infinity or NAN then store 0 to VDST. 
865Operation: 
866```
867HALF SF = ASHALF(SRC0)
868if (ABS(SF) != INF_H && !ISNAN(SF))
869    VDST = (INT16)FREXP_EXP(SF)
870else
871    VDST = 0
872```
873
874#### V_FREXP_EXP_I32_F32
875
876Opcode VOP1: 63 (0x3f) for GCN 1.0/1.1; 51 (0x33) for GCN 1.2 
877Opcode VOP3A: 447 (0x1bf) for GCN 1.0/1.1; 371 (0x173) for GCN 1.2 
878Syntax: V_FREXP_EXP_I32_F32 VDST, SRC0 
879Description: Get exponent plus 1 from single FP value SRC0, and store that exponent to VDST.
880This instruction realizes frexp function.
881If SRC0 is infinity or NAN then store -1 if GCN 1.0 or 0 to VDST. 
882Operation: 
883```
884FLOAT SF = ASFLOAT(SRC0)
885if (ABS(SF) != INF && !ISNAN(SF))
886    VDST = FREXP_EXP(SF)
887else
888    VDST = -1 // GCN 1.0
889    VDST = 0 // later
890```
891
892#### V_FREXP_EXP_I32_F64
893
894Opcode VOP1: 60 (0x3c) for GCN 1.0/1.1; 48 (0x30) for GCN 1.2 
895Opcode VOP3A: 444 (0x1bc) for GCN 1.0/1.1; 368 (0x170) for GCN 1.2 
896Syntax: V_FREXP_EXP_I32_F64 VDST, SRC0(2) 
897Description: Get exponent plus 1 from double FP value SRC0, and store that exponent to VDST.
898This instruction realizes frexp function.
899If SRC0 is infinity or NAN then store -1 if GCN 1.0 or 0 to VDST. 
900Operation: 
901```
902DOUBLE SD = ASDOUBLE(SRC0)
903if (ABS(SD) != INF && !ISNAN(SD))
904    VDST = FREXP_EXP(SD)
905else
906    VDST = -1 // GCN 1.0
907    VDST = 0 // later
908```
909
910#### V_FREXP_MANT_F16
911
912Opcode VOP1: 66 (0x42) for GCN 1.2 
913Opcode VOP3A: 386 (0x182) for GCN 1.2 
914Syntax: V_FREXP_MANT_F16 VDST, SRC0 
915Description: Get mantisa from half FP value SRC0, and store it to VDST. Mantisa includes
916sign of input. 
917Operation: 
918```
919HALF SF = ASHALF(SRC0)
920if (ABS(SF) == INF)
921    VDST = SF
922else if (!ISNAN(SF))
923    VDST = FREXP_MANT(SF) * SIGN(SF)
924else
925    VDST = NAN_H * SIGN(SF)
926```
927
928#### V_FREXP_MANT_F32
929
930Opcode VOP1: 64 (0x40) for GCN 1.0/1.1; 52 (0x34) for GCN 1.2 
931Opcode VOP3A: 448 (0x1c0) for GCN 1.0/1.1; 372 (0x174) for GCN 1.2 
932Syntax: V_FREXP_MANT_F32 VDST, SRC0 
933Description: Get mantisa from single FP value SRC0, and store it to VDST. Mantisa includes
934sign of input. For GCN 1.0, if SRC0 is infinity then store -NAN to VDST. 
935Operation: 
936```
937FLOAT SF = ASFLOAT(SRC0)
938if (ABS(SF) == INF)
939    VDST = -NAN // GCN 1.0
940    VDST = SF // later
941else if (!ISNAN(SF))
942    VDST = FREXP_MANT(SF) * SIGN(SF)
943else
944    VDST = NAN * SIGN(SF)
945```
946
947#### V_FREXP_MANT_F64
948
949Opcode VOP1: 61 (0x3d) for GCN 1.0/1.1; 49 (0x31) for GCN 1.2 
950Opcode VOP3A: 445 (0x1bd) for GCN 1.0/1.1; 369 (0x171) for GCN 1.2 
951Syntax: V_FREXP_MANT_F64 VDST(2), SRC0(2) 
952Description: Get mantisa from double FP value SRC0, and store it to VDST. Mantisa includes
953sign of input. If SRC0 is infinity then store -NAN to VDST. 
954Operation: 
955```
956DOUBLE SD = ASDOUBLE(SRC0)
957if (ABS(SD) == INF)
958    VDST = -NAN // GCN 1.0
959    VDST = SF // later
960else if (!ISNAN(SD))
961    VDST = FREXP_MANT(SD) * SIGN(SD)
962else
963    VDST = NAN * SIGN(SD)
964```
965
966#### V_LOG_CLAMP_F32
967
968Opcode VOP1: 38 (0x26) for GCN 1.0/1.1 
969Opcode VOP3A: 422 (0x1a6) for GCN 1.0/1.1 
970Syntax: V_LOG_CLAMP_F32 VDST, SRC0 
971Description: Approximate logarithm of base 2 from floating point value SRC0 with
972clamping infinities to -MAX_FLOAT. Result is stored in VDST.
973If SRC0 is negative then store -NaN to VDST. This instruction doesn't handle denormalized
974values regardless FLOAT MODE register setup. 
975Operation: 
976```
977FLOAT F = ASFLOAT(SRC0)
978if (F==1.0)
979    VDST = 0.0f
980if (F<0.0)
981    VDST = -NaN
982else
983{
984    VDST = APPROX_LOG2(F)
985    if (ASFLOAT(VDST)==-INF)
986        VDST = -MAX_FLOAT
987}
988```
989
990#### V_LOG_F16
991
992Opcode VOP1: 64 (0x40) for GCN 1.2 
993Opcode VOP3A: 384 (0x180) for GCN 1.2 
994Syntax: V_LOG_F16 VDST, SRC0 
995Description: Approximate logarithm of base 2 from half floating point value SRC0, and store
996result to VDST. If SRC0 is negative then store -NaN to VDST. 
997Operation: 
998```
999HALF F = ASHALF(SRC0)
1000if (F==1.0)
1001    VDST = 0.0h
1002if (F<0.0)
1003    VDST = -NaN_F
1004else
1005    VDST = APPROX_LOG2(F)
1006```
1007
1008#### V_LOG_F32
1009
1010Opcode VOP1: 39 (0x27) for GCN 1.0/1.1; 33 (0x21) for GCN 1.2 
1011Opcode VOP3A: 423 (0x1a7) for GCN 1.0/1.1; 353 (0x161) for GCN 1.2 
1012Syntax: V_LOG_F32 VDST, SRC0 
1013Description: Approximate logarithm of base 2 from floating point value SRC0, and store
1014result to VDST. If SRC0 is negative then store -NaN to VDST.
1015This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1016Operation: 
1017```
1018FLOAT F = ASFLOAT(SRC0)
1019if (F==1.0)
1020    VDST = 0.0f
1021if (F<0.0)
1022    VDST = -NaN
1023else
1024    VDST = APPROX_LOG2(F)
1025```
1026
1027#### V_LOG_LEGACY_F32
1028
1029Opcode VOP1: 69 (0x45) for GCN 1.1; 76 (0x4c) for GCN 1.2 
1030Opcode VOP3A: 453 (0x1c5) for GCN 1.1; 396 (0x18c) for GCN 1.2 
1031Syntax: V_LOG_LEGACY_F32 VDST, SRC0 
1032Description: Approximate logarithm of base 2 from floating point value SRC0, and store
1033result to VDST. If SRC0 is negative then store -NaN to VDST.
1034This instruction doesn't handle denormalized values regardless FLOAT MODE register setup.
1035This instruction returns slightly different results than V_LOG_F32. 
1036Operation: 
1037```
1038FLOAT F = ASFLOAT(SRC0)
1039if (F==1.0)
1040    VDST = 0.0f
1041if (F<0.0)
1042    VDST = -NaN
1043else
1044    VDST = APPROX_LOG2(F)
1045```
1046
1047#### V_MOV_B32
1048
1049Opcode VOP1: 1 (0x1) 
1050Opcode VOP3A: 385 (0x181) for GCN 1.0/1.1; 321 (0x141) for GCN 1.2 
1051Syntax: V_MOV_B32 VDST, SRC0 
1052Description: Move SRC0 into VDST. 
1053Operation: 
1054```
1055VDST = SRC0
1056```
1057
1058#### V_MOV_FED_B32
1059
1060Opcode VOP1: 9 (0x9) 
1061Opcode VOP3A: 393 (0x189) for GCN 1.0/1.1; 329 (0x149) for GCN 1.2 
1062Syntax: V_MOV_FED_B32 VDST, SRC0 
1063Description: Introduce edc double error upon write to dest vgpr without causing an exception
1064(???).
1065
1066#### V_MOVRELD_B32
1067
1068Opcode VOP1: 66 (0x42) for GCN 1.0/1.1; 54 (0x34) for GCN 1.2 
1069Opcode VOP3A: 450 (0x1c2) for GCN 1.0/1.1; 374 (0x174) for GCN 1.2 
1070Syntax: V_MOVRELD_B32 VDST, VSRC0 
1071Description: Move SRC0 to VGPR[VDST_NUMBER+M0]. 
1072Operation: 
1073```
1074VGPR[VDST_NUMBER+M0] = SRC0
1075```
1076
1077#### V_MOVRELS_B32
1078
1079Opcode VOP1: 67 (0x43) for GCN 1.0/1.1; 55 (0x35) for GCN 1.2 
1080Opcode VOP3A: 451 (0x1c3) for GCN 1.0/1.1; 375 (0x175) for GCN 1.2 
1081Syntax: V_MOVRELS_B32 VDST, VSRC0 
1082Description: Move SRC0[SRC0_NUMBER+M0] to VDST. 
1083Operation: 
1084```
1085VDST = VGPR[SRC0_NUMBER+M0]
1086```
1087
1088#### V_MOVRELSD_B32
1089
1090Opcode VOP1: 68 (0x44) for GCN 1.0/1.1; 56 (0x36) for GCN 1.2 
1091Opcode VOP3A: 452 (0x1c4) for GCN 1.0/1.1; 376 (0x176) for GCN 1.2 
1092Syntax: V_MOVRELSD_B32 VDST, VSRC0 
1093Description: Move SRC0[SRC0_NUMBER+M0] to VGPR[VDST_NUMBER+M0]. 
1094Operation: 
1095```
1096VGPR[VDST_NUMBER+M0] = VGPR[SRC0_NUMBER+M0]
1097```
1098
1099#### V_NOP
1100
1101Opcode VOP1: 0 (0x0) 
1102Opcode VOP3A: 384 (0x180) for GCN 1.0/1.1; 320 (0x140) for GCN 1.2 
1103Syntax: V_NOP 
1104Description: Do nothing.
1105
1106#### V_NOT_B32
1107
1108Opcode VOP1: 55 (0x37) for GCN 1.0/1.1; 43 (0x2b) for GCN 1.2 
1109Opcode VOP3A: 439 (0x1b7) for GCN 1.0/1.1; 363 (0x16b) for GCN 1.2 
1110Syntax: V_NOT_B32 VDST, SRC0 
1111Description: Do bitwise negation on 32-bit SRC0, and store result to VDST. 
1112Operation: 
1113```
1114VDST = ~SRC0
1115```
1116
1117#### V_RCP_CLAMP_F32
1118
1119Opcode VOP1: 40 (0x28) for GCN 1.0/1.1 
1120Opcode VOP3A: 424 (0x1a8) for GCN 1.0/1.1 
1121Syntax: V_RCP_CLAMP_F32 VDST, SRC0 
1122Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
1123Guaranted error below 1ulp. Result is clamped to MAX_FLOAT including sign of a result. 
1124Operation: 
1125```
1126VDST = APPROX_RCP(ASFLOAT(SRC0))
1127if (ABS(ASFLOAT(VDST))==INF)
1128    VDST = SIGN(ASFLOAT(VDST)) * MAX_FLOAT
1129```
1130
1131#### V_RCP_CLAMP_F64
1132
1133Opcode VOP1: 48 (0x30) for GCN 1.0/1.1 
1134Opcode VOP3A: 432 (0x1b0) for GCN 1.0/1.1 
1135Syntax: V_RCP_CLAMP_F64 VDST(2), SRC0(2) 
1136Description: Approximate reciprocal from double FP value SRC0 and store it to VDST.
1137Relative error of approximation is ~1e-8.
1138Result is clamped to MAX_DOUBLE value including sign of a result. 
1139Operation: 
1140```
1141VDST = APPROX_RCP(ASDOUBLE(SRC0))
1142if (ABS(ASDOUBLE(VDST))==INF)
1143    VDST = SIGN(ASDOUBLE(VDST)) * MAX_DOUBLE
1144```
1145
1146#### V_RCP_F16
1147
1148Opcode VOP1: 61 (0x3d) for GCN 1.2 
1149Opcode VOP3A: 381 (0x17d) for GCN 1.2 
1150Syntax: V_RCP_F16 VDST, SRC0 
1151Description: Approximate reciprocal from half floating point value SRC0 and
1152store it to VDST. Guaranted error below 1ulp. 
1153Operation: 
1154```
1155VDST = APPROX_RCP(ASHALF(SRC0))
1156```
1157
1158#### V_RCP_F32
1159
1160Opcode VOP1: 42 (0x2a) for GCN 1.0/1.1; 34 (0x22) for GCN 1.2 
1161Opcode VOP3A: 426 (0x1aa) for GCN 1.0/1.1; 354 (0x162) for GCN 1.2 
1162Syntax: V_RCP_F32 VDST, SRC0 
1163Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
1164Guaranted error below 1ulp. 
1165Operation: 
1166```
1167VDST = APPROX_RCP(ASFLOAT(SRC0))
1168```
1169
1170#### V_RCP_F64
1171
1172Opcode VOP1: 47 (0x2f) for GCN 1.0/1.1; 37 (0x25) for GCN 1.2 
1173Opcode VOP3A: 431 (0x1af) for GCN 1.0/1.1; 357 (0x165) for GCN 1.2 
1174Syntax: V_RCP_F64 VDST(2), SRC0(2) 
1175Description: Approximate reciprocal from double FP value SRC0 and store it to VDST.
1176Relative error of approximation is ~1e-8. 
1177Operation: 
1178```
1179VDST = APPROX_RCP(ASDOUBLE(SRC0))
1180```
1181
1182#### V_RCP_IFLAG_F32
1183
1184Opcode VOP1: 43 (0x2b) for GCN 1.0/1.1; 35 (0x23) for GCN 1.2 
1185Opcode VOP3A: 427 (0x1ab) for GCN 1.0/1.1; 355 (0x163) for GCN 1.2 
1186Syntax: V_RCP_IFLAG_F32 VDST, SRC0 
1187Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
1188Guaranted error below 1ulp. This instruction signals integer division by zero, instead
1189any floating point exception when error is occurred. 
1190Operation: 
1191```
1192VDST = APPROX_RCP_IFLAG(ASFLOAT(SRC0))
1193```
1194
1195#### V_RCP_LEGACY_F32
1196
1197Opcode VOP1: 41 (0x29) for GCN 1.0/1.1 
1198Opcode VOP3A: 425 (0x1a9) for GCN 1.0/1.1 
1199Syntax: V_RCP_LEGACY_F32 VDST, SRC0 
1200Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
1201Guaranted error below 1ulp. If SRC0 or VDST is zero or infinity then store 0 with proper
1202sign to VDST. 
1203Operation: 
1204```
1205FLOAT SF = ASFLOAT(SRC0)
1206if (ABS(SF)==0.0)
1207    VDST = SIGN(SF)*0.0
1208else
1209{
1210    VDST = APPROX_RCP(SF)
1211    if (ABS(ASFLOAT(VDST)) == INF)
1212        VDST = SIGN(SF)*0.0
1213}
1214```
1215
1216#### V_READFIRSTLANE_B32
1217
1218Opcode VOP1: 2 (0x2) 
1219Opcode VOP3A: 386 (0x182) for GCN 1.0/1.1; 322 (0x142) for GCN 1.2 
1220Syntax: V_READFIRSTLANE_B32 SDST, VSRC0 
1221Description: Copy one VSRC0 lane value to one SDST. Lane (thread id) is first active lane id
1222or first lane id all lanes are inactive. SSRC1 can be SGPR or M0. Ignores EXEC mask. 
1223Operation: 
1224```
1225UINT8 firstlane = 0
1226for (UINT8 i = 0; i < 64; i++)
1227    if ((1ULL<<i) & EXEC) != 0)
1228    { firstlane = i; break; }
1229SDST = VSRC0[firstlane]
1230```
1231#### V_RNDNE_F16
1232
1233Opcode VOP1: 71 (0x47) for GCN 1.2 
1234Opcode VOP3A: 391 (0x187) for GCN 1.2 
1235Syntax: V_RNDNE_F16 VDST, SRC0 
1236Description: Round half floating point value SRC0 to nearest even integer,
1237and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1238Operation: 
1239```
1240VDST = RNDNE(ASHALF(SRC0))
1241```
1242
1243#### V_RNDNE_F32
1244
1245Opcode VOP1: 35 (0x23) for GCN 1.0/1.1; 30 (0x1e) for GCN 1.2 
1246Opcode VOP3A: 420 (0x1a4) for GCN 1.0/1.1; 350 (0x15e) for GCN 1.2 
1247Syntax: V_RNDNE_F32 VDST, SRC0 
1248Description: Round floating point value SRC0 to nearest even integer, and store result to
1249VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1250Operation: 
1251```
1252VDST = RNDNE(ASFLOAT(SRC0))
1253```
1254
1255#### V_RNDNE_F64
1256
1257Opcode VOP1: 25 (0x19) for GCN 1.1/1.2 
1258Opcode VOP3A: 409 (0x199) for GCN 1.1; 345 (0x159) for GCN 1.2 
1259Syntax: V_RNDNE_F64 VDST(2), SRC0(2) 
1260Description: Round double floating point value SRC0 to nearest even integer,
1261and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1262Operation: 
1263```
1264VDST = RNDNE(ASDOUBLE(SRC0))
1265```
1266
1267#### V_RSQ_CLAMP_F32
1268
1269Opcode VOP1: 44 (0x2c) for GCN 1.0/1.1 
1270Opcode VOP3A: 428 (0x1ac) for GCN 1.0/1.1 
1271Syntax: V_RSQ_CLAMP_F32 VDST, SRC0 
1272Description: Approximate reciprocal square root from floating point value SRC0 with
1273clamping to MAX_FLOAT, and store result to VDST.
1274If SRC0 is negative value, store -NAN to VDST.
1275This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1276Operation: 
1277```
1278VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1279if (ASFLOAT(VDST)==INF)
1280    VDST = MAX_FLOAT
1281```
1282
1283#### V_RSQ_CLAMP_F64
1284
1285Opcode VOP1: 50 (0x32) for GCN 1.0/1.1
1286Opcode VOP3A: 434 (0x1b2) for GCN 1.0/1.1
1287Syntax: V_RSQ_CLAMP_F64 VDST(2), SRC0(2) 
1288Description: Approximate reciprocal square root from double floating point value SRC0
1289with clamping to MAX_DOUBLE ,and store it to VDST. If SRC0 is negative value,
1290store -NAN to VDST. 
1291Operation: 
1292```
1293VDST = APPROX_RSQRT(ASDOUBLE(SRC0))
1294if (ASDOUBLE(VDST)==INF)
1295    VDST = MAX_DOUBLE
1296```
1297
1298#### V_RSQ_F16
1299
1300Opcode VOP1: 63 (0x3f) for GCN 1.2 
1301Opcode VOP3A: 383 (0x17f) for GCN 1.2 
1302Syntax: V_RSQ_F16 VDST, SRC0 
1303Description: Approximate reciprocal square root from half floating point value SRC0 and
1304store it to VDST. If SRC0 is negative value, store -NAN to VDST. 
1305Operation: 
1306```
1307VDST = APPROX_RSQRT(ASHALF(SRC0))
1308```
1309
1310#### V_RSQ_F32
1311
1312Opcode VOP1: 46 (0x2e) for GCN 1.0/1.1; 36 (0x24) for GCN 1.2 
1313Opcode VOP3A: 430 (0x1ae) for GCN 1.0/1.1; 356 (0x164) for GCN 1.2 
1314Syntax: V_RSQ_F32 VDST, SRC0 
1315Description: Approximate reciprocal square root from floating point value SRC0 and
1316store it to VDST. If SRC0 is negative value, store -NAN to VDST.
1317This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1318Operation: 
1319```
1320VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1321```
1322
1323#### V_RSQ_F64
1324
1325Opcode VOP1: 49 (0x31) for GCN 1.0/1.1; 38 (0x26) for GCN 1.2 
1326Opcode VOP3A: 433 (0x1b1) for GCN 1.0/1.1; 358 (0x166) for GCN 1.2 
1327Syntax: V_RSQ_F64 VDST(2), SRC0(2) 
1328Description: Approximate reciprocal square root from double floating point value SRC0 and
1329store it to VDST. If SRC0 is negative value, store -NAN to VDST. 
1330Operation: 
1331```
1332VDST = APPROX_RSQRT(ASDOUBLE(SRC0))
1333```
1334
1335#### V_RSQ_LEGACY_F32
1336
1337Opcode VOP1: 45 (0x2d) for GCN 1.0/1.1 
1338Opcode VOP3A: 429 (0x1ad) for GCN 1.0/1.1 
1339Syntax: V_RCP_LEGACY_F32 VDST, SRC0 
1340Description: Approximate reciprocal square root from floating point value SRC0,
1341and store result to VDST. If SRC0 is negative value, store -NAN to VDST.
1342If result is zero then store 0.0 to VDST.
1343This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1344Operation: 
1345```
1346VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1347if (ASFLOAT(VDST)==INF)
1348    VDST = 0.0
1349```
1350
1351#### V_SAT_PK_U8_I16
1352
1353Opcode VOP1: 79 (0x4f) for GCN 1.4 
1354Opcode VOP3A: 399 (0x18f) for GCN 1.4 
1355Syntax: V_SAT_PK_U8_I16 VDST, SRC0 
1356Description: Saturate two packed signed 16-bit values in SRC0 to 8-bit unsigned value
1357and store they values to VDST in lower 16-bits. 
1358Operation: 
1359```
1360VDST = MAX(MIN((INT16)(SRC0&0xffff), 255), 0)
1361VDST |= MAX(MIN((INT16)(SRC0>>16), 255), 0) << 8
1362```
1363
1364#### V_SCREEN_PARTITION_4SE_B32
1365
1366Opcode: VOP1: 55 (0x37) for GCN 1.4 
1367Opcode: VOP3A: 375 (0x177) for GCN 1.4 
1368Syntax: V_SCREEN_PARTITION_4SE_B32 VDST, SRC0 
1369Description: 4SE version of LUT instruction for screen partitioning/filtering (see more in ISA manual). Get lower 8-bits from SRC0 and translate by table and store result to VDST. 
1370Operation: 
1371```
1372BYTE TABLE[256] = {
13730x1, 0x3, 0x7, 0xf, 0x5, 0xf, 0xf, 0xf, 0x7, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf,
13740xf, 0x2, 0x6, 0xe, 0xf, 0xa, 0xf, 0xf, 0xf, 0xb, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf,
13750xd, 0xf, 0x4, 0xc, 0xf, 0xf, 0x5, 0xf, 0xf, 0xf, 0xd, 0xf, 0xf, 0xf, 0xf, 0xf,
13760x9, 0xb, 0xf, 0x8, 0xf, 0xf, 0xf, 0xa, 0xf, 0xf, 0xf, 0xe, 0xf, 0xf, 0xf, 0xf,
13770xf, 0xf, 0xf, 0xf, 0x4, 0xc, 0xd, 0xf, 0x6, 0xf, 0xf, 0xf, 0xe, 0xf, 0xf, 0xf,
13780xf, 0xf, 0xf, 0xf, 0xf, 0x8, 0x9, 0xb, 0xf, 0x9, 0x9, 0xf, 0xf, 0xd, 0xf, 0xf,
13790xf, 0xf, 0xf, 0xf, 0x7, 0xf, 0x1, 0x3, 0xf, 0xf, 0x9, 0xf, 0xf, 0xf, 0xb, 0xf,
13800xf, 0xf, 0xf, 0xf, 0x6, 0xe, 0xf, 0x2, 0x6, 0xf, 0xf, 0x6, 0xf, 0xf, 0xf, 0x7,
13810xb, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0x2, 0x3, 0xb, 0xf, 0xa, 0xf, 0xf, 0xf,
13820xf, 0x7, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0x1, 0x9, 0xd, 0xf, 0x5, 0xf, 0xf,
13830xf, 0xf, 0xe, 0xf, 0xf, 0xf, 0xf, 0xf, 0xe, 0xf, 0x8, 0xc, 0xf, 0xf, 0xa, 0xf,
13840xf, 0xf, 0xf, 0xd, 0xf, 0xf, 0xf, 0xf, 0x6, 0x7, 0xf, 0x4, 0xf, 0xf, 0xf, 0x5,
13850x9, 0xf, 0xf, 0xf, 0xd, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0x8, 0xc, 0xe, 0xf,
13860xf, 0x6, 0x6, 0xf, 0xf, 0xe, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0xf, 0x4, 0x6, 0x7,
13870xf, 0xf, 0x6, 0xf, 0xf, 0xf, 0x7, 0xf, 0xf, 0xf, 0xf, 0xf, 0xb, 0xf, 0x2, 0x3,
13880x9, 0xf, 0xf, 0x9, 0xf, 0xf, 0xf, 0xb, 0xf, 0xf, 0xf, 0xf, 0x9, 0xd, 0xf, 0x1 }
1389VDST = TABLE[SRC0&0xff]
1390```
1391
1392#### V_SIN_F16
1393
1394Opcode VOP1: 73 (0x49) for GCN 1.2 
1395Opcode VOP3A: 393 (0x189) for GCN 1.2 
1396Syntax: V_SIN_F16 VDST, SRC0 
1397Description: Compute sine of half FP value from SRC0. Input value must be
1398normalized to range 1.0 - 1.0 (-360 degree : 360 degree).
1399If SRC0 value is out of range then store 0.0 to VDST.
1400If SRC0 value is infinity, store -NAN to VDST. 
1401Operation: 
1402```
1403HALF SF = ASHALF(SRC0)
1404VDST = 0.0
1405if (SF >= -1.0 && SF <= 1.0)
1406    VDST = APPROX_SIN(SF)
1407else if (ABS(SF)==INF_H)
1408    VDST = -NAN_H
1409else if (ISNAN(SF))
1410    VDST = SRC0
1411```
1412
1413#### V_SIN_F32
1414
1415Opcode VOP1: 53 (0x35) for GCN 1.0/1.1; 41 (0x29) for GCN 1.2 
1416Opcode VOP3A: 437 (0x1b5) for GCN 1.0/1.1; 361 (0x169) for GCN 1.2 
1417Syntax: V_SIN_F32 VDST, SRC0 
1418Description: Compute sine of FP value from SRC0. Input value must be normalized to range
14191.0 - 1.0 (-360 degree : 360 degree). If SRC0 value is out of range then store 0.0 to VDST.
1420If SRC0 value is infinity, store -NAN to VDST. 
1421Operation: 
1422```
1423FLOAT SF = ASFLOAT(SRC0)
1424VDST = 0.0
1425if (SF >= -1.0 && SF <= 1.0)
1426    VDST = APPROX_SIN(SF)
1427else if (ABS(SF)==INF)
1428    VDST = -NAN
1429else if (ISNAN(SF))
1430    VDST = SRC0
1431```
1432
1433#### V_SQRT_F16
1434
1435Opcode VOP1: 62 (0x3e) for GCN 1.2 
1436Opcode VOP3A: 382 (0x17e) for GCN 1.2 
1437Syntax: V_SQRT_F16 VDST, SRC0 
1438Description: Compute square root of half floating point value SRC0, and
1439store result to VDST. If SRC0 is negative value then store -NaN to VDST. 
1440Operation: 
1441```
1442if (ASHALF(SRC0)>=0.0)
1443    VDST = APPROX_SQRT(ASHALF(SRC0))
1444else
1445    VDST = -NAN_H
1446```
1447
1448#### V_SQRT_F32
1449
1450Opcode VOP1: 51 (0x33) for GCN 1.0/1.1; 39 (0x27) for GCN 1.2 
1451Opcode VOP3A: 435 (0x1b3) for GCN 1.0/1.1; 359 (0x167) for GCN 1.2 
1452Syntax: V_SQRT_F32 VDST, SRC0 
1453Description: Compute square root of floating point value SRC0, and store result to VDST.
1454If SRC0 is negative value then store -NaN to VDST. 
1455Operation: 
1456```
1457if (ASFLOAT(SRC0)>=0.0)
1458    VDST = APPROX_SQRT(ASFLOAT(SRC0))
1459else
1460    VDST = -NAN
1461```
1462
1463#### V_SQRT_F64
1464
1465Opcode VOP1: 52 (0x34) for GCN 1.0/1.1; 40 (0x28) for GCN 1.2 
1466Opcode VOP3A: 436 (0x1b4) for GCN 1.0/1.1; 360 (0x168) for GCN 1.2 
1467Syntax: V_SQRT_F64 VDST(2), SRC0(2) 
1468Description: Compute square root of double floating point value SRC0, and store result
1469to VDST. Relative error of approximation is ~1e-8.
1470If SRC0 is negative value then store -NaN to VDST. 
1471Operation: 
1472```
1473if (ASDOUBLE(SRC0)>=0.0)
1474    VDST = APPROX_SQRT(ASDOUBLE(SRC0))
1475else
1476    VDST = -NAN
1477```
1478
1479#### V_SWAP_B32
1480
1481Opcode VOP1: 81 (0x51) for GCN 1.4 
1482Opcode VOP3A: 401 (0x191) for GCN 1.4 
1483Syntax: V_SWAP_B32 VDST, SRC0 
1484Description: Swap SRC0 and VDST. 
1485```
1486UINT32 TMP = VDST
1487VDST = SRC0
1488SRC0 = TMP
1489```
1490
1491#### V_TRUNC_F16
1492
1493Opcode VOP1: 70 (0x46) for GCN 1.2 
1494Opcode VOP3A: 390 (0x186) for GCN 1.2 
1495Syntax: V_TRUNC_F16 VDST, SRC0 
1496Description: Get integer value from half floating point value SRC0, and store (as half)
1497it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1498Operation: 
1499```
1500VDST = RNDTZ(ASHALF(SRC0))
1501```
1502
1503#### V_TRUNC_F32
1504
1505Opcode VOP1: 33 (0x21) for GCN 1.0/1.1; 28 (0x1c) for GCN 1.2 
1506Opcode VOP3A: 417 (0x1a1) for GCN 1.0/1.1; 348 (0x15c) for GCN 1.2 
1507Syntax: V_TRUNC_F32 VDST, SRC0 
1508Description: Get integer value from floating point value SRC0, and store (as float)
1509it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1510Operation: 
1511```
1512VDST = RNDTZ(ASFLOAT(SRC0))
1513```
1514
1515#### V_TRUNC_F64
1516
1517Opcode VOP1: 23 (0x17) for GCN 1.1/1.2 
1518Opcode VOP3A: 407 (0x197) for GCN 1.1; 343 (0x157) for GCN 1.2 
1519Syntax: V_TRUNC_F64 VDST(2), SRC0(2) 
1520Description: Get integer value from double floating point value SRC0, and store (as float)
1521it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1522Operation: 
1523```
1524VDST = RNDTZ(ASDOUBLE(SRC0))
1525```
Note: See TracBrowser for help on using the repository browser.