source: CLRX/CLRadeonExtender/trunk/doc/GcnInstrsVop1.md @ 3150

Last change on this file since 3150 was 3150, checked in by matszpk, 2 years ago

CLRadeonExtender: CLRX: small typos. GalliumBinary/GalliumDisasm?: Add support new LLVM 3.9.0 (spilledGPR info) and new Mesa3D 17.0.

File size: 38.3 KB
Line 
1## GCN ISA VOP1/VOP3 instructions
2
3VOP1 instructions can be encoded in the VOP1 encoding and the VOP3A/VOP3B encoding.
4List of fields for VOP1 encoding:
5
6Bits  | Name     | Description
7------|----------|------------------------------
80-8   | SRC0     | First (scalar or vector) source operand
99-16  | OPCODE   | Operation code
1017-24 | VDST     | Destination vector operand
1125-31 | ENCODING | Encoding type. Must be 0b0111111
12
13Syntax: INSTRUCTION VDST, SRC0
14
15List of fields for VOP3A/VOP3B encoding (GCN 1.0/1.1):
16
17Bits  | Name     | Description
18------|----------|------------------------------
190-7   | VDST     | Vector destination operand
208-10  | ABS      | Absolute modifiers for source operands (VOP3A)
218-14  | SDST     | Scalar destination operand (VOP3B)
2211    | CLAMP    | CLAMP modifier (VOP3A)
2315    | CLAMP    | CLAMP modifier (VOP3B)
2417-25 | OPCODE   | Operation code
2526-31 | ENCODING | Encoding type. Must be 0b110100
2632-40 | SRC0     | First (scalar or vector) source operand
2741-49 | SRC1     | Second (scalar or vector) source operand
2850-58 | SRC2     | Third (scalar or vector) source operand
2959-60 | OMOD     | OMOD modifier. Multiplication modifier
3061-63 | NEG      | Negation modifier for source operands
31
32List of fields for VOP3A/VOP3B encoding (GCN 1.2):
33
34Bits  | Name     | Description
35------|----------|------------------------------
360-7   | VDST     | Destination vector operand
378-10  | ABS      | Absolute modifiers for source operands (VOP3A)
388-14  | SDST     | Scalar destination operand (VOP3B)
3915    | CLAMP    | CLAMP modifier
4016-25 | OPCODE   | Operation code
4126-31 | ENCODING | Encoding type. Must be 0b110100
4232-40 | SRC0     | First (scalar or vector) source operand
4341-49 | SRC1     | Second (scalar or vector) source operand
4450-58 | SRC2     | Third (scalar or vector) source operand
4559-60 | OMOD     | OMOD modifier. Multiplication modifier
4661-63 | NEG      | Negation modifier for source operands
47
48Syntax: INSTRUCTION VDST, SRC0 [MODIFIERS]
49
50Modifiers:
51
52* CLAMP - clamps destination floating point value in range 0.0-1.0
53* MUL:2, MUL:4, DIV:2 - OMOD modifiers. Multiply destination floating point value by
542.0, 4.0 or 0.5 respectively. Clamping applied after OMOD modifier.
55* -SRC - negate floating point value from source operand. Applied after ABS modifier.
56* ABS(SRC), |SRC| - apply absolute value to source operand
57
58NOTE: OMOD modifier doesn't work if output denormals are allowed
59(5 bit of MODE register for single precision or 7 bit for double precision). 
60NOTE: OMOD and CLAMP modifier affects only for instruction that output is
61floating point value. 
62NOTE: ABS and negation is applied to source operand for any instruction. 
63NOTE: OMOD modifier doesn't work for half precision (FP16) instructions.
64
65Negation and absolute value can be combined: `-ABS(V0)`. Modifiers CLAMP and
66OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.
67
68Limitations for operands:
69
70* only one SGPR can be read by instruction. Multiple occurrences of this same
71SGPR is allowed
72* only one literal constant can be used, and only when a SGPR or M0 is not used in
73source operands
74* only SRC0 can holds LDS_DIRECT
75
76Unaligned pairs of SGPRs are allowed in source operands.
77
78VOP1 opcodes (0-127) are reflected in VOP3 in range: 384-511 for GCN 1.0/1.1 or
79320-447 for GCN 1.2.
80
81List of the instructions by opcode (GCN 1.0/1.1):
82
83 Opcode     | Opcode(VOP3)|GCN 1.0|GCN 1.1| Mnemonic
84------------|-------------|-------|-------|-----------------------------
85 0 (0x0)    | 384 (0x180) |   ✓   |   ✓   | V_NOP
86 1 (0x1)    | 385 (0x181) |   ✓   |   ✓   | V_MOV_B32
87 2 (0x2)    | 386 (0x182) |   ✓   |   ✓   | V_READFIRSTLANE_B32
88 3 (0x3)    | 387 (0x183) |   ✓   |   ✓   | V_CVT_I32_F64
89 4 (0x4)    | 388 (0x184) |   ✓   |   ✓   | V_CVT_F64_I32
90 5 (0x5)    | 389 (0x185) |   ✓   |   ✓   | V_CVT_F32_I32
91 6 (0x6)    | 390 (0x186) |   ✓   |   ✓   | V_CVT_F32_U32
92 7 (0x7)    | 391 (0x187) |   ✓   |   ✓   | V_CVT_U32_F32
93 8 (0x8)    | 392 (0x188) |   ✓   |   ✓   | V_CVT_I32_F32
94 9 (0x9)    | 393 (0x189) |   ✓   |   ✓   | V_MOV_FED_B32
95 10 (0xa)   | 394 (0x18a) |   ✓   |   ✓   | V_CVT_F16_F32
96 11 (0xb)   | 395 (0x18b) |   ✓   |   ✓   | V_CVT_F32_F16
97 12 (0xc)   | 396 (0x18c) |   ✓   |   ✓   | V_CVT_RPI_I32_F32
98 13 (0xd)   | 397 (0x18d) |   ✓   |   ✓   | V_CVT_FLR_I32_F32
99 14 (0xe)   | 398 (0x18e) |   ✓   |   ✓   | V_CVT_OFF_F32_I4
100 15 (0xf)   | 399 (0x18f) |   ✓   |   ✓   | V_CVT_F32_F64
101 16 (0x10)  | 400 (0x190) |   ✓   |   ✓   | V_CVT_F64_F32
102 17 (0x11)  | 401 (0x191) |   ✓   |   ✓   | V_CVT_F32_UBYTE0
103 18 (0x12)  | 402 (0x192) |   ✓   |   ✓   | V_CVT_F32_UBYTE1
104 19 (0x13)  | 403 (0x193) |   ✓   |   ✓   | V_CVT_F32_UBYTE2
105 20 (0x14)  | 404 (0x194) |   ✓   |   ✓   | V_CVT_F32_UBYTE3
106 21 (0x15)  | 405 (0x195) |   ✓   |   ✓   | V_CVT_U32_F64
107 22 (0x16)  | 406 (0x196) |   ✓   |   ✓   | V_CVT_F64_U32
108 23 (0x17)  | 407 (0x197) |       |   ✓   | V_TRUNC_F64
109 24 (0x18)  | 408 (0x198) |       |   ✓   | V_CEIL_F64
110 25 (0x19)  | 409 (0x199) |       |   ✓   | V_RNDNE_F64
111 26 (0x1a)  | 410 (0x19a) |       |   ✓   | V_FLOOR_F64
112 32 (0x20)  | 416 (0x1a0) |   ✓   |   ✓   | V_FRACT_F32
113 33 (0x21)  | 417 (0x1a1) |   ✓   |   ✓   | V_TRUNC_F32
114 34 (0x22)  | 418 (0x1a2) |   ✓   |   ✓   | V_CEIL_F32
115 35 (0x23)  | 419 (0x1a3) |   ✓   |   ✓   | V_RNDNE_F32
116 36 (0x24)  | 420 (0x1a4) |   ✓   |   ✓   | V_FLOOR_F32
117 37 (0x25)  | 421 (0x1a5) |   ✓   |   ✓   | V_EXP_F32
118 38 (0x26)  | 422 (0x1a6) |   ✓   |   ✓   | V_LOG_CLAMP_F32
119 39 (0x27)  | 423 (0x1a7) |   ✓   |   ✓   | V_LOG_F32
120 40 (0x28)  | 424 (0x1a8) |   ✓   |   ✓   | V_RCP_CLAMP_F32
121 41 (0x29)  | 425 (0x1a9) |   ✓   |   ✓   | V_RCP_LEGACY_F32
122 42 (0x2a)  | 426 (0x1aa) |   ✓   |   ✓   | V_RCP_F32
123 43 (0x2b)  | 427 (0x1ab) |   ✓   |   ✓   | V_RCP_IFLAG_F32
124 44 (0x2c)  | 428 (0x1ac) |   ✓   |   ✓   | V_RSQ_CLAMP_F32
125 45 (0x2d)  | 429 (0x1ad) |   ✓   |   ✓   | V_RSQ_LEGACY_F32
126 46 (0x2e)  | 430 (0x1ae) |   ✓   |   ✓   | V_RSQ_F32
127 47 (0x2f)  | 431 (0x1af) |   ✓   |   ✓   | V_RCP_F64
128 48 (0x30)  | 432 (0x1b0) |   ✓   |   ✓   | V_RCP_CLAMP_F64
129 49 (0x31)  | 433 (0x1b1) |   ✓   |   ✓   | V_RSQ_F64
130 50 (0x32)  | 434 (0x1b2) |   ✓   |   ✓   | V_RSQ_CLAMP_F64
131 51 (0x33)  | 435 (0x1b3) |   ✓   |   ✓   | V_SQRT_F32
132 52 (0x34)  | 436 (0x1b4) |   ✓   |   ✓   | V_SQRT_F64
133 53 (0x35)  | 437 (0x1b5) |   ✓   |   ✓   | V_SIN_F32
134 54 (0x36)  | 438 (0x1b6) |   ✓   |   ✓   | V_COS_F32
135 55 (0x37)  | 439 (0x1b7) |   ✓   |   ✓   | V_NOT_B32
136 56 (0x38)  | 440 (0x1b8) |   ✓   |   ✓   | V_BFREV_B32
137 57 (0x39)  | 441 (0x1b9) |   ✓   |   ✓   | V_FFBH_U32
138 58 (0x3a)  | 442 (0x1ba) |   ✓   |   ✓   | V_FFBL_B32
139 59 (0x3b)  | 443 (0x1bb) |   ✓   |   ✓   | V_FFBH_I32
140 60 (0x3c)  | 444 (0x1bc) |   ✓   |   ✓   | V_FREXP_EXP_I32_F64
141 61 (0x3d)  | 445 (0x1bd) |   ✓   |   ✓   | V_FREXP_MANT_F64
142 62 (0x3e)  | 446 (0x1be) |   ✓   |   ✓   | V_FRACT_F64
143 63 (0x3f)  | 447 (0x1bf) |   ✓   |   ✓   | V_FREXP_EXP_I32_F32
144 64 (0x40)  | 448 (0x1c0) |   ✓   |   ✓   | V_FREXP_MANT_F32
145 65 (0x41)  | 449 (0x1c1) |   ✓   |   ✓   | V_CLREXCP
146 66 (0x42)  | 450 (0x1c2) |   ✓   |   ✓   | V_MOVRELD_B32
147 67 (0x43)  | 451 (0x1c3) |   ✓   |   ✓   | V_MOVRELS_B32
148 68 (0x44)  | 452 (0x1c4) |   ✓   |   ✓   | V_MOVRELSD_B32
149 69 (0x45)  | 453 (0x1c5) |       |   ✓   | V_LOG_LEGACY_F32
150 70 (0x46)  | 454 (0x1c6) |       |   ✓   | V_EXP_LEGACY_F32
151
152List of the instructions by opcode (GCN 1.2):
153
154 Opcode     | Opcode(VOP3)| Mnemonic
155------------|-------------|-----------------------------
156 0 (0x0)    | 320 (0x140) | V_NOP
157 1 (0x1)    | 321 (0x141) | V_MOV_B32
158 2 (0x2)    | 322 (0x142) | V_READFIRSTLANE_B32
159 3 (0x3)    | 323 (0x143) | V_CVT_I32_F64
160 4 (0x4)    | 324 (0x144) | V_CVT_F64_I32
161 5 (0x5)    | 325 (0x145) | V_CVT_F32_I32
162 6 (0x6)    | 326 (0x146) | V_CVT_F32_U32
163 7 (0x7)    | 327 (0x147) | V_CVT_U32_F32
164 8 (0x8)    | 328 (0x148) | V_CVT_I32_F32
165 9 (0x9)    | 329 (0x149) | V_MOV_FED_B32
166 10 (0xa)   | 330 (0x14a) | V_CVT_F16_F32
167 11 (0xb)   | 331 (0x14b) | V_CVT_F32_F16
168 12 (0xc)   | 332 (0x14c) | V_CVT_RPI_I32_F32
169 13 (0xd)   | 333 (0x14d) | V_CVT_FLR_I32_F32
170 14 (0xe)   | 334 (0x14e) | V_CVT_OFF_F32_I4
171 15 (0xf)   | 335 (0x14f) | V_CVT_F32_F64
172 16 (0x10)  | 336 (0x150) | V_CVT_F64_F32
173 17 (0x11)  | 337 (0x151) | V_CVT_F32_UBYTE0
174 18 (0x12)  | 338 (0x152) | V_CVT_F32_UBYTE1
175 19 (0x13)  | 339 (0x153) | V_CVT_F32_UBYTE2
176 20 (0x14)  | 340 (0x154) | V_CVT_F32_UBYTE3
177 21 (0x15)  | 341 (0x155) | V_CVT_U32_F64
178 22 (0x16)  | 342 (0x156) | V_CVT_F64_U32
179 23 (0x17)  | 343 (0x157) | V_TRUNC_F64
180 24 (0x18)  | 344 (0x158) | V_CEIL_F64
181 25 (0x19)  | 345 (0x159) | V_RNDNE_F64
182 26 (0x1a)  | 346 (0x15a) | V_FLOOR_F64
183 27 (0x1b)  | 347 (0x15b) | V_FRACT_F32
184 28 (0x1c)  | 348 (0x15c) | V_TRUNC_F32
185 29 (0x1d)  | 349 (0x15d) | V_CEIL_F32
186 30 (0x1e)  | 350 (0x15e) | V_RNDNE_F32
187 31 (0x1f)  | 351 (0x15f) | V_FLOOR_F32
188 32 (0x20)  | 352 (0x160) | V_EXP_F32
189 33 (0x21)  | 353 (0x161) | V_LOG_F32
190 34 (0x22)  | 354 (0x162) | V_RCP_F32
191 35 (0x23)  | 355 (0x163) | V_RCP_IFLAG_F32
192 36 (0x24)  | 356 (0x164) | V_RSQ_F32
193 37 (0x25)  | 357 (0x165) | V_RCP_F64
194 38 (0x26)  | 358 (0x166) | V_RSQ_F64
195 39 (0x27)  | 359 (0x167) | V_SQRT_F32
196 40 (0x28)  | 360 (0x168) | V_SQRT_F64
197 41 (0x29)  | 361 (0x169) | V_SIN_F32
198 42 (0x2a)  | 362 (0x16a) | V_COS_F32
199 43 (0x2b)  | 363 (0x16b) | V_NOT_B32
200 44 (0x2c)  | 364 (0x16c) | V_BFREV_B32
201 45 (0x2d)  | 365 (0x16d) | V_FFBH_U32
202 46 (0x2e)  | 366 (0x16e) | V_FFBL_B32
203 47 (0x2f)  | 367 (0x16f) | V_FFBH_I32
204 48 (0x30)  | 368 (0x170) | V_FREXP_EXP_I32_F64
205 49 (0x31)  | 369 (0x171) | V_FREXP_MANT_F64
206 50 (0x32)  | 370 (0x172) | V_FRACT_F64
207 51 (0x33)  | 371 (0x173) | V_FREXP_EXP_I32_F32
208 52 (0x34)  | 372 (0x174) | V_FREXP_MANT_F32
209 53 (0x35)  | 373 (0x175) | V_CLREXCP
210 54 (0x36)  | 374 (0x176) | V_MOVRELD_B32
211 55 (0x37)  | 375 (0x177) | V_MOVRELS_B32
212 56 (0x38)  | 376 (0x178) | V_MOVRELSD_B32
213 57 (0x39)  | 377 (0x179) | V_CVT_F16_U16
214 58 (0x3a)  | 378 (0x17a) | V_CVT_F16_I16
215 59 (0x3b)  | 379 (0x17b) | V_CVT_U16_F16
216 60 (0x3c)  | 380 (0x17c) | V_CVT_I16_F16
217 61 (0x3d)  | 381 (0x17d) | V_RCP_F16
218 62 (0x3e)  | 382 (0x17e) | V_SQRT_F16
219 63 (0x3f)  | 383 (0x17f) | V_RSQ_F16
220 64 (0x40)  | 384 (0x180) | V_LOG_F16
221 65 (0x41)  | 385 (0x181) | V_EXP_F16
222 66 (0x42)  | 386 (0x182) | V_FREXP_MANT_F16
223 67 (0x43)  | 387 (0x183) | V_FREXP_EXP_I16_F16
224 68 (0x44)  | 388 (0x184) | V_FLOOR_F16
225 69 (0x45)  | 389 (0x185) | V_CEIL_F16
226 70 (0x46)  | 390 (0x186) | V_TRUNC_F16
227 71 (0x47)  | 391 (0x187) | V_RNDNE_F16
228 72 (0x48)  | 392 (0x188) | V_FRACT_F16
229 73 (0x49)  | 393 (0x189) | V_SIN_F16
230 74 (0x4a)  | 394 (0x18a) | V_COS_F16
231 75 (0x4b)  | 395 (0x18b) | V_EXP_LEGACY_F32
232 76 (0x4c)  | 396 (0x18c) | V_LOG_LEGACY_F32
233
234### Instruction set
235
236Alphabetically sorted instruction list:
237
238#### V_BFREV_B32
239
240Opcode VOP1: 56 (0x38) for GCN 1.0/1.1; 44 (0x2c) for GCN 1.2 
241Opcode VOP3A: 440 (0x1b8) for GCN 1.0/1.1; 364 (0x16c) for GCN 1.2 
242Syntax: V_BFREV_B32 VDST, SRC0 
243Reverse bits in SRC0 and store result to VDST. 
244Operation: 
245```
246VDST = REVBIT(SRC0)
247```
248
249#### V_CEIL_F32
250
251Opcode VOP1: 34 (0x22) for GCN 1.0/1.1; 29 (0x1d) for GCN 1.2 
252Opcode VOP3A: 418 (0x1a2) for GCN 1.0/1.1; 349 (0x15d) for GCN 1.2 
253Syntax: V_CEIL_F32 VDST, SRC0 
254Description: Truncate floating point valu from SRC0 with rounding to positive infinity
255(ceilling), and store result to VDST. Implemented by flooring.
256If SRC0 is infinity or NaN then copy SRC0 to VDST. 
257Operation: 
258```
259FLOAT F = FLOOR(ASFLOAT(SRC0))
260if (ASFLOAT(SRC0) > 0.0 && ASFLOAT(SRC0) != F)
261    F += 1.0
262VDST = F
263```
264
265#### V_CEIL_F64
266
267Opcode VOP1: 24 (0x18) for GCN 1.1/1.2 
268Opcode VOP3A: 408 (0x198) for GCN 1.1; 344 (0x158) for GCN 1.2 
269Syntax: V_CEIL_F64 VDST(2), SRC0(2) 
270Description: Truncate double floating point valu from SRC0 with rounding to
271positive infinity (ceilling), and store result to VDST. Implemented by flooring.
272If SRC0 is infinity or NaN then copy SRC0 to VDST. 
273Operation: 
274```
275DOUBLE F = FLOOR(ASDOUBLE(SRC0))
276if (ASDOUBLE(SRC0) > 0.0 && ASDOUBLE(SRC0) != F)
277    F += 1.0
278VDST = F
279```
280
281#### V_CLREXCP
282
283Opcode VOP1: 65 (0x41) for GCN 1.0/1.1; 53 (0x35) for GCN 1.2 
284Opcode VOP3A: 449 (0x1c1) for GCN 1.0/1.1; 373 (0x175) for GCN 1.2 
285Syntax: V_CLREXCP 
286Description: Clear wave's exception state in SIMD. 
287
288#### V_COS_F32
289
290Opcode VOP1: 54 (0x36) for GCN 1.0/1.1; 42 (0x2a) for GCN 1.2 
291Opcode VOP3A: 438 (0x1b6) for GCN 1.0/1.1; 362 (0x16a) for GCN 1.2 
292Syntax: V_COS_F32 VDST, SRC0 
293Description: Compute cosine of FP value from SRC0. Input value must be normalized to range
2941.0 - 1.0 (-360 degree : 360 degree). If SRC0 value is out of range then store 1.0 to VDST.
295If SRC0 value is infinity, store -NAN to VDST. 
296Operation: 
297```
298FLOAT SF = ASFLOAT(SRC0)
299VDST = 1.0
300if (SF >= -1.0 && SF <= 1.0)
301    VDST = APPROX_COS(SF)
302else if (ABS(SF)==INF)
303    VDST = -NAN
304else if (ISNAN(SF))
305    VDST = SRC0
306```
307
308#### V_CVT_F16_F32
309
310Opcode VOP1: 10 (0xa) 
311Opcode VOP3A: 394 (0x18a) for GCN 1.0/1.1; 330 (0x14a) for GCN 1.2 
312Syntax: V_CVT_F16_F32 VDST, SRC0 
313Description: Convert single FP value to half floating point value with rounding from
314MODE register (single FP rounding mode), and store result to VDST.
315If absolute value is too high, then store -/+infinity to VDST. 
316Operation: 
317```
318VDST = CVTHALF(ASFLOAT(SRC0))
319```
320
321#### V_CVT_F32_F16
322
323Opcode VOP1: 11 (0xb) 
324Opcode VOP3A: 395 (0x18b) for GCN 1.0/1.1; 331 (0x14b) for GCN 1.2 
325Syntax: V_CVT_F32_F16 VDST, SRC0 
326Description: Convert half FP value to single FP value, and store result to VDST.
327**By default, immediate is in FP32 format!**
328Operation: 
329```
330VDST = (FLOAT)(ASHALF(SRC0))
331```
332
333#### V_CVT_F32_F64
334
335Opcode VOP1: 15 (0xf) 
336Opcode VOP3A: 399 (0x18f) for GCN 1.0/1.1; 335 (0x14f) for GCN 1.2 
337Syntax: V_CVT_F32_F64 VDST, SRC0(2) 
338Description: Convert double FP value to single floating point value with rounding from
339MODE register (single FP rounding mode), and store result to VDST.
340If absolute value is too high, then store -/+infinity to VDST. 
341Operation: 
342```
343VDST = CVTHALF(ASDOUBLE(SRC0))
344```
345
346#### V_CVT_F32_I32
347
348Opcode VOP1: 5 (0x5) 
349Opcode VOP3A: 389 (0x185) for GCN 1.0/1.1; 325 (0x145) for GCN 1.2 
350Syntax: V_CVT_F32_I32 VDST, SRC0 
351Description: Convert signed 32-bit integer to single FP value, and store it to VDST. 
352Operation: 
353```
354VDST = (FLOAT)(INT32)SRC0
355```
356
357#### V_CVT_F32_U32
358
359Opcode VOP1: 6 (0x6) 
360Opcode VOP3A: 390 (0x186) for GCN 1.0/1.1; 326 (0x146) for GCN 1.2 
361Syntax: V_CVT_F32_U32 VDST, SRC0 
362Description: Convert unsigned 32-bit integer to single FP value, and store it to VDST. 
363Operation: 
364```
365VDST = (FLOAT)SRC0
366```
367
368#### V_CVT_F32_UBYTE0
369
370Opcode VOP1: 17 (0x11) 
371Opcode VOP3A: 401 (0x191) for GCN 1.0/1.1; 337 (0x151) for GCN 1.2 
372Syntax: V_CVT_F32_UBYTE0 VDST, SRC0 
373Description: Convert the first unsigned 8-bit byte from SRC0 to single FP value,
374and store it to VDST. 
375Operation: 
376```
377VDST = (FLOAT)(SRC0 & 0xff)
378```
379
380#### V_CVT_F32_UBYTE1
381
382Opcode VOP1: 18 (0x12) 
383Opcode VOP3A: 402 (0x192) for GCN 1.0/1.1; 338 (0x152) for GCN 1.2 
384Syntax: V_CVT_F32_UBYTE1 VDST, SRC0 
385Description: Convert the second unsigned 8-bit byte from SRC0 to single FP value,
386and store it to VDST. 
387Operation: 
388```
389VDST = (FLOAT)((SRC0>>8) & 0xff)
390```
391
392#### V_CVT_F32_UBYTE2
393
394Opcode VOP1: 19 (0x13) 
395Opcode VOP3A: 403 (0x193) for GCN 1.0/1.1; 339 (0x153) for GCN 1.2 
396Syntax: V_CVT_F32_UBYTE2 VDST, SRC0 
397Description: Convert the third unsigned 8-bit byte from SRC0 to single FP value,
398and store it to VDST. 
399Operation: 
400```
401VDST = (FLOAT)((SRC0>>16) & 0xff)
402```
403
404#### V_CVT_F32_UBYTE3
405
406Opcode VOP1: 20 (0x14) 
407Opcode VOP3A: 404 (0x194) for GCN 1.0/1.1; 340 (0x154) for GCN 1.2 
408Syntax: V_CVT_F32_UBYTE3 VDST, SRC0 
409Description: Convert the fourth unsigned 8-bit byte from SRC0 to single FP value,
410and store it to VDST. 
411Operation: 
412```
413VDST = (FLOAT)(SRC0>>24)
414```
415
416#### V_CVT_F64_F32
417
418Opcode VOP1: 16 (0x10) 
419Opcode VOP3A: 400 (0x190) for GCN 1.0/1.1; 336 (0x150) for GCN 1.2 
420Syntax: V_CVT_F64_F32 VDST(2), SRC0 
421Description: Convert single FP value to double FP value, and store result to VDST. 
422Operation: 
423```
424VDST = (DOUBLE)(ASFLOAT(SRC0))
425```
426
427#### V_CVT_F64_I32
428
429Opcode VOP1: 4 (0x4) 
430Opcode VOP3A: 388 (0x184) for GCN 1.0/1.1; 324 (0x144) for GCN 1.2 
431Syntax: V_CVT_F64_I32 VDST(2), SRC0 
432Description: Convert signed 32-bit integer to double FP value, and store it to VDST. 
433Operation: 
434```
435VDST = (DOUBLE)(INT32)SRC0
436```
437
438#### V_CVT_F64_U32
439
440Opcode VOP1: 22 (0x16) 
441Opcode VOP3A: 406 (0x196) for GCN 1.0/1.1; 342 (0x156) for GCN 1.2 
442Syntax: V_CVT_F64_U32 VDST(2), SRC0 
443Description: Convert unsigned 32-bit integer to double FP value, and store it to VDST. 
444Operation: 
445```
446VDST = (DOUBLE)SRC0
447```
448
449#### V_CVT_FLR_I32_F32
450
451Opcode VOP1: 13 (0xd) 
452Opcode VOP3A: 397 (0x18d) for GCN 1.0/1.1; 333 (0x14d) for GCN 1.2 
453Syntax: V_CVT_FLR_I32_F32 VDST, SRC0 
454Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
455store result to VDST. Conversion uses rounding to negative infinity (floor).
456If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
457If input value is NaN/-NaN then store MAX_INT32/MIN_INT32 to VDST. 
458Operation: 
459```
460FLOAT SF = ASFLOAT(SF)
461if (!ISNAN(SF))
462    VDST = (INT32)MAX(MIN(FLOOR(SF), 2147483647.0), -2147483648.0)
463else
464    VDST = (INT32)SF>=0 ? 2147483647 : -2147483648
465```
466
467#### V_CVT_I32_F32
468
469Opcode VOP1: 8 (0x8) 
470Opcode VOP3A: 392 (0x188) for GCN 1.0/1.1; 328 (0x148) for GCN 1.2 
471Syntax: V_CVT_I32_F32 VDST, SRC0 
472Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
473store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
474maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
475If input value is NaN then store 0 to VDST. 
476Operation: 
477```
478VDST = 0
479if (!ISNAN(ASFLOAT(SRC0)))
480    VDST = (INT32)MAX(MIN(RNDTZINT(ASFLOAT(SRC0)), 2147483647.0), -2147483648.0)
481```
482
483#### V_CVT_I32_F64
484
485Opcode VOP1: 3 (0x3) 
486Opcode VOP3A: 387 (0x183) for GCN 1.0/1.1; 323 (0x143) for GCN 1.2 
487Syntax: V_CVT_I32_F64 VDST, SRC0(2) 
488Description: Convert 64-bit floating point value from SRC0 to signed 32-bit integer, and
489store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
490maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
491If input value is NaN then store 0 to VDST. 
492Operation: 
493```
494VDST = 0
495if (!ISNAN(ASDOUBLE(SRC0)))
496    VDST = (INT32)MAX(MIN(RNDTZINT(ASDOUBLE(SRC0)), 2147483647.0), -2147483648.0)
497```
498
499#### V_CVT_OFF_F32_I4
500
501Opcode VOP1: 14 (0xe) 
502Opcode VOP3A: 398 (0x18e) for GCN 1.0/1.1; 334 (0x14e) for GCN 1.2 
503Syntax: V_CVT_OFF_F32_I4 VDST, SRC0 
504Description: Convert 4-bit signed value from SRC0 to floating point value, normalize that
505value to range -0.5:0.4375 and store result to VDST. 
506Operation: 
507```
508VDST = (FLOAT)((SRC0 & 0xf) ^ 8) / 16.0 - 0.5
509```
510
511#### V_CVT_RPI_I32_F32
512
513Opcode VOP1: 12 (0xc) 
514Opcode VOP3A: 396 (0x18c) for GCN 1.0/1.1; 332 (0x14c) for GCN 1.2 
515Syntax: V_CVT_RPI_I32_F32 VDST, SRC0 
516Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
517store result to VDST. Conversion adds 0.5 to value and rounds negative infinity (floor).
518If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to
519VDST. If input value is NaN/-NaN then store MAX_INT32/MIN_INT32 to VDST. 
520Operation: 
521```
522FLOAT SF = ASFLOAT(SRC0)
523if (!ISNAN(SF))
524    VDST = (INT32)MAX(MIN(FLOOR(SF + 0.5), 2147483647.0), -2147483648.0)
525else
526    VDST = (INT32)SF>=0 ? 2147483647 : -2147483648
527```
528
529#### V_CVT_U32_F32
530
531Opcode VOP1: 7 (0x7) 
532Opcode VOP3A: 391 (0x187) for GCN 1.0/1.1; 327 (0x147) for GCN 1.2 
533Syntax: V_CVT_U32_F32 VDST, SRC0 
534Description: Convert 32-bit floating point value from SRC0 to unsigned 32-bit integer, and
535store result to VDST. Conversion uses rounding to zero. If value is higher than
536maximal integer then store MAX_UINT32 to VDST.
537If input value is NaN then store 0 to VDST. 
538Operation: 
539```
540VDST = 0
541if (!ISNAN(ASFLOAT(SRC0)))
542    VDST = (UINT32)MIN(RNDTZINT(ASFLOAT(SRC0)), 4294967295.0)
543```
544
545#### V_CVT_U32_F64
546
547Opcode VOP1: 21 (0x15) 
548Opcode VOP3A: 405 (0x195) for GCN 1.0/1.1; 341 (0x155) for GCN 1.2 
549Syntax: V_CVT_U32_F64 VDST, SRC0(2) 
550Description: Convert 64-bit floating point value from SRC0 to unsigned 32-bit integer, and
551store result to VDST. Conversion uses rounding to zero. If value is higher than
552maximal integer then store MAX_UINT32 to VDST.
553If input value is NaN then store 0 to VDST. 
554Operation: 
555```
556VDST = 0
557if (!ISNAN(ASDOUBLE(SRC0)))
558    VDST = (UINT32)MIN(RNDTZINT(ASDOUBLE(SRC0)), 4294967295.0)
559```
560
561#### V_EXP_F32
562
563Opcode VOP1: 37 (0x25) for GCN 1.0/1.1; 32 (0x20) for GCN 1.2 
564Opcode VOP3A: 421 (0x1a5) for GCN 1.0/1.1; 352 (0x160) for GCN 1.2 
565Syntax: V_EXP_F32 VDST, SRC0 
566Description: Approximate power of two from FP value SRC0 and store it to VDST. Instruction
567for values smaller than -126.0 always returns 0 regardless floatmode in MODE register. 
568Operation: 
569```
570if (ASFLOAT(SRC0)>=-126.0)
571    VDST = APPROX_POW2(ASFLOAT(SRC0))
572else
573    VDST = 0.0
574```
575
576### V_EXP_LEGACY_F32
577
578Opcode VOP1: 70 (0x46) for GCN 1.1; 75 (0x4b) for GCN 1.2 
579Opcode VOP3A: 454 (0x1c6) for GCN 1.1; 395 (0x18b) for GCN 1.2 
580Syntax: V_EXP_LEGACY_F32 VDST, SRC0 
581Description: Approximate power of two from FP value SRC0 and store it to VDST. Instruction
582for values smaller than -126.0 always returns 0 regardless floatmode in MODE register.
583For some cases this instructions returns slightly less accurate result than V_EXP_F32. 
584Operation: 
585```
586if (ASFLOAT(SRC0)>=-126.0)
587    VDST = APPROX_POW2(ASFLOAT(SRC0))
588else
589    VDST = 0.0
590```
591
592#### V_FFBH_U32
593
594Opcode VOP1: 57 (0x39) for GCN 1.0/1.1; 45 (0x2d) for GCN 1.2 
595Opcode VOP3A: 441 (0x1b9) for GCN 1.0/1.1; 365 (0x16d) for GCN 1.2 
596Syntax: V_FFBH_U32 VDST, SRC0 
597Description: Find last one bit in SRC0. If found, store number of skipped bits to VDST,
598otherwise set VDST to -1. 
599Operation: 
600```
601VDST = -1
602for (INT8 i = 31; i >= 0; i--)
603    if ((1U<<i) & SRC0) != 0)
604    { VDST = 31-i; break; }
605```
606
607#### V_FFBH_I32
608
609Opcode VOP1: 59 (0x3b) for GCN 1.0/1.1; 47 (0x2f) for GCN 1.2 
610Opcode VOP3A: 443 (0x1bb) for GCN 1.0/1.1; 367 (0x16f) for GCN 1.2 
611Syntax: V_FFBH_I32 VDST, SRC0 
612Description: Find last opposite bit to sign in SRC0. If found, store number of skipped bits
613to VDST, otherwise set VDST to -1. 
614Operation: 
615```
616VDST = -1
617UINT32 bitval = (INT32)SRC0>=0 ? 1 : 0
618for (INT8 i = 31; i >= 0; i--)
619    if ((1U<<i) & SRC0) == (bitval<<i))
620    { VDST = 31-i; break; }
621```
622
623#### V_FFBL_B32
624
625Opcode VOP1: 58 (0x3a) for GCN 1.0/1.1; 46 (0x2e) for GCN 1.2 
626Opcode VOP3A: 442 (0x1ba) for GCN 1.0/1.1; 366 (0x16e) for GCN 1.2 
627Syntax: V_FFBL_B32 VDST, SRC0 
628Description: Find first one bit in SRC0. If found, store number of bit to VDST,
629otherwise set VDST to -1. 
630Operation: 
631```
632VDST = -1
633for (UINT8 i = 0; i < 32; i++)
634    if ((1U<<i) & SRC0) != 0)
635    { VDST = i; break; }
636```
637
638#### V_FLOOR_F32
639
640Opcode VOP1: 36 (0x24) for GCN 1.0/1.1; 31 (0x1f) for GCN 1.2 
641Opcode VOP3A: 420 (0x1a4) for GCN 1.0/1.1; 351 (0x15f) for GCN 1.2 
642Syntax: V_FLOOR_F32 VDST, SRC0 
643Description: Truncate floating point value SRC0 with rounding to positive infinity
644(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
645Operation: 
646```
647VDST = FLOOR(ASFLOAT(SRC0))
648```
649
650#### V_FLOOR_F64
651
652Opcode VOP1: 26 (0x1a) for GCN 1.1/1.2 
653Opcode VOP3A: 410 (0x19a) for GCN 1.1; 346 (0x15a) for GCN 1.2 
654Syntax: V_FLOOR_F64 VDST(2), SRC0(2) 
655Description: Truncate double floating point value SRC0 with rounding to positive infinity
656(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
657Operation: 
658```
659VDST = FLOOR(ASDOUBLE(SRC0))
660```
661
662#### V_FRACT_F32
663
664Opcode VOP1: 32 (0x20) for GCN 1.0/1.1; 27 (0x1b) for GCN 1.2 
665Opcode VOP3A: 416 (0x1a0) for GCN 1.0/1.1; 347 (0x15b) for GCN 1.2 
666Syntax: V_FRACT_F32 VDST, SRC0 
667Description: Get fractional from floating point value SRC0 and store it to VDST.
668Fractional will be computed by subtracting floor(SRC0) from SRC0.
669If SRC0 is infinity or NaN then NaN with proper sign is stored to VDST. 
670Operation: 
671```
672FLOAT SF = ASFLOAT(SRC0)
673if (!ISNAN(SF) && SF!=-INF && SF!=INF)
674    VDST = SF - FLOOR(ASFLOAT(SF))
675else
676    VDST = NAN * SIGN(SF)
677```
678
679#### V_FRACT_F64
680
681Opcode VOP1: 62 (0x3e) for GCN 1.0/1.1; 52 (0x32) for GCN 1.2 
682Opcode VOP3A: 446 (0x1be) for GCN 1.0/1.1; 372 (0x172) for GCN 1.2 
683Syntax: V_FRACT_F64 VDST(2), SRC0(2) 
684Description: Get fractional from double floating point value SRC0 and store it to VDST.
685Fractional will be computed by subtracting floor(SRC0) from SRC0.
686If SRC0 is infinity or NaN then NaN with proper sign is stored to VDST. 
687Operation: 
688```
689FLOAT SD = ASDOUBLE(SRC0)
690if (!ISNAN(SD) && SD!=-INF && SD!=INF)
691    VDST = SD - FLOOR(ASDOUBLE(SD))
692else
693    VDST = NAN * SIGN(SD)
694```
695
696#### V_FREXP_EXP_I32_F32
697
698Opcode VOP1: 63 (0x3f) for GCN 1.0/1.1; 51 (0x33) for GCN 1.2 
699Opcode VOP3A: 447 (0x1bf) for GCN 1.0/1.1; 371 (0x173) for GCN 1.2 
700Syntax: V_FREXP_EXP_I32_F32 VDST, SRC0 
701Description: Get exponent plus 1 from single FP value SRC0, and store that exponent to VDST.
702This instruction realizes frexp function.
703If SRC0 is infinity or NAN then store -1 to VDST. 
704Operation: 
705```
706FLOAT SF = ASFLOAT(SRC0)
707if (ABS(SF) != INF && !ISNAN(SF))
708    VDST = FREXP_EXP(SF)
709else
710    VDST = -1
711```
712
713#### V_FREXP_EXP_I32_F64
714
715Opcode VOP1: 60 (0x3c) for GCN 1.0/1.1; 48 (0x30) for GCN 1.2 
716Opcode VOP3A: 444 (0x1bc) for GCN 1.0/1.1; 368 (0x170) for GCN 1.2 
717Syntax: V_FREXP_EXP_I32_F64 VDST, SRC0(2) 
718Description: Get exponent plus 1 from double FP value SRC0, and store that exponent to VDST.
719This instruction realizes frexp function.
720If SRC0 is infinity or NAN then store -1 to VDST. 
721Operation: 
722```
723DOUBLE SD = ASDOUBLE(SRC0)
724if (ABS(SD) != INF && !ISNAN(SD))
725    VDST = FREXP_EXP(SD)
726else
727    VDST = -1
728```
729
730#### V_FREXP_MANT_F32
731
732Opcode VOP1: 64 (0x40) for GCN 1.0/1.1; 52 (0x34) for GCN 1.2 
733Opcode VOP3A: 448 (0x1c0) for GCN 1.0/1.1; 372 (0x174) for GCN 1.2 
734Syntax: V_FREXP_MANT_F32 VDST, SRC0 
735Description: Get mantisa from double FP value SRC0, and store it to VDST. Mantisa includes
736sign of input. If SRC0 is infinity then store -NAN to VDST. 
737Operation: 
738```
739FLOAT SF = ASFLOAT(SRC0)
740if (ABS(SF) == INF)
741    VDST = -NAN
742else if (!ISNAN(SF))
743    VDST = FREXP_MANT(SF) * SIGN(SF)
744else
745    VDST = NAN * SIGN(SF)
746```
747
748#### V_FREXP_MANT_F64
749
750Opcode VOP1: 61 (0x3d) for GCN 1.0/1.1; 49 (0x31) for GCN 1.2 
751Opcode VOP3A: 445 (0x1bd) for GCN 1.0/1.1; 369 (0x171) for GCN 1.2 
752Syntax: V_FREXP_MANT_F64 VDST(2), SRC0(2) 
753Description: Get mantisa from double FP value SRC0, and store it to VDST. Mantisa includes
754sign of input. If SRC0 is infinity then store -NAN to VDST. 
755Operation: 
756```
757DOUBLE SD = ASDOUBLE(SRC0)
758if (ABS(SD) == INF)
759    VDST = -NAN
760else if (!ISNAN(SD))
761    VDST = FREXP_MANT(SD) * SIGN(SD)
762else
763    VDST = NAN * SIGN(SD)
764```
765
766#### V_LOG_CLAMP_F32
767
768Opcode VOP1: 38 (0x26) for GCN 1.0/1.1 
769Opcode VOP3A: 422 (0x1a6) for GCN 1.0/1.1 
770Syntax: V_LOG_CLAMP_F32 VDST, SRC0 
771Description: Approximate logarithm of base 2 from floating point value SRC0 with
772clamping infinities to -MAX_FLOAT. Result is stored in VDST.
773If SRC0 is negative then store -NaN to VDST. This instruction doesn't handle denormalized
774values regardless FLOAT MODE register setup. 
775Operation: 
776```
777FLOAT F = ASFLOAT(SRC0)
778if (F==1.0)
779    VDST = 0.0f
780if (F<0.0)
781    VDST = -NaN
782else
783{
784    VDST = APPROX_LOG2(F)
785    if (ASFLOAT(VDST)==-INF)
786        VDST = -MAX_FLOAT
787}
788```
789
790#### V_LOG_F32
791
792Opcode VOP1: 39 (0x27) for GCN 1.0/1.1; 33 (0x21) for GCN 1.2 
793Opcode VOP3A: 423 (0x1a7) for GCN 1.0/1.1; 353 (0x161) for GCN 1.2 
794Syntax: V_LOG_F32 VDST, SRC0 
795Description: Approximate logarithm of base 2 from floating point value SRC0, and store
796result to VDST. If SRC0 is negative then store -NaN to VDST.
797This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
798Operation: 
799```
800FLOAT F = ASFLOAT(SRC0)
801if (F==1.0)
802    VDST = 0.0f
803if (F<0.0)
804    VDST = -NaN
805else
806    VDST = APPROX_LOG2(F)
807```
808
809#### V_LOG_LEGACY_F32
810
811Opcode VOP1: 69 (0x45) for GCN 1.1; 76 (0x4c) for GCN 1.2 
812Opcode VOP3A: 453 (0x1c5) for GCN 1.1; 396 (0x18c) for GCN 1.2 
813Syntax: V_LOG_LEGACY_F32 VDST, SRC0 
814Description: Approximate logarithm of base 2 from floating point value SRC0, and store
815result to VDST. If SRC0 is negative then store -NaN to VDST.
816This instruction doesn't handle denormalized values regardless FLOAT MODE register setup.
817This instruction returns slightly different results than V_LOG_F32. 
818Operation: 
819```
820FLOAT F = ASFLOAT(SRC0)
821if (F==1.0)
822    VDST = 0.0f
823if (F<0.0)
824    VDST = -NaN
825else
826    VDST = APPROX_LOG2(F)
827```
828
829#### V_MOV_B32
830
831Opcode VOP1: 1 (0x1) 
832Opcode VOP3A: 385 (0x181) for GCN 1.0/1.1; 321 (0x141) for GCN 1.2 
833Syntax: V_MOV_B32 VDST, SRC0 
834Description: Move SRC0 into VDST. 
835Operation: 
836```
837VDST = SRC0
838```
839
840#### V_MOV_FED_B32
841
842Opcode VOP1: 9 (0x9) 
843Opcode VOP3A: 393 (0x189) for GCN 1.0/1.1; 329 (0x149) for GCN 1.2 
844Syntax: V_MOV_FED_B32 VDST, SRC0 
845Description: Introduce edc double error upon write to dest vgpr without causing an exception
846(???).
847
848#### V_MOVRELD_B32
849
850Opcode VOP1: 66 (0x42) for GCN 1.0/1.1; 54 (0x34) for GCN 1.2 
851Opcode VOP3A: 450 (0x1c2) for GCN 1.0/1.1; 374 (0x174) for GCN 1.2 
852Syntax: V_MOVRELD_B32 VDST, VSRC0 
853Description: Move SRC0 to VGPR[VDST_NUMBER+M0]. 
854Operation: 
855```
856VGPR[VDST_NUMBER+M0] = SRC0
857```
858
859#### V_MOVRELS_B32
860
861Opcode VOP1: 67 (0x43) for GCN 1.0/1.1; 55 (0x35) for GCN 1.2 
862Opcode VOP3A: 451 (0x1c3) for GCN 1.0/1.1; 375 (0x175) for GCN 1.2 
863Syntax: V_MOVRELS_B32 VDST, VSRC0 
864Description: Move SRC0[SRC0_NUMBER+M0] to VDST. 
865Operation: 
866```
867VDST = VGPR[SRC0_NUMBER+M0]
868```
869
870#### V_MOVRELSD_B32
871
872Opcode VOP1: 68 (0x44) for GCN 1.0/1.1; 56 (0x36) for GCN 1.2 
873Opcode VOP3A: 452 (0x1c4) for GCN 1.0/1.1; 376 (0x176) for GCN 1.2 
874Syntax: V_MOVRELSD_B32 VDST, VSRC0 
875Description: Move SRC0[SRC0_NUMBER+M0] to VGPR[VDST_NUMBER+M0]. 
876Operation: 
877```
878VGPR[VDST_NUMBER+M0] = VGPR[SRC0_NUMBER+M0]
879```
880
881#### V_NOP
882
883Opcode VOP1: 0 (0x0) 
884Opcode VOP3A: 384 (0x180) for GCN 1.0/1.1; 320 (0x140) for GCN 1.2 
885Syntax: V_NOP 
886Description: Do nothing.
887
888#### V_NOT_B32
889
890Opcode VOP1: 55 (0x37) for GCN 1.0/1.1; 43 (0x2b) for GCN 1.2 
891Opcode VOP3A: 439 (0x1b7) for GCN 1.0/1.1; 363 (0x16b) for GCN 1.2 
892Syntax: V_NOT_B32 VDST, SRC0 
893Description: Do bitwise negation on 32-bit SRC0, and store result to VDST. 
894Operation: 
895```
896VDST = ~SRC0
897```
898
899#### V_RCP_CLAMP_F32
900
901Opcode VOP1: 40 (0x28) for GCN 1.0/1.1 
902Opcode VOP3A: 424 (0x1a8) for GCN 1.0/1.1 
903Syntax: V_RCP_CLAMP_F32 VDST, SRC0 
904Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
905Guaranted error below 1ulp. Result is clamped to MAX_FLOAT including sign of a result. 
906Operation: 
907```
908VDST = APPROX_RCP(ASFLOAT(SRC0))
909if (ABS(ASFLOAT(VDST))==INF)
910    VDST = SIGN(ASFLOAT(VDST)) * MAX_FLOAT
911```
912
913#### V_RCP_CLAMP_F64
914
915Opcode VOP1: 48 (0x30) for GCN 1.0/1.1 
916Opcode VOP3A: 432 (0x1b0) for GCN 1.0/1.1 
917Syntax: V_RCP_CLAMP_F64 VDST(2), SRC0(2) 
918Description: Approximate reciprocal from double FP value SRC0 and store it to VDST.
919Relative error of approximation is ~1e-8.
920Result is clamped to MAX_DOUBLE value including sign of a result. 
921Operation: 
922```
923VDST = APPROX_RCP(ASDOUBLE(SRC0))
924if (ABS(ASDOUBLE(VDST))==INF)
925    VDST = SIGN(ASDOUBLE(VDST)) * MAX_DOUBLE
926```
927
928#### V_RCP_F32
929
930Opcode VOP1: 42 (0x2a) for GCN 1.0/1.1; 34 (0x22) for GCN 1.2 
931Opcode VOP3A: 426 (0x1aa) for GCN 1.0/1.1; 354 (0x162) for GCN 1.2 
932Syntax: V_RCP_F32 VDST, SRC0 
933Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
934Guaranted error below 1ulp. 
935Operation: 
936```
937VDST = APPROX_RCP(ASFLOAT(SRC0))
938```
939
940#### V_RCP_F64
941
942Opcode VOP1: 47 (0x2f) for GCN 1.0/1.1; 37 (0x25) for GCN 1.2 
943Opcode VOP3A: 431 (0x1af) for GCN 1.0/1.1; 357 (0x165) for GCN 1.2 
944Syntax: V_RCP_F64 VDST(2), SRC0(2) 
945Description: Approximate reciprocal from double FP value SRC0 and store it to VDST.
946Relative error of approximation is ~1e-8. 
947Operation: 
948```
949VDST = APPROX_RCP(ASDOUBLE(SRC0))
950```
951
952#### V_RCP_IFLAG_F32
953
954Opcode VOP1: 43 (0x2b) for GCN 1.0/1.1; 35 (0x23) for GCN 1.2 
955Opcode VOP3A: 427 (0x1ab) for GCN 1.0/1.1; 355 (0x163) for GCN 1.2 
956Syntax: V_RCP_IFLAG_F32 VDST, SRC0 
957Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
958Guaranted error below 1ulp. This instruction signals integer division by zero, instead
959any floating point exception when error is occurred. 
960Operation: 
961```
962VDST = APPROX_RCP_IFLAG(ASFLOAT(SRC0))
963```
964
965#### V_RCP_LEGACY_F32
966
967Opcode VOP1: 41 (0x29) for GCN 1.0/1.1 
968Opcode VOP3A: 425 (0x1a9) for GCN 1.0/1.1 
969Syntax: V_RCP_LEGACY_F32 VDST, SRC0 
970Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
971Guaranted error below 1ulp. If SRC0 or VDST is zero or infinity then store 0 with proper
972sign to VDST. 
973Operation: 
974```
975FLOAT SF = ASFLOAT(SRC0)
976if (ABS(SF)==0.0)
977    VDST = SIGN(SF)*0.0
978else
979{
980    VDST = APPROX_RCP(SF)
981    if (ABS(ASFLOAT(VDST)) == INF)
982        VDST = SIGN(SF)*0.0
983}
984```
985
986#### V_READFIRSTLANE_B32
987
988Opcode VOP1: 2 (0x2) 
989Opcode VOP3A: 386 (0x182) for GCN 1.0/1.1; 322 (0x142) for GCN 1.2 
990Syntax: V_READFIRSTLANE_B32 SDST, VSRC0 
991Description: Copy one VSRC0 lane value to one SDST. Lane (thread id) is first active lane id
992or first lane id all lanes are inactive. SSRC1 can be SGPR or M0. Ignores EXEC mask. 
993Operation: 
994```
995UINT8 firstlane = 0
996for (UINT8 i = 0; i < 64; i++)
997    if ((1ULL<<i) & EXEC) != 0)
998    { firstlane = i; break; }
999SDST = VSRC0[firstlane]
1000```
1001
1002#### V_RNDNE_F32
1003
1004Opcode VOP1: 35 (0x23) for GCN 1.0/1.1; 30 (0x1e) for GCN 1.2 
1005Opcode VOP3A: 420 (0x1a4) for GCN 1.0/1.1; 350 (0x15e) for GCN 1.2 
1006Syntax: V_RNDNE_F32 VDST, SRC0 
1007Description: Round floating point value SRC0 to nearest even integer, and store result to
1008VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1009Operation: 
1010```
1011VDST = RNDNE(ASFLOAT(SRC0))
1012```
1013
1014#### V_RNDNE_F64
1015
1016Opcode VOP1: 25 (0x19) for GCN 1.1/1.2 
1017Opcode VOP3A: 409 (0x199) for GCN 1.1; 345 (0x159) for GCN 1.2 
1018Syntax: V_RNDNE_F64 VDST(2), SRC0(2) 
1019Description: Round double floating point value SRC0 to nearest even integer,
1020and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1021Operation: 
1022```
1023VDST = RNDNE(ASDOUBLE(SRC0))
1024```
1025
1026#### V_RSQ_CLAMP_F32
1027
1028Opcode VOP1: 44 (0x2c) for GCN 1.0/1.1 
1029Opcode VOP3A: 428 (0x1ac) for GCN 1.0/1.1 
1030Syntax: V_RSQ_CLAMP_F32 VDST, SRC0 
1031Description: Approximate reciprocal square root from floating point value SRC0 with
1032clamping to MAX_FLOAT, and store result to VDST.
1033If SRC0 is negative value, store -NAN to VDST.
1034This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1035Operation: 
1036```
1037VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1038if (ASFLOAT(VDST)==INF)
1039    VDST = MAX_FLOAT
1040```
1041
1042#### V_RSQ_CLAMP_F64
1043
1044Opcode VOP1: 50 (0x32) for GCN 1.0/1.1
1045Opcode VOP3A: 434 (0x1b2) for GCN 1.0/1.1
1046Syntax: V_RSQ_CLAMP_F64 VDST(2), SRC0(2) 
1047Description: Approximate reciprocal square root from double floating point value SRC0
1048with clamping to MAX_DOUBLE ,and store it to VDST. If SRC0 is negative value,
1049store -NAN to VDST. 
1050Operation: 
1051```
1052VDST = APPROX_RSQRT(ASDOUBLE(SRC0))
1053if (ASDOUBLE(VDST)==INF)
1054    VDST = MAX_DOUBLE
1055```
1056
1057#### V_RSQ_F32
1058
1059Opcode VOP1: 46 (0x2e) for GCN 1.0/1.1; 36 (0x24) for GCN 1.2 
1060Opcode VOP3A: 430 (0x1ae) for GCN 1.0/1.1; 356 (0x164) for GCN 1.2 
1061Syntax: V_RSQ_F32 VDST, SRC0 
1062Description: Approximate reciprocal square root from floating point value SRC0 and
1063store it to VDST. If SRC0 is negative value, store -NAN to VDST.
1064This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1065Operation: 
1066```
1067VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1068```
1069
1070#### V_RSQ_F64
1071
1072Opcode VOP1: 49 (0x31) for GCN 1.0/1.1; 38 (0x26) for GCN 1.2 
1073Opcode VOP3A: 433 (0x1b1) for GCN 1.0/1.1; 358 (0x166) for GCN 1.2 
1074Syntax: V_RSQ_F64 VDST(2), SRC0(2) 
1075Description: Approximate reciprocal square root from double floating point value SRC0 and
1076store it to VDST. If SRC0 is negative value, store -NAN to VDST. 
1077Operation: 
1078```
1079VDST = APPROX_RSQRT(ASDOUBLE(SRC0))
1080```
1081
1082#### V_RSQ_LEGACY_F32
1083
1084Opcode VOP1: 45 (0x2d) for GCN 1.0/1.1 
1085Opcode VOP3A: 429 (0x1ad) for GCN 1.0/1.1 
1086Syntax: V_RCP_LEGACY_F32 VDST, SRC0 
1087Description: Approximate reciprocal square root from floating point value SRC0,
1088and store result to VDST. If SRC0 is negative value, store -NAN to VDST.
1089If result is zero then store 0.0 to VDST.
1090This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1091Operation: 
1092```
1093VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1094if (ASFLOAT(VDST)==INF)
1095    VDST = 0.0
1096```
1097
1098#### V_SIN_F32
1099
1100Opcode VOP1: 53 (0x35) for GCN 1.0/1.1; 41 (0x29) for GCN 1.2 
1101Opcode VOP3A: 437 (0x1b5) for GCN 1.0/1.1; 361 (0x169) for GCN 1.2 
1102Syntax: V_SIN_F32 VDST, SRC0 
1103Description: Compute sine of FP value from SRC0. Input value must be normalized to range
11041.0 - 1.0 (-360 degree : 360 degree). If SRC0 value is out of range then store 0.0 to VDST.
1105If SRC0 value is infinity, store -NAN to VDST. 
1106Operation: 
1107```
1108FLOAT SF = ASFLOAT(SRC0)
1109VDST = 0.0
1110if (SF >= -1.0 && SF <= 1.0)
1111    VDST = APPROX_SIN(SF)
1112else if (ABS(SF)==INF)
1113    VDST = -NAN
1114else if (ISNAN(SF))
1115    VDST = SRC0
1116```
1117
1118#### V_SQRT_F32
1119
1120Opcode VOP1: 51 (0x33) for GCN 1.0/1.1; 39 (0x27) for GCN 1.2 
1121Opcode VOP3A: 435 (0x1b3) for GCN 1.0/1.1; 359 (0x167) for GCN 1.2 
1122Syntax: V_SQRT_F32 VDST, SRC0 
1123Description: Compute square root of floating point value SRC0, and store result to VDST.
1124If SRC0 is negative value then store -NaN to VDST. 
1125Operation: 
1126```
1127if (ASFLOAT(SRC0)>=0.0)
1128    VDST = APPROX_SQRT(ASFLOAT(SRC0))
1129else
1130    VDST = -NAN
1131```
1132
1133#### V_SQRT_F64
1134
1135Opcode VOP1: 52 (0x34) for GCN 1.0/1.1; 40 (0x28) for GCN 1.2 
1136Opcode VOP3A: 436 (0x1b4) for GCN 1.0/1.1; 360 (0x168) for GCN 1.2 
1137Syntax: V_SQRT_F64 VDST(2), SRC0(2) 
1138Description: Compute square root of double floating point value SRC0, and store result
1139to VDST. Relative error of approximation is ~1e-8.
1140If SRC0 is negative value then store -NaN to VDST. 
1141Operation: 
1142```
1143if (ASDOUBLE(SRC0)>=0.0)
1144    VDST = APPROX_SQRT(ASDOUBLE(SRC0))
1145else
1146    VDST = -NAN
1147```
1148
1149#### V_TRUNC_F32
1150
1151Opcode VOP1: 33 (0x21) for GCN 1.0/1.1; 28 (0x1c) for GCN 1.2 
1152Opcode VOP3A: 417 (0x1a1) for GCN 1.0/1.1; 348 (0x15c) for GCN 1.2 
1153Syntax: V_TRUNC_F32 VDST, SRC0 
1154Description: Get integer value from floating point value SRC0, and store (as float)
1155it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1156Operation: 
1157```
1158VDST = RNDTZ(ASFLOAT(SRC0))
1159```
1160
1161#### V_TRUNC_F64
1162
1163Opcode VOP1: 23 (0x17) for GCN 1.1/1.2 
1164Opcode VOP3A: 407 (0x197) for GCN 1.1; 343 (0x157) for GCN 1.2 
1165Syntax: V_TRUNC_F64 VDST(2), SRC0(2) 
1166Description: Get integer value from double floating point value SRC0, and store (as float)
1167it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1168Operation: 
1169```
1170VDST = RNDTZ(ASDOUBLE(SRC0))
1171```
Note: See TracBrowser for help on using the repository browser.