source: CLRX/CLRadeonExtender/trunk/doc/GcnInstrsVop1.md @ 2468

Last change on this file since 2468 was 2468, checked in by matszpk, 3 years ago

CLRadeonExtender: ClrxDoc?: update doc.

File size: 38.3 KB
Line 
1## GCN ISA VOP1/VOP3 instructions
2
3VOP1 instructions can be encoded in the VOP1 encoding and the VOP3A/VOP3B encoding.
4List of fields for VOP1 encoding:
5
6Bits  | Name     | Description
7------|----------|------------------------------
80-8   | SRC0     | First (scalar or vector) source operand
99-16  | OPCODE   | Operation code
1017-24 | VDST     | Destination vector operand
1125-31 | ENCODING | Encoding type. Must be 0b0111111
12
13Syntax: INSTRUCTION VDST, SRC0
14
15List of fields for VOP3A/VOP3B encoding (GCN 1.0/1.1):
16
17Bits  | Name     | Description
18------|----------|------------------------------
190-7   | VDST     | Vector destination operand
208-10  | ABS      | Absolute modifiers for source operands (VOP3A)
218-14  | SDST     | Scalar destination operand (VOP3B)
2211    | CLAMP    | CLAMP modifier (VOP3A)
2315    | CLAMP    | CLAMP modifier (VOP3B)
2417-25 | OPCODE   | Operation code
2526-31 | ENCODING | Encoding type. Must be 0b110100
2632-40 | SRC0     | First (scalar or vector) source operand
2741-49 | SRC1     | Second (scalar or vector) source operand
2850-58 | SRC2     | Third (scalar or vector) source operand
2959-60 | OMOD     | OMOD modifier. Multiplication modifier
3061-63 | NEG      | Negation modifier for source operands
31
32List of fields for VOP3A/VOP3B encoding (GCN 1.2):
33
34Bits  | Name     | Description
35------|----------|------------------------------
360-7   | VDST     | Destination vector operand
378-10  | ABS      | Absolute modifiers for source operands (VOP3A)
388-14  | SDST     | Scalar destination operand (VOP3B)
3915    | CLAMP    | CLAMP modifier
4016-25 | OPCODE   | Operation code
4126-31 | ENCODING | Encoding type. Must be 0b110100
4232-40 | SRC0     | First (scalar or vector) source operand
4341-49 | SRC1     | Second (scalar or vector) source operand
4450-58 | SRC2     | Third (scalar or vector) source operand
4559-60 | OMOD     | OMOD modifier. Multiplication modifier
4661-63 | NEG      | Negation modifier for source operands
47
48Syntax: INSTRUCTION VDST, SRC0 [MODIFIERS]
49
50Modifiers:
51
52* CLAMP - clamps destination floating point value in range 0.0-1.0
53* MUL:2, MUL:4, DIV:2 - OMOD modifiers. Multiply destination floating point value by
542.0, 4.0 or 0.5 respectively. Clamping applied after OMOD modifier.
55* -SRC - negate floating point value from source operand. Applied after ABS modifier.
56* ABS(SRC), |SRC| - apply absolute value to source operand
57
58NOTE: OMOD modifier doesn't work if output denormals are allowed
59(5 bit of MODE register for single precision or 7 bit for double precision). 
60NOTE: OMOD and CLAMP modifier affects only for instruction that output is
61floating point value. 
62NOTE: ABS and negation is applied to source operand for any instruction.
63
64Negation and absolute value can be combined: `-ABS(V0)`. Modifiers CLAMP and
65OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.
66
67Limitations for operands:
68
69* only one SGPR can be read by instruction. Multiple occurrences of this same
70SGPR is allowed
71* only one literal constant can be used, and only when a SGPR or M0 is not used in
72source operands
73* only SRC0 can holds LDS_DIRECT
74
75Unaligned pairs of SGPRs are allowed in source operands.
76
77VOP1 opcodes (0-127) are reflected in VOP3 in range: 384-511 for GCN 1.0/1.1 or
78320-447 for GCN 1.2.
79
80List of the instructions by opcode (GCN 1.0/1.1):
81
82 Opcode     | Opcode(VOP3)|GCN 1.0|GCN 1.1| Mnemonic
83------------|-------------|-------|-------|-----------------------------
84 0 (0x0)    | 384 (0x180) |   ✓   |   ✓   | V_NOP
85 1 (0x1)    | 385 (0x181) |   ✓   |   ✓   | V_MOV_B32
86 2 (0x2)    | 386 (0x182) |   ✓   |   ✓   | V_READFIRSTLANE_B32
87 3 (0x3)    | 387 (0x183) |   ✓   |   ✓   | V_CVT_I32_F64
88 4 (0x4)    | 388 (0x184) |   ✓   |   ✓   | V_CVT_F64_I32
89 5 (0x5)    | 389 (0x185) |   ✓   |   ✓   | V_CVT_F32_I32
90 6 (0x6)    | 390 (0x186) |   ✓   |   ✓   | V_CVT_F32_U32
91 7 (0x7)    | 391 (0x187) |   ✓   |   ✓   | V_CVT_U32_F32
92 8 (0x8)    | 392 (0x188) |   ✓   |   ✓   | V_CVT_I32_F32
93 9 (0x9)    | 393 (0x189) |   ✓   |   ✓   | V_MOV_FED_B32
94 10 (0xa)   | 394 (0x18a) |   ✓   |   ✓   | V_CVT_F16_F32
95 11 (0xb)   | 395 (0x18b) |   ✓   |   ✓   | V_CVT_F32_F16
96 12 (0xc)   | 396 (0x18c) |   ✓   |   ✓   | V_CVT_RPI_I32_F32
97 13 (0xd)   | 397 (0x18d) |   ✓   |   ✓   | V_CVT_FLR_I32_F32
98 14 (0xe)   | 398 (0x18e) |   ✓   |   ✓   | V_CVT_OFF_F32_I4
99 15 (0xf)   | 399 (0x18f) |   ✓   |   ✓   | V_CVT_F32_F64
100 16 (0x10)  | 400 (0x190) |   ✓   |   ✓   | V_CVT_F64_F32
101 17 (0x11)  | 401 (0x191) |   ✓   |   ✓   | V_CVT_F32_UBYTE0
102 18 (0x12)  | 402 (0x192) |   ✓   |   ✓   | V_CVT_F32_UBYTE1
103 19 (0x13)  | 403 (0x193) |   ✓   |   ✓   | V_CVT_F32_UBYTE2
104 20 (0x14)  | 404 (0x194) |   ✓   |   ✓   | V_CVT_F32_UBYTE3
105 21 (0x15)  | 405 (0x195) |   ✓   |   ✓   | V_CVT_U32_F64
106 22 (0x16)  | 406 (0x196) |   ✓   |   ✓   | V_CVT_F64_U32
107 23 (0x17)  | 407 (0x197) |       |   ✓   | V_TRUNC_F64
108 24 (0x18)  | 408 (0x198) |       |   ✓   | V_CEIL_F64
109 25 (0x19)  | 409 (0x199) |       |   ✓   | V_RNDNE_F64
110 26 (0x1a)  | 410 (0x19a) |       |   ✓   | V_FLOOR_F64
111 32 (0x20)  | 416 (0x1a0) |   ✓   |   ✓   | V_FRACT_F32
112 33 (0x21)  | 417 (0x1a1) |   ✓   |   ✓   | V_TRUNC_F32
113 34 (0x22)  | 418 (0x1a2) |   ✓   |   ✓   | V_CEIL_F32
114 35 (0x23)  | 419 (0x1a3) |   ✓   |   ✓   | V_RNDNE_F32
115 36 (0x24)  | 420 (0x1a4) |   ✓   |   ✓   | V_FLOOR_F32
116 37 (0x25)  | 421 (0x1a5) |   ✓   |   ✓   | V_EXP_F32
117 38 (0x26)  | 422 (0x1a6) |   ✓   |   ✓   | V_LOG_CLAMP_F32
118 39 (0x27)  | 423 (0x1a7) |   ✓   |   ✓   | V_LOG_F32
119 40 (0x28)  | 424 (0x1a8) |   ✓   |   ✓   | V_RCP_CLAMP_F32
120 41 (0x29)  | 425 (0x1a9) |   ✓   |   ✓   | V_RCP_LEGACY_F32
121 42 (0x2a)  | 426 (0x1aa) |   ✓   |   ✓   | V_RCP_F32
122 43 (0x2b)  | 427 (0x1ab) |   ✓   |   ✓   | V_RCP_IFLAG_F32
123 44 (0x2c)  | 428 (0x1ac) |   ✓   |   ✓   | V_RSQ_CLAMP_F32
124 45 (0x2d)  | 429 (0x1ad) |   ✓   |   ✓   | V_RSQ_LEGACY_F32
125 46 (0x2e)  | 430 (0x1ae) |   ✓   |   ✓   | V_RSQ_F32
126 47 (0x2f)  | 431 (0x1af) |   ✓   |   ✓   | V_RCP_F64
127 48 (0x30)  | 432 (0x1b0) |   ✓   |   ✓   | V_RCP_CLAMP_F64
128 49 (0x31)  | 433 (0x1b1) |   ✓   |   ✓   | V_RSQ_F64
129 50 (0x32)  | 434 (0x1b2) |   ✓   |   ✓   | V_RSQ_CLAMP_F64
130 51 (0x33)  | 435 (0x1b3) |   ✓   |   ✓   | V_SQRT_F32
131 52 (0x34)  | 436 (0x1b4) |   ✓   |   ✓   | V_SQRT_F64
132 53 (0x35)  | 437 (0x1b5) |   ✓   |   ✓   | V_SIN_F32
133 54 (0x36)  | 438 (0x1b6) |   ✓   |   ✓   | V_COS_F32
134 55 (0x37)  | 439 (0x1b7) |   ✓   |   ✓   | V_NOT_B32
135 56 (0x38)  | 440 (0x1b8) |   ✓   |   ✓   | V_BFREV_B32
136 57 (0x39)  | 441 (0x1b9) |   ✓   |   ✓   | V_FFBH_U32
137 58 (0x3a)  | 442 (0x1ba) |   ✓   |   ✓   | V_FFBL_B32
138 59 (0x3b)  | 443 (0x1bb) |   ✓   |   ✓   | V_FFBH_I32
139 60 (0x3c)  | 444 (0x1bc) |   ✓   |   ✓   | V_FREXP_EXP_I32_F64
140 61 (0x3d)  | 445 (0x1bd) |   ✓   |   ✓   | V_FREXP_MANT_F64
141 62 (0x3e)  | 446 (0x1be) |   ✓   |   ✓   | V_FRACT_F64
142 63 (0x3f)  | 447 (0x1bf) |   ✓   |   ✓   | V_FREXP_EXP_I32_F32
143 64 (0x40)  | 448 (0x1c0) |   ✓   |   ✓   | V_FREXP_MANT_F32
144 65 (0x41)  | 449 (0x1c1) |   ✓   |   ✓   | V_CLREXCP
145 66 (0x42)  | 450 (0x1c2) |   ✓   |   ✓   | V_MOVRELD_B32
146 67 (0x43)  | 451 (0x1c3) |   ✓   |   ✓   | V_MOVRELS_B32
147 68 (0x44)  | 452 (0x1c4) |   ✓   |   ✓   | V_MOVRELSD_B32
148 69 (0x45)  | 453 (0x1c5) |       |   ✓   | V_LOG_LEGACY_F32
149 70 (0x46)  | 454 (0x1c6) |       |   ✓   | V_EXP_LEGACY_F32
150
151List of the instructions by opcode (GCN 1.2):
152
153 Opcode     | Opcode(VOP3)| Mnemonic
154------------|-------------|-----------------------------
155 0 (0x0)    | 320 (0x140) | V_NOP
156 1 (0x1)    | 321 (0x141) | V_MOV_B32
157 2 (0x2)    | 322 (0x142) | V_READFIRSTLANE_B32
158 3 (0x3)    | 323 (0x143) | V_CVT_I32_F64
159 4 (0x4)    | 324 (0x144) | V_CVT_F64_I32
160 5 (0x5)    | 325 (0x145) | V_CVT_F32_I32
161 6 (0x6)    | 326 (0x146) | V_CVT_F32_U32
162 7 (0x7)    | 327 (0x147) | V_CVT_U32_F32
163 8 (0x8)    | 328 (0x148) | V_CVT_I32_F32
164 9 (0x9)    | 329 (0x149) | V_MOV_FED_B32
165 10 (0xa)   | 330 (0x14a) | V_CVT_F16_F32
166 11 (0xb)   | 331 (0x14b) | V_CVT_F32_F16
167 12 (0xc)   | 332 (0x14c) | V_CVT_RPI_I32_F32
168 13 (0xd)   | 333 (0x14d) | V_CVT_FLR_I32_F32
169 14 (0xe)   | 334 (0x14e) | V_CVT_OFF_F32_I4
170 15 (0xf)   | 335 (0x14f) | V_CVT_F32_F64
171 16 (0x10)  | 336 (0x150) | V_CVT_F64_F32
172 17 (0x11)  | 337 (0x151) | V_CVT_F32_UBYTE0
173 18 (0x12)  | 338 (0x152) | V_CVT_F32_UBYTE1
174 19 (0x13)  | 339 (0x153) | V_CVT_F32_UBYTE2
175 20 (0x14)  | 340 (0x154) | V_CVT_F32_UBYTE3
176 21 (0x15)  | 341 (0x155) | V_CVT_U32_F64
177 22 (0x16)  | 342 (0x156) | V_CVT_F64_U32
178 23 (0x17)  | 343 (0x157) | V_TRUNC_F64
179 24 (0x18)  | 344 (0x158) | V_CEIL_F64
180 25 (0x19)  | 345 (0x159) | V_RNDNE_F64
181 26 (0x1a)  | 346 (0x15a) | V_FLOOR_F64
182 27 (0x1b)  | 347 (0x15b) | V_FRACT_F32
183 28 (0x1c)  | 348 (0x15c) | V_TRUNC_F32
184 29 (0x1d)  | 349 (0x15d) | V_CEIL_F32
185 30 (0x1e)  | 350 (0x15e) | V_RNDNE_F32
186 31 (0x1f)  | 351 (0x15f) | V_FLOOR_F32
187 32 (0x20)  | 352 (0x160) | V_EXP_F32
188 33 (0x21)  | 353 (0x161) | V_LOG_F32
189 34 (0x22)  | 354 (0x162) | V_RCP_F32
190 35 (0x23)  | 355 (0x163) | V_RCP_IFLAG_F32
191 36 (0x24)  | 356 (0x164) | V_RSQ_F32
192 37 (0x25)  | 357 (0x165) | V_RCP_F64
193 38 (0x26)  | 358 (0x166) | V_RSQ_F64
194 39 (0x27)  | 359 (0x167) | V_SQRT_F32
195 40 (0x28)  | 360 (0x168) | V_SQRT_F64
196 41 (0x29)  | 361 (0x169) | V_SIN_F32
197 42 (0x2a)  | 362 (0x16a) | V_COS_F32
198 43 (0x2b)  | 363 (0x16b) | V_NOT_B32
199 44 (0x2c)  | 364 (0x16c) | V_BFREV_B32
200 45 (0x2d)  | 365 (0x16d) | V_FFBH_U32
201 46 (0x2e)  | 366 (0x16e) | V_FFBL_B32
202 47 (0x2f)  | 367 (0x16f) | V_FFBH_I32
203 48 (0x30)  | 368 (0x170) | V_FREXP_EXP_I32_F64
204 49 (0x31)  | 369 (0x171) | V_FREXP_MANT_F64
205 50 (0x32)  | 370 (0x172) | V_FRACT_F64
206 51 (0x33)  | 371 (0x173) | V_FREXP_EXP_I32_F32
207 52 (0x34)  | 372 (0x174) | V_FREXP_MANT_F32
208 53 (0x35)  | 373 (0x175) | V_CLREXCP
209 54 (0x36)  | 374 (0x176) | V_MOVRELD_B32
210 55 (0x37)  | 375 (0x177) | V_MOVRELS_B32
211 56 (0x38)  | 376 (0x178) | V_MOVRELSD_B32
212 57 (0x39)  | 377 (0x179) | V_CVT_F16_U16
213 58 (0x3a)  | 378 (0x17a) | V_CVT_F16_I16
214 59 (0x3b)  | 379 (0x17b) | V_CVT_U16_F16
215 60 (0x3c)  | 380 (0x17c) | V_CVT_I16_F16
216 61 (0x3d)  | 381 (0x17d) | V_RCP_F16
217 62 (0x3e)  | 382 (0x17e) | V_SQRT_F16
218 63 (0x3f)  | 383 (0x17f) | V_RSQ_F16
219 64 (0x40)  | 384 (0x180) | V_LOG_F16
220 65 (0x41)  | 385 (0x181) | V_EXP_F16
221 66 (0x42)  | 386 (0x182) | V_FREXP_MANT_F16
222 67 (0x43)  | 387 (0x183) | V_FREXP_EXP_I16_F16
223 68 (0x44)  | 388 (0x184) | V_FLOOR_F16
224 69 (0x45)  | 389 (0x185) | V_CEIL_F16
225 70 (0x46)  | 390 (0x186) | V_TRUNC_F16
226 71 (0x47)  | 391 (0x187) | V_RNDNE_F16
227 72 (0x48)  | 392 (0x188) | V_FRACT_F16
228 73 (0x49)  | 393 (0x189) | V_SIN_F16
229 74 (0x4a)  | 394 (0x18a) | V_COS_F16
230 75 (0x4b)  | 395 (0x18b) | V_EXP_LEGACY_F32
231 76 (0x4c)  | 396 (0x18c) | V_LOG_LEGACY_F32
232
233### Instruction set
234
235Alphabetically sorted instruction list:
236
237#### V_BFREV_B32
238
239Opcode VOP1: 56 (0x38) for GCN 1.0/1.1; 44 (0x2c) for GCN 1.2 
240Opcode VOP3A: 440 (0x1b8) for GCN 1.0/1.1; 364 (0x16c) for GCN 1.2 
241Syntax: V_BFREV_B32 VDST, SRC0 
242Reverse bits in SRC0 and store result to VDST. 
243Operation: 
244```
245VDST = REVBIT(SRC0)
246```
247
248#### V_CEIL_F32
249
250Opcode VOP1: 34 (0x22) for GCN 1.0/1.1; 29 (0x1d) for GCN 1.2 
251Opcode VOP3A: 418 (0x1a2) for GCN 1.0/1.1; 349 (0x15d) for GCN 1.2 
252Syntax: V_CEIL_F32 VDST, SRC0 
253Description: Truncate floating point valu from SRC0 with rounding to positive infinity
254(ceilling), and store result to VDST. Implemented by flooring.
255If SRC0 is infinity or NaN then copy SRC0 to VDST. 
256Operation: 
257```
258FLOAT F = FLOOR(ASFLOAT(SRC0))
259if (ASFLOAT(SRC0) > 0.0 && ASFLOAT(SRC0) != F)
260    F += 1.0
261VDST = F
262```
263
264#### V_CEIL_F64
265
266Opcode VOP1: 24 (0x18) for GCN 1.1/1.2 
267Opcode VOP3A: 408 (0x198) for GCN 1.1; 344 (0x158) for GCN 1.2 
268Syntax: V_CEIL_F64 VDST(2), SRC0(2) 
269Description: Truncate double floating point valu from SRC0 with rounding to
270positive infinity (ceilling), and store result to VDST. Implemented by flooring.
271If SRC0 is infinity or NaN then copy SRC0 to VDST. 
272Operation: 
273```
274DOUBLE F = FLOOR(ASDOUBLE(SRC0))
275if (ASDOUBLE(SRC0) > 0.0 && ASDOUBLE(SRC0) != F)
276    F += 1.0
277VDST = F
278```
279
280#### V_CLREXCP
281
282Opcode VOP1: 65 (0x41) for GCN 1.0/1.1; 53 (0x35) for GCN 1.2 
283Opcode VOP3A: 449 (0x1c1) for GCN 1.0/1.1; 373 (0x175) for GCN 1.2 
284Syntax: V_CLREXCP 
285Description: Clear wave's exception state in SIMD. 
286
287#### V_COS_F32
288
289Opcode VOP1: 54 (0x36) for GCN 1.0/1.1; 42 (0x2a) for GCN 1.2 
290Opcode VOP3A: 438 (0x1b6) for GCN 1.0/1.1; 362 (0x16a) for GCN 1.2 
291Syntax: V_COS_F32 VDST, SRC0 
292Description: Compute cosine of FP value from SRC0. Input value must be normalized to range
2931.0 - 1.0 (-360 degree : 360 degree). If SRC0 value is out of range then store 1.0 to VDST.
294If SRC0 value is infinity, store -NAN to VDST. 
295Operation: 
296```
297FLOAT SF = ASFLOAT(SRC0)
298VDST = 1.0
299if (SF >= -1.0 && SF <= 1.0)
300    VDST = APPROX_COS(SF)
301else if (ABS(SF)==INF)
302    VDST = -NAN
303else if (ISNAN(SF))
304    VDST = SRC0
305```
306
307#### V_CVT_F16_F32
308
309Opcode VOP1: 10 (0xa) 
310Opcode VOP3A: 394 (0x18a) for GCN 1.0/1.1; 330 (0x14a) for GCN 1.2 
311Syntax: V_CVT_F16_F32 VDST, SRC0 
312Description: Convert single FP value to half floating point value with rounding from
313MODE register (single FP rounding mode), and store result to VDST.
314If absolute value is too high, then store -/+infinity to VDST. 
315Operation: 
316```
317VDST = CVTHALF(ASFLOAT(SRC0))
318```
319
320#### V_CVT_F32_F16
321
322Opcode VOP1: 11 (0xb) 
323Opcode VOP3A: 395 (0x18b) for GCN 1.0/1.1; 331 (0x14b) for GCN 1.2 
324Syntax: V_CVT_F32_F16 VDST, SRC0 
325Description: Convert half FP value to single FP value, and store result to VDST.
326**By default, immediate is in FP32 format!**
327Operation: 
328```
329VDST = (FLOAT)(ASHALF(SRC0))
330```
331
332#### V_CVT_F32_F64
333
334Opcode VOP1: 15 (0xf) 
335Opcode VOP3A: 399 (0x18f) for GCN 1.0/1.1; 335 (0x14f) for GCN 1.2 
336Syntax: V_CVT_F32_F64 VDST, SRC0(2) 
337Description: Convert double FP value to single floating point value with rounding from
338MODE register (single FP rounding mode), and store result to VDST.
339If absolute value is too high, then store -/+infinity to VDST. 
340Operation: 
341```
342VDST = CVTHALF(ASDOUBLE(SRC0))
343```
344
345#### V_CVT_F32_I32
346
347Opcode VOP1: 5 (0x5) 
348Opcode VOP3A: 389 (0x185) for GCN 1.0/1.1; 325 (0x145) for GCN 1.2 
349Syntax: V_CVT_F32_I32 VDST, SRC0 
350Description: Convert signed 32-bit integer to single FP value, and store it to VDST. 
351Operation: 
352```
353VDST = (FLOAT)(INT32)SRC0
354```
355
356#### V_CVT_F32_U32
357
358Opcode VOP1: 6 (0x6) 
359Opcode VOP3A: 390 (0x186) for GCN 1.0/1.1; 326 (0x146) for GCN 1.2 
360Syntax: V_CVT_F32_U32 VDST, SRC0 
361Description: Convert unsigned 32-bit integer to single FP value, and store it to VDST. 
362Operation: 
363```
364VDST = (FLOAT)SRC0
365```
366
367#### V_CVT_F32_UBYTE0
368
369Opcode VOP1: 17 (0x11) 
370Opcode VOP3A: 401 (0x191) for GCN 1.0/1.1; 337 (0x151) for GCN 1.2 
371Syntax: V_CVT_F32_UBYTE0 VDST, SRC0 
372Description: Convert the first unsigned 8-bit byte from SRC0 to single FP value,
373and store it to VDST. 
374Operation: 
375```
376VDST = (FLOAT)(SRC0 & 0xff)
377```
378
379#### V_CVT_F32_UBYTE1
380
381Opcode VOP1: 18 (0x12) 
382Opcode VOP3A: 402 (0x192) for GCN 1.0/1.1; 338 (0x152) for GCN 1.2 
383Syntax: V_CVT_F32_UBYTE1 VDST, SRC0 
384Description: Convert the second unsigned 8-bit byte from SRC0 to single FP value,
385and store it to VDST. 
386Operation: 
387```
388VDST = (FLOAT)((SRC0>>8) & 0xff)
389```
390
391#### V_CVT_F32_UBYTE2
392
393Opcode VOP1: 19 (0x13) 
394Opcode VOP3A: 403 (0x193) for GCN 1.0/1.1; 339 (0x153) for GCN 1.2 
395Syntax: V_CVT_F32_UBYTE2 VDST, SRC0 
396Description: Convert the third unsigned 8-bit byte from SRC0 to single FP value,
397and store it to VDST. 
398Operation: 
399```
400VDST = (FLOAT)((SRC0>>16) & 0xff)
401```
402
403#### V_CVT_F32_UBYTE3
404
405Opcode VOP1: 20 (0x14) 
406Opcode VOP3A: 404 (0x194) for GCN 1.0/1.1; 340 (0x154) for GCN 1.2 
407Syntax: V_CVT_F32_UBYTE3 VDST, SRC0 
408Description: Convert the fourth unsigned 8-bit byte from SRC0 to single FP value,
409and store it to VDST. 
410Operation: 
411```
412VDST = (FLOAT)(SRC0>>24)
413```
414
415#### V_CVT_F64_F32
416
417Opcode VOP1: 16 (0x10) 
418Opcode VOP3A: 400 (0x190) for GCN 1.0/1.1; 336 (0x150) for GCN 1.2 
419Syntax: V_CVT_F64_F32 VDST(2), SRC0 
420Description: Convert single FP value to double FP value, and store result to VDST. 
421Operation: 
422```
423VDST = (DOUBLE)(ASFLOAT(SRC0))
424```
425
426#### V_CVT_F64_I32
427
428Opcode VOP1: 4 (0x4) 
429Opcode VOP3A: 388 (0x184) for GCN 1.0/1.1; 324 (0x144) for GCN 1.2 
430Syntax: V_CVT_F64_I32 VDST(2), SRC0 
431Description: Convert signed 32-bit integer to double FP value, and store it to VDST. 
432Operation: 
433```
434VDST = (DOUBLE)(INT32)SRC0
435```
436
437#### V_CVT_F64_U32
438
439Opcode VOP1: 22 (0x16) 
440Opcode VOP3A: 406 (0x196) for GCN 1.0/1.1; 342 (0x156) for GCN 1.2 
441Syntax: V_CVT_F64_U32 VDST(2), SRC0 
442Description: Convert unsigned 32-bit integer to double FP value, and store it to VDST. 
443Operation: 
444```
445VDST = (DOUBLE)SRC0
446```
447
448#### V_CVT_FLR_I32_F32
449
450Opcode VOP1: 13 (0xd) 
451Opcode VOP3A: 397 (0x18d) for GCN 1.0/1.1; 333 (0x14d) for GCN 1.2 
452Syntax: V_CVT_FLR_I32_F32 VDST, SRC0 
453Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
454store result to VDST. Conversion uses rounding to negative infinity (floor).
455If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
456If input value is NaN/-NaN then store MAX_INT32/MIN_INT32 to VDST. 
457Operation: 
458```
459FLOAT SF = ASFLOAT(SF)
460if (!ISNAN(SF))
461    VDST = (INT32)MAX(MIN(FLOOR(SF), 2147483647.0), -2147483648.0)
462else
463    VDST = (INT32)SF>=0 ? 2147483647 : -2147483648
464```
465
466#### V_CVT_I32_F32
467
468Opcode VOP1: 8 (0x8) 
469Opcode VOP3A: 392 (0x188) for GCN 1.0/1.1; 328 (0x148) for GCN 1.2 
470Syntax: V_CVT_I32_F32 VDST, SRC0 
471Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
472store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
473maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
474If input value is NaN then store 0 to VDST. 
475Operation: 
476```
477VDST = 0
478if (!ISNAN(ASFLOAT(SRC0)))
479    VDST = (INT32)MAX(MIN(RNDTZINT(ASFLOAT(SRC0)), 2147483647.0), -2147483648.0)
480```
481
482#### V_CVT_I32_F64
483
484Opcode VOP1: 3 (0x3) 
485Opcode VOP3A: 387 (0x183) for GCN 1.0/1.1; 323 (0x143) for GCN 1.2 
486Syntax: V_CVT_I32_F64 VDST, SRC0(2) 
487Description: Convert 64-bit floating point value from SRC0 to signed 32-bit integer, and
488store result to VDST. Conversion uses rounding to zero. If value is higher/lower than
489maximal/minimal integer then store MAX_INT32/MIN_INT32 to VDST.
490If input value is NaN then store 0 to VDST. 
491Operation: 
492```
493VDST = 0
494if (!ISNAN(ASDOUBLE(SRC0)))
495    VDST = (INT32)MAX(MIN(RNDTZINT(ASDOUBLE(SRC0)), 2147483647.0), -2147483648.0)
496```
497
498#### V_CVT_OFF_F32_I4
499
500Opcode VOP1: 14 (0xe) 
501Opcode VOP3A: 398 (0x18e) for GCN 1.0/1.1; 334 (0x14e) for GCN 1.2 
502Syntax: V_CVT_OFF_F32_I4 VDST, SRC0 
503Description: Convert 4-bit signed value from SRC0 to floating point value, normalize that
504value to range -0.5:0.4375 and store result to VDST. 
505Operation: 
506```
507VDST = (FLOAT)((SRC0 & 0xf) ^ 8) / 16.0 - 0.5
508```
509
510#### V_CVT_RPI_I32_F32
511
512Opcode VOP1: 12 (0xc) 
513Opcode VOP3A: 396 (0x18c) for GCN 1.0/1.1; 332 (0x14c) for GCN 1.2 
514Syntax: V_CVT_RPI_I32_F32 VDST, SRC0 
515Description: Convert 32-bit floating point value from SRC0 to signed 32-bit integer, and
516store result to VDST. Conversion adds 0.5 to value and rounds negative infinity (floor).
517If value is higher/lower than maximal/minimal integer then store MAX_INT32/MIN_INT32 to
518VDST. If input value is NaN/-NaN then store MAX_INT32/MIN_INT32 to VDST. 
519Operation: 
520```
521FLOAT SF = ASFLOAT(SRC0)
522if (!ISNAN(SF))
523    VDST = (INT32)MAX(MIN(FLOOR(SF + 0.5), 2147483647.0), -2147483648.0)
524else
525    VDST = (INT32)SF>=0 ? 2147483647 : -2147483648
526```
527
528#### V_CVT_U32_F32
529
530Opcode VOP1: 7 (0x7) 
531Opcode VOP3A: 391 (0x187) for GCN 1.0/1.1; 327 (0x147) for GCN 1.2 
532Syntax: V_CVT_U32_F32 VDST, SRC0 
533Description: Convert 32-bit floating point value from SRC0 to unsigned 32-bit integer, and
534store result to VDST. Conversion uses rounding to zero. If value is higher than
535maximal integer then store MAX_UINT32 to VDST.
536If input value is NaN then store 0 to VDST. 
537Operation: 
538```
539VDST = 0
540if (!ISNAN(ASFLOAT(SRC0)))
541    VDST = (UINT32)MIN(RNDTZINT(ASFLOAT(SRC0)), 4294967295.0)
542```
543
544#### V_CVT_U32_F64
545
546Opcode VOP1: 21 (0x15) 
547Opcode VOP3A: 405 (0x195) for GCN 1.0/1.1; 341 (0x155) for GCN 1.2 
548Syntax: V_CVT_U32_F64 VDST, SRC0(2) 
549Description: Convert 64-bit floating point value from SRC0 to unsigned 32-bit integer, and
550store result to VDST. Conversion uses rounding to zero. If value is higher than
551maximal integer then store MAX_UINT32 to VDST.
552If input value is NaN then store 0 to VDST. 
553Operation: 
554```
555VDST = 0
556if (!ISNAN(ASDOUBLE(SRC0)))
557    VDST = (UINT32)MIN(RNDTZINT(ASDOUBLE(SRC0)), 4294967295.0)
558```
559
560#### V_EXP_F32
561
562Opcode VOP1: 37 (0x25) for GCN 1.0/1.1; 32 (0x20) for GCN 1.2 
563Opcode VOP3A: 421 (0x1a5) for GCN 1.0/1.1; 352 (0x160) for GCN 1.2 
564Syntax: V_EXP_F32 VDST, SRC0 
565Description: Approximate power of two from FP value SRC0 and store it to VDST. Instruction
566for values smaller than -126.0 always returns 0 regardless floatmode in MODE register. 
567Operation: 
568```
569if (ASFLOAT(SRC0)>=-126.0)
570    VDST = APPROX_POW2(ASFLOAT(SRC0))
571else
572    VDST = 0.0
573```
574
575### V_EXP_LEGACY_F32
576
577Opcode VOP1: 70 (0x46) for GCN 1.1; 75 (0x4b) for GCN 1.2 
578Opcode VOP3A: 454 (0x1c6) for GCN 1.1; 395 (0x18b) for GCN 1.2 
579Syntax: V_EXP_LEGACY_F32 VDST, SRC0 
580Description: Approximate power of two from FP value SRC0 and store it to VDST. Instruction
581for values smaller than -126.0 always returns 0 regardless floatmode in MODE register.
582For some cases this instructions returns slightly less accurate result than V_EXP_F32. 
583Operation: 
584```
585if (ASFLOAT(SRC0)>=-126.0)
586    VDST = APPROX_POW2(ASFLOAT(SRC0))
587else
588    VDST = 0.0
589```
590
591#### V_FFBH_U32
592
593Opcode VOP1: 57 (0x39) for GCN 1.0/1.1; 45 (0x2d) for GCN 1.2 
594Opcode VOP3A: 441 (0x1b9) for GCN 1.0/1.1; 365 (0x16d) for GCN 1.2 
595Syntax: V_FFBH_U32 VDST, SRC0 
596Description: Find last one bit in SRC0. If found, store number of skipped bits to VDST,
597otherwise set VDST to -1. 
598Operation: 
599```
600VDST = -1
601for (INT8 i = 31; i >= 0; i--)
602    if ((1U<<i) & SRC0) != 0)
603    { VDST = 31-i; break; }
604```
605
606#### V_FFBH_I32
607
608Opcode VOP1: 59 (0x3b) for GCN 1.0/1.1; 47 (0x2f) for GCN 1.2 
609Opcode VOP3A: 443 (0x1bb) for GCN 1.0/1.1; 367 (0x16f) for GCN 1.2 
610Syntax: V_FFBH_I32 VDST, SRC0 
611Description: Find last opposite bit to sign in SRC0. If found, store number of skipped bits
612to VDST, otherwise set VDST to -1. 
613Operation: 
614```
615VDST = -1
616UINT32 bitval = (INT32)SRC0>=0 ? 1 : 0
617for (INT8 i = 31; i >= 0; i--)
618    if ((1U<<i) & SRC0) == (bitval<<i))
619    { VDST = 31-i; break; }
620```
621
622#### V_FFBL_B32
623
624Opcode VOP1: 58 (0x3a) for GCN 1.0/1.1; 46 (0x2e) for GCN 1.2 
625Opcode VOP3A: 442 (0x1ba) for GCN 1.0/1.1; 366 (0x16e) for GCN 1.2 
626Syntax: V_FFBL_B32 VDST, SRC0 
627Description: Find first one bit in SRC0. If found, store number of bit to VDST,
628otherwise set VDST to -1. 
629Operation: 
630```
631VDST = -1
632for (UINT8 i = 0; i < 32; i++)
633    if ((1U<<i) & SRC0) != 0)
634    { VDST = i; break; }
635```
636
637#### V_FLOOR_F32
638
639Opcode VOP1: 36 (0x24) for GCN 1.0/1.1; 31 (0x1f) for GCN 1.2 
640Opcode VOP3A: 420 (0x1a4) for GCN 1.0/1.1; 351 (0x15f) for GCN 1.2 
641Syntax: V_FLOOR_F32 VDST, SRC0 
642Description: Truncate floating point value SRC0 with rounding to positive infinity
643(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
644Operation: 
645```
646VDST = FLOOR(ASFLOAT(SRC0))
647```
648
649#### V_FLOOR_F64
650
651Opcode VOP1: 26 (0x1a) for GCN 1.1/1.2 
652Opcode VOP3A: 410 (0x19a) for GCN 1.1; 346 (0x15a) for GCN 1.2 
653Syntax: V_FLOOR_F64 VDST(2), SRC0(2) 
654Description: Truncate double floating point value SRC0 with rounding to positive infinity
655(flooring), and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
656Operation: 
657```
658VDST = FLOOR(ASDOUBLE(SRC0))
659```
660
661#### V_FRACT_F32
662
663Opcode VOP1: 32 (0x20) for GCN 1.0/1.1; 27 (0x1b) for GCN 1.2 
664Opcode VOP3A: 416 (0x1a0) for GCN 1.0/1.1; 347 (0x15b) for GCN 1.2 
665Syntax: V_FRACT_F32 VDST, SRC0 
666Description: Get fractional from floating point value SRC0 and store it to VDST.
667Fractional will be computed by subtracting floor(SRC0) from SRC0.
668If SRC0 is infinity or NaN then NaN with proper sign is stored to VDST. 
669Operation: 
670```
671FLOAT SF = ASFLOAT(SRC0)
672if (!ISNAN(SF) && SF!=-INF && SF!=INF)
673    VDST = SF - FLOOR(ASFLOAT(SF))
674else
675    VDST = NAN * SIGN(SF)
676```
677
678#### V_FRACT_F64
679
680Opcode VOP1: 62 (0x3e) for GCN 1.0/1.1; 52 (0x32) for GCN 1.2 
681Opcode VOP3A: 446 (0x1be) for GCN 1.0/1.1; 372 (0x172) for GCN 1.2 
682Syntax: V_FRACT_F64 VDST(2), SRC0(2) 
683Description: Get fractional from double floating point value SRC0 and store it to VDST.
684Fractional will be computed by subtracting floor(SRC0) from SRC0.
685If SRC0 is infinity or NaN then NaN with proper sign is stored to VDST. 
686Operation: 
687```
688FLOAT SD = ASDOUBLE(SRC0)
689if (!ISNAN(SD) && SD!=-INF && SD!=INF)
690    VDST = SD - FLOOR(ASDOUBLE(SD))
691else
692    VDST = NAN * SIGN(SD)
693```
694
695#### V_FREXP_EXP_I32_F32
696
697Opcode VOP1: 63 (0x3f) for GCN 1.0/1.1; 51 (0x33) for GCN 1.2 
698Opcode VOP3A: 447 (0x1bf) for GCN 1.0/1.1; 371 (0x173) for GCN 1.2 
699Syntax: V_FREXP_EXP_I32_F32 VDST, SRC0 
700Description: Get exponent plus 1 from single FP value SRC0, and store that exponent to VDST.
701This instruction realizes frexp function.
702If SRC0 is infinity or NAN then store -1 to VDST. 
703Operation: 
704```
705FLOAT SF = ASFLOAT(SRC0)
706if (ABS(SF) != INF && !ISNAN(SF))
707    VDST = FREXP_EXP(SF)
708else
709    VDST = -1
710```
711
712#### V_FREXP_EXP_I32_F64
713
714Opcode VOP1: 60 (0x3c) for GCN 1.0/1.1; 48 (0x30) for GCN 1.2 
715Opcode VOP3A: 444 (0x1bc) for GCN 1.0/1.1; 368 (0x170) for GCN 1.2 
716Syntax: V_FREXP_EXP_I32_F64 VDST, SRC0(2) 
717Description: Get exponent plus 1 from double FP value SRC0, and store that exponent to VDST.
718This instruction realizes frexp function.
719If SRC0 is infinity or NAN then store -1 to VDST. 
720Operation: 
721```
722DOUBLE SD = ASDOUBLE(SRC0)
723if (ABS(SD) != INF && !ISNAN(SD))
724    VDST = FREXP_EXP(SD)
725else
726    VDST = -1
727```
728
729#### V_FREXP_MANT_F32
730
731Opcode VOP1: 64 (0x40) for GCN 1.0/1.1; 52 (0x34) for GCN 1.2 
732Opcode VOP3A: 448 (0x1c0) for GCN 1.0/1.1; 372 (0x174) for GCN 1.2 
733Syntax: V_FREXP_MANT_F32 VDST, SRC0 
734Description: Get mantisa from double FP value SRC0, and store it to VDST. Mantisa includes
735sign of input. If SRC0 is infinity then store -NAN to VDST. 
736Operation: 
737```
738FLOAT SF = ASFLOAT(SRC0)
739if (ABS(SF) == INF)
740    VDST = -NAN
741else if (!ISNAN(SF))
742    VDST = FREXP_MANT(SF) * SIGN(SF)
743else
744    VDST = NAN * SIGN(SF)
745```
746
747#### V_FREXP_MANT_F64
748
749Opcode VOP1: 61 (0x3d) for GCN 1.0/1.1; 49 (0x31) for GCN 1.2 
750Opcode VOP3A: 445 (0x1bd) for GCN 1.0/1.1; 369 (0x171) for GCN 1.2 
751Syntax: V_FREXP_MANT_F64 VDST(2), SRC0(2) 
752Description: Get mantisa from double FP value SRC0, and store it to VDST. Mantisa includes
753sign of input. If SRC0 is infinity then store -NAN to VDST. 
754Operation: 
755```
756DOUBLE SD = ASDOUBLE(SRC0)
757if (ABS(SD) == INF)
758    VDST = -NAN
759else if (!ISNAN(SD))
760    VDST = FREXP_MANT(SD) * SIGN(SD)
761else
762    VDST = NAN * SIGN(SD)
763```
764
765#### V_LOG_CLAMP_F32
766
767Opcode VOP1: 38 (0x26) for GCN 1.0/1.1 
768Opcode VOP3A: 422 (0x1a6) for GCN 1.0/1.1 
769Syntax: V_LOG_CLAMP_F32 VDST, SRC0 
770Description: Approximate logarithm of base 2 from floating point value SRC0 with
771clamping infinities to -MAX_FLOAT. Result is stored in VDST.
772If SRC0 is negative then store -NaN to VDST. This instruction doesn't handle denormalized
773values regardless FLOAT MODE register setup. 
774Operation: 
775```
776FLOAT F = ASFLOAT(SRC0)
777if (F==1.0)
778    VDST = 0.0f
779if (F<0.0)
780    VDST = -NaN
781else
782{
783    VDST = APPROX_LOG2(F)
784    if (ASFLOAT(VDST)==-INF)
785        VDST = -MAX_FLOAT
786}
787```
788
789#### V_LOG_F32
790
791Opcode VOP1: 39 (0x27) for GCN 1.0/1.1; 33 (0x21) for GCN 1.2 
792Opcode VOP3A: 423 (0x1a7) for GCN 1.0/1.1; 353 (0x161) for GCN 1.2 
793Syntax: V_LOG_F32 VDST, SRC0 
794Description: Approximate logarithm of base 2 from floating point value SRC0, and store
795result to VDST. If SRC0 is negative then store -NaN to VDST.
796This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
797Operation: 
798```
799FLOAT F = ASFLOAT(SRC0)
800if (F==1.0)
801    VDST = 0.0f
802if (F<0.0)
803    VDST = -NaN
804else
805    VDST = APPROX_LOG2(F)
806```
807
808#### V_LOG_LEGACY_F32
809
810Opcode VOP1: 69 (0x45) for GCN 1.1; 76 (0x4c) for GCN 1.2 
811Opcode VOP3A: 453 (0x1c5) for GCN 1.1; 396 (0x18c) for GCN 1.2 
812Syntax: V_LOG_LEGACY_F32 VDST, SRC0 
813Description: Approximate logarithm of base 2 from floating point value SRC0, and store
814result to VDST. If SRC0 is negative then store -NaN to VDST.
815This instruction doesn't handle denormalized values regardless FLOAT MODE register setup.
816This instruction returns slightly different results than V_LOG_F32. 
817Operation: 
818```
819FLOAT F = ASFLOAT(SRC0)
820if (F==1.0)
821    VDST = 0.0f
822if (F<0.0)
823    VDST = -NaN
824else
825    VDST = APPROX_LOG2(F)
826```
827
828#### V_MOV_B32
829
830Opcode VOP1: 1 (0x1) 
831Opcode VOP3A: 385 (0x181) for GCN 1.0/1.1; 321 (0x141) for GCN 1.2 
832Syntax: V_MOV_B32 VDST, SRC0 
833Description: Move SRC0 into VDST. 
834Operation: 
835```
836VDST = SRC0
837```
838
839#### V_MOV_FED_B32
840
841Opcode VOP1: 9 (0x9) 
842Opcode VOP3A: 393 (0x189) for GCN 1.0/1.1; 329 (0x149) for GCN 1.2 
843Syntax: V_MOV_FED_B32 VDST, SRC0 
844Description: Introduce edc double error upon write to dest vgpr without causing an exception
845(???).
846
847#### V_MOVRELD_B32
848
849Opcode VOP1: 66 (0x42) for GCN 1.0/1.1; 54 (0x34) for GCN 1.2 
850Opcode VOP3A: 450 (0x1c2) for GCN 1.0/1.1; 374 (0x174) for GCN 1.2 
851Syntax: V_MOVRELD_B32 VDST, VSRC0 
852Description: Move SRC0 to VGPR[VDST_NUMBER+M0]. 
853Operation: 
854```
855VGPR[VDST_NUMBER+M0] = SRC0
856```
857
858#### V_MOVRELS_B32
859
860Opcode VOP1: 67 (0x43) for GCN 1.0/1.1; 55 (0x35) for GCN 1.2 
861Opcode VOP3A: 451 (0x1c3) for GCN 1.0/1.1; 375 (0x175) for GCN 1.2 
862Syntax: V_MOVRELS_B32 VDST, VSRC0 
863Description: Move SRC0[SRC0_NUMBER+M0] to VDST. 
864Operation: 
865```
866VDST = VGPR[SRC0_NUMBER+M0]
867```
868
869#### V_MOVRELSD_B32
870
871Opcode VOP1: 68 (0x44) for GCN 1.0/1.1; 56 (0x36) for GCN 1.2 
872Opcode VOP3A: 452 (0x1c4) for GCN 1.0/1.1; 376 (0x176) for GCN 1.2 
873Syntax: V_MOVRELSD_B32 VDST, VSRC0 
874Description: Move SRC0[SRC0_NUMBER+M0] to VGPR[VDST_NUMBER+M0]. 
875Operation: 
876```
877VGPR[VDST_NUMBER+M0] = VGPR[SRC0_NUMBER+M0]
878```
879
880#### V_NOP
881
882Opcode VOP1: 0 (0x0) 
883Opcode VOP3A: 384 (0x180) for GCN 1.0/1.1; 320 (0x140) for GCN 1.2 
884Syntax: V_NOP 
885Description: Do nothing.
886
887#### V_NOT_B32
888
889Opcode VOP1: 55 (0x37) for GCN 1.0/1.1; 43 (0x2b) for GCN 1.2 
890Opcode VOP3A: 439 (0x1b7) for GCN 1.0/1.1; 363 (0x16b) for GCN 1.2 
891Syntax: V_NOT_B32 VDST, SRC0 
892Description: Do bitwise negation on 32-bit SRC0, and store result to VDST. 
893Operation: 
894```
895VDST = ~SRC0
896```
897
898#### V_RCP_CLAMP_F32
899
900Opcode VOP1: 40 (0x28) for GCN 1.0/1.1 
901Opcode VOP3A: 424 (0x1a8) for GCN 1.0/1.1 
902Syntax: V_RCP_CLAMP_F32 VDST, SRC0 
903Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
904Guaranted error below 1ulp. Result is clamped to MAX_FLOAT including sign of a result. 
905Operation: 
906```
907VDST = APPROX_RCP(ASFLOAT(SRC0))
908if (ABS(ASFLOAT(VDST))==INF)
909    VDST = SIGN(ASFLOAT(VDST)) * MAX_FLOAT
910```
911
912#### V_RCP_CLAMP_F64
913
914Opcode VOP1: 48 (0x30) for GCN 1.0/1.1 
915Opcode VOP3A: 432 (0x1b0) for GCN 1.0/1.1 
916Syntax: V_RCP_CLAMP_F64 VDST(2), SRC0(2) 
917Description: Approximate reciprocal from double FP value SRC0 and store it to VDST.
918Relative error of approximation is ~1e-8.
919Result is clamped to MAX_DOUBLE value including sign of a result. 
920Operation: 
921```
922VDST = APPROX_RCP(ASDOUBLE(SRC0))
923if (ABS(ASDOUBLE(VDST))==INF)
924    VDST = SIGN(ASDOUBLE(VDST)) * MAX_DOUBLE
925```
926
927#### V_RCP_F32
928
929Opcode VOP1: 42 (0x2a) for GCN 1.0/1.1; 34 (0x22) for GCN 1.2 
930Opcode VOP3A: 426 (0x1aa) for GCN 1.0/1.1; 354 (0x162) for GCN 1.2 
931Syntax: V_RCP_F32 VDST, SRC0 
932Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
933Guaranted error below 1ulp. 
934Operation: 
935```
936VDST = APPROX_RCP(ASFLOAT(SRC0))
937```
938
939#### V_RCP_F64
940
941Opcode VOP1: 47 (0x2f) for GCN 1.0/1.1; 37 (0x25) for GCN 1.2 
942Opcode VOP3A: 431 (0x1af) for GCN 1.0/1.1; 357 (0x165) for GCN 1.2 
943Syntax: V_RCP_F64 VDST(2), SRC0(2) 
944Description: Approximate reciprocal from double FP value SRC0 and store it to VDST.
945Relative error of approximation is ~1e-8. 
946Operation: 
947```
948VDST = APPROX_RCP(ASDOUBLE(SRC0))
949```
950
951#### V_RCP_IFLAG_F32
952
953Opcode VOP1: 43 (0x2b) for GCN 1.0/1.1; 35 (0x23) for GCN 1.2 
954Opcode VOP3A: 427 (0x1ab) for GCN 1.0/1.1; 355 (0x163) for GCN 1.2 
955Syntax: V_RCP_IFLAG_F32 VDST, SRC0 
956Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
957Guaranted error below 1ulp. This instruction signals integer division by zero, instead
958any floating point exception when error is occurred. 
959Operation: 
960```
961VDST = APPROX_RCP_IFLAG(ASFLOAT(SRC0))
962```
963
964#### V_RCP_LEGACY_F32
965
966Opcode VOP1: 41 (0x29) for GCN 1.0/1.1 
967Opcode VOP3A: 425 (0x1a9) for GCN 1.0/1.1 
968Syntax: V_RCP_LEGACY_F32 VDST, SRC0 
969Description: Approximate reciprocal from floating point value SRC0 and store it to VDST.
970Guaranted error below 1ulp. If SRC0 or VDST is zero or infinity then store 0 with proper
971sign to VDST. 
972Operation: 
973```
974FLOAT SF = ASFLOAT(SRC0)
975if (ABS(SF)==0.0)
976    VDST = SIGN(SF)*0.0
977else
978{
979    VDST = APPROX_RCP(SF)
980    if (ABS(ASFLOAT(VDST)) == INF)
981        VDST = SIGN(SF)*0.0
982}
983```
984
985#### V_READFIRSTLANE_B32
986
987Opcode VOP1: 2 (0x2) 
988Opcode VOP3A: 386 (0x182) for GCN 1.0/1.1; 322 (0x142) for GCN 1.2 
989Syntax: V_READFIRSTLANE_B32 SDST, VSRC0 
990Description: Copy one VSRC0 lane value to one SDST. Lane (thread id) is first active lane id
991or first lane id all lanes are inactive. SSRC1 can be SGPR or M0. Ignores EXEC mask. 
992Operation: 
993```
994UINT8 firstlane = 0
995for (UINT8 i = 0; i < 64; i++)
996    if ((1ULL<<i) & EXEC) != 0)
997    { firstlane = i; break; }
998SDST = VSRC0[firstlane]
999```
1000
1001#### V_RNDNE_F32
1002
1003Opcode VOP1: 35 (0x23) for GCN 1.0/1.1; 30 (0x1e) for GCN 1.2 
1004Opcode VOP3A: 420 (0x1a4) for GCN 1.0/1.1; 350 (0x15e) for GCN 1.2 
1005Syntax: V_RNDNE_F32 VDST, SRC0 
1006Description: Round floating point value SRC0 to nearest even integer, and store result to
1007VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1008Operation: 
1009```
1010VDST = RNDNE(ASFLOAT(SRC0))
1011```
1012
1013#### V_RNDNE_F64
1014
1015Opcode VOP1: 25 (0x19) for GCN 1.1/1.2 
1016Opcode VOP3A: 409 (0x199) for GCN 1.1; 345 (0x159) for GCN 1.2 
1017Syntax: V_RNDNE_F64 VDST(2), SRC0(2) 
1018Description: Round double floating point value SRC0 to nearest even integer,
1019and store result to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1020Operation: 
1021```
1022VDST = RNDNE(ASDOUBLE(SRC0))
1023```
1024
1025#### V_RSQ_CLAMP_F32
1026
1027Opcode VOP1: 44 (0x2c) for GCN 1.0/1.1 
1028Opcode VOP3A: 428 (0x1ac) for GCN 1.0/1.1 
1029Syntax: V_RSQ_CLAMP_F32 VDST, SRC0 
1030Description: Approximate reciprocal square root from floating point value SRC0 with
1031clamping to MAX_FLOAT, and store result to VDST.
1032If SRC0 is negative value, store -NAN to VDST.
1033This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1034Operation: 
1035```
1036VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1037if (ASFLOAT(VDST)==INF)
1038    VDST = MAX_FLOAT
1039```
1040
1041#### V_RSQ_CLAMP_F64
1042
1043Opcode VOP1: 50 (0x32) for GCN 1.0/1.1
1044Opcode VOP3A: 434 (0x1b2) for GCN 1.0/1.1
1045Syntax: V_RSQ_CLAMP_F64 VDST(2), SRC0(2) 
1046Description: Approximate reciprocal square root from double floating point value SRC0
1047with clamping to MAX_DOUBLE ,and store it to VDST. If SRC0 is negative value,
1048store -NAN to VDST. 
1049Operation: 
1050```
1051VDST = APPROX_RSQRT(ASDOUBLE(SRC0))
1052if (ASDOUBLE(VDST)==INF)
1053    VDST = MAX_DOUBLE
1054```
1055
1056#### V_RSQ_F32
1057
1058Opcode VOP1: 46 (0x2e) for GCN 1.0/1.1; 36 (0x24) for GCN 1.2 
1059Opcode VOP3A: 430 (0x1ae) for GCN 1.0/1.1; 356 (0x164) for GCN 1.2 
1060Syntax: V_RSQ_F32 VDST, SRC0 
1061Description: Approximate reciprocal square root from floating point value SRC0 and
1062store it to VDST. If SRC0 is negative value, store -NAN to VDST.
1063This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1064Operation: 
1065```
1066VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1067```
1068
1069#### V_RSQ_F64
1070
1071Opcode VOP1: 49 (0x31) for GCN 1.0/1.1; 38 (0x26) for GCN 1.2 
1072Opcode VOP3A: 433 (0x1b1) for GCN 1.0/1.1; 358 (0x166) for GCN 1.2 
1073Syntax: V_RSQ_F64 VDST(2), SRC0(2) 
1074Description: Approximate reciprocal square root from double floating point value SRC0 and
1075store it to VDST. If SRC0 is negative value, store -NAN to VDST. 
1076Operation: 
1077```
1078VDST = APPROX_RSQRT(ASDOUBLE(SRC0))
1079```
1080
1081#### V_RSQ_LEGACY_F32
1082
1083Opcode VOP1: 45 (0x2d) for GCN 1.0/1.1 
1084Opcode VOP3A: 429 (0x1ad) for GCN 1.0/1.1 
1085Syntax: V_RCP_LEGACY_F32 VDST, SRC0 
1086Description: Approximate reciprocal square root from floating point value SRC0,
1087and store result to VDST. If SRC0 is negative value, store -NAN to VDST.
1088If result is zero then store 0.0 to VDST.
1089This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
1090Operation: 
1091```
1092VDST = APPROX_RSQRT(ASFLOAT(SRC0))
1093if (ASFLOAT(VDST)==INF)
1094    VDST = 0.0
1095```
1096
1097#### V_SIN_F32
1098
1099Opcode VOP1: 53 (0x35) for GCN 1.0/1.1; 41 (0x29) for GCN 1.2 
1100Opcode VOP3A: 437 (0x1b5) for GCN 1.0/1.1; 361 (0x169) for GCN 1.2 
1101Syntax: V_SIN_F32 VDST, SRC0 
1102Description: Compute sine of FP value from SRC0. Input value must be normalized to range
11031.0 - 1.0 (-360 degree : 360 degree). If SRC0 value is out of range then store 0.0 to VDST.
1104If SRC0 value is infinity, store -NAN to VDST. 
1105Operation: 
1106```
1107FLOAT SF = ASFLOAT(SRC0)
1108VDST = 0.0
1109if (SF >= -1.0 && SF <= 1.0)
1110    VDST = APPROX_SIN(SF)
1111else if (ABS(SF)==INF)
1112    VDST = -NAN
1113else if (ISNAN(SF))
1114    VDST = SRC0
1115```
1116
1117#### V_SQRT_F32
1118
1119Opcode VOP1: 51 (0x33) for GCN 1.0/1.1; 39 (0x27) for GCN 1.2 
1120Opcode VOP3A: 435 (0x1b3) for GCN 1.0/1.1; 359 (0x167) for GCN 1.2 
1121Syntax: V_SQRT_F32 VDST, SRC0 
1122Description: Compute square root of floating point value SRC0, and store result to VDST.
1123If SRC0 is negative value then store -NaN to VDST. 
1124Operation: 
1125```
1126if (ASFLOAT(SRC0)>=0.0)
1127    VDST = APPROX_SQRT(ASFLOAT(SRC0))
1128else
1129    VDST = -NAN
1130```
1131
1132#### V_SQRT_F64
1133
1134Opcode VOP1: 52 (0x34) for GCN 1.0/1.1; 40 (0x28) for GCN 1.2 
1135Opcode VOP3A: 436 (0x1b4) for GCN 1.0/1.1; 360 (0x168) for GCN 1.2 
1136Syntax: V_SQRT_F64 VDST(2), SRC0(2) 
1137Description: Compute square root of double floating point value SRC0, and store result
1138to VDST. Relative error of approximation is ~1e-8.
1139If SRC0 is negative value then store -NaN to VDST. 
1140Operation: 
1141```
1142if (ASDOUBLE(SRC0)>=0.0)
1143    VDST = APPROX_SQRT(ASDOUBLE(SRC0))
1144else
1145    VDST = -NAN
1146```
1147
1148#### V_TRUNC_F32
1149
1150Opcode VOP1: 33 (0x21) for GCN 1.0/1.1; 28 (0x1c) for GCN 1.2 
1151Opcode VOP3A: 417 (0x1a1) for GCN 1.0/1.1; 348 (0x15c) for GCN 1.2 
1152Syntax: V_TRUNC_F32 VDST, SRC0 
1153Description: Get integer value from floating point value SRC0, and store (as float)
1154it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1155Operation: 
1156```
1157VDST = RNDTZ(ASFLOAT(SRC0))
1158```
1159
1160#### V_TRUNC_F64
1161
1162Opcode VOP1: 23 (0x17) for GCN 1.1/1.2 
1163Opcode VOP3A: 407 (0x197) for GCN 1.1; 343 (0x157) for GCN 1.2 
1164Syntax: V_TRUNC_F64 VDST(2), SRC0(2) 
1165Description: Get integer value from double floating point value SRC0, and store (as float)
1166it to VDST. If SRC0 is infinity or NaN then copy SRC0 to VDST. 
1167Operation: 
1168```
1169VDST = RNDTZ(ASDOUBLE(SRC0))
1170```
Note: See TracBrowser for help on using the repository browser.