source: CLRX/CLRadeonExtender/trunk/doc/GcnInstrsSmem.md

Last change on this file was 3572, checked in by matszpk, 22 months ago

CLRadeonExtender: Revert last changes in CLRXDocs.

File size: 41.2 KB
Line 
1## GCN ISA SMEM instructions (GCN 1.2/1.4)
2
3The encoding of the SMEM instructions needs 8 bytes (2 dwords). List of fields:
4
5Bits  | Name     | Description
6------|----------|------------------------------
70-5   | SBASE    | Number of aligned SGPR pair.
86-12  | SDATA    | Scalar destination/data operand
914    | SOE      | Scalar offset enable (GCN 1.4)
1015    | NV       | Non-volative (GCN 1.4)
1116    | GLC      | Operation globally coherent
1217    | IMM      | IMM indicator
1318-25 | OPCODE   | Operation code
1426-31 | ENCODING | Encoding type. Must be 0b110000
1532-51 | OFFSET   | Unsigned 20-bit byte offset or SGPR number that holds byte offset
1632-52 | OFFSET   | Signed 21-bit byte offset or SGPR number (byte offset) (GCN 1.4)
1757-63 | SOFFSET  | SGPR offset (only if SOE=1)
18
19Value of the IMM determines meaning of the OFFSET field (GCN 1.2):
20
21* IMM=1 - OFFSET holds a byte offset to SBASE.
22* IMM=0 - OFFSET holds number of SGPR that holds byte offset to SBASE.
23
24Value of the IMM and SOE determines encoding of OFFSET and SGPR offset (GCN 1.4):
25
26 IMM | SOE | Address                            | Syntax
27-----|-----|------------------------------------|--------------------
28  0  |  0  | SGPR[base] + SGPR[OFFSET]
29  0  |  1  | SGPR[base] + SGPR[SOFFSET]
30  1  |  0  | SGPR[base] + OFFSET
31  1  |  1  | SGPR[base] + OFFSET + SGPR[SOFFSET]
32
33For S_LOAD_DWORD\* instructions, 2 SBASE SGPRs holds a base 64-bit address.
34For S_BUFFER_LOAD_DWORD\* instructions, 4 SBASE SGPRs holds a
35buffer descriptor. In this case, SBASE must be a multipla of 2.
36S_STORE_\* and S_BUFFER_STORE_\* accepts only M0 as offset register for GCN 1.2.
37In GCN 1.4 S_STORE_\* and S_BUFFER_STORE_\* accepts also SGPR as offset register.
38
39The SMEM instructions can return the result data out of the order. Any SMEM operation
40(including S_MEMTIME) increments LGKM_CNT counter. The best way to wait for results
41is `S_WAITCNT LGKMCNT(0)`.
42
43* LGKM_CNT incremented by one for every fetch of single Dword
44* LGKM_CNT incremented by two for every fetch of two or more Dwords
45
46Instruction syntax: INSTRUCTION SDATA, SBASE(2,4), OFFSET|SGPR [MODIFIERS]
47
48Modifiers can be supplied in any order. Modifiers list: GLC, NV (GCN 1.4),
49OFFSET:OFFSET (GCN 1.4).
50
51NOTE: Between setting third dword from buffer resource and S_BUFFER_\* instruction
52is required least one instruction (vector or scalar) due to delay.
53
54List of the instructions by opcode:
55
56 Opcode     |GCN 1.2|GCN 1.4| Mnemonic (GCN1.2/1.4)
57------------|-------|-------|------------------------------
58 0 (0x0)    |   ✓   |   ✓   | S_LOAD_DWORD
59 1 (0x1)    |   ✓   |   ✓   | S_LOAD_DWORDX2
60 2 (0x2)    |   ✓   |   ✓   | S_LOAD_DWORDX4
61 3 (0x3)    |   ✓   |   ✓   | S_LOAD_DWORDX8
62 4 (0x4)    |   ✓   |   ✓   | S_LOAD_DWORDX16
63 5 (0x5)    |       |   ✓   | S_SCRATCH_LOAD_DWORD
64 6 (0x6)    |       |   ✓   | S_SCRATCH_LOAD_DWORDX2
65 7 (0x7)    |       |   ✓   | S_SCRATCH_LOAD_DWORDX4
66 8 (0x8)    |   ✓   |   ✓   | S_BUFFER_LOAD_DWORD
67 9 (0x9)    |   ✓   |   ✓   | S_BUFFER_LOAD_DWORDX2
68 10 (0xa)   |   ✓   |   ✓   | S_BUFFER_LOAD_DWORDX4
69 11 (0xb)   |   ✓   |   ✓   | S_BUFFER_LOAD_DWORDX8
70 12 (0xc)   |   ✓   |   ✓   | S_BUFFER_LOAD_DWORDX16
71 16 (0x10)  |   ✓   |   ✓   | S_STORE_DWORD
72 17 (0x11)  |   ✓   |   ✓   | S_STORE_DWORDX2
73 18 (0x12)  |   ✓   |   ✓   | S_STORE_DWORDX4
74 21 (0x15)  |       |   ✓   | S_SCRATCH_STORE_DWORD
75 22 (0x16)  |       |   ✓   | S_SCRATCH_STORE_DWORDX2
76 23 (0x17)  |       |   ✓   | S_SCRATCH_STORE_DWORDX4
77 24 (0x18)  |   ✓   |   ✓   | S_BUFFER_LOAD_DWORD
78 25 (0x19)  |   ✓   |   ✓   | S_BUFFER_LOAD_DWORDX2
79 26 (0x1a)  |   ✓   |   ✓   | S_BUFFER_LOAD_DWORDX4
80 32 (0x20)  |   ✓   |   ✓   | S_DCACHE_INV
81 33 (0x21)  |   ✓   |   ✓   | S_DCACHE_WB
82 34 (0x22)  |   ✓   |   ✓   | S_DCACHE_INV_VOL
83 35 (0x23)  |   ✓   |   ✓   | S_DCACHE_WB_VOL
84 36 (0x24)  |   ✓   |   ✓   | S_MEMTIME
85 37 (0x25)  |   ✓   |   ✓   | S_MEMREALTIME
86 38 (0x26)  |   ✓   |   ✓   | S_ATC_PROBE
87 39 (0x27)  |   ✓   |   ✓   | S_ATC_PROBE_BUFFER
88 40 (0x28)  |       |   ✓   | S_DCACHE_DISCARD
89 41 (0x29)  |       |   ✓   | S_DCACHE_DISCARD_X2
90 64 (0x40)  |       |   ✓   | S_BUFFER_ATOMIC_SWAP
91 65 (0x41)  |       |   ✓   | S_BUFFER_ATOMIC_CMPSWAP
92 66 (0x42)  |       |   ✓   | S_BUFFER_ATOMIC_ADD
93 67 (0x43)  |       |   ✓   | S_BUFFER_ATOMIC_SUB
94 68 (0x44)  |       |   ✓   | S_BUFFER_ATOMIC_SMIN
95 69 (0x45)  |       |   ✓   | S_BUFFER_ATOMIC_UMIN
96 70 (0x46)  |       |   ✓   | S_BUFFER_ATOMIC_SMAX
97 71 (0x47)  |       |   ✓   | S_BUFFER_ATOMIC_UMAX
98 72 (0x48)  |       |   ✓   | S_BUFFER_ATOMIC_AND
99 73 (0x49)  |       |   ✓   | S_BUFFER_ATOMIC_OR
100 74 (0x4a)  |       |   ✓   | S_BUFFER_ATOMIC_XOR
101 75 (0x4b)  |       |   ✓   | S_BUFFER_ATOMIC_INC
102 76 (0x4c)  |       |   ✓   | S_BUFFER_ATOMIC_DEC
103 96 (0x60)  |       |   ✓   | S_BUFFER_ATOMIC_SWAP_X2
104 97 (0x61)  |       |   ✓   | S_BUFFER_ATOMIC_CMPSWAP_X2
105 98 (0x62)  |       |   ✓   | S_BUFFER_ATOMIC_ADD_X2
106 99 (0x63)  |       |   ✓   | S_BUFFER_ATOMIC_SUB_X2
107 100 (0x64) |       |   ✓   | S_BUFFER_ATOMIC_SMIN_X2
108 101 (0x65) |       |   ✓   | S_BUFFER_ATOMIC_UMIN_X2
109 102 (0x66) |       |   ✓   | S_BUFFER_ATOMIC_SMAX_X2
110 103 (0x67) |       |   ✓   | S_BUFFER_ATOMIC_UMAX_X2
111 104 (0x68) |       |   ✓   | S_BUFFER_ATOMIC_AND_X2
112 105 (0x69) |       |   ✓   | S_BUFFER_ATOMIC_OR_X2
113 106 (0x6a) |       |   ✓   | S_BUFFER_ATOMIC_XOR_X2
114 107 (0x6b) |       |   ✓   | S_BUFFER_ATOMIC_INC_X2
115 108 (0x6c) |       |   ✓   | S_BUFFER_ATOMIC_DEC_X2
116 128 (0x80) |       |   ✓   | S_ATOMIC_SWAP
117 129 (0x81) |       |   ✓   | S_ATOMIC_CMPSWAP
118 130 (0x82) |       |   ✓   | S_ATOMIC_ADD
119 131 (0x83) |       |   ✓   | S_ATOMIC_SUB
120 132 (0x84) |       |   ✓   | S_ATOMIC_SMIN
121 133 (0x85) |       |   ✓   | S_ATOMIC_UMIN
122 134 (0x86) |       |   ✓   | S_ATOMIC_SMAX
123 135 (0x87) |       |   ✓   | S_ATOMIC_UMAX
124 136 (0x88) |       |   ✓   | S_ATOMIC_AND
125 137 (0x89) |       |   ✓   | S_ATOMIC_OR
126 138 (0x8a) |       |   ✓   | S_ATOMIC_XOR
127 139 (0x8b) |       |   ✓   | S_ATOMIC_INC
128 140 (0x8c) |       |   ✓   | S_ATOMIC_DEC
129 160 (0xa0) |       |   ✓   | S_ATOMIC_SWAP_X2
130 161 (0xa1) |       |   ✓   | S_ATOMIC_CMPSWAP_X2
131 162 (0xa2) |       |   ✓   | S_ATOMIC_ADD_X2
132 163 (0xa3) |       |   ✓   | S_ATOMIC_SUB_X2
133 164 (0xa4) |       |   ✓   | S_ATOMIC_SMIN_X2
134 165 (0xa5) |       |   ✓   | S_ATOMIC_UMIN_X2
135 166 (0xa6) |       |   ✓   | S_ATOMIC_SMAX_X2
136 167 (0xa7) |       |   ✓   | S_ATOMIC_UMAX_X2
137 168 (0xa8) |       |   ✓   | S_ATOMIC_AND_X2
138 169 (0xa9) |       |   ✓   | S_ATOMIC_OR_X2
139 170 (0xaa) |       |   ✓   | S_ATOMIC_XOR_X2
140 171 (0xab) |       |   ✓   | S_ATOMIC_INC_X2
141 172 (0xac) |       |   ✓   | S_ATOMIC_DEC_X2
142
143### Instruction set
144
145Alphabetically sorted instruction list:
146
147#### S_ATOMIC_ADD
148
149Opcode: 130 (0x82) only for GCN 1.4 
150Syntax: S_ATOMIC_ADD SDATA, SBASE(2), OFFSET 
151Description: Add SDATA to value from memory address, and store result to memory address.
152If GLC flag is set then return previous value from memory address to SDATA,
153otherwise keep SDATA value. Operation is atomic. 
154Operation: 
155```
156UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
157UINT32 P = *VM; *VM = *VM + SDATA; SDATA = (GLC) ? P : SDATA // atomic
158```
159
160#### S_ATOMIC_ADD_X2
161
162Opcode: 162 (0xa2) only for GCN 1.4 
163Syntax: S_ATOMIC_ADD_X2 SDATA(2), SBASE(2), OFFSET 
164Description: Add 64-bit SDATA to 64-bit value from memory address,
165and store result to memory address.
166If GLC flag is set then return previous value from memory address to SDATA,
167otherwise keep SDATA value. Operation is atomic. 
168Operation: 
169```
170UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
171UINT64 P = *VM; *VM = *VM + SDATA; SDATA = (GLC) ? P : SDATA // atomic
172```
173
174#### S_ATOMIC_AND
175
176Opcode: 136 (0x88) only for GCN 1.4 
177Syntax: S_ATOMIC_AND SDATA, SBASE(2), OFFSET 
178Description: Do bitwise AND on SDATA and value from memory address,
179and store result to memory address.
180If GLC flag is set then return previous value from memory address to SDATA,
181otherwise keep SDATA value. Operation is atomic. 
182Operation: 
183```
184UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
185UINT32 P = *VM; *VM = *VM & SDATA; SDATA = (GLC) ? P : SDATA // atomic
186```
187
188#### S_ATOMIC_AND_X2
189
190Opcode: 168 (0xa8) only for GCN 1.4 
191Syntax: S_ATOMIC_AND_X2 SDATA(2), SBASE(2), OFFSET 
192Description: Do bitwise AND on 64-bit SDATA and 64-bit value from memory address,
193and store result to memory address.
194If GLC flag is set then return previous value from memory address to SDATA,
195otherwise keep SDATA value. Operation is atomic. 
196Operation: 
197```
198UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
199UINT64 P = *VM; *VM = *VM & SDATA; SDATA = (GLC) ? P : SDATA // atomic
200```
201
202#### S_ATOMIC_CMPSWAP
203
204Opcode: 129 (0x81) only for GCN 1.4 
205Syntax: S_ATOMIC_CMPSWAP SDATA(2), SBASE(2), OFFSET 
206Description: Store lower SDATA dword into memory address if previous value
207from memory address is equal SDATA>>32, otherwise keep old value from memory address.
208If GLC flag is set then return previous value from memory address to SDATA,
209otherwise keep SDATA value. Operation is atomic. 
210Operation: 
211```
212UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
213UINT32 P = *VM; *VM = *VM = *VM==(SDATA>>32) ? SDATA&0xffffffff : *VM // atomic
214SDATA = (GLC) ? P : SDATA // atomic
215```
216
217#### S_ATOMIC_CMPSWAP_X2
218
219Opcode: 161 (0xa1) only for GCN 1.4 
220Syntax: S_ATOMIC_CMPSWAP_X2 SDATA(4), SBASE(2), OFFSET 
221Description: Store lower SDATA quadword into memory address if previous value
222from memory address is equal last SDATA quadword,
223otherwise keep old value from memory address.
224If GLC flag is set then return previous value from memory address to SDATA,
225otherwise keep SDATA value. Operation is atomic. 
226Operation: 
227```
228UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
229UINT64 P = *VM; *VM = *VM = *VM==(SDATA[2:3]) ? SDATA[0:1] : *VM // atomic
230SDATA = (GLC) ? P : SDATA // atomic
231```
232
233#### S_ATOMIC_DEC
234
235Opcode: 140 (0x8c) only for GCN 1.4 
236Syntax: S_ATOMIC_DEC SDATA, SBASE(2), OFFSET 
237Description: Compare value from memory address and if less or equal than SDATA
238and this value is not zero, then decrement value from memory address,
239otherwise store SDATA to memory address.
240If GLC flag is set then return previous value from memory address to SDATA,
241otherwise keep SDATA value. Operation is atomic. 
242Operation: 
243```
244UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
245UINT32 P = *VM; *VM = (*VM <= VDATA && *VM!=0) ? *VM-1 : VDATA; // atomic
246SDATA = (GLC) ? P : SDATA // atomic
247```
248
249#### S_ATOMIC_DEC_X2
250
251Opcode: 172 (0xac) only for GCN 1.4 
252Syntax: S_ATOMIC_DEC_X2 SDATA, SBASE(2), OFFSET 
253Description: Compare 64-bit value from memory address and if less or equal than
25464-bit SDATA and this value is not zero, then decrement value from memory address,
255otherwise store SDATA to memory address.
256If GLC flag is set then return previous value from memory address to SDATA,
257otherwise keep SDATA value. Operation is atomic. 
258Operation: 
259```
260UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
261UINT64 P = *VM; *VM = (*VM <= VDATA && *VM!=0) ? *VM-1 : VDATA; // atomic
262SDATA = (GLC) ? P : SDATA // atomic
263```
264
265#### S_ATOMIC_INC
266
267Opcode: 139 (0x8b) only for GCN 1.4 
268Syntax: S_ATOMIC_INC SDATA, SBASE(2), OFFSET 
269Description: Compare value from memory address and if less than SDATA,
270then increment value from memory address, otherwise store zero to memory address.
271If GLC flag is set then return previous value from memory address to SDATA,
272otherwise keep SDATA value. Operation is atomic. 
273Operation: 
274```
275UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
276UINT32 P = *VM; *VM = (*VM < SDATA) ? *VM+1 : 0; SDATA = (GLC) ? P : SDATA // atomic
277```
278
279#### S_ATOMIC_INC_X2
280
281Opcode: 171 (0xab) only for GCN 1.4 
282Syntax: S_ATOMIC_INC_X2 SDATA(2), SBASE(2), OFFSET 
283Description: Compare 64-bit value from memory address and if less than 64-bit SDATA,
284then increment value from memory address, otherwise store zero to memory address.
285If GLC flag is set then return previous value from memory address to SDATA,
286otherwise keep SDATA value. Operation is atomic. 
287Operation: 
288```
289UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
290UINT64 P = *VM; *VM = (*VM < SDATA) ? *VM+1 : 0; SDATA = (GLC) ? P : SDATA // atomic
291```
292
293#### S_ATOMIC_OR
294
295Opcode: 137 (0x89) only for GCN 1.4 
296Syntax: S_ATOMIC_OR SDATA, SBASE(2), OFFSET 
297Description: Do bitwise OR on SDATA and value from memory address,
298and store result to memory address.
299If GLC flag is set then return previous value from memory address to SDATA,
300otherwise keep SDATA value. Operation is atomic. 
301Operation: 
302```
303UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
304UINT32 P = *VM; *VM = *VM | SDATA; SDATA = (GLC) ? P : SDATA // atomic
305```
306
307#### S_ATOMIC_OR_X2
308
309Opcode: 169 (0xa9) only for GCN 1.4 
310Syntax: S_ATOMIC_OR_X2 SDATA(2), SBASE(2), OFFSET 
311Description: Do bitwise OR on 64-bit SDATA and 64-bit value from memory address,
312and store result to memory address.
313If GLC flag is set then return previous value from memory address to SDATA,
314otherwise keep SDATA value. Operation is atomic. 
315Operation: 
316```
317UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
318UINT64 P = *VM; *VM = *VM | SDATA; SDATA = (GLC) ? P : SDATA // atomic
319```
320
321#### S_ATOMIC_SMAX
322
323Opcode: 134 (0x86) only for GCN 1.4 
324Syntax: S_ATOMIC_SMAX SDATA, SBASE(2), OFFSET 
325Description: Choose largest signed 32-bit value from SDATA and from memory address,
326and store result to this memory address.
327If GLC flag is set then return previous value from memory address to SDATA,
328otherwise keep SDATA value. Operation is atomic. 
329Operation: 
330```
331INT32* VM = (INT32*)((SMEM + (OFFSET & ~3))
332INT32 P = *VM; *VM = MAX(*VM, (INT32)SDATA); SDATA = (GLC) ? P : SDATA // atomic
333```
334
335#### S_ATOMIC_SMAX_X2
336
337Opcode: 166 (0xa6) only for GCN 1.4 
338Syntax: S_ATOMIC_SMAX_X2 SDATA(2), SBASE(2), OFFSET 
339Description: Choose largest signed 64-bit value from SDATA and from memory address,
340and store result to this memory address.
341If GLC flag is set then return previous value from memory address to SDATA,
342otherwise keep SDATA value. Operation is atomic. 
343Operation: 
344```
345INT64* VM = (INT64*)((SMEM + (OFFSET & ~3))
346INT64 P = *VM; *VM = MAX(*VM, (INT64)SDATA); SDATA = (GLC) ? P : SDATA // atomic
347```
348
349#### S_ATOMIC_SMIN
350
351Opcode: 132 (0x84) only for GCN 1.4 
352Syntax: S_ATOMIC_SMIN SDATA, SBASE(2), OFFSET 
353Description: Choose smallest signed 32-bit value from SDATA and from memory address,
354and store result to this memory address.
355If GLC flag is set then return previous value from memory address to SDATA,
356otherwise keep SDATA value. Operation is atomic. 
357Operation: 
358```
359INT32* VM = (INT32*)((SMEM + (OFFSET & ~3))
360INT32 P = *VM; *VM = MIN(*VM, (INT32)SDATA); SDATA = (GLC) ? P : SDATA // atomic
361```
362
363#### S_ATOMIC_SMIN_X2
364
365Opcode: 164 (0xa4) only for GCN 1.4 
366Syntax: S_ATOMIC_SMIN_X2 SDATA(2), SBASE(2), OFFSET 
367Description: Choose smallest signed 64-bit value from SDATA and from memory address,
368and store result to this memory address.
369If GLC flag is set then return previous value from memory address to SDATA,
370otherwise keep SDATA value. Operation is atomic. 
371Operation: 
372```
373INT64* VM = (INT64*)((SMEM + (OFFSET & ~3))
374INT64 P = *VM; *VM = MIN(*VM, (INT64)SDATA); SDATA = (GLC) ? P : SDATA // atomic
375```
376
377#### S_ATOMIC_SUB
378
379Opcode: 131 (0x83) only for GCN 1.4 
380Syntax: S_ATOMIC_SUB SDATA, SBASE(2), OFFSET 
381Description: Subtract SDATA from value from memory address,
382and store result to memory address.
383If GLC flag is set then return previous value from memory address to SDATA,
384otherwise keep SDATA value. Operation is atomic. 
385Operation: 
386```
387UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
388UINT32 P = *VM; *VM = *VM - SDATA; SDATA = (GLC) ? P : SDATA // atomic
389```
390
391#### S_ATOMIC_SUB_X2
392
393Opcode: 163 (0xa3) only for GCN 1.4 
394Syntax: S_ATOMIC_SUB_X2 SDATA(2), SBASE(2), OFFSET 
395Description: Subtract 64-bit SDATA from 64-bit value from memory address,
396and store result to memory address.
397If GLC flag is set then return previous value from memory address to SDATA,
398otherwise keep SDATA value. Operation is atomic. 
399Operation: 
400```
401UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
402UINT64 P = *VM; *VM = *VM - SDATA; SDATA = (GLC) ? P : SDATA // atomic
403```
404
405#### S_ATOMIC_SWAP
406
407Opcode: 128 (0x80) only for GCN 1.4 
408Syntax: S_ATOMIC_SWAP SDATA, SBASE(2), OFFSET 
409Description: Store SDATA into memory address.
410If GLC flag is set then return previous value from memory address to SDATA,
411otherwise keep SDATA value. Operation is atomic. 
412Operation: 
413```
414UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
415UINT32 P = *VM; *VM = SDATA; SDATA = (GLC) ? P : SDATA // atomic
416```
417
418#### S_ATOMIC_SWAP_X2
419
420Opcode: 160 (0xa0) only for GCN 1.4 
421Syntax: S_ATOMIC_SWAP_X2 SDATA(2), SBASE(2), OFFSET 
422Description: Store 64-bit SDATA into memory address.
423If GLC flag is set then return previous value from memory address to SDATA,
424otherwise keep SDATA value. Operation is atomic. 
425Operation: 
426```
427UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
428UINT64 P = *VM; *VM = SDATA; SDATA = (GLC) ? P : SDATA // atomic
429```
430
431#### S_ATOMIC_UMAX
432
433Opcode: 135 (0x87) only for GCN 1.4 
434Syntax: S_ATOMIC_UMAX SDATA, SBASE(2), OFFSET 
435Description: Choose largest unsigned 32-bit value from SDATA and from memory address,
436and store result to this memory address.
437If GLC flag is set then return previous value from memory address to SDATA,
438otherwise keep SDATA value. Operation is atomic. 
439Operation: 
440```
441UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
442UINT32 P = *VM; *VM = MAX(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
443```
444
445#### S_ATOMIC_UMAX_X2
446
447Opcode: 167 (0xa7) only for GCN 1.4 
448Syntax: S_ATOMIC_UMAX_X2 SDATA(2), SBASE(2), OFFSET 
449Description: Choose largest unsigned 64-bit value from SDATA and from memory address,
450and store result to this memory address.
451If GLC flag is set then return previous value from memory address to SDATA,
452otherwise keep SDATA value. Operation is atomic. 
453Operation: 
454```
455UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
456UINT64 P = *VM; *VM = MAX(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
457```
458
459#### S_ATOMIC_UMIN
460
461Opcode: 133 (0x85) only for GCN 1.4 
462Syntax: S_ATOMIC_UMIN SDATA, SBASE(2), OFFSET 
463Description: Choose smallest unsigned 32-bit value from SDATA and from memory address,
464and store result to this memory address.
465If GLC flag is set then return previous value from memory address to SDATA,
466otherwise keep SDATA value. Operation is atomic. 
467Operation: 
468```
469UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
470UINT32 P = *VM; *VM = MIN(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
471```
472
473#### S_ATOMIC_UMIN_X2
474
475Opcode: 165 (0xa5) only for GCN 1.4 
476Syntax: S_ATOMIC_UMIN_X2 SDATA(2), SBASE(2), OFFSET 
477Description: Choose smallest unsigned 64-bit value from SDATA and from memory address,
478and store result to this memory address.
479If GLC flag is set then return previous value from memory address to SDATA,
480otherwise keep SDATA value. Operation is atomic. 
481Operation: 
482```
483UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
484UINT64 P = *VM; *VM = MIN(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
485```
486
487#### S_ATOMIC_XOR
488
489Opcode: 138 (0x8a) only for GCN 1.4 
490Syntax: S_ATOMIC_XOR SDATA, SBASE(2), OFFSET 
491Description: Do bitwise XOR on SDATA and value from memory address,
492and store result to memory address.
493If GLC flag is set then return previous value from memory address to SDATA,
494otherwise keep SDATA value. Operation is atomic. 
495Operation: 
496```
497UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
498UINT32 P = *VM; *VM = *VM ^ SDATA; SDATA = (GLC) ? P : SDATA // atomic
499```
500
501#### S_ATOMIC_XOR_X2
502
503Opcode: 170 (0xaa) only for GCN 1.4 
504Syntax: S_ATOMIC_XOR_X2 SDATA(2), SBASE(2), OFFSET 
505Description: Do bitwise XOR on 64-bit SDATA and 64-bit value from memory address,
506and store result to memory address.
507If GLC flag is set then return previous value from memory address to SDATA,
508otherwise keep SDATA value. Operation is atomic. 
509Operation: 
510```
511UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
512UINT64 P = *VM; *VM = *VM ^ SDATA; SDATA = (GLC) ? P : SDATA // atomic
513```
514
515#### S_BUFFER_ATOMIC_ADD
516
517Opcode: 66 (0x42) only for GCN 1.4 
518Syntax: S_BUFFER_ATOMIC_ADD SDATA, SBASE(4), OFFSET 
519Description: Add SDATA to value from memory address, and store result to memory address.
520If GLC flag is set then return previous value from memory address to SDATA,
521otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
522Operation: 
523```
524UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
525UINT32 P = *VM; *VM = *VM + SDATA; SDATA = (GLC) ? P : SDATA // atomic
526```
527
528#### S_BUFFER_ATOMIC_ADD_X2
529
530Opcode: 98 (0x62) only for GCN 1.4 
531Syntax: S_BUFFER_ATOMIC_ADD_X2 SDATA(2), SBASE(4), OFFSET 
532Description: Add 64-bit SDATA to 64-bit value from memory address,
533and store result to memory address.
534If GLC flag is set then return previous value from memory address to SDATA,
535otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
536Operation: 
537```
538UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
539UINT64 P = *VM; *VM = *VM + SDATA; SDATA = (GLC) ? P : SDATA // atomic
540```
541
542#### S_BUFFER_ATOMIC_AND
543
544Opcode: 72 (0x48) only for GCN 1.4 
545Syntax: S_BUFFER_ATOMIC_AND SDATA, SBASE(4), OFFSET 
546Description: Do bitwise AND on SDATA and value from memory address,
547and store result to memory address.
548If GLC flag is set then return previous value from memory address to SDATA,
549otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
550Operation: 
551```
552UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
553UINT32 P = *VM; *VM = *VM & SDATA; SDATA = (GLC) ? P : SDATA // atomic
554```
555
556#### S_BUFFER_ATOMIC_AND_X2
557
558Opcode: 104 (0x68) only for GCN 1.4 
559Syntax: S_BUFFER_ATOMIC_AND_X2 SDATA(2), SBASE(4), OFFSET 
560Description: Do bitwise AND on 64-bit SDATA and 64-bit value from memory address,
561and store result to memory address.
562If GLC flag is set then return previous value from memory address to SDATA,
563otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
564Operation: 
565```
566UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
567UINT64 P = *VM; *VM = *VM & SDATA; SDATA = (GLC) ? P : SDATA // atomic
568```
569
570#### S_BUFFER_ATOMIC_CMPSWAP
571
572Opcode: 65 (0x41) only for GCN 1.4 
573Syntax: S_BUFFER_ATOMIC_CMPSWAP SDATA(2), SBASE(4), OFFSET 
574Description: Store lower SDATA dword into memory address if previous value
575from memory address is equal SDATA>>32, otherwise keep old value from memory address.
576If GLC flag is set then return previous value from memory address to SDATA,
577otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
578Operation: 
579```
580UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
581UINT32 P = *VM; *VM = *VM = *VM==(SDATA>>32) ? SDATA&0xffffffff : *VM // atomic
582SDATA = (GLC) ? P : SDATA // atomic
583```
584
585#### S_BUFFER_ATOMIC_CMPSWAP_X2
586
587Opcode: 97 (0x61) only for GCN 1.4 
588Syntax: S_BUFFER_ATOMIC_CMPSWAP_X2 SDATA(4), SBASE(4), OFFSET 
589Description: Store lower SDATA quadword into memory address if previous value
590from memory address is equal last SDATA quadword,
591otherwise keep old value from memory address.
592If GLC flag is set then return previous value from memory address to SDATA,
593otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
594Operation: 
595```
596UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
597UINT64 P = *VM; *VM = *VM = *VM==(SDATA[2:3]) ? SDATA[0:1] : *VM // atomic
598SDATA = (GLC) ? P : SDATA // atomic
599```
600
601#### S_BUFFER_ATOMIC_DEC
602
603Opcode: 76 (0x4c) only for GCN 1.4 
604Syntax: S_BUFFER_ATOMIC_DEC SDATA, SBASE(4), OFFSET 
605Description: Compare value from memory address and if less or equal than SDATA
606and this value is not zero, then decrement value from memory address,
607otherwise store SDATA to memory address.
608If GLC flag is set then return previous value from memory address to SDATA,
609otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
610Operation: 
611```
612UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
613UINT32 P = *VM; *VM = (*VM <= VDATA && *VM!=0) ? *VM-1 : VDATA; // atomic
614SDATA = (GLC) ? P : SDATA // atomic
615```
616
617#### S_BUFFER_ATOMIC_DEC_X2
618
619Opcode: 108 (0x6c) only for GCN 1.4 
620Syntax: S_BUFFER_ATOMIC_DEC_X2 SDATA, SBASE(4), OFFSET 
621Description: Compare 64-bit value from memory address and if less or equal than
62264-bit SDATA and this value is not zero, then decrement value from memory address,
623otherwise store SDATA to memory address.
624If GLC flag is set then return previous value from memory address to SDATA,
625otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
626Operation: 
627```
628UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
629UINT64 P = *VM; *VM = (*VM <= VDATA && *VM!=0) ? *VM-1 : VDATA; // atomic
630SDATA = (GLC) ? P : SDATA // atomic
631```
632
633#### S_BUFFER_ATOMIC_INC
634
635Opcode: 75 (0x4b) only for GCN 1.4 
636Syntax: S_BUFFER_ATOMIC_INC SDATA, SBASE(4), OFFSET 
637Description: Compare value from memory address and if less than SDATA,
638then increment value from memory address, otherwise store zero to memory address.
639If GLC flag is set then return previous value from memory address to SDATA,
640otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
641Operation: 
642```
643UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
644UINT32 P = *VM; *VM = (*VM < SDATA) ? *VM+1 : 0; SDATA = (GLC) ? P : SDATA // atomic
645```
646
647#### S_BUFFER_ATOMIC_INC_X2
648
649Opcode: 107 (0x6b) only for GCN 1.4 
650Syntax: S_BUFFER_ATOMIC_INC_X2 SDATA(2), SBASE(4), OFFSET 
651Description: Compare 64-bit value from memory address and if less than 64-bit SDATA,
652then increment value from memory address, otherwise store zero to memory address.
653If GLC flag is set then return previous value from memory address to SDATA,
654otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
655Operation: 
656```
657UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
658UINT64 P = *VM; *VM = (*VM < SDATA) ? *VM+1 : 0; SDATA = (GLC) ? P : SDATA // atomic
659```
660
661#### S_BUFFER_ATOMIC_OR
662
663Opcode: 73 (0x49) only for GCN 1.4 
664Syntax: S_BUFFER_ATOMIC_OR SDATA, SBASE(4), OFFSET 
665Description: Do bitwise OR on SDATA and value from memory address,
666and store result to memory address.
667If GLC flag is set then return previous value from memory address to SDATA,
668otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
669Operation: 
670```
671UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
672UINT32 P = *VM; *VM = *VM | SDATA; SDATA = (GLC) ? P : SDATA // atomic
673```
674
675#### S_BUFFER_ATOMIC_OR_X2
676
677Opcode: 105 (0x69) only for GCN 1.4 
678Syntax: S_BUFFER_ATOMIC_OR_X2 SDATA(2), SBASE(4), OFFSET 
679Description: Do bitwise OR on 64-bit SDATA and 64-bit value from memory address,
680and store result to memory address.
681If GLC flag is set then return previous value from memory address to SDATA,
682otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
683Operation: 
684```
685UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
686UINT64 P = *VM; *VM = *VM | SDATA; SDATA = (GLC) ? P : SDATA // atomic
687```
688
689#### S_BUFFER_ATOMIC_SMAX
690
691Opcode: 72 (0x46) only for GCN 1.4 
692Syntax: S_BUFFER_ATOMIC_SMAX SDATA, SBASE(4), OFFSET 
693Description: Choose largest signed 32-bit value from SDATA and from memory address,
694and store result to this memory address.
695If GLC flag is set then return previous value from memory address to SDATA,
696otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
697Operation: 
698```
699INT32* VM = (INT32*)((SMEM + (OFFSET & ~3))
700INT32 P = *VM; *VM = MAX(*VM, (INT32)SDATA); SDATA = (GLC) ? P : SDATA // atomic
701```
702
703#### S_BUFFER_ATOMIC_SMAX_X2
704
705Opcode: 102 (0x66) only for GCN 1.4 
706Syntax: S_BUFFER_ATOMIC_SMAX_X2 SDATA(2), SBASE(4), OFFSET 
707Description: Choose largest signed 64-bit value from SDATA and from memory address,
708and store result to this memory address.
709If GLC flag is set then return previous value from memory address to SDATA,
710otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
711Operation: 
712```
713INT64* VM = (INT64*)((SMEM + (OFFSET & ~3))
714INT64 P = *VM; *VM = MAX(*VM, (INT64)SDATA); SDATA = (GLC) ? P : SDATA // atomic
715```
716
717#### S_BUFFER_ATOMIC_SMIN
718
719Opcode: 70 (0x44) only for GCN 1.4 
720Syntax: S_BUFFER_ATOMIC_SMIN SDATA, SBASE(4), OFFSET 
721Description: Choose smallest signed 32-bit value from SDATA and from memory address,
722and store result to this memory address.
723If GLC flag is set then return previous value from memory address to SDATA,
724otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
725Operation: 
726```
727INT32* VM = (INT32*)((SMEM + (OFFSET & ~3))
728INT32 P = *VM; *VM = MIN(*VM, (INT32)SDATA); SDATA = (GLC) ? P : SDATA // atomic
729```
730
731#### S_BUFFER_ATOMIC_SMIN_X2
732
733Opcode: 100 (0x64) only for GCN 1.4 
734Syntax: S_BUFFER_ATOMIC_SMIN_X2 SDATA(2), SBASE(4), OFFSET 
735Description: Choose smallest signed 64-bit value from SDATA and from memory address,
736and store result to this memory address.
737If GLC flag is set then return previous value from memory address to SDATA,
738otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
739Operation: 
740```
741INT64* VM = (INT64*)((SMEM + (OFFSET & ~3))
742INT64 P = *VM; *VM = MIN(*VM, (INT64)SDATA); SDATA = (GLC) ? P : SDATA // atomic
743```
744
745#### S_BUFFER_ATOMIC_SUB
746
747Opcode: 69 (0x43) only for GCN 1.4 
748Syntax: S_BUFFER_ATOMIC_SUB SDATA, SBASE(4), OFFSET 
749Description: Subtract SDATA from value from memory address,
750and store result to memory address.
751If GLC flag is set then return previous value from memory address to SDATA,
752otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
753Operation: 
754```
755UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
756UINT32 P = *VM; *VM = *VM - SDATA; SDATA = (GLC) ? P : SDATA // atomic
757```
758
759#### S_BUFFER_ATOMIC_SUB_X2
760
761Opcode: 99 (0x63) only for GCN 1.4 
762Syntax: S_BUFFER_ATOMIC_SUB_X2 SDATA(2), SBASE(4), OFFSET 
763Description: Subtract 64-bit SDATA from 64-bit value from memory address,
764and store result to memory address.
765If GLC flag is set then return previous value from memory address to SDATA,
766otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
767Operation: 
768```
769UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
770UINT64 P = *VM; *VM = *VM - SDATA; SDATA = (GLC) ? P : SDATA // atomic
771```
772
773#### S_BUFFER_ATOMIC_SWAP
774
775Opcode: 64 (0x40) only for GCN 1.4 
776Syntax: S_BUFFER_ATOMIC_SWAP SDATA, SBASE(4), OFFSET 
777Description: Store SDATA into memory address.
778If GLC flag is set then return previous value from memory address to SDATA,
779otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
780Operation: 
781```
782UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
783UINT32 P = *VM; *VM = SDATA; SDATA = (GLC) ? P : SDATA // atomic
784```
785
786#### S_BUFFER_ATOMIC_SWAP_X2
787
788Opcode: 96 (0x60) only for GCN 1.4 
789Syntax: S_BUFFER_ATOMIC_SWAP_X2 SDATA(2), SBASE(4), OFFSET 
790Description: Store 64-bit SDATA into memory address.
791If GLC flag is set then return previous value from memory address to SDATA,
792otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
793Operation: 
794```
795UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
796UINT64 P = *VM; *VM = SDATA; SDATA = (GLC) ? P : SDATA // atomic
797```
798
799#### S_BUFFER_ATOMIC_UMAX
800
801Opcode: 71 (0x47) only for GCN 1.4 
802Syntax: S_BUFFER_ATOMIC_UMAX SDATA, SBASE(4), OFFSET 
803Description: Choose largest unsigned 32-bit value from SDATA and from memory address,
804and store result to this memory address.
805If GLC flag is set then return previous value from memory address to SDATA,
806otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
807Operation: 
808```
809UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
810UINT32 P = *VM; *VM = MAX(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
811```
812
813#### S_BUFFER_ATOMIC_UMAX_X2
814
815Opcode: 103 (0x67) only for GCN 1.4 
816Syntax: S_BUFFER_ATOMIC_UMAX_X2 SDATA(2), SBASE(4), OFFSET 
817Description: Choose largest unsigned 64-bit value from SDATA and from memory address,
818and store result to this memory address.
819If GLC flag is set then return previous value from memory address to SDATA,
820otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
821Operation: 
822```
823UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
824UINT64 P = *VM; *VM = MAX(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
825```
826
827#### S_BUFFER_ATOMIC_UMIN
828
829Opcode: 69 (0x45) only for GCN 1.4 
830Syntax: S_BUFFER_ATOMIC_UMIN SDATA, SBASE(4), OFFSET 
831Description: Choose smallest unsigned 32-bit value from SDATA and from memory address,
832and store result to this memory address.
833If GLC flag is set then return previous value from memory address to SDATA,
834otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
835Operation: 
836```
837UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
838UINT32 P = *VM; *VM = MIN(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
839```
840
841#### S_BUFFER_ATOMIC_UMIN_X2
842
843Opcode: 101 (0x65) only for GCN 1.4 
844Syntax: S_BUFFER_ATOMIC_UMIN_X2 SDATA(2), SBASE(4), OFFSET 
845Description: Choose smallest unsigned 64-bit value from SDATA and from memory address,
846and store result to this memory address.
847If GLC flag is set then return previous value from memory address to SDATA,
848otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
849Operation: 
850```
851UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
852UINT64 P = *VM; *VM = MIN(*VM, SDATA); SDATA = (GLC) ? P : SDATA // atomic
853```
854
855#### S_BUFFER_ATOMIC_XOR
856
857Opcode: 74 (0x4a) only for GCN 1.4 
858Syntax: S_BUFFER_ATOMIC_XOR SDATA, SBASE(4), OFFSET 
859Description: Do bitwise XOR on SDATA and value from memory address,
860and store result to memory address.
861If GLC flag is set then return previous value from memory address to SDATA,
862otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
863Operation: 
864```
865UINT32* VM = (UINT32*)((SMEM + (OFFSET & ~3))
866UINT32 P = *VM; *VM = *VM ^ SDATA; SDATA = (GLC) ? P : SDATA // atomic
867```
868
869#### S_BUFFER_ATOMIC_XOR_X2
870
871Opcode: 106 (0x6a) only for GCN 1.4 
872Syntax: S_BUFFER_ATOMIC_XOR_X2 SDATA(2), SBASE(4), OFFSET 
873Description: Do bitwise XOR on 64-bit SDATA and 64-bit value from memory address,
874and store result to memory address.
875If GLC flag is set then return previous value from memory address to SDATA,
876otherwise keep SDATA value. Operation is atomic. SBASE is buffer descriptor. 
877Operation: 
878```
879UINT64* VM = (UINT64*)((SMEM + (OFFSET & ~3))
880UINT64 P = *VM; *VM = *VM ^ SDATA; SDATA = (GLC) ? P : SDATA // atomic
881```
882
883#### S_BUFFER_LOAD_DWORD
884
885Opcode: 8 (0x8) 
886Syntax: S_BUFFER_LOAD_DWORD SDATA, SBASE(4), OFFSET 
887Description: Load single dword from read-only memory through constant cache (kcache).
888SBASE is buffer descriptor. 
889Operation: 
890```
891SDATA = *(UINT32*)(SMEM + (OFFSET & ~3))
892```
893
894#### S_BUFFER_LOAD_DWORDX16
895
896Opcode: 12 (0xc) 
897Syntax: S_BUFFER_LOAD_DWORDX16 SDATA(16), SBASE(4), OFFSET 
898Description: Load 16 dwords from read-only memory through constant cache (kcache).
899SBASE is buffer descriptor. 
900Operation: 
901```
902for (BYTE i = 0; i < 16; i++)
903    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3))
904```
905
906#### S_BUFFER_LOAD_DWORDX2
907
908Opcode: 9 (0x9) 
909Syntax: S_BUFFER_LOAD_DWORDX2 SDATA(2), SBASE(4), OFFSET 
910Description: Load two dwords from read-only memory through constant cache (kcache).
911SBASE is buffer descriptor. 
912Operation: 
913```
914SDATA = *(UINT64*)(SMEM + (OFFSET & ~3))
915```
916
917#### S_BUFFER_LOAD_DWORDX4
918
919Opcode: 10 (0xa) 
920Syntax: S_BUFFER_LOAD_DWORDX4 SDATA(4), SBASE(4), OFFSET 
921Description: Load four dwords from read-only memory through constant cache (kcache).
922SBASE is buffer descriptor. 
923Operation: 
924```
925for (BYTE i = 0; i < 4; i++)
926    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3))
927```
928
929#### S_BUFFER_LOAD_DWORDX8
930
931Opcode: 11 (0xb) 
932Syntax: S_BUFFER_LOAD_DWORDX8 SDATA(8), SBASE(4), OFFSET 
933Description: Load eight dwords from read-only memory through constant cache (kcache).
934SBASE is buffer descriptor. 
935Operation: 
936```
937for (BYTE i = 0; i < 8; i++)
938    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3))
939```
940
941#### S_BUFFER_STORE_DWORD
942
943Opcode: 24 (0x18) 
944Syntax: S_BUFFER_STORE_DWORD SDATA, SBASE(4), OFFSET 
945Description: Store single dword to memory. It accepts only offset as M0 or any immediate.
946SBASE is buffer descriptor. 
947Operation: 
948```
949*(UINT32*)(SMEM + (OFFSET & ~3)) = SDATA
950```
951
952#### S_BUFFER_STORE_DWORDX2
953
954Opcode: 25 (0x19) 
955Syntax: S_BUFFER_STORE_DWORDX2 SDATA(2), SBASE(4), OFFSET 
956Description: Store two dwords to memory. It accepts only offset as M0 or any immediate.
957SBASE is buffer descriptor. 
958Operation: 
959```
960*(UINT64*)(SMEM + (OFFSET & ~3)) = SDATA
961```
962
963#### S_BUFFER_STORE_DWORDX4
964
965Opcode: 26 (0x1a) 
966Syntax: S_BUFFER_STORE_DWORDX4 SDATA(4), SBASE(4), OFFSET 
967Description: Store four dwords to memory. It accepts only offset as M0 or any immediate.
968SBASE is buffer descriptor. 
969Operation: 
970```
971for (BYTE i = 0; i < 4; i++)
972    *(UINT32*)(SMEM + i*4 + (OFFSET & ~3)) = SDATA[i]
973```
974
975#### S_DCACHE_DISCARD
976
977Opcode 40 (0x28) only for GCN 1.4 
978Syntax: S_DCACHE_DISCARD SBASE(2), SOFFSET1 
979Description: Discard one dirty scalar data cache line. A cache line is 64
980bytes. Address calculated as S_STORE_DWORD with alignment to 64-byte boundary.
981LGKM count is incremented by 1 for this opcode.
982
983#### S_DCACHE_DISCARD_X2
984
985Opcode 41 (0x29) only for GCN 1.4 
986Syntax: S_DCACHE_DISCARD_X2 SBASE(2), SOFFSET1 
987Description: Discard two dirty scalar data cache lines. A cache line is 64
988bytes. Address calculated as S_STORE_DWORD with alignment to 64-byte boundary.
989LGKM count is incremented by 1 for this opcode.
990
991#### S_DCACHE_INV
992
993Opcode: 32 (0x20) 
994Syntax: S_DCACHE_INV 
995Description: Invalidate entire L1 K cache.
996
997#### S_DCACHE_INV_VOL
998
999Opcode: 34 (0x22) 
1000Syntax: S_DCACHE_INV_VOL 
1001Description: Invalidate all volatile lines in L1 K cache.
1002
1003
1004#### S_LOAD_DWORD
1005
1006Opcode: 0 (0x0) 
1007Syntax: S_LOAD_DWORD SDATA, SBASE(2), OFFSET 
1008Description: Load single dword from read-only memory through constant cache (kcache). 
1009Operation: 
1010```
1011SDATA = *(UINT32*)(SMEM + (OFFSET & ~3))
1012```
1013
1014#### S_LOAD_DWORDX16
1015
1016Opcode: 4 (0x4) 
1017Syntax: S_LOAD_DWORDX16 SDATA(16), SBASE(2), OFFSET 
1018Description: Load 16 dwords from read-only memory through constant cache (kcache). 
1019Operation: 
1020```
1021for (BYTE i = 0; i < 16; i++)
1022    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3))
1023```
1024
1025#### S_LOAD_DWORDX2
1026
1027Opcode: 1 (0x1) 
1028Syntax: S_LOAD_DWORDX2 SDATA(2), SBASE(2), OFFSET 
1029Description: Load two dwords from read-only memory through constant cache (kcache). 
1030```
1031SDATA = *(UINT64*)(SMEM + (OFFSET & ~3))
1032```
1033
1034#### S_LOAD_DWORDX4
1035
1036Opcode: 2 (0x2) 
1037Syntax: S_LOAD_DWORDX4 SDATA(4), SBASE(2), OFFSET 
1038Description: Load four dwords from read-only memory through constant cache (kcache). 
1039Operation: 
1040```
1041for (BYTE i = 0; i < 4; i++)
1042    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3))
1043```
1044
1045#### S_LOAD_DWORDX8
1046
1047Opcode: 3 (0x3) 
1048Syntax: S_LOAD_DWORDX8 SDATA(8), SBASE(2), OFFSET 
1049Description: Load eight dwords from read-only memory through constant cache (kcache). 
1050Operation: 
1051```
1052for (BYTE i = 0; i < 8; i++)
1053    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3))
1054```
1055
1056#### S_MEMREALTIME
1057
1058Opcode: 37 (0x25) 
1059Syntax: S_MEMREALTIME SDATA(2) 
1060Description: Store value of 64-bit RTC counter to SDATA.
1061Before reading result, S_WAITCNT LGKMCNT(0) is required. 
1062Operation: 
1063```
1064SDATA = CLOCKCNT
1065```
1066
1067#### S_MEMTIME
1068
1069Opcode: 36 (0x24) 
1070Syntax: S_MEMTIME SDATA(2) 
1071Description: Store value of 64-bit clock counter to SDATA.
1072This "time" is a free-running clock counter based on the shader core clock.
1073Before reading result, S_WAITCNT LGKMCNT(0) is required. 
1074Operation: 
1075```
1076SDATA = CLOCKCNT
1077```
1078
1079#### S_SCRATCH_LOAD_DWORD
1080
1081Opcode: 5 (0x5) only for GCN 1.4 
1082Syntax: S_SCRATCH_LOAD_DWORD SDATA, SBASE(2), SGPROFFSET OFFSET:OFFSET 
1083Description: Load single dword from read-only memory through constant cache (kcache). 
1084Operation: 
1085```
1086SDATA = *(UINT32*)(SMEM + (OFFSET & ~3) + (SGPROFFSET & ~3)*64)
1087```
1088
1089#### S_SCRATCH_LOAD_DWORDX2
1090
1091Opcode: 6 (0x6) only for GCN 1.4 
1092Syntax: S_SCRATCH_LOAD_DWORDX2 SDATA, SBASE(2), SGPROFFSET OFFSET:OFFSET 
1093Description: Load two dwords from read-only memory through constant cache (kcache). 
1094Operation: 
1095```
1096SDATA = *(UINT64*)(SMEM + (OFFSET & ~3) + (SGPROFFSET & ~3)*64)
1097```
1098
1099#### S_SCRATCH_LOAD_DWORDX4
1100
1101Opcode: 7 (0x7) only for GCN 1.4 
1102Syntax: S_SCRATCH_LOAD_DWORDX4 SDATA, SBASE(2), SGPROFFSET OFFSET:OFFSET 
1103Description: Load four dwords from read-only memory through constant cache (kcache). 
1104Operation: 
1105```
1106for (BYTE i = 0; i < 4; i++)
1107    SDATA[i] = *(UINT32*)(SMEM + i*4 + (OFFSET & ~3) + (SGPROFFSET & ~3)*64)
1108```
1109
1110#### S_SCRATCH_STORE_DWORD
1111
1112Opcode: 21 (0x15) only for GCN 1.4 
1113Syntax: S_SCRATCH_STORE_DWORD SDATA, SBASE(2), SGPROFFSET OFFSET:OFFSET 
1114Description: Store single dword to memory. 
1115Operation: 
1116```
1117*(UINT32*)(SMEM + (OFFSET & ~3) + (SGPROFFSET & ~3)*64) = SDATA
1118```
1119
1120#### S_SCRATCH_STORE_DWORDX2
1121
1122Opcode: 22 (0x16) only for GCN 1.4 
1123Syntax: S_SCRATCH_STORE_DWORDX2 SDATA(2), SBASE(2), SGPROFFSET OFFSET:OFFSET 
1124Description: Store two dwords to memory. 
1125Operation: 
1126```
1127*(UINT64*)(SMEM + (OFFSET & ~3) + (SGPROFFSET & ~3)*64) = SDATA
1128```
1129
1130#### S_SCRATCH_STORE_DWORDX4
1131
1132Opcode: 23 (0x17) only for GCN 1.4 
1133Syntax: S_SCRATCH_STORE_DWORDX4 SDATA(4), SBASE(2), SGPROFFSET OFFSET:OFFSET 
1134Description: Store four dwords to memory. 
1135Operation: 
1136```
1137for (BYTE i = 0; i < 4; i++)
1138    *(UINT32*)(SMEM + i*4 + (OFFSET & ~3) + (SGPROFFSET & ~3)*64) = SDATA[i]
1139```
1140
1141#### S_STORE_DWORD
1142
1143Opcode: 16 (0x10) 
1144Syntax: S_STORE_DWORD SDATA, SBASE(2), OFFSET 
1145Description: Store single dword to memory.
1146It accepts only offset as M0 or any immediate (only GCN 1.2). 
1147Operation: 
1148```
1149*(UINT32*)(SMEM + (OFFSET & ~3)) = SDATA
1150```
1151
1152#### S_STORE_DWORDX2
1153
1154Opcode: 17 (0x11) 
1155Syntax: S_STORE_DWORDX2 SDATA(2), SBASE(2), OFFSET 
1156Description: Store two dwords to memory.
1157It accepts only offset as M0 or any immediate (only GCN 1.2). 
1158Operation: 
1159```
1160*(UINT64*)(SMEM + (OFFSET & ~3)) = SDATA
1161```
1162
1163#### S_STORE_DWORDX4
1164
1165Opcode: 18 (0x12) 
1166Syntax: S_STORE_DWORDX4 SDATA(4), SBASE(2), OFFSET 
1167Description: Store four dwords to memory.
1168It accepts only offset as M0 or any immediate (only GCN 1.2). 
1169Operation: 
1170```
1171for (BYTE i = 0; i < 4; i++)
1172    *(UINT32*)(SMEM + i*4 + (OFFSET & ~3)) = SDATA[i]
1173```
Note: See TracBrowser for help on using the repository browser.