source: CLRX/CLRadeonExtender/trunk/doc/AmdAbi.md @ 2331

Last change on this file since 2331 was 2331, checked in by matszpk, 3 years ago

CLRadeonExtender: Remove obsolete info about scalar/vector initial registers in ABI docs. Add info about initial vector registers to GcnState.

File size: 3.7 KB
Line 
1## AMD Catalyst OpenCL ABI description
2
3This chapter describes how kernel gets its argument, how access to constant data.
4
5In this chapter, size is given in dwords. Dword is 4-byte value.
6
7### User data classes
8
9User data is stored in first scalar registers. Data class indicates what data are stored.
10Following data classes:
11
12* IMM_RESOURCE - data for read_only image descriptors. ApiSlot determines uavid.
13* IMM_UAV - data for UAV (global/constant buffer descriptor or
14write only image descriptor). Holds 4 or 8 registers. ApiSlot determines uavid.
15* IMM_SAMPLER - data for sampler (4 registers). ApiSlot determines sampler entry index
16* IMM_CONST_BUFFER - const buffer (4 registers). See below.
17ApiSlot determines const buffer id.
18* PTR_RESOURCE_TABLE - pointer to resource table (2 registers).
19Each entry holds 8 dwords. Count from zero.
20Table can be accessed by using SMRD (s_load_dwordxx) instructions.
21Resource table holds read-only image descriptors (8 dwords).
22* PTR_SAMPLER_TABLE - pointer to sampler table (2 pointers).
23Resource table holds sampler descriptors (4 dwords).
24* PTR_UAV_TABLE - pointer to uav table (2 registers).
25Each entry holds 8 dwords. Count from zero.
26Table can be accessed by using SMRD (s_load_dwordxx) instructions.
27Uav table holds UAV for global buffer, constant buffer (since 1384 driver)
28and write only images (8 dwords descriptors).
29* PTR_CONST_BUFFER_TABLE - pointer to const buffer table (2 registers).
30Each entry have 4 dwords. For older drivers than 1348.05, global constant buffer
31(third entry) and argument constant buffers descriptors stored in this table.
32* PTR_INTERNAL_GLOBAL_TABLE - pointer to internal global table (2 registers).
33Each entry have 4 dwords.
34* IMM_SCRATCH_BUFFER - doesn't work (???)
35
36### About resource passing
37
38All global pointers resource descriptors stored in the UAV table begin from
39UAVID+1 id. By default UAVID=11 (or for driver older than 1384.xx UAVID=9).
40By default 10th entry is reserved for global data constant buffer.
419th entry is reserved for printf buffer.
42First eight entries is write only image descriptors if defined.
43
44Read only image descriptors stored in resource table.
45Constant buffer descriptors (0 and 1) stored in const buffer tables
46
47### Argument passing and kernel setup
48
49First const buffer (id=0) holds:
50
51* 0-2 dwords - global size for each dimension
52* 3 dword - number of dimensions
53* 4-6 dwords - local size for each dimension
54* 8-10 dwords - number of groups for each dimension
55* 24-26 dwords - global offset for each dimension
56* 27 dword - get_global_offset(0)\*(workDim>=1?get_global_offset(1):1)\*
57            (workDim==2?get_global_offset(2):1)
58* 32 dword (32-bit binary) - global constant buffer offset
59* 32-33 dword (64-bit binary) - global constant buffer offset
60* 36-38 dwords (32-bit binary) - global offset for each dimension
61* 37-39 dwords (64-bit binary) - global offset for each dimension
62
63Second const buffer (id=1) holds arguments aligned to 4 dwords.
64
65Global pointers holds vector offset (64-bit for 64-bit binary) to memory.
66Local pointers holds its offset in bytes (1 dword).
67
68### Image arguments
69
70Image arguments needs 8 dwords.
71
72* 0 dword - width
73* 1 dword - height
74* 2 dword - depth
75* 3 dword - OpenCL image format data type
76* 7 dword - OpenCL image component order
77
78### Sampler arguments
79
80Sampler argument holds sampler value:
81
82* 0 bit - for normalized coords is 1, zero for other
83* 1-3 bits - addressing mode:
84    0 - none, 1 - repeat, 2 - clamp_to_edge, 3 - clamp, 4 - mirrored_repeat
85* 4-5 bits - filtering: 0 - none, 1 - nearest, 2 - linear
86
87### Scratch buffer access
88
89Second entry in the internal global table holds scratch buffer descriptor.
90Refer to [GCN Machine State](GcnState) to learn about vector and scalar initial registers.
Note: See TracBrowser for help on using the repository browser.