Changes between Version 1 and Version 2 of AmdAbi
- Timestamp:
- 11/06/15 22:25:44 (8 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
AmdAbi
v1 v2 4 4 <h2>AMD Catalyst OpenCL ABI description</h2> 5 5 <p>This chapter describes how kernel gets its argument, how access to constant data.</p> 6 <p>In this chapter, size is given in dwords. Dword is 4-byte value.</p> 6 7 <h3>User data classes</h3> 7 8 <p>User data is stored in first scalar registers. Data class indicates what data are stored. 8 9 Following data classes:</p> 9 10 <ul> 10 <li>IMM_UAV - data to uav. Holds 4 registers. ApiSlot determines uavid.</li> 11 <li>IMM_RESOURCE - data for read_only image descriptors. ApiSlot determines uavid.</li> 12 <li>IMM_UAV - data for UAV (global/constant buffer descriptor or 13 write only image descriptor). Holds 4 or 8 registers. ApiSlot determines uavid.</li> 14 <li>IMM_SAMPLER - data for sampler (4 registers). ApiSlot determines sampler entry index</li> 11 15 <li>IMM_CONST_BUFFER - const buffer (4 registers). See below. 12 16 ApiSlot determines const buffer id.</li> 17 <li>PTR_RESOURCE_TABLE - pointer to resource table (2 registers). 18 Each entry holds 8 dwords. Count from zero. 19 Table can be accessed by using SMRD (s_load_dwordxx) instructions. 20 Resource table holds read-only image descriptors (8 dwords).</li> 21 <li>PTR_SAMPLER_TABLE - pointer to sampler table (2 pointers). 22 Resource table holds sampler descriptors (4 dwords).</li> 13 23 <li>PTR_UAV_TABLE - pointer to uav table (2 registers). 14 24 Each entry holds 8 dwords. Count from zero. 15 Table can be accessed by using SMRD (s_load_dwordxx) instructions.</li> 25 Table can be accessed by using SMRD (s_load_dwordxx) instructions. 26 Uav table holds UAV for global buffer, constant buffer (since 1384 driver) 27 and write only images (8 dwords descriptors).</li> 28 <li>PTR_CONST_BUFFER_TABLE - pointer to const buffer table (2 registers). 29 Each entry have 4 dwords. For older drivers than 1348.05, global constant buffer 30 (third entry) and argument constant buffers descriptors stored in this table.</li> 31 <li>PTR_INTERNAL_GLOBAL_TABLE - pointer to internal global table (2 registers). 32 Each entry have 4 dwords.</li> 33 <li>IMM_SCRATCH_BUFFER - doesn't work (???)</li> 16 34 </ul> 35 <h3>About resource passing</h3> 36 <p>All global pointers resource descriptors stored in the UAV table begin from 37 UAVID+1 id. By default UAVID=11 (or for driver older than 1384.xx UAVID=9). 38 By default10th entry is reserved for global data constant buffer. 39 9th entry is reserved for printf buffer. 40 First eight entries is write only image descriptors if defined.</p> 41 <p>Read only image descriptors stored in resource table. 42 Constant buffer descriptors (0 and 1) stored in const buffer tables</p> 17 43 <h3>Argument passing and kernel setup</h3> 18 44 <p>First const buffer (id=0) holds:</p> … … 25 51 <li>27 dword - get_global_offset(0)*(workDim>=1?get_global_offset(1):1)* 26 52 (workDim==2?get_global_offset(2):1)</li> 53 <li>32 dword (32-bit binary) - global constant buffer offset</li> 54 <li>32-33 dword (64-bit binary) - global constant buffer offset</li> 27 55 <li>36-38 dwords (32-bit binary) - global offset for each dimensions</li> 28 56 <li>37-39 dwords (64-bit binary) - global offset for each dimensions</li> 29 57 </ul> 30 <p>Second const buffer (id=1) holds:</p> 31 <p>arguments aligned to 4 dwords.</p> 58 <p>Second const buffer (id=1) holds arguments aligned to 4 dwords.</p> 59 <p>Global pointers holds vector offset (64-bit for 64-bit binary) to memory. 60 Local pointers holds its offset in bytes (1 dword).</p> 32 61 <h3>Other data and resources</h3> 33 62 <p>Scalar register after userdata holds (n - userdatanum):</p> … … 36 65 </ul> 37 66 <p>First three vector registers holds local ids for each dimensions.</p> 67 <h3>Image arguments</h3> 68 <p>Image arguments needs 8 dwords.</p> 69 <ul> 70 <li>0 dword - width</li> 71 <li>1 dword - height</li> 72 <li>2 dword - depth</li> 73 <li>3 dword - OpenCL image format data type</li> 74 <li>7 dword - OpenCL image component order</li> 75 </ul> 76 <h3>Sampler arguments</h3> 77 <p>Sampler argument holds sampler value:</p> 78 <ul> 79 <li>0 bit - for normalized coords is 1, zero for other</li> 80 <li>1-3 bits - addressing mode: 81 0 - none, 1 - repeat, 2 - clamp_to_edge, 3 - clamp, 4 - mirrored_repeat</li> 82 <li>4-5 bits - filtering: 0 - none, 1 - nearest, 2 - linear</li> 83 </ul> 84 <h3>Scratch buffer access</h3> 85 <p>Second entry in the internal global table holds scratch buffer descriptor. 86 s[n+3] register holds wavefront offset to scratch buffer. 87 where n is userdatanum.</p> 38 88 }}}