Changes between Version 1 and Version 2 of AmdAbi


Ignore:
Timestamp:
11/06/15 22:25:44 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AmdAbi

    v1 v2  
    44<h2>AMD Catalyst OpenCL ABI description</h2>
    55<p>This chapter describes how kernel gets its argument, how access to constant data.</p>
     6<p>In this chapter, size is given in dwords. Dword is 4-byte value.</p>
    67<h3>User data classes</h3>
    78<p>User data is stored in first scalar registers. Data class indicates what data are stored.
    89Following data classes:</p>
    910<ul>
    10 <li>IMM_UAV - data to uav. Holds 4 registers. ApiSlot determines uavid.</li>
     11<li>IMM_RESOURCE - data for read_only image descriptors. ApiSlot determines uavid.</li>
     12<li>IMM_UAV - data for UAV (global/constant buffer descriptor or
     13write only image descriptor). Holds 4 or 8 registers. ApiSlot determines uavid.</li>
     14<li>IMM_SAMPLER - data for sampler (4 registers). ApiSlot determines sampler entry index</li>
    1115<li>IMM_CONST_BUFFER - const buffer (4 registers). See below.
    1216ApiSlot determines const buffer id.</li>
     17<li>PTR_RESOURCE_TABLE - pointer to resource table (2 registers).
     18Each entry holds 8 dwords. Count from zero.
     19Table can be accessed by using SMRD (s_load_dwordxx) instructions.
     20Resource table holds read-only image descriptors (8 dwords).</li>
     21<li>PTR_SAMPLER_TABLE - pointer to sampler table (2 pointers).
     22Resource table holds sampler descriptors (4 dwords).</li>
    1323<li>PTR_UAV_TABLE - pointer to uav table (2 registers).
    1424Each entry holds 8 dwords. Count from zero.
    15 Table can be accessed by using SMRD (s_load_dwordxx) instructions.</li>
     25Table can be accessed by using SMRD (s_load_dwordxx) instructions.
     26Uav table holds UAV for global buffer, constant buffer (since 1384 driver)
     27and write only images (8 dwords descriptors).</li>
     28<li>PTR_CONST_BUFFER_TABLE - pointer to const buffer table (2 registers).
     29Each entry have 4 dwords. For older drivers than 1348.05, global constant buffer
     30(third entry) and argument constant buffers descriptors stored in this table.</li>
     31<li>PTR_INTERNAL_GLOBAL_TABLE - pointer to internal global table (2 registers).
     32Each entry have 4 dwords.</li>
     33<li>IMM_SCRATCH_BUFFER - doesn't work (???)</li>
    1634</ul>
     35<h3>About resource passing</h3>
     36<p>All global pointers resource descriptors stored in the UAV table begin from
     37UAVID+1 id. By default UAVID=11 (or for driver older than 1384.xx UAVID=9).
     38By default10th entry is reserved for global data constant buffer.
     399th entry is reserved for printf buffer.
     40First eight entries is write only image descriptors if defined.</p>
     41<p>Read only image descriptors stored in resource table.
     42Constant buffer descriptors (0 and 1) stored in const buffer tables</p>
    1743<h3>Argument passing and kernel setup</h3>
    1844<p>First const buffer (id=0) holds:</p>
     
    2551<li>27 dword - get_global_offset(0)*(workDim&gt;=1?get_global_offset(1):1)*
    2652            (workDim==2?get_global_offset(2):1)</li>
     53<li>32 dword (32-bit binary) - global constant buffer offset</li>
     54<li>32-33 dword (64-bit binary) - global constant buffer offset</li>
    2755<li>36-38 dwords (32-bit binary) - global offset for each dimensions</li>
    2856<li>37-39 dwords (64-bit binary) - global offset for each dimensions</li>
    2957</ul>
    30 <p>Second const buffer (id=1) holds:</p>
    31 <p>arguments aligned to 4 dwords.</p>
     58<p>Second const buffer (id=1) holds arguments aligned to 4 dwords.</p>
     59<p>Global pointers holds vector offset (64-bit for 64-bit binary) to memory.
     60Local pointers holds its offset in bytes (1 dword).</p>
    3261<h3>Other data and resources</h3>
    3362<p>Scalar register after userdata holds (n - userdatanum):</p>
     
    3665</ul>
    3766<p>First three vector registers holds local ids for each dimensions.</p>
     67<h3>Image arguments</h3>
     68<p>Image arguments needs 8 dwords.</p>
     69<ul>
     70<li>0 dword - width</li>
     71<li>1 dword - height</li>
     72<li>2 dword - depth</li>
     73<li>3 dword - OpenCL image format data type</li>
     74<li>7 dword - OpenCL image component order</li>
     75</ul>
     76<h3>Sampler arguments</h3>
     77<p>Sampler argument holds sampler value:</p>
     78<ul>
     79<li>0 bit - for normalized coords is 1, zero for other</li>
     80<li>1-3 bits - addressing mode:
     81    0 - none, 1 - repeat, 2 - clamp_to_edge, 3 - clamp, 4 - mirrored_repeat</li>
     82<li>4-5 bits - filtering: 0 - none, 1 - nearest, 2 - linear</li>
     83</ul>
     84<h3>Scratch buffer access</h3>
     85<p>Second entry in the internal global table holds scratch buffer descriptor.
     86s[n+3] register holds wavefront offset to scratch buffer.
     87where n is userdatanum.</p>
    3888}}}