Changes between Version 1 and Version 2 of ClrxAsmAmd


Ignore:
Timestamp:
10/27/15 20:41:43 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ClrxAsmAmd

    v1 v2  
    11{{{
    2 #!Markdown
    3 ## CLRadeonExtender Assembler AMD Catalyst handling
    4 
    5 The AMD Catalyst driver provides own OpenCL implementation that can generates
     2#!html
     3<h2>CLRadeonExtender Assembler AMD Catalyst handling</h2>
     4<p>The AMD Catalyst driver provides own OpenCL implementation that can generates
    65own binaries of the OpenCL programs. The CLRX assembler supports only OpenCL 1.2
    7 binary format.
    8 
    9 ## Binary format
    10 
    11 The AMD OpenCL binaries contains constant global data, the device and compilation
    12 informations and embedded kernel binaries. Kernel binaries are inside `.text` section.
     6binary format.</p>
     7<h2>Binary format</h2>
     8<p>The AMD OpenCL binaries contains constant global data, the device and compilation
     9informations and embedded kernel binaries. Kernel binaries are inside <code>.text</code> section.
    1310Program code are separate for each kernel and no shared machine code between kernels.
    1411Each kernel binary have the metadata string, ATI CAL notes and program code.
     
    1714ATI CAL notes are special small data fragments that describes features of the kernel.
    1815The most important ATI CAL note is PROGINFO that holds important data for runtime execution,
    19 like register usage, UAV usage, floating point setup.
    20 
    21 ## Layout of the source code
    22 
    23 The CLRX assembler allow to use one of two ways to configure kernel setup:
    24 for human (`.config`) and for quick recompilation (ATI CALNotes and the metadata string).
    25 
    26 ## List of the specific pseudo-operations
    27 
    28 ### .arg
    29 
    30 Syntax for scalar: .arg ARGNAME[[, "ARGTYPENAME"], ARGTYPE[, unused]] 
    31 Syntax for structure: .arg [[, "ARGTYPENAME"], ARGTYPE[, STRUCTSIZE[, unused]]] 
    32 Syntax for image: .arg ARGNAME[[, "ARGTYPENAME"], ARGTYPE[, [ACCESS] [, RESID[, unused]]]] 
    33 Syntax for counter32: .arg ARGNAME[[, "ARGTYPENAME"], ARGTYPE[, RESID[, unused]]] 
     16like register usage, UAV usage, floating point setup.</p>
     17<h2>Layout of the source code</h2>
     18<p>The CLRX assembler allow to use one of two ways to configure kernel setup:
     19for human (<code>.config</code>) and for quick recompilation (ATI CALNotes and the metadata string).</p>
     20<h2>List of the specific pseudo-operations</h2>
     21<h3>.arg</h3>
     22<p>Syntax for scalar: .arg ARGNAME[[, "ARGTYPENAME"], ARGTYPE[, unused]]<br />
     23Syntax for structure: .arg [[, "ARGTYPENAME"], ARGTYPE[, STRUCTSIZE[, unused]]]<br />
     24Syntax for image: .arg ARGNAME[[, "ARGTYPENAME"], ARGTYPE[, [ACCESS] [, RESID[, unused]]]]<br />
     25Syntax for counter32: .arg ARGNAME[[, "ARGTYPENAME"], ARGTYPE[, RESID[, unused]]]<br />
    3426Syntax for global pointer: .arg ARGNAME[[, "ARGTYPENAME"],
    35 ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, RESID[, unused]]]]] 
     27ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, RESID[, unused]]]]]<br />
    3628Syntax for local pointer: .arg ARGNAME[[, "ARGTYPENAME"],
    37 ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, unused]]]] 
     29ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, unused]]]]<br />
    3830Syntax for constant pointer: .arg ARGNAME[[, "ARGTYPENAME"],
    39 ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, [CONSTSIZE] [, RESID[, unused]]]]]]
    40 
    41 Adds kernel argument definition. Must be inside kernel configuration. First argument is
     31ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, [CONSTSIZE] [, RESID[, unused]]]]]]</p>
     32<p>Adds kernel argument definition. Must be inside kernel configuration. First argument is
    4233argument name from OpenCL kernel definition. Next optional argument is argument type name
    43 from OpenCL kernel definition. Next arugment is argument type:
    44 
    45 * char, uchar, short, ushort, int, uint, ulong, long, float, double - simple scalar types
    46 * charX, ucharX, shortX, ushortX, intX, uintX, ulongX, longX, floatX, doubleX - vector types
    47 (X indicates number of elements: 2, 3, 4, 8 or 16)
    48 * counter32 - 32-bit counter type
    49 * structure - structure
    50 * image, image1d, image1d_array, image1d_buffer, image2d, image2d_array, image3d -
    51 image types
    52 * sampler - sampler
    53 * type* - pointer to data
    54 
    55 Rest of the argument depends on type of the kernel argument. STRUCTSIZE determines size of
    56 structure. ACCESS for image determines can be one of the: `read_only` or `write_only`.
     34from OpenCL kernel definition. Next arugment is argument type:</p>
     35<ul>
     36<li>char, uchar, short, ushort, int, uint, ulong, long, float, double - simple scalar types</li>
     37<li>charX, ucharX, shortX, ushortX, intX, uintX, ulongX, longX, floatX, doubleX - vector types
     38(X indicates number of elements: 2, 3, 4, 8 or 16)</li>
     39<li>counter32 - 32-bit counter type</li>
     40<li>structure - structure</li>
     41<li>image, image1d, image1d_array, image1d_buffer, image2d, image2d_array, image3d -
     42image types</li>
     43<li>sampler - sampler</li>
     44<li>type* - pointer to data</li>
     45</ul>
     46<p>Rest of the argument depends on type of the kernel argument. STRUCTSIZE determines size of
     47structure. ACCESS for image determines can be one of the: <code>read_only</code> or <code>write_only</code>.
    5748PTRSPACE determines space where pointer points to.
    58 It can be one of: `local`, `constant` or `global`.
    59 ACCESS for pointers can be: `const`, `restrict` and `volatile`.
     49It can be one of: <code>local</code>, <code>constant</code> or <code>global</code>.
     50ACCESS for pointers can be: <code>const</code>, <code>restrict</code> and <code>volatile</code>.
    6051CONSTSIZE determines maximum size in bytes for constant buffer.
    61 RESID determines resource id.
    62 
    63 * for global or constant pointers is UAVID, range is in 8-1023.
    64 * for constant pointers (driver older than 1348.X), range is in 1-159.
    65 * for read only images range is in 0-127.
    66 * For write only images or counters range is in 0-7.
    67 
    68 The last argument `unused` indicates that argument will not be used by kernel.
    69 
    70 Sample usage:
    71 
    72 ```
    73 .arg v1,"double_t",double
     52RESID determines resource id.</p>
     53<ul>
     54<li>for global or constant pointers is UAVID, range is in 8-1023.</li>
     55<li>for constant pointers (driver older than 1348.X), range is in 1-159.</li>
     56<li>for read only images range is in 0-127.</li>
     57<li>For write only images or counters range is in 0-7.</li>
     58</ul>
     59<p>The last argument <code>unused</code> indicates that argument will not be used by kernel.</p>
     60<p>Sample usage:</p>
     61<p><code>.arg v1,"double_t",double
    7462.arg v2,double2
    7563.arg v3,double3
     
    7866.arg v41,ulong16  *,global
    7967.arg v42,ulong16  *,global, restrict
    80 .arg v57,structure*,82,global
    81 ```
    82 
    83 ### .boolconsts
    84 
    85 This pseudo-operation must be inside kernel.
    86 Open ATI_BOOL32CONSTS CAL note. Next occurrence in this same kernel, add new CAL note.
    87 
    88 ### .calnote
    89 
    90 Syntax: .calnote CALNOTEID
    91 
    92 This pseudo-operation must be inside kernel. Open ATI CAL note.
    93 
    94 ### .cbid
    95 
    96 Syntax: .cbid
    97 Syntax: .cbid VALUE
    98 
    99 If this pseudo-operation inside ATI_CONSTANT_BUFFERS CAL note then
     68.arg v57,structure*,82,global</code></p>
     69<h3>.boolconsts</h3>
     70<p>This pseudo-operation must be inside kernel.
     71Open ATI_BOOL32CONSTS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     72<h3>.calnote</h3>
     73<p>Syntax: .calnote CALNOTEID</p>
     74<p>This pseudo-operation must be inside kernel. Open ATI CAL note.</p>
     75<h3>.cbid</h3>
     76<p>Syntax: .cbid
     77Syntax: .cbid VALUE</p>
     78<p>If this pseudo-operation inside ATI_CONSTANT_BUFFERS CAL note then
    10079it adds entry into ATI_CONSTANT_BUFFERS CAL note.
    101 If this pseudo-operation in kernel configuration then set constant buffer id.
    102 
    103 ### .cbmask
    104 
    105 Syntax: .cbmask INDEX, SIZE
    106 
    107 This pseudo-operation must be in ATI_CONSTANT_BUFFERS CAL note.
    108 Add entry into ATI_CONSTANT_BUFFERS CAL note.
    109 
    110 ### .compile_options
    111 
    112 Syntax: .compile_options "STRING"
    113 
    114 Set compile options for this binary.
    115 
    116 ### .condout
    117 
    118 Syntax: .condout [VALUE] 
    119 Syntax: .condout VALUE
    120 
    121 If this pseudo-operation inside kernel then it open ATI_CONDOUT CAL note.
     80If this pseudo-operation in kernel configuration then set constant buffer id.</p>
     81<h3>.cbmask</h3>
     82<p>Syntax: .cbmask INDEX, SIZE</p>
     83<p>This pseudo-operation must be in ATI_CONSTANT_BUFFERS CAL note.
     84Add entry into ATI_CONSTANT_BUFFERS CAL note.</p>
     85<h3>.compile_options</h3>
     86<p>Syntax: .compile_options "STRING"</p>
     87<p>Set compile options for this binary.</p>
     88<h3>.condout</h3>
     89<p>Syntax: .condout [VALUE]<br />
     90Syntax: .condout VALUE</p>
     91<p>If this pseudo-operation inside kernel then it open ATI_CONDOUT CAL note.
    12292Next occurrence in this same kernel, add new CAL note.
    12393Optional argument add 4-byte value to content of this CAL note.
    124 If this pseudo-operation in kernel configuration then set CONDOUT value.
    125 
    126 ### .config
    127 
    128 Open kernel configuration. Must be inside kernel. Kernel configuration can not be
     94If this pseudo-operation in kernel configuration then set CONDOUT value.</p>
     95<h3>.config</h3>
     96<p>Open kernel configuration. Must be inside kernel. Kernel configuration can not be
    12997defined if any CALNote, metadata or header was defined.
    130 Following pseudo-ops can be inside kernel config:
    131 
    132 * .arg
    133 * .cbid
    134 * .condout
    135 * .cws
    136 * .dims
    137 * .earlyexit
    138 * .hwlocal
    139 * .hwreg
    140 * .ieeemode
    141 * .pgmrsrc2
    142 * .printfid
    143 * .privateid
    144 * .sampler
    145 * .scratchbuffer
    146 * .sgprsnum
    147 * .tgsize
    148 * .uavid
    149 * .uavprivate
    150 * .useconstdata
    151 * .useprintf
    152 * .userdata
    153 * .vgprsnum
    154 
    155 ### .constantbuffers
    156 
    157 This pseudo-operation must be inside kernel.
     98Following pseudo-ops can be inside kernel config:</p>
     99<ul>
     100<li>.arg</li>
     101<li>.cbid</li>
     102<li>.condout</li>
     103<li>.cws</li>
     104<li>.dims</li>
     105<li>.earlyexit</li>
     106<li>.hwlocal</li>
     107<li>.hwreg</li>
     108<li>.ieeemode</li>
     109<li>.pgmrsrc2</li>
     110<li>.printfid</li>
     111<li>.privateid</li>
     112<li>.sampler</li>
     113<li>.scratchbuffer</li>
     114<li>.sgprsnum</li>
     115<li>.tgsize</li>
     116<li>.uavid</li>
     117<li>.uavprivate</li>
     118<li>.useconstdata</li>
     119<li>.useprintf</li>
     120<li>.userdata</li>
     121<li>.vgprsnum</li>
     122</ul>
     123<h3>.constantbuffers</h3>
     124<p>This pseudo-operation must be inside kernel.
    158125Open ATI_CONSTANT_BUFFERS CAL note. Next occurrence in this same kernel,
    159 add new CAL note.
    160 
    161 ### .cws
    162 
    163 Syntax: .cws SIZEHINT[, SIZEHINT[, SIZEHINT]]
    164 
    165 This pseudo-operation must be inside kernel configuration.
    166 Set reqd_work_group_size hint for this kernel.
    167 
    168 ### .dims
    169 
    170 Syntax: .dims DIMENSIONS
    171 
    172 This pseudo-operation must be inside kernel configuration. Defines what dimensions
    173 (from list: x, y, z) will be used to determine space of the kernel execution.
    174 
    175 ### .driver_info
    176 
    177 Syntax: .driver_info "INFO"
    178 
    179 Set driver info for this binary.
    180 
    181 ### .driver_version
    182 
    183 Syntax: .driver_version VERSION
    184 
    185 Set driver version for this binary. Version in form: MajorVersion*100+MinorVersion.
    186 This pseudo-op replaces driver info.
    187 
    188 ### .earlyexit
    189 
    190 Syntax: .earlyexit [VALUE] 
    191 Syntax: .earlyexit VALUE
    192 
    193 If this pseudo-operation inside kernel then it open ATI_EARLY_EXIT CAL note.
     126add new CAL note.</p>
     127<h3>.cws</h3>
     128<p>Syntax: .cws SIZEHINT[, SIZEHINT[, SIZEHINT]]</p>
     129<p>This pseudo-operation must be inside kernel configuration.
     130Set reqd_work_group_size hint for this kernel.</p>
     131<h3>.dims</h3>
     132<p>Syntax: .dims DIMENSIONS</p>
     133<p>This pseudo-operation must be inside kernel configuration. Defines what dimensions
     134(from list: x, y, z) will be used to determine space of the kernel execution.</p>
     135<h3>.driver_info</h3>
     136<p>Syntax: .driver_info "INFO"</p>
     137<p>Set driver info for this binary.</p>
     138<h3>.driver_version</h3>
     139<p>Syntax: .driver_version VERSION</p>
     140<p>Set driver version for this binary. Version in form: MajorVersion*100+MinorVersion.
     141This pseudo-op replaces driver info.</p>
     142<h3>.earlyexit</h3>
     143<p>Syntax: .earlyexit [VALUE]<br />
     144Syntax: .earlyexit VALUE</p>
     145<p>If this pseudo-operation inside kernel then it open ATI_EARLY_EXIT CAL note.
    194146Next occurrence in this same kernel, add new CAL note.
    195147Optional argument add 4-byte value to content of this CAL note.
    196 If this pseudo-operation in kernel configuration then set EARLY_EXIT value.
    197 
    198 ### .entry
    199 
    200 Syntax: .entry UAVID, F1, F2, TYPE 
    201 Syntax: .entry VALUE1, VALUE2
    202 
    203 This pseudo-operation must be in ATI_UAV or ATI_PROGINFO CAL note.
     148If this pseudo-operation in kernel configuration then set EARLY_EXIT value.</p>
     149<h3>.entry</h3>
     150<p>Syntax: .entry UAVID, F1, F2, TYPE<br />
     151Syntax: .entry VALUE1, VALUE2</p>
     152<p>This pseudo-operation must be in ATI_UAV or ATI_PROGINFO CAL note.
    204153Add entry into CAL note. For ATI_UAV, pseudo-operation accepts 4 32-bit values.
    205 For ATI_PROGINFO, accepts 2 32-bit values.
    206 
    207 ### .floatconsts
    208 
    209 This pseudo-operation must be inside kernel.
    210 Open ATI_FLOAT32CONSTS CAL note. Next occurrence in this same kernel, add new CAL note.
    211 
    212 ### .floatmode
    213 
    214 Syntax: .floatmode VALUE
    215 
    216 This pseudo-operation must be inside kernel configuration.
    217 Set floatmode. Value shall to be byte value.
    218 
    219 ### .globalbuffers
    220 
    221 This pseudo-operation must be inside kernel.
    222 Open ATI_GLOBAL_BUFFERS CAL note. Next occurrence in this same kernel, add new CAL note.
    223 
    224 ### .globaldata
    225 
    226 Go to constant global data section.
    227 
    228 ### .header
    229 
    230 Go to main header of the binary.
    231 
    232 ### .hwlocal
    233 
    234 Syntax: .hwlocal SIZE
    235 
    236 This pseudo-operation must be inside kernel configuration. Set HWLOCAL value, the initial
    237 local data size.
    238 
    239 ### .hwregion
    240 
    241 Syntax: .hwregion VALUE
    242 
    243 This pseudo-operation must be inside kernel configuration. Set HWREGION value.
    244 
    245 ### .ieeemode
    246 
    247 Syntax: .ieeemode
    248 
    249 This pseudo-op must be inside kernel configuration. Set ieee-mode.
    250 
    251 
    252 ### .inputs
    253 
    254 This pseudo-operation must be inside kernel.
    255 Open ATI_INPUTS CAL note. Next occurrence in this same kernel, add new CAL note.
    256 
    257 ### .inputsamplers
    258 
    259 This pseudo-operation must be inside kernel.
    260 Open ATI_INPUT_SAMPLERS CAL note. Next occurrence in this same kernel, add new CAL note.
    261 
    262 ### .intconsts
    263 
    264 This pseudo-operation must be inside kernel.
    265 Open ATI_INT32CONSTS CAL note. Next occurrence in this same kernel, add new CAL note.
    266 
    267 ### .metadata
    268 
    269 This pseudo-operation must be inside kernel.
    270 Go to metadata content.
    271 
    272 ### .outputs
    273 
    274 This pseudo-operation must be inside kernel.
    275 Open ATI_OUTPUTS CAL note. Next occurrence in this same kernel, add new CAL note.
    276 
    277 ### .persistentbuffers
    278 
    279 This pseudo-operation must be inside kernel.
     154For ATI_PROGINFO, accepts 2 32-bit values.</p>
     155<h3>.floatconsts</h3>
     156<p>This pseudo-operation must be inside kernel.
     157Open ATI_FLOAT32CONSTS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     158<h3>.floatmode</h3>
     159<p>Syntax: .floatmode VALUE</p>
     160<p>This pseudo-operation must be inside kernel configuration.
     161Set floatmode. Value shall to be byte value.</p>
     162<h3>.globalbuffers</h3>
     163<p>This pseudo-operation must be inside kernel.
     164Open ATI_GLOBAL_BUFFERS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     165<h3>.globaldata</h3>
     166<p>Go to constant global data section.</p>
     167<h3>.header</h3>
     168<p>Go to main header of the binary.</p>
     169<h3>.hwlocal</h3>
     170<p>Syntax: .hwlocal SIZE</p>
     171<p>This pseudo-operation must be inside kernel configuration. Set HWLOCAL value, the initial
     172local data size.</p>
     173<h3>.hwregion</h3>
     174<p>Syntax: .hwregion VALUE</p>
     175<p>This pseudo-operation must be inside kernel configuration. Set HWREGION value.</p>
     176<h3>.ieeemode</h3>
     177<p>Syntax: .ieeemode</p>
     178<p>This pseudo-op must be inside kernel configuration. Set ieee-mode.</p>
     179<h3>.inputs</h3>
     180<p>This pseudo-operation must be inside kernel.
     181Open ATI_INPUTS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     182<h3>.inputsamplers</h3>
     183<p>This pseudo-operation must be inside kernel.
     184Open ATI_INPUT_SAMPLERS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     185<h3>.intconsts</h3>
     186<p>This pseudo-operation must be inside kernel.
     187Open ATI_INT32CONSTS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     188<h3>.metadata</h3>
     189<p>This pseudo-operation must be inside kernel.
     190Go to metadata content.</p>
     191<h3>.outputs</h3>
     192<p>This pseudo-operation must be inside kernel.
     193Open ATI_OUTPUTS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     194<h3>.persistentbuffers</h3>
     195<p>This pseudo-operation must be inside kernel.
    280196Open ATI_PERSISTENT_BUFFERS CAL note. Next occurrence in this same kernel,
    281 add new CAL note.
    282 
    283 ### .pgmrsrc2
    284 
    285 Syntax: .pgmrsrc2 VALUE
    286 
    287 This pseudo-operation must be inside kernel configuration. Set PGMRSRC2 value (expect bits
    288 which can be by using other pseudo-operations).
    289 
    290 ### .printfid
    291 
    292 Syntax: .printfid RESID
    293 
    294 This pseudo-operation must be inside kernel configuration. Set printfid.
    295 
    296 ### .privateid
    297 
    298 Syntax: .privateid RESID
    299 
    300 This pseudo-operation must be inside kernel configuration. Set privateid.
    301 
    302 ### .proginfo
    303 
    304 This pseudo-operation must be inside kernel.
    305 Open ATI_PROGINFO CAL note. Next occurrence in this same kernel, add new CAL note.
    306 
    307 ### .sampler
    308 
    309 Syntax: .sampler INPUT, SAMPLER 
    310 Syntax: .sampler RESID,....
    311 
    312 If this pseudo-operation is in ATI_SAMPLER CAL note, then it adds sampler entry.
     197add new CAL note.</p>
     198<h3>.pgmrsrc2</h3>
     199<p>Syntax: .pgmrsrc2 VALUE</p>
     200<p>This pseudo-operation must be inside kernel configuration. Set PGMRSRC2 value (expect bits
     201which can be by using other pseudo-operations).</p>
     202<h3>.printfid</h3>
     203<p>Syntax: .printfid RESID</p>
     204<p>This pseudo-operation must be inside kernel configuration. Set printfid.</p>
     205<h3>.privateid</h3>
     206<p>Syntax: .privateid RESID</p>
     207<p>This pseudo-operation must be inside kernel configuration. Set privateid.</p>
     208<h3>.proginfo</h3>
     209<p>This pseudo-operation must be inside kernel.
     210Open ATI_PROGINFO CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     211<h3>.sampler</h3>
     212<p>Syntax: .sampler INPUT, SAMPLER<br />
     213Syntax: .sampler RESID,....</p>
     214<p>If this pseudo-operation is in ATI_SAMPLER CAL note, then it adds sampler entry.
    313215If this  pseudo-operation is in kernel configuration, then it adds samplers with specified
    314 resource ids.
    315 
    316 ### .scratchbuffer
    317 
    318 Syntax: .scratchbuffer SIZE
    319 
    320 This pseudo-operation must be inside kernel configuration.
    321 Set scratchbuffer size.
    322 
    323 ### .scratchbuffers
    324 
    325 This pseudo-operation must be inside kernel.
    326 Open ATI_SCRATCH_BUFFERS CAL note. Next occurrence in this same kernel, add new CAL note.
    327 
    328 ### .segment
    329 
    330 Syntax: .segment OFFSET, SIZE
    331 
    332 This pseudo-operation must be in ATI_BOOL32CONSTS, ATI_INT32CONSTS or
    333 ATI_FLOAT32CONSTS CAL note. Add entry into CAL note.
    334 
    335 ### .sgprsnum
    336 
    337 Syntax: .sgprsnum REGNUM
    338 
    339 This pseudo-op must be inside kernel configuration. Set number of scalar
    340 registers which can be used during kernel execution.
    341 
    342 ### .subconstantbuffers
    343 
    344 This pseudo-operation must be inside kernel.
     216resource ids.</p>
     217<h3>.scratchbuffer</h3>
     218<p>Syntax: .scratchbuffer SIZE</p>
     219<p>This pseudo-operation must be inside kernel configuration.
     220Set scratchbuffer size.</p>
     221<h3>.scratchbuffers</h3>
     222<p>This pseudo-operation must be inside kernel.
     223Open ATI_SCRATCH_BUFFERS CAL note. Next occurrence in this same kernel, add new CAL note.</p>
     224<h3>.segment</h3>
     225<p>Syntax: .segment OFFSET, SIZE</p>
     226<p>This pseudo-operation must be in ATI_BOOL32CONSTS, ATI_INT32CONSTS or
     227ATI_FLOAT32CONSTS CAL note. Add entry into CAL note.</p>
     228<h3>.sgprsnum</h3>
     229<p>Syntax: .sgprsnum REGNUM</p>
     230<p>This pseudo-op must be inside kernel configuration. Set number of scalar
     231registers which can be used during kernel execution.</p>
     232<h3>.subconstantbuffers</h3>
     233<p>This pseudo-operation must be inside kernel.
    345234Open ATI_SUB_CONSTANT_BUFFERS CAL note. Next occurrence in this same kernel,
    346 add new CAL note.
    347 
    348 ### .tgsize
    349 
    350 This pseudo-op must be inside kernel configuration.
    351 Enable usage of the TG_SIZE_EN.
    352 
    353 ### .uav
    354 
    355 This pseudo-operation must be inside kernel.
     235add new CAL note.</p>
     236<h3>.tgsize</h3>
     237<p>This pseudo-op must be inside kernel configuration.
     238Enable usage of the TG_SIZE_EN.</p>
     239<h3>.uav</h3>
     240<p>This pseudo-operation must be inside kernel.
    356241Open ATI_UAV CAL note. Next occurrence in this same kernel,
    357 add new CAL note.
    358 
    359 ### .uavid
    360 
    361 Syntax: .uavid UAVID
    362 
    363 This pseudo-op must be inside kernel configuration. Set UAVId value.
    364 
    365 ### .uavmailboxsize
    366 
    367 Syntax: .uavmailboxsize [VALUE]
    368 
    369 This pseudo-operation must be inside kernel.
     242add new CAL note.</p>
     243<h3>.uavid</h3>
     244<p>Syntax: .uavid UAVID</p>
     245<p>This pseudo-op must be inside kernel configuration. Set UAVId value.</p>
     246<h3>.uavmailboxsize</h3>
     247<p>Syntax: .uavmailboxsize [VALUE]</p>
     248<p>This pseudo-operation must be inside kernel.
    370249Open ATI_UAV_MAILBOX_SIZE CAL note. Next occurrence in this same kernel,
    371 add new CAL note. If first argument is given, then 32-bit value will be added to content.
    372 
    373 ### .uavopmask
    374 
    375 Syntax: .uavopmask [VALUE]
    376 
    377 This pseudo-operation must be inside kernel.
     250add new CAL note. If first argument is given, then 32-bit value will be added to content.</p>
     251<h3>.uavopmask</h3>
     252<p>Syntax: .uavopmask [VALUE]</p>
     253<p>This pseudo-operation must be inside kernel.
    378254Open ATI_UAV_OP_MASK CAL note. Next occurrence in this same kernel,
    379 add new CAL note. If first argument is given, then 32-bit value will be added to content.
    380 
    381 ### .uavprivate
    382 
    383 Syntax: .uavprivate VALUE
    384 
    385 This pseudo-op must be inside kernel configuration. Set uav private value.
    386 
    387 ### .useconstdata
    388 
    389 Eanble using of the const data.
    390 
    391 ### .useprintf
    392 
    393 Eanble using of the printf mechanism.
    394 
    395 ### .userdata
    396 
    397 Syntax: .userdata DATACLASS, APISLOT, REGSTART, REGSIZE
    398 
    399 This pseudo-op must be inside kernel configuration. Add USERDATA entry. First argument is
    400 data class. It can be one of the following:
    401 
    402 * IMM_RESOURCE
    403 * IMM_SAMPLER
    404 * IMM_CONST_BUFFER
    405 * IMM_VERTEX_BUFFER
    406 * IMM_UAV
    407 * IMM_ALU_FLOAT_CONST
    408 * IMM_ALU_BOOL32_CONST
    409 * IMM_GDS_COUNTER_RANGE
    410 * IMM_GDS_MEMORY_RANGE
    411 * IMM_GWS_BASE
    412 * IMM_WORK_ITEM_RANGE
    413 * IMM_WORK_GROUP_RANGE
    414 * IMM_DISPATCH_ID
    415 * IMM_SCRATCH_BUFFER
    416 * IMM_HEAP_BUFFER
    417 * IMM_KERNEL_ARG
    418 * SUB_PTR_FETCH_SHADER
    419 * PTR_RESOURCE_TABLE
    420 * PTR_INTERNAL_RESOURCE_TABLE
    421 * PTR_SAMPLER_TABLE
    422 * PTR_CONST_BUFFER_TABLE
    423 * PTR_VERTEX_BUFFER_TABLE
    424 * PTR_SO_BUFFER_TABLE
    425 * PTR_UAV_TABLE
    426 * PTR_INTERNAL_GLOBAL_TABLE
    427 * PTR_EXTENDED_USER_DATA
    428 * PTR_INDIRECT_RESOURCE
    429 * PTR_INDIRECT_INTERNAL_RESOURCE
    430 * PTR_INDIRECT_UAV
    431 * IMM_CONTEXT_BASE
    432 * IMM_LDS_ESGS_SIZE
    433 * IMM_GLOBAL_OFFSET
    434 * IMM_GENERIC_USER_DAT
    435 
    436 Second argument is apiSlot.
     255add new CAL note. If first argument is given, then 32-bit value will be added to content.</p>
     256<h3>.uavprivate</h3>
     257<p>Syntax: .uavprivate VALUE</p>
     258<p>This pseudo-op must be inside kernel configuration. Set uav private value.</p>
     259<h3>.useconstdata</h3>
     260<p>Eanble using of the const data.</p>
     261<h3>.useprintf</h3>
     262<p>Eanble using of the printf mechanism.</p>
     263<h3>.userdata</h3>
     264<p>Syntax: .userdata DATACLASS, APISLOT, REGSTART, REGSIZE</p>
     265<p>This pseudo-op must be inside kernel configuration. Add USERDATA entry. First argument is
     266data class. It can be one of the following:</p>
     267<ul>
     268<li>IMM_RESOURCE</li>
     269<li>IMM_SAMPLER</li>
     270<li>IMM_CONST_BUFFER</li>
     271<li>IMM_VERTEX_BUFFER</li>
     272<li>IMM_UAV</li>
     273<li>IMM_ALU_FLOAT_CONST</li>
     274<li>IMM_ALU_BOOL32_CONST</li>
     275<li>IMM_GDS_COUNTER_RANGE</li>
     276<li>IMM_GDS_MEMORY_RANGE</li>
     277<li>IMM_GWS_BASE</li>
     278<li>IMM_WORK_ITEM_RANGE</li>
     279<li>IMM_WORK_GROUP_RANGE</li>
     280<li>IMM_DISPATCH_ID</li>
     281<li>IMM_SCRATCH_BUFFER</li>
     282<li>IMM_HEAP_BUFFER</li>
     283<li>IMM_KERNEL_ARG</li>
     284<li>SUB_PTR_FETCH_SHADER</li>
     285<li>PTR_RESOURCE_TABLE</li>
     286<li>PTR_INTERNAL_RESOURCE_TABLE</li>
     287<li>PTR_SAMPLER_TABLE</li>
     288<li>PTR_CONST_BUFFER_TABLE</li>
     289<li>PTR_VERTEX_BUFFER_TABLE</li>
     290<li>PTR_SO_BUFFER_TABLE</li>
     291<li>PTR_UAV_TABLE</li>
     292<li>PTR_INTERNAL_GLOBAL_TABLE</li>
     293<li>PTR_EXTENDED_USER_DATA</li>
     294<li>PTR_INDIRECT_RESOURCE</li>
     295<li>PTR_INDIRECT_INTERNAL_RESOURCE</li>
     296<li>PTR_INDIRECT_UAV</li>
     297<li>IMM_CONTEXT_BASE</li>
     298<li>IMM_LDS_ESGS_SIZE</li>
     299<li>IMM_GLOBAL_OFFSET</li>
     300<li>IMM_GENERIC_USER_DAT</li>
     301</ul>
     302<p>Second argument is apiSlot.
    437303Third argument determines the first scalar register which will hold userdata.
    438 Fourth argument determines how many scalar register needed to hold userdata.
    439 
    440 ### .vgprsnum
    441 
    442 Syntax: .vgprsnum REGNUM
    443 
    444 This pseudo-op must be inside kernel configuration. Set number of vector
    445 registers which can be used during kernel execution.
    446 
    447 ## Sample code
    448 
    449 This is sample example of the kernel setup:
    450 
    451 ```
    452 /* Disassembling 'DCT_15_5.1' */
     304Fourth argument determines how many scalar register needed to hold userdata.</p>
     305<h3>.vgprsnum</h3>
     306<p>Syntax: .vgprsnum REGNUM</p>
     307<p>This pseudo-op must be inside kernel configuration. Set number of vector
     308registers which can be used during kernel execution.</p>
     309<h2>Sample code</h2>
     310<p>This is sample example of the kernel setup:</p>
     311<p><code>/* Disassembling 'DCT_15_5.1' */
    453312.amd
    454313.gpu Pitcairn
     
    582441/*befc03ff 00008000*/ s_mov_b32       m0, 0x8000
    583442...
    584 /*bf810000         */ s_endpgm
    585 ```
    586 
    587 with kernel configuration:
    588 
    589 ```
    590 .amd
     443/*bf810000         */ s_endpgm</code></p>
     444<p>with kernel configuration:</p>
     445<p><code>.amd
    591446.gpu Pitcairn
    592447.32bit
     
    607462/*befc03ff 00008000*/ s_mov_b32       m0, 0x8000
    608463...
    609 /*bf810000         */ s_endpgm
    610 ```
    611 }}}
     464/*bf810000         */ s_endpgm</code></p>}}}