Changes between Version 21 and Version 22 of ClrxAsmAmdCl2


Ignore:
Timestamp:
09/06/17 20:00:31 (7 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ClrxAsmAmdCl2

    v21 v22  
    6565Syntax for constant pointer: .arg ARGNAME[, "ARGTYPENAME"],
    6666ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, [CONSTSIZE] [, unused]]]</p>
    67 <p>Adds kernel argument definition. Must be inside kernel configuration. First argument is
     67<p>Adds kernel argument definition. Must be inside any kernel configuration. First argument is
    6868argument name from OpenCL kernel definition. Next optional argument is argument type name
    6969from OpenCL kernel definition. Next arugment is argument type:</p>
     
    108108<p>Syntax: .bssdata [align=ALIGNMENT]</p>
    109109<p>Go to global data bss section. Optional argument sets alignment of section.</p>
     110<h3>.call_convention</h3>
     111<p>Syntax: .call_convention CALL_CONV</p>
     112<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>).
     113Set call convention for kernel.</p>
     114<h3>.codeversion</h3>
     115<p>Syntax .codeversion MAJOR, MINOR</p>
     116<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>).
     117Set AMD code version.</p>
     118<h3>.compile_options</h3>
     119<p>Syntax: .compile_options "STRING"</p>
     120<p>Set compile options for this binary.</p>
    110121<h3>.config</h3>
    111122<p>Open kernel configuration. Must be inside kernel. Kernel configuration can not be
     
    137148<li>.vgprsnum</li>
    138149</ul>
     150<h3>.control_directive</h3>
     151<p>Open control directive section. This section must be 128 bytes. The content of this
     152section will be stored in control_directive field in kernel configuration.
     153Must be defined inside kernel.</p>
    139154<h3>.cws</h3>
    140155<p>Syntax: .cws SIZEHINT[, SIZEHINT[, SIZEHINT]]</p>
    141 <p>This pseudo-operation must be inside kernel configuration.
     156<p>This pseudo-operation must be inside any kernel configuration.
    142157Set reqd_work_group_size hint for this kernel.</p>
     158<h3>.debug_private_segment_buffer_sgpr</h3>
     159<p>Syntax: .debug_private_segment_buffer_sgpr SGPRREG</p>
     160<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     161<code>debug_private_segment_buffer_sgpr</code> field in kernel configuration.</p>
     162<h3>.debug_wavefront_private_segment_offset_sgpr</h3>
     163<p>Syntax: .debug_wavefront_private_segment_offset_sgpr SGPRREG</p>
     164<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     165<code>debug_wavefront_private_segment_offset_sgpr</code> field in kernel configuration.</p>
    143166<h3>.debugmode</h3>
    144 <p>This pseudo-operation must be inside kernel configuration.
     167<p>This pseudo-operation must be inside any kernel configuration.
    145168Enable usage of the DEBUG_MODE.</p>
    146169<h3>.dims</h3>
    147170<p>Syntax: .dims DIMENSIONS</p>
    148 <p>This pseudo-operation must be inside kernel configuration. Defines what dimensions
     171<p>This pseudo-operation must be inside any kernel configuration. Defines what dimensions
    149172(from list: x, y, z) will be used to determine space of the kernel execution.</p>
    150173<h3>.driver_version</h3>
     
    153176This pseudo-op replaces driver info.</p>
    154177<h3>.dx10clamp</h3>
    155 <p>This pseudo-operation must be inside kernel configuration.
     178<p>This pseudo-operation must be inside any kernel configuration.
    156179Enable usage of the DX10_CLAMP.</p>
    157180<h3>.exceptions</h3>
    158181<p>Syntax: .exceptions EXCPMASK</p>
    159 <p>This pseudo-operation must be inside kernel configuration.
     182<p>This pseudo-operation must be inside any kernel configuration.
    160183Set exception mask in PGMRSRC2 register value. Value should be 7-bit.</p>
     184<h3>.gds_segment_size</h3>
     185<p>Syntax: .gds_segment_size SIZE</p>
     186<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     187<code>gds_segment_size</code> field in kernel configuration.</p>
    161188<h3>.gdssize</h3>
    162189<p>Syntax: .gdssize SIZE</p>
    163 <p>This pseudo-operation must be inside kernel configuration. Set the GDS
     190<p>This pseudo-operation must be inside any kernel configuration. Set the GDS
    164191(global data share) size.</p>
    165192<h3>.get_driver_version</h3>
     
    168195<h3>.globaldata</h3>
    169196<p>Go to constant global data section.</p>
     197<h3>.group_segment_align</h3>
     198<p>Syntax: .group_segment_align ALIGN</p>
     199<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     200<code>group_segment_align</code> field in kernel configuration.</p>
     201<h3>.hsaconfig</h3>
     202<p>Open kernel HSA configuration. Must be inside kernel. Kernel configuration can not be
     203defined if any isametadata, metadata or stub was defined. Do not mix with <code>.config</code>.</p>
    170204<h3>.ieeemode</h3>
    171 <p>This pseudo-op must be inside kernel configuration. Set ieee-mode.</p>
     205<p>This pseudo-op must be inside any kernel configuration. Set ieee-mode.</p>
    172206<h3>.inner</h3>
    173207<p>Go to inner binary place. By default assembler is in main binary.</p>
     
    175209<p>This pseudo-operation must be inside kernel. Go to ISA metadata content
    176210(only older driver binaries).</p>
     211<h3>.kernarg_segment_align</h3>
     212<p>Syntax: .kernarg_segment_align ALIGN</p>
     213<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     214<code>kernarg_segment_alignment</code> field in kernel configuration. Value must be a power of two.</p>
     215<h3>.kernarg_segment_size</h3>
     216<p>Syntax: .kernarg_segment_size SIZE</p>
     217<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     218<code>kernarg_segment_byte_size</code> field in kernel configuration.</p>
     219<h3>.kernel_code_entry_offset</h3>
     220<p>Syntax: .kernel_code_entry_offset OFFSET</p>
     221<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     222<code>kernel_code_entry_byte_offset</code> field in kernel configuration. This field
     223store offset between configuration and kernel code. By default is 256.</p>
     224<h3>.kernel_code_prefetch_offset</h3>
     225<p>Syntax: .kernel_code_prefetch_offset OFFSET</p>
     226<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     227<code>kernel_code_prefetch_byte_offset</code> field in kernel configuration.</p>
     228<h3>.kernel_code_prefetch_size</h3>
     229<p>Syntax: .kernel_code_prefetch_size OFFSET</p>
     230<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     231<code>kernel_code_prefetch_byte_size</code> field in kernel configuration.</p>
    177232<h3>.localsize</h3>
    178233<p>Syntax: .localsize SIZE</p>
    179 <p>This pseudo-operation must be inside kernel configuration. Set the initial
     234<p>This pseudo-operation must be inside any kernel configuration. Set the initial
    180235local data size.</p>
     236<h3>.machine</h3>
     237<p>Syntax: .machine KIND, MAJOR, MINOR, STEPPING</p>
     238<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     239machine version fields in kernel configuration.</p>
     240<h3>.max_scratch_backing_memory</h3>
     241<p>Syntax: .max_scratch_backing_memory SIZE</p>
     242<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     243<code>max_scratch_backing_memory_byte_size</code> field in kernel configuration.</p>
    181244<h3>.metadata</h3>
    182245<p>This pseudo-operation must be inside kernel. Go to metadata content.</p>
     
    187250<h3>.pgmrsrc2</h3>
    188251<p>Syntax: .pgmrsrc2 VALUE</p>
    189 <p>This pseudo-operation must be inside kernel configuration. Set PGMRSRC2 value.
     252<p>This pseudo-operation must be inside any kernel configuration. Set PGMRSRC2 value.
    190253If dimensions is set then bits that controls dimension setup will be ignored.
    191254SCRATCH_EN bit will be ignored.</p>
     
    193256<p>Syntax: .priority PRIORITY</p>
    194257<p>This pseudo-operation must be inside kernel. Defines priority (0-3).</p>
     258<h3>.private_elem_size</h3>
     259<p>Syntax: .private_elem_size ELEMSIZE</p>
     260<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>).
     261Set <code>private_element_size</code> field in kernel configuration.
     262Must be a power of two between 2 and 16.</p>
     263<h3>.private_segment_align</h3>
     264<p>Syntax: .private_segment ALIGN</p>
     265<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     266<code>private_segment_alignment</code> field in kernel configuration. Value must be a power of two.</p>
    195267<h3>.privmode</h3>
    196268<p>This pseudo-operation must be inside kernel.
    197269Enable usage of the PRIV (privileged mode).</p>
     270<h3>.reserved_sgprs</h3>
     271<p>Syntax: .reserved_sgprs FIRSTREG, LASTREG</p>
     272<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     273<code>reserved_sgpr_first</code> and <code>reserved_sgpr_count</code> fields in kernel configuration.
     274<code>reserved_sgpr_count</code> filled by number of registers (LASTREG-FIRSTREG+1).</p>
     275<h3>.reserved_vgprs</h3>
     276<p>Syntax: .reserved_vgprs FIRSTREG, LASTREG</p>
     277<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     278<code>reserved_vgpr_first</code> and <code>reserved_vgpr_count</code> fields in kernel configuration.
     279<code>reserved_vgpr_count</code> filled by number of registers (LASTREG-FIRSTREG+1).</p>
     280<h3>.runtime_loader_kernel_symbol</h3>
     281<p>Syntax: .runtime_loader_kernel_symbol ADDRESS</p>
     282<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     283<code>runtime_loader_kernel_symbol</code> field in kernel configuration.</p>
    198284<h3>.rwdata</h3>
    199285<p>Go to read-write global data section.</p>
     
    210296<h3>.scratchbuffer</h3>
    211297<p>Syntax: .scratchbuffer SIZE</p>
    212 <p>This pseudo-operation must be inside kernel configuration.
     298<p>This pseudo-operation must be inside any kernel configuration.
    213299Set scratchbuffer size.</p>
    214300<h3>.setup</h3>
    215301<p>Go to kernel setup content section.</p>
    216302<h3>.setupargs</h3>
    217 <p>This pseudo-op must be inside kernel configuration. Add first kernel setup arguments.
     303<p>This pseudo-op must be inside any kernel configuration. Add first kernel setup arguments.
    218304This pseudo-op must be before any other arguments.</p>
    219305<h3>.sgprsnum</h3>
    220306<p>Syntax: .sgprsnum REGNUM</p>
    221 <p>This pseudo-op must be inside kernel configuration. Set number of scalar
     307<p>This pseudo-op must be inside any kernel configuration. Set number of scalar
    222308registers which can be used during kernel execution.</p>
    223309<h3>.stub</h3>
    224310<p>Go to kernel stub content section. Only allowed for older driver version binaries.</p>
    225311<h3>.tgsize</h3>
    226 <p>This pseudo-op must be inside kernel configuration.
     312<p>This pseudo-op must be inside any kernel configuration.
    227313Enable usage of the TG_SIZE_EN.</p>
     314<h3>.use_debug_enabled</h3>
     315<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     316<code>is_debug_enabled</code> field in kernel configuration.</p>
     317<h3>.use_dispatch_id</h3>
     318<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     319<code>enable_sgpr_dispatch_id</code> field in kernel configuration.</p>
     320<h3>.use_dispatch_ptr</h3>
     321<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     322<code>enable_sgpr_dispatch_ptr</code> field in kernel configuration.</p>
     323<h3>.use_dynamic_call_stack</h3>
     324<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     325<code>is_dynamic_call_stack</code> field in kernel configuration.</p>
     326<h3>.use_flat_scratch_init</h3>
     327<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     328<code>enable_sgpr_flat_scratch_init</code> field in kernel configuration.</p>
     329<h3>.use_grid_workgroup_count</h3>
     330<p>Syntax: .use_grid_workgroup_count DIMENSIONS</p>
     331<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     332<code>enable_sgpr_grid_workgroup_count_X</code>, <code>enable_sgpr_grid_workgroup_count_Y</code>
     333and <code>enable_sgpr_grid_workgroup_count_Z</code> fields in kernel configuration,
     334respectively by given dimensions.</p>
     335<h3>.use_kernarg_segment_ptr</h3>
     336<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     337<code>enable_sgpr_kernarg_segment_ptr</code> field in kernel configuration.</p>
     338<h3>.use_ordered_append_gds</h3>
     339<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     340<code>enable_ordered_append_gds</code> field in kernel configuration.</p>
     341<h3>.use_private_segment_buffer</h3>
     342<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     343<code>enable_sgpr_private_segment_buffer</code> field in kernel configuration.</p>
     344<h3>.use_private_segment_size</h3>
     345<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     346<code>enable_sgpr_private_segment_size</code> field in kernel configuration.</p>
     347<h3>.use_ptr64</h3>
     348<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>).
     349Enable <code>is_ptr64</code> field in kernel configuration.</p>
     350<h3>.use_queue_ptr</h3>
     351<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     352<code>enable_sgpr_queue_ptr</code> field in kernel configuration.</p>
     353<h3>.use_xnack_enabled</h3>
     354<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable
     355<code>is_xnack_enabled</code> field in kernel configuration.</p>
    228356<h3>.useargs</h3>
    229 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses arguments.</p>
     357<p>This pseudo-op must be inside any kernel (non-HSA) configuration.
     358Indicate that kernel uses arguments.</p>
    230359<h3>.useenqueue</h3>
    231 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses
    232 enqueue mechanism.</p>
     360<p>This pseudo-op must be inside any kernel (non-HSA) configuration.
     361Indicate that kernel uses enqueue mechanism.</p>
    233362<h3>.usegeneric</h3>
    234 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses
    235 generic pointers mechanism (FLAT instructions).</p>
     363<p>This pseudo-op must be inside any kernel (non-HSA) configuration.
     364Indicate that kernel uses generic pointers mechanism (FLAT instructions).</p>
    236365<h3>.usesetup</h3>
    237 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses
    238 setup data (global sizes, local sizes, work groups num).</p>
     366<p>This pseudo-op must be inside any kernel (non-HSA) configuration.
     367Indicate that kernel uses setup data (global sizes, local sizes, work groups num).</p>
     368<h3>.userdatanum</h3>
     369<p>Syntax: .userdatanum NUMBER</p>
     370<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set number of
     371registers for USERDATA.</p>
    239372<h3>.vgprsnum</h3>
    240373<p>Syntax: .vgprsnum REGNUM</p>
    241 <p>This pseudo-op must be inside kernel configuration. Set number of vector
     374<p>This pseudo-op must be inside any kernel configuration. Set number of vector
    242375registers which can be used during kernel execution.</p>
     376<h3>.wavefront_sgpr_count</h3>
     377<p>Syntax: .wavefront_sgpr_count REGNUM</p>
     378<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     379<code>wavefront_sgpr_count</code> field in kernel configuration.</p>
     380<h3>.wavefront_size</h3>
     381<p>Syntax: .wavefront_size POWEROFTWO</p>
     382<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>).
     383Set <code>wavefront_size</code> field in kernel configuration. Value must be a power of two.</p>
     384<h3>.workgroup_fbarrier_count</h3>
     385<p>Syntax: .workgroup_fbarrier_count COUNT</p>
     386<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     387<code>workgroup_fbarrier_count</code> field in kernel configuration.</p>
     388<h3>.workgroup_group_segment_size</h3>
     389<p>Syntax: .workgroup_group_segment_size SIZE</p>
     390<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     391<code>workgroup_group_segment_byte_size</code> in kernel configuration.</p>
     392<h3>.workitem_private_segment_size</h3>
     393<p>Syntax: .workitem_private_segment_size SIZE</p>
     394<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     395<code>workitem_private_segment_byte_size</code> field in kernel configuration.</p>
     396<h3>.workitem_vgpr_count</h3>
     397<p>Syntax: .workitem_vgpr_count REGNUM</p>
     398<p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set
     399<code>workitem_vgpr_count</code> field in kernel configuration.</p>
    243400<h2>Sample code</h2>
    244401<p>This is sample example of the kernel setup:</p>