Changes between Version 19 and Version 20 of ClrxAsmGallium


Ignore:
Timestamp:
Sep 2, 2017, 10:00:25 AM (10 months ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ClrxAsmGallium

    v19 v20  
    2727 The VCC register is included by default.</p>
    2828<h2>List of the specific pseudo-operations</h2>
     29<h3>.arch_minor</h3>
     30<p>Syntax: .arch_minor ARCH_MINOR</p>
     31<p>Set architecture minor number. Used only if LLVM version is 4.0.0 or later.</p>
     32<h3>.arch_stepping</h3>
     33<p>Syntax: .arch_minor ARCH_STEPPING</p>
     34<p>Set architecture stepping number. Used only if LLVM version is 4.0.0 or later.</p>
    2935<h3>.arg</h3>
    3036<p>Syntax: .arg ARGTYPE, SIZE[, TARGETSIZE[, ALIGNMENT[, NUMEXT[, SEMANTIC]]]]</p>
     
    6874<h3>.args</h3>
    6975<p>Open kernel argument configuration. Must be inside kernel.</p>
     76<h3>.call_convention</h3>
     77<p>Syntax: .call_convention CALL_CONV</p>
     78<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     79LLVM version is 4.0.0 or later. Set call convention for kernel.</p>
     80<h3>.codeversion</h3>
     81<p>Syntax .codeversion MAJOR, MINOR</p>
     82<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     83LLVM version is 4.0.0 or later. Set AMD code version.</p>
    7084<h3>.config</h3>
    7185<p>Open kernel configuration. Must be inside kernel. Kernel configuration can not be
     
    97111    .dims xyz
    98112    .tgsize</code></p>
     113<h3>.control_directive</h3>
     114<p>Open control directive section. This section must be 128 bytes. The content of this
     115section will be stored in control_directive field in kernel configuration.
     116Must be defined inside kernel. Can ben used only if LLVM version is 4.0.0 or later</p>
     117<h3>.debug_private_segment_buffer_sgpr</h3>
     118<p>Syntax: .debug_private_segment_buffer_sgpr SGPRREG</p>
     119<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     120LLVM version is 4.0.0 or later. Set <code>debug_private_segment_buffer_sgpr</code> field in
     121kernel configuration.</p>
     122<h3>.debug_wavefront_private_segment_offset_sgpr</h3>
     123<p>Syntax: .debug_wavefront_private_segment_offset_sgpr SGPRREG</p>
     124<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     125LLVM version is 4.0.0 or later. Set <code>debug_wavefront_private_segment_offset_sgpr</code> field in
     126kernel configuration.</p>
    99127<h3>.debugmode</h3>
    100128<p>This pseudo-op must be inside kernel configuration (<code>.config</code>).
     
    125153<p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Defines float-mode.
    126154Set floatmode (FP_ROUND and FP_DENORM fields of the MODE register). Default value is 0xc0.</p>
     155<h3>.gds_segment_size</h3>
     156<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     157LLVM version is 4.0.0 or later. Set <code>gds_segment_size</code> field in kernel configuration.</p>
    127158<h3>.get_driver_version</h3>
    128159<p>Syntax: .get_driver_version SYMBOL</p>
     
    133164<h3>.globaldata</h3>
    134165<p>Go to constant global data section (<code>.rodata</code>).</p>
     166<h3>.group_segment_align</h3>
     167<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     168LLVM version is 4.0.0 or later. Set <code>group_segment_align</code> field in kernel configuration.</p>
     169<h3>.hsa_debugmode</h3>
     170<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     171LLVM version is 4.0.0 or later. Enable usage of the DEBUG_MODE in kernel HSA configuration.</p>
     172<h3>.hsa_dims</h3>
     173<p>Syntax: .hsa_dims DIMENSIONS</p>
     174<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     175LLVM version is 4.0.0 or later. Defines what dimensions (from list: x, y, z) will be used
     176to determine space of the kernel execution in kernel HSA configuration.</p>
     177<h3>.hsa_dx10clamp</h3>
     178<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     179LLVM version is 4.0.0 or later. Enable usage of the DX10_CLAMP in kernel HSA configuration.</p>
     180<h3>.hsa_exceptions</h3>
     181<p>Syntax: .hsa_exceptions EXCPMASK</p>
     182<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     183LLVM version is 4.0.0 or later. Set exception mask in PGMRSRC2 register value in
     184kernel HSA configuration. Value should be 7-bit.</p>
     185<h3>.hsa_floatmode</h3>
     186<p>Syntax: .hsa_floatmode BYTE-VALUE</p>
     187<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     188LLVM version is 4.0.0 or later. Defines float-mode in kernel HSA configuration.
     189Set floatmode (FP_ROUND and FP_DENORM fields of the MODE register). Default value is 0xc0.</p>
     190<h3>.hsa_ieeemode</h3>
     191<p>Syntax: .hsa_ieeemode</p>
     192<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     193LLVM version is 4.0.0 or later. Set ieee-mode in kernel HSA configuration.</p>
     194<h3>.hsa_localsize</h3>
     195<p>Syntax: .hsa_localsize SIZE</p>
     196<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     197LLVM version is 4.0.0 or later. Defines initial local memory size used by kernel in
     198kernel HSA configuration.</p>
     199<h3>.hsa_pgmrsrc1</h3>
     200<p>Syntax: .hsa_pgmrsrc1 VALUE</p>
     201<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     202LLVM version is 4.0.0 or later. Defines value of the PGMRSRC1 in kernel HSA configuration.</p>
     203<h3>.hsa_pgmrsrc2</h3>
     204<p>Syntax: .hsa_pgmrsrc2 VALUE</p>
     205<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     206LLVM version is 4.0.0 or later. Defines value of the PGMRSRC2 in kernel HSA configration.
     207If dimensions is set then bits that controls dimension setup will be ignored.
     208SCRATCH_EN bit will be ignored.</p>
     209<h3>.priority</h3>
     210<p>Syntax: .hsa_priority PRIORITY</p>
     211<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     212LLVM version is 4.0.0 or later. Defines priority (0-3) in kernel HSA configuration.</p>
     213<h3>.hsa_privmode</h3>
     214<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     215LLVM version is 4.0.0 or later. Enable usage of the PRIV (privileged mode) in
     216kernel HSA configuration.</p>
     217<h3>.hsa_sgprsnum</h3>
     218<p>Syntax: .hsa_sgprsnum REGNUM</p>
     219<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     220LLVM version is 4.0.0 or later. Set number of scalar registers which can be used during
     221kernel execution in kernel HSA configuration.</p>
     222<h3>.hsa_tgsize</h3>
     223<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     224LLVM version is 4.0.0 or later. Enable usage of the TG_SIZE_EN in kernel HSA configuration.</p>
     225<h3>.hsa_userdatanum</h3>
     226<p>Syntax: .userdatanum NUMBER</p>
     227<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     228LLVM version is 4.0.0 or later. Set number of registers for USERDATA in
     229kernel HSA configuration.</p>
     230<h3>.hsa_vgprsnum</h3>
     231<p>Syntax: .hsa_vgprsnum REGNUM</p>
     232<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) can ben used only if
     233LLVM version is 4.0.0 or later. Set number of vector registers which can be used during
     234kernel execution in kernel HSA configuration.</p>
    135235<h3>.ieeemode</h3>
    136236<p>Syntax: .ieeemode</p>
     
    154254<h3>.kcodeend</h3>
    155255<p>Close <code>.kcode</code> clause. Refer to <code>.kcode</code>.</p>
     256<h3>.kernarg_segment_align</h3>
     257<p>Syntax: .kernarg_segment_align ALIGN</p>
     258<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     259LLVM version is 4.0.0 or later. Set <code>kernarg_segment_alignment</code> field in
     260kernel configuration. Value must be a power of two.</p>
     261<h3>.kernarg_segment_size</h3>
     262<p>Syntax: .kernarg_segment_size SIZE</p>
     263<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     264LLVM version is 4.0.0 or later. Set <code>kernarg_segment_byte_size</code> field in
     265kernel configuration.</p>
     266<h3>.kernel_code_entry_offset</h3>
     267<p>Syntax: .kernel_code_entry_offset OFFSET</p>
     268<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     269LLVM version is 4.0.0 or later. Set <code>kernel_code_entry_byte_offset</code> field in
     270kernel configuration. This field store offset between configuration and kernel code.
     271By default is 256.</p>
     272<h3>.kernel_code_prefetch_offset</h3>
     273<p>Syntax: .kernel_code_prefetch_offset OFFSET</p>
     274<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     275LLVM version is 4.0.0 or later. Set <code>kernel_code_prefetch_byte_offset</code> field in kernel
     276configuration.</p>
     277<h3>.kernel_code_prefetch_size</h3>
     278<p>Syntax: .kernel_code_prefetch_size OFFSET</p>
     279<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     280LLVM version is 4.0.0 or later. Set <code>kernel_code_prefetch_byte_size</code> field in kernel configuration.</p>
    156281<h3>.llvm_version</h3>
    157282<p>Syntax: .llvm_version VERSION</p>
     
    162287<p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Defines initial
    163288local memory size used by kernel.</p>
     289<h3>.machine</h3>
     290<p>Syntax: .machine KIND, MAJOR, MINOR, STEPPING</p>
     291<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     292LLVM version is 4.0.0 or later. Set machine version fields in kernel configuration.</p>
     293<h3>.max_scratch_backing_memory</h3>
     294<p>Syntax: .max_scratch_backing_memory SIZE</p>
     295<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     296LLVM version is 4.0.0 or later. Set <code>max_scratch_backing_memory_byte_size</code> field
     297in kernel configuration.</p>
    164298<h3>.pgmrsrc1</h3>
    165299<p>Syntax: .pgmrsrc1 VALUE</p>
     
    174308<p>Syntax: .priority PRIORITY</p>
    175309<p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Defines priority (0-3).</p>
     310<h3>.private_elem_size</h3>
     311<p>Syntax: .private_elem_size ELEMSIZE</p>
     312<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     313LLVM version is 4.0.0 or later. Set <code>private_element_size</code> field in kernel configuration.
     314Must be a power of two between 2 and 16.</p>
     315<h3>.private_segment_align</h3>
     316<p>Syntax: .private_segment ALIGN</p>
     317<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     318LLVM version is 4.0.0 or later. Set <code>private_segment_alignment</code> field in kernel
     319configuration. Value must be a power of two.</p>
    176320<h3>.privmode</h3>
    177321<p>This pseudo-op must be inside kernel configuration (<code>.config</code>).
     
    181325ProgInfo shall to be containing 3 entries. ProgInfo can not be defined if kernel config
    182326was defined (by using <code>.config</code>).</p>
     327<h3>.reserved_sgprs</h3>
     328<p>Syntax: .reserved_sgprs FIRSTREG, LASTREG</p>
     329<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     330LLVM version is 4.0.0 or later. Set <code>reserved_sgpr_first</code> and <code>reserved_sgpr_count</code>
     331fields in kernel configuration. <code>reserved_sgpr_count</code> filled by number of registers
     332(LASTREG-FIRSTREG+1).</p>
     333<h3>.reserved_vgprs</h3>
     334<p>Syntax: .reserved_vgprs FIRSTREG, LASTREG</p>
     335<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     336LLVM version is 4.0.0 or later. Set <code>reserved_vgpr_first</code> and <code>reserved_vgpr_count</code>
     337fields in kernel configuration. <code>reserved_vgpr_count</code> filled by number of registers
     338(LASTREG-FIRSTREG+1).</p>
     339<h3>.runtime_loader_kernel_symbol</h3>
     340<p>Syntax: .runtime_loader_kernel_symbol ADDRESS</p>
     341<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     342LLVM version is 4.0.0 or later. Set <code>runtime_loader_kernel_symbol</code> field in kernel
     343configuration.</p>
    183344<h3>.scratchbuffer</h3>
    184345<p>Syntax: .scratchbuffer SIZE</p>
     
    199360<p>This pseudo-op must be inside kernel configuration (<code>.config</code>).
    200361Enable usage of the TG_SIZE_EN. Should be set.</p>
     362<h3>.use_debug_enabled</h3>
     363<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     364LLVM version is 4.0.0 or later. Enable <code>is_debug_enabled</code> field in kernel configuration.</p>
     365<h3>.use_dispatch_id</h3>
     366<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     367LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_dispatch_id</code> field in kernel
     368configuration.</p>
     369<h3>.use_dispatch_ptr</h3>
     370<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     371LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_dispatch_ptr</code> field in kernel
     372configuration.</p>
     373<h3>.use_dynamic_call_stack</h3>
     374<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     375LLVM version is 4.0.0 or later. Enable <code>is_dynamic_call_stack</code> field in
     376kernel configuration.</p>
     377<h3>.use_flat_scratch_init</h3>
     378<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     379LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_flat_scratch_init</code> field in
     380kernel configuration.</p>
     381<h3>.use_grid_workgroup_count</h3>
     382<p>Syntax: .use_grid_workgroup_count DIMENSIONS</p>
     383<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     384LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_grid_workgroup_count_X</code>,
     385<code>enable_sgpr_grid_workgroup_count_Y</code> and <code>enable_sgpr_grid_workgroup_count_Z</code> fields
     386in kernel configuration, respectively by given dimensions.</p>
     387<h3>.use_kernarg_segment_ptr</h3>
     388<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     389LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_kernarg_segment_ptr</code> field in
     390kernel configuration.</p>
     391<h3>.use_ordered_append_gds</h3>
     392<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     393LLVM version is 4.0.0 or later. Enable <code>enable_ordered_append_gds</code> field in
     394kernel configuration.</p>
     395<h3>.use_private_segment_buffer</h3>
     396<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     397LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_private_segment_buffer</code> field in
     398kernel configuration.</p>
     399<h3>.use_private_segment_size</h3>
     400<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     401LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_private_segment_size</code> field in
     402kernel configuration.</p>
     403<h3>.use_ptr64</h3>
     404<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     405LLVM version is 4.0.0 or later. Enable <code>is_ptr64</code> field in kernel configuration.</p>
     406<h3>.use_queue_ptr</h3>
     407<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     408LLVM version is 4.0.0 or later. Enable <code>enable_sgpr_queue_ptr</code> field in
     409kernel configuration.</p>
     410<h3>.use_xnack_enabled</h3>
     411<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     412LLVM version is 4.0.0 or later. Enable <code>is_xnack_enabled</code> field in kernel configuration.</p>
    201413<h3>.userdatanum</h3>
    202414<p>Syntax: .userdatanum NUMBER</p>
     
    207419<p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set number of vector
    208420registers which can be used during kernel execution.</p>
     421<h3>.wavefront_sgpr_count</h3>
     422<p>Syntax: .wavefront_sgpr_count REGNUM</p>
     423<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     424LLVM version is 4.0.0 or later. Set <code>wavefront_sgpr_count</code> field in kernel configuration.</p>
     425<h3>.wavefront_size</h3>
     426<p>Syntax: .wavefront_size POWEROFTWO</p>
     427<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     428LLVM version is 4.0.0 or later. Set <code>wavefront_size</code> field in kernel configuration.
     429Value must be a power of two.</p>
     430<h3>.workgroup_fbarrier_count</h3>
     431<p>Syntax: .workgroup_fbarrier_count COUNT</p>
     432<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     433LLVM version is 4.0.0 or later. Set <code>workgroup_fbarrier_count</code> field in
     434kernel configuration.</p>
     435<h3>.workgroup_group_segment_size</h3>
     436<p>Syntax: .workgroup_group_segment_size SIZE</p>
     437<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     438LLVM version is 4.0.0 or later. Set <code>workgroup_group_segment_byte_size</code> in
     439kernel configuration.</p>
     440<h3>.workitem_private_segment_size</h3>
     441<p>Syntax: .workitem_private_segment_size SIZE</p>
     442<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     443LLVM version is 4.0.0 or later. Set <code>workitem_private_segment_byte_size</code> field in
     444kernel configuration.</p>
     445<h3>.workitem_vgpr_count</h3>
     446<p>Syntax: .workitem_vgpr_count REGNUM</p>
     447<p>This pseudo-op must be inside kernel configuration (<code>.config</code>) and can ben used only if
     448LLVM version is 4.0.0 or later. Set <code>workitem_vgpr_count</code> field in kernel configuration.</p>
    209449<h2>Sample code</h2>
    210450<p>This is sample example of the kernel setup:</p>