Changes between Version 21 and Version 22 of ClrxAsmAmdCl2
- Timestamp:
- 09/06/17 20:00:31 (7 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ClrxAsmAmdCl2
v21 v22 65 65 Syntax for constant pointer: .arg ARGNAME[, "ARGTYPENAME"], 66 66 ARGTYPE[[, STRUCTSIZE], PTRSPACE[, [ACCESS] [, [CONSTSIZE] [, unused]]]</p> 67 <p>Adds kernel argument definition. Must be inside kernel configuration. First argument is67 <p>Adds kernel argument definition. Must be inside any kernel configuration. First argument is 68 68 argument name from OpenCL kernel definition. Next optional argument is argument type name 69 69 from OpenCL kernel definition. Next arugment is argument type:</p> … … 108 108 <p>Syntax: .bssdata [align=ALIGNMENT]</p> 109 109 <p>Go to global data bss section. Optional argument sets alignment of section.</p> 110 <h3>.call_convention</h3> 111 <p>Syntax: .call_convention CALL_CONV</p> 112 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). 113 Set call convention for kernel.</p> 114 <h3>.codeversion</h3> 115 <p>Syntax .codeversion MAJOR, MINOR</p> 116 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). 117 Set AMD code version.</p> 118 <h3>.compile_options</h3> 119 <p>Syntax: .compile_options "STRING"</p> 120 <p>Set compile options for this binary.</p> 110 121 <h3>.config</h3> 111 122 <p>Open kernel configuration. Must be inside kernel. Kernel configuration can not be … … 137 148 <li>.vgprsnum</li> 138 149 </ul> 150 <h3>.control_directive</h3> 151 <p>Open control directive section. This section must be 128 bytes. The content of this 152 section will be stored in control_directive field in kernel configuration. 153 Must be defined inside kernel.</p> 139 154 <h3>.cws</h3> 140 155 <p>Syntax: .cws SIZEHINT[, SIZEHINT[, SIZEHINT]]</p> 141 <p>This pseudo-operation must be inside kernel configuration.156 <p>This pseudo-operation must be inside any kernel configuration. 142 157 Set reqd_work_group_size hint for this kernel.</p> 158 <h3>.debug_private_segment_buffer_sgpr</h3> 159 <p>Syntax: .debug_private_segment_buffer_sgpr SGPRREG</p> 160 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 161 <code>debug_private_segment_buffer_sgpr</code> field in kernel configuration.</p> 162 <h3>.debug_wavefront_private_segment_offset_sgpr</h3> 163 <p>Syntax: .debug_wavefront_private_segment_offset_sgpr SGPRREG</p> 164 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 165 <code>debug_wavefront_private_segment_offset_sgpr</code> field in kernel configuration.</p> 143 166 <h3>.debugmode</h3> 144 <p>This pseudo-operation must be inside kernel configuration.167 <p>This pseudo-operation must be inside any kernel configuration. 145 168 Enable usage of the DEBUG_MODE.</p> 146 169 <h3>.dims</h3> 147 170 <p>Syntax: .dims DIMENSIONS</p> 148 <p>This pseudo-operation must be inside kernel configuration. Defines what dimensions171 <p>This pseudo-operation must be inside any kernel configuration. Defines what dimensions 149 172 (from list: x, y, z) will be used to determine space of the kernel execution.</p> 150 173 <h3>.driver_version</h3> … … 153 176 This pseudo-op replaces driver info.</p> 154 177 <h3>.dx10clamp</h3> 155 <p>This pseudo-operation must be inside kernel configuration.178 <p>This pseudo-operation must be inside any kernel configuration. 156 179 Enable usage of the DX10_CLAMP.</p> 157 180 <h3>.exceptions</h3> 158 181 <p>Syntax: .exceptions EXCPMASK</p> 159 <p>This pseudo-operation must be inside kernel configuration.182 <p>This pseudo-operation must be inside any kernel configuration. 160 183 Set exception mask in PGMRSRC2 register value. Value should be 7-bit.</p> 184 <h3>.gds_segment_size</h3> 185 <p>Syntax: .gds_segment_size SIZE</p> 186 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 187 <code>gds_segment_size</code> field in kernel configuration.</p> 161 188 <h3>.gdssize</h3> 162 189 <p>Syntax: .gdssize SIZE</p> 163 <p>This pseudo-operation must be inside kernel configuration. Set the GDS190 <p>This pseudo-operation must be inside any kernel configuration. Set the GDS 164 191 (global data share) size.</p> 165 192 <h3>.get_driver_version</h3> … … 168 195 <h3>.globaldata</h3> 169 196 <p>Go to constant global data section.</p> 197 <h3>.group_segment_align</h3> 198 <p>Syntax: .group_segment_align ALIGN</p> 199 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 200 <code>group_segment_align</code> field in kernel configuration.</p> 201 <h3>.hsaconfig</h3> 202 <p>Open kernel HSA configuration. Must be inside kernel. Kernel configuration can not be 203 defined if any isametadata, metadata or stub was defined. Do not mix with <code>.config</code>.</p> 170 204 <h3>.ieeemode</h3> 171 <p>This pseudo-op must be inside kernel configuration. Set ieee-mode.</p>205 <p>This pseudo-op must be inside any kernel configuration. Set ieee-mode.</p> 172 206 <h3>.inner</h3> 173 207 <p>Go to inner binary place. By default assembler is in main binary.</p> … … 175 209 <p>This pseudo-operation must be inside kernel. Go to ISA metadata content 176 210 (only older driver binaries).</p> 211 <h3>.kernarg_segment_align</h3> 212 <p>Syntax: .kernarg_segment_align ALIGN</p> 213 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 214 <code>kernarg_segment_alignment</code> field in kernel configuration. Value must be a power of two.</p> 215 <h3>.kernarg_segment_size</h3> 216 <p>Syntax: .kernarg_segment_size SIZE</p> 217 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 218 <code>kernarg_segment_byte_size</code> field in kernel configuration.</p> 219 <h3>.kernel_code_entry_offset</h3> 220 <p>Syntax: .kernel_code_entry_offset OFFSET</p> 221 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 222 <code>kernel_code_entry_byte_offset</code> field in kernel configuration. This field 223 store offset between configuration and kernel code. By default is 256.</p> 224 <h3>.kernel_code_prefetch_offset</h3> 225 <p>Syntax: .kernel_code_prefetch_offset OFFSET</p> 226 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 227 <code>kernel_code_prefetch_byte_offset</code> field in kernel configuration.</p> 228 <h3>.kernel_code_prefetch_size</h3> 229 <p>Syntax: .kernel_code_prefetch_size OFFSET</p> 230 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 231 <code>kernel_code_prefetch_byte_size</code> field in kernel configuration.</p> 177 232 <h3>.localsize</h3> 178 233 <p>Syntax: .localsize SIZE</p> 179 <p>This pseudo-operation must be inside kernel configuration. Set the initial234 <p>This pseudo-operation must be inside any kernel configuration. Set the initial 180 235 local data size.</p> 236 <h3>.machine</h3> 237 <p>Syntax: .machine KIND, MAJOR, MINOR, STEPPING</p> 238 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 239 machine version fields in kernel configuration.</p> 240 <h3>.max_scratch_backing_memory</h3> 241 <p>Syntax: .max_scratch_backing_memory SIZE</p> 242 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 243 <code>max_scratch_backing_memory_byte_size</code> field in kernel configuration.</p> 181 244 <h3>.metadata</h3> 182 245 <p>This pseudo-operation must be inside kernel. Go to metadata content.</p> … … 187 250 <h3>.pgmrsrc2</h3> 188 251 <p>Syntax: .pgmrsrc2 VALUE</p> 189 <p>This pseudo-operation must be inside kernel configuration. Set PGMRSRC2 value.252 <p>This pseudo-operation must be inside any kernel configuration. Set PGMRSRC2 value. 190 253 If dimensions is set then bits that controls dimension setup will be ignored. 191 254 SCRATCH_EN bit will be ignored.</p> … … 193 256 <p>Syntax: .priority PRIORITY</p> 194 257 <p>This pseudo-operation must be inside kernel. Defines priority (0-3).</p> 258 <h3>.private_elem_size</h3> 259 <p>Syntax: .private_elem_size ELEMSIZE</p> 260 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). 261 Set <code>private_element_size</code> field in kernel configuration. 262 Must be a power of two between 2 and 16.</p> 263 <h3>.private_segment_align</h3> 264 <p>Syntax: .private_segment ALIGN</p> 265 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 266 <code>private_segment_alignment</code> field in kernel configuration. Value must be a power of two.</p> 195 267 <h3>.privmode</h3> 196 268 <p>This pseudo-operation must be inside kernel. 197 269 Enable usage of the PRIV (privileged mode).</p> 270 <h3>.reserved_sgprs</h3> 271 <p>Syntax: .reserved_sgprs FIRSTREG, LASTREG</p> 272 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 273 <code>reserved_sgpr_first</code> and <code>reserved_sgpr_count</code> fields in kernel configuration. 274 <code>reserved_sgpr_count</code> filled by number of registers (LASTREG-FIRSTREG+1).</p> 275 <h3>.reserved_vgprs</h3> 276 <p>Syntax: .reserved_vgprs FIRSTREG, LASTREG</p> 277 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 278 <code>reserved_vgpr_first</code> and <code>reserved_vgpr_count</code> fields in kernel configuration. 279 <code>reserved_vgpr_count</code> filled by number of registers (LASTREG-FIRSTREG+1).</p> 280 <h3>.runtime_loader_kernel_symbol</h3> 281 <p>Syntax: .runtime_loader_kernel_symbol ADDRESS</p> 282 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 283 <code>runtime_loader_kernel_symbol</code> field in kernel configuration.</p> 198 284 <h3>.rwdata</h3> 199 285 <p>Go to read-write global data section.</p> … … 210 296 <h3>.scratchbuffer</h3> 211 297 <p>Syntax: .scratchbuffer SIZE</p> 212 <p>This pseudo-operation must be inside kernel configuration.298 <p>This pseudo-operation must be inside any kernel configuration. 213 299 Set scratchbuffer size.</p> 214 300 <h3>.setup</h3> 215 301 <p>Go to kernel setup content section.</p> 216 302 <h3>.setupargs</h3> 217 <p>This pseudo-op must be inside kernel configuration. Add first kernel setup arguments.303 <p>This pseudo-op must be inside any kernel configuration. Add first kernel setup arguments. 218 304 This pseudo-op must be before any other arguments.</p> 219 305 <h3>.sgprsnum</h3> 220 306 <p>Syntax: .sgprsnum REGNUM</p> 221 <p>This pseudo-op must be inside kernel configuration. Set number of scalar307 <p>This pseudo-op must be inside any kernel configuration. Set number of scalar 222 308 registers which can be used during kernel execution.</p> 223 309 <h3>.stub</h3> 224 310 <p>Go to kernel stub content section. Only allowed for older driver version binaries.</p> 225 311 <h3>.tgsize</h3> 226 <p>This pseudo-op must be inside kernel configuration.312 <p>This pseudo-op must be inside any kernel configuration. 227 313 Enable usage of the TG_SIZE_EN.</p> 314 <h3>.use_debug_enabled</h3> 315 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 316 <code>is_debug_enabled</code> field in kernel configuration.</p> 317 <h3>.use_dispatch_id</h3> 318 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 319 <code>enable_sgpr_dispatch_id</code> field in kernel configuration.</p> 320 <h3>.use_dispatch_ptr</h3> 321 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 322 <code>enable_sgpr_dispatch_ptr</code> field in kernel configuration.</p> 323 <h3>.use_dynamic_call_stack</h3> 324 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 325 <code>is_dynamic_call_stack</code> field in kernel configuration.</p> 326 <h3>.use_flat_scratch_init</h3> 327 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 328 <code>enable_sgpr_flat_scratch_init</code> field in kernel configuration.</p> 329 <h3>.use_grid_workgroup_count</h3> 330 <p>Syntax: .use_grid_workgroup_count DIMENSIONS</p> 331 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 332 <code>enable_sgpr_grid_workgroup_count_X</code>, <code>enable_sgpr_grid_workgroup_count_Y</code> 333 and <code>enable_sgpr_grid_workgroup_count_Z</code> fields in kernel configuration, 334 respectively by given dimensions.</p> 335 <h3>.use_kernarg_segment_ptr</h3> 336 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 337 <code>enable_sgpr_kernarg_segment_ptr</code> field in kernel configuration.</p> 338 <h3>.use_ordered_append_gds</h3> 339 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 340 <code>enable_ordered_append_gds</code> field in kernel configuration.</p> 341 <h3>.use_private_segment_buffer</h3> 342 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 343 <code>enable_sgpr_private_segment_buffer</code> field in kernel configuration.</p> 344 <h3>.use_private_segment_size</h3> 345 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 346 <code>enable_sgpr_private_segment_size</code> field in kernel configuration.</p> 347 <h3>.use_ptr64</h3> 348 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). 349 Enable <code>is_ptr64</code> field in kernel configuration.</p> 350 <h3>.use_queue_ptr</h3> 351 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 352 <code>enable_sgpr_queue_ptr</code> field in kernel configuration.</p> 353 <h3>.use_xnack_enabled</h3> 354 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Enable 355 <code>is_xnack_enabled</code> field in kernel configuration.</p> 228 356 <h3>.useargs</h3> 229 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses arguments.</p> 357 <p>This pseudo-op must be inside any kernel (non-HSA) configuration. 358 Indicate that kernel uses arguments.</p> 230 359 <h3>.useenqueue</h3> 231 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses232 enqueue mechanism.</p>360 <p>This pseudo-op must be inside any kernel (non-HSA) configuration. 361 Indicate that kernel uses enqueue mechanism.</p> 233 362 <h3>.usegeneric</h3> 234 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses235 generic pointers mechanism (FLAT instructions).</p>363 <p>This pseudo-op must be inside any kernel (non-HSA) configuration. 364 Indicate that kernel uses generic pointers mechanism (FLAT instructions).</p> 236 365 <h3>.usesetup</h3> 237 <p>This pseudo-op must be inside kernel configuration. Indicate that kernel uses 238 setup data (global sizes, local sizes, work groups num).</p> 366 <p>This pseudo-op must be inside any kernel (non-HSA) configuration. 367 Indicate that kernel uses setup data (global sizes, local sizes, work groups num).</p> 368 <h3>.userdatanum</h3> 369 <p>Syntax: .userdatanum NUMBER</p> 370 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set number of 371 registers for USERDATA.</p> 239 372 <h3>.vgprsnum</h3> 240 373 <p>Syntax: .vgprsnum REGNUM</p> 241 <p>This pseudo-op must be inside kernel configuration. Set number of vector374 <p>This pseudo-op must be inside any kernel configuration. Set number of vector 242 375 registers which can be used during kernel execution.</p> 376 <h3>.wavefront_sgpr_count</h3> 377 <p>Syntax: .wavefront_sgpr_count REGNUM</p> 378 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 379 <code>wavefront_sgpr_count</code> field in kernel configuration.</p> 380 <h3>.wavefront_size</h3> 381 <p>Syntax: .wavefront_size POWEROFTWO</p> 382 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). 383 Set <code>wavefront_size</code> field in kernel configuration. Value must be a power of two.</p> 384 <h3>.workgroup_fbarrier_count</h3> 385 <p>Syntax: .workgroup_fbarrier_count COUNT</p> 386 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 387 <code>workgroup_fbarrier_count</code> field in kernel configuration.</p> 388 <h3>.workgroup_group_segment_size</h3> 389 <p>Syntax: .workgroup_group_segment_size SIZE</p> 390 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 391 <code>workgroup_group_segment_byte_size</code> in kernel configuration.</p> 392 <h3>.workitem_private_segment_size</h3> 393 <p>Syntax: .workitem_private_segment_size SIZE</p> 394 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 395 <code>workitem_private_segment_byte_size</code> field in kernel configuration.</p> 396 <h3>.workitem_vgpr_count</h3> 397 <p>Syntax: .workitem_vgpr_count REGNUM</p> 398 <p>This pseudo-op must be inside kernel HSA configuration (<code>.hsaconfig</code>). Set 399 <code>workitem_vgpr_count</code> field in kernel configuration.</p> 243 400 <h2>Sample code</h2> 244 401 <p>This is sample example of the kernel setup:</p>