Changes between Version 17 and Version 18 of ClrxAsmRocm
- Timestamp:
- 02/07/18 21:00:40 (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ClrxAsmRocm
v17 v18 17 17 <ul> 18 18 <li>kernel configuration</li> 19 <li>kernel code and data (in <code>.text</code> section `)</li>19 <li>kernel code and data (in <code>.text</code> section)</li> 20 20 </ul> 21 21 <p>Order of these parts doesn't matter.</p> 22 22 <p>Kernel function should to be aligned to 256 byte boundary.</p> 23 <p>Additional kernel informations and binary informations are in metadata ELF note. 24 It holds informations about <code>printf</code> calls, kernel configuration and its arguments.</p> 23 25 <h2>Register usage setup</h2> 24 26 <p>The CLRX assembler automatically sets number of used VGPRs and number of used SGPRs. … … 44 46 <h3>.config</h3> 45 47 <p>Open kernel configuration. Must be inside kernel.</p> 48 <p>The kernel metadata info config pseudo-ops:</p> 49 <ul> 50 <li>.arg - add kernel argument</li> 51 <li>.md_language - kernel language</li> 52 <li>.cws, .reqd_work_group_size - reqd_work_group_size</li> 53 <li>.work_group_size_hint - work_group_size_hint</li> 54 <li>.fixed_work_group_size - fixed work group size</li> 55 <li>.max_flat_work_group_size - max flat work group size</li> 56 <li>.vectypehint - vector type hint</li> 57 <li>.runtime_handle - runtime handle symbol name</li> 58 <li>.md_kernarg_segment_align - kernel argument segment alignment</li> 59 <li>.md_kernarg_segment_size - kernel argument segment size</li> 60 <li>.md_group_segment_fixed_size - group segment fixed size</li> 61 <li>.md_private_segment_fixed_size - private segment fixed size</li> 62 <li>.md_symname - kernel symbol name</li> 63 <li>.md_sgprsnum - number of SGPRs</li> 64 <li>.md_vgprsnum - number of VGPRs</li> 65 <li>.spilledsgprs - number of spilled SGPRs</li> 66 <li>.spilledvgprs - number of spilled VGPRs</li> 67 <li>.md_wavefront_size - wavefront size</li> 68 </ul> 46 69 <h3>.control_directive</h3> 47 70 <p>Open control directive section. This section must be 128 bytes. The content of this 48 71 section will be stored in control_directive field in kernel configuration. 49 72 Must be defined inside kernel.</p> 73 <h3>.cws, .reqd_work_group_size</h3> 74 <p>Syntax: .cws SIZEHINT[, SIZEHINT[, SIZEHINT]] 75 Syntax: .reqd_work_group_size SIZEHINT[, SIZEHINT[, SIZEHINT]]</p> 76 <p>This pseudo-operation must be inside any kernel configuration. 77 Set reqd_work_group_size hint for this kernel in metadata info.</p> 50 78 <h3>.debug_private_segment_buffer_sgpr</h3> 51 79 <p>Syntax: .debug_private_segment_buffer_sgpr SGPRREG</p> … … 61 89 <h3>.dims</h3> 62 90 <p>Syntax: .dims DIMENSIONS</p> 63 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define swhat dimensions91 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define what dimensions 64 92 (from list: x, y, z) will be used to determine space of the kernel execution.</p> 65 93 <h3>.dx10clamp</h3> … … 73 101 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 74 102 Set exception mask in PGMRSRC2 register value. Value should be 7-bit.</p> 103 <h3>.fixed_work_group_size</h3> 104 <p>Syntax: .fixed_work_group_size SIZEHINT[, SIZEHINT[, SIZEHINT]]</p> 105 <p>This pseudo-operation must be inside any kernel configuration. 106 Set fixed_work_group_size for this kernel in metadata info.</p> 75 107 <h3>.fkernel</h3> 76 108 <p>Mark given kernel as function in ROCm. Must be inside kernel.</p> 77 109 <h3>.floatmode</h3> 78 110 <p>Syntax: .floatmode BYTE-VALUE</p> 79 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define sfloat-mode.111 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define float-mode. 80 112 Set floatmode (FP_ROUND and FP_DENORM fields of the MODE register). Default value is 0xc0.</p> 81 113 <h3>.gds_segment_size</h3> … … 138 170 <h3>.localsize</h3> 139 171 <p>Syntax: .localsize SIZE</p> 140 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define sinitial172 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define initial 141 173 local memory size used by kernel.</p> 142 174 <h3>.machine</h3> … … 144 176 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set 145 177 machine version fields in kernel configuration.</p> 178 <h3>.max_flat_work_group_size</h3> 179 <p>Syntax: .max_flat_work_group_size SIZE</p> 180 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 181 Set max flat work group size in metadata info.</p> 146 182 <h3>.max_scratch_backing_memory</h3> 147 183 <p>Syntax: .max_scratch_backing_memory SIZE</p> 148 184 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set 149 185 <code>max_scratch_backing_memory_byte_size</code> field in kernel configuration.</p> 186 <h3>.md_group_segment_fixed_size</h3> 187 <p>Syntax: .md_group_segment_fixed_size SIZE</p> 188 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 189 Set group segment fixed size in metadata info.</p> 190 <h3>.md_kernarg_segment_align</h3> 191 <p>Syntax: .md_kernarg_segment_align ALIGNMENT</p> 192 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 193 Set kernel argument segment alignment in metadata info.</p> 194 <h3>.md_kernarg_segment_size</h3> 195 <p>Syntax: .md_kernarg_segment_size SIZE</p> 196 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 197 Set kernel argument segment size in metadata info.</p> 198 <h3>.md_private_segment_fixed_size</h3> 199 <p>Syntax: .md_private_segment_fixed_size SIZE</p> 200 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 201 Set private segment fixed size in metadata info.</p> 202 <h3>.md_symname</h3> 203 <p>Syntax: .md_symname "SYMBOLNAME"</p> 204 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 205 Set kernel symbol name in metadata info. It should be in format "NAME@kd".</p> 206 <h3>.md_language</h3> 207 <p>Syntax .md_language "LANGUAGE"[, MAJOR, MINOR]</p> 208 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 209 Set kernel language and its version in metadata info. The language name is as string.</p> 210 <h3>.md_sgprsnum</h3> 211 <p>Syntax: .md_sgprsnum REGNUM</p> 212 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 213 Define number of scalar registers for kernel in metadata info.</p> 214 <h3>.md_version</h3> 215 <p>Syntax: .md_version MAJOR, MINOR</p> 216 <p>This pseudo-ops defines metadata format version.</p> 217 <h3>.md_wavefront_size</h3> 218 <p>Syntax: .md_wavefront_size SIZE</p> 219 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 220 Define wavefront size in metadata info. If not specified then value get from HSA config.</p> 221 <h3>.md_vgprsnum</h3> 222 <p>Syntax: .md_vgprsnum REGNUM</p> 223 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 224 Define number of vector registers for kernel in metadata info.</p> 225 <h3>.metadata</h3> 226 <p>Go to metadata (metadata ELF note) section.</p> 150 227 <h3>.newbinfmt</h3> 151 228 <p>This pseudo-ops set new binary format.</p> … … 153 230 <p>Syntax: .pgmrsrc1 VALUE</p> 154 231 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 155 Define svalue of the PGMRSRC1.</p>232 Define value of the PGMRSRC1.</p> 156 233 <h3>.pgmrsrc2</h3> 157 234 <p>Syntax: .pgmrsrc2 VALUE</p> 158 235 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 159 Define svalue of the PGMRSRC2. If dimensions is set then bits that controls dimension setup236 Define value of the PGMRSRC2. If dimensions is set then bits that controls dimension setup 160 237 will be ignored. SCRATCH_EN bit will be ignored.</p> 238 <h3>.printf</h3> 239 <p>Syntax: .printf [ID][,ARGSIZE,....],"FORMAT"</p> 240 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 241 Adds new printf info entry to metadata info. The first argument is ID (must be unique) 242 and is optional. Next arguments are argument size for printf call. The last argument 243 is format string.</p> 161 244 <h3>.priority</h3> 162 245 <p>Syntax: .priority PRIORITY</p> 163 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define spriority (0-3).</p>246 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define priority (0-3).</p> 164 247 <h3>.private_elem_size</h3> 165 248 <p>Syntax: .private_elem_size ELEMSIZE</p> … … 183 266 <code>reserved_vgpr_first</code> and <code>reserved_vgpr_count</code> fields in kernel configuration. 184 267 <code>reserved_vgpr_count</code> filled by number of registers (LASTREG-FIRSTREG+1).</p> 268 <h3>.runtime_handle</h3> 269 <p>Syntax: .runtime_handle "SYMBOLNAME"</p> 270 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 271 Set runtime handle in metadata info</p> 185 272 <h3>.runtime_loader_kernel_symbol</h3> 186 273 <p>Syntax: .runtime_loader_kernel_symbol ADDRESS</p> … … 189 276 <h3>.scratchbuffer</h3> 190 277 <p>Syntax: .scratchbuffer SIZE</p> 191 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define sscratchbuffer size.</p>278 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Define scratchbuffer size.</p> 192 279 <h3>.sgprsnum</h3> 193 280 <p>Syntax: .sgprsnum REGNUM</p> … … 195 282 registers which can be used during kernel execution. 196 283 It counts SGPR registers including VCC, FLAT_SCRATCH and XNACK_MASK.</p> 284 <h3>.spilledsgprs</h3> 285 <p>Syntax: .spilledsgprs REGNUM</p> 286 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set number of scalar 287 registers to spill in scratch buffer (in metadata info).</p> 288 <h3>.spilledvgprs</h3> 289 <p>Syntax: .spilledvgprs REGNUM</p> 290 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set number of vector 291 registers to spill in scratch buffer (in metadata info).</p> 197 292 <h3>.target</h3> 198 293 <p>Syntax: .target "TARGET"</p> … … 251 346 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set number of 252 347 registers for USERDATA.</p> 348 <h3>.vectypehint</h3> 349 <p>Syntax: .vectypehint "OPENCLTYPE"</p> 350 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). 351 Set vectypehint for kernel in metadata info. The argument is OpenCL type.</p> 253 352 <h3>.vgprsnum</h3> 254 353 <p>Syntax: .vgprsnum REGNUM</p> … … 279 378 <p>This pseudo-op must be inside kernel configuration (<code>.config</code>). Set 280 379 <code>workitem_vgpr_count</code> field in kernel configuration.</p> 380 <h3>.work_group_size_hint</h3> 381 <p>Syntax: .work_group_size_hint SIZEHINT[, SIZEHINT[, SIZEHINT]]</p> 382 <p>This pseudo-operation must be inside any kernel configuration. 383 Set work_group_size_hint for this kernel in metadata info.</p> 281 384 <h2>Sample code</h2> 282 385 <p>This is sample example of the kernel setup:</p> … … 351 454 /*32060200 */ v_add_u32 v3, vcc, s0, v1 352 455 ...</code></p> 456 <p>The sample with metadata info:</p> 457 <p><code>.rocm 458 .gpu Fiji 459 .arch_minor 0 460 .arch_stepping 4 461 .eflags 2 462 .newbinfmt 463 .tripple "amdgcn-amd-amdhsa-amdgizcl" 464 .md_version 1, 0 465 .kernel vectorAdd 466 .config 467 .dims x 468 .codeversion 1, 1 469 .use_private_segment_buffer 470 .use_dispatch_ptr 471 .use_kernarg_segment_ptr 472 .private_elem_size 4 473 .use_ptr64 474 .kernarg_segment_align 16 475 .group_segment_align 16 476 .private_segment_align 16 477 .control_directive 478 .fill 128, 1, 0x00 479 .config 480 .md_language "OpenCL", 1, 2 481 .arg n, "uint", 4, , value, u32 482 .arg a, "float*", 8, , globalbuf, f32, global, default const volatile 483 .arg b, "float*", 8, , globalbuf, f32, global, default const 484 .arg c, "float*", 8, , globalbuf, f32, global, default 485 .arg , "", 8, , gox, i64 486 .arg , "", 8, , goy, i64 487 .arg , "", 8, , goz, i64 488 .arg , "", 8, , printfbuf, i8 489 .text 490 vectorAdd: 491 .skip 256 # skip ROCm kernel configuration (required) 492 ...</code></p> 353 493 }}}