Changes between Version 41 and Version 42 of ClrxAsmAmdCl2


Ignore:
Timestamp:
06/29/18 19:00:45 (6 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ClrxAsmAmdCl2

    v41 v42  
    227227<p>Open kernel HSA configuration. Must be inside kernel. Kernel configuration can not be
    228228defined if any isametadata, metadata or stub was defined. Do not mix with <code>.config</code>.</p>
     229<h3>.hsalayout</h3>
     230<p>This pseudo-op enabled HSA layout mode (source code layout similar to Gallium binary format
     231layout or ROCm layout) where code of the kernels is in single main code section and
     232kernels are aligned and kernel setup is skipped in section code.</p>
    229233<h3>.ieeemode</h3>
    230234<p>This pseudo-op must be inside any kernel configuration. Set ieee-mode.</p>
     
    480484....
    481485/*bf810000         */ s_endpgm</code></p>
     486<p>This is sample of two kernels with configuration in HSA layout mode:</p>
     487<p><code>.amdcl2
     488.64bit
     489.gpu Bonaire
     490.driver_version 191205
     491.hsalayout
     492.compile_options "-I ./ -cl-std=CL2.0"
     493.acl_version "AMD-COMP-LIB-v0.8 (0.0.SC_BUILD_NUMBER)"
     494.kernel DCT
     495    .config
     496        .dims xy
     497        .useargs
     498        .usesetup
     499        .setupargs
     500        .arg output,float*
     501        .arg input,float*
     502        .arg dct8x8,float*
     503        .arg dct8x8_trans,float*
     504        .arg inter,float*,local
     505        .arg width,uint
     506        .arg blockWidth,uint
     507        .arg inverse,uint
     508        .......
     509.kernel DCT2
     510    .config
     511        .dims xy
     512        .useargs
     513        .usesetup
     514        .setupargs
     515        .arg output,float*
     516        .arg input,float*
     517        .arg dct8x8,float*
     518        .arg dct8x8_trans,float*
     519        .arg inter,float*,local
     520        .arg width,uint
     521        .arg blockWidth,uint
     522        .arg inverse,uint
     523        .......
     524.text
     525DCT:
     526.skip 256   # setup kernel skip
     527/*c0000501         */ s_load_dword    s0, s[4:5], 0x1
     528....
     529/*bf810000         */ s_endpgm
     530.p2align
     531DCT2:
     532.skip 256   # setup kernel skip
     533/*c0000501         */ s_load_dword    s0, s[4:5], 0x1
     534....
     535/*bf810000         */ s_endpgm</code></p>
    482536}}}