Changes between Version 21 and Version 22 of GcnInstrsDs


Ignore:
Timestamp:
Jun 17, 2017, 10:00:26 PM (2 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnInstrsDs

    v21 v22  
    11551155<h3>Instruction set</h3>
    11561156<p>Alphabetically sorted instruction list:</p>
     1157<h4>DS_ADD_F32</h4>
     1158<p>Opcode: 21 (0x15) for GCN 1.2<br />
     1159Syntax: DS_ADD_U32 ADDR, VDATA0 [OFFSET:OFFSET]<br />
     1160Description: Add single float value from LDS/GDS at address (ADDR+OFFSET) &amp; ~3 and
     1161VDATA0, and store result back to LDS/GDS at this address as single float value.
     1162Operation is atomic.<br />
     1163Operation:<br />
     1164<code>FLOAT* V = (FLOAT*)(DS + ((ADDR+OFFSET)&amp;~3))
     1165*V = *V + ASFLOAT(VDATA0)  // atomic operation</code></p>
    11571166<h4>DS_ADD_RTN_U32</h4>
    11581167<p>Opcode: 32 (0x20)<br />
     
    12851294VDST = *V   // scalar operation
    12861295*V += BITCOUNT(EXEC)  // scalar operation</code></p>
     1296<h4>DS_BPERMUTE_B32</h4>
     1297<p>Opcode: 63 (0x3f) for GCN 1.2<br />
     1298Syntax: DS_BPERMUTE_B32 DST, ADDR, SRC [OFFSET:OFFSET]<br />
     1299Description: Backward permutation for wave. Put value of SRC0 from
     1300lane id calculated from <code>ADDR[(LANEID + (OFFSET&gt;&gt;2)) &amp; 64</code>,
     1301to DST register in LANEID. The ADDR holds lane id is multiplied by 4 (size of dword).
     1302Realizes pop semantic: “read data from lane i”.
     1303Operation:<br />
     1304<code>UINT tmp[64]
     1305for (BYTE i = 0; i &lt; 64; i++)
     1306{
     1307    UINT32 laneid = ADDR[(i + (OFFSET&gt;&gt;2)) &amp; 63]
     1308    tmp[i] = (EXEC &amp; (1ULL&lt;&lt;laneid)!=0) ?  SRC[laneid] : 0
     1309}
     1310for (BYTE i = 0; i &lt; 64; i++)
     1311    if (EXEC &amp; (1ULL&lt;&lt;i)!=0)
     1312        DST[i] = tmp[i]</code></p>
    12871313<h4>DS_CONSUME</h4>
    12881314<p>Opcode: 61 (0x3d) for GCN 1.0/1.1; 189 (0xbd) GCN 1.2<br />
     
    19822008UINT64* V = (UINT64*)(DS + A)
    19832009*V = *V | *(UINT64*)(DS + B) // atomic operation</code></p>
     2010<h4>DS_PERMUTE_B32</h4>
     2011<p>Opcode: 62 (0x3e) for GCN 1.2<br />
     2012Syntax: DS_PERMUTE_B32 DST, ADDR, SRC [OFFSET:OFFSET]<br />
     2013Description: Forward permutation for wave. Put value of SRC0 from LANEID to DST register in
     2014lane id calculated from <code>ADDR[(LANEID + (OFFSET&gt;&gt;2)) &amp; 64</code>.
     2015The ADDR holds lane id multiplied by 4 (size of dword). Realizes push semantic:
     2016"put my lane data in lane i".
     2017Operation:<br />
     2018<code>UINT32 TMP[64]
     2019for (BYTE i = 0; i &lt; 64; i++)
     2020    tmp[ADDR[(i + (OFFSET&gt;&gt;2)) &amp; 63]] = (EXEC &amp; (1ULL&lt;&lt;i) != 0) ? SRC[i] : 0
     2021for (BYTE i = 0; i &lt; 64; i++)
     2022    if (EXEC &amp; (1ULL&lt;&lt;i) != 0)
     2023        DST[i] = tmp[i]</code></p>
    19842024<h4>DS_READ_B128</h4>
    19852025<p>Opcode: 255 (0xff) for GCN 1.1/1.2<br />