Changeset 3179 in CLRX


Ignore:
Timestamp:
Jun 17, 2017, 9:08:48 PM (2 years ago)
Author:
matszpk
Message:

CLRadeonExtender: CLRXDocs: Describe DS_ADD_F32, DS_PERMUTE_B32 and DS_BPERMUTE_B32 (GCN 1.2).

File:
1 edited

Legend:

Unmodified
Added
Removed
  • CLRadeonExtender/trunk/doc/GcnInstrsDs.md

    r2411 r3179  
    194194Alphabetically sorted instruction list:
    195195
     196#### DS_ADD_F32
     197
     198Opcode: 21 (0x15) for GCN 1.2 
     199Syntax: DS_ADD_U32 ADDR, VDATA0 [OFFSET:OFFSET] 
     200Description: Add single float value from LDS/GDS at address (ADDR+OFFSET) & ~3 and
     201VDATA0, and store result back to LDS/GDS at this address as single float value.
     202Operation is atomic. 
     203Operation: 
     204```
     205FLOAT* V = (FLOAT*)(DS + ((ADDR+OFFSET)&~3))
     206*V = *V + ASFLOAT(VDATA0)  // atomic operation
     207```
     208
    196209#### DS_ADD_RTN_U32
    197210
     
    374387VDST = *V   // scalar operation
    375388*V += BITCOUNT(EXEC)  // scalar operation
     389```
     390
     391#### DS_BPERMUTE_B32
     392
     393Opcode: 63 (0x3f) for GCN 1.2 
     394Syntax: DS_BPERMUTE_B32 DST, ADDR, SRC [OFFSET:OFFSET] 
     395Description: Backward permutation for wave. Put value of SRC0 from
     396lane id calculated from `ADDR[(LANEID + (OFFSET>>2)) & 64`,
     397to DST register in LANEID. The ADDR holds lane id is multiplied by 4 (size of dword).
     398Realizes pop semantic: “read data from lane i”.
     399Operation: 
     400```
     401UINT tmp[64]
     402for (BYTE i = 0; i < 64; i++)
     403{
     404    UINT32 laneid = ADDR[(i + (OFFSET>>2)) & 63]
     405    tmp[i] = (EXEC & (1ULL<<laneid)!=0) ?  SRC[laneid] : 0
     406}
     407for (BYTE i = 0; i < 64; i++)
     408    if (EXEC & (1ULL<<i)!=0)
     409        DST[i] = tmp[i]
    376410```
    377411
     
    13411375UINT64* V = (UINT64*)(DS + A)
    13421376*V = *V | *(UINT64*)(DS + B) // atomic operation
     1377```
     1378
     1379#### DS_PERMUTE_B32
     1380
     1381Opcode: 62 (0x3e) for GCN 1.2 
     1382Syntax: DS_PERMUTE_B32 DST, ADDR, SRC [OFFSET:OFFSET] 
     1383Description: Forward permutation for wave. Put value of SRC0 from LANEID to DST register in
     1384lane id calculated from `ADDR[(LANEID + (OFFSET>>2)) & 64`.
     1385The ADDR holds lane id multiplied by 4 (size of dword). Realizes push semantic:
     1386"put my lane data in lane i".
     1387Operation: 
     1388```
     1389UINT32 TMP[64]
     1390for (BYTE i = 0; i < 64; i++)
     1391    tmp[ADDR[(i + (OFFSET>>2)) & 63]] = (EXEC & (1ULL<<i) != 0) ? SRC[i] : 0
     1392for (BYTE i = 0; i < 64; i++)
     1393    if (EXEC & (1ULL<<i) != 0)
     1394        DST[i] = tmp[i]
    13431395```
    13441396
Note: See TracChangeset for help on using the changeset viewer.