Changeset 3178 in CLRX


Ignore:
Timestamp:
Jun 17, 2017, 4:03:32 PM (2 years ago)
Author:
matszpk
Message:

CLRadeonExtender: CLRXDocs: Describe V_PERM_B32 instruction (GCN 1.2).

File:
1 edited

Legend:

Unmodified
Added
Removed
  • CLRadeonExtender/trunk/doc/GcnInstrsVop3.md

    r3177 r3178  
    887887```
    888888
     889#### V_MAD_I16
     890
     891Opcode: 492 (0x1ec) for GCN 1.2 
     892Syntax: V_MAD_I16 VDST, SRC0, SRC1, SRC2 
     893Description: Multiply 16-bit signed value from SRC0 by 16-bit signed value from
     894SRC1 and add 16-bit signed value from SRC2, and store 16-bit signed result to VDST. 
     895Operation: 
     896```
     897VDST = (INT16)((INT16)SRC0*(INT16)SRC1 + (INT16)SRC2)
     898```
     899
    889900#### V_MAD_I32_I24
    890901
     
    928939if (ASFLOAT(SRC0)!=0.0 && ASFLOAT(SRC1)!=0.0)
    929940    VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(SRC2)
     941```
     942
     943#### V_MAD_U16
     944
     945Opcode: 491 (0x1eb) for GCN 1.2 
     946Syntax: V_MAD_U16 VDST, SRC0, SRC1, SRC2 
     947Description: Multiply 16-bit unsigned value from SRC0 by 16-bit unsigned value from
     948SRC1 and add 16-bit unsigned value from SRC2, and store 16-bit unsigned result to VDST. 
     949Operation: 
     950```
     951VDST = ((UINT16)SRC0*(UINT16)SRC1 + (UINT16)SRC2) & 0xffff
    930952```
    931953
     
    12981320```
    12991321
     1322#### V_PERM_B32
     1323
     1324Opcode: 493 (0x1ed) for GCN 1.2 
     1325Syntax: V_PERM_B32 VDST, SRC0, SRC1, SRC2 
     1326Description: Permute bytes. Choose for every byte in dword, specified value. Bytes in
     1327SRC2 dword selects value for result dword. Value 0-7 choose byte of this index of quadword
     1328(64-bit value) built from SRC0 (higher bits) and SRC1 (lower bits). Value from 8-11
     1329choose 0xff*BIT, where BIT is last bit from 2*N+1 from 64-bit value (SRC0,SRC1).
     1330Value 12 choose zero. Value equal or greater than 13 choose 0xff. 
     1331Operation: 
     1332```
     1333VDST = 0
     1334UINT64 qword = (((UINT64)SRC0)<<32) | SRC1
     1335for (int i = 0; i < 4; i++)
     1336{
     1337    BYTE choice = (SRC2 >> (8*i)) & 0xff
     1338    BYTE result
     1339    if (choice >= 13)
     1340        result = 0xff
     1341    else if (choice == 12)
     1342        result = 0
     1343    else if (choice >= 8)
     1344        result = 0xff * qword>>((choice-8)*16 + 15)
     1345    else
     1346        result = (qword >> (choice*8)) & 0xff
     1347    VDST |= (result << (i*8))
     1348}
     1349```
     1350
    13001351#### V_QSAD_U8, V_QSAD_PK_U16_U8
    13011352
Note: See TracChangeset for help on using the changeset viewer.