Changeset 3177 in CLRX


Ignore:
Timestamp:
Jun 17, 2017, 2:08:37 PM (2 years ago)
Author:
matszpk
Message:

CLRadeonExtender: CLRXDocs: Add descriptions of new instructions VOP3 (GCN 1.2).

File:
1 edited

Legend:

Unmodified
Added
Removed
  • CLRadeonExtender/trunk/doc/GcnInstrsVop3.md

    r3165 r3177  
    532532```
    533533
     534#### V_DIV_FIXUP_F16
     535
     536Opcode: 495 (0x1ef) for GCN 1.2 
     537Syntax: V_DIV_FIXUP_F16 VDST, SRC0, SRC1, SRC2 
     538Description: Handle all exceptions requires for half floating point division.
     539SRC0 is quotient, SRC1 is denominator, SRC2 is nominator. Correct result stored to VDST. 
     540Operation: 
     541```
     542HALF SF0 = ASHALF(SRC0)
     543HALF SF1 = ASHALF(SRC1)
     544HALF SF2 = ASHALF(SRC2)
     545if (ISNAN(SF1) && !ISNAN(SF2))
     546    VDST = QUIETNAN(SF1)
     547else if (ISNAN(SF2))
     548    VDST = QUIETNAN(SF2)
     549else if (SF1 == 0.0 && SF2 == 0.0)
     550    VDST = NAN_H
     551else if (ABS(SF1)==INF && ABS(SF2)==INF)
     552    VDST = -NAN_H
     553else if (SF1 == 0.0)
     554    VDST = INF_H*SIGN(SF1)*SIGN(SF2)
     555else if (ABS(SF1) == INF)
     556    VDST = SIGN(SF1)*SIGN(SF2) >=0 ? 0.0 : -0.0
     557else if (ISNAN(SF0))
     558    VDST = SIGN(SF1)*SIGN(SF2)*INF_H
     559else
     560    VDST = SF0
     561```
     562
    534563#### V_DIV_FIXUP_F32
    535564
     
    707736```
    708737
     738#### V_FMA_F16
     739
     740Opcode: 494 (0x1ee) for GCN 1.2 
     741Syntax: V_FMA_F16 VDST, SRC0, SRC1, SRC2 
     742Description: Fused multiply addition on half floating point values from
     743SRC0, SRC1 and SRC2. Result stored in VDST. 
     744Operation: 
     745```
     746// SRC0*SRC1+SRC2
     747VDST = FMA(ASHALF(SRC0), ASHALF(SRC1), ASHALF(SRC2))
     748```
     749
    709750#### V_FMA_F32
    710751
     
    821862if (ASFLOAT(SRC0)!=0.0 && ASFLOAT(SRC1)!=0.0)
    822863    VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(VDST)
     864```
     865
     866#### V_MAD_F16
     867
     868Opcode: 490 (0x1ea) for GCN 1.2 
     869Syntax: V_MAD_F16 VDST, SRC0, SRC1, SRC2 
     870Description: Multiply half FP value from SRC0 by half FP value from
     871SRC1 and add SRC2, and store result to VDST.
     872It applies OMOD modifier to result and it flush denormals. 
     873Operation: 
     874```
     875VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(SRC2)
    823876```
    824877
Note: See TracChangeset for help on using the changeset viewer.