Changeset 3165 in CLRX


Ignore:
Timestamp:
Jun 16, 2017, 6:41:29 PM (20 months ago)
Author:
matszpk
Message:

CLRadeonExtender: CLRXDocs: Add info about applying MODE to half precision operations.
Update info denormals flushing to V_MAD*/V_MAC* instructions.

Location:
CLRadeonExtender/trunk/doc
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • CLRadeonExtender/trunk/doc/GcnInstrsVop1.md

    r3150 r3165  
    6161floating point value. 
    6262NOTE: ABS and negation is applied to source operand for any instruction. 
    63 NOTE: OMOD modifier doesn't work for half precision (FP16) instructions.
    6463
    6564Negation and absolute value can be combined: `-ABS(V0)`. Modifiers CLAMP and
     
    313312Description: Convert single FP value to half floating point value with rounding from
    314313MODE register (single FP rounding mode), and store result to VDST.
    315 If absolute value is too high, then store -/+infinity to VDST. 
     314If absolute value is too high, then store -/+infinity to VDST.
     315In GCN 1.2 flushing denormals controlled by MODE. In GCN 1.0/1.1, denormals are enabled. 
    316316Operation: 
    317317```
    318318VDST = CVTHALF(ASFLOAT(SRC0))
     319```
     320
     321#### V_CVT_F16_U16
     322
     323Opcode: VOP1: 57 (0x39) for GCN 1.2 
     324Opcode VOP3A: 377 (0x179) for GCN 1.2 
     325Syntax: V_CVT_F16_U16 VDST, SRC0 
     326Description: Convert 16-bit unsigned valut to half floating point value. 
     327Operation: 
     328```
     329VDST = (HALF)SRC0
    319330```
    320331
     
    325336Syntax: V_CVT_F32_F16 VDST, SRC0 
    326337Description: Convert half FP value to single FP value, and store result to VDST.
    327 **By default, immediate is in FP32 format!**. 
     338**By default, immediate is in FP32 format!**.
     339In GCN 1.2 flushing denormals controlled by MODE. In GCN 1.0/1.1, denormals are enabled. 
    328340Operation: 
    329341```
  • CLRadeonExtender/trunk/doc/GcnInstrsVop2.md

    r3150 r3165  
    6262floating point value. 
    6363NOTE: ABS and negation is applied to source operand for any instruction. 
    64 NOTE: OMOD modifier doesn't work for half precision (FP16) instructions (except V_MAC_F16).
    6564
    6665Negation and absolute value can be combined: `-ABS(V0)`. Modifiers CLAMP and
     
    493492Syntax: V_MAC_F16 VDST, SRC0, SRC1 
    494493Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and
    495 add result to VDST. It applies OMOD modifier to result
     494add result to VDST. It applies OMOD modifier to result and it flush denormals
    496495Operation: 
    497496```
     
    504503Opcode VOP3A: 287 (0x11f) for GCN 1.0/1.1; 278 (0x116) for GCN 1.2 
    505504Syntax: V_MAC_F32 VDST, SRC0, SRC1 
    506 Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to VDST. 
     505Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to VDST.
     506It applies OMOD modifier to result and it flush denormals. 
    507507Operation: 
    508508```
     
    516516Syntax: V_MAC_LEGACY_F32 VDST, SRC0, SRC1 
    517517Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to VDST.
    518 If one of value is 0.0 then always do not change VDST (do not apply IEEE rules for 0.0*x). 
     518If one of value is 0.0 then always do not change VDST (do not apply IEEE rules for 0.0*x).
     519It applies OMOD modifier to result and it flush denormals. 
    519520Operation: 
    520521```
    521522if (ASFLOAT(SRC0)!=0.0 && ASFLOAT(SRC1)!=0.0)
    522523    VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(VDST)
     524```
     525
     526#### V_MADAK_F16
     527
     528Opcode: 37 (0x25) for GCN 1.2 
     529Opcode: 293 (0x125) for GCN 1.2 
     530Syntax: V_MADAK_F16 VDST, SRC0, SRC1, FLOAT16LIT 
     531Description: Multiply FP16 value from SRC0 with FP16 value from SRC1 and add
     532the constant literal FLOATLIT16; and store result to VDST. Constant literal follows
     533after instruction word. It flush denormals. 
     534Operation:
     535```
     536VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(FLOAT16LIT)
     537```
     538
     539#### V_MADAK_F32
     540
     541Opcode: VOP2: 33 (0x21) for GCN 1.0/1.1; 24 (0x18) for GCN 1.2 
     542Opcode: VOP3A: 289 (0x121) for GCN 1.0/1.1; 280 (0x118) for GCN 1.2 
     543Syntax: V_MADAK_F32 VDST, SRC0, SRC1, FLOATLIT 
     544Description: Multiply FP value from SRC0 with FP value from SRC1 and add
     545the constant literal FLOATLIT; and store result to VDST. Constant literal follows
     546after instruction word. It flush denormals. 
     547Operation:
     548```
     549VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(FLOATLIT)
    523550```
    524551
     
    530557Description: Multiply FP16 value from SRC0 with the constant literal FLOAT16LIT and add
    531558FP16 value from SRC1; and store result to VDST. Constant literal follows
    532 after instruction word. Use nearest-even rouding
     559after instruction word. It flush denormals
    533560Operation:
    534561```
     
    543570Description: Multiply FP value from SRC0 with the constant literal FLOATLIT and add
    544571FP value from SRC1; and store result to VDST. Constant literal follows
    545 after instruction word.
     572after instruction word. It flush denormals.
    546573Operation:
    547574```
    548575VDST = ASFLOAT(SRC0) * ASFLOAT(FLOATLIT) + ASFLOAT(SRC1)
    549 ```
    550 
    551 #### V_MADAK_F16
    552 
    553 Opcode: 37 (0x25) for GCN 1.2 
    554 Opcode: 293 (0x125) for GCN 1.2 
    555 Syntax: V_MADAK_F16 VDST, SRC0, SRC1, FLOAT16LIT 
    556 Description: Multiply FP16 value from SRC0 with FP16 value from SRC1 and add
    557 the constant literal FLOATLIT16; and store result to VDST. Constant literal follows
    558 after instruction word. 
    559 Operation:
    560 ```
    561 VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(FLOAT16LIT)
    562 ```
    563 
    564 #### V_MADAK_F32
    565 
    566 Opcode: VOP2: 33 (0x21) for GCN 1.0/1.1; 24 (0x18) for GCN 1.2 
    567 Opcode: VOP3A: 289 (0x121) for GCN 1.0/1.1; 280 (0x118) for GCN 1.2 
    568 Syntax: V_MADAK_F32 VDST, SRC0, SRC1, FLOATLIT 
    569 Description: Multiply FP value from SRC0 with FP value from SRC1 and add
    570 the constant literal FLOATLIT; and store result to VDST. Constant literal follows
    571 after instruction word. 
    572 Operation:
    573 ```
    574 VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(FLOATLIT)
    575576```
    576577
  • CLRadeonExtender/trunk/doc/GcnInstrsVop3.md

    r2450 r3165  
    828828Syntax: V_MAD_F32 VDST, SRC0, SRC1, SRC2 
    829829Description: Multiply FP value from SRC0 by FP value from SRC1 and add SRC2, and store
    830 result to VDST.
     830result to VDST. It applies OMOD modifier to result and it flush denormals.
    831831Operation: 
    832832```
     
    869869Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to SRC2, and
    870870store result to VDST. If one of value is 0.0 then always store SRC2 to VDST
    871 (do not apply IEEE rules for 0.0*x). 
     871(do not apply IEEE rules for 0.0*x). It applies OMOD modifier to result and it flush
     872denormals. 
    872873Operation: 
    873874```
  • CLRadeonExtender/trunk/doc/GcnState.md

    r3137 r3165  
    107107
    108108The single floating point rounding mode is controlled by 0-1 bits in MODE register.
    109 A rounding mode for double precision is controlled by 2-3 bits. List of possible values:
     109A rounding mode for double precision and half precision is controlled by 2-3 bits.
     110List of possible values:
    110111
    111112 Value | Description
     
    117118
    118119The denormal mode for single precision controlled by 4-5 bits in MODE register. The 6-7
    119 bits of MODE register controls denormal mode for double precision ops.
    120 List of possible values:
     120bits of MODE register controls denormal mode for double precision and half precision
     121operations. List of possible values:
    121122
    122123 Value | Description
Note: See TracChangeset for help on using the changeset viewer.