Changes between Version 25 and Version 26 of GcnInstrsVop2
- Timestamp:
- 06/10/17 10:00:25 (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GcnInstrsVop2
v25 v26 194 194 NOTE: OMOD and CLAMP modifier affects only for instruction that output is 195 195 floating point value.<br /> 196 NOTE: ABS and negation is applied to source operand for any instruction.</p> 196 NOTE: ABS and negation is applied to source operand for any instruction.<br /> 197 OMOD: OMOD modifier doesn't work for half precision (FP16) instructions (except V_MAC_F16).</p> 197 198 <p>Negation and absolute value can be combined: <code>-ABS(V0)</code>. Modifiers CLAMP and 198 199 OMOD (MUL:2, MUL:4 and DIV:2) can be given in random order.</p> … … 534 535 <h3>Instruction set</h3> 535 536 <p>Alphabetically sorted instruction list:</p> 537 <h4>V_ADD_F16</h4> 538 <p>Opcode VOP2: 31 (0x1f) for GCN 1.2<br /> 539 Opcode VOP3A: 287 (0x11f) for GCN 1.2<br /> 540 Syntax: V_ADD_F16 VDST, SRC0, SRC1<br /> 541 Description: Add two FP16 values from SRC0 and SRC1 and store result to VDST.<br /> 542 Operation:<br /> 543 <code>VDST = ASHALF(SRC0) + ASHALF(SRC1)</code></p> 536 544 <h4>V_ADD_F32</h4> 537 545 <p>Opcode VOP2: 3 (0x3) for GCN 1.0/1.1; 1 (0x1) for GCN 1.2<br /> … … 731 739 Operation:<br /> 732 740 <code>VDST = SRC1 >> (SRC0&31)</code></p> 741 <h4>V_MAC_F16</h4> 742 <p>Opcode VOP2: 35 (0x23) for GCN 1.2<br /> 743 Opcode VOP3A: 291 (0x123) for GCN 1.2<br /> 744 Syntax: V_MAC_F16 VDST, SRC0, SRC1<br /> 745 Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and add result to VDST.<br /> 746 Operation:<br /> 747 <code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(VDST)</code></p> 733 748 <h4>V_MAC_F32</h4> 734 749 <p>Opcode VOP2: 31 (0x1f) for GCN 1.0/1.1; 22 (0x16) for GCN 1.2<br /> … … 747 762 <code>if (ASFLOAT(SRC0)!=0.0 && ASFLOAT(SRC1)!=0.0) 748 763 VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(VDST)</code></p> 764 <h4>V_MADMK_F16</h4> 765 <p>Opcode: 36 (0x24) for GCN 1.2<br /> 766 Opcode: 292 (0x124) for GCN 1.2<br /> 767 Syntax: V_MADMK_F16 VDST, SRC0, FLOAT16LIT, SRC1<br /> 768 Description: Multiply FP16 value from SRC0 with the constant literal FLOAT16LIT and add 769 FP16 value from SRC1; and store result to VDST. Constant literal follows 770 after instruction word. Use nearest-even rouding.<br /> 771 Operation: 772 <code>VDST = ASHALF(SRC0) * ASHALF(FLOAT16LIT) + ASHALF(SRC1)</code></p> 749 773 <h4>V_MADMK_F32</h4> 750 774 <p>Opcode: VOP2: 32 (0x20) for GCN 1.0/1.1; 23 (0x17) for GCN 1.2<br /> … … 756 780 Operation: 757 781 <code>VDST = ASFLOAT(SRC0) * ASFLOAT(FLOATLIT) + ASFLOAT(SRC1)</code></p> 782 <h4>V_MADAK_F16</h4> 783 <p>Opcode: 37 (0x25) for GCN 1.2<br /> 784 Opcode: 293 (0x125) for GCN 1.2<br /> 785 Syntax: V_MADAK_F16 VDST, SRC0, SRC1, FLOAT16LIT<br /> 786 Description: Multiply FP16 value from SRC0 with FP16 value from SRC1 and add 787 the constant literal FLOATLIT16; and store result to VDST. Constant literal follows 788 after instruction word.<br /> 789 Operation: 790 <code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(FLOAT16LIT)</code></p> 758 791 <h4>V_MADAK_F32</h4> 759 792 <p>Opcode: VOP2: 33 (0x21) for GCN 1.0/1.1; 24 (0x18) for GCN 1.2<br /> … … 864 897 else 865 898 VDST = 0.0</code></p> 899 <h4>V_MUL_F16</h4> 900 <p>Opcode VOP2: 34 (0x22) for GCN 1.2<br /> 901 Opcode VOP3A: 290 (0x122) for GCN 1.2<br /> 902 Syntax: V_MUL_F16 VDST, SRC0, SRC1<br /> 903 Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 904 and store result to VDST.<br /> 905 Operation:<br /> 906 <code>VDST = ASHALF(SRC0) * ASHALF(SRC1)</code></p> 866 907 <h4>V_MUL_F32</h4> 867 908 <p>Opcode VOP2: 8 (0x8) for GCN 1.0/1.1; 5 (0x5) for GCN 1.2<br /> … … 925 966 Operation:<br /> 926 967 <code>SDST = VSRC0[SSRC1 & 63]</code></p> 968 <h4>V_SUB_F16</h4> 969 <p>Opcode VOP2: 32 (0x20) for GCN 1.2<br /> 970 Opcode VOP3A: 288 (0x120) for GCN 1.2<br /> 971 Syntax: V_SUB_F16 VDST, SRC0, SRC1<br /> 972 Description: Subtract FP16 value of SRC1 from FP16 value of SRC0 and store result to VDST.<br /> 973 Operation:<br /> 974 <code>VDST = ASHALF(SRC0) - ASHALF(SRC1)</code></p> 927 975 <h4>V_SUB_F32</h4> 928 976 <p>Opcode VOP2: 4 (0x4) for GCN 1.0/1.1; 2 (0x2) for GCN 1.2<br /> … … 980 1028 VDST = temp 981 1029 SDST = (SDST&~mask) | ((temp >> 32) ? mask : 0)</code></p> 1030 <h4>V_SUBREV_F16</h4> 1031 <p>Opcode VOP2: 33 (0x21) for GCN 1.2<br /> 1032 Opcode VOP3A: 289 (0x121) for GCN 1.2<br /> 1033 Syntax: V_SUBREV_F16 VDST, SRC0, SRC1<br /> 1034 Description: Subtract FP16 value of SRC0 from FP16 value of SRC1 and store result to VDST.<br /> 1035 Operation:<br /> 1036 <code>VDST = ASHALF(SRC1) - ASHALF(SRC0)</code></p> 982 1037 <h4>V_SUBREV_F32</h4> 983 1038 <p>Opcode VOP2: 5 (0x5) for GCN 1.0/1.1; 2 (0x3) for GCN 1.2<br />