Changes between Version 26 and Version 27 of GcnInstrsVop2
- Timestamp:
- 06/10/17 12:00:24 (7 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GcnInstrsVop2
v26 v27 565 565 UINT64 mask = (1ULL<<LANEID) 566 566 SDST = (SDST&~mask) | ((temp >> 32) ? mask : 0)</code></p> 567 <h4>V_ADD_U16</h4> 568 <p>Opcode VOP2: 38 (0x26) for GCN 1.2<br /> 569 Opcode VOP3A: 294 (0x126) for GCN 1.2<br /> 570 Syntax: V_ADD_U16 VDST, SRC0, SRC1<br /> 571 Description: Add two 16-bit unsigned values from SRC0 and SRC1 and 572 store 16-bit unsigned result to VDST.<br /> 573 Operation:<br /> 574 <code>VDST = (SRC0 + SRC1) & 0xffff</code></p> 567 575 <h4>V_ADDC_U32</h4> 568 576 <p>Opcode VOP2: 40 (0x28) for GCN 1.0/1.1; 28 (0x1c) for GCN 1.2<br /> … … 595 603 Operation:<br /> 596 604 <code>VDST = (INT32)SRC0 >> (SRC1&31)</code></p> 605 <h4>V_ASHRREV_B16</h4> 606 <p>Opcode VOP2: 44 (0x2c) for GCN 1.2<br /> 607 Opcode VOP3A: 300 (0x12c) for GCN 1.2<br /> 608 Syntax: V_ASHRREV_B16 VDST, SRC0, SRC1<br /> 609 Description: Shift right signed 16-bit value from SRC1 by (SRC0&15) bits and 610 store 16-bit signed result into VDST.<br /> 611 Operation:<br /> 612 <code>VDST = ((INT16)SRC1 >> (SRC0&15)) & 0xffff</code></p> 597 613 <h4>V_ASHRREV_I32</h4> 598 614 <p>Opcode VOP2: 24 (0x18) for GCN 1.0/1.1; 16 (0x11) for GCN 1.2<br /> … … 703 719 UINT16 D1 = ASINT16(CVT_HALF_RTZ(ASFLOAT(SRC1))) 704 720 VDST = D0 | (((UINT32)D1) << 16)</code></p> 721 <h4>V_LDEXP_F16</h4> 722 <p>Opcode VOP2: 51 (0x33) for GCN 1.2<br /> 723 Opcode VOP3A: 307 (0x133) for GCN 1.2<br /> 724 Syntax: V_LDEXP_F16 VDST, SRC0, SRC1<br /> 725 Description: Do ldexp operation on SRC0 and SRC1 (multiply SRC0 by 2**(SRC1)). 726 SRC1 is signed integer, SRC0 is half floating point value.<br /> 727 Operation:<br /> 728 <code>VDST = ASHALF(SRC0) * POW(2.0, (INT32)SRC1)</code></p> 705 729 <h4>V_LDEXP_F32</h4> 706 730 <p>Opcode VOP2: 43 (0x2b) for GCN 1.0/1.1<br /> … … 718 742 Operation:<br /> 719 743 <code>VDST = SRC0 << (SRC1&31)</code></p> 744 <h4>V_LSHLREV_B16</h4> 745 <p>Opcode VOP2: 42 (0x2a) for GCN 1.2<br /> 746 Opcode VOP3A: 298 (0x12a) for GCN 1.2<br /> 747 Syntax: V_LSHLREV_B16 VDST, SRC0, SRC1<br /> 748 Description: Shift left unsigned 16-bit value from SRC1 by (SRC0&15) bits and 749 store 16-bit unsigned result into VDST.<br /> 750 Operation:<br /> 751 <code>VDST = (SRC1 << (SRC0&15)) & 0xffff</code></p> 720 752 <h4>V_LSHLREV_B32</h4> 721 753 <p>Opcode VOP2: 26 (0x1a) for GCN 1.0/1.1; 18 (0x12) for GCN 1.2<br /> … … 732 764 Operation:<br /> 733 765 <code>VDST = SRC0 >> (SRC1&31)</code></p> 766 <h4>V_LSHRREV_B16</h4> 767 <p>Opcode VOP2: 43 (0x2b) for GCN 1.2<br /> 768 Opcode VOP3A: 299 (0x12b) for GCN 1.2<br /> 769 Syntax: V_LSHRREV_B16 VDST, SRC0, SRC1<br /> 770 Description: Shift right unsigned 16-bit value from SRC1 by (SRC0&15) bits and 771 store 16-bit unsigned result into VDST.<br /> 772 Operation:<br /> 773 <code>VDST = (SRC1 >> (SRC0&15)) & 0xffff</code></p> 734 774 <h4>V_LSHRREV_B32</h4> 735 775 <p>Opcode VOP2: 22 (0x16) for GCN 1.0/1.1; 16 (0x10) for GCN 1.2<br /> … … 743 783 Opcode VOP3A: 291 (0x123) for GCN 1.2<br /> 744 784 Syntax: V_MAC_F16 VDST, SRC0, SRC1<br /> 745 Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and add result to VDST.<br /> 785 Description: Multiply FP16 value from SRC0 by FP16 value from SRC1 and 786 add result to VDST. It applies OMOD modifier to result.<br /> 746 787 Operation:<br /> 747 788 <code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(VDST)</code></p> … … 798 839 Operation: 799 840 <code>VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(FLOATLIT)</code></p> 841 <h4>V_MAX_F16</h4> 842 <p>Opcode VOP2: 45 (0x2d) for GCN 1.2<br /> 843 Opcode VOP3A: 301 (0x12d) for GCN 1.2<br /> 844 Syntax: V_MAX_F16 VDST, SRC0, SRC1<br /> 845 Description: Choose largest half floating point value from SRC0 and SRC1, 846 and store result to VDST.<br /> 847 Operation:<br /> 848 <code>VDST = MAX(ASFHALF(SRC0), ASFHALF(SRC1))</code></p> 800 849 <h4>V_MAX_F32</h4> 801 850 <p>Opcode VOP2: 16 (0x10) for GCN 1.0/1.1; 11 (0xb) for GCN 1.2<br /> … … 852 901 <code>UINT32 MASK = ((1ULL << LANEID) - 1ULL) & SRC0 853 902 VDST = SRC1 + BITCOUNT(MASK)</code></p> 903 <h4>V_MIN_F16</h4> 904 <p>Opcode VOP2: 46 (0x2e) for GCN 1.2<br /> 905 Opcode VOP3A: 302 (0x12e) for GCN 1.2<br /> 906 Syntax: V_MIN_F16 VDST, SRC0, SRC1<br /> 907 Description: Choose smallest half floating point value from SRC0 and SRC1, 908 and store result to VDST.<br /> 909 Operation:<br /> 910 <code>VDST = MIN(ASFHALF(SRC0), ASFHALF(SRC1))</code></p> 854 911 <h4>V_MIN_F32</h4> 855 912 <p>Opcode VOP2: 15 (0xf) for GCN 1.0/1.1; 10 (0xa) for GCN 1.2<br /> … … 860 917 Operation:<br /> 861 918 <code>VDST = MIN(ASFLOAT(SRC0), ASFLOAT(SRC1))</code></p> 919 <h4>V_MIN_i16</h4> 920 <p>Opcode VOP2: 50 (0x32) for GCN 1.2<br /> 921 Opcode VOP3A: 306 (0x132) for GCN 1.2<br /> 922 Syntax: V_MIN_i16 VDST, SRC0, SRC1<br /> 923 Description: Choose smallest signed 16-bit value from SRC0 and SRC1, 924 and store result to VDST.<br /> 925 Operation:<br /> 926 <code>VDST = MIN((INT16)SRC0, (INT16)SRC1)</code></p> 862 927 <h4>V_MIN_I32</h4> 863 928 <p>Opcode VOP2: 17 (0x11) for GCN 1.0/1.1; 12 (0xc) for GCN 1.2<br /> … … 879 944 else 880 945 VDST = NaN</code></p> 946 <h4>V_MIN_U16</h4> 947 <p>Opcode VOP2: 49 (0x31) for GCN 1.2<br /> 948 Opcode VOP3A: 305 (0x131) for GCN 1.2<br /> 949 Syntax: V_MIN_U16 VDST, SRC0, SRC1<br /> 950 Description: Choose smallest unsigned 16-bit value from SRC0 and SRC1, 951 and store result to VDST.<br /> 952 Operation:<br /> 953 <code>VDST = MIN(SRC0&0xffff, SRC1&0xffff)</code></p> 881 954 <h4>V_MIN_U32</h4> 882 955 <p>Opcode VOP2: 19 (0x13) for GCN 1.0/1.1; 14 (0xe) for GCN 1.2<br /> … … 942 1015 INT32 V1 = (INT32)((SRC1&0x7fffff) | (SSRC1&0x800000 ? 0xff800000 : 0)) 943 1016 VDST = V0 * V1</code></p> 1017 <h4>V_MUL_LO_U16</h4> 1018 <p>Opcode VOP2: 41 (0x29) for GCN 1.2<br /> 1019 Opcode VOP3A: 297 (0x129) for GCN 1.2<br /> 1020 Syntax: V_MUL_LO_U16 VDST, SRC0, SRC1<br /> 1021 Description: Multiply 16-bit unsigned value from SRC0 by 16-bit unsigned value from SRC1 1022 and store 16-bit result to VDST.<br /> 1023 Operation:<br /> 1024 <code>VDST = ((SRC0&0Xffff) * (SRC1&0xffff)) & 0xffff</code></p> 944 1025 <h4>V_MUL_U32_U24</h4> 945 1026 <p>Opcode VOP2: 11 (0xb) for GCN 1.0/1.1; 8 (0x8) for GCN 1.2<br /> … … 980 1061 Operation:<br /> 981 1062 <code>VDST = ASFLOAT(SRC0) - ASFLOAT(SRC1)</code></p> 1063 <h4>V_SUB_U16</h4> 1064 <p>Opcode VOP2: 39 (0x27) for GCN 1.2<br /> 1065 Opcode VOP3A: 295 (0x127) for GCN 1.2<br /> 1066 Syntax: V_SUB_U16 VDST, SRC0, SRC1<br /> 1067 Description: Subtract unsigned 16-bit value of SRC1 from SRC0 and store 1068 16-bit unsigned result to VDST.<br /> 1069 Operation:<br /> 1070 <code>VDST = (SRC0 - SRC1) & 0xffff</code></p> 982 1071 <h4>V_SUB_I32, V_SUB_U32</h4> 983 1072 <p>Opcode VOP2: 38 (0x26) for GCN 1.0/1.1; 26 (0x1a) for GCN 1.2<br /> … … 1058 1147 UINT64 mask = (1ULL<<LANEID) 1059 1148 SDST = (SDST&~mask) | ((temp>>32) ? mask : 0)</code></p> 1149 <h4>V_SUBREV_U16</h4> 1150 <p>Opcode VOP2: 40 (0x28) for GCN 1.2<br /> 1151 Opcode VOP3A: 296 (0x128) for GCN 1.2<br /> 1152 Syntax: V_SUBREV_U16 VDST, SRC0, SRC1<br /> 1153 Description: Subtract unsigned 16-bit value of SRC0 from SRC1 and store 1154 16-bit unsigned result to VDST.<br /> 1155 Operation:<br /> 1156 <code>VDST = (SRC1 - SRC0) & 0xffff</code></p> 1060 1157 <h4>V_XOR_B32</h4> 1061 1158 <p>Opcode: VOP2: 29 (0x1d) for GCN 1.0/1.1; 21 (0x15) for GCN 1.2<br />