Changes between Version 33 and Version 34 of GcnInstrsVop3
- Timestamp:
- 11/25/17 23:00:28 (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GcnInstrsVop3
v33 v34 774 774 </tr> 775 775 <tr> 776 <td>497 (0x1f1)</td> 777 <td></td> 778 <td>✓</td> 779 <td>V_MAD_U32_U16</td> 780 </tr> 781 <tr> 782 <td>498 (0x1f2)</td> 783 <td></td> 784 <td>✓</td> 785 <td>V_MAD_I32_I16</td> 786 </tr> 787 <tr> 788 <td>499 (0x1f3)</td> 789 <td></td> 790 <td>✓</td> 791 <td>V_XAD_U32</td> 792 </tr> 793 <tr> 794 <td>500 (0x1f4)</td> 795 <td></td> 796 <td>✓</td> 797 <td>V_MIN3_F16</td> 798 </tr> 799 <tr> 800 <td>501 (0x1f5)</td> 801 <td></td> 802 <td>✓</td> 803 <td>V_MIN3_I16</td> 804 </tr> 805 <tr> 806 <td>502 (0x1f6)</td> 807 <td></td> 808 <td>✓</td> 809 <td>V_MIN3_U16</td> 810 </tr> 811 <tr> 812 <td>503 (0x1f7)</td> 813 <td></td> 814 <td>✓</td> 815 <td>V_MAX3_F16</td> 816 </tr> 817 <tr> 818 <td>504 (0x1f8)</td> 819 <td></td> 820 <td>✓</td> 821 <td>V_MAX3_I16</td> 822 </tr> 823 <tr> 824 <td>505 (0x1f9)</td> 825 <td></td> 826 <td>✓</td> 827 <td>V_MAX3_U16</td> 828 </tr> 829 <tr> 830 <td>506 (0x1fa)</td> 831 <td></td> 832 <td>✓</td> 833 <td>V_MED3_F16</td> 834 </tr> 835 <tr> 836 <td>507 (0x1fb)</td> 837 <td></td> 838 <td>✓</td> 839 <td>V_MED3_I16</td> 840 </tr> 841 <tr> 842 <td>508 (0x1fc)</td> 843 <td></td> 844 <td>✓</td> 845 <td>V_MED3_U16</td> 846 </tr> 847 <tr> 848 <td>509 (0x1fd)</td> 849 <td></td> 850 <td>✓</td> 851 <td>V_LSHL_ADD_U32</td> 852 </tr> 853 <tr> 854 <td>510 (0x1fe)</td> 855 <td></td> 856 <td>✓</td> 857 <td>V_ADD_LSHL_U32</td> 858 </tr> 859 <tr> 860 <td>511 (0x1ff)</td> 861 <td></td> 862 <td>✓</td> 863 <td>V_ADD3_U32</td> 864 </tr> 865 <tr> 776 866 <td>624 (0x270)</td> 777 867 <td>✓</td> … … 1049 1139 VDST = 0x80000000 1050 1140 }</code></p> 1141 <h4>V_ADD3_U32</h4> 1142 <p>Opcode: 511 (0x1ff) for GCN 1.4<br /> 1143 Syntax: V_ADD3_U32 VDST, SRC0, SRC1, SRC2<br /> 1144 Description: Make sum from SRC0, SRC1, and SRC2 and store final result to VDST.<br /> 1145 Operation:<br /> 1146 <code>VDST = SRC0 + SRC1 + SRC2</code></p> 1147 <h4>V_ADD_LSHL_U32</h4> 1148 <p>Opcode: 510 (0x1fe) for GCN 1.4<br /> 1149 Syntax: V_ADD_LSHL_U32 VDST, SRC0, SRC1, SRC2<br /> 1150 Description: Add SRC0 and SRC1 and shift left by (SRC2&31) bits and store result to VDST.<br /> 1151 Operation:<br /> 1152 <code>VDST = (SRC0 + SRC1) << (SRC2&31)</code></p> 1051 1153 <h4>V_ALIGNBIT_B32</h4> 1052 1154 <p>Opcode: 334 (0x14e) for GCN 1.0/1.1; 462 (0x1ce) for GCN 1.2<br /> … … 1532 1634 VDST = (VDST & ~(255U<<(i*8))) | (((S0+S1+S2) >> 1) << (i*8)) 1533 1635 }</code></p> 1636 <h4>V_LSHL_ADD_U32</h4> 1637 <p>Opcode: 509 (0x1fd) for GCN 1.4<br /> 1638 Syntax: V_LSHL_ADD_U32 VDST, SRC0, SRC1, SRC2<br /> 1639 Description: Shift left SRC0 by (SRC1&31) bits and add to SRC2 and store result to VDST.<br /> 1640 Operation:<br /> 1641 <code>VDST = (SRC0 << (SRC1&31)) + SRC2</code></p> 1534 1642 <h4>V_LSHL_B64</h4> 1535 1643 <p>Opcode: 353 (0x161) for GCN 1.0/1.1<br /> … … 1586 1694 Operation:<br /> 1587 1695 <code>VDST = (INT16)((INT16)SRC0*(INT16)SRC1 + (INT16)SRC2)</code></p> 1696 <h4>V_MAD_I32_I16</h4> 1697 <p>Opcode: 498 (0x1f2) for GCN 1.4<br /> 1698 Syntax: V_MAD_I32_I16 VDST, SRC0, SRC1, SRC2<br /> 1699 Description: Multiply 16-bit signed value from SRC0 by 16-bit signed value from 1700 SRC1 and add 32-bit value from SRC2, and store 32-bit result to VDST.<br /> 1701 Operation:<br /> 1702 <code>VDST = (UINT32)(SEXT32((INT16)SRC0)*(INT16)SRC1) + SRC2</code></p> 1588 1703 <h4>V_MAD_I32_I24</h4> 1589 1704 <p>Opcode: 322 (0x142) for GCN 1.0/1.1; 450 (0x1c2) for GCN 1.2<br /> … … 1624 1739 Operation:<br /> 1625 1740 <code>VDST = ((UINT16)SRC0*(UINT16)SRC1 + (UINT16)SRC2) & 0xffff</code></p> 1741 <h4>V_MAD_U32_U16</h4> 1742 <p>Opcode: 497 (0x1f1) for GCN 1.4<br /> 1743 Syntax: V_MAD_U32_U16 VDST, SRC0, SRC1, SRC2<br /> 1744 Description: Multiply 16-bit unsigned value from SRC0 by 16-bit unsigned value from 1745 SRC1 and add 32-bit unsigned value from SRC2, and store 32-bit result to VDST.<br /> 1746 Operation:<br /> 1747 <code>VDST = (UINT32)((SRC0&0xffff)*(SRC1&0xffff)) + SRC2</code></p> 1626 1748 <h4>V_MAD_U32_U24</h4> 1627 1749 <p>Opcode: 323 (0x143) for GCN 1.0/1.1; 451 (0x1c3) for GCN 1.2<br /> … … 1649 1771 Operation:<br /> 1650 1772 <code>VDST = MAX((ASDOUBLE(SRC0), ASDOUBLE(SRC1))</code></p> 1773 <h4>V_MAX3_F16</h4> 1774 <p>Opcode: 503 (0x1f7) for GCN 1.4<br /> 1775 Syntax: V_MAX3_F16 VDST, SRC0, SRC1, SRC2<br /> 1776 Description: Choose largest value from half FP values SRC0, SRC1, SRC2, 1777 and store it to VDST.<br /> 1778 Operation:<br /> 1779 <code>HALF SF0 = ASHALF(SRC0) 1780 HALF SF1 = ASHALF(SRC1) 1781 HALF SF2 = ASHALF(SRC2) 1782 if (ISNAN(SF0)) 1783 VDST = MAX(SF1, SF2) 1784 else if (ISNAN(SF1)) 1785 VDST = MAX(SF0, SF2) 1786 else if (ISNAN(SF2)) 1787 VDST = MAX(SF0, SF1) 1788 else if (SF2 > SF0 && SF2 > SF1) 1789 VDST = SF2 1790 else 1791 VDST = MAX(SF1, SF0)</code></p> 1651 1792 <h4>V_MAX3_F32</h4> 1652 1793 <p>Opcode: 340 (0x154) for GCN 1.0/1.1; 467 (0x1d3) for GCN 1.2<br /> … … 1667 1808 else 1668 1809 VDST = MAX(SF1, SF0)</code></p> 1810 <h4>V_MAX3_I16</h4> 1811 <p>Opcode: 504 (0x1f8) for GCN 1.4<br /> 1812 Syntax: V_MAX3_I16 VDST, SRC0, SRC1, SRC2<br /> 1813 Description: Choose largest value from signed 16-bit integer values SRC0, SRC1, SRC2, 1814 and store it to VDST.<br /> 1815 Operation:<br /> 1816 <code>if ((INT16)SRC2 > (INT16)SRC0 && (INT16)SRC2 > (INT16)SRC1) 1817 VDST = (UINT16)SRC2 1818 else 1819 VDST = (UINT16)MAX((INT16)SRC1, (INT16)SRC0)</code></p> 1669 1820 <h4>V_MAX3_I32</h4> 1670 1821 <p>Opcode: 341 (0x155) for GCN 1.0/1.1; 468 (0x1d4) for GCN 1.2<br /> … … 1677 1828 else 1678 1829 VDST = MAX((INT32)SRC1, (INT32)SRC0)</code></p> 1830 <h4>V_MAX3_U16</h4> 1831 <p>Opcode: 505 (0x1f9) for GCN 1.4<br /> 1832 Syntax: V_MAX3_U16 VDST, SRC0, SRC1, SRC2<br /> 1833 Description: Choose largest value from unsigned 16-bit integer values SRC0, SRC1, SRC2, 1834 and store it to VDST.<br /> 1835 Operation:<br /> 1836 <code>if ((UINT16)SRC2 > (UINT16)SRC0 && (UINT16)SRC2 > (UINT16)SRC1) 1837 VDST = (UINT16)SRC2 1838 else 1839 VDST = MAX((UINT16)SRC1, (UINT16)SRC0)</code></p> 1679 1840 <h4>V_MAX3_U32</h4> 1680 1841 <p>Opcode: 342 (0x156) for GCN 1.0/1.1; 469 (0x1d5) for GCN 1.2<br /> … … 1705 1866 <code>UINT32 MASK = ((1ULL << LANEID) - 1ULL) & SRC0 1706 1867 VDST = SRC1 + BITCOUNT(MASK)</code></p> 1868 <h4>V_MED3_F16</h4> 1869 <p>Opcode: 506 (0x1fa) for GCN 1.4<br /> 1870 Syntax: V_MED3_F16 VDST, SRC0, SRC1, SRC2<br /> 1871 Description: Choose medium value from half FP values SRC0, SRC1, SRC2, 1872 and store it to VDST.<br /> 1873 Operation:<br /> 1874 <code>HALF SF0 = ASHALF(SRC0) 1875 HALF SF1 = ASHALF(SRC1) 1876 HALF SF2 = ASHALF(SRC2) 1877 if (ISNAN(SF0)) 1878 VDST = MIN(SF1, SF2) 1879 else if (ISNAN(SF1)) 1880 VDST = MIN(SF0, SF2) 1881 else if (ISNAN(SF2)) 1882 VDST = MIN(SF0, SF1) 1883 else if ((SF2 > SF1 && SF2 < SF0) || (SF2 < SF1 && SF2 > SF0)) 1884 VDST = SF2 1885 else if ((SF1 > SF2 && SF1 < SF0) || (SF1 < SF2 && SF1 > SF0)) 1886 VDST = SF1 1887 else 1888 VDST = SF0</code></p> 1707 1889 <h4>V_MED3_F32</h4> 1708 1890 <p>Opcode: 343 (0x157) for GCN 1.0/1.1; 470 (0x1d6) for GCN 1.2<br /> … … 1725 1907 else 1726 1908 VDST = SF0</code></p> 1909 <h4>V_MED3_I16</h4> 1910 <p>Opcode: 507 (0x1fb) for GCN 1.4<br /> 1911 Syntax: V_MED3_I16 VDST, SRC0, SRC1, SRC2<br /> 1912 Description: Choose medium value from signed 16-bit integer values SRC0, SRC1, SRC2, 1913 and store it to VDST.<br /> 1914 Operation:<br /> 1915 <code>INT16 S0 = (INT16)SRC0 1916 INT16 S1 = (INT32)SRC1 1917 INT16 S2 = (INT32)SRC2 1918 if ((S2 > S1 && S2 < S0) || (S2 < S1 && S2 > S0)) 1919 VDST = (UINT16)S2 1920 else if ((S1 > S2 && S1 < S0) || (S1 < S2 && S1 > S0)) 1921 VDST = (UINT16)S1 1922 else 1923 VDST = (UINT16)S0</code></p> 1727 1924 <h4>V_MED3_I32</h4> 1728 1925 <p>Opcode: 344 (0x158) for GCN 1.0/1.1; 471 (0x1d7) for GCN 1.2<br /> … … 1740 1937 else 1741 1938 VDST = S0</code></p> 1939 <h4>V_MED3_U16</h4> 1940 <p>Opcode: 508 (0x1fc) for GCN 1.4<br /> 1941 Syntax: V_MED3_U16 VDST, SRC0, SRC1, SRC2<br /> 1942 Description: Choose medium value from unsigned 16-bit integer values SRC0, SRC1, SRC2, 1943 and store it to VDST.<br /> 1944 Operation:<br /> 1945 <code>UINT16 S0 = (UINT16)SRC0 1946 UINT16 S1 = (UINT16)SRC1 1947 UINT16 S2 = (UINT16)SRC2 1948 if ((S2 > S1 && S2 < S0) || (S2 < S1 && S2 > S0)) 1949 VDST = S2 1950 else if ((S1 > S2 && S1 < S0) || (S1 < S2 && S1 > S0)) 1951 VDST = S1 1952 else 1953 VDST = S0</code></p> 1742 1954 <h4>V_MED3_U32</h4> 1743 1955 <p>Opcode: 345 (0x159) for GCN 1.0/1.1; 472 (0x1d8) for GCN 1.2<br /> … … 1758 1970 Operation:<br /> 1759 1971 <code>VDST = MIN((ASDOUBLE(SRC0), ASDOUBLE(SRC1))</code></p> 1972 <h4>V_MIN3_F16</h4> 1973 <p>Opcode: 500 (0x1f4) for GCN 1.4<br /> 1974 Syntax: V_MIN3_F16 VDST, SRC0, SRC1, SRC2<br /> 1975 Description: Choose smallest value from half FP values SRC0, SRC1, SRC2, 1976 and store it to VDST.<br /> 1977 Operation:<br /> 1978 <code>HALF SF0 = ASHALF(SRC0) 1979 HALF SF1 = ASHALF(SRC1) 1980 HALF SF2 = ASHALF(SRC2) 1981 if (ISNAN(SF0)) 1982 VDST = MIN(SF1, SF2) 1983 else if (ISNAN(SF1)) 1984 VDST = MIN(SF0, SF2) 1985 else if (ISNAN(SF2)) 1986 VDST = MIN(SF0, SF1) 1987 else if (SF2 < SF0 && SF2 < SF1) 1988 VDST = SF2 1989 else 1990 VDST = MIN(SF1, SF0)</code></p> 1760 1991 <h4>V_MIN3_F32</h4> 1761 1992 <p>Opcode: 337 (0x151) for GCN 1.0/1.1; 464 (0x1d0) for GCN 1.2<br /> … … 1776 2007 else 1777 2008 VDST = MIN(SF1, SF0)</code></p> 2009 <h4>V_MIN3_I16</h4> 2010 <p>Opcode: 501 (0x1f5) for GCN 1.4<br /> 2011 Syntax: V_MIN3_I16 VDST, SRC0, SRC1, SRC2<br /> 2012 Description: Choose smallest value from signed 16-bit integer values SRC0, SRC1, SRC2, 2013 and store it to VDST.<br /> 2014 Operation:<br /> 2015 <code>if ((INT16)SRC2 < (INT16)SRC0 && (INT16)SRC2 < (INT16)SRC1) 2016 VDST = (UINT16)SRC2 2017 else 2018 VDST = (UINT16)MIN((INT16)SRC1, (INT16)SRC0)</code></p> 1778 2019 <h4>V_MIN3_I32</h4> 1779 2020 <p>Opcode: 338 (0x152) for GCN 1.0/1.1; 465 (0x1d1) for GCN 1.2<br /> … … 1786 2027 else 1787 2028 VDST = MIN((INT32)SRC1, (INT32)SRC0)</code></p> 2029 <h4>V_MIN3_U16</h4> 2030 <p>Opcode: 502 (0x1f6) for GCN 1.4<br /> 2031 Syntax: V_MIN3_U16 VDST, SRC0, SRC1, SRC2<br /> 2032 Description: Choose smallest value from unsigned 16-bit integer values SRC0, SRC1, SRC2, 2033 and store it to VDST.<br /> 2034 Operation:<br /> 2035 <code>if ((UINT16)SRC2 < (UINT16)SRC0 && (UINT16)SRC2 < (UINT16)SRC1) 2036 VDST = (UINT16)SRC2 2037 else 2038 VDST = MIN(S(UINT16)RC1, (UINT16)SRC0)</code></p> 1788 2039 <h4>V_MIN3_U32</h4> 1789 2040 <p>Opcode: 339 (0x153) for GCN 1.0/1.1; 466 (0x1d2) for GCN 1.2<br /> … … 2048 2299 Operation:<br /> 2049 2300 <code>VDST[SSRC1 & 63] = SSRC0</code></p> 2301 <h4>V_XAD_U32</h4> 2302 <p>Opcode: 499 (0x1f3) for GCN 1.4<br /> 2303 Syntax: V_XAD_U32 VDST, SRC0, SRC1, SRC2<br /> 2304 Description: Make XOR bitwise operation on SRC0 and SRC1, add SRC2 and store result to VDST. 2305 Instruction added to speed up SHA256 sum.<br /> 2306 Operation:<br /> 2307 <code>VDST = (SRC0 ^ SRC1) + SRC2</code></p> 2050 2308 }}}