Changes between Version 34 and Version 35 of GcnInstrsVop3
- Timestamp:
- 11/26/17 10:00:26 (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GcnInstrsVop3
v34 v35 473 473 <tr> 474 474 <th>Opcode</th> 475 <th>GCN 1.2</th> 476 <th>GCN 1.4</th> 477 <th>Mnemonic</th> 475 <th>Mnemonic (GCN 1.4)</th> 476 <th>Mnemonic (GCN 1.4)</th> 478 477 </tr> 479 478 </thead> … … 481 480 <tr> 482 481 <td>448 (0x1c0)</td> 483 <td>✓</td>484 <td>✓</td>485 482 <td>V_MAD_LEGACY_F32</td> 483 <td>V_MAD_LEGACY_F32</td> 486 484 </tr> 487 485 <tr> 488 486 <td>449 (0x1c1)</td> 489 <td>✓</td>490 <td>✓</td>491 487 <td>V_MAD_F32</td> 488 <td>V_MAD_F32</td> 492 489 </tr> 493 490 <tr> 494 491 <td>450 (0x1c2)</td> 495 <td>✓</td>496 <td>✓</td>497 492 <td>V_MAD_I32_I24</td> 493 <td>V_MAD_I32_I24</td> 498 494 </tr> 499 495 <tr> 500 496 <td>451 (0x1c3)</td> 501 <td>✓</td>502 <td>✓</td>503 497 <td>V_MAD_U32_U24</td> 498 <td>V_MAD_U32_U24</td> 504 499 </tr> 505 500 <tr> 506 501 <td>452 (0x1c4)</td> 507 <td>✓</td>508 <td>✓</td>509 502 <td>V_CUBEID_F32</td> 503 <td>V_CUBEID_F32</td> 510 504 </tr> 511 505 <tr> 512 506 <td>453 (0x1c5)</td> 513 <td>✓</td>514 <td>✓</td>515 507 <td>V_CUBESC_F32</td> 508 <td>V_CUBESC_F32</td> 516 509 </tr> 517 510 <tr> 518 511 <td>454 (0x1c6)</td> 519 <td>✓</td>520 <td>✓</td>521 512 <td>V_CUBETC_F32</td> 513 <td>V_CUBETC_F32</td> 522 514 </tr> 523 515 <tr> 524 516 <td>455 (0x1c7)</td> 525 <td>✓</td>526 <td>✓</td>527 517 <td>V_CUBEMA_F32</td> 518 <td>V_CUBEMA_F32</td> 528 519 </tr> 529 520 <tr> 530 521 <td>456 (0x1c8)</td> 531 <td>✓</td>532 <td>✓</td>533 522 <td>V_BFE_U32</td> 523 <td>V_BFE_U32</td> 534 524 </tr> 535 525 <tr> 536 526 <td>457 (0x1c9)</td> 537 <td>✓</td>538 <td>✓</td>539 527 <td>V_BFE_I32</td> 528 <td>V_BFE_I32</td> 540 529 </tr> 541 530 <tr> 542 531 <td>458 (0x1ca)</td> 543 <td>✓</td>544 <td>✓</td>545 532 <td>V_BFI_B32</td> 533 <td>V_BFI_B32</td> 546 534 </tr> 547 535 <tr> 548 536 <td>459 (0x1cb)</td> 549 <td>✓</td>550 <td>✓</td>551 537 <td>V_FMA_F32</td> 538 <td>V_FMA_F32</td> 552 539 </tr> 553 540 <tr> 554 541 <td>460 (0x1cc)</td> 555 <td>✓</td>556 <td>✓</td>557 542 <td>V_FMA_F64</td> 543 <td>V_FMA_F64</td> 558 544 </tr> 559 545 <tr> 560 546 <td>461 (0x1cd)</td> 561 <td>✓</td>562 <td>✓</td>563 547 <td>V_LERP_U8</td> 548 <td>V_LERP_U8</td> 564 549 </tr> 565 550 <tr> 566 551 <td>462 (0x1ce)</td> 567 <td>✓</td>568 <td>✓</td>569 552 <td>V_ALIGNBIT_B32</td> 553 <td>V_ALIGNBIT_B32</td> 570 554 </tr> 571 555 <tr> 572 556 <td>463 (0x1cf)</td> 573 <td>✓</td>574 <td>✓</td>575 557 <td>V_ALIGNBYTE_B32</td> 558 <td>V_ALIGNBYTE_B32</td> 576 559 </tr> 577 560 <tr> 578 561 <td>464 (0x1d0)</td> 579 <td>✓</td>580 <td>✓</td>581 562 <td>V_MIN3_F32</td> 563 <td>V_MIN3_F32</td> 582 564 </tr> 583 565 <tr> 584 566 <td>465 (0x1d1)</td> 585 <td>✓</td>586 <td>✓</td>587 567 <td>V_MIN3_I32</td> 568 <td>V_MIN3_I32</td> 588 569 </tr> 589 570 <tr> 590 571 <td>466 (0x1d2)</td> 591 <td>✓</td>592 <td>✓</td>593 572 <td>V_MIN3_U32</td> 573 <td>V_MIN3_U32</td> 594 574 </tr> 595 575 <tr> 596 576 <td>467 (0x1d3)</td> 597 <td>✓</td>598 <td>✓</td>599 577 <td>V_MAX3_F32</td> 578 <td>V_MAX3_F32</td> 600 579 </tr> 601 580 <tr> 602 581 <td>468 (0x1d4)</td> 603 <td>✓</td>604 <td>✓</td>605 582 <td>V_MAX3_I32</td> 583 <td>V_MAX3_I32</td> 606 584 </tr> 607 585 <tr> 608 586 <td>469 (0x1d5)</td> 609 <td>✓</td>610 <td>✓</td>611 587 <td>V_MAX3_U32</td> 588 <td>V_MAX3_U32</td> 612 589 </tr> 613 590 <tr> 614 591 <td>470 (0x1d6)</td> 615 <td>✓</td>616 <td>✓</td>617 592 <td>V_MED3_F32</td> 593 <td>V_MED3_F32</td> 618 594 </tr> 619 595 <tr> 620 596 <td>471 (0x1d7)</td> 621 <td>✓</td>622 <td>✓</td>623 597 <td>V_MED3_I32</td> 598 <td>V_MED3_I32</td> 624 599 </tr> 625 600 <tr> 626 601 <td>472 (0x1d8)</td> 627 <td>✓</td>628 <td>✓</td>629 602 <td>V_MED3_U32</td> 603 <td>V_MED3_U32</td> 630 604 </tr> 631 605 <tr> 632 606 <td>473 (0x1d9)</td> 633 <td>✓</td>634 <td>✓</td>635 607 <td>V_SAD_U8</td> 608 <td>V_SAD_U8</td> 636 609 </tr> 637 610 <tr> 638 611 <td>474 (0x1da)</td> 639 <td>✓</td>640 <td>✓</td>641 612 <td>V_SAD_HI_U8</td> 613 <td>V_SAD_HI_U8</td> 642 614 </tr> 643 615 <tr> 644 616 <td>475 (0x1db)</td> 645 <td>✓</td>646 <td>✓</td>647 617 <td>V_SAD_U16</td> 618 <td>V_SAD_U16</td> 648 619 </tr> 649 620 <tr> 650 621 <td>476 (0x1dc)</td> 651 <td>✓</td>652 <td>✓</td>653 622 <td>V_SAD_U32</td> 623 <td>V_SAD_U32</td> 654 624 </tr> 655 625 <tr> 656 626 <td>477 (0x1dd)</td> 657 <td>✓</td>658 <td>✓</td>659 627 <td>V_CVT_PK_U8_F32</td> 628 <td>V_CVT_PK_U8_F32</td> 660 629 </tr> 661 630 <tr> 662 631 <td>478 (0x1de)</td> 663 <td>✓</td>664 <td>✓</td>665 632 <td>V_DIV_FIXUP_F32</td> 633 <td>V_DIV_FIXUP_F32</td> 666 634 </tr> 667 635 <tr> 668 636 <td>479 (0x1df)</td> 669 <td>✓</td>670 <td>✓</td>671 637 <td>V_DIV_FIXUP_F64</td> 638 <td>V_DIV_FIXUP_F64</td> 672 639 </tr> 673 640 <tr> 674 641 <td>480 (0x1e0)</td> 675 <td>✓</td>676 <td>✓</td>677 642 <td>V_DIV_SCALE_F32 (VOP3B)</td> 643 <td>V_DIV_SCALE_F32 (VOP3B)</td> 678 644 </tr> 679 645 <tr> 680 646 <td>481 (0x1e1)</td> 681 <td>✓</td>682 <td>✓</td>683 647 <td>V_DIV_SCALE_F64 (VOP3B)</td> 648 <td>V_DIV_SCALE_F64 (VOP3B)</td> 684 649 </tr> 685 650 <tr> 686 651 <td>482 (0x1e2)</td> 687 <td>✓</td>688 <td>✓</td>689 652 <td>V_DIV_FMAS_F32</td> 653 <td>V_DIV_FMAS_F32</td> 690 654 </tr> 691 655 <tr> 692 656 <td>483 (0x1e3)</td> 693 <td>✓</td>694 <td>✓</td>695 657 <td>V_DIV_FMAS_F64</td> 658 <td>V_DIV_FMAS_F64</td> 696 659 </tr> 697 660 <tr> 698 661 <td>484 (0x1e4)</td> 699 <td>✓</td>700 <td>✓</td>701 662 <td>V_MSAD_U8</td> 663 <td>V_MSAD_U8</td> 702 664 </tr> 703 665 <tr> 704 666 <td>485 (0x1e5)</td> 705 <td>✓</td>706 <td>✓</td>707 667 <td>V_QSAD_PK_U16_U8</td> 668 <td>V_QSAD_PK_U16_U8</td> 708 669 </tr> 709 670 <tr> 710 671 <td>486 (0x1e6)</td> 711 <td>✓</td>712 <td>✓</td>713 672 <td>V_MQSAD_PK_U16_U8</td> 673 <td>V_MQSAD_PK_U16_U8</td> 714 674 </tr> 715 675 <tr> 716 676 <td>487 (0x1e7)</td> 717 <td>✓</td>718 <td>✓</td>719 677 <td>V_MQSAD_U32_U8</td> 678 <td>V_MQSAD_U32_U8</td> 720 679 </tr> 721 680 <tr> 722 681 <td>488 (0x1e8)</td> 723 <td>✓</td>724 <td>✓</td>725 682 <td>V_MAD_U64_U32 (VOP3B)</td> 683 <td>V_MAD_U64_U32 (VOP3B)</td> 726 684 </tr> 727 685 <tr> 728 686 <td>489 (0x1e9)</td> 729 <td>✓</td>730 <td>✓</td>731 687 <td>V_MAD_I64_I32 (VOP3B)</td> 688 <td>V_MAD_I64_I32 (VOP3B)</td> 732 689 </tr> 733 690 <tr> 734 691 <td>490 (0x1ea)</td> 735 <td>✓</td>736 <td>✓</td>737 692 <td>V_MAD_F16</td> 693 <td>V_MAD_LEGACY_F16</td> 738 694 </tr> 739 695 <tr> 740 696 <td>491 (0x1eb)</td> 741 <td>✓</td>742 <td>✓</td>743 697 <td>V_MAD_U16</td> 698 <td>V_MAD_LEGACY_U16</td> 744 699 </tr> 745 700 <tr> 746 701 <td>492 (0x1ec)</td> 747 <td>✓</td>748 <td>✓</td>749 702 <td>V_MAD_I16</td> 703 <td>V_MAD_LEGACY_I16</td> 750 704 </tr> 751 705 <tr> 752 706 <td>493 (0x1ed)</td> 753 <td>✓</td>754 <td>✓</td>755 707 <td>V_PERM_B32</td> 708 <td>V_PERM_B32</td> 756 709 </tr> 757 710 <tr> 758 711 <td>494 (0x1ee)</td> 759 <td>✓</td>760 <td>✓</td>761 712 <td>V_FMA_F16</td> 713 <td>V_FMA_LEGACY_F16</td> 762 714 </tr> 763 715 <tr> 764 716 <td>495 (0x1ef)</td> 765 <td>✓</td>766 <td>✓</td>767 717 <td>V_DIV_FIXUP_F16</td> 718 <td>V_DIV_FIXUP_LEGACY_F16</td> 768 719 </tr> 769 720 <tr> 770 721 <td>496 (0x1f0)</td> 771 <td>✓</td>772 <td>✓</td>773 722 <td>V_CVT_PKACCUM_U8_F32</td> 723 <td>V_CVT_PKACCUM_U8_F32</td> 774 724 </tr> 775 725 <tr> 776 726 <td>497 (0x1f1)</td> 777 <td></td> 778 <td>✓</td> 727 <td>--</td> 779 728 <td>V_MAD_U32_U16</td> 780 729 </tr> 781 730 <tr> 782 731 <td>498 (0x1f2)</td> 783 <td></td> 784 <td>✓</td> 732 <td>--</td> 785 733 <td>V_MAD_I32_I16</td> 786 734 </tr> 787 735 <tr> 788 736 <td>499 (0x1f3)</td> 789 <td></td> 790 <td>✓</td> 737 <td>--</td> 791 738 <td>V_XAD_U32</td> 792 739 </tr> 793 740 <tr> 794 741 <td>500 (0x1f4)</td> 795 <td></td> 796 <td>✓</td> 742 <td>--</td> 797 743 <td>V_MIN3_F16</td> 798 744 </tr> 799 745 <tr> 800 746 <td>501 (0x1f5)</td> 801 <td></td> 802 <td>✓</td> 747 <td>--</td> 803 748 <td>V_MIN3_I16</td> 804 749 </tr> 805 750 <tr> 806 751 <td>502 (0x1f6)</td> 807 <td></td> 808 <td>✓</td> 752 <td>--</td> 809 753 <td>V_MIN3_U16</td> 810 754 </tr> 811 755 <tr> 812 756 <td>503 (0x1f7)</td> 813 <td></td> 814 <td>✓</td> 757 <td>--</td> 815 758 <td>V_MAX3_F16</td> 816 759 </tr> 817 760 <tr> 818 761 <td>504 (0x1f8)</td> 819 <td></td> 820 <td>✓</td> 762 <td>--</td> 821 763 <td>V_MAX3_I16</td> 822 764 </tr> 823 765 <tr> 824 766 <td>505 (0x1f9)</td> 825 <td></td> 826 <td>✓</td> 767 <td>--</td> 827 768 <td>V_MAX3_U16</td> 828 769 </tr> 829 770 <tr> 830 771 <td>506 (0x1fa)</td> 831 <td></td> 832 <td>✓</td> 772 <td>--</td> 833 773 <td>V_MED3_F16</td> 834 774 </tr> 835 775 <tr> 836 776 <td>507 (0x1fb)</td> 837 <td></td> 838 <td>✓</td> 777 <td>--</td> 839 778 <td>V_MED3_I16</td> 840 779 </tr> 841 780 <tr> 842 781 <td>508 (0x1fc)</td> 843 <td></td> 844 <td>✓</td> 782 <td>--</td> 845 783 <td>V_MED3_U16</td> 846 784 </tr> 847 785 <tr> 848 786 <td>509 (0x1fd)</td> 849 <td></td> 850 <td>✓</td> 787 <td>--</td> 851 788 <td>V_LSHL_ADD_U32</td> 852 789 </tr> 853 790 <tr> 854 791 <td>510 (0x1fe)</td> 855 <td></td> 856 <td>✓</td> 792 <td>--</td> 857 793 <td>V_ADD_LSHL_U32</td> 858 794 </tr> 859 795 <tr> 860 796 <td>511 (0x1ff)</td> 861 <td></td> 862 <td>✓</td> 797 <td>--</td> 863 798 <td>V_ADD3_U32</td> 864 799 </tr> 865 800 <tr> 801 <td>512 (0x200)</td> 802 <td>--</td> 803 <td>V_LSHL_OR_B32</td> 804 </tr> 805 <tr> 806 <td>513 (0x201)</td> 807 <td>--</td> 808 <td>V_AND_OR_B32</td> 809 </tr> 810 <tr> 811 <td>514 (0x202)</td> 812 <td>--</td> 813 <td>V_OR3_B32</td> 814 </tr> 815 <tr> 816 <td>515 (0x203)</td> 817 <td>--</td> 818 <td>V_MAD_F16</td> 819 </tr> 820 <tr> 821 <td>516 (0x204)</td> 822 <td>--</td> 823 <td>V_MAD_U16</td> 824 </tr> 825 <tr> 826 <td>517 (0x205)</td> 827 <td>--</td> 828 <td>V_MAD_I16</td> 829 </tr> 830 <tr> 831 <td>518 (0x206)</td> 832 <td>--</td> 833 <td>V_FMA_F16</td> 834 </tr> 835 <tr> 836 <td>519 (0x207)</td> 837 <td>--</td> 838 <td>V_DIV_FIXUP_F16</td> 839 </tr> 840 <tr> 866 841 <td>624 (0x270)</td> 867 <td>✓</td>868 <td>✓</td>869 842 <td>V_INTERP_P1_F32 (VINTRP)</td> 843 <td>V_INTERP_P1_F32 (VINTRP)</td> 870 844 </tr> 871 845 <tr> 872 846 <td>625 (0x271)</td> 873 <td>✓</td>874 <td>✓</td>875 847 <td>V_INTERP_P2_F32 (VINTRP)</td> 848 <td>V_INTERP_P2_F32 (VINTRP)</td> 876 849 </tr> 877 850 <tr> 878 851 <td>626 (0x272)</td> 879 <td>✓</td>880 <td>✓</td>881 852 <td>V_INTERP_MOV_F32 (VINTRP)</td> 853 <td>V_INTERP_MOV_F32 (VINTRP)</td> 882 854 </tr> 883 855 <tr> 884 856 <td>627 (0x273)</td> 885 <td>✓</td>886 <td>✓</td>887 857 <td>V_INTERP_P1LL_F16 (VINTRP)</td> 858 <td>V_INTERP_P1LL_F16 (VINTRP)</td> 888 859 </tr> 889 860 <tr> 890 861 <td>628 (0x274)</td> 891 <td>✓</td>892 <td>✓</td>893 862 <td>V_INTERP_P1LV_F16 (VINTRP)</td> 863 <td>V_INTERP_P1LV_F16 (VINTRP)</td> 894 864 </tr> 895 865 <tr> 896 866 <td>629 (0x275)</td> 897 <td>✓</td>898 <td>✓</td>899 867 <td>V_INTERP_P2_F16 (VINTRP)</td> 868 <td>V_INTERP_P2_F16 (VINTRP)</td> 900 869 </tr> 901 870 <tr> 902 871 <td>640 (0x280)</td> 903 <td>✓</td>904 <td>✓</td>905 872 <td>V_ADD_F64</td> 873 <td>V_ADD_F64</td> 906 874 </tr> 907 875 <tr> 908 876 <td>641 (0x281)</td> 909 <td>✓</td>910 <td>✓</td>911 877 <td>V_MUL_F64</td> 878 <td>V_MUL_F64</td> 912 879 </tr> 913 880 <tr> 914 881 <td>642 (0x282)</td> 915 <td>✓</td>916 <td>✓</td>917 882 <td>V_MIN_F64</td> 883 <td>V_MIN_F64</td> 918 884 </tr> 919 885 <tr> 920 886 <td>643 (0x283)</td> 921 <td>✓</td>922 <td>✓</td>923 887 <td>V_MAX_F64</td> 888 <td>V_MAX_F64</td> 924 889 </tr> 925 890 <tr> 926 891 <td>644 (0x284)</td> 927 <td>✓</td>928 <td>✓</td>929 892 <td>V_LDEXP_F64</td> 893 <td>V_LDEXP_F64</td> 930 894 </tr> 931 895 <tr> 932 896 <td>645 (0x285)</td> 933 <td>✓</td>934 <td>✓</td>935 897 <td>V_MUL_LO_U32</td> 898 <td>V_MUL_LO_U32</td> 936 899 </tr> 937 900 <tr> 938 901 <td>646 (0x286)</td> 939 <td>✓</td>940 <td>✓</td>941 902 <td>V_MUL_HI_U32</td> 903 <td>V_MUL_HI_U32</td> 942 904 </tr> 943 905 <tr> 944 906 <td>647 (0x287)</td> 945 <td>✓</td>946 <td>✓</td>947 907 <td>V_MUL_HI_I32</td> 908 <td>V_MUL_HI_I32</td> 948 909 </tr> 949 910 <tr> 950 911 <td>648 (0x288)</td> 951 <td>✓</td>952 <td>✓</td>953 912 <td>V_LDEXP_F32</td> 913 <td>V_LDEXP_F32</td> 954 914 </tr> 955 915 <tr> 956 916 <td>649 (0x289)</td> 957 <td>✓</td>958 <td>✓</td>959 917 <td>V_READLANE_B32</td> 918 <td>V_READLANE_B32</td> 960 919 </tr> 961 920 <tr> 962 921 <td>650 (0x28a)</td> 963 <td>✓</td>964 <td>✓</td>965 922 <td>V_WRITELANE_B32</td> 923 <td>V_WRITELANE_B32</td> 966 924 </tr> 967 925 <tr> 968 926 <td>651 (0x28b)</td> 969 <td>✓</td>970 <td>✓</td>971 927 <td>V_BCNT_U32_B32</td> 928 <td>V_BCNT_U32_B32</td> 972 929 </tr> 973 930 <tr> 974 931 <td>652 (0x28c)</td> 975 <td>✓</td>976 <td>✓</td>977 932 <td>V_MBCNT_LO_U32_B32</td> 933 <td>V_MBCNT_LO_U32_B32</td> 978 934 </tr> 979 935 <tr> 980 936 <td>653 (0x28d)</td> 981 <td>✓</td>982 <td>✓</td>983 937 <td>V_MBCNT_HI_U32_B32</td> 938 <td>V_MBCNT_HI_U32_B32</td> 984 939 </tr> 985 940 <tr> 986 941 <td>654 (0x28e)</td> 987 <td>✓</td>988 <td>✓</td>989 942 <td>V_MAC_LEGACY_F32</td> 943 <td>V_MAC_LEGACY_F32</td> 990 944 </tr> 991 945 <tr> 992 946 <td>655 (0x28f)</td> 993 <td>✓</td>994 <td>✓</td>995 947 <td>V_LSHLREV_B64</td> 948 <td>V_LSHLREV_B64</td> 996 949 </tr> 997 950 <tr> 998 951 <td>656 (0x290)</td> 999 <td>✓</td>1000 <td>✓</td>1001 952 <td>V_LSHRREV_B64</td> 953 <td>V_LSHRREV_B64</td> 1002 954 </tr> 1003 955 <tr> 1004 956 <td>657 (0x291)</td> 1005 <td>✓</td>1006 <td>✓</td>1007 957 <td>V_ASHRREV_I64</td> 958 <td>V_ASHRREV_I64</td> 1008 959 </tr> 1009 960 <tr> 1010 961 <td>658 (0x292)</td> 1011 <td>✓</td>1012 <td>✓</td>1013 962 <td>V_TRIG_PREOP_F64</td> 963 <td>V_TRIG_PREOP_F64</td> 1014 964 </tr> 1015 965 <tr> 1016 966 <td>659 (0x293)</td> 1017 <td>✓</td>1018 <td>✓</td>1019 967 <td>V_BFM_B32</td> 968 <td>V_BFM_B32</td> 1020 969 </tr> 1021 970 <tr> 1022 971 <td>660 (0x294)</td> 1023 <td>✓</td>1024 <td>✓</td>1025 972 <td>V_CVT_PKNORM_I16_F32</td> 973 <td>V_CVT_PKNORM_I16_F32</td> 1026 974 </tr> 1027 975 <tr> 1028 976 <td>661 (0x295)</td> 1029 <td>✓</td>1030 <td>✓</td>1031 977 <td>V_CVT_PKNORM_U16_F32</td> 978 <td>V_CVT_PKNORM_U16_F32</td> 1032 979 </tr> 1033 980 <tr> 1034 981 <td>662 (0x296)</td> 1035 <td>✓</td>1036 <td>✓</td>1037 982 <td>V_CVT_PKRTZ_F16_F32</td> 983 <td>V_CVT_PKRTZ_F16_F32</td> 1038 984 </tr> 1039 985 <tr> 1040 986 <td>663 (0x297)</td> 1041 <td>✓</td>1042 <td>✓</td>1043 987 <td>V_CVT_PK_U16_U32</td> 988 <td>V_CVT_PK_U16_U32</td> 1044 989 </tr> 1045 990 <tr> 1046 991 <td>664 (0x298)</td> 1047 <td>✓</td>1048 <td>✓</td>1049 992 <td>V_CVT_PK_I16_I32</td> 993 <td>V_CVT_PK_I16_I32</td> 1050 994 </tr> 1051 995 <tr> 1052 996 <td>665 (0x299)</td> 1053 <td></td>1054 <td>✓</td>1055 997 <td>V_CVT_PKNORM_I16_F16</td> 998 <td>V_CVT_PKNORM_I16_F16</td> 1056 999 </tr> 1057 1000 <tr> 1058 1001 <td>666 (0x29a)</td> 1059 <td></td>1060 <td>✓</td>1061 1002 <td>V_CVT_PKNORM_U16_F16</td> 1003 <td>V_CVT_PKNORM_U16_F16</td> 1062 1004 </tr> 1063 1005 <tr> 1064 1006 <td>667 (0x29b)</td> 1065 <td></td>1066 <td>✓</td>1067 1007 <td>V_READLANE_REGRD_B32</td> 1008 <td>V_READLANE_REGRD_B32</td> 1068 1009 </tr> 1069 1010 <tr> 1070 1011 <td>668 (0x29c)</td> 1071 <td></td> 1072 <td>✓</td> 1012 <td>--</td> 1073 1013 <td>V_ADD_I32</td> 1074 1014 </tr> 1075 1015 <tr> 1076 1016 <td>669 (0x29d)</td> 1077 <td></td> 1078 <td>✓</td> 1017 <td>--</td> 1079 1018 <td>V_SUB_I32</td> 1080 1019 </tr> 1081 1020 <tr> 1082 1021 <td>670 (0x29e)</td> 1083 <td></td> 1084 <td>✓</td> 1022 <td>--</td> 1085 1023 <td>V_ADD_I16</td> 1086 1024 </tr> 1087 1025 <tr> 1088 1026 <td>671 (0x29f)</td> 1089 <td></td> 1090 <td>✓</td> 1027 <td>--</td> 1091 1028 <td>V_SUB_I16</td> 1092 1029 </tr> 1093 1030 <tr> 1094 1031 <td>672 (0x2a0)</td> 1095 <td></td> 1096 <td>✓</td> 1032 <td>--</td> 1097 1033 <td>V_PACK_B32_F16</td> 1098 1034 </tr> … … 1102 1038 <p>Alphabetically sorted instruction list:</p> 1103 1039 <h4>V_ADD_F64</h4> 1104 <p>Opcode: 356 (0x164) for GCN 1.0/1.1; 640 (0x280) for GCN 1.2 <br />1040 <p>Opcode: 356 (0x164) for GCN 1.0/1.1; 640 (0x280) for GCN 1.2/1.4<br /> 1105 1041 Syntax: V_ADD_F64 VDST(2), SRC0(2), SRC1(2)<br /> 1106 1042 Description: Add two double FP value from SRC0 and SRC1 and store result to VDST.<br /> … … 1152 1088 <code>VDST = (SRC0 + SRC1) << (SRC2&31)</code></p> 1153 1089 <h4>V_ALIGNBIT_B32</h4> 1154 <p>Opcode: 334 (0x14e) for GCN 1.0/1.1; 462 (0x1ce) for GCN 1.2 <br />1090 <p>Opcode: 334 (0x14e) for GCN 1.0/1.1; 462 (0x1ce) for GCN 1.2/1.4<br /> 1155 1091 Syntax: V_ALIGNBIT_B32 VDST, SRC0, SRC1, SRC2<br /> 1156 1092 Description: Align bit. Shift right bits in 64-bit stored in SRC1 (low part) and … … 1159 1095 <code>VDST = (((UINT64)SRC0)<<32) | SRC1) >> (SRC2&31)</code></p> 1160 1096 <h4>V_ALIGNBYTE_B32</h4> 1161 <p>Opcode: 335 (0x14f) for GCN 1.0/1.1; 463 (0x1cf) for GCN 1.2 <br />1097 <p>Opcode: 335 (0x14f) for GCN 1.0/1.1; 463 (0x1cf) for GCN 1.2/1.4<br /> 1162 1098 Syntax: V_ALIGNBYTE_B32 VDST, SRC0, SRC1, SRC2<br /> 1163 1099 Description: Align bit. Shift right bits in 64-bit stored in SRC1 (low part) and … … 1165 1101 Operation:<br /> 1166 1102 <code>VDST = (((UINT64)SRC0)<<32) | SRC1) >> ((SRC2&3)*8)</code></p> 1103 <h4>V_AND_OR_B32</h4> 1104 <p>Opcode: 513 (0x201) for GCN 1.4<br /> 1105 Syntax: V_AND_OR_B32 VDST, SRC0, SRC1, SRC2<br /> 1106 Description: Make btwise AND with SRC0 and SRC1, make bitwise OR with result and SRC2 1107 and store result to VDST.<br /> 1108 Operation:<br /> 1109 <code>VDST = (SRC0 & SRC1) | SRC2</code></p> 1167 1110 <h4>V_ASHR_I64</h4> 1168 1111 <p>Opcode: 355 (0x163) for GCN 1.0/1.1<br /> … … 1172 1115 <code>VDST = (INT64)SRC0 >> (SRC1&63)</code></p> 1173 1116 <h4>V_ASHRREV_I64</h4> 1174 <p>Opcode: 657 (0x291) for GCN 1.2 <br />1117 <p>Opcode: 657 (0x291) for GCN 1.2/1.4<br /> 1175 1118 Syntax: V_ASHRREV_I32 VDST(2), SRC0, SRC1(2)<br /> 1176 1119 Description: Arithmetic shift right SRC1 by (SRC0&63) bits and store result into VDST.<br /> … … 1178 1121 <code>VDST = (INT64)SRC0 >> (SRC0&63)</code></p> 1179 1122 <h4>V_BCNT_U32_B32</h4> 1180 <p>Opcode: 651 (0x28b) for GCN 1.2 <br />1123 <p>Opcode: 651 (0x28b) for GCN 1.2/1.4<br /> 1181 1124 Syntax: V_BCNT_U32_B32 VDST, SRC0, SRC1<br /> 1182 1125 Description: Count bits in SRC0, adds SRC1, and store result to VDST.<br /> … … 1184 1127 <code>VDST = SRC1 + BITCOUNT(SRC0)</code></p> 1185 1128 <h4>V_BFE_I32</h4> 1186 <p>Opcode: 329 (0x149) for GCN 1.0/1.1; 457 (0x1c9) for GCN 1.2 <br />1129 <p>Opcode: 329 (0x149) for GCN 1.0/1.1; 457 (0x1c9) for GCN 1.2/1.4<br /> 1187 1130 Syntax: V_BFE_I32 VDST, SRC0, SRC1, SRC2<br /> 1188 1131 Description: Extracts bits in SRC0 from range (SRC1&31) with length (SRC2&31) … … 1198 1141 VDST = (INT32)SRC0 >> shift</code></p> 1199 1142 <h4>V_BFE_U32</h4> 1200 <p>Opcode: 328 (0x148) for GCN 1.0/1.1; 456 (0x1c8) for GCN 1.2 <br />1143 <p>Opcode: 328 (0x148) for GCN 1.0/1.1; 456 (0x1c8) for GCN 1.2/1.4<br /> 1201 1144 Syntax: V_BFE_U32 VDST, SRC0, SRC1, SRC2<br /> 1202 1145 Description: Extracts bits in SRC0 from range SRC1&31 with length SRC2&31, and … … 1212 1155 VDST = SRC0 >> shift</code></p> 1213 1156 <h4>V_BFI_B32</h4> 1214 <p>Opcode: 330 (0x14a) for GCN 1.0/1.1; 458 (0x1ca) for GCN 1.2 <br />1157 <p>Opcode: 330 (0x14a) for GCN 1.0/1.1; 458 (0x1ca) for GCN 1.2/1.4<br /> 1215 1158 Syntax: V_BFI_B32 VDST, SRC0, SRC1, SRC2<br /> 1216 1159 Description: Replace bits in SRC2 by bits from SRC1 marked by bits in SRC0, and store result … … 1219 1162 <code>VDST = (SRC0 & SRC1) | (~SRC0 & SRC2)</code></p> 1220 1163 <h4>V_BFM_B32</h4> 1221 <p>Opcode: 659 (0x293) for GCN 1.2 <br />1164 <p>Opcode: 659 (0x293) for GCN 1.2/1.4<br /> 1222 1165 Syntax: V_BFM_B32 VDST, SRC0, SRC1<br /> 1223 1166 Description: Make 32-bit bitmask from (SRC1 & 31) bit that have length (SRC0 & 31) and … … 1226 1169 <code>VDST = ((1U << (SRC0&31))-1) << (SRC1&31)</code></p> 1227 1170 <h4>V_CUBEID_F32</h4> 1228 <p>Opcode: 324 (0x144) for GCN 1.0/1.1; 452 (0x1c4) for GCN 1.2 <br />1171 <p>Opcode: 324 (0x144) for GCN 1.0/1.1; 452 (0x1c4) for GCN 1.2/1.4<br /> 1229 1172 Syntax: V_CUBEID_F32 VDST, SRC0, SRC1, SRC2<br /> 1230 1173 Description: Cubemap face identification. Determine face by comparing three single FP … … 1246 1189 VDST = OUT</code></p> 1247 1190 <h4>V_CUBEMA_F32</h4> 1248 <p>Opcode: 327 (0x147) for GCN 1.0/1.1; 455 (0x1c7) for GCN 1.2 <br />1191 <p>Opcode: 327 (0x147) for GCN 1.0/1.1; 455 (0x1c7) for GCN 1.2/1.4<br /> 1249 1192 Syntax: V_CUBEMA_F32 VDST, SRC0, SRC1, SRC2<br /> 1250 1193 Description: Cubemap Major Axis. Choose highest absolute value from all three FP values … … 1262 1205 VDST = OUT</code></p> 1263 1206 <h4>V_CUBESC_F32</h4> 1264 <p>Opcode: 325 (0x145) for GCN 1.0/1.1; 453 (0x1c5) for GCN 1.2 <br />1207 <p>Opcode: 325 (0x145) for GCN 1.0/1.1; 453 (0x1c5) for GCN 1.2/1.4<br /> 1265 1208 Syntax: V_CUBESC_F32 VDST, SRC0, SRC1, SRC2<br /> 1266 1209 Description: Cubemap S coordination. Algorithm below.<br /> … … 1277 1220 VDST = OUT</code></p> 1278 1221 <h4>V_CUBETC_F32</h4> 1279 <p>Opcode: 326 (0x146) for GCN 1.0/1.1; 454 (0x1c6) for GCN 1.2 <br />1222 <p>Opcode: 326 (0x146) for GCN 1.0/1.1; 454 (0x1c6) for GCN 1.2/1.4<br /> 1280 1223 Syntax: V_CUBETC_F32 VDST, SRC0, SRC1, SRC2<br /> 1281 1224 Description: Cubemap T coordination. Algorithm below.<br /> … … 1292 1235 VDST = OUT</code></p> 1293 1236 <h4>V_CVT_PK_I16_I32</h4> 1294 <p>Opcode: 664 (0x298) for GCN 1.2 <br />1237 <p>Opcode: 664 (0x298) for GCN 1.2/1.4<br /> 1295 1238 Syntax: V_CVT_PK_I16_I32 VDST, SRC0, SRC1<br /> 1296 1239 Description: Convert signed value from SRC0 and SRC1 to signed 16-bit values with … … 1301 1244 VDST = D0 | (((UINT32)D1) << 16)</code></p> 1302 1245 <h4>V_CVT_PK_U16_U32</h4> 1303 <p>Opcode: 663 (0x297) for GCN 1.2 <br />1246 <p>Opcode: 663 (0x297) for GCN 1.2/1.4<br /> 1304 1247 Syntax: V_CVT_PK_U16_U32 VDST, SRC0, SRC1<br /> 1305 1248 Description: Convert unsigned value from SRC0 and SRC1 to unsigned 16-bit values with … … 1310 1253 VDST = D0 | (((UINT32)D1) << 16)</code></p> 1311 1254 <h4>V_CVT_PK_U8_F32</h4> 1312 <p>Opcode: 350 (0x15e) for GCN 1.0/1.1; 477 (0x1dd) for GCN 1.2 <br />1255 <p>Opcode: 350 (0x15e) for GCN 1.0/1.1; 477 (0x1dd) for GCN 1.2/1.4<br /> 1313 1256 Syntax: V_CVT_PK_U8_F32 VDST, SRC0, SRC1, SRC2<br /> 1314 1257 Description: Convert floating point value from SRC0 to unsigned byte value with … … 1324 1267 VDST = (SRC2&~mask) | (((UINT32)VAL8) << shift)</code></p> 1325 1268 <h4>V_CVT_PKACCUM_U8_F32</h4> 1326 <p>Opcode: 496 (0x1f0) for GCN 1.2 <br />1269 <p>Opcode: 496 (0x1f0) for GCN 1.2/1.4<br /> 1327 1270 Syntax: V_CVT_PKACCUM_U8_F32 VDST, SRC0, SRC1<br /> 1328 1271 Description: Convert floating point value from SRC0 to unsigned byte value with … … 1352 1295 VDST = roundNorm(ASHALF(SRC0)) | ((UINT32)roundNorm(ASHALF(SRC1)) << 16)</code></p> 1353 1296 <h4>V_CVT_PKNORM_I16_F32</h4> 1354 <p>Opcode: 660 (0x294) for GCN 1.2 <br />1297 <p>Opcode: 660 (0x294) for GCN 1.2/1.4<br /> 1355 1298 Syntax: V_CVT_PKNORM_I16_F32 VDST, SRC0, SRC1<br /> 1356 1299 Description: Convert normalized FP value from SRC0 and SRC1 to signed 16-bit integers with … … 1382 1325 VDST = roundNorm(ASHALF(SRC0)) | ((UINT32)roundNorm(ASHALF(SRC1)) << 16)</code></p> 1383 1326 <h4>V_CVT_PKNORM_U16_F32</h4> 1384 <p>Opcode: 661 (0x295) for GCN 1.2 <br />1327 <p>Opcode: 661 (0x295) for GCN 1.2/1.4<br /> 1385 1328 Syntax: V_CVT_PKNORM_U16_F32 VDST, SRC0, SRC1<br /> 1386 1329 Description: Convert normalized FP value from SRC0 and SRC1 to unsigned 16-bit integers with … … 1397 1340 VDST = roundNorm(ASFLOAT(SRC0)) | ((UINT32)roundNorm(ASFLOAT(SRC1)) << 16)</code></p> 1398 1341 <h4>V_CVT_PKRTZ_F16_F32</h4> 1399 <p>Opcode: 662 (0x296) for GCN 1.2 <br />1342 <p>Opcode: 662 (0x296) for GCN 1.2/1.4<br /> 1400 1343 Syntax: V_CVT_PKRTZ_F16_F32 VDST, SRC0, SRC1<br /> 1401 1344 Description: Convert normalized FP value from SRC0 and SRC1 to half floating points with … … 1407 1350 VDST = D0 | (((UINT32)D1) << 16)</code></p> 1408 1351 <h4>V_DIV_FIXUP_F16</h4> 1409 <p>Opcode: 495 (0x1ef) for GCN 1.2 <br />1352 <p>Opcode: 495 (0x1ef) for GCN 1.2; 519 (0x207) for GCN 1.4<br /> 1410 1353 Syntax: V_DIV_FIXUP_F16 VDST, SRC0, SRC1, SRC2<br /> 1411 1354 Description: Handle all exceptions requires for half floating point division. … … 1432 1375 VDST = SF0</code></p> 1433 1376 <h4>V_DIV_FIXUP_F32</h4> 1434 <p>Opcode: 351 (0x15f) for GCN 1.0/1.1; 478 (0x1de) for GCN 1.2 <br />1377 <p>Opcode: 351 (0x15f) for GCN 1.0/1.1; 478 (0x1de) for GCN 1.2/1.4<br /> 1435 1378 Syntax: V_DIV_FIXUP_F32 VDST, SRC0, SRC1, SRC2<br /> 1436 1379 Description: Handle all exceptions requires for single floating point division. … … 1457 1400 VDST = SF0</code></p> 1458 1401 <h4>V_DIV_FIXUP_F64</h4> 1459 <p>Opcode: 352 (0x160) for GCN 1.0/1.1; 479 (0x1df) for GCN 1.2 <br />1402 <p>Opcode: 352 (0x160) for GCN 1.0/1.1; 479 (0x1df) for GCN 1.2/1.4<br /> 1460 1403 Syntax: V_DIV_FIXUP_F64 VDST(2), SRC0(2), SRC1(2), SRC2(2)<br /> 1461 1404 Description: Handle all exceptions requires for double floating point division. … … 1481 1424 else 1482 1425 VDST = SF0</code></p> 1426 <h4>V_DIV_FIXUP_LEGACY_F16</h4> 1427 <p>Opcode: 495 (0x1ef) for GCN 1.4<br /> 1428 Syntax: V_DIV_FIXUP_LEGACY_F16 VDST, SRC0, SRC1, SRC2<br /> 1429 Description: Handle all exceptions requires for half floating point division. 1430 SRC0 is quotient, SRC1 is denominator, SRC2 is nominator. Correct result stored to VDST.<br /> 1431 Operation:<br /> 1432 <code>HALF SF0 = ASHALF(SRC0) 1433 HALF SF1 = ASHALF(SRC1) 1434 HALF SF2 = ASHALF(SRC2) 1435 if (ISNAN(SF1) && !ISNAN(SF2)) 1436 VDST = QUIETNAN(SF1) 1437 else if (ISNAN(SF2)) 1438 VDST = QUIETNAN(SF2) 1439 else if (SF1 == 0.0 && SF2 == 0.0) 1440 VDST = NAN_H 1441 else if (ABS(SF1)==INF && ABS(SF2)==INF) 1442 VDST = -NAN_H 1443 else if (SF1 == 0.0) 1444 VDST = INF_H*SIGN(SF1)*SIGN(SF2) 1445 else if (ABS(SF1) == INF) 1446 VDST = SIGN(SF1)*SIGN(SF2) >=0 ? 0.0 : -0.0 1447 else if (ISNAN(SF0)) 1448 VDST = SIGN(SF1)*SIGN(SF2)*INF_H 1449 else 1450 VDST = SF0</code></p> 1483 1451 <h4>V_DIV_FMAS_F32</h4> 1484 <p>Opcode: 367 (0x16f) for GCN 1.0/1.1; 482 (0x1e2) for GCN 1.2 <br />1452 <p>Opcode: 367 (0x16f) for GCN 1.0/1.1; 482 (0x1e2) for GCN 1.2/1.4<br /> 1485 1453 Syntax: V_DIV_FMAS_F32 VDST, SRC0, SRC1, SRC2<br /> 1486 1454 Description: Special case divide FMA with scale and flags. … … 1497 1465 VDST = ASFLOAT(VDST)*POW(-2.0,64)</code></p> 1498 1466 <h4>V_DIV_FMAS_F64</h4> 1499 <p>Opcode: 368 (0x170) for GCN 1.0/1.1; 483 (0x1e3) for GCN 1.2 <br />1467 <p>Opcode: 368 (0x170) for GCN 1.0/1.1; 483 (0x1e3) for GCN 1.2/1.4<br /> 1500 1468 Syntax: V_DIV_FMAS_F64 VDST(2), SRC0(2), SRC1(2), SRC2(2)<br /> 1501 1469 Description: Special case divide FMA with scale and flags. … … 1513 1481 VDST = ASDOUBLE(VDST)*POW(-2.0,128)</code></p> 1514 1482 <h4>V_DIV_SCALE_F32</h4> 1515 <p>Opcode (VOP3B): 365 (0x16d) for GCN 1.0/1.1; 480 (0x1e0) for GCN 1.2 <br />1483 <p>Opcode (VOP3B): 365 (0x16d) for GCN 1.0/1.1; 480 (0x1e0) for GCN 1.2/1.4<br /> 1516 1484 Syntax: V_DIV_SCALE_F32 VDST, SDST(2), SRC0, SRC1, SRC2<br /> 1517 1485 Description: Special case divide preop and flags. SRC0 is quotient, SRC1 is denominator, … … 1548 1516 }</code></p> 1549 1517 <h4>V_DIV_SCALE_F64</h4> 1550 <p>Opcode (VOP3B): 366 (0x16e) for GCN 1.0/1.1; 481 (0x1e1) for GCN 1.2 <br />1518 <p>Opcode (VOP3B): 366 (0x16e) for GCN 1.0/1.1; 481 (0x1e1) for GCN 1.2/1.4<br /> 1551 1519 Syntax: V_DIV_SCALE_F64 VDST(2), SDST(2), SRC0(2), SRC1(2), SRC2(2)<br /> 1552 1520 Description: Special case divide preop and flags. SRC0 is quotient, SRC1 is denominator, … … 1583 1551 }</code></p> 1584 1552 <h4>V_FMA_F16</h4> 1585 <p>Opcode: 494 (0x1ee) for GCN 1.2 <br />1553 <p>Opcode: 494 (0x1ee) for GCN 1.2; 518 (0x206) for GCN 1.4<br /> 1586 1554 Syntax: V_FMA_F16 VDST, SRC0, SRC1, SRC2<br /> 1587 1555 Description: Fused multiply addition on half floating point values from … … 1591 1559 VDST = FMA(ASHALF(SRC0), ASHALF(SRC1), ASHALF(SRC2))</code></p> 1592 1560 <h4>V_FMA_F32</h4> 1593 <p>Opcode: 331 (0x14b) for GCN 1.0/1.1; 459 (0x1cb) for GCN 1.2 <br />1561 <p>Opcode: 331 (0x14b) for GCN 1.0/1.1; 459 (0x1cb) for GCN 1.2/1.4<br /> 1594 1562 Syntax: V_FMA_F32 VDST, SRC0, SRC1, SRC2<br /> 1595 1563 Description: Fused multiply addition on single floating point values from … … 1599 1567 VDST = FMA(ASFLOAT(SRC0), ASFLOAT(SRC1), ASFLOAT(SRC2))</code></p> 1600 1568 <h4>V_FMA_F64</h4> 1601 <p>Opcode: 332 (0x14c) for GCN 1.0/1.1; 460 (0x1cc) for GCN 1.2 <br />1569 <p>Opcode: 332 (0x14c) for GCN 1.0/1.1; 460 (0x1cc) for GCN 1.2/1.4<br /> 1602 1570 Syntax: V_FMA_F64 VDST(2), SRC0(2), SRC1(2), SRC2(2)<br /> 1603 1571 Description: Fused multiply addition on double floating point values from … … 1606 1574 <code>// SRC0*SRC1+SRC2 1607 1575 VDST = FMA(ASDOUBLE(SRC0), ASDOUBLE(SRC1), ASDOUBLE(SRC2))</code></p> 1576 <h4>V_FMA_LEGACY_F16</h4> 1577 <p>Opcode: 494 (0x1ee) for GCN 1.4<br /> 1578 Syntax: V_FMA_LEGACY_F16 VDST, SRC0, SRC1, SRC2<br /> 1579 Description: Fused multiply addition on half floating point values from 1580 SRC0, SRC1 and SRC2. Result stored in VDST.<br /> 1581 Operation:<br /> 1582 <code>// SRC0*SRC1+SRC2 1583 VDST = FMA(ASHALF(SRC0), ASHALF(SRC1), ASHALF(SRC2))</code></p> 1608 1584 <h4>V_LDEXP_F32</h4> 1609 <p>Opcode: 648 (0x288) for GCN 1.2 <br />1585 <p>Opcode: 648 (0x288) for GCN 1.2/1.4<br /> 1610 1586 Syntax: V_LDEXP_F32 VDST, SRC0, SRC1<br /> 1611 1587 Description: Do ldexp operation on SRC0 and SRC1 (multiply SRC0 by 2**(SRC1)). … … 1614 1590 <code>VDST = ASFLOAT(SRC0) * POW(2.0, (INT32)SRC1)</code></p> 1615 1591 <h4>V_LDEXP_F64</h4> 1616 <p>Opcode: 360 (0x168) for GCN 1.0/1.1; 644 (0x284) for GCN 1.2 <br />1592 <p>Opcode: 360 (0x168) for GCN 1.0/1.1; 644 (0x284) for GCN 1.2/1.4<br /> 1617 1593 Syntax: V_LDEXP_F64 VDST(2), SRC0(2), SRC1<br /> 1618 1594 Description: Do ldexp operation on SRC0 and SRC1 (multiply SRC0 by 2**(SRC1)). … … 1621 1597 <code>VDST = ASDOUBLE(SRC0) * POW(2.0, (INT32)SRC1)</code></p> 1622 1598 <h4>V_LERP_U8</h4> 1623 <p>Opcode: 333 (0x14d) for GCN 1.0/1.1; 461 (0x1cd) for GCN 1.2 <br />1599 <p>Opcode: 333 (0x14d) for GCN 1.0/1.1; 461 (0x1cd) for GCN 1.2/1.4<br /> 1624 1600 Syntax: V_LERP_U8 VDST, SRC0, SRC1, SRC2<br /> 1625 1601 Description: For each byte of dword, calculate average from SRC0 byte and SRC1 byte with … … 1646 1622 Operation:<br /> 1647 1623 <code>VDST = SRC0 << (SRC1&63)</code></p> 1624 <h4>V_LSHL_OR_B32</h4> 1625 <p>Opcode: 512 (0x200) for GCN 1.4<br /> 1626 Syntax: V_LSHL_OR_B32 VDST, SRC0, SRC1, SRC2<br /> 1627 Description: Shift left SRC0 by (SRC1&31) bits and make bitwise OR with SRC2 1628 and store result to VDST.<br /> 1629 Operation:<br /> 1630 <code>VDST = (SRC0 << (SRC1&31)) | SRC2</code></p> 1648 1631 <h4>V_LSHLREV_B64</h4> 1649 <p>Opcode: 655 (0x28f) for GCN 1.2 <br />1632 <p>Opcode: 655 (0x28f) for GCN 1.2/1.4<br /> 1650 1633 Syntax: V_LSHLREV_B64 VDST(2), SRC0, SRC1(2)<br /> 1651 1634 Description: Shift left SRC1 by (SRC0&63) bits and store result into VDST.<br /> … … 1659 1642 <code>VDST = SRC0 >> (SRC1&63)</code></p> 1660 1643 <h4>V_LSHRREV_B64</h4> 1661 <p>Opcode: 656 (0x290) for GCN 1.2 <br />1644 <p>Opcode: 656 (0x290) for GCN 1.2/1.4<br /> 1662 1645 Syntax: V_LSHRREV_B64 VDST(2), SRC0, SRC1(2)<br /> 1663 1646 Description: Shift right SRC1 by (SRC0&63) bits and store result into VDST.<br /> … … 1665 1648 <code>VDST = SRC1 >> (SRC0&63)</code></p> 1666 1649 <h4>V_MAC_LEGACY_F32</h4> 1667 <p>Opcode: 654 (0x28e) for GCN 1.2 <br />1650 <p>Opcode: 654 (0x28e) for GCN 1.2/1.4<br /> 1668 1651 Syntax: V_MAC_LEGACY_F32 VDST, SRC0, SRC1<br /> 1669 1652 Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to VDST. … … 1673 1656 VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(VDST)</code></p> 1674 1657 <h4>V_MAD_F16</h4> 1675 <p>Opcode: 490 (0x1ea) for GCN 1.2 <br />1658 <p>Opcode: 490 (0x1ea) for GCN 1.2; 515 (0x203) for GCN 1.4<br /> 1676 1659 Syntax: V_MAD_F16 VDST, SRC0, SRC1, SRC2<br /> 1677 1660 Description: Multiply half FP value from SRC0 by half FP value from … … 1681 1664 <code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(SRC2)</code></p> 1682 1665 <h4>V_MAD_F32</h4> 1683 <p>Opcode: 321 (0x141) for GCN 1.0/1.1; 449 (0x1c1) for GCN 1.2 <br />1666 <p>Opcode: 321 (0x141) for GCN 1.0/1.1; 449 (0x1c1) for GCN 1.2/1.4<br /> 1684 1667 Syntax: V_MAD_F32 VDST, SRC0, SRC1, SRC2<br /> 1685 1668 Description: Multiply FP value from SRC0 by FP value from SRC1 and add SRC2, and store … … 1688 1671 <code>VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(SRC2)</code></p> 1689 1672 <h4>V_MAD_I16</h4> 1690 <p>Opcode: 492 (0x1ec) for GCN 1.2 <br />1673 <p>Opcode: 492 (0x1ec) for GCN 1.2; 517 (0x205) for GCN 1.4<br /> 1691 1674 Syntax: V_MAD_I16 VDST, SRC0, SRC1, SRC2<br /> 1692 1675 Description: Multiply 16-bit signed value from SRC0 by 16-bit signed value from 1693 SRC1 and add 16-bit signed value from SRC2, and store 16-bit signed result to VDST.<br /> 1694 Operation:<br /> 1695 <code>VDST = (INT16)((INT16)SRC0*(INT16)SRC1 + (INT16)SRC2)</code></p> 1676 SRC1 and add 16-bit signed value from SRC2, and store 16-bit signed result to VDST. 1677 If CLAMP modifier supplied, then result is saturated to 16-bit signed value.<br /> 1678 Operation:<br /> 1679 <code>UINT32 temp = (SEXT32((INT16)SRC0)*(INT16)SRC1 + (INT16)SRC2) 1680 VDST = CLAMP ? MIN(MAX(temp), -32768), 32767) : temp&0xffff</code></p> 1696 1681 <h4>V_MAD_I32_I16</h4> 1697 1682 <p>Opcode: 498 (0x1f2) for GCN 1.4<br /> … … 1702 1687 <code>VDST = (UINT32)(SEXT32((INT16)SRC0)*(INT16)SRC1) + SRC2</code></p> 1703 1688 <h4>V_MAD_I32_I24</h4> 1704 <p>Opcode: 322 (0x142) for GCN 1.0/1.1; 450 (0x1c2) for GCN 1.2 <br />1689 <p>Opcode: 322 (0x142) for GCN 1.0/1.1; 450 (0x1c2) for GCN 1.2/1.4<br /> 1705 1690 Syntax: V_MAD_I32_I24 VDST, SRC0, SRC1, SRC2<br /> 1706 1691 Description: Multiply 24-bit signed integer value from SRC0 by 24-bit signed value from … … 1711 1696 VDST = V0 * V1 + SRC2</code></p> 1712 1697 <h4>V_MAD_I64_I32</h4> 1713 <p>Opcode (VOP3B): 375 (0x177) for GCN 1.1; 489 (0x1e9) for GCN 1.2 <br />1698 <p>Opcode (VOP3B): 375 (0x177) for GCN 1.1; 489 (0x1e9) for GCN 1.2/1.4<br /> 1714 1699 Syntax: V_MAD_I64_I32 VDST(2), SDST(2), SRC0, SRC1, SRC2(2)<br /> 1715 1700 Description: Multiply 32-bit signed integer value from SRC0 by 32-bit signed value … … 1722 1707 UINT64 mask = (1ULL<<LANEID) 1723 1708 //SDST = (SDST&~mask) | ((?????) ? mask : 0)</code></p> 1709 <h4>V_MAD_LEGACY_F16</h4> 1710 <p>Opcode: 490 (0x1ea) for GCN 1.4<br /> 1711 Syntax: V_MAD_LEGACY_F16 VDST, SRC0, SRC1, SRC2<br /> 1712 Description: Multiply half FP value from SRC0 by half FP value from 1713 SRC1 and add SRC2, and store result to VDST. 1714 It applies OMOD modifier to result and it flush denormals.<br /> 1715 Operation:<br /> 1716 <code>VDST = ASHALF(SRC0) * ASHALF(SRC1) + ASHALF(SRC2)</code></p> 1724 1717 <h4>V_MAD_LEGACY_F32</h4> 1725 <p>Opcode: 320 (0x140) for GCN 1.0/1.1; 448 (0x1c0) for GCN 1.2 <br />1718 <p>Opcode: 320 (0x140) for GCN 1.0/1.1; 448 (0x1c0) for GCN 1.2/1.4<br /> 1726 1719 Syntax: V_MAD_LEGACY_F32 VDST, SRC0, SRC1, SRC2<br /> 1727 1720 Description: Multiply FP value from SRC0 by FP value from SRC1 and add result to SRC2, and … … 1732 1725 <code>if (ASFLOAT(SRC0)!=0.0 && ASFLOAT(SRC1)!=0.0) 1733 1726 VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) + ASFLOAT(SRC2)</code></p> 1727 <h4>V_MAD_LEGACY_I16</h4> 1728 <p>Opcode: 492 (0x1ec) for GCN 1.4<br /> 1729 Syntax: V_MAD_LEGACY_I16 VDST, SRC0, SRC1, SRC2<br /> 1730 Description: Multiply 16-bit signed value from SRC0 by 16-bit signed value from 1731 SRC1 and add 16-bit signed value from SRC2, and store 16-bit signed result to VDST. 1732 If CLAMP modifier supplied, then result is saturated to 16-bit signed value.<br /> 1733 Operation:<br /> 1734 <code>UINT32 temp = (SEXT32((INT16)SRC0)*(INT16)SRC1 + (INT16)SRC2) 1735 VDST = CLAMP ? MIN(MAX(temp), -32768), 32767) : temp&0xffff</code></p> 1736 <h4>V_MAD_LEGACY_U16</h4> 1737 <p>Opcode: 491 (0x1eb) for GCN 1.4<br /> 1738 Syntax: V_MAD_LEGACY_U16 VDST, SRC0, SRC1, SRC2<br /> 1739 Description: Multiply 16-bit unsigned value from SRC0 by 16-bit unsigned value from 1740 SRC1 and add 16-bit unsigned value from SRC2, and store 16-bit unsigned result to VDST. 1741 If CLAMP modifier supplied, then result is saturated to 16-bit unsigned value.<br /> 1742 Operation:<br /> 1743 <code>UINT32 temp = ((UINT16)SRC0*(UINT16)SRC1 + (UINT16)SRC2) & 0xffff 1744 VDST = CLAMP ? MIN(temp, 0xffff) : (temp&0xffff)</code></p> 1734 1745 <h4>V_MAD_U16</h4> 1735 <p>Opcode: 491 (0x1eb) for GCN 1.2 <br />1746 <p>Opcode: 491 (0x1eb) for GCN 1.2; 516 (0x204) for GCN 1.4<br /> 1736 1747 Syntax: V_MAD_U16 VDST, SRC0, SRC1, SRC2<br /> 1737 1748 Description: Multiply 16-bit unsigned value from SRC0 by 16-bit unsigned value from 1738 SRC1 and add 16-bit unsigned value from SRC2, and store 16-bit unsigned result to VDST.<br /> 1739 Operation:<br /> 1740 <code>VDST = ((UINT16)SRC0*(UINT16)SRC1 + (UINT16)SRC2) & 0xffff</code></p> 1749 SRC1 and add 16-bit unsigned value from SRC2, and store 16-bit unsigned result to VDST. 1750 If CLAMP modifier supplied, then result is saturated to 16-bit unsigned value.<br /> 1751 Operation:<br /> 1752 <code>UINT32 temp = ((UINT16)SRC0*(UINT16)SRC1 + (UINT16)SRC2) & 0xffff 1753 VDST = CLAMP ? MIN(temp, 0xffff) : (temp&0xffff)</code></p> 1741 1754 <h4>V_MAD_U32_U16</h4> 1742 1755 <p>Opcode: 497 (0x1f1) for GCN 1.4<br /> … … 1747 1760 <code>VDST = (UINT32)((SRC0&0xffff)*(SRC1&0xffff)) + SRC2</code></p> 1748 1761 <h4>V_MAD_U32_U24</h4> 1749 <p>Opcode: 323 (0x143) for GCN 1.0/1.1; 451 (0x1c3) for GCN 1.2 <br />1762 <p>Opcode: 323 (0x143) for GCN 1.0/1.1; 451 (0x1c3) for GCN 1.2/1.4<br /> 1750 1763 Syntax: V_MAD_U32_U24 VDST, SRC0, SRC1, SRC2<br /> 1751 1764 Description: Multiply 24-bit unsigned integer value from SRC0 by 24-bit unsigned value … … 1754 1767 <code>VDST = (UINT32)(SRC0&0xffffff) * (UINT32)(SRC1&0xffffff) + SRC2</code></p> 1755 1768 <h4>V_MAD_U64_U32</h4> 1756 <p>Opcode (VOP3B): 374 (0x176) for GCN 1.1; 488 (0x1e8) for GCN 1.2 <br />1769 <p>Opcode (VOP3B): 374 (0x176) for GCN 1.1; 488 (0x1e8) for GCN 1.2/1.4<br /> 1757 1770 Syntax: V_MAD_U64_U32 VDST(2), SDST(2), SRC0, SRC1, SRC2(2)<br /> 1758 1771 Description: Multiply 32-bit unsigned integer value from SRC0 by 32-bit unsigned value … … 1766 1779 SDST = (SDST&~mask) | ((VDST < PROD) ? mask : 0)</code></p> 1767 1780 <h4>V_MAX_F64</h4> 1768 <p>Opcode: 359 (0x167) for GCN 1.0/1.1; 643 (0x283) for GCN 1.2 <br />1781 <p>Opcode: 359 (0x167) for GCN 1.0/1.1; 643 (0x283) for GCN 1.2/1.4<br /> 1769 1782 Syntax: V_MAX_F64 VDST(2), SRC0(2), SRC1(2)<br /> 1770 1783 Description: Choose largest double FP value from SRC0 and SRC1, and store result to VDST.<br /> … … 1791 1804 VDST = MAX(SF1, SF0)</code></p> 1792 1805 <h4>V_MAX3_F32</h4> 1793 <p>Opcode: 340 (0x154) for GCN 1.0/1.1; 467 (0x1d3) for GCN 1.2 <br />1806 <p>Opcode: 340 (0x154) for GCN 1.0/1.1; 467 (0x1d3) for GCN 1.2/1.4<br /> 1794 1807 Syntax: V_MAX3_F32 VDST, SRC0, SRC1, SRC2<br /> 1795 1808 Description: Choose largest value from FP values SRC0, SRC1, SRC2, and store it to VDST.<br /> … … 1819 1832 VDST = (UINT16)MAX((INT16)SRC1, (INT16)SRC0)</code></p> 1820 1833 <h4>V_MAX3_I32</h4> 1821 <p>Opcode: 341 (0x155) for GCN 1.0/1.1; 468 (0x1d4) for GCN 1.2 <br />1834 <p>Opcode: 341 (0x155) for GCN 1.0/1.1; 468 (0x1d4) for GCN 1.2/1.4<br /> 1822 1835 Syntax: V_MAX3_I32 VDST, SRC0, SRC1, SRC2<br /> 1823 1836 Description: Choose largest value from signed integer values SRC0, SRC1, SRC2, … … 1839 1852 VDST = MAX((UINT16)SRC1, (UINT16)SRC0)</code></p> 1840 1853 <h4>V_MAX3_U32</h4> 1841 <p>Opcode: 342 (0x156) for GCN 1.0/1.1; 469 (0x1d5) for GCN 1.2 <br />1854 <p>Opcode: 342 (0x156) for GCN 1.0/1.1; 469 (0x1d5) for GCN 1.2/1.4<br /> 1842 1855 Syntax: V_MAX3_U32 VDST, SRC0, SRC1, SRC2<br /> 1843 1856 Description: Choose largest value from unsigned integer values SRC0, SRC1, SRC2, … … 1849 1862 VDST = MAX(SRC1, SRC0)</code></p> 1850 1863 <h4>V_MBCNT_HI_U32_B32</h4> 1851 <p>Opcode: 653 (0x28d) for GCN 1.2 <br />1864 <p>Opcode: 653 (0x28d) for GCN 1.2/1.4<br /> 1852 1865 Syntax: V_MBCNT_HI_U32_B32 VDST, SRC0, SRC1<br /> 1853 1866 Description: Make mask for all lanes ending at current lane, … … 1858 1871 VDST = SRC1 + BITCOUNT(MASK)</code></p> 1859 1872 <h4>V_MBCNT_LO_U32_B32</h4> 1860 <p>Opcode: 652 (0x28c) for GCN 1.2 <br />1873 <p>Opcode: 652 (0x28c) for GCN 1.2/1.4<br /> 1861 1874 Syntax: V_MBCNT_LO_U32_B32 VDST, SRC0, SRC1<br /> 1862 1875 Description: Make mask for all lanes ending at current lane, … … 1888 1901 VDST = SF0</code></p> 1889 1902 <h4>V_MED3_F32</h4> 1890 <p>Opcode: 343 (0x157) for GCN 1.0/1.1; 470 (0x1d6) for GCN 1.2 <br />1903 <p>Opcode: 343 (0x157) for GCN 1.0/1.1; 470 (0x1d6) for GCN 1.2/1.4<br /> 1891 1904 Syntax: V_MED3_F32 VDST, SRC0, SRC1, SRC2<br /> 1892 1905 Description: Choose medium value from FP values SRC0, SRC1, SRC2, and store it to VDST.<br /> … … 1923 1936 VDST = (UINT16)S0</code></p> 1924 1937 <h4>V_MED3_I32</h4> 1925 <p>Opcode: 344 (0x158) for GCN 1.0/1.1; 471 (0x1d7) for GCN 1.2 <br />1938 <p>Opcode: 344 (0x158) for GCN 1.0/1.1; 471 (0x1d7) for GCN 1.2/1.4<br /> 1926 1939 Syntax: V_MED3_I32 VDST, SRC0, SRC1, SRC2<br /> 1927 1940 Description: Choose medium value from signed integer values SRC0, SRC1, SRC2, … … 1953 1966 VDST = S0</code></p> 1954 1967 <h4>V_MED3_U32</h4> 1955 <p>Opcode: 345 (0x159) for GCN 1.0/1.1; 472 (0x1d8) for GCN 1.2 <br />1968 <p>Opcode: 345 (0x159) for GCN 1.0/1.1; 472 (0x1d8) for GCN 1.2/1.4<br /> 1956 1969 Syntax: V_MED3_U32 VDST, SRC0, SRC1, SRC2<br /> 1957 1970 Description: Choose medium value from unsigned integer values SRC0, SRC1, SRC2, … … 1965 1978 VDST = SRC0</code></p> 1966 1979 <h4>V_MIN_F64</h4> 1967 <p>Opcode: 358 (0x166) for GCN 1.0/1.1; 642 (0x282) for GCN 1.2 <br />1980 <p>Opcode: 358 (0x166) for GCN 1.0/1.1; 642 (0x282) for GCN 1.2/1.4<br /> 1968 1981 Syntax: V_MIN_F64 VDST(2), SRC0(2), SRC1(2)<br /> 1969 1982 Description: Choose smallest double FP value from SRC0 and SRC1, and store result to VDST.<br /> … … 1990 2003 VDST = MIN(SF1, SF0)</code></p> 1991 2004 <h4>V_MIN3_F32</h4> 1992 <p>Opcode: 337 (0x151) for GCN 1.0/1.1; 464 (0x1d0) for GCN 1.2 <br />2005 <p>Opcode: 337 (0x151) for GCN 1.0/1.1; 464 (0x1d0) for GCN 1.2/1.4<br /> 1993 2006 Syntax: V_MIN3_F32 VDST, SRC0, SRC1, SRC2<br /> 1994 2007 Description: Choose smallest value from FP values SRC0, SRC1, SRC2, and store it to VDST.<br /> … … 2018 2031 VDST = (UINT16)MIN((INT16)SRC1, (INT16)SRC0)</code></p> 2019 2032 <h4>V_MIN3_I32</h4> 2020 <p>Opcode: 338 (0x152) for GCN 1.0/1.1; 465 (0x1d1) for GCN 1.2 <br />2033 <p>Opcode: 338 (0x152) for GCN 1.0/1.1; 465 (0x1d1) for GCN 1.2/1.4<br /> 2021 2034 Syntax: V_MIN3_I32 VDST, SRC0, SRC1, SRC2<br /> 2022 2035 Description: Choose smallest value from signed integer values SRC0, SRC1, SRC2, … … 2038 2051 VDST = MIN(S(UINT16)RC1, (UINT16)SRC0)</code></p> 2039 2052 <h4>V_MIN3_U32</h4> 2040 <p>Opcode: 339 (0x153) for GCN 1.0/1.1; 466 (0x1d2) for GCN 1.2 <br />2053 <p>Opcode: 339 (0x153) for GCN 1.0/1.1; 466 (0x1d2) for GCN 1.2/1.4<br /> 2041 2054 Syntax: V_MIN3_U32 VDST, SRC0, SRC1, SRC2<br /> 2042 2055 Description: Choose smallest value from unsigned integer values SRC0, SRC1, SRC2, … … 2048 2061 VDST = MIN(SRC1, SRC0)</code></p> 2049 2062 <h4>V_MQSAD_U32_U8</h4> 2050 <p>Opcode: 373 (0x175) for GCN 1.1; 487 (0x1e7) for GCN 1.2 <br />2063 <p>Opcode: 373 (0x175) for GCN 1.1; 487 (0x1e7) for GCN 1.2/1.4<br /> 2051 2064 Syntax: V_MQSAD_U32_U8 VDST(4), SRC0(2), SRC1, SRC2(4)<br /> 2052 2065 Description: Compute four masked sum of absolute differences with accumulation. … … 2068 2081 VDST |= (MSADU8((UINT32)(SRC0>>24), SRC1, SRC2>>96)<<96</code></p> 2069 2082 <h4>V_MQSAD_U8, V_MQSAD_PK_U16_U8</h4> 2070 <p>Opcode: 371 (0x173) for GCN 1.0/1.1; 486 (0x1e6) for GCN 1.2 <br />2083 <p>Opcode: 371 (0x173) for GCN 1.0/1.1; 486 (0x1e6) for GCN 1.2/1.4<br /> 2071 2084 Syntax (GCN 1.0): V_MQSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> 2072 2085 Syntax (GCN 1.1/1.2): V_MQSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> … … 2089 2102 VDST |= (MSADU8((UINT32)(SRC0>>24), SRC1, (SRC2>>48) & 0xffff)<<48</code></p> 2090 2103 <h4>V_MSAD_U8</h4> 2091 <p>Opcode: 369 (0x171) for GCN 1.0/1.1; 484 (0x1e4) for GCN 1.2 <br />2104 <p>Opcode: 369 (0x171) for GCN 1.0/1.1; 484 (0x1e4) for GCN 1.2/1.4<br /> 2092 2105 Syntax: V_MSAD_U8 VDST, SRC0, SRC1, SRC2<br /> 2093 2106 Description: Calculate sum of absolute differences in SRC0 and SRC1 for bytes that have … … 2099 2112 VDST += ABS(((SRC0 >> (i*8)) & 0xff) - ((SRC1 >> (i*8)) & 0xff))</code></p> 2100 2113 <h4>V_MUL_F64</h4> 2101 <p>Opcode: 357 (0x165) for GCN 1.0/1.1; 641 (0x281) for GCN 1.2 <br />2114 <p>Opcode: 357 (0x165) for GCN 1.0/1.1; 641 (0x281) for GCN 1.2/1.4<br /> 2102 2115 Syntax: V_MUL_F64 VDST(2), SRC0(2), SRC1(2)<br /> 2103 2116 Description: Multiply two double FP values from SRC0 and SRC1 and store result to VDST.<br /> … … 2105 2118 <code>VDST = ASDOUBLE(SRC0) * ASDOUBLE(SRC1)</code></p> 2106 2119 <h4>V_MUL_HI_I32</h4> 2107 <p>Opcode: 364 (0x16c) for GCN 1.0/1.1; 647 (0x287) for GCN 1.2 <br />2120 <p>Opcode: 364 (0x16c) for GCN 1.0/1.1; 647 (0x287) for GCN 1.2/1.4<br /> 2108 2121 Syntax: V_MUL_HI_I32 VDST, SRC0, SRC1<br /> 2109 2122 Description: Multiply 32-bit signed value SRC0 and SRC1, and store higher part of … … 2112 2125 <code>VDST = ((INT64)SRC0 * (INT32)SRC1) >> 32</code></p> 2113 2126 <h4>V_MUL_HI_U32</h4> 2114 <p>Opcode: 362 (0x16a) for GCN 1.0/1.1; 646 (0x286) for GCN 1.2 <br />2127 <p>Opcode: 362 (0x16a) for GCN 1.0/1.1; 646 (0x286) for GCN 1.2/1.4<br /> 2115 2128 Syntax: V_MUL_HI_U32 VDST, SRC0, SRC1<br /> 2116 2129 Description: Multiply 32-bit unsigned value SRC0 and SRC1, and store higher part of … … 2126 2139 <code>VDST = (INT32)SRC0 * (INT32)SRC1</code></p> 2127 2140 <h4>V_MUL_LO_U32</h4> 2128 <p>Opcode: 361 (0x169) for GCN 1.0/1.1; 645 (0x285) for GCN 1.2 <br />2141 <p>Opcode: 361 (0x169) for GCN 1.0/1.1; 645 (0x285) for GCN 1.2/1.4<br /> 2129 2142 Syntax: V_MUL_LO_U32 VDST, SRC0, SRC1<br /> 2130 2143 Description: Multiply 32-bit unsigned value SRC0 and SRC1, and store lower part of … … 2147 2160 VDST = ASFLOAT(SRC0) * ASFLOAT(SRC1) 2148 2161 }</code></p> 2162 <h4>V_OR3_B32</h4> 2163 <p>Opcode: 514 (0x202) for GCN 1.4<br /> 2164 Syntax: V_OR3_B32 VDST, SRC0, SRC1, SRC2<br /> 2165 Description: Make bitwise OR with SRC0, SRC1 and SRC2 and store result to VDST.<br /> 2166 Operation:<br /> 2167 <code>VDST = SRC0 | SRC1 | SRC2</code></p> 2149 2168 <h4>V_PACK_B32_F16</h4> 2150 2169 <p>Opcode: 672 (0x2a0) for GCN 1.4<br /> … … 2155 2174 <code>VDST = (SRC0&0xffff) | (SRC1<<16)</code></p> 2156 2175 <h4>V_PERM_B32</h4> 2157 <p>Opcode: 493 (0x1ed) for GCN 1.2 <br />2176 <p>Opcode: 493 (0x1ed) for GCN 1.2/1.4<br /> 2158 2177 Syntax: V_PERM_B32 VDST, SRC0, SRC1, SRC2<br /> 2159 2178 Description: Permute bytes. Choose for every byte in dword, specified value. Bytes in … … 2180 2199 }</code></p> 2181 2200 <h4>V_QSAD_U8, V_QSAD_PK_U16_U8</h4> 2182 <p>Opcode: 370 (0x172) for GCN 1.0/1.1; 485 (0x1e5) for GCN 1.2 <br />2201 <p>Opcode: 370 (0x172) for GCN 1.0/1.1; 485 (0x1e5) for GCN 1.2/1.4<br /> 2183 2202 Syntax (GCN 1.0): V_QSAD_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> 2184 2203 Syntax (GCN 1.1/1.2): V_QSAD_PK_U16_U8 VDST(2), SRC0(2), SRC1, SRC2(2)<br /> … … 2199 2218 VDST |= (SADU8((UINT32)(SRC0>>24), SRC1, (SRC2>>48) & 0xffff)<<48</code></p> 2200 2219 <h4>V_READLANE_B32</h4> 2201 <p>Opcode: 649 (0x289) for GCN 1.2 <br />2220 <p>Opcode: 649 (0x289) for GCN 1.2/1.4<br /> 2202 2221 Syntax: V_READLANE_B32 SDST, VSRC0, SSRC1<br /> 2203 2222 Description: Copy one VSRC0 lane value to one SDST. Lane (thread id) choosen from SSRC1&63. … … 2206 2225 <code>SDST = VSRC0[SSRC1 & 63]</code></p> 2207 2226 <h4>V_SAD_HI_U8</h4> 2208 <p>Opcode: 347 (0x15b) for GCN 1.0/1.1; 474 (0x1da) for GCN 1.2 <br />2227 <p>Opcode: 347 (0x15b) for GCN 1.0/1.1; 474 (0x1da) for GCN 1.2/1.4<br /> 2209 2228 Syntax: V_SAD_HI_U8 VDST, SRC0, SRC1, SRC2<br /> 2210 2229 Description: Calculate sum of absolute differences for all four bytes in SRC0 and SRC1, … … 2215 2234 VDST += (ABS(((SRC0 >> (i*8)) & 0xff) - ((SRC1 >> (i*8)) & 0xff)))<<16</code></p> 2216 2235 <h4>V_SAD_U16</h4> 2217 <p>Opcode: 348 (0x15c) for GCN 1.0/1.1; 475 (0x1db) for GCN 1.2 <br />2236 <p>Opcode: 348 (0x15c) for GCN 1.0/1.1; 475 (0x1db) for GCN 1.2/1.4<br /> 2218 2237 Syntax: V_SAD_U16 VDST, SRC0, SRC1, SRC2<br /> 2219 2238 Description: Calculate sum of absolute differences for two 16-bit words in SRC0 and SRC1, … … 2224 2243 VDST += ABS((SRC0 >> 16) - (SRC1 >> 16))</code></p> 2225 2244 <h4>V_SAD_U32</h4> 2226 <p>Opcode: 349 (0x15d) for GCN 1.0/1.1; 476 (0x1dc) for GCN 1.2 <br />2245 <p>Opcode: 349 (0x15d) for GCN 1.0/1.1; 476 (0x1dc) for GCN 1.2/1.4<br /> 2227 2246 Syntax: V_SAD_U32 VDST, SRC0, SRC1, SRC2<br /> 2228 2247 Description: Calculate sum of absolute difference for SRC0 and SRC1, add … … 2231 2250 <code>VDST = SRC2 + ABS(SRC0 - SRC1)</code></p> 2232 2251 <h4>V_SAD_U8</h4> 2233 <p>Opcode: 346 (0x15a) for GCN 1.0/1.1; 473 (0x1d9) for GCN 1.2 <br />2252 <p>Opcode: 346 (0x15a) for GCN 1.0/1.1; 473 (0x1d9) for GCN 1.2/1.4<br /> 2234 2253 Syntax: V_SAD_U8 VDST, SRC0, SRC1, SRC2<br /> 2235 2254 Description: Calculate sum of absolute differences for all four bytes in SRC0 and SRC1, add … … 2273 2292 }</code></p> 2274 2293 <h4>V_TRIG_PREOP_F64</h4> 2275 <p>Opcode: 372 (0x174) for GCN 1.0/1.1; 658 (0x292) for GCN 1.2 <br />2294 <p>Opcode: 372 (0x174) for GCN 1.0/1.1; 658 (0x292) for GCN 1.2/1.4<br /> 2276 2295 Syntax: V_TRIG_PREOP_F64 VDST(2), SRC0(2), SRC1<br /> 2277 2296 Description: D.d = Look Up 2/PI (S0.d) with segment select S1.u[4:0]. … … 2293 2312 VDST = (DOUBLE)(TWOPERPI[BIT:BIT+52]) * POW(2.0, -BIT-53)</code></p> 2294 2313 <h4>V_WRITELANE_B32</h4> 2295 <p>Opcode: 650 (0x28a) for GCN 1.2 <br />2314 <p>Opcode: 650 (0x28a) for GCN 1.2/1.4<br /> 2296 2315 Syntax: V_WRITELANE_B32 VDST, VSRC0, SSRC1<br /> 2297 2316 Description: Copy SGPR to one lane of VDST. Lane choosen (thread id) from SSRC1&63.