| 1296 | <h4>DS_BPERMUTE_B32</h4> |
| 1297 | <p>Opcode: 63 (0x3f) for GCN 1.2<br /> |
| 1298 | Syntax: DS_BPERMUTE_B32 DST, ADDR, SRC [OFFSET:OFFSET]<br /> |
| 1299 | Description: Backward permutation for wave. Put value of SRC0 from |
| 1300 | lane id calculated from <code>ADDR[(LANEID + (OFFSET>>2)) & 64</code>, |
| 1301 | to DST register in LANEID. The ADDR holds lane id is multiplied by 4 (size of dword). |
| 1302 | Realizes pop semantic: “read data from lane i”. |
| 1303 | Operation:<br /> |
| 1304 | <code>UINT tmp[64] |
| 1305 | for (BYTE i = 0; i < 64; i++) |
| 1306 | { |
| 1307 | UINT32 laneid = ADDR[(i + (OFFSET>>2)) & 63] |
| 1308 | tmp[i] = (EXEC & (1ULL<<laneid)!=0) ? SRC[laneid] : 0 |
| 1309 | } |
| 1310 | for (BYTE i = 0; i < 64; i++) |
| 1311 | if (EXEC & (1ULL<<i)!=0) |
| 1312 | DST[i] = tmp[i]</code></p> |
| 2010 | <h4>DS_PERMUTE_B32</h4> |
| 2011 | <p>Opcode: 62 (0x3e) for GCN 1.2<br /> |
| 2012 | Syntax: DS_PERMUTE_B32 DST, ADDR, SRC [OFFSET:OFFSET]<br /> |
| 2013 | Description: Forward permutation for wave. Put value of SRC0 from LANEID to DST register in |
| 2014 | lane id calculated from <code>ADDR[(LANEID + (OFFSET>>2)) & 64</code>. |
| 2015 | The ADDR holds lane id multiplied by 4 (size of dword). Realizes push semantic: |
| 2016 | "put my lane data in lane i". |
| 2017 | Operation:<br /> |
| 2018 | <code>UINT32 TMP[64] |
| 2019 | for (BYTE i = 0; i < 64; i++) |
| 2020 | tmp[ADDR[(i + (OFFSET>>2)) & 63]] = (EXEC & (1ULL<<i) != 0) ? SRC[i] : 0 |
| 2021 | for (BYTE i = 0; i < 64; i++) |
| 2022 | if (EXEC & (1ULL<<i) != 0) |
| 2023 | DST[i] = tmp[i]</code></p> |