| 2 | |
| 3 | CLRadeonExtender 0.1.7: |
| 4 | |
| 5 | * update AmdCL2ABI chapter |
| 6 | * fixed kernel arguments sizes in GalliumCompute binary format |
| 7 | * add new GPU devices gfx902-gfx905 |
| 8 | * update device tables for Amd Crimson drivers |
| 9 | * small fixes in DynLibrary interface |
| 10 | * add relocations to GalliumCompute binary format (for scratch buffer symbols) |
| 11 | * make getXXXDisasmInputFromBinaryXX as public interface |
| 12 | * speeding up evaluation of simple expressions without symbols |
| 13 | * add '.for' and '.while' pseudo-ops ('for' and 'while' loops) |
| 14 | * fixed some grammar/typos in CLRX documentation |
| 15 | * add GPU device names from ROCm-OpenCL |
| 16 | * handle new ROCm binary format with YAML metadatas (assembler and disassembler) |
| 17 | * add few pseudo-ops to ROCm handling |
| 18 | * add new pseudo-ops to set parameters in ROCm YAML metadata |
| 19 | * fixes in GalliumCompute binary generator (for conformant with standards) |
| 20 | * add '.reqd_work_group_size' pseudo-op (equivalent of '.cws') |
| 21 | * add support for work_group_size_hint and vec_type hint in Amd OpenCL 2.0 binary format |
| 22 | * some small bug fixes in ROCm disassembler |
| 23 | * updates in README.md and INSTALL files |
| 24 | * small sanitizations in DisasmAmd, DisasmAmdCL2 (argument type checking) |
| 25 | * change behaviour of '.cws' (.reqd_work_group_size) while setting default values |
| 26 | * add calculation of section differences in an expressions (for ROCm handling) |
| 27 | * fixed invalid reads (potential segfault) after undefining symbol |
| 28 | * fixed old stupid bug: resolve symbol value by using new value (or just if undefined then |
| 29 | do not resolve symbol) instead old unresolved symbol value later when expression |
| 30 | has been evaluated |
| 31 | * Add GOT table handling in ROCm binary format |
| 32 | * add new option '--newROCmBinFormat' |
| 33 | * add untested support for ROCm in CLHelper and VectorAdd sample |
| 34 | * add support for multiple OpenCL platforms in CLHelper and samples |
| 35 | * allow te call_convetion to 0xffffffff in AMDHSA config |
| 36 | * handle special cases with relatives while evaluating binary/logical operators |
| 37 | * small fixes in CLRX documentation and Unix manuals |
| 38 | * developing unfinished AsmRegAlloc |
| 39 | * add a missing access qualifier to images 'read_write' for AMD OpenCL 2.0 |
| 40 | |