| 2 | |
| 3 | CLRadeonExtender 0.1.8: |
| 4 | |
| 5 | * add chapter about binary formats to CLRX documentation |
| 6 | * add some informations about compilation under FreeBSD |
| 7 | * add '.nosectdiffs' to disable new section difference behaviour if new ROCm format choosen |
| 8 | * small optimization in the AsmScope destructor. |
| 9 | * add extra info about setting up number of the SGPRs register in documentation |
| 10 | * fixed OpenCL detection for AMDGPU-PRO |
| 11 | * add '.enum' pseudo-op to simplify defining enumerations |
| 12 | * add CLRX_VERSION_NUMBER and CLRX_POLICY_UNIFIED_SGPR_COUNT |
| 13 | * add policy to unify SGPR counting for all binary formats (by default disabled) |
| 14 | * in documentation fix some some mistakes about building |
| 15 | * add preliminary support for CPU architectures (untested): SPARC, IA64 and MIPS |
| 16 | * add new '.dims' syntax for distinguish vector group ids and scalar local ids |
| 17 | * improve CLZ32/64 for MSVC |
| 18 | * introduce CTZ32/64 |
| 19 | * while disassemblying determine minimal AMD driver version for GPU device type |
| 20 | (better code detection while disassemblying) |
| 21 | * fixed some types in documentation |
| 22 | * update list of GPU devices in documentation |
| 23 | * fix stupid and old bug in ImageMix sample |
| 24 | * change a GPU device name for VEGA11 to GFX902 |
| 25 | * fixed segfault when attempt to disassemble old Gallium binaries using new Gallium binary format |
| 26 | * sort the kernels by an offset order by disassemblying |
| 27 | * better input data checking while disassemblying code |
| 28 | * add HSALayout mode for AMDCL2 format (similar code layout like in ROCm and Gallium formats) |
| 29 | * introduce kernel code parts ('.kcode' and '.kcodeend') to AMDCL2 |
| 30 | * check sanity of use LDS in AMD VEGA architecture (can be used only in SCRATCH and GLOBAL) |
| 31 | * in source code add new types: GPUArchMask, AsmKernelId and AsmSectionId type. |
| 32 | * allow constant literals in sym regranges |
| 33 | * fixed symreg ranges checking |
| 34 | * fixed handling some the symbol names similar to some register names (like exec_masc) |
| 35 | * add new GPU devices to list (gfx904, gfx905, gfx906 and gfx907) |
| 36 | * add AMD VEGA 20 instruction set |
| 37 | * add much stuff to handle register allocation (still it doesn't work and it wasnot finished) |
| 38 | * add a DTree structure to save memory in storing register allocation structures |
| 39 | * fixed possible segfault while preparing to write when ASMKERN_INNER is present |