Change Log
CLRadeonExtender 0.1.9:
- add AMD Navi support for assembler and disassembler
- add shorter addressing of FLAT/GLOBAL/SCRATCH
- add literal immediate for SMRD addressing for GCN1.1
- add Amd3 OpenCL binary format for AMD Navi for AMD OpenCL implementation
- include specific extension in device name for ROCm-OpenCL platform
CLRadeonExtender 0.1.8:
- add chapter about binary formats to CLRX documentation
- add some informations about compilation under FreeBSD
- add '.nosectdiffs' to disable new section difference behaviour if new ROCm format choosen
- small optimization in the AsmScope? destructor.
- add extra info about setting up number of the SGPRs register in documentation
- fixed OpenCL detection for AMDGPU-PRO
- add '.enum' pseudo-op to simplify defining enumerations
- add CLRX_VERSION_NUMBER and CLRX_POLICY_UNIFIED_SGPR_COUNT
- add policy to unify SGPR counting for all binary formats (by default disabled)
- in documentation fix some some mistakes about building
- add preliminary support for CPU architectures (untested): SPARC, IA64 and MIPS
- add new '.dims' syntax for distinguish vector group ids and scalar local ids
- improve CLZ32/64 for MSVC
- introduce CTZ32/64
- while disassemblying determine minimal AMD driver version for GPU device type (better code detection while disassemblying)
- fixed some types in documentation
- update list of GPU devices in documentation
- fix stupid and old bug in ImageMix? sample
- change a GPU device name for VEGA11 to GFX902
- fixed segfault when attempt to disassemble old Gallium binaries using new Gallium binary format
- sort the kernels by an offset order by disassemblying
- better input data checking while disassemblying code
- add HSALayout mode for AMDCL2 format (similar code layout like in ROCm and Gallium formats)
- introduce kernel code parts ('.kcode' and '.kcodeend') to AMDCL2
- check sanity of use LDS in AMD VEGA architecture (can be used only in SCRATCH and GLOBAL)
- in source code add new types: GPUArchMask, AsmKernelId? and AsmSectionId? type.
- allow constant literals in sym regranges
- fixed symreg ranges checking
- fixed handling some the symbol names similar to some register names (like exec_masc)
- add new GPU devices to list (gfx904, gfx905, gfx906 and gfx907)
- add AMD VEGA 20 instruction set
- add much stuff to handle register allocation (still it doesn't work and it wasnot finished)
- add a DTree structure to save memory in storing register allocation structures
- fixed possible segfault while preparing to write when ASMKERN_INNER is present
CLRadeonExtender 0.1.7:
- update AmdCL2ABI chapter
- fixed kernel arguments sizes in GalliumCompute? binary format
- add new GPU devices gfx902-gfx905
- update device tables for Amd Crimson drivers
- small fixes in DynLibrary? interface
- add relocations to GalliumCompute? binary format (for scratch buffer symbols)
- make getXXXDisasmInputFromBinaryXX as public interface
- speeding up evaluation of simple expressions without symbols
- add '.for' and '.while' pseudo-ops ('for' and 'while' loops)
- fixed some grammar/typos in CLRX documentation
- add GPU device names from ROCm-OpenCL
- handle new ROCm binary format with YAML metadatas (assembler and disassembler)
- add few pseudo-ops to ROCm handling
- add new pseudo-ops to set parameters in ROCm YAML metadata
- fixes in GalliumCompute? binary generator (for conformant with standards)
- add '.reqd_work_group_size' pseudo-op (equivalent of '.cws')
- add support for work_group_size_hint and vec_type hint in Amd OpenCL 2.0 binary format
- some small bug fixes in ROCm disassembler
- updates in README.md and INSTALL files
- small sanitizations in DisasmAmd?, DisasmAmdCL2 (argument type checking)
- change behaviour of '.cws' (.reqd_work_group_size) while setting default values
- add calculation of section differences in an expressions (for ROCm handling)
- fixed invalid reads (potential segfault) after undefining symbol
- fixed old stupid bug: resolve symbol value by using new value (or just if undefined then do not resolve symbol) instead old unresolved symbol value later when expression has been evaluated
- Add GOT table handling in ROCm binary format
- add new option '--newROCmBinFormat'
- add untested support for ROCm in CLHelper and VectorAdd? sample
- add support for multiple OpenCL platforms in CLHelper and samples
- allow te call_convetion to 0xffffffff in AMDHSA config
- handle special cases with relatives while evaluating binary/logical operators
- small fixes in CLRX documentation and Unix manuals
- developing unfinished AsmRegAlloc?
- add a missing access qualifier to images 'read_write' for AMD OpenCL 2.0
CLRadeonExtender 0.1.6:
- add support for Mesa3D 17.3.0 (GPU detection)
- fixed segfaults during disassemblying new Gallium binaries with AMD HSA
- add ability to supply defined symbols during using the CLHelper
- fixed CLRXDocs mistakes in GcnSrmdInstrs?, GcmSmemInstrs?, GcnVopXInstrs chapters.
- add GCN1.4 (VEGA) instruction's descriptons to CLRXDocs
- add support for GCN 1.4 (VEGA) to samples
- fixed encoding/decoding of SMEM instructions with SGPR offset (GCN 1.4)
- add a missing GCN 1.4 instructions
- fixed encoding/decoding of OP_SEL (GCN 1.4)
- fixed encoding/decoding of DS_READ_ADDTID_B32 (GCN 1.4)
- fixed encoding/decoding of TBUFFER_x_D16/BUFFER_x_D16 instructions for GCN 1.4
- fixed encoding CLAMP VOP3/VOPC instructions (GCN 1.4)
- allow to use OMOD, NEG, ABS, CLAMP modifiers in VOP3/VINTRP instructions
- add new VOP3/VINTRP instruction's descriptions to CLRXDocs
- update GCN timings chapter in CLRXDocs
CLRadeonExtender 0.1.5r1:
- add detection of OpenGL to CMakeLists.txt
- add more comments in the source code
- fixed hanging when ROCm code have hundreds or more kernels
- parameter in modifier can have any value
- add 'get_version' pseudo-operation
- add oldModParam mode (old modifier parameter's policy)
- fixes for ROCm disassembler module
- fixes for Gallium binary reader (accept new binaries with many kernels)
- added support for Mesa3D 17.2.x
- added Mesa3D/Gallium device names for AMD Polaris
- add new exceptions to code (to distinguish type of exception)
- fixed position in disassembler code in comments (mainly for Gallium/ROCm)
- add CLRXCLHelper library to facilitate running assembler code on the OpenCL
- move some GPU architecture versions tables to GPUId
- add new testcase GPUId
CLRadeonExtender 0.1.5:
- ignore case in an access qualifier name's (Amd and AmdCL2)
- improve handling a '\()' and '\@'
- add SDWA and DPP words to set instruction encoding
- fixing few CLRXDocs typos
- fixes for AMD RX VEGA (GFX900)
- disassembler prints an instruction's position in comments
- update GcnTimings
- update VectorAdd? and ReverseBits? for LLVM 4.0 and Mesa3D 17.0.0
- updates in ImageMix? (correct workSize calculating for kernel)
- small fixes in disassembler
- disassembler can correctly disassemble GalliumCompute? for LLVM 4.0
- add '--llvmVersion' to clrxdisasm
- dump AMD HSA configuration for GalliumCompute? and AmdCL2 (like in ROCm format)
- disassembler add '@' to hwreg and sendmsg to make dump compatible with clrxasm
- add '--HSAConfig' to dump AmdCL2 kernel configuration as AMD HSA config
- add AMD HSA configuration pseudo-ops to GalliumCompute? and AmdCL2 binary formats
- update device list for Gallium and ROCm binary formats for recognizing device
- fixed support for LLVM>=3.9 and Mesa3D>=17.0.0 in GalliumCompute?
- add pseudo-op '.default_hsa_features' to AmdCL2, Gallium and ROCm formats
- update headers in code
- make error handling more compact in assembler's code
- fixed '.machine', '.codeversion' handling (do not print obsolete warnings)
- add pkg-config files to installation
- remove obsolete warnings in CMakeLists.txt
- added GFX901 support (RX VEGA with HBCC ?)
- add Config.h and amdbin/Elf.h headers to Doxygen documentation
- change lowest device for GCN 1.2 to Iceland in GPUId.
- add support for Windows developments environments: CygWin? and MinGW
- make detecting of 64-bits more portable in CMakeLists.txt (use compiler to do)
- checking whether std::call_once is available for non full supported std threads
- use only C++ compiler to check features (Int128Detect.cpp)
CLRadeonExtender 0.1.4r1:
- fixed code operation in SMRD and SMEM instructions
- fixed parsing symbol register ranges begins from 'exec', 'vcc', 'tma', ...
- checking end of line at parsing symbol and regvar register ranges
CLRadeonExtender 0.1.4:
- add AMD RX VEGA support (GCN 1.4/VEGA)
- add symbol scopes
- add support for 32-bit AMD OpenCL 2.0 binaries
- update GPU device ids to latest drivers
- add Ellesmere and Baffin support for AMD OpenCL 1.2 binaries
- add support for LLVM 3.9, LLVM 4.0 and Mesa3D 17.0
- add new options to clrxasm (--llvmVersion)
- add GCN 1.2 instruction set documentation
- add new SMEM instruction (s_buffer_atomics)
- add GDS segment size to AMD OpenCL 2.0 binaries
- add code of samples for GCN 1.2
- add option to use old AMD OpenCL 1.2 binary format into samples
- add editor's syntax (NotePad?++, Kate, Gedit, VIM)
- minor fixes in GCN assembler
- add modifier's parametrization
- add options to control case-sensitiviness in macro names
- fixed handling AMDOCL names for 32-bit Windows environment
- add installation rules for AMDGPU-PRO drivers (OpenSUSE and Ubuntu)
- add new pseudo-ops '.get_64bit', '.get_arch', '.get_format', '.get_gpu'
- add autodetection for LLVM and Mesa3D version
- find correct AMDOCL, MesaOCL and llvm-config at runtime
CLRadeonExtender 0.1.3:
- ROCm binary format support
- fixed '.format' pseudo-op
- fixed resolving variables in some specific cases
- fixed handling AmdCL2 format for device type later than GCN.1.1
- small fixes in documentation
- fixed disassemblying s_waitcnt
- fixed handling floating point literals in assembler and compatibility mode (bugFP)
- ARMv8 (AArch64) architecture support
- Android support
CLRadeonExtender 0.1.2:
- AMD OpenCL 2.0 support
- 64-bit Gallium binary format support
- support for new closed Linux and Windows drivers
- new samples
- documentation for OpenCL 2.0 support (includes ABI)
- documentation for GCN ISA FLAT encoding
- lit() specifier to distinguish literal and inline constant
- alternate macro syntax
- correct counting registers for automatic configuration
- fixed handling of conditionals and macro pseudo-ops
- disassembler can dump configuration in user-friendly form
CLRadeonExtender 0.1.1:
- support for Windows
- register ranges, and symbol's of register ranges
- GCN ISA documentation
- fixed AMD Catalyst and Gallium compute binary generator
- fixed clrxasm
CLRadeonExtender 0.1:
- first published version
Last modified 5 years ago
Last modified on 12/24/19 16:26:32