wiki:ClrxChangeLog

Version 13 (modified by matszpk, 5 years ago) (diff)

--

Change Log

CLRadeonExtender 0.1.9:

  • add AMD Navi support for assembler and disassembler
  • add shorter addressing of FLAT/GLOBAL/SCRATCH
  • add literal immediate for SMRD addressing for GCN1.1
  • add Amd3 OpenCL binary format for AMD Navi for AMD OpenCL implementation
  • include specific extension in device name for ROCm-OpenCL platform

CLRadeonExtender 0.1.8:

  • add chapter about binary formats to CLRX documentation
  • add some informations about compilation under FreeBSD
  • add '.nosectdiffs' to disable new section difference behaviour if new ROCm format choosen
  • small optimization in the AsmScope? destructor.
  • add extra info about setting up number of the SGPRs register in documentation
  • fixed OpenCL detection for AMDGPU-PRO
  • add '.enum' pseudo-op to simplify defining enumerations
  • add CLRX_VERSION_NUMBER and CLRX_POLICY_UNIFIED_SGPR_COUNT
  • add policy to unify SGPR counting for all binary formats (by default disabled)
  • in documentation fix some some mistakes about building
  • add preliminary support for CPU architectures (untested): SPARC, IA64 and MIPS
  • add new '.dims' syntax for distinguish vector group ids and scalar local ids
  • improve CLZ32/64 for MSVC
  • introduce CTZ32/64
  • while disassemblying determine minimal AMD driver version for GPU device type (better code detection while disassemblying)
  • fixed some types in documentation
  • update list of GPU devices in documentation
  • fix stupid and old bug in ImageMix? sample
  • change a GPU device name for VEGA11 to GFX902
  • fixed segfault when attempt to disassemble old Gallium binaries using new Gallium binary format
  • sort the kernels by an offset order by disassemblying
  • better input data checking while disassemblying code
  • add HSALayout mode for AMDCL2 format (similar code layout like in ROCm and Gallium formats)
  • introduce kernel code parts ('.kcode' and '.kcodeend') to AMDCL2
  • check sanity of use LDS in AMD VEGA architecture (can be used only in SCRATCH and GLOBAL)
  • in source code add new types: GPUArchMask, AsmKernelId? and AsmSectionId? type.
  • allow constant literals in sym regranges
  • fixed symreg ranges checking
  • fixed handling some the symbol names similar to some register names (like exec_masc)
  • add new GPU devices to list (gfx904, gfx905, gfx906 and gfx907)
  • add AMD VEGA 20 instruction set
  • add much stuff to handle register allocation (still it doesn't work and it wasnot finished)
  • add a DTree structure to save memory in storing register allocation structures
  • fixed possible segfault while preparing to write when ASMKERN_INNER is present

CLRadeonExtender 0.1.7:

  • update AmdCL2ABI chapter
  • fixed kernel arguments sizes in GalliumCompute? binary format
  • add new GPU devices gfx902-gfx905
  • update device tables for Amd Crimson drivers
  • small fixes in DynLibrary? interface
  • add relocations to GalliumCompute? binary format (for scratch buffer symbols)
  • make getXXXDisasmInputFromBinaryXX as public interface
  • speeding up evaluation of simple expressions without symbols
  • add '.for' and '.while' pseudo-ops ('for' and 'while' loops)
  • fixed some grammar/typos in CLRX documentation
  • add GPU device names from ROCm-OpenCL
  • handle new ROCm binary format with YAML metadatas (assembler and disassembler)
  • add few pseudo-ops to ROCm handling
  • add new pseudo-ops to set parameters in ROCm YAML metadata
  • fixes in GalliumCompute? binary generator (for conformant with standards)
  • add '.reqd_work_group_size' pseudo-op (equivalent of '.cws')
  • add support for work_group_size_hint and vec_type hint in Amd OpenCL 2.0 binary format
  • some small bug fixes in ROCm disassembler
  • updates in README.md and INSTALL files
  • small sanitizations in DisasmAmd?, DisasmAmdCL2 (argument type checking)
  • change behaviour of '.cws' (.reqd_work_group_size) while setting default values
  • add calculation of section differences in an expressions (for ROCm handling)
  • fixed invalid reads (potential segfault) after undefining symbol
  • fixed old stupid bug: resolve symbol value by using new value (or just if undefined then do not resolve symbol) instead old unresolved symbol value later when expression has been evaluated
  • Add GOT table handling in ROCm binary format
  • add new option '--newROCmBinFormat'
  • add untested support for ROCm in CLHelper and VectorAdd? sample
  • add support for multiple OpenCL platforms in CLHelper and samples
  • allow te call_convetion to 0xffffffff in AMDHSA config
  • handle special cases with relatives while evaluating binary/logical operators
  • small fixes in CLRX documentation and Unix manuals
  • developing unfinished AsmRegAlloc?
  • add a missing access qualifier to images 'read_write' for AMD OpenCL 2.0

CLRadeonExtender 0.1.6:

  • add support for Mesa3D 17.3.0 (GPU detection)
  • fixed segfaults during disassemblying new Gallium binaries with AMD HSA
  • add ability to supply defined symbols during using the CLHelper
  • fixed CLRXDocs mistakes in GcnSrmdInstrs?, GcmSmemInstrs?, GcnVopXInstrs chapters.
  • add GCN1.4 (VEGA) instruction's descriptons to CLRXDocs
  • add support for GCN 1.4 (VEGA) to samples
  • fixed encoding/decoding of SMEM instructions with SGPR offset (GCN 1.4)
  • add a missing GCN 1.4 instructions
  • fixed encoding/decoding of OP_SEL (GCN 1.4)
  • fixed encoding/decoding of DS_READ_ADDTID_B32 (GCN 1.4)
  • fixed encoding/decoding of TBUFFER_x_D16/BUFFER_x_D16 instructions for GCN 1.4
  • fixed encoding CLAMP VOP3/VOPC instructions (GCN 1.4)
  • allow to use OMOD, NEG, ABS, CLAMP modifiers in VOP3/VINTRP instructions
  • add new VOP3/VINTRP instruction's descriptions to CLRXDocs
  • update GCN timings chapter in CLRXDocs

CLRadeonExtender 0.1.5r1:

  • add detection of OpenGL to CMakeLists.txt
  • add more comments in the source code
  • fixed hanging when ROCm code have hundreds or more kernels
  • parameter in modifier can have any value
  • add 'get_version' pseudo-operation
  • add oldModParam mode (old modifier parameter's policy)
  • fixes for ROCm disassembler module
  • fixes for Gallium binary reader (accept new binaries with many kernels)
  • added support for Mesa3D 17.2.x
  • added Mesa3D/Gallium device names for AMD Polaris
  • add new exceptions to code (to distinguish type of exception)
  • fixed position in disassembler code in comments (mainly for Gallium/ROCm)
  • add CLRXCLHelper library to facilitate running assembler code on the OpenCL
  • move some GPU architecture versions tables to GPUId
  • add new testcase GPUId

CLRadeonExtender 0.1.5:

  • ignore case in an access qualifier name's (Amd and AmdCL2)
  • improve handling a '\()' and '\@'
  • add SDWA and DPP words to set instruction encoding
  • fixing few CLRXDocs typos
  • fixes for AMD RX VEGA (GFX900)
  • disassembler prints an instruction's position in comments
  • update GcnTimings
  • update VectorAdd? and ReverseBits? for LLVM 4.0 and Mesa3D 17.0.0
  • updates in ImageMix? (correct workSize calculating for kernel)
  • small fixes in disassembler
  • disassembler can correctly disassemble GalliumCompute? for LLVM 4.0
  • add '--llvmVersion' to clrxdisasm
  • dump AMD HSA configuration for GalliumCompute? and AmdCL2 (like in ROCm format)
  • disassembler add '@' to hwreg and sendmsg to make dump compatible with clrxasm
  • add '--HSAConfig' to dump AmdCL2 kernel configuration as AMD HSA config
  • add AMD HSA configuration pseudo-ops to GalliumCompute? and AmdCL2 binary formats
  • update device list for Gallium and ROCm binary formats for recognizing device
  • fixed support for LLVM>=3.9 and Mesa3D>=17.0.0 in GalliumCompute?
  • add pseudo-op '.default_hsa_features' to AmdCL2, Gallium and ROCm formats
  • update headers in code
  • make error handling more compact in assembler's code
  • fixed '.machine', '.codeversion' handling (do not print obsolete warnings)
  • add pkg-config files to installation
  • remove obsolete warnings in CMakeLists.txt
  • added GFX901 support (RX VEGA with HBCC ?)
  • add Config.h and amdbin/Elf.h headers to Doxygen documentation
  • change lowest device for GCN 1.2 to Iceland in GPUId.
  • add support for Windows developments environments: CygWin? and MinGW
  • make detecting of 64-bits more portable in CMakeLists.txt (use compiler to do)
  • checking whether std::call_once is available for non full supported std threads
  • use only C++ compiler to check features (Int128Detect.cpp)

CLRadeonExtender 0.1.4r1:

  • fixed code operation in SMRD and SMEM instructions
  • fixed parsing symbol register ranges begins from 'exec', 'vcc', 'tma', ...
  • checking end of line at parsing symbol and regvar register ranges

CLRadeonExtender 0.1.4:

  • add AMD RX VEGA support (GCN 1.4/VEGA)
  • add symbol scopes
  • add support for 32-bit AMD OpenCL 2.0 binaries
  • update GPU device ids to latest drivers
  • add Ellesmere and Baffin support for AMD OpenCL 1.2 binaries
  • add support for LLVM 3.9, LLVM 4.0 and Mesa3D 17.0
  • add new options to clrxasm (--llvmVersion)
  • add GCN 1.2 instruction set documentation
  • add new SMEM instruction (s_buffer_atomics)
  • add GDS segment size to AMD OpenCL 2.0 binaries
  • add code of samples for GCN 1.2
  • add option to use old AMD OpenCL 1.2 binary format into samples
  • add editor's syntax (NotePad?++, Kate, Gedit, VIM)
  • minor fixes in GCN assembler
  • add modifier's parametrization
  • add options to control case-sensitiviness in macro names
  • fixed handling AMDOCL names for 32-bit Windows environment
  • add installation rules for AMDGPU-PRO drivers (OpenSUSE and Ubuntu)
  • add new pseudo-ops '.get_64bit', '.get_arch', '.get_format', '.get_gpu'
  • add autodetection for LLVM and Mesa3D version
  • find correct AMDOCL, MesaOCL and llvm-config at runtime

CLRadeonExtender 0.1.3:

  • ROCm binary format support
  • fixed '.format' pseudo-op
  • fixed resolving variables in some specific cases
  • fixed handling AmdCL2 format for device type later than GCN.1.1
  • small fixes in documentation
  • fixed disassemblying s_waitcnt
  • fixed handling floating point literals in assembler and compatibility mode (bugFP)
  • ARMv8 (AArch64) architecture support
  • Android support

CLRadeonExtender 0.1.2:

  • AMD OpenCL 2.0 support
  • 64-bit Gallium binary format support
  • support for new closed Linux and Windows drivers
  • new samples
  • documentation for OpenCL 2.0 support (includes ABI)
  • documentation for GCN ISA FLAT encoding
  • lit() specifier to distinguish literal and inline constant
  • alternate macro syntax
  • correct counting registers for automatic configuration
  • fixed handling of conditionals and macro pseudo-ops
  • disassembler can dump configuration in user-friendly form

CLRadeonExtender 0.1.1:

  • support for Windows
  • register ranges, and symbol's of register ranges
  • GCN ISA documentation
  • fixed AMD Catalyst and Gallium compute binary generator
  • fixed clrxasm

CLRadeonExtender 0.1:

  • first published version