Changeset 3568 in CLRX


Ignore:
Timestamp:
Dec 29, 2017, 7:17:27 AM (12 months ago)
Author:
matszpk
Message:

CLRadeonExtender: CLRXDocs: Small updates and fixes (typo/grammar).

Location:
CLRadeonExtender/trunk/doc
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • CLRadeonExtender/trunk/doc/AmdAbi.md

    r2331 r3568  
    77### User data classes
    88
    9 User data is stored in first scalar registers. Data class indicates what data are stored.
     9User data is stored in first scalar registers. Data class indicates what a data are stored.
    1010Following data classes:
    1111
     
    6363Second const buffer (id=1) holds arguments aligned to 4 dwords.
    6464
    65 Global pointers holds vector offset (64-bit for 64-bit binary) to memory.
     65Global pointers holds vector offset (64-bit for 64-bit binary) to the memory.
    6666Local pointers holds its offset in bytes (1 dword).
    6767
  • CLRadeonExtender/trunk/doc/AmdCl2Abi.md

    r3552 r3568  
    33This chapter describes how kernel gets its argument, how access to constant data. Because
    44Kernel setup is AMD HSA configuration, hence we recommend to refer to ROCm-ABI documentation
    5 to get information about kernel setup and kernel arguments passing. Now assembler have
     5to get information about kernel setup and kernel arguments passing. Now an assembler have
    66all the AMD HSA configuration's pseudo-ops to do it.
    77
     
    1919* usegeneric - enable generic pointers support
    2020
    21 Number of user data registers depends on set of an enabled features. Following rules will
     21The number of user data registers depends on set of an enabled features. Following rules will
    2222be applied:
    2323
     
    5555* void* aqlwrap_pointer - 32-bit or 64-bit
    5656
    57 Further arguments in that buffer are an user arguments defined for kernel. Any pointer,
     57Further arguments in that buffer are an user arguments defined for a kernel. Any pointer,
    5858command queue, image, sampler, structure tooks 8 bytes (64-bit pointer) or
    59594 bytes (32-bit pointer) in 32-bit AMD OpenCL 2.0.
    60603 component vector tooks number of bytes  of 4 element vector.
    61 Smaller types likes (char, short) tooks 1-3 bytes. Alignment depends on same type
     61Smaller types likes (char, short) tooks 1-3 bytes. An alignment depends on same type
    6262or type of element (for vectors).
    6363
  • CLRadeonExtender/trunk/doc/GalliumAbi.md

    r3263 r3568  
    1414* 6-8 - local size for each dimension
    1515
    16 Argument griddim holds number of dimensions. Argument gridoffset holds 3 values of the
     16An argument griddim holds number of dimensions. Argument gridoffset holds 3 values of the
    1717global offset.
    1818
     
    2424### Other data and resources
    2525
    26 Section '.rodata' ('.globaldata') hold constant data for kernels.
    27 Constant data is placed after code of kernels. Use PC pointer to get this data.
     26The section '.rodata' ('.globaldata') hold constant data for kernels.
     27The constant data is placed after code of kernels. Use PC pointer to get this data.
    2828
    2929## Gallium ABI description AMDHSA
     
    3838* 1-3 - global offsets for each dimensions
    3939
    40 Local sizes and other kernel setup is in memory which address is stored in s[4:5].
     40Local sizes and other kernel setup is in the memory which address is stored in s[4:5].
    4141List of data (number is dword offset after kernel argument):
    4242
  • CLRadeonExtender/trunk/doc/GcnInstrsVop1.md

    r3501 r3568  
    269269Opcode VOP3A: 389 (0x185) for GCN 1.2 
    270270Syntax: V_CEIL_F16 VDST, SRC0 
    271 Description: Truncate half floating point valu from SRC0 with rounding to positive infinity
     271Description: Truncate half floating point value from SRC0 with rounding to positive infinity
    272272(ceilling), and store result to VDST. Implemented by flooring.
    273273If SRC0 is infinity or NaN then copy SRC0 to VDST. 
     
    285285Opcode VOP3A: 418 (0x1a2) for GCN 1.0/1.1; 349 (0x15d) for GCN 1.2 
    286286Syntax: V_CEIL_F32 VDST, SRC0 
    287 Description: Truncate floating point valu from SRC0 with rounding to positive infinity
     287Description: Truncate floating point value from SRC0 with rounding to positive infinity
    288288(ceilling), and store result to VDST. Implemented by flooring.
    289289If SRC0 is infinity or NaN then copy SRC0 to VDST. 
     
    301301Opcode VOP3A: 408 (0x198) for GCN 1.1; 344 (0x158) for GCN 1.2 
    302302Syntax: V_CEIL_F64 VDST(2), SRC0(2) 
    303 Description: Truncate double floating point valu from SRC0 with rounding to
     303Description: Truncate double floating point value from SRC0 with rounding to
    304304positive infinity (ceilling), and store result to VDST. Implemented by flooring.
    305305If SRC0 is infinity or NaN then copy SRC0 to VDST. 
     
    969969Opcode VOP3A: 422 (0x1a6) for GCN 1.0/1.1 
    970970Syntax: V_LOG_CLAMP_F32 VDST, SRC0 
    971 Description: Approximate logarithm of base 2 from floating point value SRC0 with
     971Description: Approximate logarithm of the base 2 from floating point value SRC0 with
    972972clamping infinities to -MAX_FLOAT. Result is stored in VDST.
    973973If SRC0 is negative then store -NaN to VDST. This instruction doesn't handle denormalized
     
    993993Opcode VOP3A: 384 (0x180) for GCN 1.2 
    994994Syntax: V_LOG_F16 VDST, SRC0 
    995 Description: Approximate logarithm of base 2 from half floating point value SRC0, and store
    996 result to VDST. If SRC0 is negative then store -NaN to VDST. 
     995Description: Approximate logarithm of the base 2 from half floating point value SRC0,
     996and store result to VDST. If SRC0 is negative then store -NaN to VDST. 
    997997Operation: 
    998998```
     
    10111011Opcode VOP3A: 423 (0x1a7) for GCN 1.0/1.1; 353 (0x161) for GCN 1.2 
    10121012Syntax: V_LOG_F32 VDST, SRC0 
    1013 Description: Approximate logarithm of base 2 from floating point value SRC0, and store
     1013Description: Approximate logarithm of base the 2 from floating point value SRC0, and store
    10141014result to VDST. If SRC0 is negative then store -NaN to VDST.
    10151015This instruction doesn't handle denormalized values regardless FLOAT MODE register setup. 
     
    10301030Opcode VOP3A: 453 (0x1c5) for GCN 1.1; 396 (0x18c) for GCN 1.2 
    10311031Syntax: V_LOG_LEGACY_F32 VDST, SRC0 
    1032 Description: Approximate logarithm of base 2 from floating point value SRC0, and store
     1032Description: Approximate logarithm of the base 2 from floating point value SRC0, and store
    10331033result to VDST. If SRC0 is negative then store -NaN to VDST.
    10341034This instruction doesn't handle denormalized values regardless FLOAT MODE register setup.
  • CLRadeonExtender/trunk/doc/GcnOperands.md

    r3469 r3568  
    22
    33The GCN1.0/1.1 delivers maximum 104 registers (with VCC). Basic list of destination
    4 scalar operands have 128 entries. Source operands codes is in range 0-255.
     4scalar operands have 128 entries. The source operands codes is in range 0-255.
    55
    66**Important**: Two SGPR's must be aligned to 2. Four or more SGPR's must be aligned to 4.
    7 This rule do not apply to vector instruction where is more complex rule:
    8 SGPR's can be unaligned only if SGPR register range do not cross line (4 SGPR registers).
     7This rule do not apply to the vector instruction where is more complex rule:
     8SGPR's can be unaligned only if SGPR register range do not cross a line (4 SGPR registers).
    99
    1010Following list describes all operand codes values:
     
    7070### Operand syntax
    7171
    72 Single operands can be given by their name: `s0`, `v54`. CLRX assemblers accepts syntax with
     72THe Single operands can be given by their name: `s0`, `v54`.
     73CLRX assembler accepts the syntax with
    7374brackets: `s[0]`, `s[z]`, `v[66]`. In many instructions operands are
    747564-bit, 96-bit or even 128-bit. These operands consists several registers that can be
     
    7677last register's number.
    7778
    78 Names of the registers are case-insensitive.
     79The names of the registers are case-insensitive.
    7980
    80 Constant values are automatically resolved if expression have already value.
     81The constant values are automatically resolved if an expression have already value.
    8182The 1/(2*PI), 1.0, -2.0 and other floating point constant values will be
    8283resolved if that accurate floating point value will be given.
Note: See TracChangeset for help on using the changeset viewer.