source: CLRX/CLRadeonExtender/trunk/doc/GcnOperands.md @ 3094

Last change on this file since 3094 was 3094, checked in by matszpk, 2 years ago

CLRadeonExtender: GCNAsm: Add parametrization to bound_ctrl.

File size: 7.7 KB
Line 
1### Operand encoding
2
3The GCN1.0/1.1 delivers maximum 104 registers (with VCC). Basic list of destination
4scalar operands have 128 entries. Source operands codes is in range 0-255.
5
6**Important**: Two SGPR's must be aligned to 2. Four or more SGPR's must be aligned to 4.
7
8Following list describes all operand codes values:
9
10Code     | Name              | Description
11---------|-------------------|------------------------
120-103    | S0 - S103         | SGPR's (GCN1.0/1.1)
130-101    | S0 - S101         | SGPR's (GCN1.2)
14104-105  | FLAT_SCRATCH      | FLAT_SCRATCH register (GCN1.1)
15104      | FLAT_SCRATCH_LO   | Low half of FLAT_SCRATCH register (GCN1.1)
16105      | FLAT_SCRATCH_HI   | High half of FLAT_SCRATCH register (GCN1.1)
17102-103  | FLAT_SCRATCH      | FLAT_SCRATCH register (GCN1.2)
18102      | FLAT_SCRATCH_LO   | Low half of FLAT_SCRATCH register (GCN1.2)
19103      | FLAT_SCRATCH_HI   | High half of FLAT_SCRATCH register (GCN1.2)
20104-105  | XNACK_MASK        | XNACK_MASK register
21104      | XNACK_MASK_LO     | Low half of XNACK_MASK register
22105      | XNACK_MASK_HI     | High half of XNACK_MASK register
23106-107  | VCC               | VCC (vector carry register) two last SGPR's
24106      | VCC_LO            | Low half of VCC
25107      | VCC_HI            | High half of VCC
26108-109  | TBA               | Trap handler base address
27108      | TBA_LO            | Low half of TBA register
28109      | TBA_HI            | High half of TBA register
29110-111  | TMA               | Pointer to data in memory used by trap handler
30110      | TMA_LO            | Low half of TMA register
31111      | TMA_HI            | High half of TMA register
32112-123  | TTMP0 - TTMP11    | Trap handler temporary registers
33124      | M0                | M0. Memory register
34125      | -                 | reserved
35126-127  | EXEC              | EXEC register
36126      | EXEC_LO           | Low half of EXEC register
37127      | EXEC_HI           | High half of EXEC register
38128      | 0                 | 0
39129-192  | 1-64              | 1 to 64 constant value
40193-208  | -1 - -16          | -1 to -16 constant value
41209-239  | -                 | reserved
42240      | 0.5               | 0.5 floating point value
43241      | -0.5              | -0.5 floating point value
44242      | 1.0               | 1.0 floating point value
45243      | -1.0              | -1.0 floating point value
46244      | 2.0               | 2.0 floating point value
47245      | -2.0              | -2.0 floating point value
48246      | 4.0               | 4.0 floating point value
49247      | -4.0              | -4.0 floating point value
50248      | 1/(2*PI)          | 1/(2*PI)
51249      | --                | SDWA dword (GCN1.2)
52250      | --                | DPP dword (GCN1.2)
53251      | VCCZ              | VCCZ register
54252      | EXECZ             | EXECZ register
55253      | SCC               | SCC register
56254      | LDS_DIRECT        | LDS direct access
57254      | LDS               | LDS direct access
58254      | SRC_LDS_DIRECT    | LDS direct access
59255      | 255               | Literal constant (follows instruction dword)
60256-511  | V0-V255           | VGPR's (only VOP3 encoding operands)
61
62### Operand syntax
63
64Single operands can be given by their name: `s0`, `v54`. CLRX assemblers accepts syntax with
65brackets: `s[0]`, `s[z]`, `v[66]`. In many instructions operands are
6664-bit, 96-bit or even 128-bit. These operands consists several registers that can be
67expressed by ranges: `v[3:4]`, `s[8:11]`, `s[16:23]`, where second value is
68last register's number.
69
70Names of the registers are case-insensitive.
71
72Constant values are automatically resolved if expression have already value.
73The 1/(2*PI), 1.0, -2.0 and other floating point constant values will be
74resolved if that accurate floating point value will be given.
75
76In instruction syntax, operands are listed by name of the encoding field. Optionally, in
77parentheses is given number of the registers. The ranges of number of a registers are in
78form 'START:LAST'. Example:
79
80Syntax: S_SUB_I32 SDST, SSRC0, SSRC1 
81Syntax: S_AND_B64 SDST(2), SSRC0(2), SSRC1(2) 
82Syntax: S_AND_B64 SDST(2), SSRC0(2), SSRC1(2:4) 
83
84### Constants and literals
85
86There are two ways to supply immediate value to GCN instruction: first is builtin constants
87(both  integer and floating points) and second is 32-bit immediate. Some type encoding
88allow to supply immediate with various size (16-bit or 12-bit).
89
90The literals are differently treated for scalar instructions and for vector instructions.
91In scalar instructions if operand is 64-bit, the literal value is exact value 64-bit value
92(sign or zero extended). By contrast, in vector instructions, for 64-bit operand, the
93literal is higher 32-bits of value (lower 32-bit are zero). Unhapilly, the CLRX assembler
94always encodes and decodes literal immediate as 32-bit value (except floating values).
95The immediate constants are always exact value, either for 32-bit and 64-bit operands.
96For example, instructions `v_frexp_exp_i32_f64 v3, lit(45)` and
97`v_frexp_exp_i32_f64 v3, 45` generates different results, because literal and constant
98will be have different meaning.
99
100**NOTE:** These same literals and constants gives different values for 64-bit operand in
101vector instructions. To distinguish values, please use `lit()` function.
102
103**OLD_VERSIONS**: This version of CLRadeonExtender adds '--buggyFPLit' option to support
104sources for older versions (to 0.1.2). Versions to 0.1.2 incorrectly handles floating
105point literals and constants due to wrong assumptions. This and later versions fix
106that behaviour.
107
108Old and buggy behaviour:
109
110* support only half and single floating point literals (and constants)
111* shorten literals to constant only for single floating point literals
112
113New behaviour:
114
115* support half, single and double (only higher 32-bits) floating point literals
116(and constants)
117* shorten literals to constant for half, single and double literals (type depends
118from operand type)
119
120### Hardware registers
121
122These register could be read or written by S_GETREG_\* and S_SETREG_\* instruction.
123
124List of hardware registers:
125
126* GPR_ALLOC, HWREG_GPR_ALLOC -
127* HW_ID, HWREG_HW_ID -
128* IB_DBG0, HWREG_DBG0 -
129* IB_STS, HWREG_IB_STS -
130* INST_DW0, HWREG_INST_DW0 -
131* INST_DW1, HWREG_INST_DW1 -
132* LDS_ALLOC, HWREG_LDS_ALLOC -
133* MODE, HWREG_MODE -
134* PC_HI, HWREG_PC_HI -
135* PC_LO, HWREG_PC_LO -
136* STATUS, HWREG_STATUS -
137* TRAPSTS, HWREG_TRAPSTS -
138
139### LDS direct access
140
141The LDS direct access allow to access LDS memory from VOP instruction directly by supplying
142LDS, LDS_DIRECT or SRC_LDS_DIRECT keyword on the first source operand. Then data from
143LDS will be used on place that operand.
144
145The M0 must hold the offset in bytes (in 0-15 bits) and format of the data (in bits 16-18).
146Table of formats:
147
148 Value | Format
149-------|----------------
1500      | Unsigned byte
1511      | Unsigned 16-bit word
1522      | Unsigned 32-bit word
1533      | unused (same as 2)
1544      | Signed byte
1555      | Signed 16-bit word
156
157A LDS direct access doesn't require `S_WAITCNT LGKMCNT(0)` (??? check).
158
159### Parametrizable modifiers
160
161Many an instruction's modifiers can have parameter that have value 0 or 1. This feature
162allow to easily parametrize modifiers. The value 1 enables modifier, zero disables it.
163`tfe:0` disable TFE modifier, `tfe:1` enables it. The value of parameter is an expression.
164The `omod` modifier with parameter (expression) replaces `mul` and `div` modifiers.
165The `format` in MTBUF encoding is also parametrizable if data and/or
166number format expression will be preceded by `@` character (example: `format[@1,@4]`).
167Special case is `bound_ctrl`. To parametrize bound_ctrl you must use syntax:
168`bound_ctrl:0:expr` or `bound_ctrl:1:expr`.
169
170The HW registers and send message parameters (message and GSOP) is parametrizable if
171they will be preceded by `@` (example: `hwreg(@5, 8, 16)`).
Note: See TracBrowser for help on using the repository browser.