Changes between Version 5 and Version 6 of GcnTimings


Ignore:
Timestamp:
05/26/16 11:00:38 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnTimings

    v5 v6  
    4040</ul>
    4141<h3>SOP2 Instruction timings</h3>
    42 <table>
    43 <thead>
    44 <tr>
    45 <th>Instruction</th>
    46 <th>Delay</th>
    47 <th>Throughput</th>
    48 </tr>
    49 </thead>
    50 <tbody>
    51 <tr>
    52 <td>S_ABSDIFF_I32</td>
    53 <td>4</td>
    54 <td>1</td>
    55 </tr>
    56 <tr>
    57 <td>S_ADDC_U32</td>
    58 <td>4</td>
    59 <td>1</td>
    60 </tr>
    61 <tr>
    62 <td>S_ADD_I32</td>
    63 <td>4</td>
    64 <td>1</td>
    65 </tr>
    66 <tr>
    67 <td>S_ADD_U32</td>
    68 <td>4</td>
    69 <td>1</td>
    70 </tr>
    71 <tr>
    72 <td>S_ANDN2_B32</td>
    73 <td>4</td>
    74 <td>1</td>
    75 </tr>
    76 <tr>
    77 <td>S_ANDN2_B64</td>
    78 <td>4</td>
    79 <td>1</td>
    80 </tr>
    81 <tr>
    82 <td>S_AND_B32</td>
    83 <td>4</td>
    84 <td>1</td>
    85 </tr>
    86 <tr>
    87 <td>S_AND_B64</td>
    88 <td>4</td>
    89 <td>1</td>
    90 </tr>
    91 <tr>
    92 <td>S_ASHR_I32</td>
    93 <td>4</td>
    94 <td>1</td>
    95 </tr>
    96 <tr>
    97 <td>S_ASHR_I64</td>
    98 <td>4</td>
    99 <td>1</td>
    100 </tr>
    101 <tr>
    102 <td>S_BFE_I32</td>
    103 <td>4</td>
    104 <td>1</td>
    105 </tr>
    106 <tr>
    107 <td>S_BFE_I64</td>
    108 <td>4</td>
    109 <td>1</td>
    110 </tr>
    111 <tr>
    112 <td>S_BFE_U32</td>
    113 <td>4</td>
    114 <td>1</td>
    115 </tr>
    116 <tr>
    117 <td>S_BFE_U64</td>
    118 <td>4</td>
    119 <td>1</td>
    120 </tr>
    121 <tr>
    122 <td>S_BFM_B32</td>
    123 <td>4</td>
    124 <td>1</td>
    125 </tr>
    126 <tr>
    127 <td>S_BFM_B64</td>
    128 <td>4</td>
    129 <td>1</td>
    130 </tr>
    131 <tr>
    132 <td>S_CBRANCH_G_FORK</td>
    133 <td></td>
    134 <td></td>
    135 </tr>
    136 <tr>
    137 <td>S_CSELECT_B32</td>
    138 <td>4</td>
    139 <td>1</td>
    140 </tr>
    141 <tr>
    142 <td>S_CSELECT_B64</td>
    143 <td>4</td>
    144 <td>1</td>
    145 </tr>
    146 <tr>
    147 <td>S_LSHL_B32</td>
    148 <td>4</td>
    149 <td>1</td>
    150 </tr>
    151 <tr>
    152 <td>S_LSHL_B64</td>
    153 <td>4</td>
    154 <td>1</td>
    155 </tr>
    156 <tr>
    157 <td>S_LSHR_B32</td>
    158 <td>4</td>
    159 <td>1</td>
    160 </tr>
    161 <tr>
    162 <td>S_LSHR_B64</td>
    163 <td>4</td>
    164 <td>1</td>
    165 </tr>
    166 <tr>
    167 <td>S_MAX_I32</td>
    168 <td>4</td>
    169 <td>1</td>
    170 </tr>
    171 <tr>
    172 <td>S_MAX_U32</td>
    173 <td>4</td>
    174 <td>1</td>
    175 </tr>
    176 <tr>
    177 <td>S_MIN_I32</td>
    178 <td>4</td>
    179 <td>1</td>
    180 </tr>
    181 <tr>
    182 <td>S_MIN_U32</td>
    183 <td>4</td>
    184 <td>1</td>
    185 </tr>
    186 <tr>
    187 <td>S_MUL_I32</td>
    188 <td>4</td>
    189 <td>1</td>
    190 </tr>
    191 <tr>
    192 <td>S_NAND_B32</td>
    193 <td>4</td>
    194 <td>1</td>
    195 </tr>
    196 <tr>
    197 <td>S_NAND_B64</td>
    198 <td>4</td>
    199 <td>1</td>
    200 </tr>
    201 <tr>
    202 <td>S_NOR_B32</td>
    203 <td>4</td>
    204 <td>1</td>
    205 </tr>
    206 <tr>
    207 <td>S_NOR_B64</td>
    208 <td>4</td>
    209 <td>1</td>
    210 </tr>
    211 <tr>
    212 <td>S_ORN2_B32</td>
    213 <td>4</td>
    214 <td>1</td>
    215 </tr>
    216 <tr>
    217 <td>S_ORN2_B64</td>
    218 <td>4</td>
    219 <td>1</td>
    220 </tr>
    221 <tr>
    222 <td>S_OR_B32</td>
    223 <td>4</td>
    224 <td>1</td>
    225 </tr>
    226 <tr>
    227 <td>S_OR_B64</td>
    228 <td>4</td>
    229 <td>1</td>
    230 </tr>
    231 <tr>
    232 <td>S_SUBB_U32</td>
    233 <td>4</td>
    234 <td>1</td>
    235 </tr>
    236 <tr>
    237 <td>S_SUB_I32</td>
    238 <td>4</td>
    239 <td>1</td>
    240 </tr>
    241 <tr>
    242 <td>S_SUB_U32</td>
    243 <td>4</td>
    244 <td>1</td>
    245 </tr>
    246 <tr>
    247 <td>S_XNOR_B32</td>
    248 <td>4</td>
    249 <td>1</td>
    250 </tr>
    251 <tr>
    252 <td>S_XNOR_B64</td>
    253 <td>4</td>
    254 <td>1</td>
    255 </tr>
    256 <tr>
    257 <td>S_XOR_B32</td>
    258 <td>4</td>
    259 <td>1</td>
    260 </tr>
    261 <tr>
    262 <td>S_XOR_B64</td>
    263 <td>4</td>
    264 <td>1</td>
    265 </tr>
    266 </tbody>
    267 </table>
     42<p>All SOP2 instructions (S_CBRANCH_G_FORK not checked) takes 4 cycles and can be executed in
     431 cycle throughput.</p>
    26844<h3>SOPK Instruction timings</h3>
    269 <table>
    270 <thead>
    271 <tr>
    272 <th>Instruction</th>
    273 <th>Delay</th>
    274 <th>Throughput</th>
    275 </tr>
    276 </thead>
    277 <tbody>
    278 <tr>
    279 <td>S_ADDK_I32</td>
    280 <td>4</td>
    281 <td>1</td>
    282 </tr>
    283 <tr>
    284 <td>S_CBRANCH_I_FORK</td>
    285 <td></td>
    286 <td></td>
    287 </tr>
    288 <tr>
    289 <td>S_CMOVK_I32</td>
    290 <td>4</td>
    291 <td>1</td>
    292 </tr>
    293 <tr>
    294 <td>S_CMPK_EQ_I32</td>
    295 <td>4</td>
    296 <td>1</td>
    297 </tr>
    298 <tr>
    299 <td>S_CMPK_EQ_U32</td>
    300 <td>4</td>
    301 <td>1</td>
    302 </tr>
    303 <tr>
    304 <td>S_CMPK_GE_I32</td>
    305 <td>4</td>
    306 <td>1</td>
    307 </tr>
    308 <tr>
    309 <td>S_CMPK_GE_U32</td>
    310 <td>4</td>
    311 <td>1</td>
    312 </tr>
    313 <tr>
    314 <td>S_CMPK_GT_I32</td>
    315 <td>4</td>
    316 <td>1</td>
    317 </tr>
    318 <tr>
    319 <td>S_CMPK_GT_U32</td>
    320 <td>4</td>
    321 <td>1</td>
    322 </tr>
    323 <tr>
    324 <td>S_CMPK_LE_I32</td>
    325 <td>4</td>
    326 <td>1</td>
    327 </tr>
    328 <tr>
    329 <td>S_CMPK_LE_U32</td>
    330 <td>4</td>
    331 <td>1</td>
    332 </tr>
    333 <tr>
    334 <td>S_CMPK_LG_I32</td>
    335 <td>4</td>
    336 <td>1</td>
    337 </tr>
    338 <tr>
    339 <td>S_CMPK_LG_U32</td>
    340 <td>4</td>
    341 <td>1</td>
    342 </tr>
    343 <tr>
    344 <td>S_CMPK_LT_I32</td>
    345 <td>4</td>
    346 <td>1</td>
    347 </tr>
    348 <tr>
    349 <td>S_CMPK_LT_U32</td>
    350 <td>4</td>
    351 <td>1</td>
    352 </tr>
    353 <tr>
    354 <td>S_GETREG_B32</td>
    355 <td></td>
    356 <td></td>
    357 </tr>
    358 <tr>
    359 <td>S_GETREG_REGRD_B32</td>
    360 <td></td>
    361 <td></td>
    362 </tr>
    363 <tr>
    364 <td>S_MOVK_I32</td>
    365 <td>4</td>
    366 <td>1</td>
    367 </tr>
    368 <tr>
    369 <td>S_MULK_I32</td>
    370 <td>4</td>
    371 <td>1</td>
    372 </tr>
    373 <tr>
    374 <td>S_SETREG_B32</td>
    375 <td></td>
    376 <td></td>
    377 </tr>
    378 <tr>
    379 <td>S_SETREG_IMM32_B32</td>
    380 <td></td>
    381 <td></td>
    382 </tr>
    383 </tbody>
    384 </table>
     45<p>All SOPK instructions (S_CBRANCH_I_FORK, S_GETREG_B32, S_GETREG_REGRD_B32, S_SETREG_B32,
     46S_SETREG_IMM32_B32 not checked) takes 4 cycles and can be executed in 1 cycle throughput.</p>
    38547<h3>SOP1 Instruction timings</h3>
    386 <p>The S_*_SAVEEXEC_B64 instructions takes 8 cycles. Other ALU instructions (expects
     48<p>The S_*_SAVEEXEC_B64 instructions takes 8 cycles. Other ALU instructions (except
    38749S_MOV_REGRD_B32, S_CBRANCH_JOIN, S_RFE_B64) take 4 cycles.</p>
    38850<h3>SOPC Instruction timings</h3>