Changes between Version 12 and Version 13 of GcnTimings


Ignore:
Timestamp:
05/28/16 20:00:31 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnTimings

    v12 v13  
    173173S_MOV_REGRD_B32, S_CBRANCH_JOIN, S_RFE_B64) take 4 cycles.</p>
    174174<h3>SOPC Instruction timings</h3>
    175 <p>All comparison and bit checking instructions take 4 cycles.</p>
     175<p>All comparison and bit checking instructions takes 4 cycles.</p>
    176176<h3>SOPP Instruction timings</h3>
    177177<p>Jumps costs 4 (no jump) or 20 cycles (???) if jump will performed.</p>
     
    179179<p>All VOP2 instructions takes 4 cycles.</p>
    180180<h3>VOP1 Instruction timings</h3>
    181 <p>Timings of VOP1 instructions is in this table:</p>
     181<p>Timings of VOP1 instructions are in this table:</p>
    182182<table>
    183183<thead>
     
    393393<p>All 32-bit comparison instructions takes 4 cycles. All 64-bit comparison instructions takes
    394394DPFACTOR*4 cycles.</p>
     395<h3>VOP3 Instruction timings</h3>
     396<p>Timings of VOP3 instructions are in this table:</p>
     397<table>
     398<thead>
     399<tr>
     400<th>Instruction</th>
     401<th>Cycles</th>
     402<th>Instruction</th>
     403<th>Cycles</th>
     404</tr>
     405</thead>
     406<tbody>
     407<tr>
     408<td>V_ADD_F64</td>
     409<td>DPFACTOR*4</td>
     410<td>V_MAD_U64_U32</td>
     411<td>16</td>
     412</tr>
     413<tr>
     414<td>V_ALIGNBIT_B32</td>
     415<td>4</td>
     416<td>V_MAX3_F32</td>
     417<td>4</td>
     418</tr>
     419<tr>
     420<td>V_ALIGNBYTE_B32</td>
     421<td>4</td>
     422<td>V_MAX3_I32</td>
     423<td>4</td>
     424</tr>
     425<tr>
     426<td>V_ASHR_I64</td>
     427<td>DPFACTOR*4</td>
     428<td>V_MAX3_U32</td>
     429<td>4</td>
     430</tr>
     431<tr>
     432<td>V_BFE_I32</td>
     433<td>4</td>
     434<td>V_MAX_F64</td>
     435<td>DPFACTOR*4</td>
     436</tr>
     437<tr>
     438<td>V_BFE_U32</td>
     439<td>4</td>
     440<td>V_MED3_F32</td>
     441<td>4</td>
     442</tr>
     443<tr>
     444<td>V_BFI_B32</td>
     445<td>4</td>
     446<td>V_MED3_I32</td>
     447<td>4</td>
     448</tr>
     449<tr>
     450<td>V_CUBEID_F32</td>
     451<td>4</td>
     452<td>V_MED3_U32</td>
     453<td>4</td>
     454</tr>
     455<tr>
     456<td>V_CUBEMA_F32</td>
     457<td>4</td>
     458<td>V_MIN3_F32</td>
     459<td>4</td>
     460</tr>
     461<tr>
     462<td>V_CUBESC_F32</td>
     463<td>4</td>
     464<td>V_MIN3_I32</td>
     465<td>4</td>
     466</tr>
     467<tr>
     468<td>V_CUBETC_F32</td>
     469<td>4</td>
     470<td>V_MIN3_U32</td>
     471<td>4</td>
     472</tr>
     473<tr>
     474<td>V_CVT_PK_U8_F32</td>
     475<td>4</td>
     476<td>V_MIN_F64</td>
     477<td>DPFACTOR*4</td>
     478</tr>
     479<tr>
     480<td>V_DIV_FIXUP_F32</td>
     481<td>16</td>
     482<td>V_MQSAD_PK_U16_U8</td>
     483<td>16</td>
     484</tr>
     485<tr>
     486<td>V_DIV_FIXUP_F64</td>
     487<td>DPFACTOR*4</td>
     488<td>V_MQSAD_U32_U8</td>
     489<td>16</td>
     490</tr>
     491<tr>
     492<td>V_DIV_FMAS_F32</td>
     493<td>16</td>
     494<td>V_MQSAD_U8</td>
     495<td>16</td>
     496</tr>
     497<tr>
     498<td>V_DIV_FMAS_F64</td>
     499<td>DPFACTOR*8</td>
     500<td>V_MSAD_U8</td>
     501<td>4</td>
     502</tr>
     503<tr>
     504<td>V_DIV_SCALE_F32</td>
     505<td>16</td>
     506<td>V_MULLIT_F32</td>
     507<td>4</td>
     508</tr>
     509<tr>
     510<td>V_DIV_SCALE_F64</td>
     511<td>DPFACTOR*4</td>
     512<td>V_MUL_F64</td>
     513<td>DPFACTOR*8</td>
     514</tr>
     515<tr>
     516<td>V_FMA_F32</td>
     517<td>16</td>
     518<td>V_MUL_HI_I32</td>
     519<td>16</td>
     520</tr>
     521<tr>
     522<td>V_FMA_F64</td>
     523<td>DPFACTOR*8</td>
     524<td>V_MUL_HI_U32</td>
     525<td>16</td>
     526</tr>
     527<tr>
     528<td>V_LDEXP_F64</td>
     529<td>DPFACTOR*4</td>
     530<td>V_MUL_LO_I32</td>
     531<td>16</td>
     532</tr>
     533<tr>
     534<td>V_LERP_U8</td>
     535<td>4</td>
     536<td>V_MUL_LO_U32</td>
     537<td>16</td>
     538</tr>
     539<tr>
     540<td>V_LSHL_B64</td>
     541<td>DPFACTOR*4</td>
     542<td>V_QSAD_PK_U16_U8</td>
     543<td>16</td>
     544</tr>
     545<tr>
     546<td>V_LSHR_B64</td>
     547<td>DPFACTOR*4</td>
     548<td>V_QSAD_U8</td>
     549<td>16</td>
     550</tr>
     551<tr>
     552<td>V_MAD_F32</td>
     553<td>4</td>
     554<td>V_SAD_HI_U8</td>
     555<td>4</td>
     556</tr>
     557<tr>
     558<td>V_MAD_I32_I24</td>
     559<td>4</td>
     560<td>V_SAD_U16</td>
     561<td>4</td>
     562</tr>
     563<tr>
     564<td>V_MAD_I64_I32</td>
     565<td>16</td>
     566<td>V_SAD_U32</td>
     567<td>4</td>
     568</tr>
     569<tr>
     570<td>V_MAD_LEGACY_F32</td>
     571<td>4</td>
     572<td>V_SAD_U8</td>
     573<td>4</td>
     574</tr>
     575<tr>
     576<td>V_MAD_U32_U24</td>
     577<td>4</td>
     578<td>V_TRIG_PREOP_F64</td>
     579<td>DPFACTOR*8</td>
     580</tr>
     581</tbody>
     582</table>
    395583}}}