Changes between Version 3 and Version 4 of GcnTimings


Ignore:
Timestamp:
01/26/16 20:00:14 (8 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GcnTimings

    v3 v4  
    2323<li>if 16 or more cycle 2-dword instruction and 2 dword instruction in 4 dword, then
    2424no penalty for second 2-dword instruction.</li>
     25<li>best place to jump is 5 first dwords in 32-byte block. Jump to rest of dwords causes
     261-3 penalties, depending on number of dword (N-4, where N is number of dword). This rule
     27does not apply to backward jumps (???)</li>
     28<li>any conditional jump instruction should be in first half of 32-byte block, otherwise
     291-4 penalties will be added if jump was not taken, depending on number of dword
     30(N-3, where N is number of dword).</li>
    2531</ul>
    2632<h3>Instruction scheduling</h3>
    27 <p>Between any vector operation that operates on VCC and any scalar ALU instruction is
    28 16-cycle delay.</p>
     33<ul>
     34<li>between any integer V_ADD<em>, V_SUB</em>, V_FIRSTREADLINE_B32, V_READLANE_B32 operation
     35and any scalar ALU instruction is 16-cycle delay.</li>
     36<li>any conditional jump directly that checks VCCZ or EXECZ after instruction that changes
     37VCC or EXEC adds single penalty (4 cycles)</li>
     38<li>any conditional jump directly that checks SCC after instruction that changes SCC,
     39EXEC, VCC adds single penalty (4 cycles)</li>
     40</ul>
    2941<h3>SOP2 Instruction timings</h3>
    3042<table>
     
    376388<h3>SOPC Instruction timings</h3>
    377389<p>All comparison and bit checking instructions take 4 cycles.</p>
     390<h3>SOPP Instruction timings</h3>
     391<p>Jumps costs 4 (no jump) or 20 cycles (???) if jump will performed.</p>
    378392}}}