Midterm Test

Name: ___________________________  Student Number: _______________________

Time limit: 1 hour 50 min

Notes:

a) Closed book.
b) No calculators.
c) Answer all questions in the space provided.

Q1 - Assume the following C code:-

for(i=0; i<=1000; i++) {
}

A- Convert the C code to MIPS Assembly code.

```
Lw R5, 0(CL5); s
Lw R1, 0(R2); A
Lw R4, 0(R3); B
Mult R4, R5, R4; S*B[i]
Add R4, R4, R1; A
Sew R4, 0(R2); A
Addi R2, R2, 14
Addi R3, R3, 14
Sub R7, R6, R2;
Bnz R7, Loop
```

(total=20 marks)
**B**-Find the memory bandwidth of the above MIPS code assuming A[i], B[i] and S are 32 bit integers and are stored in memory. Assume op-code=1 byte, memory address is 4 bytes, immediate and displacement size= 2 bytes, and there are 64 registers. (5 marks)

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Memory Access</th>
<th>Memory Access</th>
<th>Memory Access</th>
</tr>
</thead>
<tbody>
<tr>
<td>LW R5, a(R6)</td>
<td>8+6+16+6=45B</td>
<td>4B</td>
<td></td>
</tr>
<tr>
<td>Loop: LW R1, a(CR2)</td>
<td>= 45B</td>
<td>4B</td>
<td></td>
</tr>
<tr>
<td>LW R4, a(CR3)</td>
<td>= 45B</td>
<td>4B</td>
<td></td>
</tr>
<tr>
<td>MUL R4, R5, R4</td>
<td>8+6+6+6=33B</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>ADD R4, R4, R7</td>
<td>m = 33B</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>SW R4, a(CR2)</td>
<td>8+6+16=45B</td>
<td>4B</td>
<td></td>
</tr>
<tr>
<td>ADD R2, R2, R4</td>
<td>8+6+6+16=45B</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>ADD R3, R3, R4</td>
<td>8+6+6+16=45B</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>SUB R7, R6, R2</td>
<td>8+6+6+6=33B</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Бр</td>
<td>52.15 = 5.88 B/cycle</td>
<td>12.8</td>
<td></td>
</tr>
</tbody>
</table>

**C**-Calculate the memory bandwidth if a fixed instruction length is used. If the instruction is shorter than 4 bytes it uses a 4 byte. If the instruction is longer than 4 bytes, it uses two instructions of 4 bytes each and cost two processor accesses. (5 marks)

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Memory Access</th>
<th>Memory Access</th>
</tr>
</thead>
<tbody>
<tr>
<td>LW R5, a(R6)</td>
<td>8B</td>
<td>4B</td>
</tr>
<tr>
<td>Loop: LW R1, a(CR2)</td>
<td>8B</td>
<td>4B</td>
</tr>
<tr>
<td>LW R4, a(CR3)</td>
<td>8B</td>
<td>4B</td>
</tr>
<tr>
<td>MUL R4, R5, R4</td>
<td>4B</td>
<td>0</td>
</tr>
<tr>
<td>ADD R4, R4, R7</td>
<td>8B</td>
<td>4B</td>
</tr>
<tr>
<td>SW R4, a(CR2)</td>
<td>8B</td>
<td>0</td>
</tr>
<tr>
<td>ADD R2, R2, R4</td>
<td>8B</td>
<td>0</td>
</tr>
<tr>
<td>ADD R3, R3, R4</td>
<td>8B</td>
<td>0</td>
</tr>
<tr>
<td>SUB R7, R6, R2</td>
<td>8B</td>
<td>8</td>
</tr>
<tr>
<td>Бр</td>
<td>48B</td>
<td>12B</td>
</tr>
</tbody>
</table>

Bω = 68B = 4.85B
D-Find if it is better for the performance to use the fixed instruction length assuming that the fixed instruction length uses 20% faster clock. (5 marks)

\[
T = 1 + 1000 \times 9 = 9000 \text{ cycles}
\]

\[
T_{\text{Fixed}} = \frac{(1 + 1000 \times 14)}{1.2} = 12000 \text{ cycles}
\]

NO, it is not.

Q2- (total = 10 marks)

A- Explain the Role of Compiler in optimizing the performance of applications. (3 marks)

- it assigns variables that are most frequently used to registers to reduce memory access time
- schedule code to hide hazards
- evaluate expressions in any order
- pick fast instructions
- hide branch latency
B- Explain why the compiler does not assign a data element referenced by a pointer to a register and give an example. (2 marks)

- aliasing problem: it could have different value
  
  \[ p = da \]
  \[ R_1 = a = 5 \]
  \[ \* p = 7 \]
  \[ R_2 = a \text{ different} \]

C- Why compiler does not use callee saving and explain by giving example. (3 marks)

Callee could call another routine that changes one of the variables that is not saved by callee

Example: program calls Routine 1, that does not use \( x \)
Routine 1 does not save \( x \), if Routine 1 calls Routine 2 and Routine 2 changes \( x \), then \( x \) in main program will not be same when they return.

D- Explain the advantages and the disadvantages of adding an extra addressing mode to ISA. (2 marks)

- Advantages: better for compiler, shorter code
- Disadvantages: slower, more complicated

Q3- Pipelining (total=30 marks)
A- List all types of hazards in MIPS 5 stage Integer pipeline, and give example for the code that would cause each hazard. (6 marks)

1. Structural hazard: Fetch in one clock

2. Data hazard: Fetch of correct instruction is not ready in time

3. Control hazard: Fetch of next instruction is not known in time because of BTA not calculated or known
B- Explain why pipelining complicates handling exceptions by giving examples.
(4 marks)

1. Could have two exceptions in one clock
   $F_1 \leftarrow D_1 \rightarrow E_1 \rightarrow M_1 \rightarrow W_1$
   $F_2$ (Red circle)
   Fetch Inst 2 could cause page fault
   Decode Inst 1 could have illegal Inst.

2. Out of order exception
   $F_1 \leftarrow D_1 \rightarrow M_1 \rightarrow W_1$
   $F_2$ (Red circle) $D_2 \rightarrow E_2$
   Fetch Inst 2 could cause page fault
   Earlier than Inst 1 could cause memory page fault later
   They must handle in order of their occurrence. First, the Inst 1 is
   non-pipelined.
C. In the MIPS pipeline with multi-cycle floating point operations, give examples of the following:
   i. WAW Hazard (3 marks)

   \[ \text{ADD } fl, f2, f3, \]
   \[ \text{LW } f1, f4, f5 - \]

   ii. A structural hazard that occurs in MIPS floating multi-cycle pipeline (3 marks)

   \[ \text{ADD } f1, f2, f3 \]
   \[ \text{ORR } r2, r1, r3 \]
   \[ \text{SUB } r4, r5, r6 \]
   \[ \text{ADDI } r5, 0(\text{R1}) \]

   or FP divide

D. Show the timing of the following code sequence in MIPS FP pipeline assume forwarding, scheduling and branch uses delayed branch and all memory references are in cache (FP Mult takes 7 cycles). (14 MARKS)

\[ \text{LOOP: } \text{LD } f0, 0(\text{R1}) \]
\[ \text{LD } f4, 0(\text{R2}) \]
\[ \text{Mul } f4, f0, f4 \]
\[ \text{SD } f4, 0(\text{R1}) \]
\[ \text{ADDI } r1, r1, 8 \]
\[ \text{ADDI } r2, r2, 8 \]
\[ \text{SUBI } r5, r6, r1 \]
\[ \text{BNZ } r5, \text{LOOP} \]
<table>
<thead>
<tr>
<th>Instruction</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>13</th>
<th>14</th>
<th>15</th>
<th>16</th>
<th>17</th>
<th>18</th>
<th>19</th>
<th>20</th>
<th>21</th>
<th>22</th>
</tr>
</thead>
<tbody>
<tr>
<td>LD F0, o(CP1)</td>
<td>F0</td>
<td>D0</td>
<td>E1</td>
<td>M1</td>
<td>W6</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>LD F1, o(CP2)</td>
<td>F1</td>
<td>D1</td>
<td>E2</td>
<td>M2</td>
<td>W2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADD F1, R1, F0</td>
<td>F2</td>
<td>D2</td>
<td>E3</td>
<td>M3</td>
<td>W3</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>MULT F4, F0, F1</td>
<td>F4</td>
<td>D4</td>
<td>E4</td>
<td>M4</td>
<td>W4</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADD F2, R2, F0</td>
<td>F5</td>
<td>D5</td>
<td>E5</td>
<td>M5</td>
<td>W5</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>SUB F5, R5, F0</td>
<td>F6</td>
<td>D6</td>
<td>E6</td>
<td>M6</td>
<td>W6</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>BNZ R5, D0</td>
<td>F7</td>
<td>D7</td>
<td>E7</td>
<td>M7</td>
<td>W7</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>SD F5, o(R1)</td>
<td>11 cycles</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Name: ___________________________  Section: ________