A superscalar processor can execute two instructions per cycle if there is no resource conflict and no data dependence problem. There are essentially two pipelines, with four processing stages (fetch, decode, execute, and store). Each pipeline has its own fetch, decode, and store unit. Four functional units (multiplier, adder, logic unit, and load unit) are available for use in the execute stage and are shared by the two pipelines on a dynamic basis. The two store units can be dynamically used by the two pipelines depending on availability at a particular cycle. Assume the adder has two stages, and the multiplier has three stages.
Consider the following program to be executed:
I1: Load R1, A /R1 ← Memory (A)/
I2: Add R2, R1 /R2 ← R2 + R1/
I3: Add R3, R4 /R3 ← R3 + R4/
I4: Mul R4, R5 /R4 ← R4 * R5/
I5: Comp R6 /Flag ← R6/
I6: Mul R6, R7 /R6 ← R6 * R7/
a. What dependencies exist in the program?
b. Rename the registers from the above code to prevent dependency problems.
c. Show the pipeline activity for this program using in-order issue with in-order completion policy.
d. Repeat for in-order issue with out-of-order completion.
e. Repeat for out-of-order issue with out-of-order completion.