1. Given an architecture which has the following pipeline stages: • F D I X0 X1 W • and that register fetch happens in the I stage of the pipeline and branch resolution happens in X1. • Assume a CPI of 1 and that a branch is the first instruction executing after a jump a) How many instructions are flushed when a branch miss-predict is taken? __________ b) Give an explanation by drawing the pipeline:
Added by Samantha S.
Step 1
To answer the question, we will analyze the pipeline stages and the impact of a branch miss-predict. Show more…
Show all steps
Your feedback will help us improve your experience
James Kiss and 94 other AP CS educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
Consider the following sequence of instructions: ADD $R_{1}, R_{2} \quad R_{1} \leftarrow R_{1}+R_{2}$ $\begin{array}{ll}\text { BEZ Target } & \text { Branch if Zero }\end{array}$ MUL $R_{3}, R_{4} \quad R_{3} \leftarrow R_{3} * R_{4}$ MOVE $R_{1}, 10 \quad R_{1} \leftarrow 10$ Target: Assume that this program executed on a 6-stage pipelined processor and each stage required 1 clock cycle. Let us suppose that "branch not taken" Prediction is used but the prediction is not fulfilled, then the penalty will be (branch outcome is known at $5^{\text {th }}$ stage) (A) 1 clock cycle (B) 2 clock cycles (C) 3 clock cycles (D) 4 clock cycles
Computer Organization and Architecture
Instruction Pipelining
In designing a 9-stage instruction-pipelined architecture with forwarding capability, you are asked to consider the effect on CPI from all possible hazards, assuming that the ideal CPI is 1. Structural hazards: None Data hazards (without compiler's rescheduling): - An instruction of type IA immediately followed by another instruction of type IB, and the number of clock cycle stalls is 2. - An instruction of type IA immediately followed by another instruction of type IC, and the number of clock cycle stalls is 1. Listed below is the percentage of occurrence of each combination that will lead to a stall: - IA — IB: 10% - IA — x — IB: 5% - IA — IC: 10% Control hazards (from branch instructions, disregarding those from jump or subroutine call instructions): - The target address is calculated (PC + offset) in the 3rd stage. - The condition is checked (to determine the next PC) in the 6th stage. - An average of 15% of all instructions are branch instructions, among which 70% are 'taken'. (a) What is the increase to CPI from the data hazards? (b) Determine the CPI considering all the hazards if the pipeline is "frozen" (stalled) until the next PC is known for sure when executing a branch instruction. (c) Repeat (b) if an 'assume-taken' (CPU assumes all branches are taken) approach is used for branch. (d) Repeat (b) if an 'assume-not-taken' (CPU assumes all branches are not-taken) approach is used for branch. (e) In (b), if without using the pipeline-freezing hardware, a rescheduling compiler is used to find instructions to insert into the 'branch-delay-slots', how many such slots does a compiler have to try to fill after each branch instruction in order to remove all the branch stalls (like the technique we use to fill the 'load-delay-slots' for data hazards), assuming that a nop will be placed into a slot that cannot be filled? (f) Repeat (e) if an 'assume-taken' approach is used for branch. (g) Repeat (e) if an 'assume-not-taken' approach is used for branch. (h) In (e), suppose that 80% of all the 'data-hazard-delay-slots' and 60% of all the 'branch-delay-slots' can be filled by the compiler. What is the new CPI? (i) If you are forced to use the 'assume-taken' approach and you are unable to move the condition-checking operation in the pipeline, determine all the stages that you can move the EA calculation operation to so that this approach outperforms the 'assume-not-taken' approach. Show a systematic approach in solving this problem.
Sri K.
Assume that individual stages of the datapath have the following latencies: IF: 250ps ID: 350ps EX: 150ps MEM: 300ps WB: 200ps a. List the required stages for each of the following types of instructions: load, store, r-type, branch. b. What is the execution time of each type of instruction assuming only the required stages execute for each instruction? c. Assuming the same instruction mix listed in Problem 7, what is the average execution time across all instructions? d. Assuming pipelining is used, what would be the necessary clock cycle time? e. Assuming pipelining is used, what would be the execution time for a single load instruction to execute? f. Use the average instruction execution time calculated in Part d of this problem to determine the overall speed-up gained by pipelining. Assume the processor continuously runs with a full pipeline and hazards are completely avoided.
Akash M.
Recommended Textbooks
Computer Science and Information Technology
Introduction to Programming Using Python
Computer Science - An Overview
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD