Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

Question 1. Pipeline Stages

Consider a five stage pipelined (fetch, decode, execute, memory, write back) processor.

Pipline Stage

Latency of each stage (ps)

Fetch (F0, F1)

240

Decode (D)

320

Execute (X0, X1, X2)

280

Memory (M)

400

Write Back (W)

200

Processor

Cycle

Time

Max Clock

Frequency

Latency of

Instruction

 

Throughput

a) Baseline

 

 

 

 

b) Faster memory

 

 

 

 

c) Proposed scheme

 

 

 

 

Question 1.A Baseline

Fill out the first row in the table above for the given pipelined processor. Cycle time is the clock period at which this machine can run. Max clock frequency is the maximum frequency clock that can be run given the cycle time (lower frequencies are possible). Latency of instruction is the time it takes to get the first output after it starts. Throughput is the rate at which output is produced.

Question 1.B Faster Memory

Suppose there is an optimization that reduces the Memory stage latency by 100 ps. However, the optimization in memory stage results in 100ps increase in the Write back stage latency. How much would this improve the overall performance of the 8-stage pipeline? Fill out the second row in the table above for this new pipeline.

Question 1.C Proposed Scheme

If you had the option to rearrange the pipeline, how would your arrange the pipeline to achieve maximum frequency?

Question 2. Datapath Bypassing

In this problem we will be investigating the implications of using a two-cycle pipelined integer ALU. Our new pipelined processor will have the following six stages:
• F - instruction fetch
• D - decode and read registers
• X0 - first half of the ALU operation
• X1 - second half of the ALU operation
• M - data memory read/write
• W - write registers

The figure on the next page shows the datapath of the new six-stage datapath. Notice that this datapath currently does not allow any data bypassing. Also notice that conditional branches are resolved by the end of the X1 stage. Assume that this instruction set does not include a branch delay slot.

1 add r3, r1, r2
2 xor r7, r3, r4
3 and r5, r2, r3
4 xor r8, r8, r8
5 add r8, r7, r3
6 sw r5, 400(r3)

Question 2.A Implementing Bypassing

Modify the figure on the next page to implement the di↵erent types of bypassing.

Question 2.B Stalling

For the above assembly code, draw a pipeline diagram using stalls to resolve any data de- pendencies that you find.

Question 2.C Bypassing

For the above assembly code, draw a pipeline diagram using whichever bypass you think is valid for that case. Indicate which type of bypass you have used for each case.

144_Pipeline.jpg

Question 3. Pipelining and Hazzards

Consider the MIPS assembly code given below. We want to run this code on a 5-stage pipelined processor, with some modifications. The processor is a typical 5-stage pipeline (F-D-X-M-W), with the following exceptions:

• The multiplier block used to execute the mul instruction is pipelined into three stages as shown:

2372_Pipeline1.jpg

This means that a multiply instruction runs through the pipeline as follows: F-D-X0- X1-X2-M-W and up to three multiply instructions maybe in-flight at a time. All other instruction types are blocked from the execute stage while any of the multiply stages are being used.

• The divider block used to execute the div instruction is iterative and takes five cycles as shown:

305_Pipeline2.jpg

This means that a divide instruction runs through the pipeline as follows: F-D-X0- X0-X0-X0-X0-M-W. All other instructions are blocked from the execute stage while a division is being done.

1 xor r0, r1, r1
2 addiu r1, r0, 16
3 j L1
4 loop: lw r3, 0(r2)
5 mul r4, r3, r3
6 div r3, r4, r3
7 mul r3, r3, r1
8 mul r3, r3, r0
9 addiu r0, r0, 2
10 sw r3, 0(r2)
11 addiu r2, r2, 4
12 L1: bne r0, r1, -9

Question 3.A Structural Hazards

Draw a pipeline diagram showing the execution of the MIPS code through the first iteration of the loop, without bypassing. Assume data hazards and structural hazards are resolved using only stalling. Assume the processor assumes branches are not taken, until they are resolved. What is the CPI of the entire program?

Question 3.B Data Hazards

Draw a pipeline diagram similar to Part A, but now assume the processor has data bypassing. What is the CPI of the entire program?

Question 3.C Control Hazards

Will the assembly code shown above lead to control hazards? If yes, Where does it occur and how can it be solved?

Question 4. Out-of-Order Execution

This question is based on the same pipeline and code as question 3.

Consider the MIPS assembly code given below. We want to run this code on a 5-stage pipelined processor, with some modifications. The processor is a typical 5-stage pipeline (F-D-X-M-W), with the following exceptions:

• The multiplier block used to execute the mul instruction is pipelined into three stages as shown:

864_Pipeline3.jpg

This means that a multiply instruction runs through the pipeline as follows: F-D-X0- X1-X2-M-W and up to three multiply instructions maybe in-flight at a time. All other instruction types are blocked from the execute stage while any of the multiply stages are being used.

• The divider block used to execute the div instruction is iterative and takes five cycles as shown:

884_Pipeline4.jpg

This means that a divide instruction runs through the pipeline as follows: F-D-X0- X0-X0-X0-X0-M-W. All other instructions are blocked from the execute stage while a division is being done.

• This processor supports out of order execution.
1 xor r0, r1, r1
2 addiu r1, r0, 16
3 j L1
4 loop: lw r3, 0(r2)
5 mul r4, r3, r3
6 div r3, r4, r3
7 mul r3, r3, r1
8 mul r3, r3, r0
9 addiu r0, r0, 2
10 sw r3, 0(r2)
11 addiu r2, r2, 4
12 L1: bne r0, r1, -9

Question 4.A Out-of-Order with no Bypassing

Draw a pipeline diagram showing the out-of-order execution of the MIPS code through the first iteration of the loop, without bypassing. Assume data hazards and structural hazards are resolved using only stalling. Assume the processor assumes branches are not taken, until they are resolved.

Question 4.B Out-of-Order with Bypassing

Draw a pipeline diagram similar to Part A, showing the out-of-order execution, but now assume the processor has data bypassing.

Question 4.C Program Latency

Determine the number of cycles it takes to execute all iterations of the loop for both the scenario in Part A and the scenario in Part B. Justify your answer.

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M91722027

Have any Question?


Related Questions in Computer Engineering

Suppose in your company you formulate a python script that

Suppose in your company you formulate a Python script that inserts, updates, and deletes data in tables in a MySQL database. You post your Python script on a shared drive for other staff members to use. What are some the ...

Command to mail only the process id of running java program

Command to mail only the process ID of running Java program test to the email address (single line Unix)

All rsa cryptosystem has public key n 35 and e 7 messages

All RSA cryptosystem has public key N = 35 and e = 7. Messages are encrypted one letter at a time, converting letters to numbers by A = 2, B = 3, .... Z = 27, space = 28. Showing your working, encrypt the message: BE GOO ...

Fifteen batteries are tested to determine whether the

Fifteen batteries are tested to determine whether the battery life is as long as advertised. Four batteries fail the test. From all fifteen batteries, two are selected at random without replacement. Find the probability ...

Question suppose you are developing a program that

Question : Suppose you are developing a program that frequently tests whether a student is in a soccer team, what is the best data structure to store the students in a soccer team? Justify your answer.

Suppose pointers are 4 bytes long and keys are 12 bytes

Suppose pointers are 4 bytes long, and keys are 12 bytes long. How many keys and pointers will a block of 16,384 bytes have?

Both the search and the insertion time for a b-tree are a

Both the search and the insertion time for a B-tree are a function of the height of the tree. We would like to develop a measure of the worst-case search or insertion time. Consider a B-tree of degree d that contains a t ...

Consider two computer companies - orange and ph - that

Consider two computer companies - Orange and PH - that report current sales receipts of $323 million and $294 million, respectively. Their cur-rent operating expenses were $150 million each. Orange issued $5 million in n ...

Once considered pure science fiction artificial

Once considered pure science fiction, artificial intelligence (AI) is being relied on more and more in today's world. Artificial intelligence deals with algorithms based on complex data sets. If you had to tell story rep ...

A helium filled water balloon is launched from the ground

A helium filled water balloon is launched from the ground where the pressure is 752mmHg and temperature is 21c. Under these conditions it's volume is 75L. When it has climbed to an altitude where the pressure is 89mmHg a ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As