Ask Other Engineering Expert

Assignment

1. Loop Unrolling

Consider the following loop:

loop:

l.d                    f4,0(r1)                  l1

l.d                    f6,0(r2)                  l2

mul.d                f4,f4,f0                  m1

mul.d                f6,f6,f2                  m2

add.d                f4,f4,f6                  a1

s.d                    f4,0(r1)                 s1

daddui               r1,r1,#-8               sub1

daddui               r2,r2,#-8               sub2

bnez                  r1,loop                  br

Note: Our default is that FP arithmetics have 4 x-boxes.

a) Using the names 'l1' to 's1' for the first six instructions in the loop body, draw the flow-dependence graph for these instructions. Label each arrow with the dependence gap between the producer and the consumer.

In what follows, focus on three flow-dependence types: i) FP arith to FP arith, ii) FP arith to FP store, and iii) FP load to FP arith. Denote the number of m-boxes in memory references by '#m', and the number of x-boxes in FP arithmetics by '#x'.

b) For each of the three designated flow-dependence types, indicate the number of stalls in adjacent producer-consumer pairs as functions of '#m" and '#x'.

c) Suppose #m = 1 and #x = 4. How many stalls occur in one iteration of the loop if it is executed exactly as written?

d) Unroll the loop twice. If one reschedules the unrolled loop optimally, how many stalls are left? (Keep the branch as the last instruction. Show the rescheduled code using the _short_ names).

e) Increase the 'mul-add dependence gap' to 5 cycles, leaving everything else unchanged. Unroll the loop three times. If one reschedules the unrolled loop optimally, how many stalls are left? (Keep the branch as the last instruction. Show the rescheduled code using the _short_ names).

2. Dynamic Instruction Scheduling I

Imagine that reservation stations only track whether floating-point operands are valid, and that integer operands appear by magic whenever needed.

a) Dispatch instructions 'l1' to 's1' to reservation stations rs1(l1) to rs6(s1). Show the contents of each reservation station. Indicate both valid ("value") and pending ("ear") operands in each station. For the loads, you may make all operand entries 'blank'. That is, mark all load dependences as resolved. For the store, just invent a regular single-operand reservation station. The value of 'test' is either "may issue" or "may not issue". The value of 'free' is either "free" or "not free".

b) After both loads have completed, but no further action has occurred, show the contents of each reservation station. As before, you may mark some reservation stations as 'free'. Syntax: result[rs9(a9)]; val[f12].

c) At this point, dispatch the two loads of the second iteration, viz., 'l3' and 'l4', into reservation stations 'rs1' and 'rs2'. Moreover, let instruction 'm1' of the first iteration complete. Show the contents of each reservation station.

3. Dynamic Instruction Scheduling II

a) Instruction 'op' has been dispatched to reservation station alpha. What statement must be proved to show that all its flow dependences are respected?

b) Instructions 'op1' and 'op2' are an antidependent pair. The earlier 'op1' is dispatched to reservation station alpha. The later 'op2' is dispatched to reservation station beta. What statement must be proved to show that the antidependence is respected?

c) Instructions 'op1' and 'op2' are an output-dependent pair. The earlier 'op1' is dispatched to reservation station alpha. The later 'op2' is dispatched to reservation station beta. What statement must be proved to show that the output dependence is respected?

Other Engineering, Engineering

  • Category:- Other Engineering
  • Reference No.:- M92536632
  • Price:- $70

Guranteed 36 Hours Delivery, In Price:- $70

Have any Question?


Related Questions in Other Engineering

Register design a cpu register is simply a row of

Register design A CPU register is simply a row of flip-flops (i.e. SR, JK, T, etc) put side by side in an array to make the size of register required. For example, an 8 bit register has 8 flip-flops side by side for stor ...

A detailed review of spatial modulation and simulation

A Detailed Review of Spatial Modulation and Simulation Learning Outcomes a. Learn how to model mobile communication channels d. Discern knowledge development and directions on the recent advances in 4G to the research pr ...

Mine safety amp environmental engineering assignment -part

Mine Safety & Environmental Engineering Assignment - Part 1 - Questions 1. Occupational health and safety is the primary factor that needs to be considered in the mining industry. Discuss this statement. 2. Define the fo ...

Projectflow processing of liquor in a mineral refining

Project Flow Processing of Liquor in a Mineral Refining Plant The aim of this project is to design a flow processing system of liquor (slurry) in a mineral (aluminum) refining plant. Aluminum is manufactured in two phase ...

Learning outcomes evaluate multiuser communication and

Learning Outcomes Evaluate multiuser communication and resource sharing techniques; Apply the techniques of, and report on, digital communication applications using Matlab and hardware devices. Assignment Description The ...

Operations engineering assignment -please select only one

Operations Engineering Assignment - Please select only one of the following case studies for your assignment: CASE A. Tesla Motors Tesla is an innovative manufacturer that designs, assemble and sells fully electric vehic ...

Select a risk problem from the list below and prepare a

Select a risk problem from the list below and prepare a risk management plan in accordance with AS/NZS ISO 31000:2009. Please ensure that: - Establish the context clearly, in accordance with the Standard; - Define your s ...

Engineering materials term paper assignment -conduct a

ENGINEERING MATERIALS TERM PAPER ASSIGNMENT - Conduct a thorough literature search and write a 15-20 page technical review paper on the evolution of the engineering materials used in the manufacturing of any one of the f ...

Task 1using the lab kit design a circuit for the processor

Task 1: Using the lab kit, design a circuit for the processor to control the output of a connected 7-segment LED display device. You will be provided with a standard common anode 7-segment display of the type FND-507 (or ...

Control theory - lab reportsfor experiments 1 to 4 you must

Control Theory - Lab Reports For experiments 1 to 4 you must undertake the following: a) At the start of each section (including the pre-lab activities) there are a number learning outcomes. That is, what students should ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As