Ask Statistics and Probability Expert

Q1. The data file airfares.txt on the book web site gives the one-way airfare (in US dollars) and distance (in miles) from city A to 17 other cities in the US. Interest centers on modeling airfare as a function of distance. The first model fit to the data was

Fare = β0 + β1Distance + e

(a) Based on the output for model (3.7) a business analyst concluded the following:

The regression coefficient of the predictor variable, Distance is highly statistically significant and the model explains 99.4% of the variability in the Y-variable, Fare. Thus model (1) is a highly effective model for both understanding the effects of Distance on Fare and for predicting future values of Fare given the value of the predictor variable, Distance.

Provide a detailed critique of this conclusion.

(b) Does the ordinary straight line regression model (3.7) seem to fit the data well? If not, carefully describe how the model can be improved.

Q3. The price of advertising (and hence revenue from advertising) is different from one consumer magazine to another. Publishers of consumer magazines argue that magazines that reach more readers create more value for the advertiser. Thus, circulation is an important factor that affects revenue from advertising. In this exercise, we are going to investigate the effect of circulation on gross advertising revenue. The data are for the top 70 US magaimes ranked in terms of total gross advertising revenue in 2006. In particular we will develop regression models to predict gross advertising revenue per advertising page in 2006 (m thousands of dollars) from circulation (in millions). The data were obtained from http://adage.com and are given in the file AdRevenue.csv which is available on the book web site. Prepare your answers to parts A, B and C in the form of a report.

Part A -

(a) Develop a simple linear regression model based on least squares that predicts advertising revenue per page from circulation (i.e., feel free to transform either the predictor or the response variable or both variables). Ensure that you provide justification for your choice of model.

(b) Find a 95% prediction interval for the advertising revenue per page for magazines with the following circulations:

(i) 0.5 million

(ii) 20 million

(c) Describe any weaknesses in your model.

Part B

(a) Develop a polynomial regression model based on least squares that directly predicts the effect on advertising revenue per page of an increase in circulation of 1 million people (i.e., do not transform either the predictor nor the response variable). Ensure that you provide detailed justification fee your choice of model [Hint: Consider polynomial models of order up to 3.]

(b) Find a 95% prediction interval for the advertising page cost for magazines with the following circulations:

(i) 0.5 million

(ii) 20 million

(c) Describe any weaknesses in your model.

Part C -

(a) Compare the model in Part A with that in Pan B. Decide which provides a better model. Give reasons to justify your choice.

(b) Compare the prediction intervals in Pad A with those in Part B. In each case, decide which interval you would recommend. Give reasons to justify each choice.

Q4. Tryfos 0998, p. 57) considers a real example involving the management at a Canadian port on the Great Lakes who wish to estimate the relationship between the volume of a ship's cargo and the time required to load and unload this cargo. It is envisaged that this relationship will be used for planning purposes as well as for making companions with the productivity of other ports. Records of the tonnage loaded and unloaded as well as the time spent in port by 31 liquid-Carrying vessels that used the port over the most recent summer are available The data are available on the book website in the file glalees.txt. The first model fit to the data was

Time = β0 + β1Tonnage + e           (3.8)

On the following pages is some output from fitting model (3.8) as well as some plots of Tonnage and Time (Figures 3.42 and 3.43).

(a) Does the straight line regression model (3.8) seen to fit the data well? If not, list any weaknesses apparent in model (3.8).

(b) Suppose that model (3.8) was used to calculate a prediction interval for Time when Tonnage = 10,000. Would the interval be too short, too long or about right (i.e., valid)? Give a reason to support your answer.

Q5. An analyst for the auto industry has asked for your help in modeling data on the prices of new cars. Interest centers on modeling suggested retail price as a function of the cost to the dealer for 234 new cars. The data set, which is available on the book website in the file cars04.csv, is a subset of the data from http://www.amstat.org/publications/jse/datasets/04cars.txt

(Accessed March 12, 2007)

The first model fit to the data was

Suggested Retail Price = β0 + β1 Dealer Cost + e

(a) Based on the output for model (3.10) the analyst concluded the following:

Since the model explains just more than 99.8% of the variability in Suggested Retail Price and the coefficient of Dealer Cost has a t-value greater than 412, model (1) is a highly effective model for producing prediction intervals for Suggested Retail Price.

Provide a detailed critique of this conclusion.

(b) Carefully describe all the shortcomings evident in model (3.10). For each short-coming, describe the steps needed to overcome the shortcoming.

The second model fitted to the data was

log(Suggested Retail Price) = β0 - β1Iog(Dealer Cost) + e                (3.11)

Output from model (3.11) and plots Figure 3.47) appear on the following pages.

(c) Is model (3.11) an improvement over model (3.10) in terms of predicting Suggested Retail Price? If so, please describe all the ways in which it is an improvement.

(d) Interpret the estimated coefficient of log(Dealer Cost) in model (3.11).

(e) List any weaknesses apparent in model (311).

Q8. Chu (1996) discusses the development of a regression model to predict the price of diamond rings from the size of their diamond stones (in terms of their weight in carats). Data on both variables were obtained from a full page advertisement placed in the Straits Times newspaper by a Singapore-based retailer of diamond jewelry. Only rings made with 20 carat gold and mounted with a single diamond stone were included in the data set. There were 48 such rings of varying designs. (Information on the designs was available but not used in the modeling.)

The weights of the diamond stones ranged from 0.12 to 0.35 carats (a one carat diamond stone weighs 0.2 gram) and were priced between $223 and $1086. The data are available on the course web site in the file diamonds.txt.

Part 1 -

(a) Develop a simple linear regression model based on least squares that directly predicts Price from Size (that is, do not transform either the predictor nor the response variable). Ensure that you provide justification for your choice of model.

(b) Describe any weaknesses in your model.

Part 2

(a) Develop a simple linear regression model that predicts Price from Size (i.e., feel free to transform either the predictor or the response variable or both variables). Ensure that you provide detailed justification for your choice of model.

(b) Describe any weaknesses in your model.

Part 3

Compare the model in Part A with that in Part B. Decide which provides a better model. Give reasons to justify your choice.

Complete assignment in attachment.

Attachment:- Assignment.rar

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M92020965

Have any Question?


Related Questions in Statistics and Probability

Introduction to epidemiology assignment -assignment should

Introduction to Epidemiology Assignment - Assignment should be typed, with adequate space left between questions. Read the following paper, and answer the questions below: Sundquist K., Qvist J. Johansson SE., Sundquist ...

Question 1 many high school students take the ap tests in

Question 1. Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the biology exam 84,199 of them were female. In that same year,of the 211,693 students who too ...

Basic statisticsactivity 1define the following terms1

BASIC STATISTICS Activity 1 Define the following terms: 1. Statistics 2. Descriptive Statistics 3. Inferential Statistics 4. Population 5. Sample 6. Quantitative Data 7. Discrete Variable 8. Continuous Variable 9. Qualit ...

Question 1below you are given the examination scores of 20

Question 1 Below you are given the examination scores of 20 students (data set also provided in accompanying MS Excel file). 52 99 92 86 84 63 72 76 95 88 92 58 65 79 80 90 75 74 56 99 a. Construct a frequency distributi ...

Question 1 assume you have noted the following prices for

Question: 1. Assume you have noted the following prices for paperback books and the number of pages that each book contains. Develop a least-squares estimated regression line. i. Compute the coefficient of determination ...

Question 1 a sample of 81 account balances of a credit

Question 1: A sample of 81 account balances of a credit company showed an average balance of $1,200 with a standard deviation of $126. 1. Formulate the hypotheses that can be used to determine whether the mean of all acc ...

5 of females smoke cigarettes what is the probability that

5% of females smoke cigarettes. What is the probability that the proportion of smokers in a sample of 865 females would be greater than 3%

Armstrong faber produces a standard number-two pencil

Armstrong Faber produces a standard number-two pencil called Ultra-Lite. The demand for Ultra-Lite has been fairly stable over the past ten years. On average, Armstrong Faber has sold 457,000 pencils each year. Furthermo ...

Sppose a and b are collectively exhaustive in addition pa

Suppose A and B are collectively exhaustive. In addition, P(A) = 0.2 and P(B) = 0.8. Suppose C and D are both mutually exclusive and collectively exhaustive. Further, P(C|A) = 0.7 and P(D|B) = 0.5. What are P(C) and P(D) ...

The time to complete 1 construction project for company a

The time to complete 1 construction project for company A is exponentially distributed with a mean of 1 year. Therefore: (a) What is the probability that a project will be finished in one and half years? (b) What is the ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As