Ask Statistics and Probability Expert

Harvey Weinstein has employed you as a consultant to brief him on film performance in the Australian cinema industry over the years 1997-2007. You have obtained some data on film revenues and associated variables from the Motion Picture Distributors Association of Australia (MPDAA). The file ‘Film data.xls' contains data on the top 100 films at the Australian box office for each of the years 1997-2007. However, due to some missing data (on budget) the sample is reduced and only includes 992 films with complete information. The following table provides descriptions of the data provided in the ‘Film data.xls' file:

2221_Confidence interval for expected revenue.png

a) To get a sense of the data, provide two well labeled scatter-plots of 1) revenue vs. budget, and 2) revenue vs. screens. Also, provide a correlation matrix of these three variables

b) After speaking to an industry expert, they tell you that budget and screens are the most important determinants of film success at the box office (advertising is important too, the expert tells you, but you don't have data on that). Produce a regression with ‘revenue' as the dependent variable and ‘budget' and ‘screens' as the independent variables. Interpret the regression coefficients.

c) With your regression results of part b), provide residual plots vs. each independent variable. Also, provide a histogram of the (standardised) residuals. Comment on the nature of the residuals based upon these three plots.

d) You show your results and plots to your econometrics professor at university and he tells you that you may have an issue with non-normal errors caused by outliers. He suggests you transform the revenue, screens, and budget data into natural logarithms and re-estimate the equation, which you do. Provide results of the new regression and interpret the regression coefficients [Hint: d lnY/d lnX = (dY/dX)*(X/Y) = (dY/Y)/(dX/X)].

e) With your new regression results of part d), provide residual plots vs. each independent variable. Also, provide a histogram of the (standardised) residuals. Comment on the nature of the new residuals based upon these three plots.

f) You discuss the results with the industry expert again. They tell you that apart from screens and budget, films with A-list stars and films which are sequels generally earn more at the box office. Therefore you decide to include these dummy variables in your updated model from part d). Are these new variables individually significant in the updated model? Do their coefficients' signs support the expert's intuition?

g) You are about to prepare your final report when an old friend calls you on the phone. You tell your friend about the project because they go to film school and you suspect they might find it interesting. Your friend wonders whether genre and rating might be important as well. You have data on these variables but need to construct some more dummy variables. Treating ‘Other' and ‘G' as the base categories for ‘genre' and ‘rating', respectively; estimate the new model and provide a partial-F test to test whether the extra variables are worthwhile adding assuming 5% level of significance.

h) You decide to present the model you estimated in f) to Harvey. You ring him on the phone to request a meeting and let him know that you've completed the work. He asks one more request of you: to provide a point estimate of revenue for a film he is planning on releasing in Australian cinemas soon. The film has a budget of $200,000,000, will release on a maximum of 250 screen, features an A list actor, and is also a sequel. Calculate the prediction of how much revenue this film might earn given this information (Hint: be mindful of the (natural) log transformations you have made).

i) Before the final meeting with Harvey, you remember from your ECMT1010 course that it would be better to provide an estimate of the average value of revenue (for the given values of the independent variables) by presenting a confidence interval for the conditional mean (recall equation 13.12 in Black 2E/3E). Because you've been paying close attention in class, you realise the confidence intervals computed by KaddStat are not the ones you are after and the equation you learnt in ECMT1010 is only suitable for simple regression, and not multiple regression.

You pay another visit to your econometrics professor who is rushing out the door on the way to the faculty Christmas party. He knows you've been learning some matrix algebra so scribbles you down an equation in matrix form for the confidence interval for the conditional mean:

861_Confidence interval for expected revenue1.png

Where y^o = point prediction of revenue (which you found in part h); t α/2,n-k-1 = the t-value for a two tailed test with level of significance α and n-k-1 degrees freedom; σ^ is the estimated standard error of the estimate;

x0 is the (5x1) vector of values for the independent variables (i.e. the four given by Harvey - and the constant term); and X is the (992x5) matrix of all observations for the independent variables (including the constant).

Using the matrix multiplication (MMULT) and matrix inverse (MINV) functions in Excel, provide the following:

785_Confidence interval for expected revenue2.png

iii) What are the upper and lower bounds of a 95% confidence interval for expected revenue in Australian dollars?

Attachment:- Film-data.xls

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M9824126

Have any Question?


Related Questions in Statistics and Probability

Introduction to epidemiology assignment -assignment should

Introduction to Epidemiology Assignment - Assignment should be typed, with adequate space left between questions. Read the following paper, and answer the questions below: Sundquist K., Qvist J. Johansson SE., Sundquist ...

Question 1 many high school students take the ap tests in

Question 1. Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the biology exam 84,199 of them were female. In that same year,of the 211,693 students who too ...

Basic statisticsactivity 1define the following terms1

BASIC STATISTICS Activity 1 Define the following terms: 1. Statistics 2. Descriptive Statistics 3. Inferential Statistics 4. Population 5. Sample 6. Quantitative Data 7. Discrete Variable 8. Continuous Variable 9. Qualit ...

Question 1below you are given the examination scores of 20

Question 1 Below you are given the examination scores of 20 students (data set also provided in accompanying MS Excel file). 52 99 92 86 84 63 72 76 95 88 92 58 65 79 80 90 75 74 56 99 a. Construct a frequency distributi ...

Question 1 assume you have noted the following prices for

Question: 1. Assume you have noted the following prices for paperback books and the number of pages that each book contains. Develop a least-squares estimated regression line. i. Compute the coefficient of determination ...

Question 1 a sample of 81 account balances of a credit

Question 1: A sample of 81 account balances of a credit company showed an average balance of $1,200 with a standard deviation of $126. 1. Formulate the hypotheses that can be used to determine whether the mean of all acc ...

5 of females smoke cigarettes what is the probability that

5% of females smoke cigarettes. What is the probability that the proportion of smokers in a sample of 865 females would be greater than 3%

Armstrong faber produces a standard number-two pencil

Armstrong Faber produces a standard number-two pencil called Ultra-Lite. The demand for Ultra-Lite has been fairly stable over the past ten years. On average, Armstrong Faber has sold 457,000 pencils each year. Furthermo ...

Sppose a and b are collectively exhaustive in addition pa

Suppose A and B are collectively exhaustive. In addition, P(A) = 0.2 and P(B) = 0.8. Suppose C and D are both mutually exclusive and collectively exhaustive. Further, P(C|A) = 0.7 and P(D|B) = 0.5. What are P(C) and P(D) ...

The time to complete 1 construction project for company a

The time to complete 1 construction project for company A is exponentially distributed with a mean of 1 year. Therefore: (a) What is the probability that a project will be finished in one and half years? (b) What is the ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As