Ask Statistics and Probability Expert

Statistics General Linear Model Midterm Exam

Q1. There are a series of papers by de Souza et al. on the generalized linear models in astronomy. These articles can be found either from the links provided below, or from the Blackboard where three pdf files were uploaded. In this question, we focus on the second paper. You may read other papers if you are interested in.

  • R.S. de Souza, E. Cameron, M. Killedar, J. Hilbe, R. Vilalta, U. Maio, V. Biffi, B. Ciardi, J.D. Riggs (2015). The overlooked potential of generalized linear models in astronomy, I: binomial regression, Astronomy and Computing, 12, 21-32.
  • J. Elliott, R.S. de Souza, A. Krone-Martins, E. Cameron, E.E.O. Ishida, J. Hilbe (2015). The overlooked potential of generalized linear models in astronomy, II: gamma regression and photometric redshifts. Astronomy and Computing, 10, 61-72.
  • R. S. de Souza, J. M. Hilbe, B. Buelens, J. D. Riggs, E. Cameron, E. E. O. Ishida, A. L. Chies-Santos, M. Killedar (2015). The overlooked potential of generalized linear models in astronomy, III: Bayesian negative binomial regression and globular cluster populations. Monthly Notices of the Royal Astronomical Society, 453, 1928-1940.

(a) Read the second paper by Elliott et al. on gamma regression and photometric redshifts. Write a brief summary of section 2 overview of regression methods, page 62-64.

(b) Appendix A (page 68-69) provides instructions to perform the photometric redshift estimation using the R package. Run these R codes line by line, and explain the purpose and output of each command line. Elliott et al. also provide python codes in Appendix B. If you prefer python, you can run and explain the python codes. Note, you only need to choose either R or python.

Q2. Consider the data from All Time World Rankings. We use man's 100 meter dash records and woman's 100 meter dash records.

First, summarize these records by using a table with columns Time Record (second), Age (year), and Gender (Female or Male).

For the time record in each age and gender group, you should use the fastest times without wind assistance. Based on Rule 260.14(c) of IAAF Competition Rules 2016-2017, if a tail wind exceeds 2 meters per second the result cannot be registered as a record on any level. So you should use the fastest times among the wind speed less than or equal to +2 m/s.

For age, use the lower bound of each age group. For instance, the age for age group M35-39 is 35, the age for age group W90-94 is 90.

(a) Summarize the record of each age and gender group and form an R-readable table. For example, the first several rows of the table may be

Gender

Age

Time

M

35

9.97

M

40

10.29

. . . . . .

W

35

10.74

W

40

10.99

. . . . . .

(b) Consider time as the response variable (y) and age as the explanatory variable (x). For female students, use woman's record; for male students, use man's record. Fit the models

y = β10 + β11x

and

y = β20 + β21x + β22x2.

Include your R codes and report your estimates. Does the extra quadratic term appear necessary?

(c) Denote the estimates in part (a) of the intercept of model y = β10 + β20x as b0F in woman's record model, and as b0M in man's record model.

Include gender as an additional explanatory variable (v), and v = 1 corresponds to woman's record, and v = 0 corresponds to man's record. Consider the model

y = β30 + β31x + β32v.

Include your R codes and report your estimates. How does gender appear to affect the records?

(d) For female students, compare βˆ30 + βˆ32 and b0F. For male students, compare βˆ30 and b0M. Explain the difference.

(e) For female students, use woman's record; for male students, use man's record. Using the data fit a Gamma generalized linear model. Interpret your findings and compare with part (b). Include your R codes, and write down the link function you choose, and the equation of your fitted model.

(f) Show that the density of inverse Gaussian distribution lies in the exponential family, and write the distribution in the canonical form of a generalized linear model. Then repeat part (e) using an inverse Gaussian generalized linear model.

Q3. Two items A and B are weighed on a balance, first separately and then together, to yield observations y1, y2, and y3. Say, suppose the true weights of A and B are αA and αB, we have

y1 = αA + ε1

y2 = αB + ε2

y3 = αA + αB + ε3

(a) If εi ∼ N(0, σ2ε), i = 1, 2, 3, find the reasonable estimates of αA and αB. Show your work.

(b) If εi ∼ N(0, σ2ε) for i = 1, 2, and ε3 ∼ N(0, k2σ2ε), where constant k > 1, find the reasonable estimates of αA and αB. Show your work.

(c) Let y1 = 41, y2 = 53, y3 = 97, k = 1.2. Choose a suitable function in R, and find the estimates of αA and αB in (a) and (b). Include your R codes, and highlight the key R function you use. Compare the estimates of αA and αB in (a) and (b) and explain the differences.

Attachment:- Assignment File.rar

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M92557483

Have any Question?


Related Questions in Statistics and Probability

Introduction to epidemiology assignment -assignment should

Introduction to Epidemiology Assignment - Assignment should be typed, with adequate space left between questions. Read the following paper, and answer the questions below: Sundquist K., Qvist J. Johansson SE., Sundquist ...

Question 1 many high school students take the ap tests in

Question 1. Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the biology exam 84,199 of them were female. In that same year,of the 211,693 students who too ...

Basic statisticsactivity 1define the following terms1

BASIC STATISTICS Activity 1 Define the following terms: 1. Statistics 2. Descriptive Statistics 3. Inferential Statistics 4. Population 5. Sample 6. Quantitative Data 7. Discrete Variable 8. Continuous Variable 9. Qualit ...

Question 1below you are given the examination scores of 20

Question 1 Below you are given the examination scores of 20 students (data set also provided in accompanying MS Excel file). 52 99 92 86 84 63 72 76 95 88 92 58 65 79 80 90 75 74 56 99 a. Construct a frequency distributi ...

Question 1 assume you have noted the following prices for

Question: 1. Assume you have noted the following prices for paperback books and the number of pages that each book contains. Develop a least-squares estimated regression line. i. Compute the coefficient of determination ...

Question 1 a sample of 81 account balances of a credit

Question 1: A sample of 81 account balances of a credit company showed an average balance of $1,200 with a standard deviation of $126. 1. Formulate the hypotheses that can be used to determine whether the mean of all acc ...

5 of females smoke cigarettes what is the probability that

5% of females smoke cigarettes. What is the probability that the proportion of smokers in a sample of 865 females would be greater than 3%

Armstrong faber produces a standard number-two pencil

Armstrong Faber produces a standard number-two pencil called Ultra-Lite. The demand for Ultra-Lite has been fairly stable over the past ten years. On average, Armstrong Faber has sold 457,000 pencils each year. Furthermo ...

Sppose a and b are collectively exhaustive in addition pa

Suppose A and B are collectively exhaustive. In addition, P(A) = 0.2 and P(B) = 0.8. Suppose C and D are both mutually exclusive and collectively exhaustive. Further, P(C|A) = 0.7 and P(D|B) = 0.5. What are P(C) and P(D) ...

The time to complete 1 construction project for company a

The time to complete 1 construction project for company A is exponentially distributed with a mean of 1 year. Therefore: (a) What is the probability that a project will be finished in one and half years? (b) What is the ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As