Q1) Many regions along coast in North and South Carolina and Georgia have experienced rapid growth over last 10 years. It is expected the growth will continue over next ten years. This has resulted in many of large grocery store chains building new stores in region. Kelley's super grocery stores chain is no exception. Director of planning For Kellys wishes to examine adding more stores in region. He thinks there are 2 main factors which point out amount families spend on groceries. First is their income and other is number of people in family. Director gathered the following sample.
Food and income are reported in thousands of dollars per year, and the variable size refers to number of people in household.
a) Create a correlation matrix. Do you see any problem with multicollinearity?
b) Find out the regression equation. Describe the regression equation. How much does additional family member add to the amount spent on food?
c) What is the value of R square? Can we conclude the value is greater than o?
d) Would you consider deleting wither of independent variables?
e) Plot residuals in a histogram. Is there any problem with normality assumption?
f) Plot fitted values against residuals. Does this plot point out any problems with homoscedasticity?
Family
|
Food
|
Income
|
Size
|
|
Family
|
Food
|
Income
|
Size
|
1
|
5.04
|
73.98
|
4
|
|
14
|
4.92
|
171.36
|
2
|
2
|
4.08
|
54.9
|
2
|
|
15
|
6.6
|
82.08
|
9
|
3
|
5.76
|
94.14
|
4
|
|
16
|
5.4
|
141.3
|
3
|
4
|
3.48
|
52.02
|
1
|
|
17
|
6
|
36.9
|
5
|
5
|
4.2
|
65.7
|
2
|
|
18
|
5.4
|
56.88
|
4
|
6
|
4.8
|
53.64
|
4
|
|
19
|
3.36
|
71.82
|
1
|
7
|
4.32
|
79.74
|
3
|
|
20
|
4.68
|
69.48
|
3
|
8
|
5.04
|
68.58
|
4
|
|
21
|
4.32
|
54.36
|
2
|
9
|
6.12
|
165.6
|
5
|
|
22
|
5.52
|
87.66
|
5
|
10
|
3.24
|
64.8
|
1
|
|
23
|
4.56
|
38.16
|
3
|
11
|
4.8
|
138.42
|
3
|
|
24
|
5.4
|
43.74
|
7
|
12
|
3.24
|
125.82
|
1
|
|
25
|
4.8
|
48.42
|
5
|
13
|
6.6
|
77.58
|
7
|
|
|
|
|
|