Focuses on the interpretation of the Pearson correlation and a simple linear regression.
Develop hypotheses, to calculate statistics, and to interpret output and summary tables.
PART ONE: PEARSON CORRELATION
Suppose that we are interested in the strength of the association between dietary sodium intake (the independent variable (X)) and mean diastolic blood pressure (the dependent variable (Y)). We conduct a small pilot study, measuring sodium intake and blood pressure on 15 persons. We calculate a mean dietary sodium of 14.30gm/day and a mean diastolic pressure of 106.53mmHg. The data are displayed in the table below:
Subject |
X |
Y |
X - X-bar |
(X - X-bar)2 |
Y - Y-bar |
Y - Y-bar)2 |
( X - X-bar)Y-- Y-bar) |
101 |
10.45 |
77 |
-3.85 |
14.84 |
-29.53 |
872.22 |
113.78 |
102 |
7.41 |
73 |
_-6.89 |
47.51 |
-33.53 |
1,124.48 |
231.13 |
103 |
15.26 |
101 |
_0.96 |
0.92 |
-5.53 |
30.62 |
-5.3 |
104 |
5.2 |
97 |
-9.1 |
82.86 |
-9.53 |
90.88 |
86.78 |
105 |
5.4 |
62 |
-8.9 |
79.26 |
-44.53 |
1,983.22 |
396.47 |
106 |
5.5 |
114 |
-8.8 |
77.49 |
7.47 |
55.75 |
-65.73 |
107 |
22.11 |
138 |
7.81 |
60.95 |
31.47 |
990.15 |
245.67 |
108 |
22 |
144 |
7.7 |
59.25 |
37.47 |
1,403.75 |
288.39 |
109 |
9 |
72 |
-5.3 |
28.12 |
-34.53 |
1,192.55 |
183.12 |
110 |
10.84 |
104 |
-3.46 |
11.99 |
-2.53 |
6.42 |
8.77 |
111 |
28 |
119 |
13.7 |
187.62 |
12.47 |
155.42 |
170.76 |
112 |
28.9 |
159 |
14.6 |
21.3.08 |
52.47 |
2,752.75 |
765.87 |
113 |
21.27 |
130 |
6.97 |
48.54 |
23.47 |
550.68 |
163.5 |
114 |
9.4 |
95 |
-4.9 |
24.04 |
-11.53 |
133.02 |
56.54 |
115 |
13.8 |
113 |
-0.5 |
0.25 |
6.47 |
41.82 |
-3.25 |
|
|
|
|
936.71 |
|
11,383.73 |
2,636.52 |
From Dawson and Trapp, we know that the formula for the Pearson product moment correlation coefficient is:
r = ∑(X - X¯)(Y - Y¯)/(√∑(Y - Y¯)2
1) Using the information in the table provided, calculate correlation coefficient (r) and show your work.
2) Interpret the size of r in two ways. First, give your interpretation the index value both in terms of direction (positive or negative) and strength (weak, moderate, or strong). Second, give your interpretation if the squared value of r.
We enroll an additional 15 subjects (total sample size = 30) and measure their dietary sodium intake (NA) and mean diastolic blood pressure (DBP). We analyze the data using SAS's CORR procedure and observe the results shown in Table 1 (see page 6). Based on those findings, answer the following questions.
3) Identify the dependent variable and the independent or explanatory variable in this study. Also, state the Null and Alternative hypotheses.
4) Report the correlation coefficient and P-value that should be used to test the Null hypothesis. Based on that information, what conclusion can you make about the Null hypothesis? (i.e. Reject or Fail to Reject the Null). In your response, include your interpretation of the Pearson correlation coefficient.
PART TWO: SIMPLE LINEAR REGRESSION
Suppose that we are interested in the relationship between gestational age (in days) and birth weight (in grams). We collect data on gestational age (GES_AGE) and birth weight (B_WEIGHT) for 51 babies and test the null hypotheses that knowing gestational age does not help estimate birth weight. We analyze the data using SAS's REG procedure and observe the results shown in Table 2 (see page 7). Based on those findings, answer the following questions.
1) Identify the dependent variable and the independent or explanatory variable(s) in this study. Also, state the Null and Alternative hypotheses.
2) Report the test statistic and P-value that should be used to test the Omnibus (or Overall) Null hypothesis. What is your conclusion about the Omnibus Null hypothesis?
3) Report and interpret the parameter estimate for gestational age from the SAS output.
4) Using the regression equation (Y = a + bX), calculate the predicted birth weight for an infant born at the gestational age of 280 days.
5) What is your interpretation of the R-square value?
6) Interpret these findings (3-4 sentences max). Your answer should (1) restate the findings, (2) note the generalizability of these results, and (3) state what these findings mean from clinical perspective.
Table 1: Output for Part One
Table 2: Output for Part Two