problem 1: A statistician is investigating the 'home ground' eect and is studying 20 football games, of which 14 were won by the home team and 6 by the visitors. Thus the game is a Bernoulli trial with a home win counted as a 'success'. The R idiom is:
games <- c(14,6)
because 14 games were wins for the home side and 6 were won by the visitors.
a) If the games are independent, what distribution species the number of home wins?
b) What is a sensible null hypothesis to use in this context?
c) Using chisq.test(games), or otherwise, give a p-value for the observation.
d) Interpret your p-value in such a way that a football commentator (who is not a statistician) would understand.
e) NB: harder. Bearing in mind the precise denition of a p-value [that is, the probability, if the null is true, of obtaining our observation or one more extreme], and the fact that in this context 'more extreme' means 'more home wins' [that is, 14 or more home wins], use dbinom() to find out a p-value for our observation.
problem 2: You will recall the function pnorm() from lectures. Using this, or otherwise, find outd the probability of a standard Gaussian random variable exceeding 1.3.
Using table(), or otherwise, verify your result numerically. You may find a sketch helpful here.
problem 3: A scientist measures the temperature of melting platinum using a new type of thermometer. The temperature of the metal is known to be exactly 1768.3 centigrade. The measurements are as follows:
c(1770.1, 1771.1, 1769.19, 1769.5, 1767.32, 1768.67, 1768.22)
a) What is the mean of these observations?
b) In the context of hypothesis testing, and bearing in mind that it is not known whether the new thermometer overpredicts or underpredicts temperature, is a one-sided or two-sided test indicated? Justify your answer.
c) Conduct a Student t-test for the hypothesis that the mean of the temperatures is 1768.3 (you may use the mu argument of t.test() here). State whether the observations are signicantly dierent from the known value.
d) Interpret the result of the statistical test in a way that the scientist, who is not a statistician, could understand.
e) Harder. Under what circumstances would a one-sided test be appropriate?
problem 4: A medical researcher has 100 bone cancer patients in a study. Each patient's condition is one of six types, type "A" to type "F". The 100 patients split as follows:
x <- c(11, 25, 10, 21, 16, 17)
Thus there are 11 patients with condition "A", 25 patients with "B", and so on.
Using chisq.test(), or otherwise, test the null hypothesis that the six conditions are equally likely; report your p-value.
Is there evidence that the conditions are not equally probable? Interpret your nding in such a way that a busy oncologist, who is not a statisician, could understand.