Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Statistics and Probability Expert

ASSIGNMENT

In the current assignment we apply some of the tools to analyze the data. The data was collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes their blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The current assignment involves data collected on a random sample of 748 donors. The data was obtained from the UCI Machine Learning Repository.

The file "transfusion.csv" contains the data. The file can be found here. The file contains 5 variables:
- recency = The number of months since the last donation. (numeric)
- frequency = The total number of donations. (numeric)
- monetary = Total blood donated (in c.c.). (numeric)
- time = The number of months since the first donation. (numeric)
- march2007 = An indicator. Indicates those that donated blood in March, 2007. (factor)
In the assignment we consider the last four variables.

Comparing Two Samples
Consider "frequency" as a response and "march2007" as an explanatory variable. Plot the relation between the two variables, test the equality of the expectation in the two sub-samples and the equality of the variance. Repeat the same analysis for the case where the response "frequency" is replaced by the log-transformed response: "log(frequency)". In Tasks 1-3 you are asked to describe the results of the analysis.

Linear Regression
In Tasks 4-7 you are asked to conduct an analysis similar to the analysis of Tasks 1-3. The difference is that the numerical variable "time" is used as the explanatory variable. The model of linear regression assumes that the expectation of the response is a linear function of the explanatory variable. Another assumption of the model is that the variance of the response is constant for each value of the explanatory variable. Frequently, however, one may observe an increase in the variance for larger values of the explanatory variable. Replacing the response by the log-transformed response is a commonly used method to overcome this difficulty. The analysis that involves the log of the response can be carried out via the replacement of the response "frequency" in the formula by the transformed response "log(frequency)".

The Relation Between Two Variables
The final Task 8 involves the investigation of the relation between the response "frequency" and the variable "monetary".

Tasks

Comparing Two Samples:

1. Apply the function "plot" to the formula that relates the response "frequency" to the explanatory variable "march2007" in order to produce the two box-plots of the response. Redo the plotting with "frequency" replaced by "log(frequency)". The distribution of the variable "log(frequency)" is:

__ More symmetric, __ Less symmetric compared to the distribution of the variable "frequency".

Mark the most appropriate option and attach the R code that produces the two plots:

2. Mark the null hypotheses that you reject with a significance level of 5% and those that you do not reject:

(Reject/Don't Reject) H0: The expectation of "frequency" is the same in the two subsets,

(Reject/Don't Reject) H0: The expectation of "log(frequency)" is the same in the two subsets.

Explain your answer:

3. Mark the null hypotheses that you reject with a significance level of 5% and those that you do not reject:

(Reject/Don't Reject) H0: The variance of "frequency" is the same in the two subsets,

(Reject/Don't Reject) H0: The variance of "log(frequency)" is the same in the two subsets.

Explain your answer:

Linear Regression:

4. Apply the function "plot" to the formula that relates the response "frequency" to the explanatory variable "time" in order to produce the scatter plot. Add the regression line to the plot. The variability of the variable "frequency, for larger values of the explanatory variable, is:

__ Smaller, __ Larger, __ Constant.

Mark the most appropriate option and attach the R code that produces the two plots:

5. Mark the null hypotheses that you reject with a significance level of 5% and those that you do not reject:

(Reject/Don't Reject) H0: The slope of "time" in the regression line of the response "frequency" is equal to zero,

(Reject/Don't Reject) H0: The slope of "time" in the regression line of the response "log(frequency)" is equal to zero.
Explain your answer:

6. The 95%-confidence interval of slope of "time" in the regression line of the response "log(frequency)" is:
Lower end = ____, Upper end = ____.

Attach the R code that produces the confidence interval:

7. The regression line between "time" as an explanatory variable and "log(frequency)" as a response is:
__ Increasing, __ Decreasing, __ Constant.

Mark the most appropriate option and explain your answer:

The Relation Between Two Variables:

8. Apply the function "plot" to the formula that relates the response "frequency" to the explanatory variable "monetary" in order to produce the scatter plot. Add the regression line to the plot. The points in the scatter plot are:

__ All on the same line, __ Show a linear trend but are not on the same line, __ Don't show a linear trend.

Mark the most appropriate option and attach the R code that produces the plot:

Attachment:- Data.rar

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M92721178
  • Price:- $30

Guranteed 24 Hours Delivery, In Price:- $30

Have any Question?


Related Questions in Statistics and Probability

What can a continuum of elements in terms of strategic

What can a continuum of elements in terms of strategic planning mean to performance to small-medium enterprises?

A particular brand of tires claims that its deluxe tire

A particular brand of tires claims that its deluxe tire averages at least 50,000 miles before it needs to be replaced. From past studies of this tire, the standard deviation is known to be 8,000. A survey of owners of th ...

List the three assumptions of the independent groups t-test

List the three assumptions of the independent groups t-test. The consequences of violating which assumption are exacerbated if you have unequal N

Jeremy needs to start paying back his student loan the

Jeremy needs to start paying back his student loan. The amount he owes is $14,069.80 with an APR of 6.94%. Assuming he will take 10 years, how much will his monthly payment be?

Your company is considering a new project that will require

Your Company is considering a new project that will require $18,000 of new equipment at the start of the project. The equipment will have a depreciable life of 5 years and will be depreciated to a book value of $3,000 us ...

A factory makes parts for laptop computers including screws

A factory makes parts for laptop computers, including screws. The screws are required to have the right length. The lengths of the screws obey a normal distribution with mean μ=4.25 millimeters and standard deviation σ=0 ...

A population has a mean of 400 and a standard deviation of

A population has a mean of 400 and a standard deviation of 50. Suppose a sample of size 125 is selected and is used to estimate . Use z-table. 1) What is the probability that the sample mean will be within +/- 6 of the p ...

If no payments are made a loan of amount 44000 would

If no payments are made, a loan of amount $44000 would increase to $49103.89 after 2 years of monthly compounding interest. If instead, payments of $718.87 are made at the end of each month, how many years would it be un ...

Question in one law school class the entering students

Question: In one law school class the entering students averaged 700 on the LSAT test with a standard de-viation of 40. Assuming the distribution of test scores was normal, what fraction of the class scored above 750? Th ...

A researcher determines that students study an average of

A researcher determines that students study an average of 60 15 (M SD) minutes per week. Assuming these data are normally distributed, what is the z score for students studying 45 minutes per week?

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As