Ask Applied Statistics Expert

Question 1:

The production manager of American Tool and Castings Company is conducting a study regarding the relationship between the numbers of alloy caps milled on a lathe versus the measure of distance from specification of outside cap diameters.  The lathe uses a sharp steel cutting tool in a milling process to cut and shape raw alloy bars into caps.  The lathe tool turns at a high speed while cutting into the alloy, in essence, cutting the alloy down to size and shaping it to resemble a round cap.  A similar lathe tool cuts into the inside of the cap.  The caps are later fit with interior gaskets and permanently sealed onto airtight canisters.    After the steel cutting tool is used repeatedly, the tool begins to wear, hence cutting a larger outside cap diameter than desired.  If the outside cap diameter is too large the cap can't be properly affixed and sealed to the canister.  The production manager would like to build a model to estimate/predict how many caps a tool can mill until it wears down too much, hence milling caps that are too large in diameter and unusable.  Each cap costs approximately $400 to mill, so defective caps are expensive.  The main variable of interest (y) is "distance from specification" of outside cap diameter.

To conduct the study, 62 lathe tools were randomly sampled.  Each lathe operator keeps a record of the number of caps milled by particular tool.  Each cap milled is measured to see how close to specification the outside diameter is.  According to specification, each cap should be 6 inches in diameter. For example, the measure in record one is 0.36, meaning it was 0.36 inches larger than specification. When each tool was sampled, the number of caps milled by the tool was recorded, as well as a measure from specification of the diameter of the last cap milled by that particular tool.  The data for each cutting tool sampled and the measure of distance from specification of the outside cap diameter of the last cap milled is in the spreadsheet labeled American.

1. Scatter plot

Construct a scatter plot revealing the relationship between the number of caps milled by a tool and the distance from specification of the outside cap diameter.  Make sure the x variable is on the x-axis and the y-variable is on the y-axis.  Move the chart so that it starts in cell E3.  Do not resize the chart beyond the red shaded region.

2. Correlation

Using a built-in Excel function in cell F22, calculate the correlation (r) between the number of caps milled by a tool and the distance from specification of the outside cap diameter.

In cell F23, indicate the strength of the linear relationship as very strong, relatively strong, very weak, relatively weak, or no relationship.

In cell F24, indicate if the relationship is positive or negative.

3. Anchoring the output in cell P3, generate the regression output.  Make sure you select an appropriate "Residual Plot," and place the residual plot in the designated area near cell E32.

4. Output

In cells J23 and J24, enter the value of the intercept and slope (respectively) by referencing the appropriate cells in the regression output.

In cell K24, enter the value of the t test statistic for testing the slope significance by referencing the appropriate cell from the regression output.

In cell L24, enter the p-value regarding the slope significance by referencing the appropriate cell from the regression output.

In cell M24, indicate with the word "Yes" or "No" if the slope coefficient is significant.  Assume α.01.

5. In cell F29, provide the predictive power (a.k.a. the coefficient of determination) of the model by referencing the appropriate cell from the regression output.

6. In cell J29, write the prediction equation relating NM to DS using the intercept and slope values.  This is a text input that starts with a number, so you must start the input with a space to trick Excel into interpreting the input as text.  For example, if a = 4 and b = 10, enter 4 + 10(NM), placing a space before the value 4.

7.

Cell E32 should contain the residual plot.  Keep the plot within the red shaded area.

In cell F48, comment on the assumption of linearity as interpreted using this residual plot.

In cell F49, comment on the assumption of constant variance as interpreted using this residual plot.

8. Prediction and Residual

In cell F53, predict the distance from specification of a cap milled by a tool when the cap is the 20th cap to be milled.

In cells F54 and F55, calculate the lower and upper values for the range of definition for this data set.

9. Prediction Interval

Using the table in cells J52:K53 as the Predication Data Set and StatTools, calculate the lower limit and upper limit for a 95% prediction interval for the DS of a cap that is the 20th cap milled.  Anchor your StatTools Regression output in cell A1 of the Regression Worksheet Place the values in cells J58 and K58 by referencing the appropriate cells in the StatTools output.  Note that this will shift the columns of your worksheet.

Question 2:

A mental health agency measured the self-esteem score for randomly selected individuals with disabilities who were involved in some work activity within the past year.  The spreadsheet named Self Esteem provides the data including each individuals self-esteem measure (y), years of education (YrsEdu), age, months worked in the last year (MonWork), marital status dummy variables (MS2, MS3, MS4) indicating if the individual is single, married, separated, or divorced, and a support level (SL) dummy variable indicating if the level of job support (counseling, etc) was provided directly (1) or indirectly (0).  Regarding marital status, if single all MS indicators are 0, while MS2 = 1 indicates married, MS3 = 1 indicates separated, and MS4 = 1 indicates divorced.

In cell N4, use Excel's "Correlation" Data Analysis tool to construct a correlation matrix for all the variables.   Note that the categories in columns I and J should not be included since the data are already represented as dummy variables in columns E through H.

Considering the correlation between self esteem and each x variable identify the three variables that, based on correlation with y alone, should be considered as best candidates for inclusion in the model.  Shade the appropriate cells containing the correlation values in yellow.  Ignore any multicollinearity concerns for this part.

Considering the correlation between each pair of x variables, identify the variables that would possibly cause multicollinearity problems if included in the model.  Shade the appropriate cells containing the correlation values in green.

Based on your conclusions in parts b and c, shade in red color the names of any variables that should not be included in the initial model because of possible multicollinearity problems.

With cell N19 as the upper left hand corner of the output, fit the full regression model. (Do not include a residual plot)

Considering the regression output from part e, shade (in yellow) the name of any x variable that appears significant and should remain in the model.  Also shade the t stat and p-value.  Consider the p-value small if it is less than 0.05.

Partial Regression Model: With cell N51 the upper left hand corner of the output, fit the model including only the x variable(s) that were found to be significant in part f.  (Do not include a residual plot)

Question 3:

A bank must prepare for a discrimination suit filed on behalf of female employees that claim females are paid less than male employees.  The bank manager sampled employee files to see if he could build a useful model for predicting salary as a function of gender and other characteristics.  For each employee, the data includes salary (y, in thousands of dollars), years experience (YrsExp), years prior experience (YrsPrior), and Gender.  The data is in the spreadsheet named Bank.

1. Since Gender is a categorical variable, construct the appropriate dummy variable in column E to indicate gender as female = 1 and male = 0.  You must use an "IF" statement in the appropriate cell(s) to indicate the correct dummy value based on gender.

2. With cell H7 the upper left hand corner of the output, fit the full model.  (Do not include a residual plot).

3. Based on the regression output from part b, shade (in yellow) the name of any x variable that appears significant and should remain in the model.  Also shade the t stat and p-value.

Attachment:- Assignment.rar

Applied Statistics, Statistics

  • Category:- Applied Statistics
  • Reference No.:- M92186033

Have any Question?


Related Questions in Applied Statistics

Question onea a factory manager claims that workers at

QUESTION ONE (a) A factory manager claims that workers at plant A are faster than those at plant B. To test the claim, a random sample of times (in minutes) taken to complete a given task was taken from each of the plant ...

You are expected to work in groups and write a research

You are expected to work in groups and write a research report. When you work on your report, you need to use the dataset, and other sources such as journal articles. If you use website material, please pay attention to ...

Assignment -for each of the prompts below report the

Assignment - For each of the prompts below, report the appropriate degrees of freedom, t statistic, p-value and plot using the statistical software platform of your choice (R/STATA) 1) A sample of 12 men and 14 women hav ...

Assignment - research topicpurpose the purpose of this task

Assignment - Research topic Purpose: The purpose of this task is to ensure you are progressing satisfactorily with your research project, and that you have clean, useable data to analyse for your final project report. Ta ...

Assessment task -you become interested in the non-skeletal

Assessment Task - You become interested in the non-skeletal effects of vitamin D and review the literature. On the basis of your reading you find that there is some evidence to suggest that vitamin D deficiency is linked ...

Part a -question 1 - an analyst considers to test the order

PART A - Question 1 - An analyst considers to test the order of integration of some time series data. She decides to use the DF test. She estimates a regression of the form Δy t = μ + ψy t-1 + u t and obtains the estimat ...

Medical and applied physiology experimental report

Medical and Applied Physiology Experimental Report Assignment - Title - Compare the working and spatial memory by EEG. 30 students were tested (2 memory games were played to test their memory - a card game and a number g ...

Business data analysis computer assignment -part 1

Business Data Analysis Computer Assignment - PART 1 - Economists believe that high rates of unemployment are linked to decreased life satisfaction ratings. To investigate this relationship, a researcher plans to survey a ...

Question - go to the website national quality forum nqf

Question - Go to the website, National Quality Forum (NQF), located in the Webliography, and download the article by WIRED FOR QUALITY: The Intersection of Health IT and Healthcare Quality, Number 8, MARCH 2008. You are ...

Go to the webliography source for the national cancer

Go to the Webliography source for the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program. In the Fast Stats, create your own cancer statistical report, "Stratified by Data Type," and u ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As