Ask Applied Statistics Expert

Analysis of Correlates of School Performance

You have been hired as a consultant to the California Department of Education to analyze elementary school outcome data in order to understand the predictors of school level variation in academic performance. A select group of demographic and outcome data are available. The R data file called "calschooldist.csv" contains these data (note: these are real data). This file contains school outcomes for 400 elementary schools randomly selected across the state of California. The data set contains the following variables:

Your task is to build a reliable regression model which explains the correlates with school performance. Your dependent variable is acadperf, which is a measure of school wide standardized test scores (higher is better).

Policy background
By looking at schoolwide academic performance at the school level (rather than the individual level) we can obtain important insights into how context matters. Sadly, this analysis highlights how much student performance varies based on characteristics that are out of one's control such as the poverty of the community they were born in. It also raises some interesting questions related to educational practice. The ideal way to conduct this analysis would be to do a two-level analysis so that we could separate student level from school level factors. But I think you'll find the analysis of school level factors interesting.

Please follow the instructions below carefully:

Memo:

Using a hierarchal modeling strategy (as described in class and in Field), build a regression model that best predicts school performance. Write a 3-4 page memo which describes your findings. Discuss what other data you would like to collect in order to strengthen your findings. Describe the main substantive results from your final regression model only. Leave the more technical details about model assumptions and the model building process to the technical appendix (described below). Remember: learning what variables are statistically insignificant can be just as important as learning which ones are statistically significant.

Model building documentation appendix:

Important: following is a step by step description of the model building process. Regression can get unwieldy if you don't have a plan so I would suggest that you closely follow the steps I outline below.

Then, in a couple of pages, prepare a concise technical appendix where you answer the questions below. In the technical appendix, you basically have to document how you went through the 10 step process. See below for more detail, by step. Note: brief answers are okay!

Step 1: Without looking at the data, record expectations: what factors are likely to explain school performance (make a ‘wish list' of independent variables)?

Step 2: Reconcile "wish list" with available data. Take note of variables that you can't measure because they aren't available (to gauge omitted variable bias). List those variables here.

Step 3: Create a list of the variables in your wish list that are available in the data (or have close proxies). These are your candidate independent variables.

Step 4: Perform basic checks of the candidate variables. Do you have any missing value or out of range data problems? (if so, what did you do to resolve them, if anything?).

Step 5: What did your check of the correlation matrix find? Did you add any variables to the end of you list based on it? Does it look like you need to worry about multicollinearity?

Step 6: Write down the order of entry based on your best guess given your knowledge of field (protection against specification error) . If you added any variables based on the correlation analysis, add them to the end of your list. They should be given lowest priority since a priori expectations did not suggest their importance.

Step 7: Add your first independent variable. Show your bivariate model. Did it accord with your expectations?

Step 8: Check for regression violations for this bivariate mode. Did you find any major violations?

Step 9: Sequentially build up the model adding variables in the order you specified (don't check reg. assumptions at each stage)

Add variables one by one. As you add variables:

- Drop variables that are insignificant unless strong theoretical reason to keep.

- If an insignificant variable makes existing variable insignificant just drop the new one.

- If the new variable is significant but adding it makes and old variable insignificant, keep both. Theory led you to think the other important, so keep it.

- Keep track of variables which are not significant. This is important to document.

Briefly document what you kept and what you dropped.

You do NOT!! Need to check assumptions for each variable you add..only do this for the bivariate model and your final model. The one exception relates to multicollinearity. It can be useful to check for multi-collinearity as you add variables.

Step 10: Recheck model assumptions, for your final model. The final model is the one you should write about.

Discuss your final model, review the coefficient table in detail, and the other key statistics (Bs, Rsq,T stats,Fstats,StandardizedBs etc). Also, briefly discuss if the final model satisfied regression assumptions overall. If not, what are some options for improving the model fit?

Review the distance measures and influence statistics that Field discusses for the final model (Cooks Distance) , etc. What do they suggest?)

Notes:

-The free meals variable is included both as a continuous and categorical variable. I would suggest starting with the continuous one and only use the categorical one if you want to explore the relationship between income and performance in more detail. But, if you do that, remember that categorical variables need to be dummy coded (so if you eventually use the categorical representation of school meals, you should dummy code it and don't use both the continuous and categorical version of the free meals variable). If you do this, beware of the dummy variable trap (can't enter dummy variable for every level of a variable-need to drop at least one).

Step 11: Advanced options (Please try at least one of these)

1. explore the use of logarithms of the dependent variable. Do these improve your model?

2. For some of your predictors create dummy variables for those who score "high" on the variables (that is, those in the top quartile). See code for how to do this (using the high ELL variable as an example). Do there appear to be threshold effects? In other words, do these dummy variables perform better than continuous versions of the same domains? (when you add these variables, remove the continuous version of the variables)

3. Create an interaction variable (by multiplying two dummy variables). Test for interaction effects. If you do this, make sure that the main effects are also included in the model. Alternatively, it could be interesting to run your final model separately by subgroups of a key variable (such as mealcat).

4. Using the visualization tools included in the lab (under "visualization extensions")

Attachment:- Statistical Case Study.zip

Applied Statistics, Statistics

  • Category:- Applied Statistics
  • Reference No.:- M92225342
  • Price:- $45

Priced at Now at $45, Verified Solution

Have any Question?


Related Questions in Applied Statistics

Question onea a factory manager claims that workers at

QUESTION ONE (a) A factory manager claims that workers at plant A are faster than those at plant B. To test the claim, a random sample of times (in minutes) taken to complete a given task was taken from each of the plant ...

You are expected to work in groups and write a research

You are expected to work in groups and write a research report. When you work on your report, you need to use the dataset, and other sources such as journal articles. If you use website material, please pay attention to ...

Assignment -for each of the prompts below report the

Assignment - For each of the prompts below, report the appropriate degrees of freedom, t statistic, p-value and plot using the statistical software platform of your choice (R/STATA) 1) A sample of 12 men and 14 women hav ...

Assignment - research topicpurpose the purpose of this task

Assignment - Research topic Purpose: The purpose of this task is to ensure you are progressing satisfactorily with your research project, and that you have clean, useable data to analyse for your final project report. Ta ...

Assessment task -you become interested in the non-skeletal

Assessment Task - You become interested in the non-skeletal effects of vitamin D and review the literature. On the basis of your reading you find that there is some evidence to suggest that vitamin D deficiency is linked ...

Part a -question 1 - an analyst considers to test the order

PART A - Question 1 - An analyst considers to test the order of integration of some time series data. She decides to use the DF test. She estimates a regression of the form Δy t = μ + ψy t-1 + u t and obtains the estimat ...

Medical and applied physiology experimental report

Medical and Applied Physiology Experimental Report Assignment - Title - Compare the working and spatial memory by EEG. 30 students were tested (2 memory games were played to test their memory - a card game and a number g ...

Business data analysis computer assignment -part 1

Business Data Analysis Computer Assignment - PART 1 - Economists believe that high rates of unemployment are linked to decreased life satisfaction ratings. To investigate this relationship, a researcher plans to survey a ...

Question - go to the website national quality forum nqf

Question - Go to the website, National Quality Forum (NQF), located in the Webliography, and download the article by WIRED FOR QUALITY: The Intersection of Health IT and Healthcare Quality, Number 8, MARCH 2008. You are ...

Go to the webliography source for the national cancer

Go to the Webliography source for the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program. In the Fast Stats, create your own cancer statistical report, "Stratified by Data Type," and u ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As