(Answered) Analysis of correlates of school performanceyou have been

Statistics

Analysis of Correlates of School Performance

You have been hired as a consultant to the California Department of Education to analyze elementary school outcome data in order to understand the predictors of school level variation in academic performance. A select group of demographic and outcome data are available. The R data file called "calschooldist.csv" contains these data (note: these are real data). This file contains school outcomes for 400 elementary schools randomly selected across the state of California. The data set contains the following variables:

Your task is to build a reliable regression model which explains the correlates with school performance. Your dependent variable is acadperf, which is a measure of school wide standardized test scores (higher is better).

Policy background
By looking at schoolwide academic performance at the school level (rather than the individual level) we can obtain important insights into how context matters. Sadly, this analysis highlights how much student performance varies based on characteristics that are out of one's control such as the poverty of the community they were born in. It also raises some interesting questions related to educational practice. The ideal way to conduct this analysis would be to do a two-level analysis so that we could separate student level from school level factors. But I think you'll find the analysis of school level factors interesting.

Please follow the instructions below carefully:

Memo:

Using a hierarchal modeling strategy (as described in class and in Field), build a regression model that best predicts school performance. Write a 3-4 page memo which describes your findings. Discuss what other data you would like to collect in order to strengthen your findings. Describe the main substantive results from your final regression model only. Leave the more technical details about model assumptions and the model building process to the technical appendix (described below). Remember: learning what variables are statistically insignificant can be just as important as learning which ones are statistically significant.

Model building documentation appendix:

Important: following is a step by step description of the model building process. Regression can get unwieldy if you don't have a plan so I would suggest that you closely follow the steps I outline below.

Then, in a couple of pages, prepare a concise technical appendix where you answer the questions below. In the technical appendix, you basically have to document how you went through the 10 step process. See below for more detail, by step. Note: brief answers are okay!

Step 1: Without looking at the data, record expectations: what factors are likely to explain school performance (make a ‘wish list' of independent variables)?

Step 2: Reconcile "wish list" with available data. Take note of variables that you can't measure because they aren't available (to gauge omitted variable bias). List those variables here.

Step 3: Create a list of the variables in your wish list that are available in the data (or have close proxies). These are your candidate independent variables.

Step 4: Perform basic checks of the candidate variables. Do you have any missing value or out of range data problems? (if so, what did you do to resolve them, if anything?).

Step 5: What did your check of the correlation matrix find? Did you add any variables to the end of you list based on it? Does it look like you need to worry about multicollinearity?

Step 6: Write down the order of entry based on your best guess given your knowledge of field (protection against specification error) . If you added any variables based on the correlation analysis, add them to the end of your list. They should be given lowest priority since a priori expectations did not suggest their importance.

Step 7: Add your first independent variable. Show your bivariate model. Did it accord with your expectations?

Step 8: Check for regression violations for this bivariate mode. Did you find any major violations?

Step 9: Sequentially build up the model adding variables in the order you specified (don't check reg. assumptions at each stage)

Add variables one by one. As you add variables:

- Drop variables that are insignificant unless strong theoretical reason to keep.

- If an insignificant variable makes existing variable insignificant just drop the new one.

- If the new variable is significant but adding it makes and old variable insignificant, keep both. Theory led you to think the other important, so keep it.

- Keep track of variables which are not significant. This is important to document.

Briefly document what you kept and what you dropped.

You do NOT!! Need to check assumptions for each variable you add..only do this for the bivariate model and your final model. The one exception relates to multicollinearity. It can be useful to check for multi-collinearity as you add variables.

Step 10: Recheck model assumptions, for your final model. The final model is the one you should write about.

Discuss your final model, review the coefficient table in detail, and the other key statistics (Bs, Rsq,T stats,Fstats,StandardizedBs etc). Also, briefly discuss if the final model satisfied regression assumptions overall. If not, what are some options for improving the model fit?

Review the distance measures and influence statistics that Field discusses for the final model (Cooks Distance) , etc. What do they suggest?)

Notes:

-The free meals variable is included both as a continuous and categorical variable. I would suggest starting with the continuous one and only use the categorical one if you want to explore the relationship between income and performance in more detail. But, if you do that, remember that categorical variables need to be dummy coded (so if you eventually use the categorical representation of school meals, you should dummy code it and don't use both the continuous and categorical version of the free meals variable). If you do this, beware of the dummy variable trap (can't enter dummy variable for every level of a variable-need to drop at least one).

Step 11: Advanced options (Please try at least one of these)

1. explore the use of logarithms of the dependent variable. Do these improve your model?

2. For some of your predictors create dummy variables for those who score "high" on the variables (that is, those in the top quartile). See code for how to do this (using the high ELL variable as an example). Do there appear to be threshold effects? In other words, do these dummy variables perform better than continuous versions of the same domains? (when you add these variables, remove the continuous version of the variables)

3. Create an interaction variable (by multiplying two dummy variables). Test for interaction effects. If you do this, make sure that the main effects are also included in the model. Alternatively, it could be interesting to run your final model separately by subgroups of a key variable (such as mealcat).

4. Using the visualization tools included in the lab (under "visualization extensions")

Attachment:- Statistical Case Study.zip

View complete question

Applied Statistics, Statistics

Category:- Applied Statistics
Reference No.:- M92225342
Price:- $45

Verified Expert

Priced at $90 Now at $45, Verified Solution

Have any Question?Write your Review or question?

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Recent Questions

Ask Applied Statistics Expert

Statistics

Related Questions in Applied Statistics

Question onea a factory manager claims that workers at

You are expected to work in groups and write a research

Assignment -for each of the prompts below report the

Assignment - research topicpurpose the purpose of this task

Assessment task -you become interested in the non-skeletal

Part a -question 1 - an analyst considers to test the order

Medical and applied physiology experimental report

Business data analysis computer assignment -part 1

Question - go to the website national quality forum nqf

Go to the webliography source for the national cancer

Ask Experts for help!!

Looking for Assignment Help?

Why might a bank avoid the use of interest rate swaps even

Describe the difference between zero coupon bonds and

Compute the present value of an annuity of 880 per year

Compute the present value of an 1150 payment made in ten

Compute the present value of an annuity of 699 per year

Follow Us