Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Statistics and Probability Expert

Data Exploration and Descriptive Statistics.

For the final you will pick 4 variables to work with. At least one of them has to be an interval-ratio variable; please consult me if you are having trouble finding an interval-ratio variable. Your other variables can be nominal,ordinal, or interval ratio.

Part .Data Exploration

a) Run a histogram for each interval ratio variable. Cut and paste those onto your word file. Briefly describe the shape of the distribution, making note of its overall shape and also looking for any outliers.

b) Build a scatter plot if you have two or more interval ratio variables. What type of relationship, if any, can you observe between the variables?

Now, turn to your categorical variables (if you have any).

c) Run a frequency distribution for each of your categorical variables- you can use either the tab or fre command. Cut and paste the output in your word file and briefly describe the distribution of the variable. Note which category has the most observations, and note categories which have very few observations.

d) If you have two or more categorical variables run a cross-tab to examine the relationship between two of them using the tab command and either column or row percentages. Briefly describe the relationship between the variables. Cut and paste your output into a word file.

Part. Descriptive Statistics

Now, calculate descriptive statistics for your variables using the sum command. You can do all four variables at once.

a) Make sure you describe the mean of each variable. If the mean is not a good measure of central tendency for a particular variable please explain why. Do the same for the standard deviation and the range.

Of course, the type of correlation that you should calculate depends upon the type of variables you are working with. At the minimum you should calculate Three correlations but you must be very careful to use the right type of correlation coefficient for your data.

Just a few reminders:

1) To correlate interval-ratio variables, use pearson's r. Make sure you display a scatter plot before you run your correlation.

2) To correlate ordinal variables use spearman's rho. Make sure you display a cross tab before you run your correlation.

3) To correlate nominal variables use lambda. Make sure you display a crosstab before you run your correlation.

For each of your three correlations make sure you describe the size, statistical significance and, when applicable, the direction of the relationship.

Notice that the three correlations your report could be very different depending on the types of variables you are working with.

Regression and Multiple Regression

Now you will build a series of regression models. Before you begin keep the following in mind:

-Your outcome variable MUST be interval ratio.
-The interpretation of the regression coefficients depends upon the type of variable you are using and it's coding.
-For categorical predictors you might want to do some recoding. If you recode any variables make sure that you SAVE your data and that you describe how you recoded in your homework.

Part 1.
Estimate a regression model with a single predictor. Interpret the regression coefficient and it's p-value, the intercept and the R2.

Part 2.
Add another predictor variable to the model you estimated in Part 1. Describe any change in the coefficient of the original variable (and it's associated p-value) and interpret the coefficient and p-value of the new variable. Note any change in the R2

Part 3.
Add a third predictor variable. Describe any changes in the coefficients and p-values for the variables you entered into the previous models. Interpret the coefficient and p-value for your new variable. Note any change in the R2.

Effect size, prediction, and diagnostics

There are multiple ways to think about "effect size" in a multiple regression context. In this section:

1) Use the listcoef command after your regression models to obtain standardized coefficients. Briefly interpret the standardized coefficients using 2-3 sentences.

2) Now, calculate some "effect sizes" as I show in the video and the notes. The way that you do this section will depend upon the types of variables that you have. For categorical variables it probably makes most sense to calculate and effect at the mode. For interval ratio variables you might want to use the 25th and 75th percentiles. Do whatever makes sense for the type of data you are working with.

Take a few sentences to describe what you have calculated. Which variable appears to be the most important now?

Predicted values:

1) Create 2-4 "archetypes" or "representative cases" and calculate predicted values for those cases. How you do this will depend upon what types of variables you are working with. Please show your work and explain your archetypes.

Residuals

1) Calculate the residuals from your final regression model. Plot those residuals on a histogram and cut and paste the histogram into your word file. Do the residuals follow a normal distribution?

Heteroscedasticity

1) Plot your residuals against your fitted values using a scatter plot (paste the scatter plot into your word file). Do you see visual evidence of heteroscedasticity?

2) Test for heteroscedasticity using the Breusch-Pagan/ Cook-Weisberg test. What does this test tell you? Make sure you paste the output of the test into your word file.
Multicollinearity

1) Calculate VIFs for your model and paste the output into your word file? Does your model have multicollinearity problems?

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M91906722

Have any Question?


Related Questions in Statistics and Probability

Suppose a life insurance company sells a 230000 one-year

Suppose a life insurance company sells a $230,000 ?one-year term life insurance policy to a 20?-year-old female for ?$330. The probability that the female survives the year is 0.999642. Compute and interpret the expected ...

A manufacturing company wishes to compare two production

A manufacturing company wishes to compare two production facilities based on Defective units out of total unit production. The company obtains random samples from both facilities. Facility A produced a total of 983 units ...

A company recently had 26 million shares outstanding

A company recently had 26 million shares outstanding trading at $45/share. The company announces its intention to raise $290M by selling new shares. What price shoukd the company expect its existing shares shares to sell ...

Given the following values 20 m 16 07 conduct a

Given the following values: = 20, M = 16, = 0.7, conduct a one-sample z test at a .05 level of significance. What is the decision for a two-tailed test? A) to reject the null hypothesis B) to retain the null hypothesis C ...

A corporate bond is currently selling for 840 it has 5

A corporate bond is currently selling for $840. It has 5 years till maturity, 6% coupon, and YTM=10%. What is the par value?

Illustrate the difference between straight and cumulative

Illustrate the difference between straight and cumulative voting systems using as an example a shareholder who owns 5,000 shares and an election in which six directors will be selected. Why might shareholders care about ...

Your companys revenues were 3 million this year you paid

Your company's revenues were $3 million this year. You paid out $500,000 in salaries and your only other cash outflow was the purchase of a piece of construction equipment for $1 million that is to be depreciated to a ze ...

You want to you want to estimate the mean weight of

You want to you want to estimate the mean weight of quarters in circulation. A sample of 30 quarters has a mean weight of 5.649 grams in a standard deviation of 0.066 gram. Use a single value to estimate the mean weight ...

In an effort to check the quality of their cellnbspphones a

In an effort to check the quality of their cell? phones, a manufacturing manager decides to take a random sample of 10 cell phones from? yesterday's production? run, which produced cell phones with serial numbers ranging ...

A researcher is planning to a silvicultural study using 3

A researcher is planning to a silvicultural study using 3 different fertilizers on 3 adjacent plots. How many ways can the 3 plots be arranged if there are 18 fertilizers to choose from?

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As