+1-415-315-9853

info@mywordsolution.com

## Statistics

1. Using the dataset WAGE1.dta…

a) Regress wages on education and experience but do so in a way that produces standardized betas. That is, you will have to take each variable and standardize it by subtracting from it the mean and dividing by the standard deviation, i.e.x_standard= (x-x_bar)/x_std.

b) Interpretation the coefficients on education and experience?

c) Now regress log wage (already defined as “lwage” in the dataset) on education, experience, non white, female, married. Now run the same regression but including a variable that is the interaction between female and education. Interpret the coefficient on the interaction term. describe what happens to the standard error of the main education effect and why. Would you recommend keeping this interaction term in the equation? Why or why not?

d) Now drop the female X education interaction, but add another interaction, between female and married. What is the effect of being married for men? Now report the coefficient and standard error of the marriage effect for women. Is it statistically different from zero? (hint: for the standard error of a linear combination of estimates, use “lincom”):

e) Manually find out the adjusted R-squared for the regression in (d). Note: take advantage of the Stata output. Type “help regress” and scroll to the bottom. There you will see that the model sum of squares (i.e. the Sum of Squares describeed SSE) can be referenced as e(mss).

However, you can only access these results right after the regression has been run and before performing any other calculations. So, run the regression in (e) again, though you can do so without printing the results to the screen using the syntax “quietly: regress y x”. Confirm that your R-squared is the same as the one produced by Stata.

f) Building on the regression in (d), add the occupation variables: profocc, clerocc, servocc. What happens to the estimated rate of return to education? Why does this happen? If our goal is to correctly identify the causal effect of education, would you recommend including or excluding these occupation variables? describe why.

g) Manually find out the adjusted R-squared for the model you just ran? Compare with the result in (e). From an adjusted R-squared perspective, did the extra variables (profocc, clerocc, servocc) add much explanatory power to the model?

2. The dataset KIELMC.dta has information about houses sold in 1978 and 1981 in a Massachusetts town. In 1981 an incinerator was built.

a) Regress the log of the house’s sale price against the log of distance from the incinerator for houses sold in 1978 only. Run the same regression for houses sold in 1981 only. (Hint: use the clause “if year==1978” to select just that year for the regression). Does the result for 1978 make sense, given that the incinerator wasn’t built yet? What could describe this result?

b) Add the following variables to the regression: log of square footage of lot (lland), log of square footage of house (larea), log of distance to interstate (linst), age of house, squared age of house, number of rooms, number of bathrooms. Run the regression again for 1978, and again for 1981. What are the estimated effects of distance from the incinerator in 1978 and 1981 now? Are they significant and do they make sense?

describe why these results are so different from the ones in a.

c) For the regression from part b), perform a Chow Test to determine whether the 1978 and the 1981 data have the same parameters. Given the results from b) what do you expect the outcome of this test to be? (Hint: quietly run a pooled regression, a 1978 regression, and a 1981 regression, saving the SSR each time. Remember that the sum of squared residuals (aka residual sum of squares) can be accessed after the regression by typing “scalar ssr = e(rss)”).

d) Again using the regression from b), for the data, at what age does a house reach its maximum or minimum value? Which is it, a max or a min, and how do you know? Does that make sense? Are there any houses older than this in the data?

e) Rerun the 1981 equation but including the squared log of the distance to the interstate. What happens to the coefficient on distance to the incinerator? What does this tell you about the importance of functional form?

3) This exercise uses the dataset GPA1.dta.

a) Run a regression of colGPA against all of the following: hsGPA, ACT, PC, skipped, alcohol, greek, bgfriend

b) Now run the “restricted” regression that you need to construct an F-test of the joint hypothesis that the three “social” variables (alcohol, greek, and bgfriend) do not matter. Construct that F-statistic. Are these social variables jointly significant at alpha=1%?

describe how you know.

c) Confirm that you can get a similar result using Stata’s test command, which performs a Wald test, which is very similar to the F-test you just ran. The syntax is test varlist.

d) Now run a Linear Probability Model where bgfriend is the dependent variable and alcohol, greek, and colGPA are the independent variables. What is the interpretation of the coefficient on on alcohol?

e) A critique of the LPM is that the predicated values might value outside the 0-1 range. (probabilities can only range from 0 to 1). find out the predict values from (d). What percent of the predicted values are outside the 0-1 range? (Hint: an easy way to do this is to generate an 0/1 variable where 1 indicates that yhat was out of the 0-1 range. Then simply take the mean of the 0/1 variable).

f) We know that the Linear Probability Model suffers from heteroskedasticity. Test for heteroskedasticity using the special case of the White Test. Do you reject the null hypothesis of homoscedasticity at the 5% level, at the 10% level? (Hint: use the version of the F-stata for testing the overall significance of a regression).

g) Re-run the LPM but this time telling Stata to find out robust standard errors, i.e. specifying the vce(robust) option when running the regression. How do these standard errors compare to those in (d)? How about the coefficient estimates?

Statistics and Probability, Statistics

• Category:- Statistics and Probability
• Reference No.:- M93542

Have any Question?

## Related Questions in Statistics and Probability

### The setup is again as in exercise 931 but this time peter

The setup is again as in Exercise 9.31, but this time Peter chooses the integer k randomly according to a geometric distribution with parameter 1/2, that is, P(k = n) = 1/2n for each n ∈ N. How does this affect your answ ...

### The wall street journal ceo compensation study analyzed ceo

The Wall Street Journal CEO Compensation Study analyzed CEO pay from many U.S. companies with fiscal year 2008 revenue of at least \$5 billion that filed their proxy statements between October 2008 and March 2009. The dat ...

### Tables or graphsusing the internet the text or another

Tables or Graphs Using the Internet, the text, or another reliable source such as a newspaper or periodical site, (not a scholastic or school site like Khan Academy and not Wikipedia), research some important data that h ...

### For each experiment described below identify any confounds

For each experiment described below, identify any confounds that may be present. (Be careful not to identify things as confounds that are not.) Then, redesign each study to eliminate any confounds that you find. Write a ...

### Romeo composes a letter to juliet and gives it to tybalt to

Romeo composes a letter to Juliet, and gives it to Tybalt to deliver to Juliet. While on the way, Tybalt peeks at the letter's contents. Tybalt gives Juliet the letter, and Juliet reads it immediately, in Tybalt's presen ...

### The mean sat score in mathematics is 515 the founders of a

The mean SAT score in mathematics is 515 . The founders of a nationwide SAT preparation course claim that graduates of the course score higher, on average, than the national mean. Suppose that the founders of the course ...

### The average number of pages in a novel is 326 with a

The average number of pages in a novel is 326 with a standard deviation of 24 pages. If a sample of 50 novels is randomly chosen, what is the probability the average number of pages in these books is between 319 and 331?

### Bob like peggy wants to know if his cookies are special but

Bob, like Peggy, wants to know if his cookies are special, but all he knows are the class average of 21.65 chips per cookie, and his own data for his 10 cookies: 12,14,15,12,12,13,17,15,15,22. Conduct all six steps of hy ...

### Locate the data set banksav and open it with spss follow

Locate the data set "Bank.sav" and open it with SPSS. Follow the steps in section 10.15 Learning Activity as written. Answer all of the questions in the activity based on your observations of the SPSS output. Type your a ...

### Two population meansa tomato farmer with a very large farm

Two Population Means A tomato farmer with a very large farm of approximately 2200 acres had heard about a new type of rather expensive fertilizer which would supposedly significantly increase his production. The frugal f ...

• 13,132 Experts

## Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

### Section onea in an atwood machine suppose two objects of

SECTION ONE (a) In an Atwood Machine, suppose two objects of unequal mass are hung vertically over a frictionless

### Part 1you work in hr for a company that operates a factory

Part 1: You work in HR for a company that operates a factory manufacturing fiberglass. There are several hundred empl

### Details on advanced accounting paperthis paper is intended

DETAILS ON ADVANCED ACCOUNTING PAPER This paper is intended for students to apply the theoretical knowledge around ac

### Create a provider database and related reports and queries

Create a provider database and related reports and queries to capture contact information for potential PC component pro

### Describe what you learned about the impact of economic

Describe what you learned about the impact of economic, social, and demographic trends affecting the US labor environmen