Ask Statistics and Probability Expert

Be sure to read each question carefully and answer all parts. For all Stata questions, be sure to provide the log output (which should be edited and commented so that it is easier to grade). 

Q1. Find an article THAT YOU ARE INTERESTED IN, any article (or book chapter etc.), from other classes, from the news, from research journals, policy briefs etc. that has statistics (in the broad sense-it could have anything we have learned in class, or it could have something "statistical" outside of the scope of 631) in it. 

a. Print it out/ copy it and attach it.

b. Either: 1. Talk about something in this article that this class has made understandable for you, explaining to what you are referring, what part of the class has clarified it for you, and what you think it means.

2. Talk about something in this article that you do not understand, explaining to what you are referring and what you do not understand about it.  Ask questions that you think might help clarify what you do not understand for you.

Q3. Use a graphical analysis (which often looks more impressive than just doing statistical analysis) to support your answer to the following question: You are concerned with staff wasting time over lunch.  Government employees in your branch are allowed a 60 minute lunch break.  For a week you monitor employees (without their knowledge) and how long they spend at lunch.  You consider a lunch break of longer than 65 minutes unacceptable and a waste of tax-payer resources.  Do your employees have a problem with long lunch breaks?  (Note:  you may want to enter your data in twice and compare to make sure you have no data entry errors.  Also be sure to explain how your graphical analysis supports your answer.)  What statistics could you run to support your graphical analysis?  Run them.

Minutes Spent at Lunch

64           93           66           68

60           65           63           85

78           86           73           77

69           63           64           87

93           61           80           65

82           75           60           64

62           84           75           63

63           61           70           67

76           73           72           91

80           70           89           82

***Copy and paste STATA results for each question for question 4.

Q4. Download the GSS dataset from e-campus if you do not already have this dataset.  This is a shortened version of the full dataset which was downloaded from the NORC website.  Documentation can be found here: http://gss.norc.org/Get-Documentation. (Use STATA)  

a. Exploring

i. Explore the dataset with Stata using the tools you have learned.  As with all questions, show your output.  (Note that if your output is too lengthy, before printing your solutions you may clip out the middle part with a note that you have done so.)  

ii. For what years is this dataset available in the shortened version I uploaded?

b. Is this dataset longitudinal, panel, cross-section, repeated cross section, case study, some combination (if so, of what), or some other form of dataset?

c. Cut the dataset so you only have data for the year 2012 left.

d. Look carefully through the variables and variable descriptions. You have been asked to compare various statistics by gender and age.

i. Pick an outcome that you are interested in from the list of variables available.  Make sure that it is not missing from the dataset.

ii. Cut the dataset down so that the only variables remaining are for year, sex, age, and your variable(s) of interest.

iii. Save this smaller set as a new dataset.

iv. Explore this new dataset. 

e. How is your outcome of choice coded?

i. Is it coded in a way that will make sense for analysis? 

ii. If it is coded in a way that will make sense for analysis, say N/A for this part.  If it is not:

1. Can it be coded it in a way that it will make sense for analysis (hint:  you may want to turn a categorical variable into a binary variable or a continuous variable)? 

2. Do so if it can (otherwise you may want to choose another outcome, and move back to part d). 

3. Explain any assumptions you made when you changed this variable (or made a new variable from the old one).

f. Test to see if males and females act differently with respect to your variable. Be sure to include your hypothesis tests and whether or not your results are significant.  In your opinion, is the magnitude of the difference big?

g. Create a variable for older.  Explain how you define "older" vs. "younger," and any other assumptions that you make.  [Hint:  be EXTRA careful with missing values]

h. Are your results different for gender (from f) if you look only at older people?  Be sure to include your hypothesis tests and whether or not your results are significant.

i. Reopen the original dataset.

i. If you have not already, create a .do file that walks through steps d through h (so that it does not cut the original dataset by year).

ii. Cut the dataset so the year is 2000.  If your outcome variable in your .do file does not exist for the year you have picked, choose another outcome that does and modify your .do file accordingly.

iii. Do your .do file on the dataset for the year 2000.  [As always, show your output.]

iv. Are your results on the hypothesis tests from 4g and 4h different for the year 2000 compared to 2012?  If so, why might they be different?

***Show STATA output for question 5

Q5. Using the GSS

a. Pick two variables of your choice,

b. Make sure they are in an appropriate format for a scattergram

c. Create a scattergram using STATA. 

d. Repeat the scattergram using the jitter option. 

e. Repeat again with the sunflower option. 

f. How are these plots different?

Q6. You have been given a large budget, have successfully bribed your local IRB and have access to a local prison population- just kidding!  You actually have access to hundreds of psychology undergraduate students.  You have been asked to evaluate whether or not listening to classical music while studying for Psych 101 benefits students' midterm test scores.  [Note that you must answer the question in the context of the problem-it is not sufficient to just copy from your class notes.]

a. What is the best way to answer this question? 

b. Using the procedure you have learned in class, formally guide your (not-bribed) IRB committee through the steps you would take to answer this question, including the pros and cons of each step if there are any.

c. What kind of statistical analysis would you use at the end? 

Q7. You want to know the relationship between number of US troops per citizen in an occupied foreign city and the civilian death rate in those cities.  Your statistical team comes back with the following information:  "Using data from all US occupied foreign cities in the past 10 years, we ran the following regression:  Y = B X + alpha, where Y is the civilian death rate (measured from 0 to 100) and X is the ratio of US troops to citizens times 100 (also measured from 0 to 100).  We found that B = 19 and the standard error on B is 5.1.  Alpha is .3.  The R2 on this regression is .35."

a. What is the t-statistic for X?  Is B significant? (Use STATA if necessary)

b. What does B mean in this case?

c. Your newly hired analyst points out, "Your R2 is only .35.  Therefore we should ignore this regression because the fit is really low."  Is (s)he right?  What should you explain to him/her?

d. What are possible omitted variables?

e. After you have explained part c to your analyst, (s)he recommends that, based on this regression, you remove all US troops from all occupied cities.  Why does (s)he recommend this?  Should you take his/her advice?  Why or why not?

f. How might you better answer this question?  (Note:  the actual numbers on the Iraq war alone from Brookings Institute show a small negative sign on B.)

Q8. You are working for the department of public health.  Your supervisor has recently had a bad experience involving NoDoze and is sure that caffeine is evil.  (S)he tells you to find all the literature you can on the evils of caffeine so that your office can make a public health announcement against the stuff.

a. Should you ignore the literature on the potential benefits of caffeine?  Why or why not?  (Give both moral and immoral answers.)

b. You find a medical science article on heart attacks and caffeine intake.  It runs the regression Y = BX + alpha on a group of 70,000 men aged 45 to 65 over a period of 5 years, where X is the number of cups of coffee a person drinks each day and Y is the number of heart attacks the man has had during those 5 years.

i. B = .0000002, and the reported t-statistic is 47.  What does this mean in words?  Is this relationship significant?  Is this an important relationship? 

ii. What are possible omitted variables?

c. You find another medical science article on the effect of caffeine on fetal growth.  It follows 10 mothers throughout their pregnancies and finds the following:  Y = BX + alpha where X = number of cups of coffee the mother drinks each day over two cups and Y = fetal birth weight in pounds. 

i. B = -2, t = .9.  What does this mean in words?  Is the relationship significant?  Is it important?

ii. What are possible omitted variables?

iii. Does this regression justify looking for other articles on this topic?  Why or why not?  Would your answer be the same if n =10,000?

***See attached file for question nine.

Q9. Use the Stata command ttesti to answer the MB&B ttest problem of your choice from chapters 11, 12, or 14 (make sure it is a problem that can be solved using ttesti, not ttest or sampsi).  Provide hypothesis and log file output and make it clear which question you are addressing.  (Yes, you may use a problem that you have already solved by hand and/or has solutions in the back of the book.)

Q10.  Give (real or hypothetical) examples of the following.  Do not use either the examples that I gave you from your class notes or from Wikipedia:

a. A situation where someone is led astray by misuse of the Representativeness Heuristic

b. A situation where someone is led astray by misuse of the Availability Heuristic

c. A situation where Framing could change someone's answer to a survey question.

Attachment:- Assignment.rar

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M92062115

Have any Question?


Related Questions in Statistics and Probability

Introduction to epidemiology assignment -assignment should

Introduction to Epidemiology Assignment - Assignment should be typed, with adequate space left between questions. Read the following paper, and answer the questions below: Sundquist K., Qvist J. Johansson SE., Sundquist ...

Question 1 many high school students take the ap tests in

Question 1. Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the biology exam 84,199 of them were female. In that same year,of the 211,693 students who too ...

Basic statisticsactivity 1define the following terms1

BASIC STATISTICS Activity 1 Define the following terms: 1. Statistics 2. Descriptive Statistics 3. Inferential Statistics 4. Population 5. Sample 6. Quantitative Data 7. Discrete Variable 8. Continuous Variable 9. Qualit ...

Question 1below you are given the examination scores of 20

Question 1 Below you are given the examination scores of 20 students (data set also provided in accompanying MS Excel file). 52 99 92 86 84 63 72 76 95 88 92 58 65 79 80 90 75 74 56 99 a. Construct a frequency distributi ...

Question 1 assume you have noted the following prices for

Question: 1. Assume you have noted the following prices for paperback books and the number of pages that each book contains. Develop a least-squares estimated regression line. i. Compute the coefficient of determination ...

Question 1 a sample of 81 account balances of a credit

Question 1: A sample of 81 account balances of a credit company showed an average balance of $1,200 with a standard deviation of $126. 1. Formulate the hypotheses that can be used to determine whether the mean of all acc ...

5 of females smoke cigarettes what is the probability that

5% of females smoke cigarettes. What is the probability that the proportion of smokers in a sample of 865 females would be greater than 3%

Armstrong faber produces a standard number-two pencil

Armstrong Faber produces a standard number-two pencil called Ultra-Lite. The demand for Ultra-Lite has been fairly stable over the past ten years. On average, Armstrong Faber has sold 457,000 pencils each year. Furthermo ...

Sppose a and b are collectively exhaustive in addition pa

Suppose A and B are collectively exhaustive. In addition, P(A) = 0.2 and P(B) = 0.8. Suppose C and D are both mutually exclusive and collectively exhaustive. Further, P(C|A) = 0.7 and P(D|B) = 0.5. What are P(C) and P(D) ...

The time to complete 1 construction project for company a

The time to complete 1 construction project for company A is exponentially distributed with a mean of 1 year. Therefore: (a) What is the probability that a project will be finished in one and half years? (b) What is the ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As