Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Statistics and Probability Expert

Question 1 -

The following table lists some variables that might be of interest in your next data analysis. For each variable, complete the associated table indicating whether it is categorical (and if so, is it nominal or ordinal) or numerical (and if so, is it discrete or continuous).

Variable

Categorical

Continuous

Example

Eye Color

nominal X

ordinal

discrete

continuous

1a

Sex





1b

Number of runs scored in a baseball game





1c

Profession





1d

Temperature, measured in Farenheit





1e

Confidence in one's ability to to statistics as measured by "yes/no" to the statement: "I will do well"





1f

Number of siblings





1g

Distance an individual can run in five minutes





1h

Ethnicity





1i

Number of MD's - who also have a PhD





1j

Lack of coordination as measured by time it takes an individual to complete a certain puzzle.





Question 2 -

Here is a hypothetical situation. In 2015 a program aimed at reducing infant mortality was implemented in two regions, Pepi and Quepi. The following table (this is hypothetical, sorry) shows the numbers of births and infant deaths in two regions (Pepi and Quepi) in each of two years: 2014 and 2016.

 

Pepi

Quepi

 

Births

Infant Deaths

Births

Infant Deaths

2014

100,000

300

1,000,000

5000

2016

100,000

60

1,000,000

4000

2a. In which region is there more convincing evidence that the reduction in mortality was caused by the program?

2b. If the program can be continued in one region ONLY, which would you choose? In developing your answer, you may assume that the reductions shown were in fact caused by the program.

Question 3 -

The following are some data on some famous statisticians. Yes! Florence Nightingale, among her other talents, was a statistician!

Statistician

Gender

Year of Birth

Year of Death

Sir Francis Galton

2

1822

1911

Karl Pearson

2

1857

1936

William Sealy Gosset

2

1876

1937

Ronald Aylmer Fisher

2

1890

1962

Harald Cramer

2

1893

1985

Prasanta Mahalanobis

2

1893

1972

Jerzy Neyman

2

1894

1981

Egon S. Pearson

2

1895

1980

Gertrude Cox

1

1900

1978

Samuel S Wilks

2

1906

1964

Florence Nightingale

1

1909

1995

David John Tukey

2

1915

2000

3a. By any means you like (by hand is just fine), create a stem-and-leaf summary of the data on the variable YEAR OF BIRTH. Display it here. Then use this visual summary to answer questions #3b - #3e below.

3b. Are there any outliers (i.e., extreme values) in this distribution? Explain.

3c. How would you describe the shape of this distribution? Explain.

3d. What is/are the most frequently occurring score(s) in this distribution? How many times does it/do they occur?

3e. Can we use this stem-and-leaf to obtain the original set of values for this variable? Explain.

Question 4 -

4a. When a distribution is skewed to the right

i) TRUE or FALSE: The median is greater than the mean.

ii) TRUE or FALSE: The distribution is uni-modal

iii) TRUE or FALSE: The majority of observations are less than the mean.

4b. The shape of a frequency distribution can be described using:

i) TRUE or FALSE: A box and whisker plot.

ii) TRUE or FALSE: A table of frequencies

iii) TRUE or FALSE: A histogram

4c. For the sample 3, 1, 7, 2 and 2:

i) TRUE or FALSE: The sample mean is 3

ii) TRUE or FALSE: The sample median is 7

iii) TRUE or FALSE: The range is 1

iv) TRUE or FA.LSE: The sample variance is 5.5

Question 5 -

The following table shows the numbers of geriatric admissions, each week from May through September, to a certain facility in each of two years, 2012 and 2013.

Week

# Admissions
2012

# Admissions
2013

Week

# Admissions
2012

# Admissions
2013

1

24

20

12

11

25

2

22

17

13

6

22

3

21

21

14

10

26

4

22

17

15

13

12

5

24

22

16

19

33

6

15

23

17

13

19

7

23

20

18

17

21

8

21

16

19

10

28

9

18

24

20

16

19

10

21

21

21

24

13

11

17

20

22

15

29

5a. By any means you like (by hand is just fine), summarize these data graphically. Display it here. Then use this visual summary to answer question #5b.

5b. Why do you think these two years were different? Note - There is no single correct answer here. I will accept any well-reasoned interpretation. I'm looking for you to think about what you see!

Question 6 -

6a. You read that the median income of U.S. households in 2010 was $49,455. In 1-2 sentences at most, explain in plain language what "the median income" is.

6b. The Census Bureau website gives several choices for "average income" in its historical income data. In 2010, the median income of American households was $49,455. The mean household income was $67,530. The median income of families was $60,395, and the mean family income was $78,361. The Census Bureau says, "Households consist of all people who occupy a housing unit. The term family' refers to a group of two or more people related by birth, marriage, or adoption who reside together". In at most 5 sentences, explain carefully why mean incomes are higher than median incomes and why family incomes are higher than household incomes.

6c. A January 2012 magazine article reported that the average income for readers of the business magazine Forbes was $217,000. In your opinion, is the median wealth of these readers greater or less than $217,000? In at most 1-2 sentences, explain your reasoning.

6d. The distribution of individual incomes in the United States is strongly skewed to the right. In 2008, the mean and median incomes of the top 1% of Americans were $558,726 and $1,137,680. Which of these numbers is the mean and which is the median? In at most 1-2 sentences, explain your reasoning.

6e. By any means you like (by hand is fine) which of the following two data sets is more spread out? Show your work. In at most 1-2 sentences, explain your reasoning.

Data set "A": 4  0  1  4  3  6

Data set "B": 5  3  1  3  4  2

Question 7 -

A box plot is the graph of a five number summary. The central box spans the quartiles. The line in the box mark the median. The size of the box is a measure of spread. The lines extending out from the box give an indication of extremes, if any. Side-by-side box plots are useful for comparing two distributions. As an example, consider the following table. It lists the average month's temperature (Farenheit) of Springfield, Massachusetts and San Francisco, California.

Month

Ave Temp (F)
Springfield

Month

Ave Temp (F)
San Francisco

January

32

January

49

February

36

February

52

March

45

March

53

April

56

April

55

May

65

May

58

June

73

June

61

July

78

July

62

August

77

August

63

September

70

September

64

October

58

October

61

November

45

November

55

December

36

December

49

7a. Obtain the five number summary for the average monthly temperatures, separately for each data set, Springfield versus San Francisco. Use these values to complete the following table.


Springfield

San Francisco

Minimum



Q1



Q2 = median



Q3



Maximum



7b. By any means you like (by hand is fine), produce a side-by-side box and whisker plot of the two distributions of average monthly temperatures. You will use this visual to answer question #7c.

7c. i) Are the 2 cities similar in their typical (median) average temp?

ii) Are the 2 cities similar in terms of temperature spread? Explain

iii) Which city requires owning a larger wardrobe of clothes?

Question 8 -

This last exercise gives you practice working with the fundamentals of calculations of the sample mean, the sample variance and the sample standard deviation. It also gives you practice producing and interpreting a histogram.

On the next page is a table of data on X = blood glucose levels (mmol/L) obtained from a simple random sample of n=40 first year medical students. The students are indexed using a subscript "i" that ranges from i = 1 to i = 40.

8a. First calculate the sample mean. To do this, obtain the sum of the individual blood glucose values and divide this by the sample size.

i) i=140 xi =

ii) n =

iii) Sample mean = i=140xi/n = fill in/fill in =

8b. Next, calculate the individual squared values of individual blood glucose levels. In developing your answer complete the entries to the 3rd column of the table. All done? Now obtain the sum of the squared values of the individual blood glucose levels. Enter this total at the bottom.

8c. Next, calculate the individual squared values of the deviations of the individual blood glucose levels about the sample mean. In developing your answer complete the entries to the 4th and 5th columns of the table. All done? Now obtain the sum of the individual squared values of the deviations of the individual blood glucose values about the sample mean. Enter this total at the bottom of the 5th column.

i

xi

xi2

(xi - x-)

(xi - x-)2

1

4.7




2

4.2




3

3.9




4

3.4




5

3.6




6

4.1




7

4.8




8

4.0




9

3.8




10

4.4




11

3.3




12

3.8




13

2.2




14

5.0




15

3.3




16

4.1




17

4.7




18

3.7




19

3.6




20

3.8




21

4.1




22

3.6




23

4.6




24

4.4




25

3.6




26

2.9




27

3.4




28

4.9




29

4.0




30

3.7




31

4.5




32

4.9




33

4.4




34

4.7




35

3.3




36

4.3




37

5.1




38

3.4




39

4.0




40

6.0




Total of column





8d. Calculate the sample variance using the appropriate column totals in TWO ways. Show your work. Tip - You should get the same answer, thus illustrating a shortcut when doing calculations by hand and clarifying the confusion you might have encountered when encountering more than one formula for this calculation.

i) s2 = i=140(xi -x-)2/(n-1)

ii) s2 = [i=140xi2] - [n][x-2]/(n-1)

8e. Finally, calculated the sample standard deviation.

8f. By any means you like (by hand is fine), produce a histogram of these data.

8g. Calculate the mean ±1 standard deviation and the mean ±2 standard deviations. Indicate these points on your histogram.

8h. What term best describes the shape of the distribution of blood glucose in this sample: symmetrical, skewed to the right, or skewed to the left?

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M92490924

Have any Question?


Related Questions in Statistics and Probability

Star corp is considering investing in new equipment to

Star Corp. is considering investing in new equipment to serve more customers. You have been asked to determine the required return of Star Corp's common equity, assuming that the firm will raise new equity financing to f ...

Calculation of individual costs and wacc lang enterprises

Calculation of individual costs and WACC Lang Enterprises is interested in measur-ing its overall cost of capital. Current investigation has gathered the following data. The firm is in the 21% tax bracket. Debt The firm ...

In a pre-election poll a candidate for district attorney

In a? pre-election poll, a candidate for district attorney receives 253 of 500 votes. Assuming that the people polled represent a random sample of the voting? population, test the claim that a majority of voters support ...

Why is sustainability in the sport industry linked to the

Why is sustainability in the sport industry linked to the green movement?

Bond x1 is a premium bond with a 12 coupon bond x2 is a 6

Bond X1 is a premium bond with a 12% coupon. Bond X2 is a 6% coupon bond currently selling at a discount. Both bonds make annual payments, have a YTM of 8%, and have seven years to maturity. (Round off all answers to 2 d ...

Suppose you want to estimate the proportion of traditional

Suppose you want to estimate the proportion of traditional college on your campus who own their own car. Based on some research on other campuses, you believe the proportion will be near 25%. What sample size is needed i ...

The glen arboretum wants to start removing invasive norway

The Glen arboretum wants to start removing invasive Norway Maple trees. To determine just how bad the problem is you set up five plots that are three meters on each side. You find that your plots have 2, 5, 9, 1, and 3 N ...

32 of college students say they use credit cards because of

32% of college students say they use credit cards because of the rewards program. You randomly select 10 college students and ask each to name the reason he or she uses credit cards. Find the probability that the number ...

In a study to determine the percentage of college students

In a study to determine the percentage of college students who read the newspaper, which of one following is the best sample? A. The Students in a math class. C.The freshmen in a particular school. B. The first 20 studen ...

There are 5 women and 8 men in a department how many ways

There are 5 women and 8 men in a department. How many ways can a committee be selected if there must be 2 men and 2 women on the committee? Number of ways to select a committee containing 2 men and 2 women is

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As