## Statistics

Part A -

Question 1 - True or False: In data collection, the most common technique to ensure proper representation of the population is to use a random sample.

True

False

Question 2 - Most analysts focus on the cost of HECS fees as the way to measure the cost of a university education. But incidentals, such as textbook costs, are rarely considered. A researcher at the University of Adelaide wishes to estimate the textbook costs of first-year students at the University. To do so, she monitored the textbook cost of 250 first-year students and found that their average textbook cost was \$300 per semester. Identify the population of interest to the researcher.

All university students.

The 250 students that were monitored.

All first-year University of Adelaide students.

Question 3 - True or False: The answer to the question "What is your favourite colour?" is an example of a continuous variable.

True

False

Question 4 - The classification of student year of study (first year, second year, third year, honours) is an example of

a discrete variable.

an ordinal variable.

a nominal variable.

a continuous variable.

Question 5 - True or False: Histograms are used for continuous data while bar charts are suitable for discrete data.

True

False

Question 6 - An insurance company evaluates many numerical variables about a person before deciding on an appropriate rate for automobile insurance. A representative from a local insurance agency selected a random sample of insured drivers and recorded the number of claims each made in the last 3 years. The results are shown below.

 Number of claims Frequency 1 29 2 33 3 27 4 20 5 16

How many drivers are represented in the sample?

125

100

75

50

Question 7 - Research has shown that the more time students spent reviewing their course site the more chances they have of successfully completing their course.

The data in the Excel file webstats 1 (under Supplementary Material on Canvas site) refer to the duration (in minutes) that a sample of 50 students spent on the website of a Statistics course that they were enrolled in last September.

Using Excel, construct a histogram to represent the above data. Use a class width of 50 and start the first class at 0.

Note: You are not required to upload the histogram but to answer the following questions based on the histogram.

The shape of the distribution is?

The modality of the distribution is?

Are there any outliers?

Question 8 - True or False: If the distribution of a data set was perfectly symmetrical, the distance from Q1 to the median would always equal the distance from Q3 to the median in a box plot.

True

False

Question 9 - True or False: The five-number summary consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation.

True

False

Question 10 - True or False: A set of data is perfectly symmetrical when the mean is identical to the median.

True

False

Question 11 - True or False: If the mean is equal to the median, the distribution is symmetrical or has zero skewness.

True

False

Question 12 - Research has shown that the more time students spent reviewing their course site the more chances they have of successfully completing their course.

The data in the Excel file webstats 2 (under Supplementary Material on Canvas site) refer to the duration (in minutes) that a sample of 50 students spent on the website of a Statistics course that they were enrolled in last September.

Using Excel, generate the Descriptive Statistics for the above data.

Note: You are not required to upload the output but to answer the following questions based on the output.

The mean is?

The median is?

The range is?

The IQR is?

The shape of the distribution is?

The most appropriate measure to describe Centre is?

The most appropriate measure to describe Spread is?

Question 13 - Despite continuous advances in communication technology many organisations still prefer staff to meet their clients face to face. These meetings are often interstate and sometimes arranged at the last minute, requiring almost immediate transport and accommodation bookings. A common method of booking hotel rooms is to compare the market prices using one of the many appropriately specialised websites.

The file Hotel Room Prices.xlsx (under Supplementary Material on Canvas site) contains a sample of 256 hotel room prices (in dollars) listed on a "well known" website for the night of Thursday 16th February 2017.

Construct a side-by-side boxplots for "All cities hotel rooms prices" and for each of the city's hotel rooms prices.

Note: You are not required to upload the side-by-side boxplots but to answer the following questions.

How many outliers can be identified in each of the following city?

All Cities

Brisbane

Melbourne

Sydney

Part B -

Question 1 - A study is under way in the Otway National Park to determine the mature height of Mountain Ash gum trees. Specifically, the study is attempting to determine what factors aid a tree in reaching heights greater than 60 metres tall. It is estimated that the park contains 25,000 mature Mountain Ash gum trees. The study involves collecting heights from 250 randomly selected mature Mountain Ash gum trees and analysing the results. Identify the population from which the study was sampled.

All Mountain Ash gum trees, of any age, in the park.

The 25,000 mature Mountain Ash gum trees in the park.

The 250 randomly selected mature Mountain Ash gum trees.

All the mature Mountain Ash gum trees taller than 60 metres.

Question 2 - A study is under way in the Otway National Park to determine the mature height of Mountain Ash gum trees. Specifically, the study is attempting to determine what factors aid a tree in reaching heights greater than 60 metres tall. It is estimated that the park contains 25,000 mature Mountain Ash gum trees. The study involves collecting heights from 250 randomly selected mature Mountain Ash gum trees and analysing the results. Identify the sample in the study.

The 250 randomly selected mature Mountain Ash gum trees.

The 25,000 mature Mountain Ash gum trees in the park.

All Mountain Ash gum trees, of any age, in the park.

All the mature Mountain Ash gum trees taller than 60 metres.

Question 3 - The data in the Excel file travel work (under Supplementary Material on Canvas site) refer to the distance (in kilometres) that a sample of 50 people drive to work each day.

Using Excel, generate the Descriptive Statistics for the above data.

Note: You are not required to upload the output but to answer the following questions based on the output.

The mean is?

The median is?

The range is?

The IQR is?

The shape of the distribution is?

The most appropriate measure to describe Spread is?

Question 4 - The data in the Excel file travel work (under Supplementary Material on Canvas site) refer to the number of kilometres that a sample of 50 people drive to work each day.

Using Excel, construct a histogram to represent the above data. Use a class width of 10 and start the first class at 0.

Note: You are not required to upload the histogram but to answer the following questions based on the histogram.

The shape of the distribution is?

The modality of the distribution is?

Are there any outliers?

Question 5 - True or False: Faculty rank (professor to associate lecturer) is an example of a discrete variable.

True

False

Question 6 - True or False: Whether a university student is a full-fee student or a HECS student is an example of a nominal variable.

True

False

Question 7 - The Crowne Plaza in Canberra's CBD will be open for three meal sittings (breakfast, lunch and dinner) on this year's Christmas day. Due to the extra expense of public holiday rates, the manager does not want to over-staff the restaurant by rostering people on longer than necessary; however they do not want to under-staff the restaurant as they may lose patronage for future events. To assist the manager in preparing the Christmas roster, data were collected randomly over the past 3 months. Consider the data and how to allocate reasonable meal times for the budget considerations.

The file Xmas.xlsx (under Supplementary Material on Canvas site) contains a sample of 1200 dining times, 400 at each meal (in minutes) collected randomly over the past 3 months. Meal sessions are coded Breakfast = 1; Lunch = 2; Dinner = 3.

Construct a side-by-side boxplots for "all meals" and for each of the meal times sessions.

Note: You are not required to upload the side-by-side boxplots but to answer the following questions.

How many potential outliers can be identified in each of the following session?

All Meals

Breakfast

Lunch

Dinner

Attachment:- Data Files.rar

