Ask Applied Statistics Expert

Assignment: Written and Practical Report

The key frameworks and concepts covered in modules 1-5 are particularly relevant for this assignment. Assignment relates to the specific course learning objectives 1, 2 and 4 and associated MBA program learning goals and skills: Global Content, Problem solving, Critical thinking, and Written Communication at level 3:

1. Demonstrate applied knowledge of people, markets, finances, technology and management in a global context of business intelligence practice (data warehouse design, data mining process, data visualisation and performance management) and resulting organisational change and how these apply to implementation of business intelligence in organisation systems and business processes.

2. Identify and solve complex organisational problems creatively and practically through the use of business intelligence and critically reflect on how evidence based decision making and sustainable business performance management can effectively address real world problems.

4. Demonstrate the ability to communicate effectively in a clear and concise manner in written report style for senior management with correct and appropriate acknowledgment of main ideas presented and discussed.

Assignment consists of three main tasks and a number of sub tasks

Task 1: Consists of the following sub tasks The sinking of the Titanic is a famous event. You may find it useful to research the facts surrounding the sinking of the Titanic to inform your understanding of the problem and ensuing interpretation of your data analysis of the factors determining the survival of passengers on the Titanic. Use the data mining tool RapidMiner to conduct an exploratory analysis of the titanic_train.csv data set which is provided on the course study desk Assignment 2 folder link and then build a simple predictive model of Survival on the Titanic using a Decision Tree.

a) You need to identify five key variables that contribute most to determining the survival rate of passengers on the ill-fated Titanic on its maiden voyage. Note you should also refer to the data dictionary provided with the titanic3_train.csv file which describes each of the variables and their range of values. (Hint: an exploratory analysis should be based on summary statistics, histograms, crosstab tables and scatterplots of individual variables and the relationship between individual variables and the target variable survived. Which variables are correlated with target variable survived and other variables?) You might also need to consider reformatting some of variables to facilitate the next stage of analysis of the titanic3._train.csv and titanic3_score.csv data sets using a Decision Tree (Hint: you will need to convert the survival variable to nominal variable with the values Yes = 1, No = 0 in titanic_train.csv). See Data Mining for the Masses Chapters 3 and 4 for guidance in Exploratory Data Analysis using RapidMiner.

Discuss each of your five top predictor variables and the results of your exploratory data analysis in general using the RapidMiner data mining tool as well as how you dealt with missing data and unusual data informed by relevant supporting literature on the survival rate of passengers on the Titanic. Your discussion should also include appropriate statistical analysis results such as graphs and results tables from conducting an exploratory data analysis in the RapidMiner data mining tool with some supporting references on predictive model building and interpretation using Decision Trees in data mining (about 600 words).

The following table lists the data dictionary for the data set titanic_train.csv. (Note: titanic_score.csv is the same as titanic_train.csv but does not contain any values for target variable survived which is referred to as a label variable in Rapidminer).

Variable Description pclass Passenger Class (1 = 1st class; 2 = 2nd class; 3 = 3rd class) survived Survived (0 = No; 1 = Yes) name Name Sex Sex Age Age sibsp Number of Siblings/Spouses Aboard parch Number of Parents/Children Aboard ticket Ticket Number fare Passenger Fare cabin Cabin embarked Port of Embarkation(C = Cherbourg; Q = Queenstown; S = Southampton) boat Lifeboat body Body Identification Number home.dest Home/Destination

SPECIAL NOTES: Pclass is a proxy for socio-economic status (SES) 1st ~ Upper; 2nd ~ Middle; 3rd ~ Lower

Age is in Years; Fractional if Age less than One (1) If the Age is Estimated, it is in the form xx.5

Fare is in Pre-1970 British Pounds (£) Conversion Factors: 1£ = 12s = 240d and 1s = 20d With respect to the family relation variables (i.e. sibsp and parch) some relations were ignored. The following are the definitions used for sibsp and parch.

Sibling: Brother, Sister, Stepbrother, or Stepsister of Passenger Aboard Titanic Spouse: Husband or Wife of Passenger Aboard Titanic (Mistresses and Fiancées Ignored) Parent: Mother or Father of Passenger Aboard Titanic Child: Son, Daughter, Stepson, or Stepdaughter of Passenger Aboard Titanic

Other family relatives excluded from this study include cousins, nephews/nieces, aunts/uncles, and in-laws. Some children travelled only with a nanny, therefore parch=0 for them. As well, some travelled with very close friends or neighbours in a village, however, the definitions do not support such relations.

STORY BEHIND THE DATA: This dataset is based on the Titanic Passenger List edited by Michael A. Findlay, originally published in Eaton & Haas (1994) Titanic: Triumph and Tragedy, Patrick Stephens Ltd, and expanded with the help of the internet community.

b). Build a model for predicting the survival of passengers on the Titanic using a decision tree in RapidMiner (See Chapter 10 of Data Mining for the Masses textbook for guidance on Decision Trees in RapidMiner) using the two data sets, titanic3_train.csv and titanic3_score.csv. Then present and discuss the results of your Decision Tree analysis and a diagram showing your final Decision Tree. Comment on the relative predictive strength of this model and what you believe are the most significant variables that determined whether a passenger on the Titanic survived or not. Include some supporting references on using Decision Trees in data mining (about 400 words).

Task 2: Consists of the following two sub tasks Big data is a hot topic and is generating enormous interest in industry and academia however there is no agreement on the definition of this term and the application of big data analytics in practice is currently more hype than reality.

Your task is twofold:

a) Research and critically critique the current literature available on the Internet and in academic journals and conferences and provide a comprehensive definition and description of the term ‘Big Data' that is underpinned and supported by the reference literature (Approx 500 words)

b) Research and critically critique the current literature available on the Internet and in academic journals and conferences and provide a comprehensive discussion describing one specific application of Big data analytics in an Industry sector, emphasize how, in this specific application, of Big data analytics is providing business value to organisations in this industry sector (Approx 1000 words)

Your discussion and analysis here should be underpinned by an appropriate level of in text referencing using Harvard Referencing Style.

Task 3: Consists of the following sub tasks With the following Excel file SalesSuperstore.xlsx provided on the course study desk Assignment Folder link and using Tableau Desktop 8.3 produce the four following reports with appropriate accompanying graphs based on a Tableau workbook sheet view for each. Briefly comment on each report in about 125 words in terms of what trends and patterns are apparent in each report.

The SalesSuperstore.xlsx file contains the following dimensions and information:

1. Customer Name, Customer Segment

2. Location- Region, State, City, Zipcode

3. Product Category, Sub Category, Product Name, Product Container, Unit Price

4. Order Information

5 . Shipping Information

6. Sales Information

7. Profit

a) Create a report and accompanying graph using Tableau that shows a trend analysis for sales by Product Category over the years 2009 to 2012 and comment on key trends and patterns apparent in this report (125 words approx)

b) Create a report and accompanying graph using Tableau that shows for each Product Category Average Profit and Total Sales for each month over the years 2009 to 2012 and comment on key trends and patterns apparent in this report (125 words approx)

c) Create a geographical map presentation using Tableau that shows graphically the relative size by City within each state, Product Sales for year 2012 and comment on key trends and patterns in this report (125 words approx)

d) Create a report and accompanying graph using Tableau that shows for Product Sub Categories that are technology based Unit Prices, Sales and Profit for each month over the years 2009 to 2012 and comment on key trends and patterns in this report (125 words approx)
Your assignment 2 report must be structured as follows, which is similar to the report structure detailed in Summers & Smith 2010:
Cover page for assignment 2 report 1. Title Page 2. Table of Contents 3. Body of report - main sections and subsections for assignment 2 task and sub tasks so 3.1 Task 1 will be a main heading with appropriate sub headings etc....for each sub task etc.. 3.2 Task 2 ... 3.3 Task 3 .... 4. List of References 5. List of Appendices

You need to submit two files when you submit Assignment 2 1. Your Assignment 2 Report for Tasks 1, 2 and 3 in Word document format with the extension .docx 2. Your Assignment 2 Task 3 as a Tableau packaged workbook with the extension .twbx

Use the following file naming convention: 1. Student_no_Student_name_CIS8008_Ass2.docx and 2. Student_no_Student_name_CIS8008_Ass2.twbx

Online Assignment submission All assignments must be submitted electronically via the course study Assignment 2 submission link and are subject to automated checking for plagiarism and collusion by Turnitin when you submit your Assignment 2 documents via the Assignment 2 submission link.

Note carefully University policy on plagiarism, collusion and cheating. If any of these occur they will be found and dealt with.

Harvard referencing resources Install a reference tool (example Endnote) which integrates with your word processor. These tools are a great help for referencing and citing sources in your assignments. For more information on how to get Endnote you may visit the following webpage: http://www.usq.edu.au/library/referencing/endnote-bibliographic-software.

Study the referencing techniques in Communication skills handbook (Smith & Summers 2010). The USQ Librarian has compiled the following resources on how to reference correctly using the Harvard referencing system - make use of these excellent resources if you are unsure as how to reference correctly using Harvard referencing system. Library Harvard Referencing Guide http://www.usq.edu.au/library/referencing/harvardagps-referencing-guide

Applied Statistics, Statistics

  • Category:- Applied Statistics
  • Reference No.:- M91407934
  • Price:- $100

Priced at Now at $100, Verified Solution

Have any Question?


Related Questions in Applied Statistics

Question onea a factory manager claims that workers at

QUESTION ONE (a) A factory manager claims that workers at plant A are faster than those at plant B. To test the claim, a random sample of times (in minutes) taken to complete a given task was taken from each of the plant ...

You are expected to work in groups and write a research

You are expected to work in groups and write a research report. When you work on your report, you need to use the dataset, and other sources such as journal articles. If you use website material, please pay attention to ...

Assignment -for each of the prompts below report the

Assignment - For each of the prompts below, report the appropriate degrees of freedom, t statistic, p-value and plot using the statistical software platform of your choice (R/STATA) 1) A sample of 12 men and 14 women hav ...

Assignment - research topicpurpose the purpose of this task

Assignment - Research topic Purpose: The purpose of this task is to ensure you are progressing satisfactorily with your research project, and that you have clean, useable data to analyse for your final project report. Ta ...

Assessment task -you become interested in the non-skeletal

Assessment Task - You become interested in the non-skeletal effects of vitamin D and review the literature. On the basis of your reading you find that there is some evidence to suggest that vitamin D deficiency is linked ...

Part a -question 1 - an analyst considers to test the order

PART A - Question 1 - An analyst considers to test the order of integration of some time series data. She decides to use the DF test. She estimates a regression of the form Δy t = μ + ψy t-1 + u t and obtains the estimat ...

Medical and applied physiology experimental report

Medical and Applied Physiology Experimental Report Assignment - Title - Compare the working and spatial memory by EEG. 30 students were tested (2 memory games were played to test their memory - a card game and a number g ...

Business data analysis computer assignment -part 1

Business Data Analysis Computer Assignment - PART 1 - Economists believe that high rates of unemployment are linked to decreased life satisfaction ratings. To investigate this relationship, a researcher plans to survey a ...

Question - go to the website national quality forum nqf

Question - Go to the website, National Quality Forum (NQF), located in the Webliography, and download the article by WIRED FOR QUALITY: The Intersection of Health IT and Healthcare Quality, Number 8, MARCH 2008. You are ...

Go to the webliography source for the national cancer

Go to the Webliography source for the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program. In the Fast Stats, create your own cancer statistical report, "Stratified by Data Type," and u ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As