Ask Business Management Expert

Business Intelligence & Big Data Analytics-

Assignment Specification and Deliverables

You are given a dataset that contains a survey for the distribution of income of households. The survey collects a mix of continuous and discrete data values on source and amount of income, labour force information, and general demographic characteristics. The full data set is given in two Excel files income_data and income_test. The files contain one table providing the training data (income_data table with 32561 rows) and one table providing the data that could be used for testing (income_test table with 16281 rows).

You are asked to complete the following task:

Predict the income of individuals in the income_test table. The prediction task is to determine whether a person makes over 50K a year.

You need to use the techniques discussed in the lectures using Modeler to complete the task.

The deliverables that you should produce are:

• A report describing the approach that you have followed, the study scenarios or streams that you have attempted, the pre-processing of the data and the best predictions that you have achieved.

• The predictions of the income achieved (stored as a table), the streams that you have created (stored as .str files) and the models that you have generated (stored as .gen files). All deliverables should be included in a zip file and submitted through Blackboard.

Below is an indication of the parts that your report should include, together with an indication of the overall weighting attached to each part.

1. A Cover page gives the title of your report, your name, student number and degree programme.

2. An Introduction Section of at most two pages introduces the problem and the approach followed, introduces the analysis and outlines the contents of each section of the report.

3. The Main Section of at most ten pages develops the approach that you have followed, the study scenarios or streams that you have attempted, the pre-processing of the data and the best predictions that you have achieved.

4. A Conclusion Section of at most two pages summarises the main points of the report and draws your overall conclusions. Assume that a bank had requested this survey in order to offer low interest loans to the households with income over 50K. What would be your overall recommendations to the bank? Justify your answer based on your data mining experiments and results (overall weighting 20%)

5. Use of sources, presentation and language and referencing, if needed.

Dataset Description- Below is given a description of the attributes in the dataset

Attribute

Type

Description

age

continuous

The age of the household income earner

workclass

Private, Self-emp-not-inc,

Self-emp-inc, Federal-gov,

Local-gov, State-gov,

Without-pay, Never-worked

Employment status of the household income earner

education

Bachelors,  Some-college,

11th, HS-grad, Prof-school,

Assoc-acdm, Assoc-voc,

9th, 7th-8th, 12th, Masters,

1st-4th, 10th, Doctorate,

5th-6th, Preschool

Information about the highest level of school completed or degree received from the household income earner

education-num

continuous

Number of years in education of the household income earner

marital-status

Married-civ-spouse,

Divorced,

Never-married,

Separated,

Widowed,

Married-spouse-absent,

Married-AF-spouse

Marital status of the household income earner

occupation

Tech-support,

Craft-repair,

Other-service,

Sales,

Exec-managerial,

Prof-specialty,

Handlers-cleaners,

Machine-op-inspct,

Adm-clerical,

Farming-fishing,

Transport-moving,

Priv-house-serv,

Protective-serv,

Armed-Forces

The occupation of the household income earner

relationship

Wife, Own-child, Husband,

Not-in-family, Other-relative,

Unmarried

The relationship  of the interviewee with the income earner

race

White,  Asian-Pac-Islander,

Amer-Indian-Eskimo, Other,

Black

The race of the household income earner

Sex

Female,  Male

The sex of the household income earner

capital-gain

continuous

The increase of the income from the previous year when the last survey was carried out

capital-loss

continuous

The decrease of the income from the previous year when the last survey was carried out

hours-per-week

continuous

The number of hours worked on average each week by the household income earner

Business Management, Management Studies

  • Category:- Business Management
  • Reference No.:- M91779914

Have any Question?


Related Questions in Business Management

Name a company that addressed a recent ethical problem in a

Name a company that addressed a recent ethical problem in a positive way. Also, explain how or if this positively affects us as a community?

When it is appropriate to use the trade-off process what

When it is appropriate to use the trade-off process. What conditions apply, and the technical evaluation criteria that might be used?

Need help with a essay with the following phrase for

Need help with a essay with the following phrase for analyzing : " Capitalism is at the heart of how people and organisations are managed in contemporary society" May i ask for a better explanation of the question? Also ...

How could these three tenets of the auburn creed be used to

How could these three tenets of the Auburn Creed be used to motivate others: "I believe that this is a practical word and that I can count only on what I earn. Therefore, I believe in work, hard work." "I believe in educ ...

How can these two tenets of the auburn creed by used in

How can these two tenets of the Auburn Creed by used in addressing teamwork issues: "I believe in honesty and truthfulness, without which I cannot win the respect and confidence of my fellow men." "I believe in the human ...

Discuss the advantages of having and interacting in a

Discuss the advantages of having and interacting in a diverse workplace. Consider the wide range of ideas and perspectives that a range of team members bring to a team, that are of differing ages, ethnic backgrounds and ...

Parmigiano-reggiano global recognition of geographical

Parmigiano-Reggiano: Global Recognition of Geographical Indications What historical factors have helped support the consortium's claims for the geographic specificity of Parmigiano-Reggiano and Parmesan? What are the eco ...

Communication planthis communication plan will be a roadmap

Communication Plan This communication plan will be a roadmap on how the new division will best be able to communicate with Biotech's corporate headquarters, suppliers, other divisions, and internally. This should lay out ...

Discuss strategies to obtain feedback from a customer and

Discuss strategies to obtain feedback from a customer and clients when working in sales.

Describe different networking methods and the advantages

Describe different networking methods and the advantages and disadvantages of them?

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As