Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask DBMS Expert


Home >> DBMS

The groceries Dataset

Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer's basket. That is exactly what the Groceries Data Set contains: a collection of receipts with each line representing 1 receipt and the items purchased. Each line is called a transaction and each column in a row represents an item.

Task 1: Data Pre-processing

Read the data in R. There are many ways to read in csv tables in R. For more details, please refer to data import/export in R

For the clustering experiments, the column for class labels need to be removed. Refer to lecture Module 10 to see how to do so.

Verify if any other pre-processing is beneficial for the analysis. For example, replacing missing values, attribute range normalization, converting numerical or string to nominal values etc.

Task 2: Data Mining

- Association Rule Mining experiments: Using R to explorer "association rules" on the groceries dataset.Try out different algorithms. Visualize the result you found. Report any interesting association rules discovered in the experiments and explain why they are interesting.
- Classification experiments: Using to construct classifiers on the mushroom or Ionosphere dataset. Randomly split the data set in the training and test data set (80% v.s. 20%). Select at least one classifier from each of the following two categories of classifiers: Tree-based models, Bayes classifiers, and Rule-based classifiers. Compare the result of the chosen classifers.
- Clustering experiments: Using R explorer clusters on the mushroom or Ionospheredataset.Select and compare two clustering algorithms from R(e.g. k-means v.s. density-based). Use R to visually explore the resulting clusters.
- For all the above experimentations, try different parameter settings to fine tune the outcome. In principle select methods that work well on the given data set.
Task 3: Prepare a report
Your report should contain the following:
- Theoretical Discussion: Limited to two pages discussing about data preprocessing steps, the motivation for selecting a particular method, and how the parameters are chosen.
- Results: Include results and screenshots of the above experimentations.
- Discussion and error analysis: Try to interpret the results of your model. Discuss intuitions or hypothesis that can be obtained by visual inspections of the resulting classes or clusters. Mention about assumptions if any, discuss issues that might have affected the model's performance.
- References: If you are using information from other sources apart from R manual and official website, you should cite them.

Attachment:- Assignment.rar

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M93108172
  • Price:- $50

Priced at Now at $50, Verified Solution

Have any Question?


Related Questions in DBMS

Tableau is business intelligence software that helps people

Tableau is business intelligence software that helps people see and understand their data. Fast Analytics Connect and visualize your data in minutes. Tableau is 10 to 100x faster than existing solutions. Ease of Use Anyo ...

In sql database questions phase-1 in 100 words what steps

In SQL Database Questions: Phase-1 In 100 words, what steps can one take to avoid losing work? Which command is used to save changes to the database? What is the syntax for this command? Phase-2 In 100 words, explain the ...

Questionsuppose a prolog database exists that gives

Question: Suppose a Prolog database exists that gives information about states and capital cities. Some cities are big, others small. Some states are eastern, others are western. a. Write a query to find all the small ca ...

In this section the student is required to develop a

In this section, the student is required to develop a technical debate based on his/her understanding using available scientific literature. The answer to this question should not exceed three A4 Pages. In the traditiona ...

In sql developercreate a table userpermissions provide

IN SQL DEVELOPER Create a table UserPermissions (provide create and insert statements code) Document UserName Policy SYSTEM Menu JDOW W2 USAM Permissions SYSTEM W2 JDOW Form 1040 USAM Policy JDOW W2 SYSTEM Write a PL/SQL ...

A schools office of the registrar maintains data about the

A School's office of the registrar maintains data about the following entities: a) courses (including course number, title, credits, syllabus and prerequisites), b) course offerings (including course number, year, semest ...

We can represent a data set as a collection of object nodes

We can represent a data set as a collection of object nodes and a collection of attribute nodes, where there is a link between each object and each attribute, and where the weight of that link is the value of the object ...

This assignment is a continuation of this solution the case

This assignment is a continuation of this solution The case study company has received the first report from its enterprise content management (ECM) consultant and now has a documented list of major content requirements ...

Solve the following questions using oracle you are not

Solve the following questions using Oracle. You are not allowed to use the syntax of any DBMS other than Oracle. Make sure to upload an electronic copy of your solution to your CSC335 TRACE folder. Name the file hw4.sql. ...

Sql query assignment -for this assignment you are to write

SQL Query Assignment - For this assignment you are to write your answers in a word document. This assignment is in three parts: Part A (reporting queries), Part B (query performance), Part C (query design). For this assi ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As