Ask DBMS Expert


Home >> DBMS

The groceries Dataset

Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer's basket. That is exactly what the Groceries Data Set contains: a collection of receipts with each line representing 1 receipt and the items purchased. Each line is called a transaction and each column in a row represents an item.

Task 1: Data Pre-processing

Read the data in R. There are many ways to read in csv tables in R. For more details, please refer to data import/export in R

For the clustering experiments, the column for class labels need to be removed. Refer to lecture Module 10 to see how to do so.

Verify if any other pre-processing is beneficial for the analysis. For example, replacing missing values, attribute range normalization, converting numerical or string to nominal values etc.

Task 2: Data Mining

- Association Rule Mining experiments: Using R to explorer "association rules" on the groceries dataset.Try out different algorithms. Visualize the result you found. Report any interesting association rules discovered in the experiments and explain why they are interesting.
- Classification experiments: Using to construct classifiers on the mushroom or Ionosphere dataset. Randomly split the data set in the training and test data set (80% v.s. 20%). Select at least one classifier from each of the following two categories of classifiers: Tree-based models, Bayes classifiers, and Rule-based classifiers. Compare the result of the chosen classifers.
- Clustering experiments: Using R explorer clusters on the mushroom or Ionospheredataset.Select and compare two clustering algorithms from R(e.g. k-means v.s. density-based). Use R to visually explore the resulting clusters.
- For all the above experimentations, try different parameter settings to fine tune the outcome. In principle select methods that work well on the given data set.
Task 3: Prepare a report
Your report should contain the following:
- Theoretical Discussion: Limited to two pages discussing about data preprocessing steps, the motivation for selecting a particular method, and how the parameters are chosen.
- Results: Include results and screenshots of the above experimentations.
- Discussion and error analysis: Try to interpret the results of your model. Discuss intuitions or hypothesis that can be obtained by visual inspections of the resulting classes or clusters. Mention about assumptions if any, discuss issues that might have affected the model's performance.
- References: If you are using information from other sources apart from R manual and official website, you should cite them.

Attachment:- Assignment.rar

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M93108172
  • Price:- $50

Priced at Now at $50, Verified Solution

Have any Question? 


Related Questions in DBMS

In sql developercreate a table userpermissions provide

IN SQL DEVELOPER Create a table UserPermissions (provide create and insert statements code) Document UserName Policy SYSTEM Menu JDOW W2 USAM Permissions SYSTEM W2 JDOW Form 1040 USAM Policy JDOW W2 SYSTEM Write a PL/SQL ...

Assignmentqueries functions and triggersdatabase

Assignment Queries, Functions and Triggers Database Systems Aims The aims of this assignment are to: formulate SQL queries; populate an RDBMS with a real dataset, and analyse the data; design test data for testing SQL qu ...

Question we can sort a given set of n numbers by first

Question : We can sort a given set of n numbers by first building a binary search tree containing these numbers (using TREE-INSERT repeatedly to insert the numbers one by one) and then printing the numbers by an inorder ...

Answer the following question explain the difference

Answer the following Question : Explain the difference between a database management system (DBMS) and a database. Are Microsoft Access, SQL Server, and Oracle examples of databases or database management systems (DBMS)?

Question 1 a- consider that you are asked to design an

Question: 1. (a)- Consider that you are asked to design an entity relationship diagram based on the below scenario: A university consists of a number of departments (id, d_name) and each department offers some courses. A ...

Database design and development assignment -assessment task

Database Design and Development Assignment - Assessment task - 1. Normalization a) Map the ERD, from the sample solution, into a set of relations in at least Third Normal Form (3NF). You must ensure that your relations m ...

This is a starting out with visual basic programming

This is a starting out with visual basic programming problem, meaning its a basic programming nothing advanced. Question: Add a linq statement to the combo box text-changed event to dispaly the city field of the selected ...

Question create the physical data model for the logical

Question: Create the physical data model for the logical data model that you submitted in IP3. This should include all of the data definition language SQL. Your submission should include all DDL needed to: Create the tab ...

Systems analysis project scenic routes operates a bus

Systems analysis project Scenic Routes operates a bus company that specializes in travelling on secondary roads, rather than Interstate highways. Their slogan is: "It Takes a Little Longer, But It's Scenic." The firm nee ...

Assignmenta restaurant is designing a database to keep

Assignment A restaurant is designing a database to keep track of customer services. A customer is defined as a customer ID, name, address and a telephone number. Customers are served by employees. Each employee is define ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As