Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

1. In this question, we are going to build a neural network (NN) classifier to predict red  wine quality (represented by an integer ranging from 0 to 10, higher means better) using a set of chemical properties. These properties are presented as attributes below: fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol

The last attribute quality is the class label. Download the program "DataSplit2.exe" and execute it. Enter your student ID and specify the locations of the dataset file and the destination folder. The dataset will be split for you by clicking the "OK" button. Note that your training and testing datasets are unique to others. Make sure you enter the student ID correctly.

a. Show your training and testing datasets. Only the first 20 rows of each dataset are required in your answer.
b. Load both datasets into the MATLAB worksp Download the program "DataSplit2.exe" and execute it. Enter your student ID and specify the locations of the dataset file and the destination folder. The dataset will be split for you by clicking the "OK" button. Note that your training and testing datasets are unique to others. Make sure you enter the student ID correctly.

a. Show your training and testing datasets. Only the first 20 rows of each dataset are required in your answer.

b. Load both datasets into the MATLAB workspace. It is recommended to separate the class label (i.e. the attribute quality) from other attributes such that all the class labels of a dataset are stored in a matrix. As a result, there are four matrices after the import process, two for the attribute values from the two datasets, and the other two for the class labels from these datasets.

The class labels require encoding before they can be used for training and evaluating the NN classifier. Since there are 11 distinct class values (0 - 10), each class label is encoded into a column vector of 11 × 1. For a class value k, the k + 1 th row of the column vector is set to 1, while the others are zero. For example, if the class label is 4, then it is encoded into a column vector:

Therefore, if the dataset has N samples, then the class labels are encoded into an 11 × N matrix.

Show only the encoded class labels for the last 20 samples of both the training and testing datasets, i.e. two 11× 20 matrices.

c. The NN classifier is created using the following parameters:
Number of hidden layers: 1
Number of neurons: 10
Use default settings for other parameters. Train the classifier using the training dataset. Show the training performance by pasting the performance curve in your

answer.

Hint: Check carefully the dimension arrangement of the NN classifier, i.e. whether it considers a row or a column as a tuple.

d. Use the NN classifier to predict the qualities of the samples in the testing dataset. Show only the predicted class labels for the first 20 rows of the testing dataset.

e. Obtain and show the confusion matrix. What is the accuracy of the classifier?
Please submit your MATLAB source codes (in MATLAB script file) with the assignment answer. No marks will be given to your answer unless the relevant source codes are submitted.

2. We are going to mine some association rules from the supermarket transactions using WEKA.
a. Download the program "TransactionDataGenerator.exe" and execute it. Enter your student ID and specify the location of destination folder. The dataset will be generated for you by clicking the "OK" button. A transaction file will then be generated in CSV format. Each line row represents a single transaction, the first item is the transaction ID and the others are the goods bought by the customer.

Show only the first 20 lines (transactions) of your transaction file.

b. The transaction file generated must be converted to an attribute format (see appendix) that can be imported to WEKA. For example, a transaction file consists of five transactions as follows:
T001, jam
T002, bread, jam
T003, bread, butter
T004, jam
T005, bread The converted format is shown below:
t_id bread butter jam
T001 t
T002 t t
T003 t t
T004 t
T005 t
The converted transactions can be saved in CSV format. The content of the above converted format in CSV is like this: tid,bread,butter,jam
T001,,,t
T002,t,,t
T003,t,t,
T004,,,t
T005,t,,
Write a conversion program for this task. The list of all items is available at the Appendix. Show only the last 20 converted transactions.

Hints:

i. Since the transactions consist of different number of items, it is recommended to read the whole transaction as a string, i.e. all the N
transactions are put in an N × 1 cell array. You may find functions such as
textscan or importdata useful.

ii. Following (i), it is then necessary to separate the transactions Id and every item in a single transaction. The delimiter is a comma (","). You may find the regular expression function regexp useful.

iii. A transaction schema (i.e. all possible transaction items in the header line of the above converted format) is needed. You transactions might not cover all the items, but this does not affect the final results.

iv. The transaction schema should be implemented as an array in your source codes. Also, the item order in the array should be identical to the item order in the header line. This helps determining which column to put a ‘t' label for a transaction. You may find the function ismember useful.

c. Mine the association rules from the transactions using WEKA. Specify which algorithm you select and the related parameters such as minimum support and confidence. List the best 10 rules discovered with highest possible support and
confidence.

For Q2(b), please submit your source codes (using MATLAB or other languages) with the assignment answer.

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M91606220

Have any Question?


Related Questions in Computer Engineering

Question suppose a computer using direct mapped cache has

Question : Suppose a computer using direct mapped cache has 2 20 words of main memory and a cache of 32 blocks, where each cache block contains 16 words. a. How many blocks of main memory are there? b. What is the format ...

Question when a syscall is called which register must have

Question : When a syscall is called which register must have the syscall number? Which syscall is a must for every program? Why?

In some states allow requires drivers to turn on their

In some states allow requires drivers to turn on their headlights when driving in the rain. A highway patrol officer believes that lesson one-quarter of all the drivers follow this rule. As a test, he randomly samples 20 ...

A banks assets equal its liabilities under a both

A bank's assets equal its liabilities under a. both 100-percent-reserve banking and fractional-reserve banking. b. 100-percent-reserve banking but not under fractional-reserve banking. c. fractional-reserve banking but n ...

Given an undirected graph with both positive and negative

Given an undirected graph with both positive and negative edge weights, design an algorithm to find a maximum spanning forest with the largest total edge weights.

Fiona told her friend that she is very fortunate as the

Fiona told her friend that she is very fortunate as the slow-down in the economy has not decreased sales in her grocery store by much compared to sales of new cars in his car dealership. Explain what Fiona meant using th ...

A sample of 1000 us households is taken and the average

A sample of 1,000 U.S. households is taken and the average amount of newspaper garbage or recycling is found to be 27.8 pounds with a standard deviation of 2 pounds. Estimate, with 90% confidence, the mean amount of news ...

Combustion analysis of a hydrocarbon produces 3301 g co2

Combustion analysis of a hydrocarbon produces 33.01 g CO2 and 27.03 g H 2O Calculate the empirical formula of the hydrocarbon. Express your answer as a chemical formula.

In thenbspworkspaceproject-lognbspdirectory create file

In the ~/workspace/project-log directory, create file named  changelog.txt  with the following content and format: Changelog Version: 1.0 Redirect the output of the ls command to a file named  file-list.txt  in the ~/wor ...

A bird is flying around the world at a rate of 256ftmin how

A bird is flying around the world at a rate of 25.6ft/min. How many weeks will it take to compete it's journey if the circumference of the earth is25,000 miles using factor label method.

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As