Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Homework Help/Study Tips Expert

The point: this coursework is designed to give you experience with, and hence more understanding of:

- Overfitting: finding a classifier that does very well on your training data doesn't mean it will do well on unseen (test) data.
- The relationship between overfitting and complexity of the classifier - the more degrees of freedom in your classifier, the more chances it has to overfit the training data.
- The relationship between overfitting and the size of the training set.
- Bespoke machine learning: you don't have to just use one of the standard types of classifier - the ‘client' may specifically want a certain type of classifier (here, a ruleset that works in a certain way), and you can develop algorithms that try to find the best possible such classifier.

Students wishing to complete the below tasks in other languages, such as R, Matlab, Python are welcome to do so, assuming they have prior knowledge of these languages.

In the below task spec, the assumption is made that the majority of the class uses Weka. Please adapt the below instructions accordingly if you use a different programming language.

1. Convert the above files into arff format and load them to Weka.

Dealing with big data sets: in CW2, you were given several options how to deal with large data sets in Weka (increasing heap size for Weka GUI, using Weka command line with increased heap, wrapping Weka command line within scripts that automate the experiments, or just reducing the size of the data set using Weka methods of randomization and attribute selections). You will have to make one such decision for this coursework, too.

2. Create folders on your computer to store classifiers, screenshots and results of all your experiments, as explained below.

Your coursework will consist of two parts - in Part-1 you will work with Decision trees and in Part -2 - with Linear Classifiers and Neural Networks.

For each of the two parts, you will do the following:

3. Using the provided data sets, and Weka's facility for 10-fold cross validation, run the classifier, and note its accuracy for varying learning parameters provided by Weka. (Below you will find more instructions on those.) Record all your findings and explain them. Make sure you understand and can explain logically the meaning of the confusion matrix, as well as the information contained in the "Detailed Accuracy" field: TP Rate, FP Rate, Precision, Recall, F Measure, ROC Area.

4. Use Visualization tools to analyze and understand the results: Weka has comprehensive tools for visualization of, and manipulation with, Decision trees and Neural Networks.

5. Repeat steps 3 and 4, this time using testing data set instead of Weka's cross validation.

6. Make new training and testing sets, by moving 3000 of the instances in the testing set into the training set. Then, repeat steps 3 and 4.

7. Make new training and testing sets again, this time enlarging the training set with 6000 instances from the testing set, and again repeat steps 3 and 4.

8. Analyse your results from the point of view of the problem of classifier over-fitting.

Part 1. Decision tree learning.

In this part, you are asked to explore the following three decision tree algorithms implemented in Weka
1. J48 Algorithm
2. User Classifier (This option allows you to construct decision trees semi-manually)
3. One other Decision tree algorithm.
You should compare their relative performance on the given data set. For this:
- Experiment with various decision tree parameters: binary splits or multiple branching, prunning, confidence threshold for pruning, and the minimal number of instances permissible per leaf.
- Experiment with their relative performance based on the output of confusion matrices as well as other metrics (TP Rate, FP Rate, Precision, Recall, F Measure, ROC Area). Note that different algorithms can perform differently on various metrics. Does it happen in your experiments? - Discuss.
- When working with User Classifier, you will learn to work with both Data and Tree Visualizers in Weka. Please reduce the number of attributes as in CW2 to prototype more efficiently in Visualizers.
- Record all the above results by going through the steps 3-8.

Part 2. Neural Networks.

In this part, you will work with the MultilayerPerceptron algorithm in Weka.

- Run MultilayerPerceptron. Experiment with various Neural Network parameters: add or remove

nodes, layers and connections, vary the learning rate, epochs and momentum, and validation threshold.
- You will need to work with Weka's Neural Network Visualiser in order to perform some of the above tasks. You are allowed to use smaller data sets when working with the Visualiser.
- Experiment with relative performance of Neural Networks and changing parameters. Base your comparative study on the output of confusion matrices as well as other metrics (TP Rate, FP Rate, Precision, Recall, F Measure, ROC Area).
- Record all the above results by going through the steps 3-8.

9. Deep Learning and Deep Neural Networks have gained popularity recently. Do some research (using the www and the recommended textbook) to find out more about Deep Learning. Use algorithms and tools available in Weka or on-line. Write a one page essay comparing Neural Networks and Deep Neural Networks.

Homework Help/Study Tips, Others

  • Category:- Homework Help/Study Tips
  • Reference No.:- M92554751
  • Price:- $70

Guranteed 36 Hours Delivery, In Price:- $70

Have any Question?


Related Questions in Homework Help/Study Tips

Question part 1 think about how to build teams in terms of

Question: Part 1: Think about how to build teams in terms of designing the task, selecting the people, and then, managing their relationships. How would compose a team for completing a course/work project in terms of the ...

Question suppose that kenji an economist from an am talk

Question: Suppose that Kenji, an economist from an AM talk radio program, and Lucia, an economist from a university in Massachusetts, are arguing over budget deficits. The following dialogue shows an excerpt from their d ...

Question 1 what are the two sources of ideas and how do

Question: 1. What are the two sources of ideas, and how do they operate? 2. Discuss primary and secondary qualities of things & how their ideas are formed in the mind. 3. Discuss Locke's views on learning, memory, and fo ...

Question develop a profile on the professional networking

Question: Develop a profile on the professional networking site, LinkedIn. Please submit 1- to 2 page essay describing the development of your LinkedIn profile. Please include the link to your profile in your essay. And ...

Assignment rationale for agency selectedfor this and the

Assignment : Rationale for Agency Selected For this and the following assignments, which will become a major part of your portfolio, you will take on the role of a consultant for a government agency. The first role of th ...

Question you are serving on a committee to select a course

Question: You are serving on a committee to select a course management system for use in educating and training your faculty, students, or employees. The concept of standards has come up. While you are familiar with SCOR ...

Short answer questions choose any two 2 of the following

Short Answer Questions Choose any two (2) of the following short answer questions and PREPARE to answer them on the exam. Your answers will be written in-class during the exam period; I will leave space on the exam for y ...

Question part two paperpresent information in a 3-4-page

Question: Part Two: Paper Present information in a 3-4-page paper. (APA format), title page, no abstract, reference page (minimum one nursing journal reference on topic). 1. Name of AA/NA meeting: 2. Goals/purposes of th ...

Question extra credit home workplease send me your rscript

Question: Extra Credit Home Work Please send me your Rscript before the class to be considered for the extra credit. Instruction: there is a dataset uploaded on Canvas called Q2.data , it is an excel file. You should imp ...

Question the acme corporation is a new startup that wishes

Question: The Acme Corporation is a new startup that wishes to sell their new phone, called Acmephone, to the public. Acmephone plans to offer two options. 1) a secure version of the phone designed for business organizat ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As