Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

Problem 1: Download the letter recognition data from: http://archive.ics.uci.edu/ml/datasets/Letter+Recognition

The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The character images were based on 20 different fonts and each letter within these 20 fonts was randomly distorted to produce a file of 20,000 unique stimuli. Each stimulus was converted into 16 primitive numerical attributes (statistical moments and edge counts) which were then scaled to fit into a range of integer values from 0 through 15. Below is the attribute information, but more information on the data and how it was used for data mining research can be found in the paper:

P. W. Frey and D. J. Slate. "Letter Recognition Using Holland-style Adaptive Classifiers". (Machine Learning Vol 6 #2 March 91)

Attribute Information:

1. lettr capital letter (26 values from A to Z)

2. x-box horizontal position of box (integer)

3. y-box vertical position of box (integer)

4. width width of box (integer)

5. high height of box (integer)

6. onpix total # on pixels (integer)

7. x-bar mean x of on pixels in box (integer)

8. y-bar mean y of on pixels in box (integer)

9. x2bar mean x variance (integer)

10. y2bar mean y variance (integer)

11. xybar mean x y correlation (integer)

12. x2ybr mean of x * x * y (integer)

13. xy2br mean of x * y * y (integer)

14. x-ege mean edge count left to right (integer)

15. xegvy correlation of x-ege with y (integer)

16. y-ege mean edge count bottom to top (integer)

17. yegvx correlation of y-ege with x (integer)

Create a classification model for letter recognition using decision trees as a classification method with a holdout partitioning technique for splitting the data into training versus testing.

a. Changing the values for the depth, number of cases per parent and number of cases per leaf produces different tree configurations with different accuracies for training and testing. Choose at least five different configurations and report the accuracy for training and testing for each one of them.  Which configuration will you choose as the best model? Explain your answer.

b. For the best tree configuration, report the misclassification matrix and interpret it.  In your opinion, is accuracy a good way to interpret the performance of the model?  If not, suggest other measures.

c. What are the most important three attributes for recognizing the letters?

Problem 2: On the same data from Problem 1, apply a K-nearest neighbor classifier to classify the data.  Report the following:

1. If you are doing any data transformation, explain the transformation and why it is needed.

2. Report the misclassification matrix and the appropriate performance metrics for different values of K (K=1, 3, 5, and 7). 

3. Interpret the results and also compare them with the ones obtained by using the decision trees.

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M91787938

Have any Question?


Related Questions in Computer Engineering

Sql transactions exercisesfor the universal relation rwxyz

SQL Transactions Exercises For the universal relation R(w,x,y,z), consider the decomposition D consisting of R1(w,y,z) and R2(x,y), and the set F of functional dependencies { y->xz ; yz->w ; x->w }. Recall that the proje ...

Describe a ping of death attack as an attack that causes

Describe a ping of death attack as an attack that causes the victim computer to freeze and malfunction.

Software engineeringyou are a webapp designer for

Software Engineering You are a WebApp designer for FutureLearning Corporation, a distance learning company. You intend to implement an Internet-based "learning engine" that will enable you to deliver course content to a ...

Subnetting1 as system administrator you need to create 10

Subnetting 1) As system administrator, you need to create 10 subnets for the Network Address: 192.168.1.0 with a minimum of 10 hosts per subnet. Room for future expansion to more subnets is desirable. Create a table with ...

Identify at least two 2 factors that have led to the

Identify at least two (2) factors that have led to the explosive growth of digital crime over the past a few decades. Next, describe the most common forms of digital crime, and give your opinion as to why those forms you ...

Question suppose that you have a set of n files that have

Question : Suppose that you have a set of n files that have to be copied on 1GB USB drives. You can assume that the file sizes are known and always less than or equal to 1GB. The objective is to use the minimum possible ...

You are on a system in which the finger program has been

You are on a system in which the finger program has been disabled and you want a quicky finger type program and you decide that greping/etc/passwd would be sufficient. However the system that you are on uses nis+ and so ...

Suppose you have two algorithms blarg and wibble with time

Suppose you have two algorithms, blarg and wibble, with time complexity ?(n log n) and ?(n) respectively. blarg modifies the input, while wibble just checks something about the input and returns True or False. You write ...

Qion leadership paradox and inter-team relationsa

Question: Leadership Paradox and Inter-team Relations A. What is the leadership paradox? Give some reasons why a leader can encounter difficulty in newly formed teams or groups using a participative management system. Su ...

Need help with the following 2 problems1 they offer you a

Need help with the following 2 problems: 1. They offer you a promissory note with a four-year maturity, which will generate $ 3,000 at the end of each of the four years. Its price is $ 10,200. What is the implicit annual ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As