Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Humanities Expert

Question 1

Get the dataset "food.txt" from GauchoSpace and read it with R. Alternatively you can download this data set from the library cluster.datasets with the following code:

library(cluster.datasets)
data(nutrients.meat.fish.fowl.1959)
The Data Set contains the quantity of Energy, Protein, Fat, Calcium and Iron of 27 differen aliments.

The task here is to finding meaningful clusters in the data. To this end perform the following:
1. Find clusters using a K-means algorithm. Try out different values of K and determine your best best solution. The number of clusters you choose should be based either on appropriate measures of fit, for example SSE as defined in the book IDM, and interpretability of the results. For each value of K that you try out provide:

a. the centroids
b. the size of each cluster and a list of the aliments and their cluster membership
c. the ratio between-SS/total-SS
d. a meaning (use your imagination) to each cluster formed, e.g. what are the summarizing characteristics of the aliments in group 1?
e. to answer part d above you might find useful using a parallel coordinate plot of the centroids
2. Apply hierarchical clustering using min, max and average distances (respectively single, complete and average methods in R).
a. For each method produce a dendrogram with the labels of the aliments
b. What are the differences, in any, in using the three different measures of distances?
c. Can you individuate clusters similar to those obtained by K-means clustering?

Additional exercises for PStat 231
Question 2
Perform PCA of the food.txtdata and use a biplot to visualize the first two PC and the Variables. Based on the biplot one could still individuate groups (clusters) of aliments with similar characteristics.

a. Is the grouping obtained by PCA similar or different from that obtained by the clustering algorithms above? Explain with some detail.
b. Which technique do you find most useful in describing the data set? Why?
1
Question 3
Suppose that we have four observations, for which we compute a dissimilarity matrix, given by

0.3 0.4 0.7
0.3 0.5 0.8
0.4 0.5 0.45
0.7 0.8 0.45
For instance, the dissimilarity between the first and second observations is 0.3, and the dissimilarity between the second and fourth observations is 0.8.
a. On the basis of this dissimilarity matrix, sketch the dendrogram that results from hierarchically clustering these four observations using complete linkage. Be sure to indicate on the plot the height at which each fusion occurs, as well as the observations corresponding to each leaf in the dendrogram.

b. Suppose that we cut the dendogram obtained in (a) such that two clusters result. Which observations are in each cluster?

Humanities, Academics

  • Category:- Humanities
  • Reference No.:- M91782556

Have any Question?


Related Questions in Humanities

Assignment - watch the it hits the fan south park

Assignment - Watch the "It Hits The Fan" South Park episode. After watching, you should answer the following: Discuss how this particular episode of South Park would be interpreted through the lens of ONE of the followin ...

Assignment essaychoose one 1 of the three 3 reading

Assignment: Essay Choose one (1) of the three (3) reading selections from the list of topic choices below. The focus is on brief but important primary source material written by important authors. In each case, the subje ...

Name at least two people who have had a great influence on

Name at least two people who have had a great influence on the field of social psychology and discuss the contribution of each.  • Define the term theory, its role of theory in health assessment, and how theory can help ...

Part 1 media scholar george rodman describes technological

Part 1: Media scholar George Rodman describes technological determinism as a theory stating "the introduction of every new technology changes society, sometimes in unexpected ways." Baran further discusses this topic in ...

Question case analysis - collaborating with outside

Question: Case Analysis - Collaborating with Outside Providers Read the Treatment Plan and Case - Bulimia Nervosa in Gorenstein and Comer (2014). Please also read the Waller, Gray, Hinrichsen, Mounford, Lawson, and Patie ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As