(6 pts) P5-2: Provide the definition of the proximity matrix for hierarchical clustering algorithms. Consider you have the following data points. Please show the proximity matrix (you can represent results without calculating the square roots). Figure 3: Hierarchical Clustering Data Points (6 pts) P5-3: The clustering data points are shown in Figure 3. Provide the hierarchical clustering results with MAX similarity (steps ordered by number). And show the Dendrogram of that result.
Added by Patricia R.
Close
Step 1
A proximity matrix is a square matrix that shows the distance (or similarity) between each pair of data points in a dataset. In this case, we'll use the squared Euclidean distance (without calculating the square root) as our measure of proximity. The element at Show more…
Show all steps
Your feedback will help us improve your experience
Sri K and 98 other AP CS educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
'Problem 3 (20 points) Use the similarity matrix as follows to perform single and complete link hierarchical clustering: Show your results by drawing dendrogram with R The dendrogram should clearly show the order in which the points are merged. Please show your R code: pl p2 p3 p4 p5 pl 1.00 0.10 0.41 0.55 0.35 p2 0.10 1.00 0.64 0.47 0.98 p3 0.41 0.64 1.00 0.44 0.85 p4 0.55 0.47 0.44 1.00 0.76 p5 0.35 0.98 0.85 0.76 1.00'
Sri K.
Question 8: (15pt) One of the hierarchical cluster algorithms is the agglomerative (bottom-up) procedure. The procedure starts with n singleton clusters and forms a hierarchy by merging the most similar clusters until all the data points are merged into one single cluster. Let the distance between two data points be the Euclidean distance d(x,y) = √((x1 - y1)^2 + ... + (xd - yd)^2). Let the distance between two clusters A and B be min d(x,y) for x in A, y in B, the minimum distance between the points from the two clusters. There are 5 observations: a, b, c, d, and e. Their Euclidean distances are given in the following matrix: 0 4 3 6 11 4 0 5 7 10 3 5 0 9 2 6 7 9 0 13 11 10 2 13 0 For example, based on the matrix above, the distance between a and b is 4. Please derive the four steps in the agglomerative clustering procedure to construct the hierarchical clustering for this dataset. For each step, you need to specify which two clusters are merged and why you choose these two to merge.
Adi S.
Problem 2 (10 points). Use the similarity matrix in the following table to perform single and complete link hierarchical clustering. Show your results by drawing a dendrogram. The dendrogram should show the order in which the points are merged (Coding is not mandatory here.) | | p1 | p2 | p3 | p4 | p5 | |---|---|---|---|---|---| | p1 | 1.00 | 0.10 | 0.41 | 0.55 | 0.35 | | p2 | 0.10 | 1.00 | 0.64 | 0.47 | 0.98 | | p3 | 0.41 | 0.64 | 1.00 | 0.44 | 0.85 | | p4 | 0.55 | 0.47 | 0.44 | 1.00 | 0.76 | | p5 | 0.35 | 0.98 | 0.85 | 0.76 | 1.00 |
Aarti K.
Recommended Textbooks
Computer Science and Information Technology
Introduction to Programming Using Python
Computer Science - An Overview
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD