How does the silhouette score measure the quality of a clustering? Question 6 options: A) It measures how tight the clusters are and how well-separated they are from each other B) It assesses the ratio of the number of clusters to the number of data points C) It computes the sum of squared distances between data points and their assigned cluster centroids D) It evaluates the ratio of between-cluster variance to within-cluster variance
Added by Alejandra A.
Step 1
Let's think step by step. Show more…
Show all steps
Your feedback will help us improve your experience
Emily Himsel and 91 other AP CS educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
K-Mean Clustering
Emily H.
Please answer the multiple-choice questions below: 1 . When may cosine distance be a good choice to measure differences between observations? Group of answer choices When dealing with observations with variables with both quantitative and categorical variables. When similar patterns across the variables of two observations is more relevant to the application than the similarity in terms of the magnitude of values. When dealing with observations consisting of binary variables. When dealing with observations consisting of ordinal variables. 2. If one wants to attempt to keep the largest difference between two observations in a cluster as small as possible, which linkage method may be most appropriate? Group of answer choices Ward's linkage. Single linkage. Group average linkage. Complete linkage. 3. Why is it recommended to implement the k-means clustering algorithm with multiple starts? Group of answer choices The Euclidean distance measure commonly used by k-means clustering is inappropriate when there are non-globular clusters. Multiple starts are necessary because k-means clustering tends to result in globular clusters. The location of the initial k randomly selected centroids can have an impact on the final clusters. The algorithm requires several iterations of assigning observations to centroids and recomputing cluster centroids to obtain the final clusters. 4. Which of the following is true statement about the comparison of Euclidean distance versus Manhattan distance? Group of answer choices Manhattan distance scales better to higher dimensions. Euclidean distance is more applicable to binary variables. Manhattan distance is distorted less by outlier observations. Euclidean distance is less expensive to compute. 5. What is a recommended way to determine the number of clusters in a k-means approach? Group of answer choices All of these. Silhouette score. Cluster interpretability. Cluster stability.
Aishwarya K.
Texts: 1. The elbow plot depicts: A. One clear number of clusters that should be used in a k-means cluster analysis. B. The total sum of within sums of squared deviations for all clusters divided by the total number of clusters. C. One clear number of clusters that should be used in a k-means cluster analysis. 2. Which of the following is NOT true about Ward's method for merging clusters? A. Ward's method uses both the cluster centroid and individual differences in the observations in the computations. B. Ward's method is essentially the same as the group average linkage method. C. Ward's method is essentially the same as the group average linkage method.
Akash M.
Recommended Textbooks
Computer Science and Information Technology
Introduction to Programming Using Python
Computer Science - An Overview
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD