1. (5 points) What's the main difference between supervised...

1. (5 points) What's the main difference between supervised and unsupervised learning? Give one benefit and drawback for supervised and unsupervised learning, respectly. 2. (5 points) Will different initializations for k-means lead to different results? 3. (5 points) Give a short proof (can be in words but using correct logic) why k-means algorithm will converge in finite number of iterations. 4. (5 points) What is the main difference between k-means and generalized k-means algorithm? Explain how the choice of the similarity/dissimilarity/distance will impact the result. 5. (10 points) Consider the following simple graph 1 4 2 3 5 Write down the graph Laplacian matrix and find the eigenvectors associated with the zero eigenvalue. Explain how do you find out the number of disconnected clusters in graph and identify these disconnected clusters using these eigenvectors.

Transcript

00:01 Hello students, here the main difference between the supervised and unsupervised learning is the presence of the label data.

00:06 So in the case of supervised learning, in the supervised learning, the algorithm is trained on the label data set, where each data point has an associated target or the label.

00:15 So, the goal is to learn a mapping from the input to the output based on the provided label.

00:20 A benefit of supervised algorithm that it can make accurate prediction on the new unseen data.

00:25 A drawback is that it requires a large amount of the label data for the training, which can be expensive and time consuming to obtain.

00:34 So training takes a time, but it can make the accurate prediction.

00:41 So here, according to the data, trained data we get, so here is a label data.

00:47 Unsupervised algorithm is given an unlabeled data.

00:51 So here the data is not labeled, it's unlabeled data set.

00:55 So here to find the pattern or the structure within the data, we find here the particular structure.

01:01 It aims to discover hidden pattern, group similar data point or reduce the dimensionality of the data.

01:07 A benefit of unsupervised learning is its ability to reveal underlying structure in the data without the need for the labeled example.

01:15 A drawback is that the result may be less interpretable since there is no predefined label.

01:24 And after this, the next is the difference.

01:28 Yes, different initialization for the k -means algorithm can lead to a different result.

01:35 So here the final cluster assignment and the centroid can vary depending on the initial position of the centroid.

01:42 So as we know very well in k -means algorithm, the final cluster assignment and the centroid can vary depending on the initial period of the centroid.

01:53 So here we take the centroid and we consider we make the cluster.

01:58 K -means is sensitive to the initial location of the data centroid because it can converge to the local minima of the objective function.

02:07 So to mitigate this issue, it's common practice to run k -means multiple times with different initialization and choose the best result based on the sum criteria such as minimizing the sum of the square distance.

02:19 So here we take the different different cluster on their distance.

02:24 So we can take the criteria so that the distance is minimized.

02:29 So here we can say yes, different initialization for the k -means algorithm can lead to a different result.

02:35 Now the k -means algorithm will converge in a finite number of iterations because it monotonically decreases the objective function which is the sum of the square distance between the data point and their assigned centroid.

02:50 So here some proofs.

02:51 The algorithm starts with an initial assignment of the data point to the cluster.

02:56 First we do the initialization by assigning them.

03:02 In each iteration it updates the cluster centroid to minimize the sum of the square distance.

03:07 So here we have to we have to this square distance to minimize.

03:13 So this is the distance from the centroid.

03:17 This decreases the value of the objective function or keep it the same.

03:20 So in this way we reduce the objective function.

03:24 Since there is a finite number of the data points and a finite number of the possible cluster assignments, so there are only a finite number of the way to assign the data point to the cluster...

Question

Please give Ace some feedback