Contents

- 1 What are the limitations of K means algorithm explain the alternative if any?
- 2 How KMeans is different from the KMeans ++ algorithm?
- 3 Why do we use K-means algorithm?
- 4 What is the main disadvantage of the K-means clustering?
- 5 Is K-means a classifier?
- 6 Is K-means supervised or unsupervised?
- 7 What is the k-means algorithm for vector quantization?
- 8 What is the difference between clustering and quantization?

## What are the limitations of K means algorithm explain the alternative if any?

The most important limitations of Simple k-means are: The user has to specify k (the number of clusters) in the beginning. k-means can only handle numerical data. k-means assumes that we deal with spherical clusters and that each cluster has roughly equal numbers of observations.

## How KMeans is different from the KMeans ++ algorithm?

Both K-means and K-means++ are clustering methods which comes under unsupervised learning. The main difference between the two algorithms lies in: the selection of the centroids around which the clustering takes place. k means++ removes the drawback of K means which is it is dependent on initialization of centroid.

**Why KMeans is not recommended?**

k-means assume the variance of the distribution of each attribute (variable) is spherical; all variables have the same variance; the prior probability for all k clusters are the same, i.e. each cluster has roughly equal number of observations; If any one of these 3 assumptions is violated, then k-means will fail.

**Is it possible to apply the K means algorithm which is based on squared Euclidean distances on categorical data?**

The k-Means algorithm is not applicable to categorical data, as categorical variables are discrete and do not have any natural origin. So computing euclidean distance for such as space is not meaningful.

### Why do we use K-means algorithm?

The K-means clustering algorithm is used to find groups which have not been explicitly labeled in the data. This can be used to confirm business assumptions about what types of groups exist or to identify unknown groups in complex data sets.

### What is the main disadvantage of the K-means clustering?

It requires to specify the number of clusters (k) in advance. It can not handle noisy data and outliers. It is not suitable to identify clusters with non-convex shapes.

**Which is faster k-means or K Medoids?**

K-means attempts to minimize the total squared error, while k-medoids minimizes the sum of dissimilarities between points labeled to be in a cluster and a point designated as the center of that cluster. In contrast to the k -means algorithm, k -medoids chooses datapoints as centers ( medoids or exemplars).

**What is the K Medoids method?**

k -medoids is a classical partitioning technique of clustering that splits the data set of n objects into k clusters, where the number k of clusters assumed known a priori (which implies that the programmer must specify k before the execution of a k -medoids algorithm).

#### Is K-means a classifier?

K-means is an unsupervised classification algorithm, also called clusterization, that groups objects into k groups based on their characteristics.

#### Is K-means supervised or unsupervised?

K-means is a clustering algorithm that tries to partition a set of points into K sets (clusters) such that the points in each cluster tend to be near each other. It is unsupervised because the points have no external classification.

**Is K-means a supervised learning algorithm?**

K-Means clustering is an unsupervised learning algorithm. There is no labeled data for this clustering, unlike in supervised learning. K-Means performs the division of objects into clusters that share similarities and are dissimilar to the objects belonging to another cluster.

**What is K-means algorithm with example?**

K-means clustering algorithm computes the centroids and iterates until we it finds optimal centroid. In this algorithm, the data points are assigned to a cluster in such a manner that the sum of the squared distance between the data points and centroid would be minimum.

## What is the k-means algorithm for vector quantization?

The K-means is also called Lloyd algorithm consists in starting from an arbitrary set of knots, (or codebook), and iteratively replace each one of them by the L^p-median (or simply by the mean for quadratic quantization) of the probability distribution given that it falls in the Voronoi cell of that knot. An interactive animation is available here.

## What is the difference between clustering and quantization?

Quantization are of 2 kinds -scalar and vector. k-means algorithm is applied for vector quantization. Again k-means algorithm is also applied in vector clustering. So, is vector quantization = vector clustering? Thank you. They are essentially the same.

**How are Voronoi regions used in vector quantization?**

In vector quantization, the input space is partitioned into a set of convex areas. They are also referred to as Voronoi regions or cells. If you look at here it says k-means (which is mostly used in computer science) is actually originated from VQ in signal processing as well as pulse code modulation back in fifties.

**What is the k value for color quantization?**

1. Identify the number of clusters you need — ‘K’s value. The K value for color quantization is the number of colors that we desire to represent the image. For example in the case of the above 4×4 image, let’s say we want to represent the same image with only 5 colors, then our K value will be 5.