K-Median

A clustering algorithm closely related to k-means. The practical difference between the two is as follows:

In k-means, centroids are determined by minimizing the sum of the squares of the distance between a centroid candidate and each of its examples.

In k-median, centroids are determined by minimizing the sum of the distance between a centroid candidate and each of its examples.

Note that the definitions of distance are also different:

k-means relies on the Euclidean distance from the centroid to an example. (In two dimensions, the Euclidean distance means using the Pythagorean theorem to calculate the hypotenuse.) For example, the k-means distance between (2,2) and (5,-2) would be:

k-median relies on the Manhattan distance from the centroid to an example. This distance is the sum of the absolute deltas in each dimension. For example, the k-median distance between (2,2) and (5,-2) would be:

Real-world uses

Created for this library

1.
A logistics team uses k-median as a robust alternative to k-means when delivery locations include extreme outliers that distort means.
2.
A retail analytics team uses k-median on customer features when a small group of high-spenders would otherwise pull cluster centers unhelpfully.
3.
A risk analytics team uses k-median to cluster portfolios because the median is less sensitive to single-account anomalies than the mean.

Back to glossary

A clustering algorithm closely related to k-means. The practical difference between the two is as follows:

In k-means, centroids are determined by minimizing the sum of the squares of the distance between a centroid candidate and each of its examples.

In k-median, centroids are determined by minimizing the sum of the distance between a centroid candidate and each of its examples.

Note that the definitions of distance are also different:

Real-world uses

Created for this library

1.
A logistics team uses k-median as a robust alternative to k-means when delivery locations include extreme outliers that distort means.
2.
A retail analytics team uses k-median on customer features when a small group of high-spenders would otherwise pull cluster centers unhelpfully.
3.
A risk analytics team uses k-median to cluster portfolios because the median is less sensitive to single-account anomalies than the mean.

Back to glossary

K-Median

Real-world uses

Related terms

Loading…

K-Median

Real-world uses

Related terms