Sum of squared error

Internal quality metrics usually measure the compactness of the clusters using some similarity measure. It usually measures the intra-cluster homogeneity, the inter-cluster separability or a combination of these two. It does not useany external information beside the data itself.

SSE is the simplest and mostwidely used criterion measure for clustering. It is calculated as:

where C_k is the set of instances in cluster k; μ_k is the vector mean of cluster k. The components of μ_k are calculated as:

where N_k=|C_k| is the number of instances belonging to cluster k.

Clustering methods that minimize the SSE criterion are often called minimum variance partitions, since by simple algebraic manipulation the SSE criterion may be written as:

The SSE criterion function is suitable for cases in which the clusters form compact clouds that are well separated from one another.

Additional minimum criteria to SSE may be produced by replacing the value of S_k with expressions such as: