Distance measurements for nominal attributes

Distance measurements for nominal attributes

Many methods of partitioning use distance measures to determine the similarity or dissimilarity between any pair of objects (like distance measures for nominal attributes). It is common to denote the distance between two instances x_i and x_j as: d(x_i, x_j). A valid distance measure must be symmetric and obtains its minimum value (usually zero) in the case of identical vectors. The distance measure is called a metric distance measure if it also satisfies the following properties:

Distance measurements for nominal attributes

When attributes are nominal, two main approaches can be used:

  1. Simple match
Distance measurements for nominal attributes

where p is the total number of attributes and m is the number of matches.

2. Creating a binary attribute for each state of each nominal attribute and calculation of their dissimilarity.