Distance measurements for nominal attributes
Many methods of partitioning use distance measures to determine the similarity or dissimilarity between any pair of objects (like distance measures for nominal attributes). It is common to denote the distance between two instances x_i and x_j as: d(x_i, x_j). A valid distance measure must be symmetric and obtains its minimum value (usually zero) in the case of identical vectors. The distance measure is called a metric distance measure if it also satisfies the following properties:
When attributes are nominal, two main approaches can be used:
- Simple match
where p is the total number of attributes and m is the number of matches.
2. Creating a binary attribute for each state of each nominal attribute and calculation of their dissimilarity.