Application 2: Clustering/Diversity
The Tanimoto coefficient (or similarity) between tautomeric forms of the same compound can be extremely low: 0.1957 for the real example above.
Normalization (or seeding clusters with tautomers) improves the quality of clustering and diversity analysis, especially for synthetic data sets.