Improvements to Daylight Clustering

Jack Delany, John Bradshaw

DAYLIGHT Chemical Information Systems, Inc. Mission Viejo, CA USA

Introduction

Daylight v4.8 implements the Jarvis-Patrick clustering algorithm with some refinements and variations. The clustering package consists of seven different command-line programs which perform steps within the clustering process (fingerprint, nearneighbors/mergeneighbors, jarpat/jpscan, showclusters/listclusters). Documentation on use of the programs is available here.

We've been working in three areas: improvements to the behavior of Jarvis-Patrick, addition of new clustering/selection techniques, and addition of new measures for use across the clustering techniques.

Ties Handling in Jarvis-Patrick

Sphere Exclusion Clustering

User-defined Similarity Measures

Kmodes Clustering

Acknowledgements/References