VisualiSAR: A Web-based SAR Tool

David J. Wild and C. John Blankley


Parke-Davis Pharmaceutical Research
Division of Warner-Lambert Company
2800 Plymouth Road, Ann Arbor, MI 48105, USA

This is an HTML version of a talk given at
Daylight MUG 99, Santa Fe, NM
February 23-26, 1999

Overview of presentation

The origins of VisualiSAR

VisualiSAR techniques

Technique 1: Ward's clustering

Technique 2: Modal fingerprints and Stigmata coloring

The diagram shows a list of fingerprints on the left. The generalized modal shows the counts for each of the bits in the set of fingerprints. The specific modal may then be calcualated - for instance, the modal at the 75% level is shown, which has a bit set if at least 75% of the compounds have that bit set. Once a specific modal has been calculated, various modal measures can be calculated for each fingerprint, like MSIM (tanimoto similarity to the modal), MODP (fraction of modal in common with the fingerprint) and PMOD (fraction of fingerprint in common with the modal). The Stigmata coloring scheme can be used to color atoms by the fraction of the paths that pass through the atom that are represented by bits that are set in the modal, that is the atoms are colored by 'commonality' to the set.

Here are some images which give an example of how sorting a cluster of compounds by MSIM and then using coloring can help to show the common features to the cluster. The first image shows the cluster without any sorting or coloring. The second image shows the cluster sorted by MSIM (using the 75% modal) and colored showing the common features.

With large clusters, viewing a representative sample of the cluster can be useful, as can viewing the top and bottom-ranked compounds from the set:

Technique 3: VisualiSAR Web interface

The fingerprint manipulation toolkit lies behind much of the functionality of VisualiSAR. It is a set of C programs that work on structures in a common format (based on TDT) for clustering, general processing, fingeprint analysis and display. The programs can read and write from standard input and output, so can easily be piped together in Unix, and used from Perl scripts. Some of the programs utilize the Daylight toolkit, although the general format is independent of fingerprint type.

VisualiSAR is one example application that uses a Perl / CGI interface on top of the fingerprint manipulation toolkit.. It is written in Perl, and is interacted with through a CGI / HTML interface. VisualiSAR uses standard HTML, with no Java or Javascript. Our philosophy was to use good information design and interface design in the development of the interface, and then to implement the interface using the simplest technology possible.

The opening screen is where compounds can be supplied as a SMILES or SD file, or pasted into the paste box. VisualiSAR then shows an initial view of the data set, with a sample of nine compounds (representing different levels of similarity to the modal) shown. The toolbar on the left allows the user access to the functionality of VisualiSAR. After clustering, the user can scroll through the clusters, or jump directly to a particular cluster using the links on the left. Here, each cluster has also been colored by commonality to the modal of the cluster.

VisualiSAR features

VisualiSAR scope

An example SAR strategy with VisualiSAR

A good example is a cluster of penicillin-like compounds, with different bioavailabilities (shown beside the name at the top), sorted and colored by similarity to the modal. The effect of small changes in structure on bioavailability can be seen - for example, the addition of a methyl (highlighted in green)in ampicillin more than doubles its bioavailbility over penicillin. Note also, that penicillin is sorted to the top as it is the most 'representative' compound of the set (the highest value of MSIM) and the compounds reading right and down become more unusual.

A different way of viewing compounds is to just cluster the actives, and then search the inactives for similar compounds to those in the active cluster (done by sorting and coloring the inactives by similarity to the modal of the active cluster). Thus features that may be responsible for activity and inactivity can be highlighted. An active cluster and similar inactives can be compared using the split-screen capability of VisualiSAR (active cluster at the top).

Summary

Future developments

Spinoff fingerprint & clustering research (in progress)

Algorithm for improved alignment of depictions

To align depictions A and B: This is shown in before and after depictions

Acknowledgements

If you are viewing this at MUG99, you may try out VisualiSAR