Source | # of structures | 2-D parent matches in glax | Near neighbours in glax | Near neighbours within database |
Specs | 1897 | |||
IBS-NP | 3000 | |||
MS-9402 | 960 | |||
MS-9502 | 953 7 Compounds with no structures | |||
MS-9504 | 960 | |||
MS-9505 | 835 | |||
Chembridge | 2500 TD> | |||
Glax96_2 ComGenex | 4969 TD> | |||
On_Stock LaboTest | 10608 TD> | |||
Orion | 5000 TD> | |||
Biotechnology Corp Of America | 7700 7690 readable TD> | |||
glx9604 Beletskaya Moscow | 8084 TD> |
The parent matches pie plot shows the proportion of compounds which are present in the glax file. To do the comparison, the largest piece is extracted and all the charges which can be reduced to zero by changes in hydrogen count, removed. The matches are true 2D there is no attempt to match isomers. This therefore represents an upper bound of the overlap.
The two neighbour pie charts indicate the proportion of compounds which have a given number of near neighbours based on the DAYLIGHT 1024 sized fingerprint and the Tanimoto similarity measure. In the first plot the neighbours in the glax registry file are counted for each compound from the commercial source. This should give a feel for how this database is filling gaps in the diversity and/or extending the diversity of the registry file. In the second plot the numbers of neighbours within the supplied database are counted. This gives an indication of whether we are being offered a lot of related compounds.
To ease use as this table gets bigger, or the plots are borrowed :-), the 10 and over segment is labelled >10 for the comparison between databases and 10+ for within plots. The within plots are in heavy type too.