% smi2tdt -t '$CAS' smicas.smi > smicas.tdt
% fingerprint -id 7DEC smicas.tdt > fingers.tdt
% nearneighbors -FID 7DEC fingers.tdt > neighbors.tdt
% jpscan neighbors.tdt > /dev/printer
% jarpat -JP_NEED 8 -JP_NEAR 14 -NNID 7DEC neighbors.tdt > clusters.tdt
% showclusters -h -q -v clusters.tdt | more
% showclusters -h -q -x clusters.tdt > /dev/printer
Program ......... jpscan Version ......... Daylight Software Release 4.51 Function ........ scan Jarvis-Patrick clustering parameters Input ........... NN (nearest neighbors) data NN data set ..... na created by ... nearneighbors version ...... 4.51 from ......... FP with params .. 16 Trees read in ... 1999 Clustered by .... standard Jarvis-Patrick method NUMBER OF STRUCTURES CLUSTERED ------- NEED --------------------------------------------------------- 2 3 4 5 6 7 8 9 10 11 NEAR ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ 2: 980 - - - - - - - - - 3: 1338 779 - - - - - - - - 4: 1506 1186 663 - - - - - - - 5: 1633 1424 1072 579 - - - - - - 6: 1705 1563 1332 965 525 - - - - - 7: 1749 1648 1459 1209 877 467 - - - - 8: 1788 1707 1571 1381 1120 796 394 - - - 9: 1823 1761 1657 1503 1288 1057 741 368 - - 10: 1850 1797 1715 1582 1401 1208 972 679 344 - 11: 1866 1823 1748 1639 1502 1326 1134 898 596 319 PERCENTAGE OF STRUCTURES CLUSTERED ------- NEED --------------------------------------------------------- 2 3 4 5 6 7 8 9 10 11 NEAR ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ 2: 49.02 - - - - - - - - - 3: 66.93 38.96 - - - - - - - - 4: 75.33 59.32 33.16 - - - - - - - 5: 81.69 71.23 53.62 28.96 - - - - - - 6: 85.29 78.18 66.63 48.27 26.26 - - - - - 7: 87.49 82.44 72.98 60.48 43.87 23.36 - - - - 8: 89.44 85.39 78.58 69.08 56.02 39.81 19.70 - - - 9: 91.19 88.09 82.89 75.18 64.43 52.87 37.06 18.40 - - 10: 92.54 89.89 85.79 79.13 70.08 60.43 48.62 33.96 17.20 - 11: 93.34 91.19 87.44 81.99 75.13 66.33 56.72 44.92 29.81 15.95 NUMBER OF CLUSTERS ------- NEED --------------------------------------------------------- 2 3 4 5 6 7 8 9 10 11 NEAR ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ 2: 490 - - - - - - - - - 3: 466 333 - - - - - - - - 4: 365 378 264 - - - - - - - 5: 280 333 317 220 - - - - - - 6: 209 278 318 273 193 - - - - - 7: 154 200 267 271 241 172 - - - - 8: 121 142 213 259 246 221 145 - - - 9: 94 118 164 223 240 240 202 130 - - 10: 71 92 133 188 226 234 226 184 136 - 11: 61 78 107 147 194 219 232 209 171 127 AVERAGE CLUSTER SIZE ------- NEED --------------------------------------------------------- 2 3 4 5 6 7 8 9 10 11 NEAR ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ 2: 2.0 - - - - - - - - - 3: 2.8 2.3 - - - - - - - - 4: 4.1 3.1 2.5 - - - - - - - 5: 5.8 4.2 3.3 2.6 - - - - - - 6: 8.1 5.6 4.1 3.5 2.7 - - - - - 7: 11.3 8.2 5.4 4.4 3.6 2.7 - - - - 8: 14.7 12.0 7.3 5.3 4.5 3.6 2.7 - - - 9: 19.3 14.9 10.1 6.7 5.3 4.4 3.6 2.8 - - 10: 26.0 19.5 12.8 8.4 6.1 5.1 4.3 3.6 2.5 - 11: 30.5 23.3 16.3 11.1 7.7 6.0 4.8 4.2 3.4 2.5 SIZE OF LARGEST CLUSTER ------- NEED --------------------------------------------------------- 2 3 4 5 6 7 8 9 10 11 NEAR ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ 2: 3 - - - - - - - - - 3: 9 4 - - - - - - - - 4: 31 11 5 - - - - - - - 5: 79 15 10 6 - - - - - - 6: 371 106 17 9 7 - - - - - 7: 946 204 27 16 11 8 - - - - 8: 1114 829 133 34 16 12 9 - - - 9: 1264 1009 460 58 23 18 13 10 - - 10: 1337 1177 873 123 35 21 18 15 11 - 11: 1459 1239 1018 306 60 33 23 18 15 11 NUMBER OF SINGLETONS ------- NEED --------------------------------------------------------- 2 3 4 5 6 7 8 9 10 11 NEAR ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ 2: 1019 - - - - - - - - - 3: 661 1220 - - - - - - - - 4: 493 813 1336 - - - - - - - 5: 366 575 927 1420 - - - - - - 6: 294 436 667 1034 1474 - - - - - 7: 250 351 540 790 1122 1532 - - - - 8: 211 292 428 618 879 1203 1605 - - - 9: 176 238 342 496 711 942 1258 1631 - - 10: 149 202 284 417 598 791 1027 1320 1655 - 11: 133 176 251 360 497 673 865 1101 1403 1680
showclusters -h -q -v clusters.tdt | more
HEADER AND SUMMARY: program ................... showclusters function .................. analysis and display of structure clusters version ................... DCIS Release 4.61 (c) 1995 output requested .......... Summary Frequencies Sorted lists singletons to be listed ... no datatype(s) to show ....... all SMILES display long data items ... normal input file ................ jp810.tdt tree allocation, initial .. 10000 tree allocation, final .... 10000 total datatrees read ...... 2002 trees with SMILES ......... 1999 cluster id required ....... none trees with CL data ........ 1999 trees with FP data ........ 1999 trees with other data ..... 0 (0 items read) trees used ................ 1999 clusters + singletons ..... 1253 number of singletons ...... 1027 number of clusters ........ 226 average cluster size ...... 4.3 largest cluster ........... 17 Generation of CLUSTERS: ID ........... na Program ...... jarpat Version ...... 4.61 Source ....... NN (near neighbors) Parameters ... 8,10,0 Generation of NEAR NEIGHBORS: ID ........... na Program ...... nearneighbors Version ...... 4.61 Source ....... FP (fingerprints) Parameters ... 16 Generation of FINGERPRINTS: ID ........... na Program ...... fingerprint Version ...... 4.61 Source ....... med98.tdt Parameters ... 2048,64,0.30,0/7 FREQUENCIES OF CLUSTER SIZES: size | frequency size | frequency size | frequency ----------+---------- ----------+---------- ----------+---------- 1 | 1027 7 | 10 13 | 1 2 | 107 8 | 1 14 | 4 3 | 35 9 | 9 15 | 5 4 | 15 10 | 8 16 | . 5 | 15 11 | 2 17 | 1 6 | 10 12 | 3 . | . CLUSTERS LISTED BY SIZE, SMILES BY VAR(TANIMOTO): CLUSTER 0 (64) size 17 0.0 0.0219 CC1(C)SC2C(NC(=O)Cc3ccccc3)C(=O)N2C1C(=O)O 0.1 0.0383 CC1(C)SC2C(NC(=O)C(C(=O)O)c3ccccc3)C(=O)N2C1C(=O)O 0.2 0.0477 CC1(C)SC2C(NC(=O)C(N=[N+]=[N-])c3ccccc3)C(=O)N2C1C(=O)O 0.3 0.0486 CC(=O)OCOC(=O)C1N2C(SC1(C)C)C(NC(=O)Cc3ccccc3)C2=O 0.4 0.0497 CC(C)(C)C(=O)OCOC(=O)C1N2C(SC1(C)C)C(NC(=O)C(N)c3ccccc3)C2=O 0.5 0.0507 CC1(C)SC2C(NC(=O)C3(N)CCCCC3)C(=O)N2C1C(=O)O 0.6 0.0566 CC1(C)SC2C(NC(=O)C(C(=O)Oc3ccccc3)c4ccccc4)C(=O)N2C1C(=O)O 0.7 0.0589 COC(C(=O)NC1C2SC(C)(C)C(N2C1=O)C(=O)O)c3ccc(Cl)c(Cl)c3 0.8 0.0657 CC1(C)SC2C(NC(=O)C(C(=O)O)c3ccsc3)C(=O)N2C1C(=O)O 0.9 0.0677 CC1(C)SC2C(NC(=O)COc3ccccc3)C(=O)N2C1C(=O)O 0.10 0.0720 CCC(Oc1ccccc1)C(=O)NC2C3SC(C)(C)C(N3C2=O)C(=O)O 0.11 0.0740 CC1(C)NC(C(=O)N1C2C3SC(C)(C)C(N3C2=O)C(=O)O)c4ccccc4 0.12 0.0791 COc1cccc(OC)c1C(=O)NC2C3SC(C)(C)C(N3C2=O)C(=O)O 0.13 0.0920 COC1(NC(=O)C(C(=O)O)c2ccsc2)C3SC(C)(C)C(N3C1=O)C(=O)O 0.14 0.0954 CC1(C)SC2C(NC(=O)C(N)c3ccccc3)C(=O)N2C1C(=O)OC4OC(=O)c5ccccc45 0.15 0.0977 CC1(C)SC2C(NC(=O)C(N)C3=CCC=CC3)C(=O)N2C1C(=O)O 0.16 0.0986 CC1(C)SC2C(NC(=O)C(NC(=O)N3CCN(C3=O)S(=O)(=O)C)c4ccccc4)C(=O)N2C1C(=O)O CLUSTER 1 (5) size 15 1.0 0.0020 Cn1c(=O)n(C)c2[nH]c(=O)[nH]c2c1=O 1.1 0.0021 Cn1c(=O)[nH]c2[nH]c(=O)[nH]c2c1=O 1.2 0.0022 Cn1cnc2[nH]c(=O)n(C)c(=O)c12 1.3 0.0023 Cn1cnc2n(C)c(=O)[nH]c(=O)c12 1.4 0.0023 Cn1c(=O)[nH]c2[nH]c(=O)n(C)c(=O)c12 1.5 0.0024 Cn1c(=O)[nH]c2n(C)c(=O)[nH]c(=O)c12 1.6 0.0024 Cn1cnc2c(=O)n(C)c(=O)[nH]c12 1.7 0.0025 Cn1c(=O)[nH]c2nc[nH]c2c1=O 1.8 0.0026 Cn1c(=O)[nH]c(=O)c2[nH]cnc12 1.9 0.0027 Cn1c(=O)[nH]c(=O)c2[nH]c(=O)[nH]c12 1.10 0.0028 Cn1cnc2n(C)c(=O)n(C)c(=O)c12 1.11 0.0030 Cn1c(=O)[nH]c2n(C)c(=O)n(C)c(=O)c12 1.12 0.0035 Cn1cnc2c(=O)[nH]c(=O)[nH]c12 1.13 0.0051 Cn1c(=O)[nH]c2[nH]c(=O)[nH]c(=O)c12 1.14 0.0105 Cn1cnc2c(=O)[nH]cnc12 CLUSTER 2 (75) size 15 2.0 0.0181 CN1C(=O)CN=C(c2ccccc2)c3cc(Cl)ccc13 2.1 0.0256 CN1C(=O)CN=C(c2ccccc2F)c3cc(Cl)ccc13 2.2 0.0259 Clc1ccc2NC(=O)CN=C(c3ccccc3)c2c1 2.3 0.0259 CN1C(=O)C(O)N=C(c2ccccc2)c3cc(Cl)ccc13 2.4 0.0282 Clc1ccc2N(CC#C)C(=O)CN=C(c3ccccc3)c2c1 2.5 0.0325 Clc1ccc2NC(=O)CN=C(c3ccccc3Cl)c2c1 2.6 0.0332 Clc1ccc2N(CC3CC3)C(=O)CN=C(c4ccccc4)c2c1 2.7 0.0336 CN1C(=O)C(O)N=C(c2ccccc2Cl)c3cc(Cl)ccc13 2.8 0.0342 FC(F)(F)CN1C(=O)CN=C(c2ccccc2)c3cc(Cl)ccc13 2.9 0.0367 CCN(CC)CCN1C(=O)CN=C(c2ccccc2F)c3cc(Cl)ccc13 2.10 0.0374 Clc1ccc2NC(=O)CN(=O)=C(c3ccccc3)c2c1 2.11 0.0391 OCCN1C(=O)C(O)N=C(c2ccccc2F)c3cc(Cl)ccc13 2.12 0.0525 CN1CCN=C(c2ccccc2)c3cc(Cl)ccc13 2.13 0.0575 CN(C)C(=O)OC1N=C(c2ccccc2)c3cc(Cl)ccc3N(C)C1=O 2.14 0.0766 CN1C(=O)CN=C(c2ccccc2F)c3cc(ccc13)N(=O)=O CLUSTER 3 (286) size 15 3.0 0.0245 CC1CC2C3CC(F)C4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(O)C(=O)CO 3.1 0.0300 CC1CC2C3CC(F)C4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(O)C(=O)COC(=O)C(C)(C)C 3.2 0.0308 CC12CC(O)C3(F)C(CCC4=CC(=O)C=CC43C)C2CC(O)C1(O)C(=O)CO 3.3 0.0339 CC1CC2C3CC(F)(F)C4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(O)C(=O)COC(=O)C 3.4 0.0361 CC1CC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(C)C(=O)CO 3.5 0.0361 CC(OC(=O)C)C(=O)C1(O)CCC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC21C 3.6 0.0361 CC1CC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(O)C(=O)COC(=O)C 3.7 0.0403 CC12CC(O)C3C(CC(F)C4=CC(=O)C=CC34C)C2CCC1(O)C(=O)CO 3.8 0.0425 CCCCC(=O)OC1(CCC2C3CC(F)C4=CC(=O)C=CC4(C)C3C(O)CC21C)C(=O)CO 3.9 0.0450 CC1CC2C3CC(F)C4=CC(=O)C=CC4(C)C3C(O)CC2(C)C1C(=O)CO 3.10 0.0518 CCC(=O)OC1(C(C)CC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC21C)C(=O)CCl 3.11 0.0612 CC(=O)OCC(=O)C1(CCC2C3CC(F)C4=CC(=O)C(=CC4(C)C3(F)C(O)CC21C)Br)OC(=O)C 3.12 0.0642 CCC(=O)OC1(C(C)CC2C3CC(F)C4=CC(=O)C=CC4(C)C3(F)C(O)CC21C)C(=O)SC 3.13 0.0697 CC1CC2C3CC(F)C4=CC(=O)C=CC4(C)C3(Cl)C(O)CC2(C)C1C(=O)COC(=O)C(C)(C)C 3.14 0.1074 CCSC1(CCC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC21C)SC ... 228 singletons suppressed ...