
Efficiently finding similar proteins (via sequence similarity) requires specialized algorithms which work at the sequence (string) level. However, there are a number of interesting parallels between the methods for small molecule substructure searching and sequence similarity searching:
| Substructure searching | Sequence similarity | |
|---|---|---|
| Topology | 2D graphs | 1D strings |
| Screening | Fingerprints | Local identities |
| Similarity measure | Tanimoto, etc | Scoring Matrices |
| Matching | Graph matching | Dynamic programming |