Daylight v4.9
Release Date: 1 February 2008

Name

dt_fp_generatefp - generate a fingerprint from an object

Generic Prototype

dt_fp_generatefp(dt_Handle, dt_Integer, dt_Integer, dt_Integer) => dt_Handle

C Prototype

#include "dt_finger.h"

dt_Handle dt_fp_generatefp(dt_Handle object, dt_Integer minstep, dt_Integer maxstep, dt_Integer size)

FORTRAN Prototype

include 'dt_f_finger.inc'

integer*4 dt_f_fp_generatefp(object, minstep, maxstep, size)

integer*4 object
integer*4 minstep
integer*4 maxstep
integer*4 size

Description

Allocate a new, empty fingerprint of size 'size', fills its fields with the fingerprint generated from the object 'object', and sets the objects 'original bits', 'original size', 'size' and 'bits set' properties. Legal values of 'size' are 32 to 1073741824 (2^30). If not a power of two, 'size' will be rounded to the next highest power of two.

The object 'object' can be any object for which dt_stream(object, TYP_ATOM) and dt_stream(object, TYP_BOND) will return a stream of atoms and bonds, respectively. Typically, 'object' is a molecule, but paths, pathsets, cycles, atoms, bonds, and reactions can be used to generate fingerprints if desired. Note that unconnected bonds do not result in bits being set; bonds only count if both attached atoms are also present in the object being fingerprinted.

Fingerprint generation occurs in two phases. First, unique bits are calculated and set for every linear path of atoms and bonds in the object. Second, bits are generated and added for unique cycles and branching (atoms with 3 or more connections).

The parameters 'minstep' and 'maxstep' controls the first phase of fingerprint generation (linear paths). 'minstep' sets the minimum length path to be included in the fingerprint, 'maxstep' sets the maximum length path to be included. The maximum value for either parameter is 31. If 'minstep' is greater than 'maxstep', the first phase of fingerprint generation is skipped altogether.

Note: in version prior to 4.42, the function dt_fp_fingerprint(3) was used for fingerprint generation. Internally, that function used the values of 0 and 7 for minsteps and maxsteps, respectively. dt_fp_fingerprint(3) included all linear paths of 0 to 7 bonds (1 - 8 atoms).

Note: using values of 'size' over 2097152 (2^21) yield fingerprints without uniform distribution of bits. The current fingerprint algorithm was written before larger fingerprints became practical, and hence will 'scatter' the fingerprints only in the first 2^21 bits of the fingerprint. These fingerprints are perfectly legal; they fold correctly down to smaller sizes, yet the distribution of bits will appear odd (to those of you sick enough to actually look at them).

Return Value

Returns a new fingerprint object. Returns NULL_OB if 'object' is an inappropriate object type (see above).

Related Topics

dt_fp_allocfp(3) dt_fp_bitcount(3) dt_fp_bitvalue(3) dt_fp_euclid(3) dt_fp_fingertest(3) dt_fp_foldfp(3) dt_fp_nbits(3) dt_fp_obitcount(3) dt_fp_obits(3) dt_fp_range(3) dt_fp_setbitvalue(3) dt_fp_setobitcount(3) dt_fp_setobits(3) dt_fp_setrange(3) dt_fp_tanimoto(3) dt_fp_tversky(3) dt_fp_tversky(3)