Daylight Web Services ManualRelease Date 08/01/11 Copyright Notice
This document and the programs described herein are
Copyright © 2007-2011, Daylight Chemical Information Systems, Inc., Laguna
Niguel, CA. Daylight explicitly grants permission to reproduce this document
under the condition that it is reproduced in its entirety including this
notice, and without alteration. All other rights are reserved.
Table of Contents1. Introduction2. Prerequisites 3. Web Services
3.2 convertStructure 3.3 deriveScaffold 3.4 deriveSDClusters 3.5 getProperties 3.6 getDepiction 3.7 getTransform 3.8 getTautomers 3.9 desaltSmiles 3.10 normalizeSmiles 3.11 getClogP 3.12 generateRTable 3.13 executeProgram 1. IntroductionWeb services use standard, open protocols to provide access to a wide range of programs over a network. Passing parameter data along with a request for a particular service triggers an action and sends back a response. Thus the web services model provides a valuable mechanism for delivering complex chemistry-oriented functionality such as format conversion or property calculation within an organization.Daylight offers a series of application components such as canonicalization and depiction as Java web services for use on their own or for inclusion in user-designed or workflow applications. The "Web Services" access Daylight application libraries and toolkits written in C through the Java Native Interface (JNI) framework. SOAP is used as the messaging format for all currently available Daylight Web Services. 2. PrerequisitesNo particular programming skills are required for use of the Web Services. However, installation and set-up requires a general knowledge of UNIX and Daylight software.2.1 Daylight Software RequirementsWeb services are included with the standard Daylight distribution (versions 4.93 or later). The standard distribution is available for download from Daylight's web site (http://www.daylight.com). In order to use any of the web services, an appropriate Daylight license for each particular web service is required. Note: The server code will only run on supported Solaris and Linux platforms.2.2 Third-Party Software RequirementsThe following third-party software packages are required for the server:
Tomcat 5.5 - http://tomcat.apache.org/ Axis 1.4 (testing only) - http://tomcat.apache.org/ 2.3 InstallationSee the Daylight Installation Manual for specific instructions on setting up the web services.3. Web ServicesThe following sections describe the currently available Web Services. The response objects for all services contain the content of the response, processing or error messages, or both. When options are provided to web services in the form of a list of alternating strings of names and values, the general rule is that a repeated name will have its last value used. All names must have associated values.Each web service publishes a Web Services Description Language (WSDL) file which represents the definitive specification of all the inputs and outputs (including exceptions) for each service. A copy of the WSDL in $DY_ROOT/webservices. All of the Web Services will optionally report errors generated during an action as part of returned message when an ERRORLEVEL input parameter is supplied. Standard error levels are as follows:
1 = warnings, notes and errors returned 2 = warnings and errors returned 3 = all errors returned 4 = serious errors only 3.1 canonicalizeSmilesThis Web Service parses a list of molecules or reactions and generates the corresponding canonical SMILES.
ISO option string ERRORLEVEL Output List:
[(SMILES string, error messages)] Option:
[TRUE|FALSE] default is FALSE 3.2 convertStructureThis Web Service interconverts data and structures between MDL chemical table-based file formats [molfile (MOL), SDfile (SDF), RGfile, RXNfile (RDF) and RDfile (RDF)] and Daylight SMILES-based formats [SMILES (SMI), isomeric SMILES (ISM), SMARTS (SMA), SMIRKS (SMRK), and Thor Data Tree (TDT). Detailed descriptions of conversion formats and options are available in the Daylight Conversion Manual. MDL format to SMILES conversions are based upon default ptable values unless specific ptable changes are provided.
Input format string Output format string List of options as name-value pairs Optional ptable changes as a list of [atom number, atom symbol, atom mass, list of valence-charge pairs] ERRORLEVEL Output SOAP Message:
[(output string, error messages)]
TDT --> SDF or RDF SDF --> SMI, ISM, SMA, TDT or TDTSMA RDF --> SMI, ISM, SMA, SMRK, TDT, TDTSMA or TDTSMRK Note: MOL is a valid input value that is interchangeable with SDF regardless of the actual input format. In addition, MOL is a valid output format value if the input value is SMI or TDT. However, the output will always be written in SDF format even if there is no associated data. In addtion, either MOL or SDF can be used with RGfile input. Lastly, rxnfile format is not recognized as a separate format. RDF is used for both rxnfiles and RDfiles. Valid Conversion/Option Combinations:
[TRUE|FALSE] default is true ADD_3D - Adds 3D coordinates to Daylight output [TRUE|FALSE] default is true CHI_EXPLICIT_H - Determines whether chiral atoms must have explicit hydrogens [TRUE|FALSE] default is false DAYLIGHT_LIKE - Sets all three DAYLIGHT options [TRUE|FALSE] default is true DAYLIGHT_HCOUNT - Determines whether both the explicit H and H-count fields are used [TRUE|FALSE] default is true DAYLIGHT_STEREO - Determines whether only specified stereochemistry is used [TRUE|FALSE] default is true DAYLIGHT_CHI_H - Determines whether chiral atoms in the input file must have explicit hydrogens [TRUE|FALSE] default is true DB_EXPLICIT_H - Determines whether double bonds in the input file must have explicit hydrogens [TRUE|FALSE] default is false DB_RINGS_CISTRANS - Determines whether stereochemistry for ring double bonds is indicated [TRUE|FALSE] default is false FIX_RADICAL-RINGS - Determines if radical rings are converted to aromatic [TRUE|FALSE] default is true ID_FIELD- Specifies the data field identifier to be used as the unique ID [NAME] default is first line of header block for molecules and $RIREG for reactions IMPLICIT_CHIRALITY - Specifies how chirality is determined in order to detect implicit chiral centers [TRUE|FALSE] default is false M__ISO_ARE_DEFECTS - Indicates whether values in the M ISO line are mass defects or actual masses [TRUE|FALSE] default is false NAME_DATATAG - Designates the data tag to be used as the unique ID [NAME] default is LINE1, if available otherwise $NAM PREFIX - Parses the designated prefix from data field identifiers [NAME] default is to use the full $DTYPE name SMI_COMMENT - Determines whether the SMILES is placed in the comment line of the connection table [TRUE|FALSE] default is false SMI_IS_ISM - Replaces SMILES with isomeric SMILES in the output [TRUE|FALSE] default is false SMI_WITH_TUPLES - Determines whether tuple information is associated with SMILES or isomeric SMILES [TRUE|FALSE] default is true SPLIT-FIELDS - Splits data that is spread across multiple lines in an input into separate entries [TRUE|FALSE] default is false USE_3D - Designates whether 3D coordinates are included in the MDL output [TRUE|FALSE] default is false 3.3 deriveScaffoldThis Web Service generates a single scaffold that captures all common substructure elements including ring topology from a list of SMILES. Please note that this process can time intensive. Therefore, the server timeout may need to be adjusted to accommodate large processes.
List of options as name-value pairs ERRORLEVEL Output SOAP Message:
Error messages Option:
counts used (DX_CSS_SIMPLE_TOPOLOGY) and uses only atoms with TOORDER properties set, e.g. as a result of using transforms, (DX_CSS_USE_TORDER_ONLY) If mulitple TOPO_OPTION-value pairs are supplied, the individual values will be combined and all will be used. MIN_FRAGMENT - Sets the minimum fingerprint path size [INTEGER] default is 0, range is 0 to 19 Increasing the minimum fingerprint path size eliminates scaffolds that are smaller than the set size. 3.4 deriveSDClustersThis Web Service partitions small to moderate-sized sets of input SMILES into clusters with significant scaffolds. Like most clustering algorithms, scaffold-directed clustering limit on the web service permits clustering up to 10,000 structures which can take tens of minutes for drug-like molecules. The server timeout may need to be adjusted depending on the use. For more information see the Clustering Manual.
List of options as name-value pairs ERRORLEVEL Output SOAP Message:
[(SMARTS scaffold, list of member SMILES, properties (cluster id, minimum coverage, number of members), error messages)]
[INTEGER] default is 0, range is 0 to 19 Increasing the minimum fingerprint path size eliminates scaffolds that are smaller than the set size. MAX_FP_PATH_SIZE - Sets the maximum fingerprint path size [INTEGER] default is 19, range is 0 to 19 MIN_COVERAGE - Sets the minimum scaffold coverage [NUMBER] default is 0.3, range is 0.0 to 1.0 TOPO_OPTION - Sets topology to full or simple, i.e., no ring bond counts used [DX_CSS_DEFAULT|DX_CSS_SIMPLE_TOPOLOGY] default is DX_CSS_DEFAULT 3.5 getPropertiesThis Web Service calculates values for a specified list of different physical properties for one or more input SMILES using Daylight algorithms. See the Daylight Properties Manual for additional information.
List of properties SINGLE_PART option string RXNDIFF option string Optional SMARTS string for MATCH_COUNT ERRORLEVEL Output SOAP Message:
[(list of computed property values, error messages)] Options:
[TRUE|FALSE] default is FALSE Computed property values can be a comma separated string if the input SMILES has multiple parts and flag is set to false RXNDIFF - Returns the difference between the property values of the product and the reactant [TRUE|FALSE] default is FALSE If rxndiff is true, then single_part cannot be FALSE Properties:
ATOM-COUNT - Count of heavy atoms in a molecule AVERAGE_MOL-WEIGHT - Molecular weight based on average atomic weights for naturally occurring element DEPICTION - Planar coordinates for explicit atoms FINGERPRINT - Fingerprint using default parameters FLEXIBILITY - Ratio of rotatable bonds to the total count of bonds FRAGMENT_COUNT - Number of fragments formed by removal of the isolated carbons from the structure HACCEPTOR_COUNT - Number of hydrogen-bond acceptor sites HDONOR_COUNT - Number of hydrogen-bond donor sites MATCH_COUNT - Number of unique matches using a user defined SMARTS MOLAR_VOLUME - Average molar volume based on Schroedinger's method MOL_FORM - Molecular formula in Hill order PARACHOR - Molar surface tension in dynes per centimeter using McGowan's method PART_COUNT - Number of components POLAR_SURFACE_AREA - Topological polar surface area according to the method of Ertl, Rohde, and Selzer RIGIDITY - Tanimoto similarity value between a molecule and version of itself with rotatable bonds removed RING_COUNT - Number of smallest set of smallest rings ROTBOND_COUNT - Number of rotatable bonds using defined SMARTS pattern STEREOCENTER_COUNT - Number of stereocenters using a particular set of defined SMARTS patterns 3.6 getDepictionThis Web Service parses a list of name-value strings (alternate name and values), one of which pairs must be either "SMILES" and a valid SMILES string or "TDT" and a valid TDT string and returns a structural diagram in GIF format.
ERRORLEVEL Output SOAP Message:
Error messages Options:
[COB, COW, COP, BOW, BOP, WOB, or WOP] default is COB FROMTO - Specifies output horizontal alignment by aligning depiction to -1 and -2 wildcard atoms ([*-1] and [*-2]) [TRUE|FALSE] default is false; overridden by the orient option) HEIGHT- Specifies output height [PIXELS] default is 300 HIDE_CHI_H - Specifies hide chiral hydrogens in output [TRUE|FALSE]] default is true HIGHLIGHT - Specifies a SMARTS query string to be used to highlight the matching portion of the input SMILES or TDT structure. [SMARTS] HLEN_PCT - Specifies scaling length for bonds to hydrogen in output [NUMBER] default is 1.00, range is 0.67 to 1.0 HYDROGENS - Specifies that aliphatic hydrogen and carbons are to be shown in output [TRUE|FALSE] default is false NONEXHAUSTIVE - Specifies whether exhaustive or nonexhaustive SMARTS matching is used for highlighting the depiction [TRUE|FALSE] default is false NUMCOLORS - Specifies the number of output atom colors for input TDT with ALAB specified [NUMBER] CPK color scheme is used by default OLD_STYLE - Specifies pre-v4.83 bond style rendition to be used in output [TRUE|FALSE] default is false ORIENT - Specifies automatic orientation of 2D layout in output to the longest axis [TRUE|FALSE] default is false; overrides the fromto option) OUTPUT - Specifies whether the out is in gif or png format REACTION - Specifies that the input SMILES is a reaction with atom-mapping [TRUE|FALSE] default is false SCALE - Specifies the output number of pixels per angstrom [NUMBER] default is 100; overrides width and height options SCHEMATIC - Specifies that the output be a skeleton frame with no hydrogen atoms or aromatic bonds [TRUE|FALSE] default is false SMILES - Indicates that the input is a SMILES [valid-SMILES-string] SMIRKS - Indicates that the input is a general reaction that may contain SMARTS expressions for atoms [TRUE|FALSE] default is false TDT - Indicates that the input is a TDT [valid-TDT-string] WIDTH - Specifies output width [PIXELS] default is 400 XSMILES - Specifies that output be in XSMILES or Kekule format [TRUE|FALSE] default is false 3.7 getTransformThis Web Service applies a specified transform to one or more input SMILES. See the Daylight Theory Manual for additional information on SMIRKS reaction transforms.
Single SMIRKS transform ISO option string List of options as name-value pairs ERRORLEVEL Output SOAP Message:
Options:
[TRUE|FALSE] default is FALSE EXHAUSTIVE_SEARCH - TRUE returns all molecules, FALSE returns a single molecule. [TRUE|FALSE] default is FALSE. FULL_RXN - Determines if a full reaction is returned or only the reaction product. [TRUE|FALSE] default is TRUE. DIRECTION - Determines the direction in which the transform is performed. [DX_FORWARD|DX_REVERSE] default is DX_FORWARD. 3.8 getTautomersThe Web Service calculates tautomers for input SMILES. See the Daylight Properties Manual for additional information on tautomers.
ISO option string List of options as name-value pairs FIXED_SUBSTRUCTURE option list ERRORLEVEL Output SOAP Message:
Options:
[TRUE|FALSE] default is FALSE NO_ENOL - Restricts hydrogen donors and acceptors to only heteroatoms, i.e., suppresses keto-enol type tautomerism. [TRUE|FALSE] The default is FALSE. KEKULE - Controls whether kekule structures are generated using dt_xsmiles() (TRUE) or canonical SMILES using dt_cansmiles() (FALSE). [TRUE|FALSE] The default is FALSE. UNIQUE - Determines if the output is generated by using the relative electronegativities of the atom types (O>S>Se>Te>N>C) as graph invariants to preferentially assign double bond and hydrogen positions in the tautomer. Although this canonical tautomer often corresponds to the lowest energy form, this is not guaranteed as extended electronic factors are not considered. [TRUE|FALSE] The default is FALSE ITERATION_LIMIT - Maximum number of donor or acceptor positions for iteration. If a structure has more donors or acceptors than the specified limit, then no tautomer enumeration is performed. The default limit allows the program to generate tautomers for every input structure until all possible tautomers have been generated. A reasonable value for limit to minimize long-running, pathological cases, is 10. [INTEGER] The default is 0 FIXED_SUBSTRUCTURE - This option is useful for excluding specific functional groups from the calculation. If an input molecule matches one of the SMARTS in the supplied comma-separated list it is marked as non-tautomerizable. [LIST OF SMARTS] The default is no matches. 3.9 desaltSmilesThis Web Service parses a list of SMILES and removes salts based upon a salt table or an actual list of salts that is provided as part of the input message. A copy of the default salt table (salts.dat) is located in $DY_ROOT/data. The format is one salt with a class number per line, i.e., [Na+] 0.In order to utilize a user-provided table instead of the default, the environment variable DY_SALT_DATA must be set to the location of the new table. If the user-provided table has more than one class listed, then the class number to be used can be specified in the input message. Lastly, if a list of salts is provided with the input message, then this list is used in place of either the default or user-provided table.
Comma-separated list of salts ISO option string Class number ERRORLEVEL Output SOAP Message:
Options:
[TRUE|FALSE] default is FALSE 3.10 normalizeSmilesThis Web Service parses a list of SMILES and normalizes the structure based upon a transform table or an actual list of SMIRKS transforms that is provided as part of the input message. A copy of the default transform table (transforms.dat) is located in $DY_ROOT/data. The format is one SMIRKS, reaction direction, and class number per line.In order to utilize a user-provided table instead of the default, the environment variable DY_TRANSFORM_DATA must be set to the location of the new table. If the user-provided table has more than one class listed, then the class number to be used can be specified in the input message. Lastly, if a list of transforms is provided with the input message, then this list is used in place of either the default or user-provided table. In this case, the direction (forward/reverse) to be used for the transformation is specified in input message.
Comma-separated list of SMIRKS transforms ISO option string Direction of SMIRKS reaction Class number ERRORLEVEL Output SOAP Message:
Options:
[TRUE|FALSE] default is FALSE 3.11 getClogPThis Web Service parses a list of SMILES and calculates the both the logarithm of the computed octanol-water partition coefficient (clogp) and molar refractivity (cmr). See the Daylight ClogP Manual and the Daylight CMR Manual for more information.
ERRORLEVEL Output SOAP Message:
3.12 generateRTableThis Web Service parses a single, scaffold-based cluster of molecules such as those generated by deriveSDclusters and determines the RGroups for that scaffold.Please note that in order to generate an rtable, the input SMARTS scaffold cannot have more than four fragments. In fact, the most useful R-tables are those where the number of scaffold fragments is kept to one or two. Therefore if you are using deriveSDclusters/deriveScaffold to generate the input information, you may need to set the MIN_FP_PATH_SIZE/MIN_FRAGMENT option to a larger value in order to get a better scaffold. Also be aware that if an input scaffold is highly symmetric then the program will automatically switch to non-exhaustive matches.
ERRORLEVEL The list of cluster properties is optional and may be the same as that generated by deriveSDCluster. Output SOAP Message:
[RTable row (molecule ID and Rgroups consisting of an array of SMARTS strings)] error messages 3.13 executeProgramThis Web Service enables the use of PipeTalk for two-way communication with an external process such as that described for ClogP. See the Program Object Toolkit section of the Daylight Programmer's Guide for additional information. Note: In order for executeProgram to function, the environment variable DY_WSPATH must be defined (see Daylight Installation Manual) and the program being called by must be installed in a path below DY_WSPATH.
Path to program relative to DY_WSPATH (ascending is not permitted) List of program arguments Output SOAP Message:
|
||||||||||||