Designing a database generally amounts to specifying the datatypes. Datatypes may be identifiers or non-identifier data, there may be one or several datafields, fields may be indirect or not, they may be numeric or text or ascii-encoded-binary, or chemical - a SMILES.
So this exercise consists of designing a datatype and incorporating it into our test database. This does not encompass all the issues of database design, but is a typical and illustrative task.
SOL<CO;Methanol;1.234>
We see three fields, the SMILES for the solvent, the name of the solvent, and the solubility. Your task is to compose a datatype TDT which defines this datatype in a reasonable way. The SMILES should be recognized as such, the name should be normalized for reliable searching, and the solubility should be recognized as a real number.
Look at examples in test_datatypes.tdt
.
Write a file containing your one datatype,
sol_dtype.tdt
.
Here's one possibility...
$D<SOL>
_V<"Solvent; Solvent/Ref">
_B<"Solv;Solv/ref">
_N<"USMILESANY;INDIRECT $I">
_P<"*;*">
_S<Solvent in SMILES notation>
_M<Common>
_O<Test datatype>
|
thorload \
-MERGE FALSE -OVERWRITE TRUE \
test_datatypes < sol_dtype.tdt
sols.tdt
contains TDTs of solvent data rooted by their associated
SMILES. By loading in merge mode, these data will be added
to the appropriate datatrees.
% thorload test < sols.tdt
xvthor
and xvmerlin
.