A simple, generic, and highly extensible computational strategy has been developed with the premise that a compounds acute toxicity can be gauged from the toxicities of structurally similar compounds. Using a reference set of over 13500 compounds with reported oral, rat-LD50 endpoint data, a generic utility which assigns a compound the average of endpoint dose data of structurally similar reference set compounds is shown to correlate well with reported values. In a leave-one-out simulation using the requirement that at least one structurally similar member of a "voting consortium" be present within a reference set, the strategy demonstrates a predictive correlation (q^2) of 0.82 with 57.3% of the compounds being predicted. Similar leave-one-out simulations on a set of 1781 drugs demonstrated a q^2 of 0.74 with 51.8% of the compounds being predicted.
Simulations to optimize similarity cut-off definitions, consortium member size, and reference set size illustrate that further improvement in the quality and quantity of predictions can be obtained with increases in the reference set size. Similar application of the strategy to subchronic and chronic study data should be possible provided reasonably sized reference sets.
It is argued that if industry participants contribute to universal, standardized toxicity reference sets, improved in-silico toxicity assessment can become commonplace in early preclinical research and compound failure rate due to toxic outcome would decrease without compromising any specific competitive advantage within the pharamaceutical industry.