Darko Butina, Joelle Gola
ArQule Ltd.
This paper describes development of an aqueous solubility model which is based on solubility data from Syracuse database, using calculated octanol/water partition coefficients and fifty one 2D descriptors. Initially, two different statistical packages, SIMCA [8] and Cubist [11], were used. Cubist, which combines collections of rules where each rule has an associated Multiple Linear Regression model (MLR), gave better overall results with smaller average absolute errors (AAE's) than the partial least squares (PLS) method within SIMCA.
Presentation slides:
 Daylight Chemical Information Systems, Inc.
Daylight Chemical Information Systems, Inc.