Euromug '96
    - New projects
    - previous
    - next
  
 
Synopsis
  The project goal is to implement a medium-to-high performance chemical
  information server and use it to deploy large chemical databases
  on the Internet.
Status
  Design and initial negotiations with data vendors are underway.
  Human resources and computer infrastructure could be in place mid-1997.
  Databases and pre-production software could be in-place by
  the end of 1997.
People
  
    - Dave Weininger (lead and system design)
    
- Daylight employee to be announced later (interfaces and integration)
    
- Jeremy Yang and Norah Shemultuskis (database design)
    
- Jack Delany (reaction databases)
    
- Yosef Taitz (business and legal)
  
Description
  The Mjollnir project was spawned by an analysis of the sociological
  aspects of scientific information exchange by Howard Winant about
  16 years ago (not only pre-Daylight, but pre-MedChem!)
  One of the conclusions of this analysis was that
  of the three sociologically-evolved methods of scientific information
  exchange (reviewed journal, library, forum), only the forum is suitable
  for use on a global network in a fully distributed manner.
  In an idealistic forum, participants must identify themselves,
  may freely (or at least equally) have access to all information,
  and may make contributions without prior review or censorship
  (including comments about other people's contributions).
  Most scientific forums take the form of conferences, but the forum has
  no requirement for immediacy, e.g., usenet is a good example of a forum.
  The Thor database system was designed with the forum concept in mind,
  although it is not (yet) used this way.
  Universal structure-based indexing, constant time retrieval and
  strict ID-data distinction make Thor particularly suitable for use
  in a chemical information forum.
  An early version of Mjollnir which implements (only) universal access
  of low-performance data retrieval via e-mail has been operational for
  several years.  It is primarily used by academics with poor access to
  information.  Although this system makes a lot of data available
  (the medchem, tsca, spresipreps, and wdi databases), its very low
  performance makes it immune to "database-raping" and limits its use
  to delivering data to desparately poor students and
  marketing databases to better-funded researchers (e.g., database demos).
  The current Mjollnir project is aimed at raising the system's performance
  to a more usable level and introducing some forum features.
  Key design features include:
  
  - Very high performance database server capable of handling
      10's to 100's of Merlin search requests per minute,
      i.e., the combined educational database needs in the US.
  
   
- Integrated interfaces which reduce per-user bandwidth to allow
      data delivery at typical internet speeds.
  
   
- Multiple large databases totalling over 7 million
      structures and reactions are on the table.
  
   
- Free access.
  
   
- User's identities and search requests are available as data:
     
     - Users find out who else is interested in similar data (the carrot). 
     
- Not suitable for proprietary queries (the stick).
     
 
   
- Allows users to examine databases before subscribing to them or
      buying them for high-performance, in-house use.
  
   
- Restrictions against database dumping will be automatically enforced.
  
  Given the realities of how most chemical information is collected
  (expensively) and disseminated (expensively and not that universally),
  the Mjollnir approach seems to be in everyone's benefit:
  
  - Academic chemists and chemistry students
    -- Academic chemists (and libraries) are being pinched by severe
       budget cuts in many colleges and universities.  Research monies
       which used to take up the slack are not so plentiful anymore.
       Mjollnir should provide such chemists with free access to data
   
- Industrial chemists
    -- Mjollnir will provide a mechanism for industrial chemists with
       Internet access to explore many sources of chemical information
       quickly and easily.
       For all the advances in modern chemical informatics, this is
       something which hasn't reached most bench chemists.
       It is unlikely that industrial chemists will be able to use the
       public Mjollnir server for their day-to-day work since it is not
       possible to ask proprietary questions.
       If the service is something they really need, the assumption is
       that they have the resources to obtain it (e.g., obtain the
       database for in-house use).
   
- IS/IT specialists
    -- This system provides try-before-you-buy functionality for both
       software and databases.
       Current data aquisition decisions are often made for historical
       reasons or based on "blurbs".
       Mjollnir should allow better-informed decision-making
       low cost in time and money.
   
- Database vendors
    -- Publishing an up-to-date chemical information database is
       intrinsically expensive and any workable system must ensure that
       database vendors must get a good return for their efforts.
       The keys to selling data are quality (data that people actually need)
       and volume (multiplies the effect of the effort).
       The Internet provides a very effective mechanism to reach potential
       customers and to allow them to become familiar with the product.
       In this context, "free academic use" is an advantage: most chemical
       databases are way too expensive for academics anyway (no real loss)
       and having students be trained to use a database as one of their
       tools is a great advantage in the long term.
       The assumption that those that can buy it are the same people as
       those with proprietary questions is arguable, but eminently testable.
       One nice assurance for the vendors is that the data is maintained
       in a central place with provisions against dumping -- if it seems
       that the system is being abused, they can "pull the plug".
   
- Online services --
    -- Once it proves itself, the Mjollnir server will be made available
       as normal Daylight product.
       Our intention is that Mjollnir should provide a cost-effective
       delivery system for existing- and would-be online services,
       whether public/free, commercial (charge by subscription or usage), 
       or private (in-house, secure).
       There don't appear to be existing products of this type available,
       especially for small services.
   
- Daylight --
    -- Mjollnir represents both a commercial product and a step in
       Daylight's mission to bring chemical informatics to all chemists.
       Since Daylight sells the databases, the underlying database servers
       and Mjollnir itself, it is expected to serve as a marketing tool.
       We hope that companies will say, "Gee, if you can deliver eight
       million structures across the Internet you can surely handle our
       few hundred thousand structures on our local network".
       The only way to build a reliable and high-performance system is
       to actually maintain one in active use: Mjollnir will give us the
       opprotunity to do so in a big way.  It also serves as part of our
       continuing efforts to support chemical education which is now in
       crisis.
  
  | HardwareWe expect to start with a Sun 4000 Enterprise which Sun has loaned us
  for server development.
  Enterprise machines are great for servers such as ours:
  the architecture scales up to 30x500 MHz CPUs, 30 GB RAM, and 6 TB disk.
  They claim that the new asynchronus memory manager can keep up with all
  those CPU cycles (the 500 MHz CPUs aren't shipping yet, we'll see...)
  As a bonus, they have very robust (fault-tolerant) features. |   | 
| Physical environmentWe are building an office with provisions for uninterruptible power
  and a T1 line which will serve as a stable environment for the service.
  Should be ready by mid-1997. | 
| Human resourcesTwo additional research office staff will come on board in early 1997:
  one primarily for research support,
  the other a network/java-interface guru.
  Both will spend some of their time on this project. | 
  As always, we are looking for input and feedback from our users.
  If you have ideas about the role, design, or implementation of
  Mjollnir, let us know.  Soon!
     Daylight Chemical Information Systems, Inc.
    
    Daylight Chemical Information Systems, Inc.
    
    info@daylight.com