Chemical information in patents
Linda Clark
Derwent Information Ltd.
Patents - what and why?
Legal protection for invention in return for disclosure of new technology
Granted only for new, useful and "non-obvious" inventions
Includes prior art, disclosure, examples, claims
Protect investment in research
Information source for patentability, "state of the art", competitive intelligence
Patent Volumes
From 40 patent issuing authorities:
~20,000 patent applications per week
~10,000 new inventions per week
~25% contain some chemistry
Multiple languages
Different legal frameworks
Chemistry in Patents
Average chemical content per week in pharmaceutical patents
In the claims 1,619
Exemplified compounds 2,981
Disclosed compounds 2,467
Generics 51
Synthetic methods 617
TOTAL 7,735
(from a 3 week study in 1995)
Chemistry in Patents
Widest possible claim (excluding prior art)
Generic concepts
"Markush" structures
Legal vs. chemical definitions
Typical patent claim
Patent claims often hierarchical
use of antibiotics
use of antibiotics of Formula 1 (Markush)
use of specific antibiotics
often a Markush structure plus one or more specific structures
allows scope for restriction of claims if necessary
Markush structures
Real Examples
indefinite - e.g. aryl, heterocycle containing N and optionally containing O or S
non structural - e.g. electron withdrawing group
Searching Markush structures
Fragmentation codes
e.g. Derwent's CPI code, IDC GREMAS code
Drawbacks
no relationship between fragments
Imprecise (many false drops)
Not simple to apply or to search
Benefits
Large backfile (to 1960’s in DWPI)
"Fuzzy" searching
can handle any structure (specific or generic)
The Introduction of Graphics
Aims
greater precision
more user-friendly
developed in mid 1980’s (although earliest predecessors from 1958)
Sheffield research project
GENSAL, Markush DARC, MARPAT
3 stage search - fragment screen, reduced graph match, atom by atom search
Commercial Markush systems
Markush DARC and MARPAT launched in 1987/88
further enhancements required to solve e.g. problems of segmentation and translation between generic and specific substituents
Access through online hosts
Markush DARC
available on Questel
Databases:
DWPIM - Derwent Information
MPharm - INPI
Data from 1987 to date
Approx 1,400 structures per week
Cross file results to WPI file to retrieve documents
MARPAT
available on STN
MARPAT - from Chemical Abstracts
Data from 1988 to date
Approx 800 structures per week
Database contains structures plus displayable document information
Must search MARPAT and the Registry File for complete search
Current Graphical systems
Benefits
powerful search systems
easy to use?
Drawbacks
query specification
system limits often encountered
cost
A step forward, but not a complete solution?
Recent developments...
Technology improvements
Speed and capacity
The Internet
Web browsers, search engines e.g. Alta Vista, ranked results
Combinatorial chemistry
storing/searching libraries
analysis of libraries (pre and post synthesis)
Markush techniques applied to Combinatorial chemistry
The future?
Apply benefits of new technology
web interface
benefits of graphical input
"fuzzy" searching
ranked answers
analysis, data mining in addition to patentability searches
application to all types of chemistry (specific, Markush, sequences…)
use by non-specialists
Useful questions
Current questions
Will using or patenting this compound infringe somebody else’s patent?
New questions
Has anything like this compound been patented?
Is any component of this combinatorial library patented?
How much does my CL overlap with "patent space"?
How?
Can the Daylight tools and approach contribute to these goals?
Can the Daylight user community help?