Toxicity Prediction and Risk Assessment in Europe – and elsewhere Johann Gasteiger Computer-Chemie-Centrum University of Erlangen-Nuremberg and Molecular Networks GmbH Henkestraße 91 91052 Erlangen, Germany www.molecular-networks.com
Outline Legislation in Europe REACH Cosmetics Directive
Projects funded by the European Union Modeling of toxicity Prediction of metabolism Risk assessment workflow
2
Risk Assessment of Chemicals REACH – Registration, Evaluation, Authorization and restriction of CHemicals Only those chemicals used with more than 1 ton/year that are registered are allowed to be manufactured or imported into the European Union Registration has to provide a dossier with many data and might need a safety report Law since June 1, 2007 Chemicals have to be accepted until Dec 1, 2013 Applies to about 35,000 chemicals 3
REACH Dossier For compounds used in more than 10 t/a a Chemical Safety Report is needed
Harmful effects on human health Harmful effects on the environment Determination of Persistence, Bioacumulation and Toxicity (PBT) Evaluation of exposition
Testing is time-consuming, expensive and might need many animals
4
REACH Dossier For compounds used in more than 10 t/a a Chemical Safety Report is needed
Harmful effects on human health Harmful effects on the environment Determination of Persistence, Bioacumulation and Toxicity (PBT) Evaluation of exposition
Testing is time-consuming, expensive and might need many animals Use chemoinformatics methods for ranking of chemicals 5
European Cosmetics Directive Puts an end to animal testing in cosmetics products Seventh amendment on Feb 27, 2003 By March 11, 2009 Prohibits ingredients to cosmetics products that have been the subject of animal testing (except for repeated-dose toxicity, reproductive toxicity and toxicokinetics )
By March 11, 2013 Prohibits ingredients that have been tested for any toxicity with animals
6
European Cosmetics Directive Puts an end to animal testing in cosmetics products Seventh amendment on Feb 27, 2003 By March 11, 2009 Prohibits ingredients to cosmetics products that have been the subject of animal testing (except for repeated-dose toxicity, reproductive toxicity and toxicokinetics )
By March 11, 2013 Prohibits ingredients that have been tested for any toxicity with animals
Use chemoinformatics methods for toxicity prediction 7
Funding of Chemoinformatics Methods for Toxicity Prediction by European Union
CAESAR DEMETRA SCARLET CHEMPREDICT CASCADE OPENMOLGRID ORCHESTRA VEGA
OSIRIS OpenTox CADASTER HEROIC NOMIRACLE .. ..
8
Funding of Computational Toxicity and Risk Assessment Over the last ten years many projects have been funded by the European Union (EU) (CAESAR, OSIRIS, etc.) Selected projects funded by the EU OpenTox eTOX (EU-IMI)
OECD QSAR Toolbox Projects funded by EU and Colipa Projects funded by CEFIC (the Federation of European Chemical Industries) LRI (Long-range Research Initiative)
9
OpenTox Project Funded by EU 7th Framework Programme (2008-2011) Consortium of 6 academic groups and 5 SMEs from Switzerland, Germany, Italy, Bulgaria, Russia and India Objectives: Develop interoperable predictive toxicology framework with multiple web services May be used as an enabling platform for the creation of toxicology applications www.opentox.org B. Hardy et al. J.Cheminform. 2010, 2, 7 10
OpenTox Project Largely academic project Future maintenance not secured (ended Dec 2011)
11
OECD QSAR Toolbox Funded by ECHA (European CHemical Agency, Helsinki, FIN) Developed by Laboratory of Mathematical Chemistry, Bourgas, Bulgaria (Prof. Ovanes Mekanyan)
Objectives To be used by governments, chemical industry and other stakeholders Filling gaps in (eco)toxicity data needed for assessing the hazard of chemicals 12
OECD QSAR Toolbox Features Information and tools from various sources Logical workflow Crucial: grouping of chemicals into chemical categories Identification of relevant structural characteristics and potential mechanisms or mode of action of a chemical Identification of other chemicals that have the same structural characteristics Use of experimental data to fill the data gap(s) www.oecd.org/document/54/0,3746,en_2649_34379_42923638_1_1_1,00.html 13
OECD QSAR Toolbox 2 Phase project Phase 1 (feasibility study) ended April 2008 Phase 2 since Oct 2008 Version 2.3 can freely be downloaded Version 3.0 will be available Oct 2012
Comments Methods are largely kept confidential Downloading of data unclear (i.e., not possible)
14
etox Google: etox
15
etox www.etox.com.tr
16
etox www.etox.com.tr
E: ERTEX Oto Dekorasyn T: Türk O: Otomotiv X: Bilinmeyen sonsuzu temsilen
17
The eTOX Project An IMI-JU project (Innovative Medicines Initiative – Joint Undertaking) Jointly funded by EU (European Union) and EFPIA (European Federation of Pharmaceutical Industries and Associations) Project kick-off : January 2010 Duration: 5 years Total budget: 13.0 M€ In kind contribution from EFPIA companies: 7.0 M€ IMI-JU funding: 4.7 M€ 18
eTOX Project Full Title: Integrating ´bioinformatics and chemoinformatics approaches for the development of expert systems allowing the in silico prediction of toxicities Objectives Develop a drug safety database from pharmaceutical industry legacy toxicology report and public toxicology data Develop multi-scale and multi-level modelling techniques Develop an integrated system providing access to the data and models 19
Scientific Approach of the eTOX Project
20
EFPIA Partners
Novartis Pharma Bayer Schering Pharma AstraZeneca Boehringer Ingelheim Esteve GlaxoSmithKline Janssen Pharmaceutica Lundbeck Pfizer Hoffmann-La Roche UCB Pharma Sanofi-Aventis Servier 21
Contributions od EFPIA Partners 1. Provision of high quality data (mostly GLP) from systemic toxicity studies (different species), DMPK studies, and safety pharmacology studies (including ligand binding) 2. Data extraction from legacy reports (in-house or with external help)
3. Profound toxicological expertise 4. Expertise in the field of QSAR, pharmacophore modelling and ontologies
5. Contacts to regulatory authorities (including interfacing regulatory databases) 22
Academic Partners and SMEs
Fundació IMIM (E) Centro Nacional de Investigaciones Oncológicas (UK) European Bioinformatics Institute (EMBL) (UK) Liverpool John Moores University (UK) Technical University of Denmark (DK) Universität Wien (A) Vrije Universiteit Amsterdam (VUA) (NL)
Inte:Ligand GmbH (A) Lhasa Ltd (UK) Molecular Networks GmbH (D) Chemotargets SL (E) Lead Molecular Design SL (E) 23
Contributions of Academics and SMEs 1. Expertise in bioinformatics and chemoinformatics, including software development in these fields 2. Database development and hosting
3. Development of QSAR models and expert systems 4. DMPK modelling (CYP, transporters) 5. Validation methodology for predictive systems 24
COSMOS Project One out of 7 projects of the SEURAT cluster (Safety Evaluation Ultimately Replacing Animal Testing) Funded by EU and COLIPA (The European Cosmetics Association) 3.3 + 3.3 M€ Participants:
3 academic 5 institutes 5 SMEs 2 companies
Started Jan 2011 25
COSMOS Project Title: Integrated in silico Models for the Prediction of Human Repeated Dose Toxicity of COSmetics to Optimize Safety Objectives Developing methods for determining the safety of cosmetic ingredients for humans, without the use of animals, using computational models www.cosmostox.eu 26
Handbook of Chemoinformatics From Data to Knowledge J. Gasteiger (Editor)
65 authors 73 contributions 4 volumes 1900 pages Wiley-VCH, Weinheim (August 2003)
Chemoinformatics - A Textbook J. Gasteiger, T. Engel (Editors) 650 pages Wiley-VCH, Weinheim (September 2003)
Quantitative Structure Activity/Property Relationships
molecular structure
//
representation
property
model building
structure descriptors
29
Structure Representation O N
-O NH3+
Constitution
N
3D model
Molecular surface J. Gasteiger, Of Humans and Molecules, J. Med. Chem., 2006, 55, 6429 - 6434 30
Fingerprints …0100100110001101010011…
31
Molecules and Humans
J. Gasteiger, Of Humans and Molecules J. Med. Chem., 2006, 55, 6429 - 6434 32
Structure Representation - Geometry O N
-O NH3+
N
Constitution CORINA 3D model SURFACE
250,000 structures 99.8% conversion rate 0.02 s/molecule Connolly surface van der Waals surface
Molecular surface
33
Structure Representation Physicochemistry Charge distribution J. Gasteiger, M. Marsili, Tetrahedron 36, 3219 (1980)
Inductive effect J. Gasteiger, M. G. Hutchings, Tetrah. Lett. 24, 2541 (1983)
Resonance effect J. Gasteiger, H. Saller, Angew. Chem. Int. Ed. Engl. 24, 687 (1985)
Polarizability effect J. Gasteiger, M. G. Hutchings, J. Chem. Soc. Perkin 2, 559 (1984)
34
Hierarchy of Structure Representations: ADRIANA.Code Global molecular properties
O N
-O NH3+
N
# H acceptors & donors, molecular weight , TPSA, dipole moment, polarizability, logP, logS
Constitution (topological, 2D) 2D autocorrelation Atom properties: q, c, a
3D model 3D autocorrelation, radial distribution functions Atom properties: q, c, a
Molecular surface Autocorrelation of surface properties MEP, HBP, HPP 35
Methods for Data Analysis Inductive learning methods Machine learning Data mining Statistics Pattern recognition Chemometrics Neural networks Support vector machine
36
MOlecular Structure Encoding System C++ based Chemoinformatics toolkit high performance available for many platforms (Windows, Linux, Unix)
Python interface provides easy access to the full functionality of MOSES ideally suited for the development of client / server solutions
under active development since 2001 Computer-Chemie-Centrum, Universität Erlangen-Nürnberg Molecular Networks GmbH
300,000 lines of code well documented and tested - CONFIDENTIAL -
37
Modeling toxicity of chemicals
38
Modeling of Toxicity Different data analysis methods S.Spycher, M.Nendza, J.Gasteiger, QSAR Comb. Sci., 2004, 23, 779-791
Representation of chemical structures S.Spycher, E.Pellegrini, J.Gasteiger, J.Chem.Inf.Model., 2005, 45, 200-208
Considering toxicological mechanism S.Spycher, B.Escher, J.Gasteiger, Chem.Res.Toxicol., 2005,18, 1857-1867
39
log(1/LC50)
toxicity
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
Baseline Toxicity
0
1
2
log P
3
4
lipophilicity
LC50 (fish species Pimephales promelas) of a series of aliphatic compounds versus lipophilicity (log P) 40
5
The Larger Picture
1
inhibitors of photosynthesis
0
2
3
baseline toxicants (polar) uncouplers
inhibitors of AChE
-1
log(1/LC50)
4
baseline toxicants (nonpolar)
SH-alkylating agents reactives -1
0
1
2
log P
3
4
5
6
estrogenic compounds
41
Prediction of Toxicity Global QSAR models are of limited predictive power because of different toxic modes of action (MOA) First classify compounds according to toxic MOA
Then develop a local QSAR model for this MOA
42
Why Prediction of Toxic Mode of Action (MOA)? most QSARs in toxicology focus on a certain class of compounds
OH
OH
however:
Cl
Cl
Cl
and Cl
Cl Cl
polar narcotic
uncoupler of oxidative phosphorylation
require different QSAR-equations.
43
Dataset: MOA of Phenols
1. 2. 3. 4.
polar narcotics uncouplers of oxidative phosphorylation precursors to soft electrophiles soft electrophiles
(156 cpds) (19 cpds) (24 cpds) (22 cpds)
221 cpds A.O.Aytula, T.I.Netzeva, I.V.Valkova, M.T.D.Cronin, T.W.D.Schultz, R.Kühne, G.Schüürmann, Quant. Struct.-Act. Relat. 2002, 21, 12-22. S.Spycher, E.Pellegrini, J.Gasteiger, J. Chem. Inf. Model.,2005, 45, 200-208
44
Counterpropagation Network Models for Classification of MOA Estimate of predictive power with 5-fold cross-validation:
RDF (a, q) RDF (c s)
RDF (c LP, c s)
NHdonor, RDF (c LP, c s)
2x32
77.4%
32
85.5%
2x32
85.1%
1 + 2x32
88.7%
RDF(c LP, c s ), HBP surface AC 2x8 + 12
95.9%
S.Spycher, E.Pellegrini, J.Gasteiger, J. Chem. Inf. Model., 2005, 45, 200-208
45
Classification in 5-fold Crossvalidation
OH
OH Cl
Cl
Cl Cl
Cl Cl
polar narcotic
uncoupler of oxidative phosphorylation
Correct classification !
46
Metabolism of Xenobiotics Drugs, agrochemicals, food additives
47
Oxidations by Cytochrome P450
Aromatic hydroxylation
Aliphatic hydroxylation Epoxidation N, O, S-dealkylation, oxidative deamination N,S-oxidation
48
Development of MOSES.Metabolism Modeling different Selectivities
Selectivity between different cytochrome P450 isozymes in particular 3A4, 2C9, 2C19, 2D6, 1A2
Selectivity between different reaction types chemoselectivity
Selectivity between different reaction sites regioselectivity
49
Development of MOSES.Metabolism Modeling different Selectivities
Selectivity between different cytochrome P450 isozymes in particular 3A4, 2C9, 2C19, 2D6, 1A2
Selectivity between different reaction types chemoselectivity
Selectivity between different reaction sites regioselectivity
50
Data Set of 3A4, 2D6, and 2C9 Substrates Training set: 146 drugs, substrate for 3A4, 2D6 or 2C9* major isoform specified Cl O
OH
N
OH
N
O
Cl O
O O
O N H
Bufuralol *Manga,
Tramadol
Felodipine
N. et al. SAR and QSAR in Env. Res. 2005, 16, 43-61.
51
Support Vector Machine (SVM) Model Training set: 146 drugs Descriptors (242 descriptors by ADRIANA.Code) Automatic variable selection: 12 components 2D-ACidentity(5), 2D-ACq(3), 2D-ACq(6), 2D-ACc(5), 2D-ACqs(1), 2DACqs(2), 2D-ACcs(6), 3D-ACidentity([5.8-5.9[Å), nacid_groups, naliphatic_amino , nbasic_n , r3
Training: 5-fold CV:
Predictability 90.4% 87.8%
52
Validation of the Support Vector Machine Model External validation set: 233 substrates from the Metabolite database Predictability:
82.8%
remember: some drugs are metabolized by several isoforms L. Terfloth, B. Bienfait, J. Gasteiger, J. Chem. Inf. Model. 2007, 47, 1688-1710
53
isoCYP Webservice
Prediction of major metabolizing CYP450 isoform (2D6, 3A4, 2C9)
http://www.molecular-networks.com/online_services L. Terfloth, B. Bienfait, J. Gasteiger, J. Chem. Inf. Model. 2007, 47, 1688-1710 54
Development of MOSES.Metabolism Modeling different Selectivities
Selectivity between different cytochrome P450 isozymes in particular 3A4, 2C9, 2C19, 2D6, 1A2
Selectivity between different reaction types chemoselectivity
Selectivity between different reaction sites regioselectivity
55
A Data-Driven Approach to Metabolism Prediction Extract reaction types from a metabolic reaction database (Metabolite by MDL/Symyx/Accelrys) For each reaction type develop a statistical evaluation based on the number of observed reactions / the number of conceivable reactíons
Use this ratio for assigning a likelihood to a reaction type L.Ridder, M.Wagener, ChemMedChem, 2008, 3, 821-832 56
Derivation of a Rule Base for Metabolite Prediction Define reaction rules, e.g. for an acetylation O R
NH2
R
N H
Calculate reaction probabilities based on a reaction database (Metabolite, MDL-Symyx) Conceivable metabolites Observed metabolites Probability
1223 122 122/1223 = 0.10
57
MOSES.Metabolism Reaction Rules 117 reaction rules Reaction types covered:
Aromatic hydroxylation Aliphatic hydroxylation N- and O-dealkylation Hydrolysis (ester, amides) Conjugation reactions (glucuronidation, sulphation, glycination, acetylation) Oxidation reactions (alcohols, aldehydes, etc.)
Empirical score for likeliness of a reaction based on literature data 58
Observed and Predicted Metabolites of Lumiracoxib Cl
precursor of 5-carboxy OH derivative Rank 1
NH F HO
Cl
Cl
NH
NH OH
F
O OH OH
F
OH O
4‘-hydroxy derivative Rank 4
O
Lumiracoxib
Cl
OH OH
O
OH
NH O
F O
The 3 observed metabolites are high in the ranking position (1, 2, 4)
Glucuronidation Rank 2 59
In silico Toxicity and Metabolism Prediction in the Risk Assessment Workflow Using the Chemoinformatics Platform
60
Workflow of Risk Assessment
O O
O O
Chemical Speciation
Collection of Data
Categorization Prediction
• query • representation
PBT Assessment
Persistence Bioaccumulation Toxicity
• reactivity • degradation • metabolism
• phys-chem prop • toxicity • biological assays • get data • read-across • QSAR prediction
Slide courtesy Dr. Chihae Yang
• biodegradation... • eco-toxicity... • human health.. 61
The CERES System Chemical Evaluation and Risk Estimation System Developed at FDA/CFSAN (Food and Drug Administration – Center for Food Safety and Applied Nutrition) in cooperation with Molecular Networks Proof of Concept: July 2009
Release 1: Sept 2011
- CONFIDENTIAL -
62
Report generation
• Analog search • TTC analysis • Toxicity prediction • Metabolism prediction
- CONFIDENTIAL -
63
Analog searching Similarity criteria – MDL fingerprints
Specify cut off Run search - CONFIDENTIAL -
64
List with analogs
Analogs with data
- CONFIDENTIAL -
65
Toxicity prediction
- CONFIDENTIAL -
66
Metabolism prediction
- CONFIDENTIAL -
67
List with metabolites
- CONFIDENTIAL -
68
Toxicity prediction for query compound and all metabolites
- CONFIDENTIAL -
69
Summary We can learn from data about the relationships between chemical structure and toxicity Information in reaction databases can help us model metabolism Risk assessment of chemicals can profit from chemoinformatics methods Risk assessment has to combine toxicity prediction with metabolism modeling
70
Important Topics and Future Developments
Analyze Adverse Outcome Processes (AOP) Locate the Molecular Initiating Event (MIE) Define the applicability domain Combine chemoinformatics and bioinformatics methods Study in vitro – in vivo relationships Combine chemical structure descriptors with in vitro assay data Get more and better data
- CONFIDENTIAL -
71
Conclusions Strong interest in chemoinformatics approaches to toxicity prediction and risk assessment Public – industry partnerships in tackling these compliated problems Huge opportunity for introducing these methods into science Increasing our understanding of structure-toxicity realtionships and of metabolism Be humble! Let us not promise too much! - CONFIDENTIAL -
72
Acknowledgements eTOX project funded by EU-IMI (Innovative Medicine Initiative) COSMOS project funded by EU and COLIPA (The European Cosmetics Association) FDA – CFSAN (Center for Food Safety and Applied Nutrition) Development of the CERES system Collaboration Dr. Chihae Yang, Altamira LLC, USA
73