Pharma Grid
Pharma Grids – Key to Pharmaceutical Innovation? René Ziegler, PhD Head of Global IT Management Novartis Pharma AG CH-4002 Basel, Switzerland
[email protected]
What is a Pharma Grid? • A Pharma Grid is a shared in silico resource to generate and preserve knowledge in the areas of discovery, development, manufacturing, marketing and sales of new drug therapies.
Pharma Grid • Pharma Grids can be part of or closely integrated with eScience and Health Grids • For competitive and intellectual property protection reasons, Pharma Grids are predominantly private, enterprise IntraGrids with strict access and authentication controls 2 R.Ziegler, Pharma Grid, January 2004
2
Health and eScience Grid Approaches Cover a Wide Scope of Activities (Academia, Health Care) • • •
•
First PharmaGrid meeting held in 2003 Contributions mainly from academia Follow-up in 2004 (July 7 to 9) with focus on the industry For information see www.pharmagrid.com
Topics Covered in 2003: UK eScience Programmes, eDiamond, BIRN, Virtual Laboratories, Medical Image Computing, HealthGrids, KnowledgeGrids, myGrid 3 R.Ziegler, Pharma Grid, January 2004
3
The Pharmaceutical Industry Faces Many Challenges • Large pharmaceutical companies have an innovation gap despite steeply growing investments in R&D • The Pharmaceutical Industry will continue to consolidate • Size will rather play in favour of marketing power than innovation • Near-term pipeline-rich Biotech will gain power vs. big Pharma • The US market will continue to dominate • European markets and European R&D investments are losing ground • Political pressure to reduce prices will increase and succeed • Patent attrition and cost containment favour Generics • Parallel trade is a growing concern • The pressure exerted by NGO’s and informed consumers (patients) will increase 4 R.Ziegler, Pharma Grid, January 2004
4
The Pharmaceutical Industry Faces Many Challenges • Large pharmaceutical companies have an innovation gap despite steeply growing investments in R&D • The Pharmaceutical Industry will continue to consolidate • Size will rather playAre in favour of marketing power than innovation Pharma • Near-term pipeline-rich Biotech will gain power vs. big Pharma Grids Part of • The US market will continue to dominate theEuropean Solution? • European markets and R&D investments are losing ground • Political pressure to reduce prices will increase and succeed • Patent attrition and cost containment favour Generics • Parallel trade is a growing concern • The pressure exerted by NGO’s and informed consumers (patients) will increase 5 R.Ziegler, Pharma Grid, January 2004
5
The Pharma Industry Continues to Invest Massively in New Science and Technologies Nanotechnology Clinical trials simul. Systems Biology In silico Screening HT Structure Analysis eClinical Research HT Proteomics Functional Genomics HT Expression Analysis Source: PhRMA annual membership HT Sequencing survey, 2003 HT Screening Combinatorial Chemistry Rec. DNA Technology Biochemistry Medicinal Chemistry Natural Products 1900
1950
6 R.Ziegler, Pharma Grid, January 2004
1975
1985
1990
1995
6
2000
2005
2010
New Technologies and Should Enable Parallel Processes and Faster Time-to-Market at Lower Cost De 4 velo – p 6 m ye en ar t s
de v 1 elo – p 2 m ye en ar t s to tio n Tr an si
Ta rg et 1 dis – c 2 ov ye er ar y s Ta rg et 1 val – id 3 at ye io ar n s
Le ad 0. dis Le 5 – co ad 1 ve op ye ry a 2 tim r – iz 4 at ye io ar n s
Traditional approach
Years
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Ta rg e 0. t di 5 sc – Le 1 ove ad ye ry ar 0 d Le .5 –isc o ad 1 ve y op e ry tim ar 1 i - 2 za Tr ye tio an ar n si s tio n to de ve lo p 1 me De yea nt r ve lo p 3 m ye en ar t s
Evolving approach Cost of developing a drug: 802 M$ Average development time: 12 years Cost savings if development time gets cut by one-third: 167 M$ Source: Tufts Center for the study of Drug development Years
1
2
3
4
Target validation 7 R.Ziegler, Pharma Grid, January 2004
5
6
7
8
Source: Nature Biotechnology 7
Despite Massive Investments in New Science and Technologies the Innovation Gap is Widening
Innovation Gap
Number of New Molecular Entities approved by calendar year
Number of NMEs
40 35
30
No of NMEs
30
28 24
20
18
10 0 1998
1999
2000
2001 Year
8 R.Ziegler, Pharma Grid, January 2004
8
2002
2003
Is There Some Hope for 2004? Innovation Gap Number of New Molecular Entities approved by calendar year
Number of NMEs
40 35
30
No of NMEs
30
28 24
20
18
10 0 1998
1999
2000
2001 Year
9 R.Ziegler, Pharma Grid, January 2004
9
2002
2003
The Evolving in silico Science Platform
Evolution of the Pharmaceutical Industry:
In Vivo
In Vitro
In Silico
[Time]
10R.Ziegler, Pharma Grid, January 2004
10
The Contribution of IT to Drug Discovery is Increasing Compounds 3D structure prediction
Proteins Genes de novo design Expression analysis
Docking
Function prediction
Biology Target ID Target Validation 11R.Ziegler, Pharma Grid, January 2004
Chemistry Screening Optimisation 11
QSAR
IT
Development Preclinical Clinical
The Vision Enable and transform the Drug Discovery process through: - Comprehensive and reliable data and information - Seamless information integration for easy navigation - Turning data into knowledge using in silico science - Simulate biomolecular processes using in silico science
12R.Ziegler, Pharma Grid, January 2004
12
Drug Discovery Is Increasingly Dependent on High Performance Computing •
•
•
•
Genomics, proteomics, genetics, chemistry, toxicology, pharmacology, pharmacokinetics, pharmacogenomics produce large and massively growing amounts of data and information The discovery of novel medicines relies on the proper management and analysis of these large amounts of data and information Computer-based modelling and simulation of biochemical and chemical processes is playing an ever increasing role in pharmaceutical R&D, allowing scientists to conduct better designed and targeting physical experiments High Performance Computing is becoming a key component in Pharma R&D
13R.Ziegler, PharmaGRIDs, January 2004
13
Novartis Views Pharma Grids in Three Dimensions Computing Grid A resource that provides extremely large CPU power to perform computing intense tasks in a transparent way by means of an automated job submission and distribution facility
Pharma Grid Data Grid
Knowledge Grid
A resource that provides transparent and secure access and storage to large amounts of data in an automated, selforganised mode 14R.Ziegler, PharmaGRIDs, January 2004
A resource that connects data and information in a transparent mode according to user-defined rules based on business or scientific semantics and ontologies 14
Novartis Has Implemented a Globally Coherent Pharma Grid Strategy Users and applications :
Group FGA A
Group DDC B
Group CT C
Group IK@ND
Group TAs E
Job submission protocol HPC technologies integration/sharing layer
Cluster existing HPC infrastructure : SGI HPC server SUN HPC server
SGI HPC server
TimeLogic HPC server
Large shared Linux Clusters
15R.Ziegler, Pharma Grid, January 2004
SGI HPC server
Shared Multiprocessor servers
Shared PC grid
15
COMPAQ HPC Cluster
External Collaborations
Novartis Has Implemented a Globally Coherent Pharma Grid Strategy Users and applications :
Group FGA A
Group DDC B
Group CT C
Group IK@ND
Group TAs E
Job submission protocol HPC technologies integration/sharing layer
Provide reliable, transparent, uniform access to the HPC Cluster existing HPC infrastructure : resources according to business needs and systems availability SGI HPC server
SUN HPC server
TimeLogic HPC server
Large shared Linux Clusters
16R.Ziegler, Pharma Grid, January 2004
Shared PC grid
Enable large globally deployable and scalable PC Grid based Shared on a Multiprocessor standardised External servers platform Collaborations External partnership for special needs SGI HPC server
16
SGI HPC server
COMPAQ HPC Cluster
Harnessing the Idle Processing Power of the Standardised PC Platform for in silico Research PC Grid Computing Using United Devices’ MetaProcessor Most desktop CPUs are idle most of the time The aggregate power of these idle resources is substantial
Plugging into these systems would generate a super computer without new investment in hardware 17R.Ziegler, Pharma Grid, January 2004
17
Influencing Bio-molecular Processes
Target Ligand Drug
ACTIVE I
Target = enzyme, receptor, nucleic acid, … Ligand = substrate, hormone, other messenger, ... 18R.Ziegler, Pharma Grid, January 2004
18
PC Grid Success Story: Protein Kinase CK2 Inhibition • Protein Kinase CK2 has roles in cell growth, proliferation and survival. • Protein Kinase CK2 has a possible role cancer and its overexpression has been associated with lymphoma. • To elucidate the different functions and roles of CK2 and confirm it as a drug target for oncology, one needs a potent and selective inhibitor. • The problem was addressed by in silico screening (docking).
19R.Ziegler, Pharma Grid, January 2004
19
Virtual Screening by in silico Docking
> 1,000,000 Compounds Docking Process and Selection of possible hits
< 100 Compounds 20R.Ziegler, Pharma Grid, January 2004
20
Virtual Docking Accelerates the Docking Process at Negligible Additional Cost • Task: DOCK, ~320,000 molecules ¾
• • • •
Virtual docking of compounds from the Novartis Library into the 3D structure of a protein (target)
Elapsed time Hrs. Elapsed time days Elapsed time years Devices in Grid:
21R.Ziegler, Pharma Grid, January 2004
55,794 2,325 6.4
21
547 23
6
561
1200
Application of a Computing Grid in Marketing & Sales Data Analysis • Marketing & Sales in the US have a data warehouse of > 7 TB • Monthly reports of internal and external sales data generate ca. 0.5 TB of new data to be analysed • Monthly data analysis requires minimum 4 days on current hardware (IBM AIX) • Estimate using the PC Grid (2,700 PCs): analysis time down to 2 days (experiments on-going) • Earlier availability of data should enhance sales force targeting and effectiveness
22R.Ziegler, Pharma Grid, January 2004
22
Computing Grid: Summary •Performance: • 2,700 standard PCs in the Grid • Equivalent to 5 TeraFLOPS (1012 Floating Point Operations per Second) • Acceleration of the in silico docking process versus 1 standard 2002 PC: ~4000x
•Cost: • No need for investments in new or larger data centres and machines • Immediate savings > 2m$ (vs. 0.4 M$ investment) • Optimal use of existing investment (hardware, image, support)
•Scalability: • Expansion to 27,000 PCs in 2004 • Potential to expand even further (up to 50 to 60,000 PCs) • Grid power grows in parallel with the growing performance of each PC as the fleet gets renewed 23R.Ziegler, Pharma Grid, January 2004
23
Computing Grid: Summary
•Rapid uptake: • Chemists now routinely use the PC Grid for their lead discovery work in most research programmes
24R.Ziegler, Pharma Grid, January 2004
24
Novartis Views Pharma Grids in Three Dimensions Computing Grid A resource that provides extremely large CPU power to perform computing intense tasks in a transparent way by means of an automated job submission and distribution facility
Pharma Grid Data Grid
Knowledge Grid
A resource that provides transparent and secure access and storage to large amounts of data in an automated, selforganised mode 25R.Ziegler, Pharma Grid, January 2004
A resource that connects data and information in a transparent mode according to user-defined rules based on business or scientific semantics and ontologies 25
The Knowledge Space Concept
Access & Navigation Advanced Web Technologies
Content management: information quality control and curation.
Knowledge Engineering and mapping. Know what we know.
Knowledge & Document Base Knowledge Production Modelling, Simulation, Data Mining, Competitor Intelligence, Information Analysis Information Integration Integration of diverse data and information sources
Data Management Data and Information capture, laboratory automation, archiving
Web
Internal & External data and information sources 26R.Ziegler, Pharma Grid, January 2004
26
Develop a culture fostering information and knowledge sharing. Advanced Infrastructure and High Performance Computing.
The Knowledge Space The Knowledge Space consists of: • The collection of all types of data and information within the scope of interest defined by a particular business. ¾
Thus, the Knowledge Space is composed of: • Databases, information sources, document/knowledge bases, etc… with relevance to Pharma. • There is no conceptual difference between internal and external data/information. This is only a matter of tagging it as either.
• The Meta Data and the Knowledge Map which describe the collection in terms of content and location of content.
27R.Ziegler, Pharma Grid, January 2004
27
Knowledge Space Portal - Vision The "Knowledge Space Portal" will, via a single customizable interface: • Federate heterogeneous data resources and provide precise organization of the content • Provide quick and intuitive access to information • Provide data extraction, analysis and exploration tools • Allow data integration, data exchange and interoperability of applications • Provide mechanisms for data capture and annotation • Provide knowledge sharing and collaborative tools 28R.Ziegler, Pharma Grid, January 2004
28
Knowledge Space Portal - Scope • Provide key elements for efficiently accessing internal and external information relevant to daily decisions in the drug discovery and development processes: ¾ ¾ ¾ ¾ ¾ ¾ ¾
Data integration across heterogeneous data sources and applications (internal and external) Consistent user interface for data retrieval, exploration and analysis across all data types Contextual (ultralink), tree-based (static or dynamic taxonomies) and semantic (knowledge map) navigation Data exploration and analysis methods Personalized views Collaborative, annotation and information sharing tools Alerting
29R.Ziegler, Pharma Grid, January 2004
29
(Text Retrieval)
(Text Retrieval)
(Text Mining)
(Text Retrieval)
(Text Retrieval)
Preprod (Offline)
Ulix (online)
WIN Infrastr. (weblogic Server)
PC Grid Job Submitter
KSP (online) •User Request Entry-point •Document Pass-through •Rware internal communication •Lexical Extraction Pass-Through
30R.Ziegler, Pharma Grid, January 2004
•Document Sources
The Knowledge Grid Enables Text Retrieval and Text Mining
30
Pharma Grids Have Potential in Both the Private and the Public Sector • Drug discovery and development: ¾
Systems biology to understand and model disease on a molecular level in silico ¾ Networks of academic centres and community-based physicians to work on clinical trials ¾ Networks, shared resources for academic, Biotech start-ups and Pharma companies
• Logistics, drug quality/safety: ¾
Smart tags, integrated with ERP: • • • •
Badge, package tracking Quality monitoring Patient compliance Parallel trade
31R.Ziegler, Pharma Grid, January 2004
31
Pharma Grid
Pharma Grids Have Potential in Both the Private and the Public Sector • Regulatory; ¾
Data Grids for long-term archiving (throughout product life cycle, i.e. up to 50 years)
Pharma Grid
• Health Care, health economics: ¾
Disease management (epidemiology, prognosis, outcome, cost) ¾ eMedical records ¾ Pharmacogenomics ¾ Analysis and direction of health care consumption and prescription practices
32R.Ziegler, Pharma Grid, January 2004
32
Pharma Grids Have Potential in Both the Private and the Public Sector • Disease knowledge resource: ¾
Third world diseases ¾ Rare diseases (orphan diseases) ¾ Lead discovery for third world and/or rare diseases
Pharma Grid
• Educational resource: ¾
Training of health care professionals (eLearning) (in particular in remote locations) ¾ On-line assistance to health care professionals (in particular in remote locations)
33R.Ziegler, Pharma Grid, January 2004
33
For Broader Deployment of Private and Public Pharma Grids Several Issues Need to Be Addressed • Regulatory issues: ¾
Can GxP validated platforms be part of a Grid? ¾ Data ownership (privacy)
• Security:
Pharma Grid
¾
Authentication, authorisation ¾ Trust, provenance ¾ Data security ¾ Data integrity
• Technical issues: ¾
Required degree of standardisation (open source) ¾ Grid segmentation based on task requirements ¾ Task scheduling 34R.Ziegler, Pharma Grid, January 2004
34
For Broader Deployment of Private and Public Pharma Grids Several Issues Need to Be Addressed • Affordability: ¾
Software licensing models and cost ¾ Drive towards open source software (longevity, adaptability) ¾ Network requirements and cost
• Software availability and suitability:
Pharma Grid
¾
Porting to parallelised platform ¾ New algorithms for parallel processes ¾ Processes requiring large shared memory (e.g. quantum chemistry)
• Accounting/pricing model: ¾
Flat per seat/user charge for all Grid services ¾ Metered charges for all services ¾ Separation between flat charges for basic access and usage and metered charges for dedicated resources ¾ No charge at all 35R.Ziegler, Pharma Grid, January 2004
35
For Broader Deployment of Private and Public Pharma Grids Several Issues Need to Be Addressed
Pharma Grid • User concerns and working practices: ¾
Reassure users that Grid processes have no negative impact on both the workstation performance and the integrity of their data ¾ Move culture from “my private workstation” to “my workstation as an integrated part of a larger resource” ¾ 7x24 CPU availability
36R.Ziegler, Pharma Grid, January 2004
36
Summary and Conclusions • The Pharma Grid has successfully passed the proof-ofconcept stage • In silico docking of small molecules has become a routine application for the Pharma Grid • A Pharma Grid computing roadmap has been developed: ¾ ¾ ¾ ¾ ¾
Knowledge Grid Data Grid Expansion to Marketing & Sales, Clinical Development Up-scaling to 27,000 PCs and other platforms External Grid resources
• The Pharma Grid paradigm needs new policies and a focus on security • The Pharma Grid is a catalyst of paradigm shifts and thus a true source of innovation 37R.Ziegler, Pharma Grid, January 2004
37
Acknowledgements • Grid and Knowledge Management Strategy: ¾
Manuel Peitsch
• Implementation of the PC Grid: ¾
Jürgen Basse Welker ¾ Pascal Afflard
• Implementation of the Knowledge Grid: ¾
Thérèse Vachon
• Marketing data analysis: ¾
Ken Corsini
• Slides: ¾
Sylvie Burger
38R.Ziegler, Pharma Grid, January 2004
38