Written for drug developers rather than computer scientists, this monograph adopts a systematic approach to mining scientific data sources, covering all key steps in rational drug discovery, from compound screening to lead compound selection and personalized medicine.
Clearly divided into four sections, the first part discusses the different data sources available, both commercial and non-commercial, while the next section looks at the role and value of data mining in drug discovery. The third part compares the most common applications and strategies for polypharmacology, where data mining can substantially enhance the research effort. The final section of the book is devoted to systems biology approaches for compound testing.
Throughout the book, industrial and academic drug discovery strategies are addressed, with contributors coming from both areas, enabling an informed decision on when and which data mining tools to use for one?s own drug discovery project.
Зміст
Preface
A Personal Foreword
PART ONE: Data Sources
PROTEIN STRUCTURAL DATABASES IN DRUG DISCOVERY
The Protein Data Bank: The Unique Public Archive of Protein Structures
PDB-Related Databases for Exploring Ligand-Protein Recognition
The sc-PDB, A Collection of Pharmacologically Relevant Protein-Ligand Complexes
Conclusions
PUBLIC DOMAIN DATABASES FOR MEDICINAL CHEMISTRY
Introduction
Databases of Small Molecule Binding and Bioactivity
Trends in Medicinal Chemistry Data
Directions
Summary
CHEMICAL ONTOLOGIES FOR STANDARDIZATION, KNOWLEDGE DISCOVERY, AND DATA MINING
Introduction
Background
Chemical Ontologies
Standardization
Knowledge Discovery
Data Mining
Conclusions
BUILDING A CORPORATE CHEMICAL DATABASE TOWARD SYSTEMS BIOLOGY
Introduction
Setting the Scene
Dealing with Chemical Structures
Increased Accuracy of the Registration of Data
Implementation of the Platform
Linking Chemical Information to Analytical Data
Linking Chemicals to Bioactivity Data
Conclusions
PART TWO: Analysis and Enrichment
DATA MINING OF PLANT METABOLIC PATHWAYS
Introduction
Pathway Representation
Pathway Management Platforms
Obtaining Pathway Information
Constructing Organism-Specific Pathway Databases
Conclusions
THE ROLE OF DATA MINING IN THE IDENTIFICATION OF BIOACTIVE COMPOUNDS VIA HIGH-THROUGHPUT SCREENING
Introduction to the HTS Process: The Role of Data Mining
Relevant Data Architectures for the Analysis of HTS Data
Analysis of HTS Data
Identification of New Compounds via Compound Set Enrichment and Docking
Conclusions
THE VALUE OF INTERACTIVE VISUAL ANALYTICS IN DRUG DISCOVERY: AN OVERVIEW
Creating Informative Visualizations
Lead Discovery and Optimization
Genomics
USING CHEMOINFORMATICS TOOLS FROM R
Introduction
System Call
Shared Library Call
Wrapping
Java Archives
Conclusions
PART THREE: Applications to Polypharmacology
CONTENT DEVELOPMENT STRATEGIES FOR THE SUCCESSFUL IMPLEMENTATION OF DATA MINING TECHNOLOGIES
Introduction
Knowledge Challenges in Drug Discovery
Case Studies
Knowledge-Based Data Mining Technologies
Future Trends and Outlook
APPLICATIONS OF RULE-BASED METHODS TO DATA MINING OF POLYPHARMACOLOGY DATA SETS
Introduction
Materials and Methods
Results
Discussion
Conclusion
DATA MINING USING LIGAND PROFILING AND TARGET FISHING
Introduction
In Silico Ligand Profiling Methods
Summary and Conclusions
PART FOUR: System Biology Approaches
DATA MINING OF LARGE-SCALE MOLECULAR AND ORGANISMAL TRAITS USING AN INTEGRATIVE AND MODULAR ANALYSIS APPROACH
Rapid Technological Advances Revolutionize Quantitative Measurements in Biology and Medicine
Genome-Wide Association Studies Reveal Quantitative Trait Loci
Integration of Molecular and Organismal Phenotypes Is Required for Understanding Causative Links
Reduction of Complexity of High-Dimensional Phenotypes in Terms of Modules
Biclustering Algorithms
Ping-Pong Algorithm
Module Commonalities Provide Functional Insights
Module Visualization
Application of Modular Analysis Tools for Data Mining of Mammalian Data Sets
Outlook
SYSTEMS BIOLOGY APPROACHES FOR COMPOUND TESTING
Introduction
Step 1: Design Experiment for Data Production
Step 2: Compute Systems Response Profiles
Step 3: Identify Perturbed Biological Networks
Step 4: Compute Network Perturbation Amplitudes
Step 5: Compute the Biological Impact Factor
Conclusions
Про автора
Currently VP of Business Development at Prestwick Chemical SAS, Rémy Hoffmann studied pharmacy at the University Louis Pasteur in Strasbourg, France, and gained his doctorate in medicinal chemistry. After 17 years spent at what is now Accelrys, where he worked on pharmacophore perception methods, he joined Thomson Reuters as a regional sales manager. Here he learnt the importance of curated scientific data, and the need to develop methods for mining this data so as to extract accurate information to support the decisionmaking process, and thus arrive at the knowledge stage. In his current role, Dr. Hoffmann oversees the deployment of Prestwick Chemical?s products and services to the drug discovery community, both in the pharma and biotech industries, as well as within the academic scientific community.
Arnaud Gohier studied organic chemistry at the University of Le Mans and Nantes (France). He received his Ph D in Molecular Modeling from the University of Joseph Fourier in Grenoble (France). In 1999, he joined the french pharmaceutical company Servier. Dr Gohier?s main areas of interest are drug design and chemoinformatics.
Pavel Pospisil has been Manager of Computational Chemistry at Philip Morris International, R&D in Neuchatel, Switzerland, since 2008. He holds a BSc in biochemistry from the University of Joseph Fourier in Grenoble, France, and an MSc in biochemical engineering from the Institute of Chemical Technology in Prague, Czech Republic, and received his Ph D
in natural sciences from the Swiss Federal Institute of Technology (ETH), Zurich. He carried out his postdoctoral studies at ETH Zurich and with the pharmaceutical company, Arpida, now Evolva. In 2004, Dr. Pospisil became a postdoctoral fellow and research associate at Harvard Medical School, Boston, USA, where he focused on data mining for cancer targets and the discovery of low molecular radiolabeled cancer imaging analogs. In 2008, he took up a position as consultant at Hoffmann-La-Roche, Basel, Switzerland. His current interests are the automatic processing of molecules and computational toxicology.