Pattern Recognition in Computational Molecular Biology

Mourad Elloumi & Costas Iliopoulos

Mourad Elloumi & Costas Iliopoulos
Pattern Recognition in Computational Molecular Biology [EPUB ebook]
Techniques and Approaches

Support

Couverture du Mourad Elloumi & Costas Iliopoulos: Pattern Recognition in Computational Molecular Biology (ePUB)

A comprehensive overview of high-performance pattern recognition techniques and approaches to Computational Molecular Biology

This book surveys the developments of techniques and approaches on pattern recognition related to Computational Molecular Biology. Providing a broad coverage of the field, the authors cover fundamental and technical information on these techniques and approaches, as well as discussing their related problems. The text consists of twenty nine chapters, organized into seven parts: Pattern Recognition in Sequences, Pattern Recognition in Secondary Structures, Pattern Recognition in Tertiary Structures, Pattern Recognition in Quaternary Structures, Pattern Recognition in Microarrays, Pattern Recognition in Phylogenetic Trees, and Pattern Recognition in Biological Networks.

Surveys the development of techniques and approaches on pattern recognition in biomolecular data

Discusses pattern recognition in primary, secondary, tertiary and quaternary structures, as well as microarrays, phylogenetic trees and biological networks

Includes case studies and examples to further illustrate the concepts discussed in the book

Pattern Recognition in Computational Molecular Biology: Techniques and Approaches is a reference for practitioners and professional researches in Computer Science, Life Science, and Mathematics. This book also serves as a supplementary reading for graduate students and young researches interested in Computational Molecular Biology.

€123.99

Table des matières

LIST OF CONTRIBUTORS xxi

PREFACE xxvii

I PATTERN RECOGNITION IN SEQUENCES 1

1 COMBINATORIAL HAPLOTYPING PROBLEMS 3
Giuseppe Lancia

1.1 Introduction / 3

1.2 Single Individual Haplotyping / 5

1.2.1 The Minimum Error Correction Model / 8

1.2.2 Probabilistic Approaches and Alternative Models / 10

1.3 Population Haplotyping / 12

1.3.1 Clark’s Rule / 14

1.3.2 Pure Parsimony / 15

1.3.3 Perfect Phylogeny / 19

1.3.4 Disease Association / 21

1.3.5 Other Models / 22

References / 23

2 ALGORITHMIC PERSPECTIVES OF THE STRING BARCODING PROBLEMS 28
Sima Behpour and Bhaskar Das Gupta

2.1 Introduction / 28

2.2 Summary of Algorithmic Complexity Results for Barcoding Problems / 32

2.2.1 Average Length of Optimal Barcodes / 33

2.3 Entropy-Based Information Content Technique for Designing

Approximation Algorithms for String Barcoding Problems / 34

2.4 Techniques for Proving Inapproximability Results for String Barcoding Problems / 36

2.4.1 Reductions from Set Covering Problem / 36

2.4.2 Reduction from Graph-Coloring Problem / 38

2.5 Heuristic Algorithms for String Barcoding Problems / 39

2.5.1 Entropy-Based Method with a Different Measure for Information Content / 39

2.5.2 Balanced Partitioning Approach / 40

2.6 Conclusion / 40

Acknowledgments / 41

References / 41

3 ALIGNMENT-FREE MEASURES FOR WHOLE-GENOME COMPARISON 43
Matteo Comin and Davide Verzotto

3.1 Introduction / 43

3.2 Whole-Genome Sequence Analysis / 44

3.2.1 Background on Whole-Genome Comparison / 44

3.2.2 Alignment-Free Methods / 45

3.2.3 Average Common Subword / 46

3.2.4 Kullback–Leibler Information Divergence / 47

3.3 Underlying Approach / 47

3.3.1 Irredundant Common Subwords / 48

3.3.2 Underlying Subwords / 49

3.3.3 Efficient Computation of Underlying Subwords / 50

3.3.4 Extension to Inversions and Complements / 53

3.3.5 A Distance-Like Measure Based on Underlying Subwords / 53

3.4 Experimental Results / 54

3.4.1 Genome Data sets and Reference Taxonomies / 54

3.4.2 Whole-Genome Phylogeny Reconstruction / 56

3.5 Conclusion / 61

Author’s Contributions / 62

Acknowledgments / 62

References / 62

4 A MAXIMUM LIKELIHOOD FRAMEWORK FOR MULTIPLE SEQUENCE LOCAL ALIGNMENT 65
Chengpeng Bi

4.1 Introduction / 65

4.2 Multiple Sequence Local Alignment / 67

4.2.1 Overall Objective Function / 67

4.2.2 Maximum Likelihood Model / 68

4.3 Motif Finding Algorithms / 70

4.3.1 DEM Motif Algorithm / 70

4.3.2 WEM Motif Finding Algorithm / 70

4.3.3 Metropolis Motif Finding Algorithm / 72

4.3.4 Gibbs Motif Finding Algorithm / 73

4.3.5 Pseudo-Gibbs Motif Finding Algorithm / 74

4.4 Time Complexity / 75

4.5 Case Studies / 75

4.5.1 Performance Evaluation / 76

4.5.2 CRP Binding Sites / 76

4.5.3 Multiple Motifs in Helix–Turn–Helix Protein Structure / 78

4.6 Conclusion / 80

References / 81

5 GLOBAL SEQUENCE ALIGNMENT WITH A BOUNDED NUMBER OF GAPS 83
Carl Barton, Tomáš Flouri, Costas S. Iliopoulos, and Solon P. Pissis

5.1 Introduction / 83

5.2 Definitions and Notation / 85

5.3 Problem Definition / 87

5.4 Algorithms / 88

5.5 Conclusion / 94

References / 95

II PATTERN RECOGNITION IN SECONDARY STRUCTURES 97

6 A SHORT REVIEW ON PROTEIN SECONDARY STRUCTURE PREDICTION METHODS 99
Renxiang Yan, Jiangning Song, Weiwen Cai, and Ziding Zhang

6.1 Introduction / 99

6.2 Representative Protein Secondary Structure Prediction Methods / 102

6.2.1 Chou–Fasman / 103

6.2.2 GOR / 104

6.2.3 PHD / 104

6.2.4 PSIPRED / 104

6.2.5 SPINE-X / 105

6.2.6 PSSpred / 105

6.2.7 Meta Methods / 105

6.3 Evaluation of Protein Secondary Structure Prediction Methods / 106

6.3.1 Measures / 106

6.3.2 Benchmark / 106

6.3.3 Performances / 107

6.4 Conclusion / 110

Acknowledgments / 110

References / 111

7 A GENERIC APPROACH TO BIOLOGICAL SEQUENCE SEGMENTATION PROBLEMS: APPLICATION TO PROTEIN SECONDARY STRUCTURE PREDICTION 114
Yann Guermeur and Fabien Lauer

7.1 Introduction / 114

7.2 Biological Sequence Segmentation / 115

7.3 MSVMpred / 117

7.3.1 Base Classifiers / 117

7.3.2 Ensemble Methods / 118

7.3.3 Convex Combination / 119

7.4 Postprocessing with A Generative Model / 119

7.5 Dedication to Protein Secondary Structure Prediction / 120

7.5.1 Biological Problem / 121

7.5.2 MSVMpred2 / 121

7.5.3 Hidden Semi-Markov Model / 122

7.5.4 Experimental Results / 122

7.6 Conclusions and Ongoing Research / 125

Acknowledgments / 126

References / 126

8 STRUCTURAL MOTIF IDENTIFICATION AND RETRIEVAL: A GEOMETRICAL APPROACH 129
Virginio Cantoni, Marco Ferretti, Mirto Musci, and Nahumi Nugrahaningsih

8.1 Introduction / 129

8.2 A Few Basic Concepts / 130

8.2.1 Hierarchy of Protein Structures / 130

8.2.2 Secondary Structure Elements / 131

8.2.3 Structural Motifs / 132

8.2.4 Available Sources for Protein Data / 134

8.3 State of the Art / 135

8.3.1 Protein Structure Motif Search / 135

8.3.2 Promotif / 136

8.3.3 Secondary-Structure Matching / 137

8.3.4 Multiple Structural Alignment by Secondary Structures / 138

8.4 A Novel Geometrical Approach to Motif Retrieval / 138

8.4.1 Secondary Structures Cooccurrences / 138

8.4.2 Cross Motif Search / 143

8.4.3 Complete Cross Motif Search / 146

8.5 Implementation Notes / 149

8.5.1 Optimizations / 149

8.5.2 Parallel Approaches / 150

8.6 Conclusions and Future Work / 151

Acknowledgment / 152

References / 152

9 GENOME-WIDE SEARCH FOR PSEUDOKNOTTED NONCODING RNAs: A COMPARATIVE STUDY 155
Meghana Vasavada, Kevin Byron, Yang Song, and Jason T.L. Wang

9.1 Introduction / 155

9.2 Background / 156

9.2.1 Noncoding RNAs and Their Secondary Structures / 156

9.2.2 Pseudoknotted nc RNA Search Tools / 157

9.3 Methodology / 157

9.4 Results and Interpretation / 161

9.5 Conclusion / 162

References / 163

III PATTERN RECOGNITION IN TERTIARY STRUCTURES 165

10 MOTIF DISCOVERY IN PROTEIN 3D-STRUCTURES USING GRAPH MINING TECHNIQUES 167
Wajdi Dhifli and Engelbert Mephu Nguifo

10.1 Introduction / 167

10.2 From Protein 3D-Structures to Protein Graphs / 169

10.2.1 Parsing Protein 3D-Structures into Graphs / 169

10.3 Graph Mining / 172

10.4 Subgraph Mining / 173

10.5 Frequent Subgraph Discovery / 173

10.5.1 Problem Definition / 174

10.5.2 Candidates Generation / 176

10.5.3 Frequent Subgraph Discovery Approaches / 177

10.5.4 Variants of Frequent Subgraph Mining: Closed and Maximal Subgraphs / 178

10.6 Feature Selection / 179

10.6.1 Relevance of a Feature / 179

10.7 Feature Selection for Subgraphs / 180

10.7.1 Problem Statement / 180

10.7.2 Mining Top-k Subgraphs / 180

10.7.3 Clustering-Based Subgraph Selection / 181

10.7.4 Sampling-Based Approaches / 181

10.7.5 Approximate Subgraph Mining / 181

10.7.6 Discriminative Subgraph Selection / 182

10.7.7 Other Significant Subgraph Selection Approaches / 182

10.8 Discussion / 183

10.9 Conclusion / 185

Acknowledgments / 185

References / 186

11 FUZZY AND UNCERTAIN LEARNING TECHNIQUES FOR THE ANALYSIS AND PREDICTION OF PROTEIN TERTIARY STRUCTURES 190
Chinua Umoja, Xiaxia Yu, and Robert Harrison

11.1 Introduction / 190

11.2 Genetic Algorithms / 192

11.2.1 GA Model Selection in Protein Structure Prediction / 196

11.2.2 Common Methodology / 198

11.3 Supervised Machine Learning Algorithm / 201

11.3.1 Artificial Neural Networks / 201

11.3.2 ANNs in Protein Structure Prediction / 202

11.3.3 Support Vector Machines / 203

11.4 Fuzzy Application / 204

11.4.1 Fuzzy Logic / 204

11.4.2 Fuzzy SVMs / 204

11.4.3 Adaptive-Network-Based Fuzzy Inference Systems / 205

11.4.4 Fuzzy Decision Trees / 206

11.5 Conclusion / 207

References / 208

12 PROTEIN INTER-DOMAIN LINKER PREDICTION 212
Maad Shatnawi, Paul D. Yoo, and Sami Muhaidat

12.1 Introduction / 212

12.2 Protein Structure Overview / 213

12.3 Technical Challenges and Open Issues / 214

12.4 Prediction Assessment / 215

12.5 Current Approaches / 216

12.5.1 Dom Cut / 216

12.5.2 Scooby-Domain / 217

12.5.3 FIEFDom / 218

12.5.4 Chatterjee et al. (2009) / 219

12.5.5 Drop / 219

12.6 Domain Boundary Prediction Using Enhanced General Regression Network / 220

12.6.1 Multi-Domain Benchmark Data Set / 220

12.6.2 Compact Domain Profile / 221

12.6.3 The Enhanced Semi-Parametric Model / 222

12.6.4 Training, Testing, and Validation / 225

12.6.5 Experimental Results / 226

12.7 Inter-Domain Linkers Prediction Using Compositional Index and Simulated Annealing / 227

12.7.1 Compositional Index / 228

12.7.2 Detecting the Optimal Set of Threshold Values Using Simulated Annealing / 229

12.7.3 Experimental Results / 230

12.8 Conclusion / 232

References / 233

13 PREDICTION OF PROLINE CIS–TRANS ISOMERIZATION 236
Paul D. Yoo, Maad Shatnawi, Sami Muhaidat, Kamal Taha, and Albert Y. Zomaya

13.1 Introduction / 236

13.2 Methods / 238

13.2.1 Evolutionary Data Set Construction / 238

13.2.2 Protein Secondary Structure Information / 239

13.2.3 Method I: Intelligent Voting / 239

13.2.4 Method II: Randomized Meta-Learning / 241

13.2.5 Model Validation and Testing / 242

13.2.6 Parameter Tuning / 242

13.3 Model Evaluation and Analysis / 243

13.4 Conclusion / 245

References / 245

IV PATTERN RECOGNITION IN QUATERNARY STRUCTURES 249

14 PREDICTION OF PROTEIN QUATERNARY STRUCTURES 251
Akbar Vaseghi, Maryam Faridounnia, Soheila Shokrollahzade, Samad Jahandideh, and Kuo-Chen Chou

14.1 Introduction / 251

14.2 Protein Structure Prediction / 255

14.2.1 Secondary Structure Prediction / 255

14.2.2 Modeling of Tertiary Structure / 256

14.3 Template-Based Predictions / 257

14.3.1 Homology Modeling / 257

14.3.2 Threading Methods / 257

14.3.3 Ab initio Modeling / 257

14.4 Critical Assessment of Protein Structure Prediction / 258

14.5 Quaternary Structure Prediction / 258

14.6 Conclusion / 261

Acknowledgments / 261

References / 261

15 COMPARISON OF PROTEIN QUATERNARY STRUCTURES BY GRAPH APPROACHES 266
Sheng-Lung Peng and Yu-Wei Tsay

15.1 Introduction / 266

15.2 Similarity in the Graph Model / 268

15.2.1 Graph Model for Proteins / 270

15.3 Measuring Structural Similarity VIA MCES / 272

15.3.1 Problem Formulation / 273

15.3.2 Constructing P-Graphs / 274

15.3.3 Constructing Line Graphs / 276

15.3.4 Constructing Modular Graphs / 276

15.3.5 Maximum Clique Detection / 277

15.3.6 Experimental Results / 277

15.4 Protein Comparison VIA Graph Spectra / 279

15.4.1 Graph Spectra / 279

15.4.2 Matrix Selection / 281

15.4.3 Graph Cospectrality and Similarity / 283

15.4.4 Cospectral Comparison / 283

15.4.5 Experimental Results / 284

15.5 Conclusion / 287

References / 287

16 STRUCTURAL DOMAINS IN PREDICTION OF BIOLOGICAL PROTEIN–PROTEIN INTERACTIONS 291
Mina Maleki, Michael Hall, and Luis Rueda

16.1 Introduction / 291

16.2 Structural Domains / 293

16.3 The Prediction Framework / 293

16.4 Feature Extraction and Prediction Properties / 294

16.4.1 Physicochemical Properties / 296

16.4.2 Domain-Based Properties / 298

16.5 Feature Selection / 299

16.5.1 Filter Methods / 299

16.5.2 Wrapper Methods / 301

16.6 Classification / 301

16.6.1 Linear Dimensionality Reduction / 301

16.6.2 Support Vector Machines / 303

16.6.3 k-Nearest Neighbor / 303

16.6.4 Naive Bayes / 304

16.7 Evaluation and Analysis / 304

16.8 Results and Discussion / 304

16.8.1 Analysis of the Prediction Properties / 304

16.8.2 Analysis of Structural DDIs / 307

16.9 Conclusion / 309

References / 310

V PATTERN RECOGNITION IN MICROARRAYS 315

17 CONTENT-BASED RETRIEVAL OF MICROARRAY EXPERIMENTS 317
Hasan O?gul

17.1 Introduction / 317

17.2 Information Retrieval: Terminology and Background / 318

17.3 Content-Based Retrieval / 320

17.4 Microarray Data and Databases / 322

17.5 Methods for Retrieving Microarray Experiments / 324

17.6 Similarity Metrics / 327

17.7 Evaluating Retrieval Performance / 329

17.8 Software Tools / 330

17.9 Conclusion and Future Directions / 331

Acknowledgment / 332

References / 332

18 EXTRACTION OF DIFFERENTIALLY EXPRESSED GENES IN MICROARRAY DATA 335
Tiratha Raj Singh, Brigitte Vannier, and Ahmed Moussa

18.1 Introduction / 335

18.2 From Microarray Image to Signal / 336

18.2.1 Signal from Oligo DNA Array Image / 336

18.2.2 Signal from Two-Color c DNA Array / 337

18.3 Microarray Signal Analysis / 337

18.3.1 Absolute Analysis and Replicates in Microarrays / 338

18.3.2 Microarray Normalization / 339

18.4 Algorithms for De Gene Selection / 339

18.4.1 Within–Between DE Gene (WB-DEG) Selection Algorithm / 340

18.4.2 Comparison of the WB-DEGs with Two Classical DE Gene Selection Methods on Latin Square Data / 341

18.5 Gene Ontology Enrichment and Gene Set Enrichment Analysis / 343

18.6 Conclusion / 345

References / 345

19 CLUSTERING AND CLASSIFICATION TECHNIQUES FOR GENE EXPRESSION PROFILE PATTERN ANALYSIS 347
Emanuel Weitschek, Giulia Fiscon, Valentina Fustaino, Giovanni Felici, and Paola Bertolazzi

19.1 Introduction / 347

19.2 Transcriptome Analysis / 348

19.3 Microarrays / 349

19.3.1 Applications / 349

19.3.2 Microarray Technology / 350

19.3.3 Microarray Workflow / 350

19.4 RNA-Seq / 351

19.5 Benefits and Drawbacks of RNA-Seq and Microarray Technologies / 353

19.6 Gene Expression Profile Analysis / 356

19.6.1 Data Definition / 356

19.6.2 Data Analysis / 357

19.6.3 Normalization and Background Correction / 357

19.6.4 Genes Clustering / 359

19.6.5 Experiment Classification / 361

19.6.6 Software Tools for Gene Expression Profile Analysis / 362

19.7 Real Case Studies / 364

19.8 Conclusions / 367

References / 368

20 MINING INFORMATIVE PATTERNS IN MICROARRAY DATA 371
Li Teng

20.1 Introduction / 371

20.2 Patterns with Similarity / 373

20.2.1 Similarity Measurement / 374

20.2.2 Clustering / 376

20.2.3 Biclustering / 379

20.2.4 Types of Biclusters / 380

20.2.5 Measurement of the Homogeneity / 383

20.2.6 Biclustering Algorithms with Different Searching Schemes / 387

20.3 Conclusion / 391

References / 391

21 ARROW PLOT AND CORRESPONDENCE ANALYSIS MAPS FOR VISUALIZING THE EFFECTS OF BACKGROUND CORRECTION AND NORMALIZATION METHODS ON MICROARRAY DATA 394
Carina Silva, Adelaide Freitas, Sara Roque, and Lisete Sousa

21.1 Overview / 394

21.1.1 Background Correction Methods / 395

21.1.2 Normalization Methods / 396

21.1.3 Literature Review / 397

21.2 Arrow Plot / 399

21.2.1 DE Genes Versus Special Genes / 399

21.2.2 Definition and Properties of the ROC Curve / 400

21.2.3 AUC and Degenerate ROC Curves / 401

21.2.4 Overlapping Coefficient / 402

21.2.5 Arrow Plot Construction / 403

21.3 Significance Analysis of Microarrays / 404

21.4 Correspondence Analysis / 405

21.4.1 Basic Principles / 405

21.4.2 Interpretation of CA Maps / 406

21.5 Impact of the Preprocessing Methods / 407

21.5.1 Class Prediction Context / 408

21.5.2 Class Comparison Context / 408

21.6 Conclusions / 412

Acknowledgments / 413

References / 413

VI PATTERN RECOGNITION IN PHYLOGENETIC TREES 417

22 PATTERN RECOGNITION IN PHYLOGENETICS: TREES AND NETWORKS 419
David A. Morrison

22.1 Introduction / 419

22.2 Networks and Trees / 420

22.3 Patterns and Their Processes / 424

22.4 The Types of Patterns / 427

22.5 Fingerprints / 431

22.6 Constructing Networks / 433

22.7 Multi-Labeled Trees / 435

22.8 Conclusion / 436

References / 437

23 DIVERSE CONSIDERATIONS FOR SUCCESSFUL PHYLOGENETIC TREE RECONSTRUCTION: IMPACTS FROM MODEL MISSPECIFICATION, RECOMBINATION, HOMOPLASY, AND PATTERN RECOGNITION 439
Diego Mallo, Agustín Sánchez-Cobos, and Miguel Arenas

23.1 Introduction / 440

23.2 Overview on Methods and Frameworks for Phylogenetic Tree Reconstruction / 440

23.2.1 Inferring Gene Trees / 441

23.2.2 Inferring Species Trees / 442

23.3 Influence of Substitution Model Misspecification on Phylogenetic Tree Reconstruction / 445

23.4 Influence of Recombination on Phylogenetic Tree Reconstruction / 446

23.5 Influence of Diverse Evolutionary Processes on Species Tree Reconstruction / 447

23.6 Influence of Homoplasy on Phylogenetic Tree Reconstruction: The Goals of Pattern Recognition / 449

23.7 Concluding Remarks / 449

Acknowledgments / 450

References / 450

24 AUTOMATED PLAUSIBILITY ANALYSIS OF LARGE PHYLOGENIES 457
David Dao, Tomáš Flouri, and Alexandros Stamatakis

24.1 Introduction / 457

24.2 Preliminaries / 459

24.3 A Naïve Approach / 462

24.4 Toward a Faster Method / 463

24.5 Improved Algorithm / 467

24.5.1 Preprocessing / 467

24.5.2 Computing Lowest Common Ancestors / 468

24.5.3 Constructing the Induced Tree / 468

24.5.4 Final Remarks / 471

24.6 Implementation / 473

24.6.1 Preprocessing / 473

24.6.2 Reconstruction / 473

24.6.3 Extracting Bipartitions / 474

24.7 Evaluation / 474

24.7.1 Test Data Sets / 474

24.7.2 Experimental Results / 475

24.8 Conclusion / 479

Acknowledgment / 481

References / 481

25 A NEW FAST METHOD FOR DETECTING AND VALIDATING HORIZONTAL GENE TRANSFER EVENTS USING PHYLOGENETIC TREES AND AGGREGATION FUNCTIONS 483
Dunarel Badescu, Nadia Tahiri, and Vladimir Makarenkov

25.1 Introduction / 483

25.2 Methods / 485

25.2.1 Clustering Using Variability Functions / 485

25.2.2 Other Variants of Clustering Functions Implemented in the Algorithm / 487

25.2.3 Description of the New Algorithm / 488

25.2.4 Time Complexity / 491

25.3 Experimental Study / 491

25.3.1 Implementation / 491

25.3.2 Synthetic Data / 491

25.3.3 Real Prokaryotic (Genomic) Data / 495

25.4 Results and Discussion / 501

25.4.1 Analysis of Synthetic Data / 501

25.4.2 Analysis of Prokaryotic Data / 502

25.5 Conclusion / 502

References / 503

VII PATTERN RECOGNITION IN BIOLOGICAL NETWORKS 505

26 COMPUTATIONAL METHODS FOR MODELING BIOLOGICAL INTERACTION NETWORKS 507
Christos Makris and Evangelos Theodoridis

26.1 Introduction / 507

26.2 Measures/Metrics / 508

26.3 Models of Biological Networks / 511

26.4 Reconstructing and Partitioning Biological Networks / 511

26.5 PPI Networks / 513

26.6 Mining PPI Networks—Interaction Prediction / 517

26.7 Conclusions / 519

References / 519

27 BIOLOGICAL NETWORK INFERENCE AT MULTIPLE SCALES: FROM GENE REGULATION TO SPECIES INTERACTIONS 525
Andrej Aderhold, V Anne Smith, and Dirk Husmeier

27.1 Introduction / 525

27.2 Molecular Systems / 528

27.3 Ecological Systems / 528

27.4 Models and Evaluation / 529

27.4.1 Notations / 529

27.4.2 Sparse Regression and the LASSO / 530

27.4.3 Bayesian Regression / 530

27.4.4 Evaluation Metric / 531

27.5 Learning Gene Regulation Networks / 532

27.5.1 Nonhomogeneous Bayesian Regression / 533

27.5.2 Gradient Estimation / 534

27.5.3 Simulated Bio-PEPA Data / 534

27.5.4 Real m RNA Expression Profile Data / 535

27.5.5 Method Evaluation and Learned Networks / 536

27.6 Learning Species Interaction Networks / 540

27.6.1 Regression Model of Species interactions / 540

27.6.2 Multiple Global Change-Points / 541

27.6.3 Mondrian Process Change-Points / 542

27.6.4 Synthetic Data / 544

27.6.5 Simulated Population Dynamics / 544

27.6.6 Real World Plant Data / 546

27.6.7 Method Evaluation and Learned Networks / 546

27.7 Conclusion / 550

References / 550

28 DISCOVERING CAUSAL PATTERNS WITH STRUCTURAL EQUATION MODELING: APPLICATION TO TOLL-LIKE RECEPTOR SIGNALING PATHWAY IN CHRONIC LYMPHOCYTIC LEUKEMIA 555
Athina Tsanousa, Stavroula Ntoufa, Nikos Papakonstantinou, Kostas Stamatopoulos, and Lefteris Angelis

28.1 Introduction / 555

28.2 Toll-Like Receptors / 557

28.2.1 Basics / 557

28.2.2 Structure and Signaling of TLRs / 558

28.2.3 TLR Signaling in Chronic Lymphocytic Leukemia / 559

28.3 Structural Equation Modeling / 560

28.3.1 Methodology of SEM Modeling / 560

28.3.2 Assumptions / 561

28.3.3 Estimation Methods / 562

28.3.4 Missing Data / 562

28.3.5 Goodness-of-Fit Indices / 563

28.3.6 Other Indications of a Misspecified Model / 565

28.4 Application / 566

28.5 Conclusion / 580

References / 581

29 ANNOTATING PROTEINS WITH INCOMPLETE LABEL INFORMATION 585
Guoxian Yu, Huzefa Rangwala, and Carlotta Domeniconi

29.1 Introduction / 585

29.2 Related Work / 587

29.3 Problem Formulation / 589

29.3.1 The Algorithm / 591

29.4 Experimental Setup / 592

29.4.1 Data sets / 592

29.4.2 Comparative Methods / 593

29.4.3 Experimental Protocol / 594

29.4.4 Evaluation Criteria / 594

29.5 Experimental Analysis / 596

29.5.1 Replenishing Missing Functions / 596

29.5.2 Predicting Unlabeled Proteins / 600

29.5.3 Component Analysis / 604

29.5.4 Run Time Analysis / 604

29.6 Conclusions / 605

Acknowledgments / 606

References / 606

INDEX 609

A propos de l’auteur

Mourad Elloumi, Ph D, is Professor in Computer Science at the University of Tunis-El Manar, Tunisia. Dr. Elloumi is the author/co-author of more than 50 publications in international journals and conference proceedings related to Algorithmics, Computational Molecular Biology, and Knowledge Discovery and Data Mining.
Costas S. Iliopoulos, Ph D, is Professor of Algorithm Design at King’s College London, UK. Dr. Iliopoulos co-authored over 300 peer-reviewed articles in pattern matching and combinatorics of strings. He serves on the editorial board of the Journal of Discrete Algorithms, Computer Mathematics & Combinatorial Computing, and System Biology & Biomedical Technologies.
Jason T. L. Wang, Ph D, is Professor of Computer Science at the New Jersey Institute of Technology, USA. Dr. Wang has published extensively on Data Mining and Computational Molecular Biology, and has been a member of program committees for over 200 conferences and workshops in these and related areas.
Albert Y. Zomaya, Ph D, is the Chair Professor of High Performance Computing & Networking in the School of Information Technologies, University of Sydney, Australia. Dr. Zomaya published more than 500 scientific papers and articles and is author, co-author or editor of more than 20 books. Dr. Zomaya is Fellow of AAAS, IEEE, and IET.

Achetez cet ebook et obtenez-en 1 de plus GRATUITEMENT !

Langue Anglais ● Format EPUB ● ISBN 9781119078869 ● Taille du fichier 37.6 MB ● Maison d’édition John Wiley & Sons ● Pays US ● Publié 2015 ● Édition 1 ● Téléchargeable 24 mois ● Devise EUR ● ID 4820704 ● Protection contre la copie sans