This book is about inductive databases and constraint-based data mining, emerging research topics lying at the intersection of data mining and database research. The aim of the book as to provide an overview of the state-of- the art in this novel and – citing research area. Of special interest are the recent methods for constraint-based mining of global models for prediction and clustering, the uni?cation of pattern mining approaches through constraint programming, the clari?cation of the re- tionship between mining local patterns and global models, and the proposed in- grative frameworks and approaches for inducive databases. On the application side, applications to practically relevant problems from bioinformatics are presented. Inductive databases (IDBs) represent a database view on data mining and kno- edge discovery. IDBs contain not only data, but also generalizations (patterns and models) valid in the data. In an IDB, ordinary queries can be used to access and – nipulate data, while inductive queries can be used to generate (mine), manipulate, and apply patterns and models. In the IDB framework, patterns and models become ”?rst-class citizens” and KDD becomes an extended querying process in which both the data and the patterns/models that hold in the data are queried.
Tabella dei contenuti
Inductive Databases and Constraint-based Data Mining: Introduction and Overview.- Representing Entities in the Onto DM Data Mining Ontology.- A Practical Comparative Study Of Data Mining Query Languages.- A Theory of Inductive Query Answering.- Constraint-based Mining: Selected Techniques.- Generalizing Itemset Mining in a Constraint Programming Setting.- From Local Patterns to Classification Models.- Constrained Predictive Clustering.- Finding Segmentations of Sequences.- Mining Constrained Cross-Graph Cliques in Dynamic Networks.- Probabilistic Inductive Querying Using Prob Log.- Inductive Databases: Integration Approaches.- Inductive Querying with Virtual Mining Views.- SINDBAD and Si QL: Overview, Applications and Future Developments.- Patterns on Queries.- Experiment Databases.- Applications.- Predicting Gene Function using Predictive Clustering Trees.- Analyzing Gene Expression Data with Predictive Clustering Trees.- Using a Solver Over the String Pattern Domain to Analyze Gene Promoter Sequences.- Inductive Queries for a Drug Designing Robot Scientist.