The increasing availability of data in our current, information
overloaded society has led to the need for valid tools for its
modelling and analysis. Data mining and applied statistical methods
are the appropriate tools to extract knowledge from such data. This
book provides an accessible introduction to data mining methods in
a consistent and application oriented statistical framework, using
case studies drawn from real industry projects and highlighting the
use of data mining methods in a variety of business applications.
* Introduces data mining methods and applications.
* Covers classical and Bayesian multivariate statistical
methodology as well as machine learning and computational data
mining methods.
* Includes many recent developments such as association and
sequence rules, graphical Markov models, lifetime value modelling,
credit risk, operational risk and web mining.
* Features detailed case studies based on applied projects within
industry.
* Incorporates discussion of data mining software, with case
studies analysed using R.
* Is accessible to anyone with a basic knowledge of statistics or
data analysis.
* Includes an extensive bibliography and pointers to further
reading within the text.
Applied Data Mining for Business and Industry, 2nd
edition is aimed at advanced undergraduate and graduate
students of data mining, applied statistics, database management,
computer science and economics. The case studies will provide
guidance to professionals working in industry on projects involving
large volumes of data, such as customer relationship management,
web design, risk management, marketing, economics and finance.
สารบัญ
1 Introduction.
Part I Methodology.
2 Organisation of the data.
2.1 Statistical units and statistical variables.
2.2 Data matrices and their transformations.
2.3 Complex data structures.
2.4 Summary.
3 Summary statistics.
3.1 Univariate exploratory analysis.
3.2 Bivariate exploratory analysis of quantitative data.
3.3 Multivariate exploratory analysis of quantitative data.
3.4 Multivariate exploratory analysis of qualitative data.
3.5 Reduction of dimensionality.
3.6 Further reading.
4 Model specification.
4.1 Measures of distance.
4.2 Cluster analysis.
4.3 Linear regression.
4.4 Logistic regression.
4.5 Tree models.
4.6 Neural networks.
4.7 Nearest-neighbour models.
4.8 Local models.
4.9 Uncertainty measures and inference.
4.10 Non-parametric modelling.
4.11 The normal linear model.
4.12 Generalised linear models.
4.13 Log-linear models.
4.14 Graphical models.
4..15 Survival analysis models.
4.16 Further reading.
5 Model evaluation.
5.1 Criteria based on statistical tests.
5.2 Criteria based on scoring functions.
5.3 Bayesian criteria.
5.4 Computational criteria.
5.5 Criteria based on loss functions.
5.6 Further reading.
Part II Business caste studies.
6 Describing website visitors.
6.1 Objectives of the analysis.
6.2 Description of the data.
6.3 Exploratory analysis.
6.4 Model building.
6.5 Model comparison.
6.6 Summary report.
7 Market basket analysis.
7.1 Objectives of the analysis.
7.2 Description of the data.
7.3 Exploratory data analysis.
7.4 Model building.
7.5 Model comparison.
7.6 Summary report.
8 Describing customer satisfaction.
8.1 Objectives of the analysis.
8.2 Description of the data.
8.3 Exploratory data analysis.
8.4 Model building.
8.5 Summary.
9 Predicting credit risk of small businesses.
9.1 Objectives of the analysis.
9.2 Description of the data.
9.3 Exploratory data analysis.
9.4 Model building.
9.5 Model comparison.
9.6 Summary report.
10 Predicting e-learning student performance.
10.1 Objectives of the analysis.
10.2 Description of the data.
10.3 Exploratory data analysis.
10.4 Model specification.
10.5 Model comparison.
10.6 Summary report.
11 Predicting customer lifetime value.
11.1 Objectives of the analysis.
11.2 Description of the data.
11.3 Exploratory data analysis.
11.4 Model specification.
11.5 Model comparison.
11.6 Summary report.
12 Operational risk management.
12.1 Context and objectives of the analysis.
12.2 Exploratory data analysis.
12.3 Model building.
12.4 Model comparison.
12.5 Summary conclusions.
References.
Index.
เกี่ยวกับผู้แต่ง
Paolo Giudici – Department of Economics and
Quantitative Methods, University of Pavia, A lecturer in data
mining, business statistics, data analysis and risk management,
Professor Giudici is also the director of the data mining
laboratory. He is the author of around 80 publications, and the
coordinator of 2 national research grants on data mining, and local
coordinator of a European integrated project on the topic. He was
the sole author of the first edition of this book, which has been
translated into both Italian and Chinese. He is also one of the
Editors of Wiley’s Series in Computational Statistics.
Silvia Figini, Ms Figini has worked for 2 years for the
Competence centre for data mining analysis and business
intelligence at SAS Milan. She is currently completing a Ph D in
statistics, and already has a collection of publications to her
name