This volume presents the latest advances in statistics and data science, including theoretical, methodological and computational developments and practical applications related to classification and clustering, data gathering, exploratory and multivariate data analysis, statistical modeling, and knowledge discovery and seeking. It includes contributions on analyzing and interpreting large, complex and aggregated datasets, and highlights numerous applications in economics, finance, computer science, political science and education. It gathers a selection of peer-reviewed contributions presented at the 16th Conference of the International Federation of Classification Societies (IFCS 2019), which was organized by the Greek Society of Data Analysis and held in Thessaloniki, Greece, on August 26-29, 2019.
Cuprins
Chapter 1 – Perio Clust: a simple hierarchical agglomerative clustering approach including constraints (Lise Bellanger, Arthur Coulon and Philippe Husi).- Chapter 2 – What was really the case? Party competition in Europe at the occasion of the 2019 European Parliament Elections (Theodore Chadjipadelis and Eftichia Teperoglou).- Chapter 3 – A Fast Electric Vehicle Planner Using Clustering (Jaël Champagne Gareau, Éric Beaudry and Vladimir Makarenkov).- Chapter 4 – A generalized coefficient of determination for mixtures of regressions (Roberto Di Mari, Salvatore Ingrassia and Antonio Punzo).- Chapter 5 – Distance measurement when fuzzy numbers are used. Survey of selected problems and procedures (Józef Dziechciarz and Marta Dziechciarz-Duda).- Chapter 6 – Performance measures in discrete supervised classification (Ana Sousa Ferreira and Anabela Marques).- Chapter 7 – Using EVT to assess risk on energy market (Alicja Ganczarek-Gamrot, Dominik Krezołek and Grazyna Trzpiot).- Chapter 8 – Measuring and testing mutual dependence for functional data (Tomasz Górecki, Mirosław Krzysko and Waldemar Wołynski).- Chapter 9 – Single imputation via chunk-wise PCA (Alfonso Iodice D’Enza, Francesco Palumbo and Angelos Markos).- Chapter 10 – Clustering Mixed-Type Data: a benchmark study on KAMILA and K-prototypes (Jarrett Jimeno, Madhumita Roy and Cristina Tortora).- Chapter 11 – Exploring social attitudes towards the green infrastructure plan of the Drama city in Greece (Vassiliki Kazana, Angelos Kazaklis, Dimitrios Raptis, Efthimia Chrisanthidou, Stella Kazakli and Nefeli Zagourgini).- Chapter 12 – Spatial perception for structured and unstructured data in topological data analysis (Yoshitake Kitanishi, Fumio Ishioka, Masaya Iizuka, and Koji Kurihara).- Chapter 13 – Text, Content and Data Analysis of Journal Articles: the field of International Relations (Nikos Koutsoupias and Kyriakos Mikelis).- Chapter 14 – Quantile measures of extreme risk on metals market (Dominik Krezołek and Grazyna Trzpiot).- Chapter 15 – Evaluation of Text Clustering Methods and their Dataspace Embeddings: an Exploration (Alain Lelu and Martine Cadot).- Chapter 16 – Specification of Basis Spacing for Process Convolution Gaussian Process Models (Waley Liang and Herbert Lee).- Chapter 17 – Estimation of Classification Rules from Partially Classified Data (Geoffrey Mc Lachlan and Daniel Ahfock).- Chapter 18 – Correspondence Analysis and Kriging: Projection of quantitative information on the factorial maps (George Menexes and Thomas Koutsos).- Chapter 19 – Intertemporal exploratory analysis of e-commerce from Greek households from official statistics data (Stratos Moschidis and Athanasios Thanopoulos).- Chapter 20 – Benchmarking in cluster analysis: a study on Spectral Clustering, DBSCAN, and K-means (Nivedha Murugesan, Irene Cho, and Cristina Tortora).- Chapter 21 – Detection of Topics and Time Series Variation in Consumer Web Communication Data (Atsuho Nakayama).- Chapter 22 – Classificationthrough graphical models: Evidences from the EU-SILC data (Federica Nicolussi, Agnese Maria Di Brisco and Manuela Cazzaro).- Chapter 23 – A simulation study for the identification of missing data mechanisms using visualisation (Johané Nienkemper-Swanepoel, Niël le Roux and Sugnet Gardner-Lubbe).- Chapter 24 – Triplet Clustering of One-Mode Two-Way Proximities (Akinori Okada and Satoru Yokoyama).- Chapter 25 – First-time Voters in Greece: Views and Attitudes of Youth on Europe and Democracy (Georgia Panagiotidou and Theodore Chadjipadelis).- Chapter 26 – Comparison of hierarchical clustering methods for binary data from SSR and ISSR molecular markers (Emmanouil Pratsinakis, Lefkothea Karapetsi, Symela Ntoanidou, Angelos Markos, Panagiotis Madesis, Ilias Eleftherohorinos and George Menexes).- Chapter 27 – One-way repeated measures ANOVA for functional data (Łukasz Smaga).- Chapter 28 – Flexible Clustering (Sokołowski Andrzej and Małgorzata Markowska).- Chapter 29 – Classification of entrepreneurial regimes: a symbolic polygonal clustering approach (Andrej Srakar and Marilena Vecco).- Chapter 30 – Multidimensional factor and cluster analysis vs embedding-based learning for personalised supermarket offer recommendations (George Stalidis, Theodosios Siomos, Pantelis Kaplanoglou, Alkiviadis Katsalis, Iphigenia Karaveli, Marina Delianidi, and Konstantinos Diamantaras).- Chapter 31 – Motivation for Participating in the Sharing Economy: The Case of Hungary (Roland Szilágyi and Levente Lengyel).- Chapter 32 – Benchmarking Minimax Linkage in Hierarchical Clustering (Xiao Hui Tai and Kayla Frisoli).- Chapter 33 – Clustering Binary Data by Application of Combinatorial Optimization Heuristics (Javier Trejos-Zelaya, Luis Eduardo Amaya-Briceño, Alejandra Jiménez-Romero, Alex Murillo-Fernández, Eduardo Piza-Volio and Mario Villalobos-Arias).- Chapter 34 – Classifying Users through Keystroke Dynamics (Ioannis Tsimperidis, Georgios Peikos and Avi Arampatzis).- Chapter 35 – Technologicalinnovation and the critical raw material stock (Beatrix Varga and Kitti Fodor).- Chapter 36 – Redundancy analysis for binary data based on logistic responses (Jose L. Vicente-Villardon and Laura Vicente-Gonzalez).- Chapter 37 – Predictive power of school motivation clusters in secondary education (Matthijs J. Warrens and W. Miro Ebert).
Despre autor
Theodore Chadjipadelis is a Full Professor of Applied Statistics at the School of Political Sciences at the Aristotle University of Thessaloniki, Greece and former Head of the Department from 2006 to 2009 and from 2013 to 2016. His main research interests are in the field of applied statistics, statistics education, electoral and political behaviour, urban and regional planning, and e-governance. He coordinates the Greek section of the C.C.S. (Comparative Candidates Survey) and the C.S.E.S. (Comparative Study of Electoral Systems) programmes. He has published more than 100 papers in international journals, encyclopedias, conference proceedings and edited books.
Berthold Lausen is a Full Professor of Data Science and Head of the Department of Mathematical Sciences at the University of Essex, Colchester, UK, former president of the International Federation of Classification Societies (IFCS) from 2018 to 2019, former president of the Data Science Society (Gf Kl) from2013 to 2019 and founding vice president of the European Association for Data Science (Eu ADS) from 2015 to 2018. Since 2014 he has lead on the introduction and development of data science education at the University of Essex. His research interests are in the field of artificial intelligence, biostatistics, classification, clinical research, data science and machine learning. He has published more than 100 papers in international journals, conference proceedings and edited books.
Angelos Markos is an Associate Professor of Data Analysis in the Social Sciences at the School of Education at the Democritus University of Thrace, Greece. He is a Board Member of the Greek Society of Data Analysis since 2009. His research interests are in the field of multivariate data analysis, dimension reduction and clustering, particularly correspondence analysis and related methods. He has published more than 50 papers in international journals, encyclopedias, conference proceedings andedited books.
Tae Rim Lee is an Honorary Professor of the Department of Data Science & Statistics, and former Dean of the College of Natural Science at the KNOU in Seoul. She is a biostatistician and her main research interests include tree-based classification model with CART, FACT, NN, kernel discrimination, deep learning for HCC patients, survival tree for OSCC. She was the president of KOSHIS (2009-2011) and KCS (2008-2016), council member of IFCS (2008-now), treasurer (2002-2004), former vice president of KSS (2014-2015), the vice president of IASE under ISI (2011-2013). She was elected as an Executive Board Director of IBS (2015-2021) and Organizing Committee Member of International Prize in Statistical Foundation (2018-now).
Angela Montanari is a Full Professor of Statistics and Head of the Department of Statistical Sciences at the University of Bologna, Italy, the president of the International Federation of Classification Societies (IFCS) from 2020 to 2021, former president of the Classification Group of the Italian Statistical Society (CLADAG) from 2007 to 2009. Her research interests are in the field of supervised and unsupervised classification, dimension reduction, data science and machine learning. She has published more than 100 papers in international journals, conference proceedings and edited books.
Rebecca Nugent is the Stephen E. and Joyce Fienberg Professor of Statistics & Data Science, the Associate Department Head and Co-Director of Undergraduate Studies for the Statistics & Data Science Department at the Carnegie Mellon University in Pittsburgh. She is the President-Elect of the International Federation of Classification Societies (2020-2021) and the former President of the Classification Society (of North America) from 2012-2014. She has won several national and university teaching awards including the American Statistical Association Waller Award for Innovation in Statistics Education and serves as one of the co-editors of the Springer Texts in Statistics. She recently served on the National Academy of Sciences study on Envisioning the Data Science Discipline: The Undergraduate Perspective and is the co-chair of the 2020 NAS study Improving Defense Acquisition Workforce Capability in Data Use. She has worked extensively in clustering and classification methodology with an emphasis on high-dimensional, big data problems and record linkage applications. Her current research focus is the development and deployment of low-barrier data analysis platforms that allow for adaptive instruction and the study of data science as a science.