This two-volume set, LNAI 9077 + 9078, constitutes the refereed proceedings of the 19th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2015, held in Ho Chi Minh City, Vietnam, in May 2015.The proceedings contain 117 paper carefully reviewed and selected from 405 submissions. They have been organized in topical sections named: social networks and social media; classification; machine learning; applications; novel methods and algorithms; opinion mining and sentiment analysis; clustering; outlier and anomaly detection; mining uncertain and imprecise data; mining temporal and spatial data; feature extraction and selection; mining heterogeneous, high-dimensional, and sequential data; entity resolution and topic-modeling; itemset and high-performance data mining; and recommendations.
Содержание
Opinion Mining and Sentiment Analysis.- Emotion Cause Detection for Chinese Micro-Blogs Based on ECOCC Model.- Parallel Recursive Deep Model for Sentiment Analysis.- Sentiment Analysis in Transcribed Utterances.- Rating Entities and Aspects Using a Hierarchical Model.- Sentiment Analysis on Microblogging by Integrating Text and Image Features.- TSum4act: A Framework for Retrieving and Summarizing Actionable Tweets during a Disaster for Reaction.- Clustering.- Evolving Chinese Restaurant Processes for Modeling Evolutionary Traces in Temporal Data.- Small-Variance Asymptotics for Bayesian Nonparametric Models with Constraints.- Spectral Clustering for Large-Scale Social Networks via a Pre-Coarsening Sampling Based Nyström Method.- pc Stream: A Stream Clustering Algorithm for Dynamically Detecting and Managing Temporal Contexts.- Clustering Over Data Streams Based on Growing Neural Gas.- Computing and Mining Clust Cube Cubes Efficiently.- Outlier and Anomaly Detection Contextual Anomaly Detection Using Log-Linear Tensor Factorization.- A Semi-Supervised Framework for Social Spammer Detection.- Fast One-Class Support Vector Machine for Novelty Detection.- ND-SYNC: Detecting Synchronized Fraud Activities.- An Embedding Scheme for Detecting Anomalous Block Structured Graphs.- A Core-Attach Based Method for Identifying Protein Complexes in Dynamic PPI Networks.- Mining Uncertain and Imprecise Data Mining Uncertain Sequential Patterns in Iterative Map Reduce.- Quality Control for Crowdsourced POI Collection.- Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases.- Preference-Based Top-k Representative Skyline Queries on Uncertain Databases.- Cluster Sequence Mining: Causal Inference with Time and Space Proximity under Uncertainty.- Achieving Accuracy Guarantee for Answering Batch Queries with Differential Privacy.- Mining Temporal and Spatial Data Automated Classification of Passing in Football.- Stabilizing Sparse Cox Model Using Statistic and Semantic Structures in Electronic Medical Records.- Predicting Next Locations with Object Clustering and Trajectory Clustering.- A Plane Moving Average Algorithm for Short-Term Traffic Flow Prediction.- Recommending Profitable Taxi Travel Routes Based on Big Taxi Trajectories Data.- Semi Supervised Adaptive Framework for Classifying Evolving Data Stream.- Feature Extraction and Selection Cost-Sensitive Feature Selection on Heterogeneous Data.- A Feature Extraction Method for Multivariate Time Series Classification Using Temporal Patterns.- Scalable Outlying-Inlying Aspects Discovery via Feature Ranking.- A DC Programming Approach for Sparse Optimal Scoring.- Graph Based Relational Features for Collective Classification.- A New Feature Sampling Method in Random Forests for Predicting High-Dimensional Data.- Mining Heterogeneous, High Dimensional, and Sequential Data Seamlessly Integrating Effective Links with Attributes for Networked Data Classification.- Clustering on Multi-source Incomplete Data via Tensor Modeling and Factorization.- Locally Optimized Hashing for Nearest Neighbor Search.- Do-Rank: DCG Optimization for Learning-to-Rank in Tag-Based Item Recommendation Systems.- Efficient Discovery of Recurrent Routine Behaviours in Smart Meter Time Series by Growing Subsequences.- Convolutional Nonlinear Neighbourhood Components Analysis for Time Series Classification.- Entity Resolution and Topic Modelling Clustering-Based Scalable Indexing for Multi-party Privacy-Preserving Record Linkage.- Efficient Interactive Training Selection for Large-Scale Entity Resolution.- Unsupervised Blocking Key Selection for Real-Time Entity Resolution.- Incorporating Probabilistic Knowledge into Topic Models.- Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs.- Predicting Future Links Between Disjoint Research Areas Using Heterogeneous Bibliographic Information Network.- Itemset and High Performance Data Mining CPT+: Decreasing the Time/Space Complexity of the Compact Prediction Tree.- Mining Association Rules in Graphs Based on Frequent Cohesive Itemsets.- Mining High Utility Itemsets in Big Data.- Decomposition Based SAT Encodings for Itemset Mining Problems.- A Comparative Study on Parallel LDA Algorithms in Map Reduce Framework.- Distributed Newton Methods for Regularized Logistic Regression.- Recommendation.- Coupled Matrix Factorization Within Non-IID Context.- Complementary Usage of Tips and Reviews for Location Recommendation in Yelp.- Coupling Multiple Views of Relations for Recommendation.- Pairwise One Class Recommendation Algorithm.- RIT: Enhancing Recommendation with Inferred Trust.