This book presents ample, richly illustrated account on results and experience from a project, dealing with the analysis of data concerning behavior patterns on the Web. The advertising on the Web is dealt with, and the ultimate issue is to assess the share of the artificial, automated activity (ads fraud), as opposed to the genuine human activity.
After a comprehensive introductory part, a full-fledged report is provided from a wide range of analytic and design efforts, oriented at: the representation of the Web behavior patterns, formation and selection of telling variables, structuring of the populations of behavior patterns, including the use of clustering, classification of these patterns, and devising most effective and efficient techniques to separate the artificial from the genuine traffic.
A series of important and useful conclusions is drawn, concerning both the nature of the observed phenomenon, and hence the characteristics of the respective datasets, and theappropriateness of the methodological approaches tried out and devised. Some of these observations and conclusions, both related to data and to methods employed, provide a new insight and are sometimes surprising.
The book provides also a rich bibliography on the main problem approached and on the various methodologies tried out.
สารบัญ
The problem and its key characteristics.- The pragmatics of the data acquisition and assessment.- The proper representation: patterns, variables and their analysis.- Clustering analysis.- Building the classifiers.- The hybrid cluster-and-classify approach.- A summary view of the problem and its solution.