电子书 > 科学 > 计算机科学 > Nikolaus Augsten & Michael Bohlen: Similarity Joins in Relational Database Systems (PDF)

Nikolaus Augsten & Michael Bohlen
Similarity Joins in Relational Database Systems [PDF ebook]

支持

的封面 Nikolaus Augsten & Michael Bohlen: Similarity Joins in Relational Database Systems (PDF)

State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. This book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify the edit distance as the de facto standard for comparing complex objects. Since the edit distance is computationally expensive, token-based distances have been introduced to speed up edit distance computations. The basic idea is to decompose complex objects into sets of tokens that can be compared efficiently. Token-based distances are used to compute an approximation of the edit distance and prune expensive edit distance calculations. A key observation when computing similarity joins is that many of the object pairs, for which the similarity is computed, are very different from each other. Filters exploit this property to improve the performance of similarity joins. A filter preprocesses the input data sets and produces a set of candidate pairs. The distance function is evaluated on the candidate pairs only. We describe the essential query processing techniques for filters based on lower and upper bounds. For token equality joins we describe prefix, size, positional and partitioning filters, which can be used to avoid the computation of small intersections that are not needed since the similarity would be too low.

€32.09

购买此电子书可免费获赠一本！

语言英语 ● 格式 PDF ● 网页 124 ● ISBN 9781627050296 ● 出版者 Morgan & Claypool Publishers ● 发布时间 2013 ● 下载 3 时 ● 货币 EUR ● ID 5479233 ● 复制保护 Adobe DRM

需要具备DRM功能的电子书阅读器

来自同一作者的更多电子书 / 编辑

Nikolaus Augsten Michael Bohlen

75,745 此类电子书

的封面 Start Vector.com: Rank Hack Method

Start Vector.com: Rank Hack Method

的封面 Kandarpa Kumar Sarma & Manash Pratim Sarma: Intelligent Applications for Heterogeneous System Modeling and Design

Kandarpa Kumar Sarma & Manash Pratim Sarma: Intelligent Applications for Heterogeneous System Modeling and Design

的封面 Cyrille Debacq: Vos premiers pas avec SAP ABAP

Cyrille Debacq: Vos premiers pas avec SAP ABAP

Plongez dans l’ABAP grace a ce livre qui vous guidera pas a pas, de la description des composantes du langage SAP ABAP a l’etape de la programmation. Vous verrez en detail les divers elements constit …

的封面 Barry L. (Williams Technology Audit Services, Alton, Illinois, USA) Williams: Information Security Policy Development for Compliance

Barry L. (Williams Technology Audit Services, Alton, Illinois, USA) Williams: Information Security Policy Development for Compliance

Although compliance standards can be helpful guides to writing comprehensive security policies, many of the standards state the same requirements in slightly different ways. Information Security Poli …

的封面 Haripriya Dhall & Bijay Kumar Sahoo: Microsoft Power Platform A Deep Dive

Haripriya Dhall & Bijay Kumar Sahoo: Microsoft Power Platform A Deep Dive

With "Microsoft’s Power Platform A Deep Dive, " you can learn more about how Microsoft’s Power Platform creates and fosters opportunities for users to enhance their technical skills and boo …

的封面 Abhay Bhargav: PCI Compliance

Abhay Bhargav: PCI Compliance

Although organizations that store, process, or transmit cardholder information are required to comply with payment card industry standards, most find it extremely challenging to comply with and meet …

的封面 Roberto (New Jersey Institute of Technology, Newark, New Jersey, USA) Rojas-Cessa: Interconnections for Computer Communications and Packet Networks

Roberto (New Jersey Institute of Technology, Newark, New Jersey, USA) Rojas-Cessa: Interconnections for Computer Communications and Packet Networks

This book introduces different interconnection networks applied to different systems. Interconnection networks are used to communicate processing units in a multi-processor system, routers in communi …

的封面 El-P. & El-I. Taylor: Understanding the Basics About Bitcoin & Other Cryptocurrencies, the Beginner’s 101 Guide - An Introductory Explanation for Beginners Part 1 the First Most Comprehensive Book to Understanding Cryptocurrency With Step-By-Step Instruct

El-P. & El-I. Taylor: Understanding the Basics About Bitcoin & Other Cryptocurrencies, the Beginner’s 101 Guide – An Introductory Explanation for Beginners Part 1 the First Most Comprehensive Book to Understanding Cryptocurrency With Step-By-Step Instruct

Understanding the Basics About Bitcoin & Other Cryptocurrencies is the Beginner”s Go-To-Guide for Understanding Cryptocurrency and is intended to be used as a manual/handbook. It first lays the nece …

的封面 Ade Asefeso MCIPS MBA: Online Marketing for Start-ups and Offline Business

Ade Asefeso MCIPS MBA: Online Marketing for Start-ups and Offline Business

的封面 Nick Vandome: Android Phones for seniors in easy steps, 4th edition

Nick Vandome: Android Phones for seniors in easy steps, 4th edition

Android phones continue to be the most popular smartphones around the world, combining power, efficiency and the comprehensive Android operating system and interface. However, with the constantly exp …