While Web 2.0 was about data, Web 3.0 is about knowledge and information. Scripting Intelligence: Web 3.0 Information Gathering and Processing offers the reader Ruby scripts for intelligent information management in a Web 3.0 environment—including information extraction from text, using Semantic Web technologies, information gathering (relational database metadata, web scraping, Wikipedia, Freebase), combining information from multiple sources, and strategies for publishing processed information. This book will be a valuable tool for anyone needing to gather, process, and publish web or database information across the modern web environment.
- Text processing recipes, including speech tagging and automatic summarization
- Gathering, visualizing, and publishing information from the Semantic Web
- Information gathering from traditional sources such as relational databases and web sites
Зміст
Text Processing.- Parsing Common Document Types.- Cleaning, Segmenting, and Spell-Checking Text.- Natural Language Processing.- The Sematic Web.- Using RDF and RDFS Data Formats.- Delving Into RDF Data Stores.- Performing SPARQL Queries and Understanding Reasoning.- Implementing SPARQL Endpoint Web Portals.- Information Gathering and Storage.- Working with Relational Databases.- Supporting Indexing and Search.- Using Web Scraping to Create Semantic Relations.- Taking Advantage of Linked Data.- Implementing Strategies for Large-Scale Data Storage.- Information Publishing.- Creating Web Mashups.- Performing Large-Scale Data Processing.- Building Information Web Portals.
Про автора
Mark Watson is the author of 14 books on artificial intelligence, Java, C++, UML, and Linux. He is a consultant who uses Ruby, Java, and Common Lisp. He maintains a web site at markwatson.com.