Take a dive into data lakes
‘Data lakes’ is the latest buzz word in the world of data storage, management, and analysis. Data Lakes For Dummies decodes and demystifies the concept and helps you get a straightforward answer the question: ‘What exactly is a data lake and do I need one for my business?’ Written for an audience of technology decision makers tasked with keeping up with the latest and greatest data options, this book provides the perfect introductory survey of these novel and growing features of the information landscape. It explains how they can help your business, what they can (and can’t) achieve, and what you need to do to create the lake that best suits your particular needs.
With a minimum of jargon, prolific tech author and business intelligence consultant Alan Simon explains how data lakes differ from other data storage paradigms. Once you’ve got the background picture, he maps out ways you can add a data lake to your business systems; migrate existing information and switch on the fresh data supply; clean up the product; and open channels to the best intelligence software for to interpreting what you’ve stored.
* Understand and build data lake architecture
* Store, clean, and synchronize new and existing data
* Compare the best data lake vendors
* Structure raw data and produce usable analytics
Whatever your business, data lakes are going to form ever more prominent parts of the information universe every business should have access to. Dive into this book to start exploring the deep competitive advantage they make possible–and make sure your business isn’t left standing on the shore.
Table des matières
Introduction 1
Part 1: Getting Started with Data Lakes 5
Chapter 1: Jumping into the Data Lake 7
Chapter 2: Planning Your Day (and the Next Decade) at the Data Lake 25
Chapter 3: Break Out the Life Vests: Tackling Data Lake Challenges 49
Part 2: Building the Docks, Avoiding the Rocks 65
Chapter 4: Imprinting Your Data Lake on a Reference Architecture 67
Chapter 5: Anybody Hungry? Ingesting and Storing Raw Data in Your Bronze Zone 97
Chapter 6: Your Data Lake’s Water Treatment Plant: The Silver Zone 121
Chapter 7: Bottling Your Data Lake Water in the Gold Zone 139
Chapter 8: Playing in the Sandbox 151
Chapter 9: Fishing in the Data Lake 159
Chapter 10: Rowing End-to-End across the Data Lake 169
Part 3: Evaporating the Data Lake into the Cloud 187
Chapter 11: A Cloudy Day at the Data Lake 189
Chapter 12: Building Data Lakes in Amazon Web Services 199
Chapter 13: Building Data Lakes in Microsoft Azure 217
Part 4: Cleaning Up the Polluted Data Lake 243
Chapter 14: Figuring Out If You Have a Data Swamp Instead of a Data Lake 245
Chapter 15: Defining Your Data Lake Remediation Strategy 259
Chapter 16: Refilling Your Data Lake 283
Part 5: Making Trips to the Data Lake a Tradition 297
Chapter 17: Checking Your GPS: The Data Lake Road Map 299
Chapter 18: Booking Future Trips to the Data Lake 325
Part 6: The Part of Tens 333
Chapter 19: Top Ten Reasons to Invest in Building a Data Lake 335
Chapter 20: Ten Places to Get Help for Your Data Lake 341
Chapter 21: Ten Differences between a Data Warehouse and a Data Lake 345
Index 351
A propos de l’auteur
Alan Simon is the managing principal of Thinking Helmet, Inc., the author of 32 books on business technology, and a consultant who’s worked with enterprise and government organizations. His professional focus is business intelligence, analytics, and data warehousing. He also teaches university courses in his specialty areas.