The Data Catalog: Sherlock Holmes Data Sleuthing for Analytics, by Daniel A. McGrath
Apply this definitive guide to data catalogs and select the feature set needed to empower your data citizens in their quest for faster time to insight.
About data catalogs
Data catalog features overview
Data catalog benefits
Permissions and roles
Data prep tools
Example: Boomi Unifi
Data analyst dashboard
Creating a prep job
Working with jobs
Comparing data sets
Ingesting and crawling
Certifying data sources
Data catalog role in data governance
Data governance access
Privacy and risk
About the data lake
Data lake data catalog
Example: Waterline Data from Hitachi Vantara
Avoid siloed data catalogs
Enterprise data catalogs
Reference data support
Data quality support
Policies and rules
Portal products: CKAN
Cloud providers: Microsoft (Azure)
Data virtualization and integration tools: Denodo
Business intelligence & data visualization tools: Tableau
Data and process modeling: erwin
Self-service data prep: Paxata
API service catalog: Ignite Platform
ML to the rescue
ML and the data catalog
ML and the human
Intelligent semantic search
Data catalog features
The power of deduction
Trends and innovations
Where would we like them to go?
The data catalog may be the most important breakthrough in data management in the last decade, ranking alongside the advent of the data warehouse. The latter enabled business consumers to conduct their own analyses to obtain insights themselves. The data catalog is the next wave of this, empowering business users even further to drastically reduce time to insight, despite the rising tide of data flooding the enterprise.
Use this book as a guide to provide a broad overview of the most popular Machine Learning (ML) data catalog products, and perform due diligence using the extensive features list. Consider graphical user interface (GUI) design issues such as layout and navigation, as well as scalability in terms of how the catalog will handle your current and anticipated data and metadata needs.
O’Neil & Fryman…present a typology which ranges from products that focus on data lineage, curation and search, data governance, data preparation, and of course, the core capability of finding and understanding the data. The authors emphasize that machine learning is being adopted in many of these products, enabling a more elegant data democratization solution in the face of the burgeoning mountain of data that is engulfing organizations.
Derek Strauss, Chairman/CEO, Gavroshe, and Former CDO, TD Ameritrade
This book is organized into three sections:
Bonnie O’Neil is a Principal Computer Scientist at The MITRE Corporation and is a well-known expert on all phases of data architecture including data catalogs, data quality, business metadata, and governance. She has assisted both Fortune 500 companies and government agencies in data management projects for over 30 years. She is a regular speaker and workshop/tutorial leader at many conferences, and the author of four books.
Lowell is an independent consultant specializing in implementing data governance programs and data catalogs. He has been a speaker, practitioner, and industry leader in data governance, analytics, and data quality having hands-on experience with implementations across most industries.
Please complete all fields.