Technics Publications

The Data Catalog

Original price was: $49.95.Current price is: $19.95.
Original price was: $99.90.Current price is: $49.95.

The Data Catalog: Sherlock Holmes Data Sleuthing for Analytics, by Daniel A. McGrath

Apply this definitive guide to data catalogs and select the feature set needed to empower your data citizens in their quest for faster time to insight.


Chapter 1: Introducing Data Catalogs

About data catalogs
Data catalog features overview
Data catalog benefits
Key points


Chapter 2: A Data Worker’s Dream

Data scientist
Data administrator/curator
Key points


Chapter 3: The “Back Story”

About metadata
Example: Alation
Business glossary
Data lineage
Permissions and roles
Key points


Chapter 4: “Data Prep”

Data prep tools
Example: Boomi Unifi
Data analyst dashboard
Creating a prep job
Working with jobs
Comparing data sets
Ingesting and crawling
Certifying data sources
Key points


Chapter 5: Data Catalog as a Data Governance Platform

Data self-service
Data curation
Data governance
Data catalog role in data governance
Example: Collibra
Data quality
Data certification
Data governance access
Privacy and risk
Key points


Chapter 6: Fishing in the Data Lake

About the data lake
Data lake data catalog
Example: Waterline Data from Hitachi Vantara
Fishing features
Avoid siloed data catalogs
Key points


Chapter 7: One-Stop Shopping

Enterprise data catalogs
Example: IBM
Example: Informatica
Reference data support
Business glossary
Term relationships
Data quality support
Policies and rules
Key points


Chapter 8: Data Catalog “Add-ons”

Portal products: CKAN
Cloud providers: Microsoft (Azure)
Data virtualization and integration tools: Denodo
Business intelligence & data visualization tools: Tableau
Data and process modeling: erwin
Self-service data prep: Paxata
API service catalog: Ignite Platform
MDM: Reltio
Key points


Chapter 9: Data Lineage

Lineage benefits
Lineage capture
Lineage challenge
Lineage categories
Drill down
Inferred lineage
Key points


Chapter 10: Machine Learning in the Data Catalog

ML to the rescue
ML and the data catalog
Knowledge graph
ML and the human
Intelligent semantic search
Inferred joins
Inferred lineage
Key points


Chapter 11: Data Catalog Features

Feature categories
Data catalog features
Key points


Chapter 12: Conclusion

The power of deduction
Trends and innovations
Where would we like them to go?
Key points

The data catalog may be the most important breakthrough in data management in the last decade, ranking alongside the advent of the data warehouse. The latter enabled business consumers to conduct their own analyses to obtain insights themselves. The data catalog is the next wave of this, empowering business users even further to drastically reduce time to insight, despite the rising tide of data flooding the enterprise.

Use this book as a guide to provide a broad overview of the most popular Machine Learning (ML) data catalog products, and perform due diligence using the extensive features list. Consider graphical user interface (GUI) design issues such as layout and navigation, as well as scalability in terms of how the catalog will handle your current and anticipated data and metadata needs.

O’Neil & Fryman…present a typology which ranges from products that focus on data lineage, curation and search, data governance, data preparation, and of course, the core capability of finding and understanding the data. The authors emphasize that machine learning is being adopted in many of these products, enabling a more elegant data democratization solution in the face of the burgeoning mountain of data that is engulfing organizations.
Derek Strauss, Chairman/CEO, Gavroshe, and Former CDO, TD Ameritrade

This book is organized into three sections:

  • Chapters 1 and 2 reveal the rationale for a data catalog and share how data scientists, data administrators, and curators fare with and without a data catalog.
  • Chapters 3-10 present the many different types of data catalogs.
  • Chapters 11 and 12 provide an extensive features’ list, current trends, and visions for the future.

About Bonnie and Lowell

Bonnie O’Neil is a Principal Computer Scientist at The MITRE Corporation and is a well-known expert on all phases of data architecture including data catalogs, data quality, business metadata, and governance. She has assisted both Fortune 500 companies and government agencies in data management projects for over 30 years. She is a regular speaker and workshop/tutorial leader at many conferences, and the author of four books.

Lowell is an independent consultant specializing in implementing data governance programs and data catalogs. He has been a speaker, practitioner, and industry leader in data governance, analytics, and data quality having hands-on experience with implementations across most industries.


Faculty may request complimentary digital desk copies

Please complete all fields.