Technics Publications

The Data Catalog

$19.95
$49.95

The Data Catalog: Sherlock Holmes Data Sleuthing for Analytics, by Daniel A. McGrath

Apply this definitive guide to data catalogs and select the feature set needed to empower your data citizens in their quest for faster time to insight.

Topics

Chapter 1: Introducing Data Catalogs


About data catalogs
Data catalog features overview
Data catalog benefits
Key points


 

Chapter 2: A Data Worker’s Dream


Data scientist
Data administrator/curator
Key points


 

Chapter 3: The “Back Story”


About metadata
Example: Alation
Business glossary
Data lineage
Permissions and roles
APIs
Key points


 

Chapter 4: “Data Prep”


Data prep tools
Example: Boomi Unifi
Data analyst dashboard
Creating a prep job
Working with jobs
Comparing data sets
Ingesting and crawling
Certifying data sources
Key points


 

Chapter 5: Data Catalog as a Data Governance Platform


Data self-service
Data curation
Data governance
Data catalog role in data governance
Example: Collibra
Data quality
Data certification
Data governance access
Policies
Privacy and risk
Key points


 

Chapter 6: Fishing in the Data Lake


About the data lake
Data lake data catalog
Example: Waterline Data from Hitachi Vantara
Fishing features
Avoid siloed data catalogs
Key points


 

Chapter 7: One-Stop Shopping


Enterprise data catalogs
Example: IBM
Example: Informatica
Reference data support
Business glossary
Term relationships
Data quality support
Policies and rules
Workflow
Key points


 

Chapter 8: Data Catalog “Add-ons”


Portal products: CKAN
Cloud providers: Microsoft (Azure)
Data virtualization and integration tools: Denodo
Business intelligence & data visualization tools: Tableau
Data and process modeling: erwin
Self-service data prep: Paxata
API service catalog: Ignite Platform
MDM: Reltio
Key points


 

Chapter 9: Data Lineage


Lineage benefits
Lineage capture
Lineage challenge
Lineage categories
Drill down
Inferred lineage
Key points


 

Chapter 10: Machine Learning in the Data Catalog


ML to the rescue
ML and the data catalog
Knowledge graph
Similarity
ML and the human
Intelligent semantic search
Domains
Classifications
Inferred joins
Inferred lineage
Key points


 

Chapter 11: Data Catalog Features


Feature categories
Data catalog features
Scoring
Key points


 

Chapter 12: Conclusion


The power of deduction
Trends and innovations
Where would we like them to go?
Key points

The data catalog may be the most important breakthrough in data management in the last decade, ranking alongside the advent of the data warehouse. The latter enabled business consumers to conduct their own analyses to obtain insights themselves. The data catalog is the next wave of this, empowering business users even further to drastically reduce time to insight, despite the rising tide of data flooding the enterprise.

Use this book as a guide to provide a broad overview of the most popular Machine Learning (ML) data catalog products, and perform due diligence using the extensive features list. Consider graphical user interface (GUI) design issues such as layout and navigation, as well as scalability in terms of how the catalog will handle your current and anticipated data and metadata needs.

O’Neil & Fryman…present a typology which ranges from products that focus on data lineage, curation and search, data governance, data preparation, and of course, the core capability of finding and understanding the data. The authors emphasize that machine learning is being adopted in many of these products, enabling a more elegant data democratization solution in the face of the burgeoning mountain of data that is engulfing organizations.
Derek Strauss, Chairman/CEO, Gavroshe, and Former CDO, TD Ameritrade

This book is organized into three sections:

  • Chapters 1 and 2 reveal the rationale for a data catalog and share how data scientists, data administrators, and curators fare with and without a data catalog.
  • Chapters 3-10 present the many different types of data catalogs.
  • Chapters 11 and 12 provide an extensive features’ list, current trends, and visions for the future.

About Bonnie and Lowell

Bonnie O’Neil is a Principal Computer Scientist at The MITRE Corporation and is a well-known expert on all phases of data architecture including data catalogs, data quality, business metadata, and governance. She has assisted both Fortune 500 companies and government agencies in data management projects for over 30 years. She is a regular speaker and workshop/tutorial leader at many conferences, and the author of four books.

Lowell is an independent consultant specializing in implementing data governance programs and data catalogs. He has been a speaker, practitioner, and industry leader in data governance, analytics, and data quality having hands-on experience with implementations across most industries.

Bestsellers

Faculty may request complimentary digital desk copies

Please complete all fields.