Data Mesh Design

Original price was: $54.95.Current price is: $49.95.
$54.95

Data Mesh Design, by Bruno Freitag

Design and implement a data lakehouse using technology-driven simplifications and generalizations. 

Topics

Introduction
Data warehouse / lake / data lakehouse
Star and snowflake models
Full denormalization
Data load
The enterprise challenge
Agile data-informed decision making


Chapter 1: Enterprise Data Lakehouse Overview
Modular data marts
Data mesh
Demand management


Chapter 2: Design Principles
Preamble
FAIR
Modular
System agnostic
Finest grain
Generalized and explicit
Conformed natural business keys


Chapter 3: Denormalization and Cartesian Products
Denormalization
Attribute categories
Cartesian products
Erroneous data
JSON data and data models


Chapter 4: Design Blueprint Overview
Conventions
Overview


Chapter 5: Facts
Introduction to facts
Granularity and derived values
Business key proxy
Multi-layer facts
Partner_reference in facts
Facts affiliations


Chapter 6: Dimensions, Synonyms, and Hierarchies
Overview
_Main
_dim
_affiliation
_ladder
Additional information
Consolidation
Hard consolidation
Soft consolidation
Disambiguation and qualifying


Chapter 7: ETL Simplifications
Data virtualization versus data lakehouse
ETL principles
Ingestion
Transform
Load
Audit fields
Basic load
Historicization and versioning
Loading atomic lists
Sample code
Simple mini-mart
Type 2, historicization
Transforming multi-layer facts
Atomic lists
Extending mini-marts


Chapter 8: Using the Mini-Marts
Self-service
Dashboarding
Common dimensions and synonyms
Using hierarchies
Affiliation and proximity
Multi-layer facts
Snapshot tables
Machine Learning and Data Science
Overview
Features
Data quality and cleansing
Integration of additional data
Transition aspects
Usage Simplification
Generic simplifications
Standard KPIs
Project-specific views


Chapter 9: Data Catalog and Lineage
Data catalog
Common dimensions
Lineage


Chpater 10: Transition Challenges
Data as a Product
Knowledge and technology
Data engineering / delivery teams
Ownership
Things that should have been done better

The approach you will learn enables consolidating even incoherent data from multiple source systems across complex enterprise environments. The precise business question does not need to be known in advance and can even change over time. The approach lends itself well to federated, cooperating data mesh nodes. The individual components, called mini-marts, are like the “data part” of a data quantum and are interoperable. We describe data model blueprints to generalize dimensions with synonyms and facts at different granularities. Includes code examples using complex hierarchies as they exist in heterogenous real-world go-to-market organizations.

Watch the book release celebration!

About Bruno

Bruno holds degrees in engineering and computer science from the Bern University of Applied Science BFH and Polytechnic University of New York. He has had a long and diverse career in the Information Technology industry, mostly in the Chemical and Agriculture sector as well as in Banking and Insurance, always in heterogenous, complex environments. In the last few years, Bruno has focused on data and process integration in its many shapes and forms. From classical process design to canonical messages for API and message-based integrations, to integrating data on large scale for data science and machine learning. Most recently, Bruno was the “spiritus rector” behind a large-scale data lakehouse, integrating and simplifying data from more than a hundred source systems into a coherent Data Mesh. This experience led to the writing of this book.

Bestsellers

Faculty may request complimentary digital desk copies

Please complete all fields.