Building the Unstructured Data Warehouse

The Unified Star Schema: An Agile and Resilient Approach to Data Warehouse and Analytics Design

Master the most agile and resilient design for building analytics applications: the Unified Star Schema (USS) approach. The USS has many benefits over traditional dimensional modeling. Witness the power of the USS as a single star schema that serves as a foundation for all present and future business requirements of your organization.

Data warehouse legend Bill Inmon and business intelligence innovator, Francesco Puppini, explain step-by-step why the Unified Star Schema is the recommended approach for business intelligence designs today, and show through many examples how to build and use this new solution.

This book contains two parts. Part I, Architecture, explains the benefits of data marts and data warehouses, covering how organizations progressed to their current state of analytics, and to the challenges that result from current business intelligence architectures. Chapter 1 covers the drivers behind and the characteristics of the data warehouse and data mart. Chapter 2 introduces dimensional modeling concepts, including fact tables, dimensions, star joins, and snowflakes. Chapter 3 recalls the evolution of the data mart. Chapter 4 explains Extract, Transform, and Load (ETL), and the value ETL brings to reporting. Chapter 5 explores the Integrated Data Mart Approach, and Chapter 6 explains how to monitor this environment. Chapter 7 describes the different types of metadata within the data warehouse environment. Chapter 8 progresses through the evolution to our current modern data warehouse environment.
Part II, the Unified Star Schema, covers the Unified Star Schema (USS) approach and how it solves the challenges introduced in Part I. There are eight chapters within Part II:

  • Chapter 9, Introduction to the Unified Star Schema: Learn about its architecture and use cases, as well as how the USS approach differs from the traditional approach.
  • Chapter 10, Loss of Data: Learn about the loss of data and the USS Bridge. Understand that the USS approach does not create any join, and for this reason, it has no loss of data.
  • Chapter 11, The Fan Trap: Get introduced to the Oriented Data Model convention, and learn the dangers of a fan trap through an example. Differentiate join and association, and realize that an “in-memory association” is the preferred solution to the fan trap.
  • Chapter 12, The Chasm Trap: Become familiar with the Cartesian product, and then follow along with an example based on LinkedIn, which illustrates that a chasm trap produces unwanted duplicates. See that the USS Bridge is based on a union, which does not create any duplicates.
  • Chapter 13, Multi-Fact Queries: Distinguish between multiple facts “with direct connection” versus multiple facts “with no direct connection”. Explore how BI tools are capable of building aggregated virtual rows.
  • Chapter 14, Loops: Learn more about loops and five traditional techniques to solve them. Follow along with an implementation, which will illustrate the solution based on the USS approach.
  • Chapter 15, Non-Conformed Granularities: Learn about non-conformed granularities, and learn that the Unified Star Schema introduces a solution called “re-normalization”.
  • Chapter 16, Northwind Case Study. Witness how easy it is to detect the pitfalls of Northwind using the ODM convention. Follow along with an implementation of the USS approach on the Northwind database with various BI tools.

Request review copy (Instructors only)


Part I: Architecture
Chapter 1: Data Marts and the Dimensional Model

Chapter 2: Dimensional Modeling Concepts

Chapter 3: Data Mart Evolution

Chapter 4: Transformations

Chapter 5: The Integrated Data Mart Approach

Chapter 6: Monitoring the Data Mart Environment
Actively used data vs. inactive data
Removing dormant data
Discarding older versions of the data mart

Chapter 7: Metadata and Documentation in the Data Mart Environment
Simple table and element metadata
Source metadata
Load date metadata
Combined metadata
Usage metadata

Chapter 8: The Evolution toward Integrated Data Marts
Data out of harmony
Bad corporate decisions
Enter the data warehouse: The single version of the truth

Part II: The Unified Star Schema
Chapter 9: Introduction to the Unified Star Schema
The architecture
The USS approach
Hunter and prey
The central table
The dangers of denormalization

Chapter 10: Loss of Data
Example based on sales and products
Postponing the join
The heart of the USS: The Bridge
The USS naming convention
How the USS solves the loss of data
Implementation with Tibco Spotfire

Chapter 11: The Fan Trap
The Oriented Data Model convention
Definition of the fan trap
Example based on sales and shipments
Visualizing the one-to-many relationship
How the USS solves the fan trap
Implementation with Microsoft Power BI
Is your BI tool capable of association?
Splitting the measures
Moving all the measures to the Bridge
The JSON fan trap

Chapter 12: The Chasm Trap
The Cartesian product
Definition of the chasm trap
Example based on LinkedIn
The method for the chasm trap row count
The chasm trap with measures
How the USS solves the chasm trap
Implementation with Tableau
Chasm Trap with multiple tables
The JSON chasm trap

Chapter 13: Multi-Fact Queries
Multi-fact queries with direct connection (one-to-many)
Multi-fact queries with no direct connection (many-to-many)
Example based on sales and purchases
The union
The spring effect of aggregation in BI tools
How the USS solves multi-fact queries
Implementation with Tibco Spotfire

Chapter 14: Loops
Example based on CRM
Solving loops using traditional techniques
Solving loops with the union
How the USS solves the loops
Implementation with SAP Business Objects

Chapter 15: Non-Conformed Granularities
Example based on sales and targets
Understanding the challenges
The Re-normalization
How the USS solves non-conformed granularities
Implementation with QlikView
Working with aggregates and details

Chapter 16: Northwind Case Study
Oriented Data Model for Northwind
Detecting the challenges
Understanding the effect of the traps
The safe zone
From ad-hoc to self-service BI
Example of a challenging business requirement
How the USS implements the Northwind database
Implementations with various BI tools

About Bill and Francesco

Bill Inmon, the “father of data warehouse”, has written 60 books published in nine languages. Bill’s latest adventure is the building of technology known as textual disambiguation – technology that reads raw text in a narrative format and allows the text to be placed in a conventional database so that it can be analyzed by standard analytical technology, thereby creating unique business value for Big Data/unstructured data.

Francesco Puppini is an Italian freelance consultant in business intelligence and data warehousing, and he is the inventor of the Unified Star Schema. He has worked on over 30 different projects across ten European countries, for clients from several industries. He is currently working as a business intelligence specialist and innovator, after 20 years spent on Business Objects, Qlik, Tableau, SQL, Teradata, and data modeling.

Business Intelligence & Data Warehouse Books