Bill Inmon’s Data Warehouse Collection

The Textual Warehouse PDF Instant Download quantity	The Textual Warehouse PDF Instant Download	Original price was: $24.95.Current price is: $14.95.
The Textual Warehouse Print Version quantity	The Textual Warehouse Print Version	$24.95
Data Lake Architecture PDF Instant Download quantity	Data Lake Architecture PDF Instant Download	Original price was: $24.95.Current price is: $9.95.
Data Lake Architecture Print Version quantity	Data Lake Architecture Print Version	$24.95
Building the Unstructured Data Warehouse PDF Instant Download quantity	Building the Unstructured Data Warehouse PDF Instant Download	Original price was: $44.95.Current price is: $24.95.
Building the Unstructured Data Warehouse Print Version (with free PDF Instant Download!) quantity	Building the Unstructured Data Warehouse Print Version (with free PDF Instant Download!)	Original price was: $89.90.Current price is: $44.95.

Bill Inmon’s Data Warehouse Collection

Bill Inmon, the “father of the data warehouse,” has written 60 books published in nine languages. ComputerWorld named Bill one of the ten most influential people in the history of the computer profession.

The Textual Warehouse

Build a Textual Warehouse to help your organization understand and analyze documents through text analytics (both sentiment and non-sentiment analysis), to make better business decisions.

Learn the important role of documents and text within your organization, the difference between identifying and qualifying text, and when you need document preprocessing. Appreciate the power of taxonomies and the necessity of textual ETL. Know how the textual warehouse architecture differs from the conventional data warehouse architecture and when to apply contextualization and textual disambiguation.

About Ranjeet

Ranjeet Srivastava is a data management professional and an enterprise architect with more than 20 years in enterprise product research, development, and design of data-intensive mission-critical applications.

Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump

Organizations invest incredible amounts of time and money obtaining and then storing big data in data stores called data lakes. But how many of these organizations can actually get the data back out in a useable form? Very few can turn the data lake into an information gold mine. Most wind up with garbage dumps.

Data Lake Architecture will explain how to build a useful data lake, where data scientists and data analysts can solve business challenges and identify new business opportunities. Learn how to structure data lakes as well as analog, application, and text-based data ponds to provide maximum business value. Understand the role of the raw data pond and when to use an archival data pond. Leverage the four key ingredients for data lake success: metadata, integration mapping, context, and metaprocess.

Bill Inmon opened our eyes to the architecture and benefits of a data warehouse, and now he takes us to the next level of data lake architecture.

Building the Unstructured Data Warehouse

Learn essential techniques from data warehouse legend Bill Inmon on how to build the reporting environment your business needs now!

Answers for many valuable business questions hide in text. How well can your existing reporting environment extract the necessary text from email, spreadsheets, and documents, and put it in a useful format for analytics and reporting? Transforming the traditional data warehouse into an efficient unstructured data warehouse requires additional skills from the analyst, architect, designer, and developer. This book will prepare you to successfully implement an unstructured data warehouse and, through clear explanations, examples, and case studies, you will learn new techniques and tips to successfully obtain and analyze text.

Master these ten objectives:

Build an unstructured data warehouse using the 11-step approach
Integrate text and describe it in terms of homogeneity, relevance, medium, volume, and structure
Overcome challenges including blather, the Tower of Babel, and lack of natural relationships
Avoid the Data Junkyard and combat the Spider’s Web
Reuse techniques perfected in the traditional data warehouse and Data Warehouse 2.0,including iterative development
Apply essential techniques for textual Extract, Transform, and Load (ETL) such as phrase recognition, stop word filtering, and synonym replacement
Design the Document Inventory system and link unstructured text to structured data
Leverage indexes for efficient text analysis and taxonomies for useful external categorization
Manage large volumes of data using advanced techniques such as backward pointers
Evaluate technology choices suitable for unstructured data processing, such as data warehouse appliances

The following outline briefly describes each chapter’s content:

Chapter 1 defines unstructured data and explains why text is the main focus of this book.
Chapter 2 addresses the challenges one faces when managing unstructured data.
Chapter 3 discusses the DW 2.0 architecture, which leads into the role of the unstructured data warehouse. The unstructured data warehouse is defined and benefits are given. There are several features of the conventional data warehouse that can be leveraged for the unstructured data warehouse, including ETL processing, textual integration, and iterative development.
Chapter 4 focuses on the heart of the unstructured data warehouse: Textual Extract, Transform, and Load (ETL).
Chapter 5 describes the 11 steps required to develop the unstructured data warehouse.
Chapter 6 describes how to inventory documents for maximum analysis value, as well as link the unstructured text to structured data for even greater value.
Chapter 7 goes through each of the different types of indexes necessary to make text analysis efficient. Indexes range from simple indexes, which are fast to create and are good if the analyst really knows what needs to be analyzed before the indexing process begins, to complex combined indexes, which can be made up of any and all of the other kinds of indexes.
Chapter 8 explains taxonomies and how they can be used within the unstructured data warehouse.
Chapter 9 explains ways of coping with large amounts of unstructured data. Techniques such as keeping the unstructured data at its source and using backward pointers are discussed. The chapter explains why iterative development is so important.
Chapter 10 focuses on challenges and some technology choices that are suitable for unstructured data processing. In addition, the data warehouse appliance is discussed.
Chapters 11, 12, and 13 put all of the previously discussed techniques and approaches in context through three case studies.

About Krish

Krish Krishnan is a recognized thought leader in Data Warehouse Performance and Architecture. Krish writes and teaches Social Intelligence across the world and is a frequent speaker at industry conferences. He provides consulting advice to CxO’s on DW Strategy and is an Independent Analyst covering the Data Warehouse and Business Intelligence Industry.

Bestsellers

Faculty may request complimentary digital desk copies

Please complete all fields.