The Data Path Less Traveled: Step up Creativity using Heuristics in Data Science, Artificial Intelligence, and Beyond, by Dr. Zacharias Voulgaris
Become proficient in using heuristics within the data science pipeline to produce higher quality results in less time.
1.1 Problem-solving
1.2 Creativity in problem-solving
1.3 AI and creativity
1.4 Down-to-earth creativity
1.5 Summary
2.1 Heuristics overview
2.2 Heuristics as metrics
2.3 Heuristics as algorithms
2.4 Important considerations
2.5 Summary
3.1 Metaheuristics overview
3.2 When to use metaheuristics
3.3 Problems lending themselves to metaheuristics
3.4 Important considerations
3.5 Summary
4.1 Why heuristics are essential
4.2 How heuristics manifest in practice
4.3 When to use a specialized metric
4.4 When to use a specialized method
4.5 Summary
5.1 EDA heuristics overview
5.2 Basic heuristics in EDA
5.2.1 The range based correlation heuristic
5.2.2 Binary correlation heuristics
5.2.3 Your own heuristics
5.3 How you can leverage these heuristics in EDA effectively
5.4 Important considerations
5.5 Summary
6.1 The whys of advanced heuristics in EDA
6.2 Specific advanced heuristics in EDA
6.2.1 Index of discernibility
6.2.2 Density analysis
6.2.3 Other advanced heuristics
6.3 How to leverage these heuristics in EDA effectively
6.4 Important considerations
6.5 Summary
7.1 Overview of model-related heuristics
7.2 Specific model-related heuristics
7.2.1 F-scores heuristic
7.2.2 Area Under Curve heuristic
7.2.3 Range based correlation heuristic
7.2.4 Confidence index heuristic
7.2.5 Other model heuristics
7.3 How to leverage these heuristics effectively
7.4 Important considerations
7.5 Summary
8.1 Overview of additional heuristics
8.2 The Entropy and Ectropy heuristics
8.2.1 Entropy
8.2.2 Ectropy
8.2.3 Whether to use entropy or ectropy in a data-related problem
8.3 Distance-related heuristics
8.3.1 Distance heuristics
8.3.2 Similarity heuristics
8.3.3 Relationship to the confidence index
8.4 Important considerations
8.5 Summary
9.1 Optimization overview
9.2 Optimization use cases
9.3 Key components of an optimization algorithm
9.4 Optimization’s role in AI and ML
9.5 Important considerations
9.6 Summary
10.1 Heuristics in optimization in general
10.2 Specific optimization algorithms using heuristics
10.2.1 Swarm-based algorithms
10.2.2 Genetic algorithms
10.2.3 Simulated annealing and variants
10.2.4 Other
10.3 Particle swarm optimization and heuristics
10.3.1 Overview
10.3.2 Pseudocode of PSO algorithm
10.3.3 Heuristics used
10.4 Important considerations
10.5 Summary
11.1 Complex optimizers overview
11.2 The genetic algorithms family of optimizers
11.2.1 Key concepts of GAs
11.2.2 The vanilla flavor GA and its limitations
11.2.3 Elitism variant
11.2.4 Scaling hack
11.2.5 Constraints tweak
11.2.6 Other variants
11.3 Heuristics involved in genetic algorithms
11.4 Important considerations
11.5 Summary
12.1 Optimization ensembles overview
12.2 Structure of an optimization ensemble
12.3 Role of heuristics in optimization ensembles
12.4 Important considerations
12.5 Summary
13.1 Overview of heuristic objectives and functionality
13.2 Defining the objective(s) of a heuristic
13.3 Working out the functionality of a heuristic
13.4 Optimizing the heuristic’s objectives and functionality
13.5 Important considerations
13.6 Summary
14.1 Overview of parameters, outputs, and usability of metric heuristics
14.2 Defining a metric heuristic’s parameters and outputs
14.3 Figuring out a metric heuristic’s usability and scope
14.4 Optimizing a metric heuristic’s usability
14.5 Important considerations
14.6 Summary
15.1 Overview of parameters, outputs, and usability of method heuristics
15.2 Defining a method heuristic’s parameters and outputs
15.3 Figuring out a method heuristic’s usability and scope
15.4 Optimizing a method heuristic’s usability
15.5 Important considerations
15.6 Summary
16.1 Process overview for developing a new heuristic
16.2 Defining the objectives and functionality of the new heuristic
16.2.1 Overview
16.2.2 A heuristic to measure diversity in a variable
16.2.3 A heuristic to measure the peculiarity of dataset points
16.2.4 The value question
16.2.5 Your part
16.3 Defining the parameters, outputs, and usability of the new heuristics
16.3.1 Parameters, outputs, and usability of the diversity heuristic
16.3.2 Parameters, outputs, and usability of the index of peculiarity heuristic
16.3.3 Scope matters for the two heuristics
16.4 Important considerations
16.5 Summary
17.1 Overview of heuristic limitations in general
17.2 Limitations in generalization capability
17.3 Limitations in accuracy
17.4 Why these limitations exist and trade-offs
17.5 Important considerations
17.6 Summary
18.1 Overview of heuristics’ potential in general
18.2 Heuristics’ potential for EDA
18.3 Heuristics’ potential for optimization
18.4 Heuristics’ potential for auxiliary processes
18.5 Heuristics’ potential for model-building
18.6 Summary
19.1 Value of transparency in data science and AI
19.2 How heuristics can help with transparency
19.3 Building a more transparent framework for data science
19.4 Important considerations
19.5 Summary
20.1 Heuristics and their value
20.2 Is there an end to creativity when it comes to heuristics?
20.3 Heuristics as a way to develop your own creativity
20.4 Important considerations
20.5 Where do we go from here in our heuristics journey?
Although data professionals have used heuristics for many years within optimization-related applications, heuristics have been a vibrant area of research in various data-related areas, from machine learning to image processing. Heuristics also play a role in niche applications such as cybersecurity. In addition, the advent of AI and other data-driven methodologies have brought heuristics to the forefront of data-related work.
In this book, we explore heuristics from a practical perspective. We illustrate how heuristics can help you solve challenging problems through simple examples and real-life situations. Apply Jaccard Similarity and a variant, F1 score, Entropy, Ectropy, Area Under Curve, Particle Swarm Optimization, and Genetic Algorithms (along with GA variants). Beyond just exhibiting the various known and lesser-known heuristics available today, we also examine how you can go about creating your own through a simple and functional framework. Code notebooks enable you to practice all of the techniques and explore a few of your own.
There is no doubt that the data-driven paradigm is here to stay. There are many ways to stand out in it as a data professional, with AI-related know-how being at the top of the list. However, equally impactful can be the creative tools (heuristics) that make such technologies feasible and scalable. Unfortunately, this is a way that not many people care to follow as it’s off the beaten path. Are you up for the challenge?
Dr. Zacharias Voulgaris was born in Athens, Greece. He studied Production Engineering and Management at the Technical University of Crete, shifted to Computer Science through a Masters in Information Systems & Technology, and then to Data Science through a PhD in Machine Learning. He has worked at Georgia Tech as a Research Fellow, at an e-marketing startup in Cyprus as an SEO manager, and as a Data Scientist in both Elavon (GA) and G2 Web Services (WA). He also was a Program Manager at Microsoft on a data analytics pipeline for Bing. Zacharias has authored several books on Data Science, mentors aspiring data scientists, and maintains a Data Science and AI blog. Currently, he works as a consultant at GLG.
Please complete all fields.