Machine Learning and Data Science, 2nd Edition

Original price was: $49.95.Current price is: $44.95.
$49.95

Machine Learning and Data Science, 2nd Edition: An Introduction to Statistical Learning Methods with R, by Daniel D. Gutierrez

Build real-world machine learning solutions from scratch using R—no advanced math or prior coding experience required.

Topics

Chapter 1: Data Science Overview

Types of Machine Learning

Use Case Examples of Data Science

Porto Seguro’s Safe Driver Prediction

Netflix

Algorithmic Trading Challenge

Heritage Health Prize

Marketing

Sales

Supply Chain

Risk Management

Customer Support

Human Resources

Google Flu Trends

Process of Data Science

Mathematics Behind Machine Learning

Becoming a Data Scientist

R Project for Statistical Computing

RStudio

Using R Packages

Data Sets

Summary


Chapter 2: Coding for Data Science Using R

Creating Variables of Atomic Classes with the Assignment Operator

Creating Vector Objects by Default

Creating Integer Sequences

Using the c() Combine Values Function

Using the vector() Constructor Function

Coercion – Implicit Transformation of Class

Casting – Explicit Transformation of Class

Using Matrices

Using Lists

Constructing Lists and Sublists

Using Factor Variables

Using Data Frames

Using Name Attributes

Using Multidimensional Arrays

Missing Values

Subsetting a Vector

Subsetting a Matrix

Subsetting and Slicing a List

Using the subset() Function

Common Operations on a Data Frame

Removing NA Values

Examining Inf and NaN Values

The Empty String

Vectorized Operations

IF Control Structure

FOR Control Structure

WHILE Control Structure

REPEAT Control Structure

User Defined Functions

Loop Functions

SWITCH Control Structure

Date and Time Handling

Random Sampling

Summary


Chapter 3: Data Access

Managing Your Working Directory

Types of Data Files

Sources of Data

Base R Data Sets

Downloading Data Sets From the Web

Reading CSV Files

Reading Excel Files

Using Connection Objects

Reading JSON Files

SQL Databases

R SQL

SQL Equivalents in R

API Data Access

Writing Data

Summary


Chapter 4: Data Transformation

Feature Engineering

Data Pipeline

Revising Variable Names

Creating New Variables

Discretizing Numeric Values

Date Handling

Creating Binary Categorical Variables

Merging Data Sets

Ordering Data Sets

Reshaping Data Sets

Data Manipulation Using Dplyr

Handling Missing Data

Feature Scaling

Summary


Chapter 5: Exploratory Data Analysis

Probability Distributions

Performing Counts

Contingency Tables

Chi-squared Statistical Test

Summary Statistics

Statistical Functions

Variance

Covariance and Correlation

Calculating a Cumulative Sum

Detecting Outliers

Summary


Chapter 6: Data Visualization

Histograms

Boxplots

Barplots

Density Plots

Scatterplots

QQ-Plots

Big Data Techniques

Line Graphs

Missing Value Plots

Expository Plots

Introduction to ggplot2

Summary


Chapter 7: Regression

Simple Linear Regression

Multiple Linear Regression

Polynomial Regression

Summary


Chapter 8: Classification

A Simple Example

Logistic Regression

Classification Trees

Naïve Bayes

K-Nearest Neighbors

Support Vector Machines

Neural Networks

Ensembles

Random Forests

Gradient Boosting Machines

XGBoost

Summary


Chapter 9: Evaluating Model Performance

Overfitting

Bias and Variance

Confounders

Data Leakage

Measuring Regression Performance

Measuring Classification Performance

ROC Curves

Cross Validation

Other Machine Learning Diagnostics

Summary


Chapter 10: Unsupervised Learning

Clustering

Simulating Clusters

Hierarchical Clustering

K-Means Clustering

Extensive K-Means Example

Principal Component Analysis

Summary

This second edition of Machine Learning and Data Science offers an accessible, hands-on introduction to the core principles of machine learning, statistical modeling, and practical data science—without overwhelming readers with complex formulas or technical jargon. Perfect for beginners, analysts, and business professionals transitioning into data science, this book provides a complete project-based roadmap from data wrangling to model deployment using the powerful R programming language. Whether you’re analyzing marketing trends, predicting customer behavior, or detecting fraud, this book equips you with the foundation needed to solve real problems using machine learning.

Author and data scientist Daniel D. Gutierrez draws on his experience teaching at UCLA and years of industry practice to guide you through essential topics, including regression, classification, clustering, feature engineering, and model evaluation. You’ll explore supervised and unsupervised learning techniques, apply visualization strategies, and build intuitive workflows that mirror the data science process used by professionals across finance, healthcare, marketing, and more. Unlike overly theoretical texts, this guide emphasizes application—what to do, why to do it, and how to do it in R.

Inside, you’ll find step-by-step tutorials, use case examples from Kaggle competitions, and easy-to-follow code snippets that let you apply machine learning concepts immediately. Learn how to access and clean real-world data sets, implement algorithms like decision trees, random forests, logistic regression, and k-means clustering, and avoid common pitfalls such as data leakage and overfitting. Move from exploratory data analysis to powerful predictive modeling.

Whether you’re a student, aspiring data scientist, or working analyst seeking to expand your skills, this is your essential, beginner-friendly guide to statistical learning and machine learning with R.

About Daniel

Daniel D. Gutierrez is an independent consultant in data science, and AI industry analyst and influencer. He holds a BS degree in Mathematics and Computer Science from UCLA. His long-term background in “data science” extends far before this cool name was en vogue. His main channel is the Radical Data Science blog where he keeps a pulse on this fast paced industry. Daniel teaches the “Introduction to Data Science” class for UCLA Extension where he trains the next generation of data scientists. He’s written four “data” books including the recent 2nd edition of his popular title “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.”

Bestsellers

Faculty may request complimentary digital desk copies

Please complete all fields.