SQL for Data Scientists
-15%
portes grátis
SQL for Data Scientists
A Beginner's Guide for Building Datasets for Analysis
Teate, Renee M. P.
John Wiley & Sons Inc
11/2021
288
Mole
Inglês
9781119669364
15 a 20 dias
482
Descrição não disponível.
Introduction xix
Chapter 1 Data Sources 1
Data Sources 1
Tools for Connecting to Data Sources and Editing SQL 2
Relational Databases 3
Dimensional Data Warehouses 7
Asking Questions About the Data Source 9
Introduction to the Farmer's Market Database 11
A Note on Machine Learning Dataset Terminology 12
Exercises 13
Chapter 2 The SELECT Statement 15
The SELECT Statement 15
The Fundamental Syntax Structure of a SELECT Query 16
Selecting Columns and Limiting the Number of Rows Returned 16
The ORDER BY Clause: Sorting Results 18
Introduction to Simple Inline Calculations 20
More Inline Calculation Examples: Rounding 22
More Inline Calculation Examples: Concatenating Strings 24
Evaluating Query Output 26
SELECT Statement Summary 29
Exercises Using the Included Database 30
Chapter 3 The WHERE Clause 31
The WHERE Clause 31
Filtering SELECT Statement Results 32
Filtering on Multiple Conditions 34
Multi-Column Conditional Filtering 40
More Ways to Filter 41
BETWEEN 41
IN 42
LIKE 43
IS NULL 44
A Warning About Null Comparisons 44
Filtering Using Subqueries 46
Exercises Using the Included Database 47
Chapter 4 CASE Statements 49
CASE Statement Syntax 50
Creating Binary Flags Using CASE 52
Grouping or Binning Continuous Values Using CASE 53
Categorical Encoding Using CASE 56
CASE Statement Summary 59
Exercises Using the Included Database 60
Chapter 5 SQL JOINs 61
Database Relationships and SQL JOINs 61
A Common Pitfall when Filtering Joined Data 71
JOINs with More than Two Tables 74
Exercises Using the Included Database 76
Chapter 6 Aggregating Results for Analysis 79
GROUP BY Syntax 79
Displaying Group Summaries 80
Performing Calculations Inside Aggregate Functions 84
MIN and MAX 88
COUNT and COUNT DISTINCT 90
Average 91
Filtering with HAVING 93
CASE Statements Inside Aggregate Functions 94
Exercises Using the Included Database 96
Chapter 7 Window Functions and Subqueries 97
ROW NUMBER 98
RANK and DENSE RANK 101
NTILE 102
Aggregate Window Functions 103
LAG and LEAD 108
Exercises Using the Included Database 111
Chapter 8 Date and Time Functions 113
Setting datetime Field Values 114
EXTRACT and DATE_PART 115
DATE_ADD and DATE_SUB 116
DATEDIFF 118
TIMESTAMPDIFF 119
Date Functions in Aggregate Summaries and Window Functions 119
Exercises 126
Chapter 9 Exploratory Data Analysis with SQL 127
Demonstrating Exploratory Data Analysis with SQL 128
Exploring the Products Table 128
Exploring Possible Column Values 131
Exploring Changes Over Time 134
Exploring Multiple Tables Simultaneously 135
Exploring Inventory vs. Sales 138
Exercises 142
Chapter 10 Building SQL Datasets for Analytical Reporting 143
Thinking Through Analytical Dataset Requirements 144
Using Custom Analytical Datasets in SQL:
CTEs and Views 149
Taking SQL Reporting Further 153
Exercises 157
Chapter 11 More Advanced Query Structures 159
UNIONs 159
Self-Join to Determine To-Date Maximum 163
Counting New vs. Returning Customers by Week 167
Summary 171
Exercises 171
Chapter 12 Creating Machine Learning Datasets Using SQL 173
Datasets for Time Series Models 174
Datasets for Binary Classification 176
Creating the Dataset 178
Expanding the Feature Set 181
Feature Engineering 185
Taking Things to the Next Level 189
Exercises 189
Chapter 13 Analytical Dataset Development Examples 191
What Factors Correlate with Fresh Produce Sales? 191
How Do Sales Vary by Customer Zip Code,
Market Distance, and Demographic Data? 211
How Does Product Price Distribution Affect
Market Sales? 217
Chapter 14 Storing and Modifying Data 229
Storing SQL Datasets as Tables and Views 229
Adding a Timestamp Column 232
Inserting Rows and Updating Values in Database Tables 233
Using SQL Inside Scripts 236
In Closing 237
Exercises 238
Appendix Answers to Exercises 239
Index 255
Chapter 1 Data Sources 1
Data Sources 1
Tools for Connecting to Data Sources and Editing SQL 2
Relational Databases 3
Dimensional Data Warehouses 7
Asking Questions About the Data Source 9
Introduction to the Farmer's Market Database 11
A Note on Machine Learning Dataset Terminology 12
Exercises 13
Chapter 2 The SELECT Statement 15
The SELECT Statement 15
The Fundamental Syntax Structure of a SELECT Query 16
Selecting Columns and Limiting the Number of Rows Returned 16
The ORDER BY Clause: Sorting Results 18
Introduction to Simple Inline Calculations 20
More Inline Calculation Examples: Rounding 22
More Inline Calculation Examples: Concatenating Strings 24
Evaluating Query Output 26
SELECT Statement Summary 29
Exercises Using the Included Database 30
Chapter 3 The WHERE Clause 31
The WHERE Clause 31
Filtering SELECT Statement Results 32
Filtering on Multiple Conditions 34
Multi-Column Conditional Filtering 40
More Ways to Filter 41
BETWEEN 41
IN 42
LIKE 43
IS NULL 44
A Warning About Null Comparisons 44
Filtering Using Subqueries 46
Exercises Using the Included Database 47
Chapter 4 CASE Statements 49
CASE Statement Syntax 50
Creating Binary Flags Using CASE 52
Grouping or Binning Continuous Values Using CASE 53
Categorical Encoding Using CASE 56
CASE Statement Summary 59
Exercises Using the Included Database 60
Chapter 5 SQL JOINs 61
Database Relationships and SQL JOINs 61
A Common Pitfall when Filtering Joined Data 71
JOINs with More than Two Tables 74
Exercises Using the Included Database 76
Chapter 6 Aggregating Results for Analysis 79
GROUP BY Syntax 79
Displaying Group Summaries 80
Performing Calculations Inside Aggregate Functions 84
MIN and MAX 88
COUNT and COUNT DISTINCT 90
Average 91
Filtering with HAVING 93
CASE Statements Inside Aggregate Functions 94
Exercises Using the Included Database 96
Chapter 7 Window Functions and Subqueries 97
ROW NUMBER 98
RANK and DENSE RANK 101
NTILE 102
Aggregate Window Functions 103
LAG and LEAD 108
Exercises Using the Included Database 111
Chapter 8 Date and Time Functions 113
Setting datetime Field Values 114
EXTRACT and DATE_PART 115
DATE_ADD and DATE_SUB 116
DATEDIFF 118
TIMESTAMPDIFF 119
Date Functions in Aggregate Summaries and Window Functions 119
Exercises 126
Chapter 9 Exploratory Data Analysis with SQL 127
Demonstrating Exploratory Data Analysis with SQL 128
Exploring the Products Table 128
Exploring Possible Column Values 131
Exploring Changes Over Time 134
Exploring Multiple Tables Simultaneously 135
Exploring Inventory vs. Sales 138
Exercises 142
Chapter 10 Building SQL Datasets for Analytical Reporting 143
Thinking Through Analytical Dataset Requirements 144
Using Custom Analytical Datasets in SQL:
CTEs and Views 149
Taking SQL Reporting Further 153
Exercises 157
Chapter 11 More Advanced Query Structures 159
UNIONs 159
Self-Join to Determine To-Date Maximum 163
Counting New vs. Returning Customers by Week 167
Summary 171
Exercises 171
Chapter 12 Creating Machine Learning Datasets Using SQL 173
Datasets for Time Series Models 174
Datasets for Binary Classification 176
Creating the Dataset 178
Expanding the Feature Set 181
Feature Engineering 185
Taking Things to the Next Level 189
Exercises 189
Chapter 13 Analytical Dataset Development Examples 191
What Factors Correlate with Fresh Produce Sales? 191
How Do Sales Vary by Customer Zip Code,
Market Distance, and Demographic Data? 211
How Does Product Price Distribution Affect
Market Sales? 217
Chapter 14 Storing and Modifying Data 229
Storing SQL Datasets as Tables and Views 229
Adding a Timestamp Column 232
Inserting Rows and Updating Values in Database Tables 233
Using SQL Inside Scripts 236
In Closing 237
Exercises 238
Appendix Answers to Exercises 239
Index 255
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
SQL guide; SQL data scientists; design analytical datasets; spreadsheet data source; build datasheets; construct datasets; data analyst book; data scientist guide; data scientist career; SQL skills; relational database structure; design query; data science; data science skills; top data science skills; data science careers; data science careers query; design datasets for machine learning; how to write queries; become a data scientist; how to become a data scientist; books for data scientists; learn data science; master data science; how to I become a data scientist
Introduction xix
Chapter 1 Data Sources 1
Data Sources 1
Tools for Connecting to Data Sources and Editing SQL 2
Relational Databases 3
Dimensional Data Warehouses 7
Asking Questions About the Data Source 9
Introduction to the Farmer's Market Database 11
A Note on Machine Learning Dataset Terminology 12
Exercises 13
Chapter 2 The SELECT Statement 15
The SELECT Statement 15
The Fundamental Syntax Structure of a SELECT Query 16
Selecting Columns and Limiting the Number of Rows Returned 16
The ORDER BY Clause: Sorting Results 18
Introduction to Simple Inline Calculations 20
More Inline Calculation Examples: Rounding 22
More Inline Calculation Examples: Concatenating Strings 24
Evaluating Query Output 26
SELECT Statement Summary 29
Exercises Using the Included Database 30
Chapter 3 The WHERE Clause 31
The WHERE Clause 31
Filtering SELECT Statement Results 32
Filtering on Multiple Conditions 34
Multi-Column Conditional Filtering 40
More Ways to Filter 41
BETWEEN 41
IN 42
LIKE 43
IS NULL 44
A Warning About Null Comparisons 44
Filtering Using Subqueries 46
Exercises Using the Included Database 47
Chapter 4 CASE Statements 49
CASE Statement Syntax 50
Creating Binary Flags Using CASE 52
Grouping or Binning Continuous Values Using CASE 53
Categorical Encoding Using CASE 56
CASE Statement Summary 59
Exercises Using the Included Database 60
Chapter 5 SQL JOINs 61
Database Relationships and SQL JOINs 61
A Common Pitfall when Filtering Joined Data 71
JOINs with More than Two Tables 74
Exercises Using the Included Database 76
Chapter 6 Aggregating Results for Analysis 79
GROUP BY Syntax 79
Displaying Group Summaries 80
Performing Calculations Inside Aggregate Functions 84
MIN and MAX 88
COUNT and COUNT DISTINCT 90
Average 91
Filtering with HAVING 93
CASE Statements Inside Aggregate Functions 94
Exercises Using the Included Database 96
Chapter 7 Window Functions and Subqueries 97
ROW NUMBER 98
RANK and DENSE RANK 101
NTILE 102
Aggregate Window Functions 103
LAG and LEAD 108
Exercises Using the Included Database 111
Chapter 8 Date and Time Functions 113
Setting datetime Field Values 114
EXTRACT and DATE_PART 115
DATE_ADD and DATE_SUB 116
DATEDIFF 118
TIMESTAMPDIFF 119
Date Functions in Aggregate Summaries and Window Functions 119
Exercises 126
Chapter 9 Exploratory Data Analysis with SQL 127
Demonstrating Exploratory Data Analysis with SQL 128
Exploring the Products Table 128
Exploring Possible Column Values 131
Exploring Changes Over Time 134
Exploring Multiple Tables Simultaneously 135
Exploring Inventory vs. Sales 138
Exercises 142
Chapter 10 Building SQL Datasets for Analytical Reporting 143
Thinking Through Analytical Dataset Requirements 144
Using Custom Analytical Datasets in SQL:
CTEs and Views 149
Taking SQL Reporting Further 153
Exercises 157
Chapter 11 More Advanced Query Structures 159
UNIONs 159
Self-Join to Determine To-Date Maximum 163
Counting New vs. Returning Customers by Week 167
Summary 171
Exercises 171
Chapter 12 Creating Machine Learning Datasets Using SQL 173
Datasets for Time Series Models 174
Datasets for Binary Classification 176
Creating the Dataset 178
Expanding the Feature Set 181
Feature Engineering 185
Taking Things to the Next Level 189
Exercises 189
Chapter 13 Analytical Dataset Development Examples 191
What Factors Correlate with Fresh Produce Sales? 191
How Do Sales Vary by Customer Zip Code,
Market Distance, and Demographic Data? 211
How Does Product Price Distribution Affect
Market Sales? 217
Chapter 14 Storing and Modifying Data 229
Storing SQL Datasets as Tables and Views 229
Adding a Timestamp Column 232
Inserting Rows and Updating Values in Database Tables 233
Using SQL Inside Scripts 236
In Closing 237
Exercises 238
Appendix Answers to Exercises 239
Index 255
Chapter 1 Data Sources 1
Data Sources 1
Tools for Connecting to Data Sources and Editing SQL 2
Relational Databases 3
Dimensional Data Warehouses 7
Asking Questions About the Data Source 9
Introduction to the Farmer's Market Database 11
A Note on Machine Learning Dataset Terminology 12
Exercises 13
Chapter 2 The SELECT Statement 15
The SELECT Statement 15
The Fundamental Syntax Structure of a SELECT Query 16
Selecting Columns and Limiting the Number of Rows Returned 16
The ORDER BY Clause: Sorting Results 18
Introduction to Simple Inline Calculations 20
More Inline Calculation Examples: Rounding 22
More Inline Calculation Examples: Concatenating Strings 24
Evaluating Query Output 26
SELECT Statement Summary 29
Exercises Using the Included Database 30
Chapter 3 The WHERE Clause 31
The WHERE Clause 31
Filtering SELECT Statement Results 32
Filtering on Multiple Conditions 34
Multi-Column Conditional Filtering 40
More Ways to Filter 41
BETWEEN 41
IN 42
LIKE 43
IS NULL 44
A Warning About Null Comparisons 44
Filtering Using Subqueries 46
Exercises Using the Included Database 47
Chapter 4 CASE Statements 49
CASE Statement Syntax 50
Creating Binary Flags Using CASE 52
Grouping or Binning Continuous Values Using CASE 53
Categorical Encoding Using CASE 56
CASE Statement Summary 59
Exercises Using the Included Database 60
Chapter 5 SQL JOINs 61
Database Relationships and SQL JOINs 61
A Common Pitfall when Filtering Joined Data 71
JOINs with More than Two Tables 74
Exercises Using the Included Database 76
Chapter 6 Aggregating Results for Analysis 79
GROUP BY Syntax 79
Displaying Group Summaries 80
Performing Calculations Inside Aggregate Functions 84
MIN and MAX 88
COUNT and COUNT DISTINCT 90
Average 91
Filtering with HAVING 93
CASE Statements Inside Aggregate Functions 94
Exercises Using the Included Database 96
Chapter 7 Window Functions and Subqueries 97
ROW NUMBER 98
RANK and DENSE RANK 101
NTILE 102
Aggregate Window Functions 103
LAG and LEAD 108
Exercises Using the Included Database 111
Chapter 8 Date and Time Functions 113
Setting datetime Field Values 114
EXTRACT and DATE_PART 115
DATE_ADD and DATE_SUB 116
DATEDIFF 118
TIMESTAMPDIFF 119
Date Functions in Aggregate Summaries and Window Functions 119
Exercises 126
Chapter 9 Exploratory Data Analysis with SQL 127
Demonstrating Exploratory Data Analysis with SQL 128
Exploring the Products Table 128
Exploring Possible Column Values 131
Exploring Changes Over Time 134
Exploring Multiple Tables Simultaneously 135
Exploring Inventory vs. Sales 138
Exercises 142
Chapter 10 Building SQL Datasets for Analytical Reporting 143
Thinking Through Analytical Dataset Requirements 144
Using Custom Analytical Datasets in SQL:
CTEs and Views 149
Taking SQL Reporting Further 153
Exercises 157
Chapter 11 More Advanced Query Structures 159
UNIONs 159
Self-Join to Determine To-Date Maximum 163
Counting New vs. Returning Customers by Week 167
Summary 171
Exercises 171
Chapter 12 Creating Machine Learning Datasets Using SQL 173
Datasets for Time Series Models 174
Datasets for Binary Classification 176
Creating the Dataset 178
Expanding the Feature Set 181
Feature Engineering 185
Taking Things to the Next Level 189
Exercises 189
Chapter 13 Analytical Dataset Development Examples 191
What Factors Correlate with Fresh Produce Sales? 191
How Do Sales Vary by Customer Zip Code,
Market Distance, and Demographic Data? 211
How Does Product Price Distribution Affect
Market Sales? 217
Chapter 14 Storing and Modifying Data 229
Storing SQL Datasets as Tables and Views 229
Adding a Timestamp Column 232
Inserting Rows and Updating Values in Database Tables 233
Using SQL Inside Scripts 236
In Closing 237
Exercises 238
Appendix Answers to Exercises 239
Index 255
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
SQL guide; SQL data scientists; design analytical datasets; spreadsheet data source; build datasheets; construct datasets; data analyst book; data scientist guide; data scientist career; SQL skills; relational database structure; design query; data science; data science skills; top data science skills; data science careers; data science careers query; design datasets for machine learning; how to write queries; become a data scientist; how to become a data scientist; books for data scientists; learn data science; master data science; how to I become a data scientist