Public Policy Analytics: Code & Context for Data Science in Government
Updated: February, 2021
Updated: February, 2021
Welcome to the online version of Public Policy Analytics: Code & Context for Data Science in Government, a book set to be published by CRC Press as part of its Data Science Series. The data for this book can be found here.
The goal of this book is to make data science accessible to social scientists and City Planners, in particular. I hope to convince readers that one with strong domain expertise plus intermediate data skills can have a greater impact in government than the sharpest computer scientist who has never studied economics, sociology, public health, political science, criminology etc.
Public Policy Analytics was written to pass along the knowledge I have personally gained from so many gifted educators over the last 20 years. They are too many to name individually, but their impression on me has been so lasting and so monumental, that somewhere along the line, I decided to become an educator myself. This book is a reflection of all that these individuals have given to me.
I am incredibly grateful to my colleague Sydney Goldstein, without whom this book would not have been possible. Sydney was instrumental in helping me edit and compile the text. Additionally, she and I co-authored an initial version of Chapter 7 as a white paper. Dr. Tony Smith, a most cherished mentor and friend, edited nearly every machine learning chapter in this book. Dr. Maria Cuellar (Ch. 5), Michael Fichman (Intro), Matt Harris (review of functions), Dr. George Kikuchi (Ch. 5); and Dr. Jordan Purdy (Chs. 6 & 7), each generously provided their time and expertise in review. I thank them wholeheartedly. All errors are mine alone. Finally, this book is dedicated to my wife, Diana, and my sons Emil and Malcolm, who always keep me focused on love and positivity.
I hope both non-technical policymakers and budding public-sector data scientists find this book useful and I thank you for taking a look.
West Philadelphia, PA.
Table of Contents
## Warning: package 'kableExtra' was built under R version 4.0.5
|Chapter 1: Indicators for Transit Oriented Development||
Following the Introduction, Chapter 1 introduces indicators as an important tool for simplifying and communicating complex processes to non-technical decision makers. Introducing the
|Chapter 2: Expanding the Urban Growth Boundary||Chapter 2 explores the discontinuous nature of boundaries to understand how an Urban Growth Area in Lancaster County, PA affects suburban sprawl.||link|
|Chapters 3 & 4: Intro to Geospatial Machine Learning||Chapters 3 and 4 provide a first look at geospatial predictive modeling, forecasting home prices in Boston, MA. Chapter 3 introduces linear regression, goodness of fit metrics, and cross-validation, with the goal of assessing model accuracy and generalizability. Chapter 4 builds on the initial analysis to account for the ‘spatial process’ or pattern of home prices.||link|
|Chapter 5: Geospatial Risk Modeling - Predictive Policing||Chapter 5 tackles the controversial topic of Predictive Policing, forecasting burglary risk in Chicago. The argument is made that converting Broken Windows theory into Broken Window policing, can bake bias directly into a predictive model and lead to a discriminatory resource allocation tool. The concept of generalizability remains key.||link|
|Chapter 6: People-Based ML Models||Chapter 6 introduces the use of machine learning in estimating risk/opportunity for individuals. The resulting intelligence is then used to develop a cost/benefit analysis for Bounce to Work! a pogo-transit start-up. The goal is to predict the probability a client will ‘churn’ or not re-up their membership. This is valuable for public-sector data scientists working with individuals and families.||link|
|Chapter 7: People-Based ML Models: Algorithmic Fairness||Chapter 7 evaluates people-based algorithms for ‘disparate impact’ - the idea that even if an algorithm is not designed to discriminte on its face, it may still have a discriminatory effect. This chapter returns to a criminal justice use case, estimating the social costs and benefits.||link|
|Chapter 8: Predicting Rideshare Demand||Chapter 8 builds a space/time predictive model of ride share demand in Chicago. New R functionality is introduced along with functions unique to time series data.||link|