7 Must-Read Books for the Budding Data Scientist

Getting started in the exciting field of data science can be a bit overwhelming. There are so many new tools, groundbreaking applications and innovative ways to explore data that even experts in the field don’t have it all figured out. But for budding data scientists, understanding this complex field may be just a few pages away. These highly acclaimed books explain the basics of big data and beyond with predictive analysis, illuminating information, applications and even potential threats, offering a comprehensive introduction to the field. Consider this an essential reading list for the aspiring data scientist.

Written by Viktor Mayer-Schonberger and Kenneth Cukier

Eamon Dolan/Houghton Mifflin Harcourt (March 5, 2013)

This book provides a highly detailed introduction to the emerging science of big data, while also uncovering some of the most pressing issues related to both its current and future applications. Exploring big data in business, health, politics and more, you’ll learn all about how big data is transforming the way we process the information around us. Big Data also reveals the threats of data science, including the pervasive erosion of personal privacy. Overall, the book offers a strong introduction to the big data revolution and is an excellent resource for budding data scientists exploring the field.

Read the New York Times review

Written by Christopher Steiner

Portfolio Hardcover (August 30, 2012)

Data science is much more than predictions; automation and algorithms play a major part as well. In Automate This, Christopher Steiner explains how algorithms are increasingly being used to tackle high-level tasks that were once achieved only by humans with advanced training, including medical diagnosis and foreign policy analysis. Automate This illustrates how algorithms have far exceeded the expectations of their creators – and how the “bot revolution” is penetrating every aspect of our lives.

Read the Wall Street Journal review

Written by Nate Silver

Penguin Press (September 27, 2012)

Big data is aptly named: every data scientist knows that the world is teeming with data; in fact, so much that it would be impossible to comprehend it without specialized tools and meticulous analysis. Without accurate methods, the sheer abundance of data can make predictions go bad, especially when confronted with the limits of human cognition. Read The Signal and the Noise to find out how forecasters are able to overcome biases and unpredictability to uncover accurate, meaningful predictions in a vast sea of noisy data.

Read the New York Times review

Written by Thomas H. Davenport

Harvard Business Review Press (February 25, 2014)

One of big data’s most fascinating applications is in the world of business. Big data can illuminate decision making, improve customer relationships and streamline organizations. Thomas Davenport‘s Big Data at Work explains the opportunities, impact and critical factors for successfully using big data in business. This book is an excellent guide for businesses interested in harnessing the power of big data, illustrating how leading organizations are using data science to improve the ways they do business.

Read the Forbes review

Written by Eric Siegel

Wiley (February 19, 2013)

Perfect for new data scientists, Predictive Analytics offers tangible and easy-to-understand insights into the complex world of data analysis. Read this book to find out how institutions are increasingly predicting human behavior – whether you’re going to click, buy, lie, or die, as the title suggests. Predictive Analytics also shares the “why” and the “how” of behavior prediction – highlighting the many ways in which predictive analysis is able to improve healthcare, fight crime and boost sales – all through the careful analysis of big data.

Read the Forbes review

Written by Theresa M. Payton and Ted Claypoole

Rowman & Littlefield Publishers (January 16, 2014)

Big data can predict what you’ll buy at the grocery store, the spread of disease and even when you’ll die. The power to tell the future with seemingly incomprehensible quantities of data is truly astounding; but, is it all just a little too personal? In a world where practically every move can increasingly be predicted, it’s difficult to maintain a sense of privacy. Privacy in the Age of Big Data asks readers to consider the ramifications of data collection and surveillance and offers solutions to those who prefer to remain private. Check out our two-part guest blog post by co-author Ted Claypoole here.

Read the UC Berkeley review

Written by Cathy O’Neil and Rachel Schutt

O’Reilly Media (November 3, 2013)

Doing Data Science is an ideal read for budding data scientists who are just getting started in the field. Based on Columbia University’s Introduction to Data Science class, this book will teach you to see through the popular hype around “big data,” and it will give you the knowledge and insights you need to hit the ground running in this fast-growing field. Study the book’s chapters for lectures from leading data scientists from Google, Microsoft and eBay as they share case studies and code for analysis, algorithms, modeling, visualization and more.

Read the Scientific Computing review