Photo by Thought Catalog on Unsplash
I remember when I was first learning data science. There were almost too many resources and too much to learn that it was easy to get lost. I explored many avenues that while interesting, in retrospect, were not the most efficient way to get started. If you are just starting your journey and want the 3 best books to help you focus your studies, this is the article for you.
Python For Data Analysis
I start with the classic Pandas book written by the creator of Pandas himself: Python for Data Analysis. I’ll be the first to admit that this is not a perfect book. It reads almost like a cookbook of sorts, but I have found it to be the best way to get started with Python for data analytics. It will teach you how to get set up with Python as well as load, wrangle, clean, and visualize data.
When starting out, I think it is a much better strategy to begin with the data processing and analytics pieces as it helps you learn to really understand your data and emphasizes the importance of all the steps that need to happen before machine learning. Also, this tends to be the best way to get comfortable using Python for data science and sets you up well for the next book.
Hands-On Machine Learning
Now that you are feeling comfortable with Python and manipulating data, its time to start modeling! Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow is by far the best book to get started with machine learning. This book will take you from linear regression all the way to GANs and deploying deep learning at scale. That is an insane amount of material to cover and the author does so beautifully.
This book will definitely take some time to get through, but I found it to be very beginner-friendly, so with effort, I think almost anyone could get through it. And if you do, you will emerge on the other side with an amazing foundation of machine learning knowledge and practical experience.
Introduction to Statistical Learning
Lastly, I think everyone should read an Introduction to Statistical Learning. After the first two books, it does a great job of adding a statistical viewpoint to your knowledge. It covers some of the same algorithms as Hands-On Machine Learning, but with a more statistical bend. It also goes much deeper into the world of regression models as well as provides R code for practical application.
The book was written to be an “accessible overview of the field of statistical learning” and definitely gets the job done. To do so, though, the book focuses more on intuitive explanations as opposed to math. So other books would be necessary to dive even deeper.
A Strong Foundation
I picked these three books as a starting point because once finished I believe you will find yourself with a solid foundation to explore almost any area of data science in more depth. As these books are targeted at people entering the field, the main area you might find lacking is the mathematical rigor. If you are like me, though, starting with the practical and intuitive methods helps build my motivation to dig deeper. I hope these books help you as much as they did me!