Data ScienceLearnStatistics

The Single Best Introductory Statistics Book for Data Science

This article was originally published on Towards Data Science on July 16th, 2020.

Everybody and their mother wants to learn data science. The field is quite interesting — I have to admit — but comes with a lot of prerequisites. The most important one is statistics — both descriptive and inferential, alongside with the probability theory.

Statistics is covered pretty well by tech universities, so what’s the point of this article? There are 3 main points, my friend. Read this article if:

  1. It’s been some time since your last exposure to statistics
  2. You didn’t find it intuitive and well-explained during your studies
  3. You were intoxicated by the field

If you fall into any of these 3 categories, boy do I have some worthy resources for you. I’ve read a good portion of this book after my bachelor’s degree, and the rest right after enrolling in Data science master’s degree.

I’m glad I did that because it would be a way bumpier road otherwise.

Okay, let’s see what book am I talking about in the next section. Please keep in mind — down below you’ll find an affiliate link to the book. That doesn’t mean anything to you, as the price is identical, but I’ll get a small commission if you decide to make a purchase.

Head First Statistics

This might come as a surprise to you. It’s not a book you can usually see in university courses — mostly because it’s full of visualizations and plain simple explanations. University books, on the other hand, are full of formulas, proofs, and plain old boring text.

When you first see the physical copy of this book you might get anxious a bit — as it has over 700 pages. Now it’s not short by any standards, but it certainly doesn’t feel like a 700-page book. Let me elaborate.

Pick up any statistics and probability book from a university — there is a high chance it is somewhere around 500 pages, if not more. While that is significantly less than Head First Statistics, the university book most likely isn’t full of visual examples and visualizations in general.

If someone took all the visualizations from this Head First Statistics, the total page number would reduce by half, if not more.

Why is that important?

Good question. For the majority of us, reading through pages and pages of plain old boring text isn’t the most fun thing in the world. That’s especially the case if you work full time, and want to study statistics after your day job. It’s just not going to happen — you’ll fall asleep by the third paragraph.

You need the right amount of text, followed by some nice visuals, followed by practical examples.

Answer me this question honestly: Would you read this article if I’ve written it as a single paragraph? No, because that would look a chore — unnecessary chore. Also, we have a short attention span. If you can’t finish a paragraph within 30 seconds, it’s most likely you won’t continue reading. In simpler terms — if the article wasn’t formatted in a visually pleasing way, you’ll find the one which is.

The same goes for books. Every time I see 30-row-paragraphs I get this desire to drop the book immediately — no matter how good it might be. You see, 10 different books on the same subject typically cover the same topics, but what makes the book a bestseller is how approachable it is to the reader (and marketing, of course).

Okay, we now know what to search for in a book and why we never finish some books — no matter how valuable the information might be. Let’s now explore what Head First Statistics has to offer.

My short review

As I’ve said, I’ve finished the book roughly 1.5 years ago, and it was a great primer for more advanced topics. If I were now to read a book on statistics with Python, which doesn’t cover the theory in-depth, I wouldn’t be confused due to solid background knowledge.

And that’s really who this book is for — either for complete beginners, of for ones who’ve taken statistics courses before, but the teaching style was awful. If you’ve completed several university-level courses on statistics and probability, you might want to pass on this one.

The book covers the following topics:

  • Basic data visualization
  • Measures of central tendency and spread
  • Probability, permutations, combinations, and distributions
  • Statistical sampling
  • Confidence intervals
  • Hypothesis testing
  • Regression analysis

And covers them really, really well. The book doesn’t cover anything that isn’t taught on university courses — but if you’re not on some of the world’s best universities, I honestly think this you are far better of with this book.

Before you go

Statistics is a mandatory tool for every data scientist — no arguing there. At the same time, your formal education on statistics might have sucked, or it never existed. That doesn’t mean you can’t learn the topic, it will just require more manual work.

And that’s where books like Head First Statistics come in handy. It won’t take you too long to finish — around 1 to 2 months — depending on your previous knowledge and amount of time you can spare. Once you finish it, you’ll be able to easily understand more advanced topics in data science and machine learning.

Thanks for reading.

Join our private email list for more helpful insights.

Dario Radečić
Data scientist, blogger, and enthusiast. Passionate about deep learning, computer vision, and data-driven decision making.

You may also like

Comments are closed.

More in Data Science