Drawing a parallel between statistical models and books on statistics, I lean on the instrumental words of statistical legend George Box: "All models are wrong. Some models are useful."
I’ve read close to one hundred books on statistics and probability. Many are textbooks or a like style. And in my humble opinion, an overwhelming majority would fail a usability test by task one.
I am not a conspiracy theorist, but I once conducted a thought experiment about how statisticians could create an esoteric society. The mantra of this society would be something to the tune of, “we write to further our understanding, and so the remainder amplify in perplexity.”
Taking one statistics book I own, I scan to page 5. And I find this:
Here is the problem I see with this formula:
First, unless you were in a Greek organization, few know what a sigma is, and those that do probably don’t know what it represents.
And why are there lower case letters above and below the backwards looking 3? What is A? n? x? What does the subscript i do? And what is the significance of 1 in either places?
Simply, we do not read this way, nor do we process information this way.
What’s worse? The above formula is the cryptic explanation of how to calculate an average. Maybe someday someone will write something that explains these concepts at an eighth grade language, with no complicated notation like what we see above, and as a bonus, would actually be entertaining.
Until that day comes, here are the four best statistics books I've ever read:
Naked Statistics: Stripping the Dread from the Data
The second book in a series by Charles Wheelan, a former writer for The Economist, is an entertaining look at how statistics is a powerful tool once we eliminate “the dread from the data.”
A New York Times Bestseller, Wheelan’s masterpiece is by far the most entertaining book on statistics on my bookshelf.
It reinforced a lot of my statistical knowledge. The kind that is easily forgotten. It also taught me a great deal about what I was missing from my statistical tool belt.
Wheelan takes a humble and conversational approach to statistics, with demonstrations of how applied statistics enables solutions to economic and societal problems.
This is the first book I’ve found to successfully explain the Monty Hall problem. Additionally, Wheelan offers fantastic examples of the awesome power of statistics, like showing how a statistical mean can be devastating with improper interpretation, and demonstrating how statistical methods predicted the likelihood that a female shopper is to be pregnant.
Published in 2013, Wheelan's work positioned itself well at the forefront of an ever-growing data-driven economy and society. As a fan of all his work, I hope that his work continues to shift the perspectives of those fearful of quantitative topics.
This is the book I wanted to write when I first strategized my contribution to the statistics community. Wheelan nailed it. And I am beyond proud and grateful to call Dr. Wheelan a mentor in my own endeavor.
How to Lie with Statistics
In my years of reading, there are few books I would consider timeless. This is one of them.
OK, so not everything in it is timeless. To say the wage of a business owner’s annual salary is $45,000 and his workers hover in the single-digit thousands is not exactly consistent today. The examples in the book might not be relative, but most certainly are relevant.
Darrell Huff originally published this seminal work in 1954 and renewed the copyright in 1982.
My copy is from 1993, and one the front cover, it reads “Over half a million copies sold – an honest-to-goodness bestseller.” I would agree. According to the author, it has sold over one million copies to date.
The book is not how to actually lie with statistics. As Huff suggests:
“The crooks already know these tricks; honest men must learn them in self-defense."
While he uses this expression as part of a metaphor, throughout my career, I have seen more individuals that do not realize they are lying than those that are doing it intentionally. Throughout the book, it shows examples of skewing statistics to formulate a lesson on proper usage and interpretation. An inverted instructional design, if you will.
It touches on the topics of sampling, central tendency, and perhaps most importantly, data visualization.
Given that in our space, most data are presented in the form of graphs and charts, the lessons imparted by Huff and relevant to date. Perhaps more relevant today than at the point of its original printing.
How Not to Be Wrong: The Power of Mathematical Thinking
Admittedly, this is a book on mathematics, but Jordan Ellenberg does a solid to his mother and father, both statisticians, by allocating a fair amount of book space on probability and statistics.
Creatively and wittily, Ellenberg tackles the p-value with a heading titled "Doctor, it hurts when I p.
This is a thinking book. The kind that Bill Gates values, naming it one it to his “10 Favorite Books” list.
This New York Times Bestseller sells from page one that mathematics is at the heart of all we do.
Ellenberg was able to offer concrete examples of when playing the lottery is actually logical.
The reason? The expected value equation. He is the first I’ve seen that has been able to explain how correlation is not transitive
And the storytelling throughout the history of mathematics, probability and statistics is nothing short of entertaining and memorable.
The Signal and the Noise: Why So Many Predictions Fail—but Some Don't
Yet another New York Times Bestseller, this book is filled with intelligent commentary about how predictions succeed, and fail.
It wouldn’t be a discussion about influential works if I didn’t mention Daniel Kahneman and Amos Tversky somewhere. In Kahneman’s book, Thinking, Fast and Slow, Kahneman chronicles the 40 year journey that he and Tversky set out to answer the question: Do we think statistically?
I will give you the cliffs notes. No. We do not think statistically. Even professional statisticians, forecasters, and economists are prone to statistical error in judgment.
But if there was a continuum from “do not think statistically whatsoever” and “I only think statistically,” Nate Silver would likely be one of the top statistical minds on the right end of the spectrum.
Ellenberg spent some time commending Silver on his mindset, notably his tendency to forecast probabilistically, avoiding the matter-of-fact approach that most media outlets want to hear.
As an expert, you are often expected to give your opinion of a discrete outcome. Not Silver. Nate’s opinion is formed by probabilities. Likelihoods. The kind of speak of Candidate A is more likely to win, at this point, than Candidate B. No absolutes. Just probabilities. And speaking at that specific time. And it is among the most admirable qualities of a statistician.
It apparently wasn’t enough for Nate to share stories his own experiences and knowledge. This guy did his homework. Interviews, academic literature, patent filings, government reports, you name it.
He has 55 pages of citations, in what looks like size two font. Ever written a book with over 1,000 citations? I counted 1,014, with an average of 68 per chapter (71 if you throw out the Conclusion).
Among the multiple lessons in Silver’s tour de force is the multiple lessons about the challenges of forecasting. An ideal platform to perfectly explain why his tendencies avoid the absolute.
He chronicles the experiences of baseball forecasters, political pundits, meteorologists, economists, seismologists, and poker players.
Among the gems in this book, there are two that remain embedded in my brain to this day. The first is Silver’s brilliantly executed articulation of Bayes Theorem. The second being his position on the future of Multi-Agent modeling. It’s coming, folks.
The overall take here is, although many of the concepts in this book are statistical and probabilistic, the examples used are fascinating and current. Coupling these two makes this a pretty easy read, even for those with little to no statistical background.
Have you read any of the books mentioned? Or are there any books you think should be on this list? Tell us below!