Book Review : Everybody Lies
Author : Seth Stephens-Davidowitz
My Rating : 5 out of 5 stars
The complete title of the book is “Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are”.
Like I have done for some other book reviews, I must start this review with proper disclosures. I spent a significant portion of my career at Yahoo, where “Big Data” or “Data Science” or whatever you want to call it, is a big part of the strategy. I have myself worked on projects related to this field, and the results of my own code have astonished me by the sheer amount of insight that can be extracted out of sometimes simple techniques. I am not new to this field, and I am very familiar with the potency of such data mining exercises.
So a lot of insights presented by this book were not surprising to me at all. That includes the techniques explained in the book for finding correlation using statistics, as well as the kinky nature of a lot that people search for. Like many software engineers do, almost a decade ago, I downloaded Apache Lucene to familiarize myself with the search engine technology. This download includes the sample of actual search terms used by internet users, and it felt that almost all the terms were pornographic in nature, and sometimes quite disturbing. So yes, I kind of knew a lot of the findings of this book already.
If you are not from a similar work background, be prepared to be surprised, enlightened, entertained, disturbed and a bit depressed. Sometimes, all at the same time.
In this fascinating book, the author will give you a grand tour what insights about specific individuals, as well as society in general can be inferred these days. The main source of this information is internet of course. As the book explains, the interaction people have with the various websites, and especially Google and PornHub are much much richer in terms of data mining, compared to the traditional mechanisms such as surveys. Of course the main reason is anonymity - as in people thinking that they are not revealing their identity. So they search for things and visit sites that they would not admit in polite company. They leave anonymous comments that can be analyzed using natural language processing techniques. The quantity of such interactions rise and fall with events that happen in real world. Such trends give far deeper insight into what people are actually feeling, which non-anonymous methods of interactions would simply not reveal.
All this is possible, because people behave differently in public and in private - especially when they are alone and think they are not being monitored. Hence the title of the book - Everybody Lies.
We all know that Facebook posts by users are nothing but advertisements (often false) about what they want their friends to perceive about their life. Reality is very different. But how different? By how much? And what about politically incorrect issues? For example, how racist people really are? What social policies do they really support? And so on.
The author gives a very nice introduction to techniques that have been used by companies, and individual researchers that have access to such internet interactions, what insights we can derive from this research and how valuable this can be.
I will be doing a terrible disservice to this book, if I give an impressions that everything is all dark and gloomy. Or this books is all about interactions on Google, Facebook and PornHub. The author himself takes pains to clarify that this is about “New Data” not just the size of the data. Also included are many many examples of non-internet data that can be as insightful.
This book is not a sales pitch for “Big Data”, and that’s great. You will also learn the limitations of these new techniques, and how mis-interpretations can happen. It’s all written in a very accessible way. No knowledge of economics or computer science is required. In fact, on few occasions, I felt that the author is trying to simplify it too much. Just a quick note : it’s clear that the author’s own stance on many political issues is towards the left. I am fine with it, but some readers may not appreciate it too much.
It’s a wonderful book, very relevant to what’s happening today. It will remind you of the great book “Freakonomics”. If you liked that one, you will like this one too. The findings are sometimes as you expect using common sense, but more often than not, they will be new insights. I highly recommend it.
No comments:
Post a Comment