
Everybody Lies
Big Data, New Data and What the Internet Can Tell Us About Who We Really Are
Categories
Business, Nonfiction, Psychology, Science, Economics, Politics, Technology, Audiobook, Sociology, Social Science
Content Type
Book
Binding
Hardcover
Year
2017
Publisher
Dey Street Books
Language
English
ASIN
0062390856
ISBN
0062390856
ISBN13
9780062390851
File Download
PDF | EPUB
Everybody Lies Plot Summary
Introduction
In an age dominated by digital technology, our online behaviors have created vast troves of data that reveal truths we often hide from others and even ourselves. This digital universe of clicks, searches, and posts provides unprecedented insights into human nature, allowing us to see past what people say to understand what they actually think, feel, and do. The traditional methods of understanding human behavior—surveys, interviews, and self-reports—have long been limited by our tendency to present ourselves in the best possible light. Now, the unfiltered truth of our digital footprints offers a revolutionary lens into our authentic selves. This exploration of big data illuminates the gap between our public personas and private realities. By analyzing patterns in Google searches, social media behavior, online shopping, and countless other digital interactions, we can observe human nature with remarkable clarity. The insights gained challenge conventional wisdom about politics, sexuality, prejudice, and personal aspirations. What emerges is not merely a collection of surprising facts but a fundamentally new understanding of who we are as individuals and societies. Through rigorous analysis and compelling examples, we'll discover how the honest signals in our digital lives reveal truths that traditional research methods could never uncover.
Chapter 1: Our Hidden Selves: How Digital Footprints Reveal More Than Surveys
For decades, social scientists have relied primarily on surveys to understand human behavior and attitudes. People are asked directly about their beliefs, habits, and experiences, and researchers draw conclusions from these self-reports. This approach has yielded valuable insights, but it suffers from a fundamental flaw: people lie. They lie to surveys, they lie to researchers, they lie to their friends and family, and they often lie to themselves. These lies aren't always malicious—they frequently stem from our desire to present ourselves in a socially acceptable light or to align with how we wish to see ourselves. The digital revolution has given us unprecedented access to data that bypasses these filters of social desirability and self-deception. When people search on Google, post anonymously on forums, or click through websites when they believe no one is watching, they reveal truths they might never admit in a survey or interview. This "digital truth serum" provides a window into authentic human behavior that was previously inaccessible to researchers. For example, surveys suggest that only about 25% of men and 8% of women admit to watching pornography, yet pornography-related searches vastly outnumber searches for weather information or other common topics. The honesty of digital data extends beyond embarrassing or taboo topics. People search for advice about problems they're facing, questions they're too afraid to ask others, and information they need in moments of crisis or curiosity. These searches create a remarkably detailed picture of human concerns, desires, and behaviors across different regions, demographics, and time periods. Unlike survey responses, which are carefully constructed and filtered, digital traces are created in moments of genuine need or interest, making them extraordinarily valuable for understanding true human behavior. What makes this digital data particularly powerful is its scale and granularity. Traditional surveys might include thousands of respondents, but digital platforms collect data from millions or billions of interactions daily. This massive scale allows researchers to identify patterns that would be invisible in smaller datasets and to analyze behavior with unprecedented precision across geography, time, and countless other variables. The combination of honesty and scale makes digital data a revolutionary tool for understanding human nature. A key advantage of digital data is that it captures behavior rather than intentions. Survey respondents might sincerely believe they will exercise more, save money, or vote in an upcoming election, but their subsequent actions often tell a different story. Digital data tracks what people actually do—the websites they visit, the products they purchase, the content they engage with—providing a more accurate picture of human behavior than self-reported intentions or retrospective accounts.
Chapter 2: The Four Powers of Big Data: New Sources, Honesty, Detail, and Experimentation
Big Data's transformative potential stems from four distinct powers that collectively revolutionize our understanding of human behavior. First, it offers entirely new sources of information that were previously inaccessible or nonexistent. Before the digital age, researchers could only analyze the limited data people consciously provided or that could be directly observed. Now, every click, search, purchase, and movement leaves digital traces that can be aggregated and analyzed. From smartphone location data to social media interactions to browsing histories, these new data sources capture aspects of life that traditional research methods couldn't reach. The second power of Big Data is its remarkable honesty. When people believe they're anonymous or that their data is private, they behave with surprising authenticity. Google searches reveal questions and interests people would never admit to in public. Dating apps show actual romantic preferences that differ from stated ideals. Financial transactions document real spending priorities rather than aspirational budgets. This digital truth serum bypasses the social desirability bias that plagues surveys and interviews, allowing researchers to see past what people say to what they actually do, think, and want. The third power lies in Big Data's extraordinary detail and granularity. Traditional datasets might capture broad demographic categories or occasional survey responses, but digital data can track behavior continuously across countless variables. Researchers can analyze patterns by location (down to specific neighborhoods), time (to the minute or second), individual characteristics, and innumerable other factors. This granularity enables "zooming in" on specific subgroups or contexts that would be impossible with conventional research methods. For example, researchers can study how behavior differs not just between states but between zip codes, or how patterns change not just seasonally but hourly. The fourth power is Big Data's capacity for large-scale experimentation. Digital platforms routinely conduct thousands of A/B tests to optimize user experiences, creating vast natural experiments that reveal causal relationships. Companies like Google, Facebook, and Amazon run continuous tests to determine which designs, features, or algorithms produce desired outcomes. This experimental approach, combined with massive sample sizes, allows researchers to move beyond correlation to establish causation with unprecedented precision and confidence. The digital world has effectively become a giant laboratory where hypotheses can be tested quickly and at scale. These four powers work synergistically to create a fundamentally new approach to understanding human behavior. Traditional research methods relied on small samples, self-reported data, limited variables, and artificial laboratory settings. Big Data offers massive samples of honest behavior, captured with extraordinary detail in natural settings, with robust experimental designs to establish causation. This combination allows researchers to ask and answer questions that were previously beyond reach, revealing patterns and insights that transform our understanding of human nature.
Chapter 3: Escaping Our Intuition: How Data Challenges What We Think We Know
Human intuition, shaped by personal experience and cultural narratives, often leads us astray when trying to understand complex social phenomena. We tend to generalize from limited observations, overestimate the prevalence of dramatic events, and confuse correlation with causation. Big data analysis repeatedly challenges these intuitive assumptions, revealing counterintuitive truths about human behavior that overturn conventional wisdom. For instance, while conventional wisdom suggests that anxiety is highest in overeducated urban areas, Google searches for anxiety-related terms are actually more common in rural areas with lower education levels and incomes. Our intuitions about societal trends are similarly flawed. Many believe that the internet has increased political polarization by creating echo chambers where people only encounter views they already agree with. However, data analysis shows that internet users are more likely to encounter opposing political views online than offline. When researchers compared the political diversity of news sources people consume online versus their real-world social networks, they found that the internet actually exposes people to more ideological diversity than their neighborhoods, workplaces, or family gatherings. This finding directly contradicts the widely held belief that the internet is driving political segregation. The gap between intuition and reality extends to our understanding of social influences on behavior. Conventional wisdom suggests that violent media inspires violent behavior, based on anecdotal evidence of crimes mimicking scenes from movies or video games. Yet when researchers analyzed comprehensive crime data alongside movie attendance records, they discovered something surprising: on weekends when violent movies were popular, violent crime rates actually decreased significantly. The explanation? Potential offenders were in theaters watching movies rather than drinking at bars or engaging in other high-risk activities that more commonly lead to violence. Even deeply held beliefs about personal development and success can be contradicted by data analysis. Many assume that attending elite educational institutions provides substantial advantages beyond what less prestigious schools offer. However, when researchers compared earnings of students who were accepted to highly selective universities but chose to attend less selective ones, they found virtually no difference in long-term outcomes. For students with similar qualifications, attending an Ivy League school versus a good state university had minimal impact on future earnings or career success, suggesting that individual characteristics matter far more than institutional prestige. Perhaps most powerfully, big data challenges our intuitions about our own motivations and behaviors. We construct narratives about why we make certain choices or hold particular beliefs, but our digital footprints often tell a different story. Search data reveals patterns of interest and concern that contradict our self-reported priorities. Location data shows discrepancies between where we say we spend our time and where we actually go. Purchasing data exposes gaps between our stated values and our actual consumption. These contradictions highlight how poorly we understand even our own behaviors and preferences, much less those of others or society as a whole. By systematically challenging our intuitions with comprehensive data analysis, big data helps us escape the limitations of personal experience and conventional wisdom. It reveals a world that is often more complex, counterintuitive, and surprising than the one we think we inhabit, forcing us to question assumptions and reconsider established beliefs about human nature and social dynamics.
Chapter 4: Zooming In: Finding Patterns in Local, Temporal, and Personal Data
The extraordinary volume of digital data allows researchers to "zoom in" on specific places, times, and individuals with unprecedented precision. This capability transforms our understanding of geographic variations in behavior and attitudes. For example, when economists analyzed tax records from every American, they discovered that economic mobility—the chance for children from poor families to reach the upper middle class—varies dramatically across the United States. In some regions, like San Jose, California, poor children have more than twice the chance of becoming wealthy adults compared to children from similarly poor families in Charlotte, North Carolina. This geographic variation exists even within single cities, with neighborhoods just miles apart showing vastly different outcomes. This zooming capability extends to temporal patterns as well, revealing how behavior changes across minutes, hours, days, and seasons. Researchers can track how Google searches shift throughout the day: searches for suicide peak at 12:36 AM and reach their lowest point around 9 AM; searches for porn follow distinct daily patterns; searches for weather information spike before 5:30 AM as people prepare for their day. These temporal patterns offer insights into rhythms of human life that were previously invisible. By analyzing how behaviors fluctuate across time, researchers can identify when people are most vulnerable, most productive, or most receptive to different types of information or interventions. The ability to zoom in also applies to individual-level data, enabling what researchers call "doppelgänger searches." By analyzing vast datasets, algorithms can identify people who closely match a specific individual across numerous variables—not just basic demographics but detailed behavioral patterns, preferences, and circumstances. This approach has transformed fields from medicine to marketing. Netflix, Amazon, and other tech companies use it to predict with remarkable accuracy what products or content individuals will enjoy based on the preferences of similar users. In healthcare, researchers are developing systems to find patients with nearly identical medical profiles to better predict disease progression and treatment outcomes for specific individuals. This granular approach reveals how factors like childhood experiences shape adult behaviors and preferences. When researchers analyzed Facebook data on baseball team preferences, they discovered that men are most likely to become lifelong fans of teams that were successful when they were around eight years old. Similar patterns emerge for political affiliations, with political views often crystallizing based on which president was popular when someone was eighteen. These findings demonstrate how zooming in on specific life stages can reveal critical windows when preferences and identities form. The power of zooming in is perhaps most evident in the study of rare or hidden phenomena that would be invisible in broader analyses. Traditional research methods struggle to capture uncommon experiences or stigmatized behaviors that affect only small percentages of the population. Big data allows researchers to identify these patterns by analyzing massive datasets where even rare occurrences appear in meaningful numbers. This capability has proven particularly valuable for understanding stigmatized health conditions, unusual consumer preferences, and behaviors that people are reluctant to report in surveys or interviews. By zooming in across geography, time, and individual characteristics, big data reveals the extraordinary diversity and complexity of human behavior. Patterns that appear uniform at a national or global level often dissolve into intricate variations when examined more closely. This granularity challenges simplistic narratives and one-size-fits-all approaches to social issues, highlighting the need for nuanced, contextual understanding of human behavior and experiences.
Chapter 5: The Digital Laboratory: How All the World Became an Experiment
The internet has transformed from a mere information repository into the world's largest laboratory for human behavior. Through A/B testing—where users are randomly assigned to different versions of websites, apps, or content—digital platforms continuously run experiments that reveal causal relationships between design choices and human responses. What began as a tool for optimizing website performance has evolved into a methodology that generates profound insights about decision-making, attention, persuasion, and countless other aspects of human psychology. Companies now run thousands of these experiments daily, vastly outpacing the experimental output of traditional academic research. The scale of these digital experiments is unprecedented in human history. Facebook conducts over a thousand A/B tests every day, meaning a small team of engineers at this single company starts more randomized controlled experiments in a typical day than the entire pharmaceutical industry initiates in a year. These experiments involve millions of participants and can detect effects so subtle they would be invisible in traditional laboratory studies. When Google tested 41 slightly different shades of blue for its advertising links, it could measure differences in user response that would be imperceptible to the naked eye but translated into millions of dollars in revenue. Beyond corporate applications, this experimental approach has revolutionized our understanding of social and political dynamics. During the 2008 and 2012 presidential campaigns, Barack Obama's team conducted extensive A/B tests on email messages, donation pages, and website designs. These experiments revealed that seemingly minor changes—using a different photo, adjusting button text from "Sign Up" to "Learn More," or altering email subject lines—could dramatically impact engagement and donations. One winning combination increased sign-ups by 40%, generating an estimated $60 million in additional campaign funding. These findings challenge conventional wisdom about political communication and demonstrate the power of evidence-based approaches to persuasion. The digital laboratory extends beyond intentional experiments to include "natural experiments" created by circumstance. Researchers now analyze situations where random or arbitrary factors create effectively randomized conditions in real-world settings. For example, economists studied how the arbitrary timing of tax rebates affected spending, how random roommate assignments influenced political beliefs, and how arbitrary school admission cutoffs impacted long-term educational outcomes. These natural experiments provide causal evidence about human behavior in contexts where deliberate experimentation would be impractical or unethical. This experimental revolution has profound implications for how we understand causality in human affairs. Traditional social science relied heavily on observational studies that could establish correlation but struggled to prove causation. The digital laboratory allows researchers to definitively test causal hypotheses at scale and in natural settings. For instance, studies of how violent movies affect crime rates moved beyond correlation to establish that popular violent films actually reduce crime by keeping potential offenders occupied during high-risk evening hours—a counterintuitive finding that only became clear through rigorous experimental analysis. The transformation of the world into a laboratory raises important ethical questions about consent, privacy, and manipulation. Most digital experiments occur without explicit participant knowledge or consent, raising concerns about autonomy and transparency. The power to influence behavior through optimized design creates potential for both beneficial interventions and harmful manipulation. As experimental techniques become more sophisticated and widespread, society faces critical questions about appropriate limits and governance for this powerful new approach to understanding and shaping human behavior.
Chapter 6: Limitations and Ethics: Big Data's Blind Spots and Responsibilities
Despite its remarkable capabilities, big data analysis faces significant limitations that can lead to misleading conclusions if not carefully addressed. The "curse of dimensionality" represents a fundamental challenge—with extremely large datasets containing countless variables, researchers can always find patterns that appear meaningful but actually result from random chance. This problem intensifies as datasets grow larger and more complex. When analyzing billions of data points across thousands of variables, even completely random data will produce apparent correlations and patterns. Researchers must apply rigorous statistical methods and out-of-sample testing to distinguish genuine insights from statistical artifacts. Another crucial limitation is what we might call the "measurement trap"—the tendency to focus exclusively on what is easily measurable while ignoring equally important factors that resist quantification. Digital data captures certain behaviors with extraordinary precision: clicks, purchases, location changes, and text inputs. However, it often misses critical aspects of human experience like emotions, motivations, meanings, and values. Facebook can track exactly how many times users click "like," but struggles to measure whether those interactions actually make users feel connected or satisfied. Google can count searches for depression symptoms but cannot directly measure psychological well-being. This measurement bias can distort priorities and conclusions. Big data also raises profound ethical concerns about privacy, consent, and potential harm. Most digital data is collected incidentally as people navigate online spaces, often without full awareness of how extensively they're being tracked or how their data might be used. The aggregation of previously separate data streams creates increasingly comprehensive profiles of individuals, potentially enabling surveillance and manipulation that users never anticipated or consented to. Even anonymized data can often be re-identified when combined with other information, undermining privacy protections and exposing vulnerable individuals to potential harm. The power of predictive algorithms raises additional ethical questions about discrimination and autonomy. When algorithms predict who will default on loans, commit crimes, succeed in jobs, or develop health problems, these predictions can perpetuate existing biases and inequalities. If lending algorithms disproportionately deny loans to minority applicants based on correlations in historical data, they may entrench discriminatory patterns while appearing objective and data-driven. Similarly, when algorithms personalize content and experiences based on past behavior, they may limit exposure to new ideas and opportunities, potentially reducing autonomy and reinforcing existing preferences and beliefs. Corporate concentration of data power creates further ethical challenges. A small number of technology companies now control unprecedented amounts of behavioral data, giving them extraordinary influence over both markets and social dynamics. These companies can use this data to maximize engagement, even when doing so exploits psychological vulnerabilities or promotes harmful content. The profit incentives driving data collection and analysis may not align with individual or societal well-being, creating tensions between commercial interests and ethical responsibilities. Addressing these limitations and ethical concerns requires both technical solutions and social governance. Technically, researchers must develop more sophisticated methods for avoiding false patterns, integrating qualitative insights with quantitative data, and protecting privacy through techniques like differential privacy and federated learning. Socially, we need robust governance frameworks that ensure transparency, accountability, and ethical use of big data. This includes clear consent mechanisms, restrictions on certain applications, auditing requirements for algorithms, and public oversight of data practices that affect fundamental rights and opportunities.
Chapter 7: Making Sense of Truth: Integrating Data with Human Wisdom
The big data revolution has given us unprecedented ability to see patterns in human behavior, but data alone cannot provide meaning or wisdom. The most powerful applications of big data combine computational analysis with human judgment, contextual understanding, and ethical reasoning. This integration transforms raw data into meaningful insights that can guide better decisions and deeper understanding. For example, when Google search data revealed unexpected patterns in flu symptoms across regions, medical experts were essential for interpreting these patterns, identifying potential biases in the data, and translating statistical correlations into actionable public health recommendations. Context is crucial for understanding what digital behavior actually means. The same search query can reflect radically different intentions depending on who performs it and under what circumstances. A teenager searching for information about depression might be researching a school assignment, helping a friend, or experiencing symptoms themselves. A search for a politician's name might indicate support, opposition, or simple curiosity. Human expertise in relevant domains—psychology, political science, medicine, or other fields—provides the contextual knowledge needed to interpret data patterns accurately and avoid simplistic or misleading conclusions. The most effective approaches combine multiple data sources with different strengths and limitations. Digital data excels at scale, precision, and capturing behavior that people won't report, but it often lacks depth and context. Traditional research methods like surveys, interviews, and ethnographic observation provide richness and meaning but suffer from social desirability bias and limited scope. By triangulating across these different approaches, researchers can develop more comprehensive and nuanced understandings of human behavior. Facebook now complements its massive behavioral datasets with targeted surveys to understand the meaning and impact of user interactions, recognizing that clicks alone cannot reveal how people experience their platform. Ethical judgment remains irreplaceable in deciding how to apply big data insights. Data can tell us what is happening and even predict what might happen next, but it cannot tell us what should happen. Questions about privacy boundaries, fair use of predictive algorithms, appropriate constraints on manipulation, and equitable distribution of data's benefits require ethical reasoning that extends beyond statistical analysis. Human values and democratic processes must guide decisions about when and how to use big data's revelatory power, especially when it involves vulnerable populations or consequential decisions about individuals' opportunities and rights. The future of big data lies not in replacing human judgment but in augmenting it with unprecedented empirical insights. The most promising applications combine computational power with human creativity, empathy, and wisdom. Doctors use algorithm-generated predictions alongside clinical judgment to make better diagnoses. Policy makers combine geographic analyses of social outcomes with on-the-ground knowledge to design more effective interventions. Educators use learning analytics alongside personal relationships with students to provide better support. In each case, data provides evidence that informs human decisions rather than replacing human judgment entirely.
Summary
The digital revolution has fundamentally transformed our ability to understand human behavior by revealing truths that traditional research methods could never access. By analyzing the vast digital footprints we leave as we navigate online spaces, researchers can now see past the filters of social desirability and self-presentation to observe authentic patterns of thought, desire, and action. This digital truth serum exposes the gap between what people say and what they actually do, think, and want, challenging conventional wisdom and illuminating aspects of human nature that previously remained hidden from view. The power of this approach lies not merely in the volume of data but in its unique qualities: its honesty, granularity, diversity, and experimental potential. Unlike surveys or interviews, digital data captures behavior in natural contexts without the distorting effects of social pressure or self-deception. It reveals patterns across geography, time, and individual characteristics with unprecedented precision, allowing researchers to zoom in on specific populations, moments, and behaviors that would be invisible in broader analyses. By combining these insights with human judgment, contextual understanding, and ethical reasoning, we can develop a richer, more nuanced understanding of ourselves and our societies. As we navigate the opportunities and challenges of this new empirical landscape, we must balance the pursuit of truth with respect for privacy, fairness, and human autonomy—using data's revelatory power not just to predict behavior but to improve lives and deepen our understanding of what it means to be human.
Best Quote
“The next Freud will be a data scientist. The next Marx will be a data scientist. The next Salk might very well be a data scientist.” ― Seth Stephens-Davidowitz, Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Review Summary
Strengths: The book's last section, which explains the limitations of big data, is noted as the most grounded part. It provides a more balanced perspective on the use of big data compared to the earlier sections. Weaknesses: The book attempts to emulate "Freakonomics" but falls short, with the first two parts filled with random examples that seem mostly pointless. Many assumptions are made from data without a solid scientific basis. The inclusion of unprofessional jokes, which require footnotes to clarify their intent, detracts from the content. The repetition of examples, particularly one concerning women's concerns about vaginal odor, is seen as unnecessary. The book's structure could have benefited from integrating the insights of the last section into the earlier parts to provide a more balanced view. Overall Sentiment: The sentiment expressed in the review is largely negative, with criticism directed at the book's methodology, style, and content choices. Key Takeaway: The book struggles to effectively emulate "Freakonomics," with its strengths overshadowed by poor scientific grounding, unnecessary repetition, and unprofessional humor.
Trending Books
Download PDF & EPUB
To save this Black List summary for later, download the free PDF and EPUB. You can print it out, or read offline at your convenience.

Everybody Lies
By Steven Pinker