Home/Business/Super Crunchers
Loading...
Super Crunchers cover

Super Crunchers

Why Thinking-by-Numbers Is the New Way to be Smart

3.7 (5,963 ratings)
24 minutes read | Text | 9 key ideas
In a world where intuition once reigned supreme, numbers now hold the key to the future. Welcome to the era of Super Crunchers, where algorithms and data analytics redefine decision-making. Economist Ian Ayres takes you on a journey through this transformative landscape, revealing how massive databases wield unprecedented power over our daily lives. From predicting your ideal partner to outsmarting wine connoisseurs, these digital wizards are reshaping everything from healthcare to boardrooms. But with great power comes great responsibility. Ayres uncovers the triumphs and pitfalls of this data-driven age, posing provocative questions about control and influence. Who really benefits when equations challenge expertise? Arm yourself with the knowledge of the Super Crunchers before making your next big decision—because in this brave new world, numbers rule all.

Categories

Business, Nonfiction, Psychology, Finance, Science, Economics, Technology, Audiobook, Management, Mathematics

Content Type

Book

Binding

Hardcover

Year

2007

Publisher

Bantam

Language

English

ASIN

0553805401

ISBN

0553805401

ISBN13

9780553805406

File Download

PDF | EPUB

Super Crunchers Plot Summary

Introduction

We live in an age where numbers speak louder than expertise. Traditional experts who rely on intuition and experience are increasingly losing ground to statistical analysis and data-driven decision making. This fundamental shift represents one of the most significant transformations in how decisions are made across virtually every domain—from medicine and education to business and government. What makes this revolution particularly powerful is not just the accuracy of statistical predictions, but the speed and scale at which they can now be implemented. The rise of what we might call "Super Crunching" combines massive datasets, sophisticated regression techniques, and randomized trials to make predictions and establish causality with unprecedented precision. This approach doesn't merely supplement traditional decision-making—it often replaces it entirely. While some may view this trend with skepticism or even alarm, understanding the mechanics behind these statistical methods and their applications provides insight into a new way of thinking that has already transformed industries and will continue to reshape our world. The coming chapters will explore how these techniques work, why they're superior in many contexts, and the profound implications they have for expertise, privacy, and human judgment in the 21st century.

Chapter 1: The Rise of Statistical Analysis in Modern Decision-Making

Statistical decision-making has silently infiltrated nearly every aspect of our lives. When you call a customer service line, algorithms have already predicted why you're calling before a human representative even picks up the phone. When you apply for a loan, statistical models—not loan officers—determine your creditworthiness. Even the movies and music recommended to you result from complex algorithms analyzing vast datasets of consumer preferences. At the heart of this revolution is regression analysis, a statistical technique that examines relationships between variables to make predictions. Unlike human experts who might rely on gut feelings or personal experience, regression models use historical data to identify patterns and correlations. The wine industry provides a telling example. Princeton economist Orley Ashenfelter developed a simple formula based on growing season temperatures and rainfall amounts to predict the quality of Bordeaux wines years before they could be tasted. Traditional wine critics were outraged, arguing that only experienced palates could properly evaluate wine. Yet Ashenfelter's predictions proved remarkably accurate, often outperforming the experts. The same pattern has emerged in baseball, where data analysts like Bill James challenged the conventional wisdom of scouts who claimed they could spot talent through observation. Teams that embraced statistical analysis, like the Oakland Athletics (immortalized in Michael Lewis's "Moneyball"), achieved remarkable success despite limited budgets. The numbers showed that traditional metrics like batting average were less predictive of scoring runs than overlooked statistics such as on-base percentage. What makes this approach revolutionary is not just its accuracy but its scalability. Once a statistical model is developed, it can be applied millions of times at virtually no additional cost. Dating websites like eHarmony don't rely on matchmakers' intuition but on statistical models that predict compatibility based on answers to hundreds of questions. Companies like Capital One conduct thousands of experiments yearly, testing everything from interest rates to marketing language, measuring responses down to the decimal point. This transformation isn't limited to private industry. Government agencies increasingly rely on statistical analysis to determine which policies work and which don't. In medicine, evidence-based approaches are challenging the authority of experienced physicians. Even the Supreme Court's decisions have been predicted with surprising accuracy using simple statistical models. The rise of statistical analysis doesn't just change how decisions are made—it changes who makes them. Power shifts from those with experience to those who can analyze data. While expertise and intuition still have value, they increasingly serve as inputs to statistical models rather than final arbiters of decisions.

Chapter 2: Regression Analysis: Finding Patterns in Data Chaos

Regression analysis stands as the workhorse of statistical prediction, transforming seemingly chaotic data into actionable insights. At its core, regression identifies relationships between variables, allowing us to predict outcomes based on historical patterns. Unlike the human mind, which tends to overweight dramatic events and underweight common occurrences, regression assigns precise numerical weights to each factor, creating a formula that can be applied to new situations. Consider how Harrah's casino uses regression analysis to identify each customer's "pain point"—the amount of gambling losses a customer can sustain before the experience becomes unpleasant enough to drive them away. By analyzing factors like age, income level, and past gambling behavior, Harrah's creates a personalized profile for each customer. When a valuable patron approaches their pain point, a "luck ambassador" intervenes with complimentary services to transform a potentially negative experience into a positive one. This precise targeting helps maintain customer loyalty while maximizing profits. The insurance industry demonstrates another powerful application of regression analysis. Traditionally, insurance categories were broad—young male drivers paid high premiums because, as a group, they had more accidents. Today, companies use regression analysis to examine hundreds of variables, from credit scores to purchasing habits, creating highly specific risk profiles. Progressive Insurance can now offer competitive rates to traditionally "high-risk" customers who demonstrate low-risk behaviors in other aspects of their lives. The result is a more efficient market that benefits both careful consumers and insurance companies. What makes regression analysis particularly powerful is its ability to quantify uncertainty. Unlike human experts who rarely acknowledge the limits of their knowledge, regression models explicitly state their confidence levels. When Farecast.com predicts whether airline prices will rise or fall, it also tells you the probability that its prediction is correct. This transparency about uncertainty represents a fundamental shift from traditional expertise, which often presents judgments with unwarranted certainty. Regression techniques have also evolved beyond simple linear relationships. Neural networks, which mimic the human brain's learning process, can identify complex patterns in data that traditional regression might miss. A company called Epagogix uses neural networks to analyze movie scripts and predict box office performance with surprising accuracy, identifying elements that contribute to financial success that even experienced producers might overlook. The democratization of regression tools means that virtually any organization can now harness this power. Small businesses can predict customer churn, non-profits can identify effective fundraising strategies, and government agencies can target services to those most likely to benefit. While building effective models requires skill, the barriers to entry have fallen dramatically, accelerating the spread of data-driven decision making across society.

Chapter 3: Randomized Testing: Creating Data through Experimentation

While regression analysis extracts insights from existing data, randomized testing generates new data through controlled experiments. This approach, borrowed from medical research, has become a cornerstone of modern decision-making. The process is deceptively simple: randomly divide subjects into groups, expose each group to different conditions, and measure the differences in outcomes. This method cuts through correlation-causation confusion by creating statistically identical groups where the only difference is the treatment being tested. Credit card giant Capital One pioneered this approach in the business world, running over 28,000 experiments in a single year. Rather than rely on marketing executives' intuition about what appeals to consumers, Capital One tests everything—from interest rates to envelope designs to the exact wording of offers. For instance, in one early experiment, Capital One randomly divided 600,000 potential customers into six groups, each receiving different promotional interest rates. The results showed that offering 4.9% for six months was significantly more profitable than offering 7.9% for twelve months—contrary to what many marketing experts had assumed. The internet has dramatically expanded the possibilities for randomized testing. Companies like Offermatica help businesses conduct real-time experiments on their websites. When visitors arrive at a site, some see one version of a page while others see alternative designs. The software automatically tracks which version generates more sales, sign-ups, or other desired actions. Monster.com used this approach to test 128 different variations of their employer home page, discovering that small changes like altering button text from "Search Resumes" to "Search and Buy Resumes" significantly increased revenue. The winning design generated 8.31% more revenue per visitor—translating into tens of millions of additional dollars annually. Even more surprising was Jo-Ann Fabrics' experience with randomized testing. Among their test promotions was an offer they considered unlikely to succeed: "Buy two sewing machines and save 10%." After all, how many people need two sewing machines? Yet this promotion outperformed all others because customers recruited friends to share the discount, effectively turning customers into sales agents. Without randomized testing, this counterintuitive insight would never have been discovered. The power of randomized testing extends beyond marketing. Continental Airlines randomly assigned passengers who had experienced canceled flights or delays to receive either nothing, an apology letter, or an apology plus compensation. The results were striking—passengers who received any form of apology spent 8% more on Continental tickets in the following year, generating millions in additional revenue. The experiment transformed a customer service decision from a matter of opinion to a data-driven strategy with measurable results. Governments are also embracing randomized testing to evaluate policy effectiveness. Programs like job search assistance for unemployed workers, conditional cash transfers for poverty reduction, and educational interventions have all been subjected to randomized trials. This approach has the advantage of providing clear, credible evidence about what works, often challenging long-held assumptions and helping direct limited resources to their most effective use.

Chapter 4: Super Crunchers vs. Traditional Experts: The Power Shift

The confrontation between statistical prediction and expert judgment represents a fundamental power struggle across professions. In field after field, statistical algorithms—even relatively simple ones—consistently outperform human experts in predictive accuracy. This pattern, first documented by psychologist Paul Meehl in 1954, has been confirmed in hundreds of subsequent studies covering diverse domains from medical diagnosis to stock picking to parole decisions. The superiority of statistical prediction stems from several human cognitive limitations. Experts tend to overweight dramatic events and recent experiences while underweighting mundane but statistically significant factors. They struggle to maintain consistency, applying different standards to similar situations based on fatigue, mood, or other irrelevant factors. Perhaps most critically, experts demonstrate persistent overconfidence, believing they know more than they actually do and failing to update their beliefs when presented with contradictory evidence. Consider a study by economist Justin Wolfers examining college basketball betting. Analysis of over 44,000 games revealed that when point spreads exceeded twelve points, favored teams covered the spread only 47% of the time rather than the expected 50%. This statistical anomaly suggested that players on heavily favored teams occasionally engaged in "point shaving"—deliberately keeping games closer than they should be. No basketball expert had detected this pattern, but statistical analysis made it visible. The medical profession provides another striking example of this power shift. Traditional diagnostics relied heavily on physicians' experience and pattern recognition abilities. However, diagnostic software like Isabel, which searches a database of over 11,000 diseases and their associated symptoms, now regularly identifies conditions that experienced doctors miss. In one case, a senior oncologist nearly administered chemotherapy to a child diagnosed with leukemia until Isabel flagged a rare variant that required different treatment—potentially saving the child's life. This transition hasn't come without resistance. Wine critics dismissed Orley Ashenfelter's statistical predictions as "silliness." Baseball scouts ridiculed the statistical approach as ignoring the intangibles of athletic performance. Hollywood executives resist algorithms that predict box office success, claiming that filmmaking is an art that cannot be reduced to numbers. These reactions reflect not just philosophical disagreements but the threat to professional status and identity that statistical prediction represents. Yet the evidence increasingly suggests that the optimal approach combines statistical prediction with selective human judgment. The most effective arrangement isn't giving statistics to experts for consideration—in such cases, experts typically improve but still underperform the statistical models alone. Rather, the best results come from using human expertise to identify relevant variables and unusual circumstances, while allowing the statistical models to determine the appropriate weights and make the final predictions. This new division of labor fundamentally reshapes professional roles. Radiologists increasingly focus on identifying abnormalities in images rather than diagnosing their significance. Loan officers become data entry specialists rather than decision makers. Coaches implement statistically validated strategies rather than relying on intuition. While this transition can be painful for established professionals, it ultimately leads to better outcomes for organizations and society.

Chapter 5: Technological Drivers: Why Super Crunching Is Happening Now

The techniques underlying data-driven decision making aren't new—regression analysis dates back to the 19th century, and randomized experiments have been the gold standard in medicine since the 1940s. What has changed is our technological capacity to capture, store, analyze, and implement statistical insights at unprecedented scale and speed. Several converging technological developments explain why Super Crunching has recently transformed from academic theory to practical reality. First, digitalization has created an explosion of available data. Nearly every transaction, from grocery purchases to medical visits to online browsing, now generates digital records. Companies like Acxiom maintain consumer information on virtually every American household, managing over 20 billion customer records spanning 850 terabytes of raw data. Meanwhile, public records that once gathered dust in government filing cabinets are increasingly accessible online. The convergence of these public and private data sources creates unprecedented opportunities for analysis. Second, storage costs have plummeted following Kryder's Law—the storage capacity of hard drives doubles roughly every two years. In the early days of computing, storage limitations meant deleting old data to make room for new information. Today, a terabyte of storage costs less than $400, making it economically feasible to maintain massive historical datasets. Yahoo! records over twelve terabytes of data daily—equivalent to more than half the information contained in the Library of Congress—and this amount no longer requires acres of servers or billions of dollars. Third, database technology has evolved to allow seamless integration of previously isolated information. Traditional databases were siloed, making it difficult to connect related information across systems. Modern database architecture, combined with standardized formats and sophisticated matching algorithms, now enables organizations to merge diverse datasets. "Data mashups" combine information from multiple sources—like housing prices, neighborhood demographics, and school performance—to reveal insights that would be invisible when examining each dataset in isolation. Fourth, computational power has increased exponentially, following Moore's Law. Regression analyses that once required expensive mainframe time can now be performed on ordinary laptops. More importantly, algorithmic innovations like neural networks can identify subtle patterns in data that traditional regression techniques might miss. These advanced methods, once restricted to academic research, are now widely available through commercial software packages. Finally, implementation technologies have dramatically reduced the gap between insight and action. In earlier eras, statistical findings might sit in research journals for years before affecting practice. Today, companies like Offermatica can analyze customer responses in real-time and automatically adjust web page designs to optimize performance. Capital One can instantaneously route customer service calls based on statistical predictions of customer needs and value. These technological developments have created a virtuous cycle. As organizations implement data-driven decisions, they generate more data, enabling even more sophisticated analysis. The costs of experimentation have fallen so dramatically that companies can test dozens or hundreds of variations simultaneously, learning and adapting at unprecedented speed. Meanwhile, the competitive advantages gained by early adopters force others to follow suit or risk obsolescence. The result is a fundamental restructuring of decision-making across industries and institutions. Traditional expertise increasingly serves as input to statistical models rather than as the final arbiter of decisions. While this transition creates winners and losers, the aggregate impact has been to make organizations more responsive, efficient, and effective.

Chapter 6: The Dark Side: Privacy Concerns and Statistical Manipulation

While Super Crunching offers unprecedented insights and efficiencies, it also creates significant risks to privacy, fairness, and autonomy. As data collection becomes ubiquitous and algorithms increasingly determine critical aspects of our lives, we must confront the potential dark side of the statistical revolution. Privacy concerns stand at the forefront of these challenges. Data aggregators like ChoicePoint and Acxiom compile comprehensive profiles on virtually every American, combining public records with consumer transaction data. These profiles include not just basic demographic information but detailed records of purchases, web browsing, travel patterns, and social connections. While individual pieces of this information might seem innocuous, their combination can reveal intimate details about health conditions, financial status, political beliefs, and personal relationships. When a laptop containing personal information on 196,000 HP employees was stolen from a Fidelity Investments employee's home, it highlighted the vulnerability of these massive databases. More troubling is the erosion of what might be called "probabilistic privacy." Statistical analysis can now predict behaviors and characteristics that individuals might wish to keep private. Credit card companies can predict with unsettling accuracy which customers are likely to divorce based on spending patterns. Insurance companies use statistical models to identify customers likely to develop serious health conditions before any symptoms appear. These predictions, while imperfect, increasingly shape how organizations interact with individuals—often without their knowledge or consent. Statistical discrimination represents another concerning application of Super Crunching. While algorithms don't harbor prejudice, they can perpetuate existing biases when trained on historical data reflecting discriminatory patterns. A study of auto lending revealed that African-American borrowers paid almost $700 in loan markups compared to $300 for white borrowers with identical credit scores. The algorithms didn't explicitly consider race, but they incorporated variables that served as proxies for race, producing discriminatory outcomes through seemingly neutral criteria. The opacity of many algorithms compounds these problems. When a loan application is denied or insurance premiums increase, the statistical models behind these decisions often function as "black boxes," making it difficult for individuals to understand or challenge adverse outcomes. This lack of transparency creates accountability gaps, allowing organizations to implement discriminatory practices under the guise of objective analysis. Perhaps most fundamentally, Super Crunching shifts power from individuals to organizations that control data and algorithms. Retailers like Harrah's calculate each customer's "pain point"—the maximum amount they can lose while still enjoying gambling enough to return. Airlines determine precisely which disrupted passengers deserve compensation based on their statistical value as customers. These targeted practices maximize profits by treating different customers differently, often without their knowledge or consent. Even well-intentioned statistical applications can produce problematic outcomes. The Virginia Sexually Violent Predator Act uses a statistical risk assessment tool to identify prisoners who might be subject to civil commitment after completing their sentences. While this approach aims to protect public safety, it raises profound questions about liberty and due process when freedom depends on statistical predictions rather than actual behavior. Addressing these concerns requires technical, legal, and ethical responses. Differential privacy techniques can allow statistical analysis while protecting individual data. Transparency requirements can make algorithms more accountable. Data rights legislation can give individuals greater control over information about them. Most importantly, we need ongoing societal dialogue about the appropriate limits of statistical prediction and intervention.

Chapter 7: Finding Balance: Integrating Intuition with Data Analysis

The confrontation between Super Crunching and traditional expertise need not end in the complete victory of one approach over the other. The most promising path forward integrates statistical rigor with human judgment, leveraging the complementary strengths of each approach. This integration represents not just a practical compromise but a more sophisticated understanding of how different forms of knowledge contribute to effective decision-making. Statistical methods excel at identifying patterns across large datasets, maintaining consistency, and precisely weighing multiple factors. However, they struggle with novel situations, rare events, and contextual understanding. Human experts, conversely, excel at recognizing when standard approaches might fail, generating creative hypotheses, and incorporating ethical considerations into decisions. The challenge is structuring this collaboration to maximize the strengths of each approach while minimizing their weaknesses. Consider the case of medical diagnosis. Isabel, a diagnostic support system, can search through 11,000 diseases and their associated symptoms in seconds, identifying possibilities that even experienced physicians might overlook. However, Isabel doesn't replace physicians—it augments them. The doctor still evaluates the patient, enters relevant findings, and exercises clinical judgment about which diagnostic possibilities merit further investigation. This partnership leverages both statistical comprehensiveness and clinical judgment. The same complementary relationship applies in other domains. Financial analysts use statistical models to identify promising investments but apply judgment when evaluating unique circumstances like management changes or regulatory shifts. Teachers implement statistically validated curricula like Direct Instruction but adapt their approach based on individual student responses. Movie studios use algorithms to identify promising scripts but rely on human judgment to evaluate artistic quality and cultural resonance. Effective integration requires clarity about appropriate roles. Statistical models should generally drive routine predictions and decisions, with human oversight focused on unusual cases where statistical assumptions might fail. Paul Meehl's famous "broken leg" example illustrates this principle—a statistical model might predict that Professor Jones will go to the movies on Friday night based on past behavior, but this prediction should be overridden if we learn Jones has just broken his leg. The key is limiting such overrides to situations with clear justification rather than general discomfort with statistical recommendations. Organizations implementing this balanced approach must address cultural and status concerns. Traditional experts often resist statistical methods not just because they fear job loss but because these methods challenge professional identity and autonomy. Framing statistical tools as enhancing rather than replacing expertise, involving professionals in model development, and maintaining meaningful discretion in appropriate areas can reduce resistance and improve implementation. Education represents another critical component of successful integration. Future professionals need training in both domain knowledge and statistical literacy. They must understand not just how to use statistical tools but their limitations and appropriate applications. This dual competency will define the most valuable professionals in the coming decades—those who can toggle comfortably between statistical analysis and domain expertise, using each to inform and enhance the other. The statistician John Tukey famously said, "The greatest value of a picture is when it forces us to notice what we never expected to see." In this spirit, the greatest value of Super Crunching may be forcing us to notice patterns and relationships that challenge our preconceptions and expand our understanding. By embracing both statistical rigor and human insight, we can make decisions that are both more accurate and more humane than either approach alone could achieve.

Summary

The statistical revolution described throughout these chapters represents more than a technical innovation—it fundamentally transforms how decisions are made across society. By replacing subjective judgment with data-driven analysis, Super Crunching has demonstrated the ability to make more accurate predictions, discover non-obvious relationships, and rigorously evaluate causal claims. This approach hasn't merely improved existing decision processes; it has completely rewritten the rules of the game in fields ranging from medicine and education to business and government. What emerges most clearly is that the future belongs not to those who reject statistical thinking in favor of pure intuition, nor to those who blindly follow algorithms without question. Rather, it belongs to those who can integrate both approaches—using statistical tools to identify patterns and relationships while applying human judgment to generate hypotheses, recognize unusual circumstances, and incorporate ethical considerations. This integration requires both technical literacy and domain expertise, creating new opportunities for those who can comfortably navigate both worlds. As statistical tools become increasingly accessible and powerful, the most valuable skill may be knowing when and how to use them appropriately—understanding their capabilities and limitations within specific contexts. In this way, Super Crunching doesn't eliminate the need for human judgment but transforms it, creating a more nuanced and effective approach to understanding our complex world.

Best Quote

“what caused a law review article to be cited more or less. Fred and I collected citation information on all the articles published for fifteen years in the top three law reviews. Our central statistical formula had more than fifty variables. Like Epagogix, Fred and I found that seemingly incongruous things mattered a lot. Articles with shorter titles and fewer footnotes were cited significantly more, whereas articles that included an equation or an appendix were cited a lot less. Longer articles were cited more, but the regression formula predicted that citations per page peak for articles that were a whopping fifty-three pages long.” ― Ian Ayres, Super Crunchers: Why Thinking-by-Numbers Is the New Way to Be Smart

Review Summary

Strengths: Ayres' engaging writing style effectively simplifies complex statistical concepts for a wide audience. Compelling case studies vividly demonstrate data analysis applications across sectors like healthcare and business, making the content relatable. The book's practical examples and Ayres’ skill in demystifying statistics are particularly noteworthy. Additionally, the exploration of ethical implications, such as privacy concerns, adds depth to the discussion.\nWeaknesses: Occasionally, the book's emphasis on data might overshadow the value of human intuition. Some readers note that complex issues are sometimes oversimplified. A more balanced perspective acknowledging the continued importance of human judgment in certain scenarios could enhance the narrative.\nOverall Sentiment: The general reception is largely positive, with many finding it insightful and thought-provoking. Readers appreciate its relevance in highlighting the significance of data in contemporary decision-making.\nKey Takeaway: The book underscores the transformative role of data analysis in modern decision-making, suggesting that statistical tools can often surpass traditional human judgment, while also highlighting the necessity of considering ethical dimensions.

About Author

Loading...
Ian Ayres Avatar

Ian Ayres

Ian Ayres is the William K. Townsend Professor at Yale Law School and the Yale School of Management, and is editor of the Journal of Law, Economics and Organization. In addition to his best-selling SuperCrunchers, Ayres has written for the New York Times, the Wall Street Journal, Financial Times, International Herald Tribune, and The New Republic. He lives in New Haven, Connecticut.Barry Nalebuff is Professor of Economics and Management at the Yale School of Management. His books include The Art of Strategy (an update of the best-selling Thinking Strategically) and Co-opetition. He is the author of fifty scholarly articles and has been an associate editor of five academic journals. He lives in New Haven, Connecticut.

Read more

Download PDF & EPUB

To save this Black List summary for later, download the free PDF and EPUB. You can print it out, or read offline at your convenience.

Book Cover

Super Crunchers

By Ian Ayres

Build Your Library

Select titles that spark your interest. We'll find bite-sized summaries you'll love.