Home/Business/The Bestseller Code
Loading...
The Bestseller Code cover

The Bestseller Code

Anatomy of the Blockbuster Novel

3.8 (1,237 ratings)
22 minutes read | Text | 9 key ideas
In a literary landscape where success often seems like a capricious gust of wind, "The Bestseller Code" unveils a groundbreaking narrative: a computer algorithm that deciphers the enigmatic DNA of bestselling books. Jodie Archer and Matthew Jockers have cracked the code, revealing that the climb to the bestseller list isn't just a stroke of luck. By dissecting the essence of 20,000 novels, they illuminate why certain themes, styles, and characters resonate across genres and captivate readers worldwide. With the precision of a literary detective, this work exposes the magnetic pull of dark heroines and unravels the mystique behind phenomena like "Fifty Shades of Grey." But the true marvel lies in its quest for "the one"—the archetype of bestselling mastery, as revealed by a meticulous analysis of countless data points. The outcome is as surprising as it is intriguing, offering a fresh lens on the art of fiction and our deep-seated desire to be enthralled by stories.

Categories

Business, Nonfiction, Psychology, Writing, Literature, Linguistics, Adult, Research, Books About Books, Crafts

Content Type

Book

Binding

Hardcover

Year

2016

Publisher

St. Martin's Press

Language

English

ISBN13

9781250088277

File Download

PDF | EPUB

The Bestseller Code Plot Summary

Introduction

What makes a bestselling novel? Is it merely a matter of luck, timing, or aggressive marketing, or are there underlying patterns that determine which books will fly off the shelves? For decades, the publishing industry has operated on instinct and intuition, treating bestseller status as something akin to lightning strikes—rare, unpredictable, and impossible to engineer. This conventional wisdom is challenged through rigorous computational analysis of thousands of novels, revealing that bestsellers do indeed share specific, identifiable patterns that can be detected algorithmically. Through text mining and machine learning techniques, patterns emerge across seemingly disparate works, from literary fiction to thrillers to romance novels. The research demonstrates that bestsellers share specific characteristics in their themes, plots, characters, and style—what might be called the DNA of commercial success. These findings not only illuminate what readers truly want, but also offer insights into how human creativity and algorithmic analysis can coexist. Rather than diminishing the art of writing, understanding these patterns provides a new lens through which to appreciate the craft of successful storytelling and perhaps transform how we think about literary creation and consumption.

Chapter 1: The Bestseller Formula: Identifying Patterns Beyond Random Success

The publishing industry has long operated on the assumption that bestsellers are essentially flukes—happy anomalies that cannot be predicted or engineered. Publishing executives often describe the process of selecting manuscripts as educated gambling, with success rates hovering below one percent. This perspective has led to a conservative approach where publishers primarily invest in established authors rather than newcomers, perpetuating the careers of aging bestseller veterans while struggling to identify the next generation of literary stars. This prevalent belief in the randomness of bestselling success is challenged by computational evidence. By analyzing thousands of novels published over the past thirty years, algorithms can identify bestsellers with 80-90% accuracy based solely on the text itself—no author name, marketing budget, or cover design required. The algorithm examines thousands of textual features, from thematic elements to stylistic choices, identifying patterns that consistently appear in commercially successful fiction. The fact that a computer can predict with such accuracy which books will succeed suggests that bestsellers are far from random occurrences. The significance of this finding is profound. Contrary to industry wisdom, the data reveals that what makes a book successful is primarily what's inside it—the words on the page arranged in particular ways. While established authors like John Grisham and Danielle Steel consistently produce bestsellers, the computer analysis indicates this is not merely due to their name recognition but because they have mastered—consciously or intuitively—the fundamentals of what readers respond to. Their manuscripts contain predictable patterns that signal commercial appeal. Consider the case of debut novels that become immediate blockbusters, like The Girl with the Dragon Tattoo or Fifty Shades of Grey. These books weren't promoted by famous authors or launched with enormous marketing budgets, yet they captured millions of readers. The algorithm reveals these weren't flukes but texts that contained the specific DNA of bestselling fiction. Similarly, when experienced editors reject manuscripts that later become global phenomena, they're not failing to recognize quality but missing these specific patterns that indicate mass appeal. Understanding these patterns doesn't diminish the art of writing or reduce creativity to formulas. Rather, it illuminates what readers collectively respond to and why certain books resonate across diverse audiences. This knowledge could democratize publishing by helping identify promising new voices regardless of their connections or backgrounds, potentially transforming an industry that has traditionally relied on subjective judgment and established networks.

Chapter 2: Theme Analysis: The Surprising Truth About Popular Topics in Fiction

When examining what bestselling novels are about, conventional wisdom suggests sex, violence, and scandalous content should dominate. Yet computational analysis reveals surprising results: the themes most strongly associated with bestselling aren't what most would expect. Sex, for instance, appears with significantly lower frequency in bestsellers than in non-bestsellers—directly contradicting the common assumption that "sex sells." Even blockbusters like Fifty Shades of Grey, often dismissed as merely "mommy porn," succeed primarily due to other thematic elements rather than their explicit content. The research identifies human connection and relationships as the most statistically significant theme in bestselling fiction, appearing consistently across genres from thrillers to literary fiction. This theme of human closeness—characters forming bonds, experiencing intimacy, and navigating relationships—appears with markedly higher frequency in bestsellers compared to books that never reach the lists. Secondary themes that strongly indicate bestselling potential include modern technology, workplace dynamics, and domestic settings. These themes transcend genre categories, appearing in mysteries, romances, and literary novels alike. Bestselling authors demonstrate a distinctive approach to thematic construction. They typically devote approximately 30% of their novel to just two or three primary themes, creating a focused narrative core. Non-bestselling writers, by contrast, tend to pack in more themes, requiring twice as many topics to reach the same percentage of the book. This difference suggests that commercially successful novels maintain thematic focus rather than attempting to cover too many ideas. The ideal bestseller features a dominant theme (taking about one-third of the novel) with several secondary themes adding variation and complexity. Significantly, bestselling novels tend to ground themselves in contemporary reality rather than fantasy. While fantasy elements certainly appear in some bestsellers, the data shows that books featuring real-world settings, ordinary homes, modern technologies, and recognizable workplaces sell better on average than those set in invented worlds or distant times. Equally important is the presence of thematic tension—the most commercially successful novels juxtapose themes that naturally create conflict. John Grisham's pairing of legal systems with family life or Danielle Steel's combination of domestic settings with medical emergencies creates built-in narrative tension. The data also reveals a preference for the mainstream over the niche. While distinctive topics may differentiate a novel in the marketplace, bestsellers avoid extremes that might alienate broad readership. They favor topics with universal appeal—work, family, technology, crime—treated in ways that feel relevant to contemporary readers. This explains why Grisham and Steel, despite writing in different genres, both top the algorithm's list for thematic mastery: they have perfected the balance between recognizable reality and compelling drama, between focus and variety.

Chapter 3: Emotional Arcs: The Seven Plot Shapes That Drive Bestselling Novels

Bestselling novels don't just tell stories—they create emotional experiences that follow specific patterns. Through sentiment analysis of thousands of novels, the research identifies seven fundamental plot shapes that characterize all bestsellers. These emotional arcs aren't about specific events but about the emotional journey readers experience as they progress through the narrative. The computer tracks this by analyzing the density of positive and negative emotional language throughout each book, creating visualizations that represent the emotional roller coaster of the reading experience. The most commercially successful novels feature emotional arcs that maintain regular, rhythmic beats—creating a consistent pattern of emotional highs and lows that keeps readers engaged. The most striking example of this pattern appears in the two highest-selling adult novels of the past thirty years: Fifty Shades of Grey and The Da Vinci Code. Despite their vastly different content and style, both novels feature nearly identical emotional rhythms, with peaks and valleys occurring at remarkably similar intervals. This suggests that the addictive quality of these blockbusters stems not just from their content but from their precisely calibrated emotional pacing. Among the seven plot shapes identified, no single one guarantees commercial success. The shapes include the "rags to riches" plot (gradual improvement from negative to positive circumstances), its inverse (fall from grace), the "man in hole" story (descent followed by recovery), and others featuring various combinations of emotional ups and downs. What matters more than the specific shape is how the author manages the emotional transitions. Books with steep emotional drops and climbs tend to be described as "page-turners," while those with more gradual slopes create different reading experiences. The three-act structure, long recognized by storytelling experts, appears consistently in bestsellers. The data shows critical turning points typically occurring around the 30% and 65% marks in the narrative, with a midpoint shift at 50%. These structural features create the sense of forward momentum and emotional satisfaction that keeps readers engaged. Within this structure, bestsellers tend to feature emotionally consequential scenes at the highest and lowest points—moments where characters experience major transformations, revelations, or decisions. Recent blockbuster "girl" novels like Gone Girl, The Girl with the Dragon Tattoo, and The Girl on the Train share striking similarities in their emotional arcs despite surface differences. All three feature sharp emotional descents in the first half followed by complicated resolutions that end in darker emotional territory than where they began. This pattern represents a relatively new emotional template that has proven extraordinarily successful with contemporary readers. By mapping these emotional journeys, the algorithm provides insight not just into individual bestsellers but into evolving reader preferences and expectations.

Chapter 4: Style Mechanics: The Linguistic DNA of Commercial Success

While theme and plot are essential elements of bestselling fiction, the research reveals that an author's style—the nuts and bolts of how sentences are constructed—plays an equally crucial role in commercial success. Analyzing style at the most granular level, the algorithm examines thousands of features including word choice, sentence length, punctuation, and parts of speech. Remarkably, using only the 491 most common words and punctuation marks, the model can differentiate between bestsellers and non-bestsellers with 70% accuracy. The stylistic DNA of bestsellers reveals a clear preference for accessible, conversational language. Contractions appear significantly more frequently in bestsellers than in non-bestsellers—words like "don't," "can't," "I'm," and "you're" create a more natural, immediate voice than their formal alternatives. Similarly, the word "okay" appears three times more often in bestsellers than in books that don't make the lists. This suggests successful authors instinctively create narrator voices that feel authentic and contemporary to readers rather than formal or distant. Punctuation shows equally telling patterns. Bestsellers contain more question marks but fewer exclamation points than non-bestselling novels. The ellipsis (…) appears more frequently in bestsellers, often used to indicate unfinished thoughts that engage readers by inviting them to fill in the blanks. Period points are more common in bestselling prose, while semicolons and colons are significantly less so. These patterns indicate that bestselling style favors clearer, more straightforward sentence structures over complex or ornate ones. In terms of grammar, bestsellers feature fewer adjectives and adverbs than non-bestsellers. This aligns with common writing advice to avoid overmodification and let verbs and nouns carry the narrative weight. The algorithm confirms that bestselling prose isn't decorated with excessive description—it's leaner, cleaner, and more direct. The verb "do" appears twice as frequently in bestsellers, while the adverb "very" appears only half as often, suggesting that successful writers favor action over qualification. Perhaps the most surprising finding involves gender patterns in bestselling style. When ranking novels solely on stylistic features most typical of bestsellers, women writers—particularly those with backgrounds in journalism, advertising, or media—dominated the top spots. This suggests that training in accessible, public-facing communication may provide advantages in crafting commercially successful fiction. However, further analysis revealed that what appeared initially as a gender difference was more accurately understood as a difference in professional background and training. The most commercially successful style appears to be one that balances accessibility with sophistication—neither too elevated nor too simplistic, but pitched precisely at the sweet spot where the broadest readership feels both engaged and respected.

Chapter 5: Character Agency: How Bestselling Protagonists Engage Readers

Character emerges as a crucial factor in determining bestseller status, with the algorithm revealing specific patterns in how successful novels portray their protagonists. The key finding is that bestselling characters consistently demonstrate agency—they act decisively, express clear desires, and drive the narrative forward through their choices. The verbs associated with characters prove so telling that analyzing character actions alone allows the algorithm to predict bestseller status with 72% accuracy. The most significant difference between bestselling and non-bestselling characters lies in their relationship to desire. Characters in bestsellers "need" and "want" twice as often as those in less successful books. They actively pursue goals rather than merely reacting to circumstances. Bestselling protagonists also demonstrate greater self-awareness and confidence—they frequently "tell," "know," "think," and "ask." By contrast, characters in less successful novels more often "halt," "hesitate," "interrupt," and "seem"—verbs that suggest uncertainty and passivity. Bestselling characters possess distinctive qualities that set them apart from ordinary people. They typically excel in some specific area—whether intellectual capability, moral strength, or a particular skill. Lisbeth Salander's unparalleled hacking abilities in The Girl with the Dragon Tattoo, Robert Langdon's expertise in symbology in The Da Vinci Code, or Werner's electronics prowess in All the Light We Cannot See represent this pattern. These special abilities make characters more compelling without necessarily making them unrealistic or superheroic. While bestselling characters of both genders demonstrate agency, the data reveals some gender differences in typical actions. Male characters in bestsellers do more kissing, flying, driving, and killing, while female characters do more talking, reading, and imagining. However, these differences appear less significant than the overarching pattern of agency that unites successful protagonists across gender lines. Both male and female bestselling characters make decisions, take action, and pursue desires with determination. The analysis of recent "girl" bestsellers (The Girl with the Dragon Tattoo, Gone Girl, The Girl on the Train) reveals why these particular female protagonists have captured reader attention. Despite being called "girls," these characters are complex adult women who subvert traditional female character types. They bring danger and darkness into domestic spaces, challenging conventional expectations of femininity while maintaining strong agency. The algorithm demonstrates that in all three novels, the female characters' verbs and actions contribute significantly more to the books' bestselling potential than the male characters' actions do. Character development in bestsellers follows recognizable patterns across genres. Protagonists typically begin with clear desires, face significant obstacles, and transform through their experiences. The algorithm shows that the most successful novels balance character consistency with meaningful change—heroes and heroines remain recognizably themselves while evolving in response to plot challenges. This combination of stable identity and dynamic development creates characters readers can simultaneously identify with and root for throughout the narrative journey.

Chapter 6: The Perfect Book: What Makes a 100% Bestseller Score

After analyzing thousands of novels across multiple dimensions—theme, plot, style, and character—the algorithm identified one book that scored a perfect 100% on the bestseller prediction scale. This perfect specimen of bestseller DNA turned out to be The Circle by Dave Eggers. What makes this novel the paradigmatic example of bestseller construction, combining all the elements that statistically predict commercial success? The Circle achieves the ideal thematic makeup identified by the algorithm. Its three main themes constitute approximately 30% of the novel: Modern Technology (21%), Jobs and the Workplace (4%), and Human Closeness (3%). This combination creates inherent tension—the novel explores how technology and work environments impact human relationships. This thematic tension drives the narrative forward while addressing concerns relevant to contemporary readers, precisely matching the pattern found in other highly successful novels. The emotional arc of The Circle follows the same shape as Fifty Shades of Grey—one of the seven fundamental plot shapes identified in bestsellers. It begins with an emotional high point as the protagonist, Mae, starts her dream job at a technology company she describes as "heaven." The plot then descends into increasingly troubling territory as the implications of the company's surveillance technologies become apparent. Like other top-selling novels, The Circle features a three-act structure with regular emotional beats that create a sense of momentum. The plotline ends in a darker place than where it began, matching the pattern seen in other recent blockbusters. Stylistically, The Circle achieves the perfect balance identified in bestselling prose. Its style sits precisely at the midpoint between what the algorithm identified as "masculine" and "feminine" writing patterns—52% "feminine" and 48% "masculine." This balanced approach combines accessibility with sophistication, creating prose that appeals to the broadest possible readership. The novel uses contractions, punctuation, and sentence structures in exactly the proportions found in other highly successful books. The protagonist, Mae Holland, exemplifies the characteristics of bestselling heroines. She demonstrates strong agency throughout the narrative, with her most frequently used verbs being "need" and "want"—the two verbs most strongly associated with bestselling characters. As the novel progresses, her character arc follows a path similar to those seen in other female-led bestsellers, particularly the recent "girl" novels. She becomes both victim and perpetrator, both problem and solution, navigating complex moral territory that defies simple categorization. What makes The Circle's perfect score particularly fascinating is the novel's content—it's a cautionary tale about technology and algorithms taking over human life. This irony wasn't lost on the researchers who discovered this result. The novel that most perfectly embodies the algorithmic patterns of bestselling appears to warn against the very kind of algorithmic control it exemplifies. This paradox highlights the complex relationship between data analysis and creative expression, suggesting that understanding the patterns of successful storytelling doesn't diminish the power of stories themselves.

Chapter 7: Implications: How Algorithms Transform Publishing Without Replacing Creativity

The discovery that bestsellers follow identifiable patterns raises profound questions about the future of publishing and the nature of literary creation. Rather than threatening artistic expression, algorithmic analysis of successful fiction offers new perspectives on why certain stories resonate with mass audiences. This knowledge has implications for writers, publishers, and readers alike, potentially democratizing an industry long dominated by subjective judgments and established networks. For publishers and agents, the ability to identify manuscripts with high commercial potential could transform business models that currently operate with success rates below one percent. By recognizing the patterns associated with bestselling, industry professionals could make more informed acquisition decisions, potentially discovering talented writers who might otherwise be overlooked. This doesn't mean algorithms should replace human judgment, but rather that computational tools can supplement traditional evaluation methods, providing additional data points in an inherently risky business. For writers, understanding the DNA of bestselling novels offers valuable insights without prescribing formulas. The research reveals that commercial success isn't random or solely dependent on marketing budgets—it stems from specific choices in theme, plot construction, style, and character development. Writers can use this knowledge to strengthen their craft while maintaining their unique voice and vision. The data doesn't dictate what stories to tell but illuminates how those stories might be shaped to maximize reader engagement. For readers, algorithmic analysis provides a new lens through which to understand their own preferences and responses. The research reveals what elements collectively drive page-turning engagement across diverse audiences. This doesn't reduce reading to mechanistic processes but rather highlights the shared human experiences that connect readers to stories. The surprising consistency in what resonates across different demographics suggests that despite our apparent differences, we respond to similar narrative patterns. The field of computational literary analysis raises ethical questions about the relationship between algorithms and art. Critics might worry that data-driven approaches could lead to formulaic writing designed to maximize commercial appeal at the expense of originality. However, the research suggests the opposite conclusion: the most commercially successful books often subvert expectations in creative ways while maintaining fundamental storytelling patterns. Understanding these patterns doesn't limit creativity but provides a foundation from which innovation can emerge. Perhaps most significantly, this research challenges traditional hierarchies in literary evaluation. By focusing exclusively on textual features rather than author reputation or cultural capital, algorithmic analysis creates a more level playing field where books are judged solely by their content. This approach has the potential to recognize excellence across genres and styles, whether in literary fiction or mass-market thrillers. In this sense, computational methods might actually expand our appreciation of diverse storytelling rather than narrowing literary possibilities.

Summary

The fundamental insight revealed through this computational analysis is that bestselling novels are far from random occurrences—they contain recognizable patterns that can be identified with remarkable accuracy. These patterns exist across multiple dimensions: thematic focus on human relationships balanced with contemporary concerns; emotionally resonant plots with regular rhythmic beats; accessible yet sophisticated prose style; and protagonists who demonstrate clear agency and desires. What emerges is not a reductive formula but a deeper understanding of the craft elements that collectively engage millions of readers. This research transforms our understanding of commercial fiction while respecting the artistry involved in creating successful novels. Rather than diminishing the achievements of bestselling authors, it highlights their mastery—whether intuitive or deliberate—of fundamental storytelling techniques that resonate across diverse audiences. For the publishing industry, these findings offer potential pathways to discover new voices and connect more readers with books they'll love. For writers and readers alike, understanding the DNA of bestselling fiction provides a new appreciation for the subtle mechanics that drive our collective literary experience—the invisible architecture that makes us keep turning pages long into the night.

Best Quote

“Back in the spring of 2010, Stieg Larsson’s agent was having a good day.” ― Jodie Archer, The Bestseller Code: Anatomy of the Blockbuster Novel

Review Summary

Strengths: The book's innovative approach, combining literary analysis with data science, captures interest. Its ability to make complex data accessible to a general audience is notable. Insights into narrative structure and themes that resonate with readers are particularly engaging.\nWeaknesses: Some skepticism exists regarding the reduction of literary success to a formula. The book could have explored the limitations and ethical considerations of using data in literature more thoroughly.\nOverall Sentiment: Reception is mixed, with appreciation for the originality and thought-provoking nature of the analysis, yet some debate about the implications of an analytical approach to creativity.\nKey Takeaway: "The Bestseller Code" offers a compelling exploration of how data can uncover the secrets of bestselling novels, though it raises questions about the balance between art and data-driven analysis in literature.

About Author

Loading...
Jodie Archer Avatar

Jodie Archer

Jodie Archer spent her childhood hiding in the changing rooms of the clothing stores her mother managed: she would pile sweaters all over the floor to look as though she were putting together an outfit, and would sit for the 8 hours the shops were open and read books in any genre she was allowed. At this age, she decided the only sensible profession in the world was to be a writer, but sensible people told her that it was not sensible at all to pursue such an unlikely dream.In pursuit of plan B, she was, for a while, a (not very good) young actress. She would take the parts of the characters and even writers she was most interested in--Alice in Wonderland, Matilda, Charlotte Bronte (which is when she most spectacularly forgot her lines), and once even Winnie The Pooh (which was fatal for her street cred at school). When she was given parts that involved singing on stage, she really proved that she was not meant to be involved in this business at all.A compulsive writer of letters, stories, scripts and even a newsletter for the neighborhood cats, she never thought again of becoming a writer for real until her school displayed a blurb she wrote for a book she had never, in fact, written, and she was besieged by requests for the text. Still, she was told it would be more sensible to become a lawyer than an author, and she must study English (if she insisted) an then convert to law.Before she went to Cambridge to study English, Jodie took a year out and wrote for several local newspapers and magazines, and then worked as a runner and researcher in TV. Throughout her following years at St, John's in Cambridge, (where the law was forgotten after her first lecture), she wrote reviews and features for various news outlets, and edited the May Anthologies and other well known university magazines. Once she graduated, she was offered training schemes at Macmillan and Penguin, and spent two years learning the publishing industry before becoming an acquisitions editor at Penguin. While this was exceptionally useful training, it showed her just how hard it is to make it as an author, and she thought it best to play it safe.Jodie left the UK for a scholarship at Stanford University in California, where she spent time teaching nonfiction and memoir writing alongside her research in contemporary fiction and bestsellers. While at Stanford, she enjoyed the blue skies and palm trees of California and wrote most of her first (unpublished) book. She got her first interest from agents the day her mother suddenly died. Just after that, her marriage broke down, and she abandoned writing for a couple of years.After she got her PhD, Jodie was recruited to Apple, where she became the lead in research on books. After she was approached by her agent to write a book based on her doctoral research with Matt Jockers, she wrote a pitch. She left the corporate world to write The Bestseller Code after its acquisition became deal of the week in New York.Finally, at 36, Jodie is a full time writer. After ten years in the US, she lives in Yorkshire, UK with her Havanese puppy, Mollie.

Read more

Download PDF & EPUB

To save this Black List summary for later, download the free PDF and EPUB. You can print it out, or read offline at your convenience.

Book Cover

The Bestseller Code

By Jodie Archer

Build Your Library

Select titles that spark your interest. We'll find bite-sized summaries you'll love.