Data-driven development: Should creators drink the data Kool-Aid?

One of the compelling anecdotes that sustains the new trend of a ‘data-driven approach’ in online content development is the story of how Netflix’s Ted Sarandos acquired House of Cards. The Los Angeles Times provides a good summary of this transaction:

“Sarandos studied Netflix data to determine how many subscribers watched political dramas such as Aaron Sorkin’s ‘The West Wing’ or the BBC’s original 1990 ‘House of Cards.’ He identified the die-hard fans of the directors’ films and the series’ proposed star, Kevin Spacey.

On the strength of that projected audience, Beau Willimon’s script and Fincher’s track record, Sarandos walked into the director’s West Hollywood office with a groundbreaking proposition: Netflix would commit to not one but two full seasons at a cost of $100 million.” (source)

There is something almost alchemical about how Sarandos is said to have mingled data points to confirm the series’ potential as a winning concoction. The Sarandos formula has inspired both wonder at the potential of data-driven content development as well as concerns that this approach is turning viewers into puppets. The data-driven method has undeniably important implications for the dissemination of online content and its audience, but the narrative also raises the issue of if and how this growing movement with respect to content development has implications for creating content in the first place.

While the Sarandos example is squarely focused on video on demand (VOD) services like Netflix, the promise of data-driven content development also extends to items such as web series and other forms of online content. In broad terms, creators of online content fall into one of three camps in response to data-driven content creation.

The first camp rejects the method outright. Its members are worried that data-driven approaches will dilute the craft of creating content: those who commission work will only acquire content that is supported by their data and creators will be more concerned with whether data supports an editing or casting decision instead of how to best tell a story. In this camp, audience and user experience data will only spoil the ingredients that really matter when creating content.

The second camp shares a similar scepticism towards data and their ability to contribute to good story-telling. However, this camp is happy to leave the data collection and analysis to those who commission and deliver content. Creators like Jenji Kohan (Orange is the New Black) or Beau Willimon (House of Cards), both distributed through Netflix, have been quoted expressing what might best be characterized as ambivalence towards the data-driven approach. In fact, Willimon states:

“Netflix closely guards [viewership] data for a whole host of reasons, and I’m glad that they do, I wouldn’t want access to that data. […] That sort of data leads to pandering, which is the antithesis of creativity.” (source)

These two camps raise an interesting and age-old debate: to which extent is the creative process enabled or hampered by audience data and similar information?

That leads us to the third camp. For a number of online content creators, certain kinds of audience data are already readily available. For example, people who use YouTube as a distribution platform can access Google Analytics and other similar tools to track how viewers engage in their videos (cf. previous article titled “Pros v. Amateurs”). Similarly, creators of web apps and standalone online content can include ways to track audience data in their designs. Companies ranging from Google to Adobe continue to perfect new content marketing and data analysis tools (cf. link to a recent list of tool types). This has led creators working in fields that do not benefit from such ready access to data to start asking for data regarding their content. These creators’ desire for data is nothing new. Some—including independent filmmakers—have always been frustrated by the secrecy among distributors who jealously guard their ‘numbers.’ What is changing is how those numbers fit into content delivery business models and, more importantly, how content creators are no longer beholden to a specific distribution platform.

Consequently, in this camp, data has become too important for creators to leave to someone else to put together. That has lead to calls for greater transparency with respect to the data generated through platforms like Netflix and other VOD services. In an address to the TIFF Doc Conference, Liesl Copland captured the call for transparency:

“Hollywood is woefully behind but, with a little ingenuity and passion from all of you, we can catch up and eventually surpass the present reality. I am very optimistic and deeply believe in the possibility of a clear, transparent reporting model that gives us visibility into everything. One that makes us all smarter and better at making movies, frees up the artist, and creates more engaged moviegoers who are eager to support fresh voices.” (Source)

But what makes it unlikely that this type of transparency is ever realized is that the very secrecy that stands between creators and their ability to see into ‘everything data’ is what gives this data value. Transparency would reveal the alchemy that is essential for data formulas such as Sarandos’. He picks and chooses what variables he needs to create a suitably valid and reliable model that would predict the viability of developing new programming. But, as Tim Wu recently observed, Sarrandos’ own intuition may have more to do with how to successfully interpret the Netflix algorithms than we are led to believe. These data recipes are not necessarily based on variables that would be relevant to other projects. In this sense, ‘visibility into everything’ requires that one first defines what ‘everything’ is relevant—something very different for Netflix compared to an independent documentary filmmaker. The power to define what variables are worth tracking is something that would be difficult to relinquish. After all, it is to the alchemist’s advantage to keep the secret about his ingredients. The only way that creators will be able to gain the power to concoct transparent data-driven development methods in ways that are beneficial for their work is by generating their own data and sharing them with others.

Frédérik Lesage
Frédérik Lesage (Ph.D., London School of Economics and Political Science) is an assistant professor in the School of Communication at Simon Fraser University. His research interests include digital media, creative practice, and mediation theory.
Read Bio