In December 2015, the Democratic Party’s data infrastructure became the subject of fierce controversy. When it was publicly revealed that staffers of Bernie Sanders’s presidential bid breached rival Hillary Clinton’s voter data stored in NGP VAN’s VoteBuilder, this infrastructure, normally hidden from view, suddenly became a contested matter of concern. As the Democratic Party closed access to VoteBuilder for a period of time for the Sanders campaign, the candidate’s supporters, competitors to NGP VAN, and journalists publicly debated why the party had such control over voter data and, ultimately, the dangers that such centralization might pose for the party and the democratic process. I have spent the past decade studying platforms such as VoteBuilder. While the incident with Sanders raised a number of important issues, the Democratic Party’s data infrastructure, developed in the wake of the 2004 presidential election cycle, is a key reason for its well-documented technological advantages  over its Republican rival that persisted at least through the 2014 midterm elections.

Before I go any further it is worth providing some background context from news reports on the Sanders data breach, with the caveat that I have no direct knowledge of and have not conducted original research on the incident. From journalistic reports, the basic facts behind the Sanders data breach of NGP VAN’s firewalls between campaigns seem clear enough. By exploiting a vulnerability in the NGP VAN system, staffers on the Sanders campaign pulled multiple lists of voters from the Clinton campaign’s voter data. According to news articles, this included data on things such as strong Clinton supporters in Iowa and New Hampshire, 24 lists in all. The DNC responded by temporarily suspending the Sanders campaign’s access to VoteBuilder, in effect preventing staffers from using their voter data less than two months out from the Iowa caucuses.

As Larry Lessig has pointed out, there are legal questions here relating to the actual contract that users of VoteBuilder and the DNC’s data sign that, without direct knowledge of the matter at hand, I am not going to opine on. However, the consensus among practitioners is that Sanders’s staffers knew what they were doing and it was a clear ethical violation of the use of the NGP VAN system (indeed, the campaign’s national data director was summarily fired and the candidate apologized). In the end, the campaign filed a federal lawsuit against the DNC, and the entire matter was resolved in 24-hours with the campaign again gaining access to its voter data after the DNC stated it fulfilled its requests for more information.

What I want to focus on here is the call after the Sanders incident, among some practitioners and firms with a stake in the outcome, to open up the DNC’s voter data so candidates and their campaigns, not the party, own what they use and produce. Part of this response stems from fears over monopoly power. The Democratic Party has built the most powerful political database in the world, and serves as the ‘obligatory passage point’ through which all of its political campaigns must go to gain access to it. Presidential campaigns essentially rent the DNC’s voter file, which is in effect a collection of 51 different state voter files (including Washington D.C.) that states share with the national party and are all hosted in the same database and interface system, VoteBuilder. The Democratic Party contracts with NGP VAN to make the state parties’ voter files accessible to and actionable for campaigns at all levels of office through VoteBuilder. As campaigns use VoteBuilder, they enrich the voter file through all of their canvassing and voter contacts. The voter contact and identification data of campaigns are firewalled from each other during elections. However, after elections, this voter contact and identification work conducted by campaign volunteers and paid staffers goes back to the DNC and the state parties, and is ultimately made available to subsequent Democratic candidates. In this system, campaigns do not own these voter files that they helped to enrich.

The basic architecture of the Democratic Party’s voter data was put in place by Chairman Howard Dean, with VAN gaining the contract to put together VoteBuilder in 2007. The national party provides access to VoteBuilder for its presidential candidates, while state parties determine access for state-level campaigns. As I have previously detailed, this architecture grew out of the party’s experiences during the 2004 cycle, when John Kerry’s multiple state voter files, housed in many vendor systems responsible for providing access to them, crashed in important states. The data was unstandardized across states, vendors had uneven track records, there was little in the way of field tools for accessing voter files, much historical data on voter contact was simply lost, and candidates even had trouble accessing state voter files. During the 2004 cycle, the Democratic Party was significantly behind the Republican Party and its voter database and interface system VoterVault.

Dean and his staff turned the Democratic Party’s data architecture from a mess of often vendor-controlled state voter files into the nationalized system it is today. Dean’s team created a set of agreements between the national and state parties, where the former works to ensure that state parties are fair arbiters of their voter files and make them available to all Democrats in an open primary, and the latter can hold the national party to account for what they do with the data.  Former party staffers argue that because the national party is an outsized player in terms of funding, the national organization has a significant role in making sure state parties are not unfair and can mediate disputes when they arise. That said, disputes between candidates and state parties do occur. For example, the policies of state parties with respect to challengers to incumbents in primaries varies, with states such as Missouri only providing access to VAN for incumbent office holders as the “winner of the last primary.” This policy was recently called into question by an African American state senator who was active in the protests over the police shooting of Michael Brown after she sought access to VAN to contest a congressional incumbent. The rationale of the state party is that there needs to be an incentive for incumbents to continue to use the party’s voter data, and ultimately share their voter identification work back with other candidates, or they will view their electoral work across cycles simply as being used against them. What this in effect means is that, in some states, challengers to incumbents cannot access the party’s extensive voter file and the competitive advantages it might entail.

While there is a romantic notion that candidates should be free of parties and just make their own direct appeals to voters, especially among organizations that stand to profit from the decentralization and fragmentation of the party’s data, the Democratic Party’s voter file is an acknowledgment of the fact that American political life is, for worse but mostly for better, organized through parties. The DNC’s voter file architecture is ultimately a distinctly partisan resource, designed to strengthen its candidates’ ability to contest elections by providing them with far more data than they could ever muster on their own in discrete electoral runs, or even through data swaps with allies. As such, it is a tool of the party, not a ‘democratic’ technology in the sense of facilitating the efforts of independent candidates across the ideological spectrum to contest elections or even challenges to Democratic incumbents.

While this might not accord with our deep-seated longings for democratic technologies that will afford things such as more open and participatory elections (although this cycle is certainly testing that normative democratic wish), it does fit with the structure of American democracy, which is organized through parties. Political theorist Nancy Rosenblum has argued for the “moral distinctiveness of partisanship.” Parties, in Rosenblum’s eyes, are the institutions that define representative democracy. As Rosenblum argues, parties not only organize elections, they define political issues (and the political center), organize intra-party deliberation, are responsible for mobilizing the electorate, and are pluralist in having to negotiate intra-party compromise. Rosenblum, for instance, notes the irony that the celebrated “civil society organization” is the foremost example of political extremism, given that it often pursues single issues at the expense of multi-issue coalitions and tends to be the most uncompromising (single issues candidates would fall into a similar category).

The story of parties in America is long and complicated, but suffice it to say they are central to democratic processes. In an era when there are stark differences between the two parties, voters have clear choices and responsible parties attempt to pursue power through the ballot box. Parties do so, in no small part, through their ability to serve as databases for their candidates. In the process, they often do not fulfill other democratic longings or normative aspirations that the public has for electoral politics. However, parties fulfill their normative democratic role in representative democracy by working to defeat the other side in elections and create governing majorities according to the most advantageous means that they collectively decide upon. Indeed, Rosenblum argues that it is ultimately a good thing that parties have to balance multiple interests within them, lest we empower extremist single-issue candidates or ideological movements that brook no compromise (indeed, the Koch Brothers’ network invested in its own data firm http://www.i-360.com/ precisely to help bend the Republican Party to its ideological will). And, ultimately, a party may decide that protecting incumbents in exchange for them contributing back to a common data pool that candidates at all levels of office may subsequently benefit from should come at the expense of individual candidates’ attempting to access data. We should respect the decisions of parties to judge what is in their collective best interest according to their normative role of, according to Rosenblum, mobilizing citizens, pursuing political power through majorities, and making government work once in office.

One final consideration. There are alternatives to the Democratic Party’s data – although most would argue that they are inferior to the party’s historical data provided through VoteBuilder. There is nothing precluding a candidate that does not want to opt-into this system from relying on third party data firms such as Aristotle, Catalist, and NationBuilder, and taking advantage of any of the commercial customer relationship management platforms available to manage it. And, there is nothing preventing them from purchasing data, generating their own contacts, and then building their own independent operations. But it will never be as powerful as the DNC’s communal and partisan resource, and these candidates will find that even if they gain office, they will diminish the resources available for their partisan and ideological allies to do so as well – in the end, this hurts their ability to enact legislative change.

In essence, the Democratic Party has created a powerful and robust tool that facilitates its efforts to secure political power. And, I believe, in keeping with the normative role of parties in democratic societies, the party should have the ability to control access to it according to their own policies designed to further their governance interests. As a matter of course, these policies and remedies should be transparent and ultimately contestable (which appeared to be part of the issue in the Sanders and Missouri cases), but in the end I believe it is a good thing that as a multi-issue coalition of heterogeneous actors the Democratic Party sets its own policies and procedures for its use as a database.

The buy-in across the Democratic Party network to the party’s voter file and interface system provided by NGP VAN has resulted in a powerful and robust piece of infrastructure that the party’s ecosystem convenes around and that is ultimately enrolled in the pursuit of electoral power. The DNC’s voter file is an asset for the party because of the trust and culture of collaboration and sharing that has grown up around it. As a result of the fact that there is broad buy-in to VoteBuilder, the party’s data is standardized and accrues across election cycles and moves up and down ballot. Democrats in races across all levels of office and election cycles share the data they collect about who their voters are and what they care about. The fact that campaigns have to provide the data they generate through canvasses and contacts back to the party (excluding campaign proprietary data on donors), means that data is a shared asset across the party as a whole. At the same time, there is one system for accessing the voter file, so volunteers without much in the way of technical skills or training in states such as Iowa use it easily from cycle to cycle. Voter modeling flows downward from comparatively well-resourced presidential campaigns to state and even local races. And, the universal buy-in to the party’s data has meant that the expansive, hybrid Democratic network ecosystem of firms and organizations [PDF]  all work from the same basic data infrastructure. These organizations adopt and use the party’s voter file and tools for accessing it  – which facilitates the complementarity of campaign services.

The danger, as a number of practitioners have suggested, is that with monopoly comes the possibility for suboptimal technologies and stasis. However, the fact that the Democratic Party has been well ahead of its rival with a more fragmented data and analytics ecosystem for nearly a decade suggests that this fear is overblown. Another concern is that the party might behave arbitrarily, such as in the case of the Sanders breach when it appeared the party’s own policies were unclear, but there are a host of normative pressures exerted by the DNC and other party actors and regulative agreements and policies against this (indeed, the Sanders situation was ultimately resolved quickly). As former Clark and Kerry 2004 veteran and director of political data and analytics from 2005-2007 and director of technology from 2009-2011 at the Democratic Party, Josh Hendler argued, “the state parties decided that while they would be giving up some power, they trusted the DNC to be stewards of the data.” In that trust, the Democratic Party as a whole benefits from access to the data generated by their candidates, especially their presidential candidates with their vast mobilization of resources and volunteers to generate voter contacts.

Daniel Kreiss is Assistant Professor in the School of Media and Journalism at the University of North Carolina at Chapel Hill. Kreiss’s research explores the impact of technological change on the public sphere and political practice. Kreiss is the author of Taking Our Country Back: The Crafting of Networked Politics from Howard Dean to Barack Obama (Oxford University Press, 2012) and Prototype Politics: Technology-Intensive Campaigning and the Data of Democracy (Oxford University Press, 2016).


Header image credit: Keith Bacongco