Big Data and the Epistemological Renaissance

Technological advancements have had a profound influence on social science research. The rise of the internet, mobile hardware and app economies generate a breadth, depth and type of data previously unimaginable, while computational capabilities allow granular analyses that reveal patterns across massive data sets. From these new types of data and forms of analysis, has emerged a crisis and renaissance of methodological thought.

Early excitement around big data celebrated a world that would be entirely changed and entirely knowable. Big data would “revolutionize” the way we “live, work, and think” claimed Viktor Mayer-Schönberger and Kenneth Cuckier in their 2013 monograph, which so aptly captured the cultural zeitgeist energized around this new way of knowing. At the same time, social scientists and humanities scholars expressed concern that big data would displace their rich array of methodological traditions, undermining diverse scholarly practices and forms of knowledge production. However, with the hype around big data beginning to settle, polemic visions of omnipotence on the one hand, and bleak austerity on the other, seem unlikely to come into fruition.

While big data itself enables researchers to ask new kinds of questions, I argue that big data’s most significant effect has been to bring social thinkers back to the methodological (and philosophical) drawing board. For decades, researchers have relied on the same suite of epistemological tools—survey, ethnography, interview, and census. Advances in these well-worn methods have undoubtedly increased the sophistication of knowledge production. For example, statistical analyses are more precise and complex, while ethnographers regularly integrate critical race and feminist theories into their research process. In turn, computer-based tools are now part of the quantitative and qualitative research repertoire, streamlining intricate numerical relationships and troves of field notes alike. But these innovations in qualitative and quantitative research are all, more or less, linear progressions. Big data is a move in a new direction. Big data isn’t just about answering particular questions better, but about asking questions we didn’t even know we had. This capacity to pose and answer new kinds of questions has given pause to the myriad stakeholders interested in understanding the world and the people who live together in it—scholars, investors, politicians, scientists. In this pause, we find a renewed focus on epistemology.

Grappling with the capabilities of big data entails looking back at how we have known and looking forward, to how we might know. It pushes us to revisit what we have done, and imagine what we can now do. Susan Halford and Mike Savage’s notion of “symphonic social science” resides neatly in this intellectual space of revisiting and reimagining that big data creates.

Recently published in the journal Sociology, Halford and Savage’s piece entitled “Speaking Sociologically with Big Data: Symphonic Social Science and the Future for Big Data Research” begins by looking back at the most influential works from the contemporary era. To learn how to best manage big data, the authors contend, we must first look at how scholars have leveraged data with optimal effectiveness in the past. That is, they look back in order to look forward. Halford and Savage identify three canonical contemporary works: Robert Putnam’s Bowling Alone, Thomas Piketty’s Capital in the Twenty-First Century and Richard Wilkinson and Kate Pickett’s The Spirit Level: Why Equality is Better for Everyone. Though coming from different disciplines and addressing distinct social phenomena, Halford and Savage demonstrate a similar analytic approach in all three works. Namely, Putnam, Piketty, and Wilkinson and Pickett each generate an argument by compiling multiple diverse data sources, exploring those data sources with relatively simple statistics, and making an argument about the ways that the data converge on a larger, underlying point (i.e., substantial shifts in, and unequal distributions of, social and economic capital).

Halford and Savage describe this process as a symphonic, in which each data source is its own riff, and all sources return to a single refrain. In its repetition, the refrain makes a powerful and compelling case, beyond what any one data source could demonstrate on its own. In the authors’ own words:

Drawing these data together into a powerful overall argument, each book relies on the deployment of repeated ‘refrains’, just as classical music symphonies introduce and return to recurring themes, with subtle modifications, so that the symphony as a whole is more than its specific themes. This is the repertoire that symphonic social science deploys. Whereas conventional social science focuses on formal models, often trying to predict the outcomes of specific ‘dependent variables’, symphonic social science draws on a more aesthetic repertoire. Rather than the ‘parsimony’ championed in mainstream social science, what matters here is ‘prolixity’, with the clever and subtle repetition of examples of the same kind of relationship (or as Putnam (2000: 26) describes it ‘imperfect inferences from all the data we can find’) punctuated by telling counter-factuals (Halford and Savage 2017).

The symphonic data assemblage and its analysis, Halford and Savage contend, is derived from theory, exhibits clear visual representation, and can/should act as a guide for dealing with big data.

The symphonic approach instructs big data analysts to select their data points, data sets, and computational approaches through theoretical understandings of the processes they wish to unearth. This means taking a critical approach to big data, maintaining an awareness that big data are often collected for financial and/or security purposes, and may therefore be inadequate or ill equipped to answer sociological questions. It means combining data in thoughtful ways, and knowing when data are irrelevant. It means visualizations that both represent data and also, integrate into argument formation, revealing patterns to the researchers who in turn reveal patterns to consuming publics. In these ways, big data can be a rigorous complement to existing methods. Large scale computations can enrich—rather than displace—ways of knowing about the world while social theory remains central to analysis and argumentation.

Symphonic social science is both a considered approach to big data and also, an artefact of big data’s effects upon epistemology. Big data has disrupted knowledge production, focusing scholarly attention on how we have known and how we might know. In this vein, the symphonic approach can fruitfully apply not only to big data but also to new forms of established methodologies. We can imagine, for instance, a multiple case study approach to ethnography, in which each case, though rich in its own empirically grounded way, combines into an ethnographic assemblage that rings through unexpected refrains. We can imagine mixed methods designs, in which big data, survey data, and interviews each act as their own verse, which together create a powerful harmony of argument. The symphonic approach is indeed versatile and elegant. It is an important way forward, derived from looking back, inspired by big data.

Jenny Davis is on Twitter @Jenny_L_Davis

Headline Pic Via: Source

Big Data and the Epistemological Renaissance

About Cyborgology

Pages