Wednesday, January 18, 2017

The Partial Truths of Big Data


Last July I was using R to do some social network analysis of Instagram tags.  After lots of package downloads, App Developer’s applications, etc., I couldn’t get it to work, only to discover that Instagram had changed its policy the months before.  Like many social media platforms, Instagram had restricted access to data through its API (Application Programming Interface).  For some, this could be welcome news—after all, third party developers having untrammeled access weakens privacy and serves to expose more and more of our lives to commodification.

But this isn’t the whole story.  Just because I (a researcher at a mid-tier state university) was having trouble gaining access doesn’t mean that large corporations were having trouble, or the National Security Agency, or Instagram itself.  Rather, what we’ve seen with the rise of Big Data as a research object is the progressive commodification of social media.  The social network analysis that began as a recondite branch of anthropology, sociology and mathematics has become an indispensable tool in business development.  Social media data are money, and the tightening of restrictions represents another digital divide, this one between corporations and governments that can gain access to the “firehose” of complete data, while the rest of us work with a fraction of that under whatever restrictions are placed upon data access through APIs.  In this latest chapter of the digital divide, some people (and entities) get Big Data, and some of us get “partial” data.

This has prompted some scholars to question the involvement of academics in Big Data analysis in the first place: “How much of a difference does it make for academics to gain access to Big Data, after all, when the logics of commercial enclosure of social media data may [have] already begun to run deep?” (Chan 2015: 1080).  It certainly doesn’t look good for cultural anthropologists—our “n” in a research study rarely exceeds one hundred.  Compare that to the 2016 update of a 2011 study from Facebook that looks to social distance and weak ties among its 1.5 billion users, concluding the geodesic distance between anyone on the planet is about 3.57 “degrees of separation” (Bhagat et al 2016).  It would be hard for anthropology to compare their work to this.  And yet, as Tricia Wang (2013) has reminded ethnographers, we have little choice but to work with the Big Data science around us: “Otherwise our work will be all too easily shoved into another department, minimized as a small line item in a budget, and relegated to a small data corner” (Wang 2013).  One strategy here is to point out the obvious.  “Big Data” (however construed) does not interpret itself; it needs context, theory, narrative—in other words, the work of anthropology.  In their often cited 2012 paper, dana Boyd and Kate Crawford urge researchers to critically engage the emergent hegemony of Big Data by pointing to the limits of the data these social media platforms aggregate.  “Do numbers speak for themselves?  We believe the answer is ‘no’” (boyd and Crawford 2012: 666).

But this means more than stressing the importance of history and political economy to the quanta of data we emit.  We need to ask more subversive questions.  What kinds of numbers are generated in the space of social media?  What, for example, does Facebook know about me?  On the one hand, it undoubtedly knows a great deal.  Not only am I updating Facebook with personal information (photos from family trips, political opinions), but I’m also “liking” groups, causes, music, etc. on Facebook and, furthermore, Facebook harvests cookies from my non-Facebook internet perambulations in order to “better serve me” advertising targeted to my demographic and political leanings.

But none of this, I would suggest, is really “anthropological” data—instead, it’s consumer data, information about what I buy, and what I might be tempted to buy.   It’s tempting to leap from this to insights into culture, society and social action, but that’s not really what Facebook is collecting.   The numbers are numbers about consumers—users who click on links, who link to each other, who can be profiled in order to sell more.  When we do other things on Facebook: “like” a group or respond to efforts to organize for a cause, we do so through a consumption frame.  Not surprisingly, this has led to several critiques of slacktivism: it looks like consumption without a credit card number.  In any case, Facebook data is not, as Boellsstorff put it, “raw data.”  Instead—it’s already been thoroughly “cooked”, data as emanating from an individual consumer (Boellstorff 2013).

As far as Facebook is concerned, though, this is all that’s important.  Facebook thinks it knows the whole truth, and, from the perspective of an enormous, monopolistic corporation, it knows all it needs (or cares) to know about my identity, habits and social relations.  And yet, it does not.  The emergent, the collective, the alternative, the subaltern, becoming-animal, the multitude—Facebook will never start the revolution, because Facebook can only know our social lives through the reified perspective of commodification.  Of course, activists have utilized Facebook (and other social media) for their work, but they do this in spite of the platforms themselves, media frames that will gamely struggle to track shopping and supply advertising to even the most ardent revolutionary’s account.  Big Data, then, is always “partial” data.

In other words, Facebook (and other social media) disclose “partial truths.”  I deploy this term from Clifford’s often-cited (and often excoriated) introductory essay to “Writing Culture,” a collection of essays that is widely credited with issuing in anthropology’s “postmodern” age.  There, Clifford (1986: 10) focuses attention on the ways ethnographic accounts “construct” culture and, in particular, the ways these genre conventions both enable and delimit anthropological truth:
"'Cultures' do not hold still for their portraits.  Attempts to make them do so always involve simplification and exclusion, selection of a temporal focus, the construction of a particular self-other relationship, and the imposition of a power relationship."
In focusing on the constructedness of the ethnographic encounter, Clifford led a generation of anthropologists to experiment with the ethnographic form and to reflect on their dyadic, field encounters.  But by directing our attention to the dyadic encounter, he deflects our attention from other contexts, among them political economy, social activism, postcolonial struggle and the work of the different communities in which anthropologists site their work.  As many critics have since concluded, anthropology is only in the last (and reified) instance, the ethnographic representation of a dyadic encounter.

There is, nevertheless, truth in Clifford, but it is a truth that serves to conceal other truths.  As Taussig writes of magic in general, “The real skill of the practitioner lies not in skilled concealment but in the skilled revelation of skilled concealment” (Taussig 2003:273).  A momentary glimpse into one secret serves to conceal another; for anthropology, the truth of ethnography served to conceal the onslaught of neo-liberalism.  This is where we can re-define Clifford’s titular perspective: not just a “part,” and not just biased, but a truth that obscures other truths.

With Big Data, the magic is the same.  There are truths to Big Data, but the focus upon them obscures other insights that may lead us to critical alternatives.  The same theories and methods that graph connected action and aggregate millions of data points also serve to deflect the eye from local process, or from action that unfolds over a longer timeline, or non-episodic phenomena that continue without defining “events”.    

In his 2011 book, Rob Nixon introduced the concept of “slow violence,” “a violence that occurs gradually and out of sight, a violence of delayed destruction that is dispersed across time and space, an attritional violence that is typically not viewed as violence at all” (2).  Ordinary violence—along with other temporally discrete phenomena—is particularly amenable to social media.  How many examples of police violence, for example, have been rendered visible through their felicitous recording on smartphones, the resulting videos uploaded to Facebook?  But slow violence proceeds without these—incremental tragedy impacting health, education and psychology.  Nixon concentrates his analysis on the slow violence of environmental degradation, and, particularly, on the ways that marginalized communities suffer through policies that enable corporations and governments to concentrate pollution in communities that cannot defend against it.  But slow violence can take many other forms, including processes of structural violence, de-industrialization, de-funding, under-development, infrastructure decay, pathologization.  None of these may spark social media storms, but these “slow” processes have the same, calamitous consequences in neighborhoods in both urban and rural areas.

This is where the data of anthropology and the “Big Data” available through social network analysis seem to diverge the most, but the onus is upon us to attempt to identify the lacunae and, when possible, use our methodological understandings to move in these interstices.  And it can mean using Big Data in ways contrary to the social media platforms that aggregated it in the first place—e.g., researching food deserts through Instagram (Beck 2016).  It is, however, not an easy task to take images that reflect the commodification of daily life and the drive towards the “quantified self” and appropriate them to advance social justice.  And it is here where the ethnography that seemed so beside the point suddenly becomes vital.

References

Beck, Julie (2016).  “The Instagrams of Food Deserts.”  The Atlantic [accessed on November 1, 2016 at www.theatlantic.com].

Chan, Anita (2015).  “Big data interfaces and the problem of inclusion.”  Media, Culture & Society: 1080-1086.

Bhagat, Smriti, Moira Burke, Carlos Diuk, Ismail Filiz and Sergey Edunov  (2016).  “Three and a half degrees of separation.”  Facebook Research [retrieved from research.fb.com on November 10, 2016].

Boellstorff, Tom (2013).  “Making big data, in theory.”  First Monday 18(10).  [Retrieved at firstmonday.org on January 6, 2017].

boyd, dana and Kate Crawford (2012).  “Critical Questions for Big Data.”  Information, Communication & Society 15(5): 662-679.

Clifford, James (1986).  “Partial Truths.” In Writing Culture, ed. By James Clifford and George Marcus.  Berkeley: University of California Press.

Nixon, Rob (2011).  Slow Violence and the Environmentalism of the Poor.  Cambridge: Harvard University Press.

Taussig, Michael (2003) “Viscerality, Faith, and Skepticism.”  In Magic and Modernity, ed. By Birgit Meyer and Peter Pels, pp. 272-306.  Stanford: Stanford University Press.

Wang, Tricia (2013).  “Big Data Needs Thick Data.”  Ethnography Matters [retrieved from ethnographymatters.net on September 3, 2013].