Skip to main content

Anthropology and the Twitter Challenge

For many of us in anthropology, the advent of “big data'' represents a threat.  Why, after all, spend months developing rapport and interviewing 100 people when you can run sentiment analyses on 40 million tweets in a matter of hours?  Still, I agree with Tricia Wang, who urges us to engage big data and complement that work with our own “thick data.”  In “thick data,” the depths of our insights into meaning and interpretation, “the native’s point of view,” could act as a corrective to billions of data points that may “speak for themselves,” as Chris Anderson claimed, but not, perhaps, for people.  Ironically, this move to “thick data'' was enabled by the gradual choking off of data access to social media APIs.  Facebook, Instagram, Twitter - one by one social media platforms began limiting third-party access to their data, under the cover of protecting users from infringements on their privacy.  Well, not all third-party access.  Corporations and select researchers still manage to maintain access to the “firehose” of user data in social media, while the rest of us have to make do with whatever limited sets of data we can access.  For some platforms, (e.g., Facebook), access has ceased altogether.  You can still gain access to much of this proprietary data through scraping, but that’s not an ethical research practice for anthropology.  So, I’ve worked towards my “thick data,” using the limited data I can download from platforms like Twitter to broaden the “deep” data I’ve been getting from more traditional, ethnographic methods. 

This has proven useful for community-based ethnographic work, and I've applied it to studies of neighborhoods in Baltimore, in Seoul, and elsewhere, resulting in articles and a co-authored monograph (“Networked Anthropology”) explaining the advantages of this mixed-methods approach to community-based, participatory research strategies.  I’ve also worked on multiple grants with the National Park Service using the same approach.   There, the park itself is the focus of social media investigation, with the ultimate goal being the identification of community stakeholders and their connections to the park.  

However: in early 2021, after the introduction of a new API interface, Twitter allowed academics to apply for an academic track with access to 10 million tweets per month.  While this is not full access, it certainly moves my possibilities more into the realm of big data.  And this raises all sorts of new problems and possibilities.  While my work has utilized some basic metrics (centrality measures, word frequencies, descriptive statistics), the scale of data I now have access to requires a different set of empirical tests and, perhaps, a different class of questions.  Ultimately, I wonder if it is possible to even ask similar kinds of questions of these data.  Can they tell me, for example, about the meaning of place?  About the ways people interpret their worlds?  The challenge for me is to bridge “thick” and “big” data.

But the big challenge (and opportunity) here is to anthropology.  While no stranger to quantitative methods, we still generally do not work with larger data sets.  These have been inimical to the “small societies” approach that characterized anthropology in the early twentieth century.  So what will anthropology become in this environment?  

Comments

Popular posts from this blog

Networked, Not Virtual: ethnography when you can't go there

(from our storymap ) In my capacity as a fellow in our faculty research center, I've been doing a lot of support work for the unexpected shift to learning-at-a-distance.  At my uni, very few of us have experience teaching online.  The faculty (generally) aren't especially enthusiastic, and there hasn't really been a lot of institutional support.  So, I wasn't surprised when most of the questions I was fielding took the form of: "I do X in my class.  How can I do X online?"  Not surprised because that's the ideological frame distance education has relied upon: an exact homology between offline- and online teaching, with the physical classroom replaced by the discussion board, the lectures by videos.  But actual online courses (not our band aid efforts to stitch together something in a few days) are structured very differently than their physical counterparts.  The best classes maximize their digital affordances and don’t try to simply "reprodu...

SETI: Signs in space/ Enacting space

[From the SETI project, "A Sign in Space" ( https://asignin.space/ )]  “To interpret is to impoverish, to deplete the world — in order to set up a shadow world of ‘meanings,’ Susan Sontag, Against Interpretation  In May, the SETI Institute Artist-in-Residence initiated a piece of collaborative performance–the decoding of an “alien” message, transmitted from the European Space Agency's ExoMars Trace Gas Orbiter (TGO). “A Sign in Space” is a simulation that enlists ordinary people in the work of “decoding” an alien message–one that you can download yourself. Along the way, SETI has hosted a series of workshops (including one from anthropologist Willi Lempert ) designed to help participants through the decoding process–including hints on avoiding ethnocentric (and anthropocentric) assumptions about what this communication could be and what the intentions of extraterrestrial intelligence might entail.  I am a very enthusiastic SETI advocate, but I wonder if “decoding” is re...

Turing Tests and ChatGPT’s Sleight of Hand

  One of the many benchmarks for AI is the “Turing test,” Alan Turing’s adaptation of the “imitation game” where an interrogator must decide which of two respondents is a computer. It is, as many have pointed out, a strangely indirect test, one that depends on the credulity of the human interrogator and the capacity of the machine to deceive (Wooldridge 2020). Will they believe the computer? And will the computer be a good enough liar? As Pantsar (2025) comments, “For the machine to pass the test, it needs to impersonate a human successfully enough to fool the interrogator. But this is puzzling in the wide context of intelligence ascriptions. Why would intelligence be connected to a form of deception?”   On the one hand, measuring AI through its deceptive power has the benefit of avoiding the idiocy of attempting to establish a measure of intelligence, a task deeply imbricated in racial eugenics (Bender and Hanna 2025; Wooldridge 2020). On the other, generative AI applicat...