Data for Religious Studies and Digital Humanities

For the digital humanities and the study of religion, data is never neutral. Data’s un-neutral-ness can be extended to other fields, but especially within disciplines concerned with identity formation and knowledge production, it can never be neutral.

Data tend to encompass observable, quantifiable information, and people often imbue the category of data with objectivity and certainty—a “you can’t argue with numbers” attitude. For some small-scale observations, data can appear inarguable (e.g., the number of students enrolled in a class), but the information that becomes data are collected because someone had a question to answer. Data serves a purpose, and people determine what those purposes are, which means that data cannot be separated from the subjective processes by which it is selected and used.

Religious studies scholars find value in analyzing “traditional” data (census and socioeconomic figures, numerical rates of some variable over time, numerical insights taken from text mining, poll responses, etc.), but almost anything can be considered data. A more important question to ask is, what can data not communicate? Since I have shown that data’s significance comes from its usefulness for particular purposes, which can be infinite, figuring out what data is not is more helpful in understanding what constitutes data for the study of religion and in the digital humanities.

Context

Alone, data cannot communicate context. Context includes the situations under which the data was collected and produced, the methods used, the researchers’ interests and assumptions, and the data’s purposes. Too often, data’s value comes from the belief that it is unambiguous and does not require explanation. If you find conclusions drawn from data unclear, you just need more data. But, as Catherine D’Ignazio and Lauren Klein write, “Refusing to acknowledge context is a power play to avoid power…a way to assert authoritativeness and mastery without being required to address the complexity of what the data actually represent.”1 Without understanding the “why” and “how” of data, which context provides, the data lose significance and credibility.

In the Longitudinal Religious Congregations files, the data by itself seems to suggest that some denominations disappeared over a ten-year period, and some appeared out of thin air. Context provides the understanding that some denominations renamed themselves, split from others, or merged with others. For this example, the researchers provided a thorough document explaining such context, data limitations, and methodology,2 but knowledge about renamings, splits, and mergers does not come from the numbers themselves.

One focus for religion scholars is to defamiliarize the familiar and familiarize the unfamiliar, which means we study anomalies and difference within broader contexts. When data are used to generalize, this conceals or erases exceptions. By analyzing aggregated data, those small, fascinating exceptions, which are integral to our studies, are swept under the rug. On “data cleaning,” the process by which data are structured and normalized so that computers can process them, Katie Rawson and Trevor Munoz lend this insight: “When humanities scholars recoil at data-driven research, they are often responding to the reductiveness inherent in this form of scholarship…data cleaning…is understood as a step that inscribes a normative order by wiping away what is different.”3 So, depending on the steps taken to organize data, we lose context crucial to understanding and framing both general and exceptional trends.

Meaning

Additionally, data cannot communicate meaning on its own. Meaning is interpretive and relies on context, but I also want to include the import of lived experiences. Some researchers find data useful because of its perceived objectivity and non-emotion, which again stem from its reputation as neutral. Yet, a preoccupation with neutrality can lead to wrong and offensive conclusions, as was the case with Robert Fogel and Stanley Engerman’s Time on the Cross: The Economics of Negro Slavery. Their study estimated that enslaved black people enjoyed better material conditions than black people in the twentieth-century. This horribly insensitive finding based on statistics “enticing in their seeming neutrality, failed to address or unpack black life…[and] failed to remove emotion from the discussion.” In this example, Jessica Marie Johnson warns that the lack of context incorrectly suggested a meaning from the data that “further obscure[d] the social and political realities of black diasporic life under slavery.”4 Objectivity and non-emotion are not goals to pursue in the humanities if studies neglect lived human experience, which data often does. And if the purpose of the data is to arouse emotion and point out the flaws in unjust systems, then data’s meaning should reflect that.

Objectivity

Finally, data cannot capture objectivity. If we accept that data derive value and meaning from how people use the numbers and for what purpose, data require interpretation by people. danah boyd and Kate Crawford are helpful here: “claims to objectivity are necessarily made by subjects and are based on subjective observations and choices.”5 Contrary to providing hard facts, data instead signify what researchers find notable and worth studying.

What counts as data reflects normative assumptions and personal biases that are not neutral. A person or a system had to decide that certain information was worth capturing for some specific purpose, and how the data is organized and communicated requires making subjective design choices. When data is presented, scholars must provide context so that viewers understand the intended meaning. What is left out when capturing data is just as important as what is included, and data cannot communicate these alone. For religious studies and the digital humanities, data cannot speak for itself.

Notes

  1. Catherine D’Ignazio and Lauren Klein, “The Numbers Don’t Speak for Themselves,” in Data Feminism (N.p.: MIT Press, 2020), sec. “Raw Data, Cooked Data, Cooking,” par. 7, https://data-feminism.mitpress.mit.edu/pub/czq9dfs5.
  2. Rachel Bacon, Roger Finke, and Dale Jones, “Merging the Religious Congregations and Membership Studies: A Data File for Documenting American Religious Change,” Review of Religious Research 60, no. 3 (2018): 403–422, https://doi.org/10.1007/s13644-018-0339-4.
  3. Katie Rawson and Trevor Muñoz, “Against Cleaning,” Curating Menus, July 7, 2016, sec. “Humanities Data and Suspicions of Reduction,” par. 1–2, http://www.curatingmenus.org/articles/against-cleaning/.
  4. Jessica Marie Johnson, “Markup Bodies: Black [Life] Studies and Slavery [Death] Studies at the Digital Crossroads,” Social Text 36, no. 4 (2018): 57–59, sec. “Black Data and the Slavery Debates,” par. 5, https://doi.org/10.1215/01642472-7145658.
  5. danah boyd and Kate Crawford, “CRITICAL QUESTIONS FOR BIG DATA,” Information, Communication, & Society 15, no. 5 (2012): 662–679, sec. “2. Claims to objectivity and accuracy are misleading,” par. 3, https://doi.org/10.1080/1369118X.2012.678878.

Leave a Reply

Your email address will not be published. Required fields are marked *