The readings for this week highlight three limitations for computational algorithms:
- Logical Contradiction
- Private Ownership of Data
- Mimetic Fidelity
In “What is Computable?,” MacCormick uses deductive reasoning to prove certain types of computer programs logically contradictory, particularly the creation of a computer program that identifies whether or not other programs will crash. Except for a brief mention of phenomenology and spirituality at the end, he focuses almost exclusively on the logical limits of algorithms, insisting that everything not logically contradictory is at least theoretically computable.
However, Kugler’s article, “What Happens When Big Data Blunders?” (interestingly both articles are phrased as questions), uncovers two other issues through case studies: Google’s attempt to predict flu trends and the WHO/CDC attempts to predict ebola trends. These two big data projects failed, not from logical contradictions, but from commercial bias and mimetic infidelity respectively. In the first case, Google Flu Trends were based on a commercial search algorithm that changes based on fluctuating business plans. This presents difficulties beyond the comparatively clear-cut deductive reasoning of MacCormick, questioning whether a commercial venture determined by profit and competition can provide reliable algorithms for scientific research. While perhaps practically difficult, there’s no theoretical reason why such issues cannot be resolved through, say, performing such research outside of the commercial sphere.
In contrast, the WHO/CDC case study uncovers a much more difficult (and perhaps unanswerable) question: what are the limits of computational simulation? The WHO CDC studies used simulations that extrapolated from initial conditions to approximate ebola deaths, failing to keep up with “the ever-changing situation on the ground”–what we might call “reality.” This opens up philosophical questions going back to Plato regarding the relation between representation and reality. Further, in the age of computer simulation, what are the conditions necessary to render reality representable in a computational environment? Does reality itself have to function according to the principles of computation?
In “The Science of Culture?,” Lev Manovich evaluates the differences between Digital Humanities and Social Computing, claiming that the latter performs “humanities work” but on a much larger scale. Without opening the “What are the Humanities?” can of worms, I’d like to push on this claim to bring to the fore questions that remain unanswered in this week’s readings: what are we searching for in these data sets, who benefits from cultural algorithmic research, and what limitations (other than corpora size) hinder cultural study of algorithms?
Manovich writes, “However, looking at many examples of computer science papers, it is clear that they are actually doing Humanities or Communication Studies (in relation to contemporary media) – but at a much larger scale.” His examples include Quantifying Visual Preferences Around the World and What We Instagram: A First Analysis of Instagram Photo Content and User Types. The former analyzes the impact of sociodemographic factors (gender, age, education level, and geographical location) on aesthetic preferences of website design. The latter categorizes types of photos and users on Instagram. But what makes this humanities work on a “much larger scale”?
For the first, Manovich points out, “Obviously, the study of aesthetics and design traditionally was part of the humanities,” the operative word being “was.” When was the last time humanists attempted to either identify universal aesthetic values or attribute disparate aesthetic preferences to essentialist categories of human difference (“Females, for example, liked colorful websites more than males”)? A hundred years ago? The second article, on the other hand, bears some similarity to Russian formalist and later structuralist attempts to identify traits common to, for instance, all myths or narratives. But these focused on structural similarities rather than merely categorizing common subjects of representation. Further, structuralist questions–e.g. how is meaning produced and reproduced–are largely absent here. What questions are answered and for whom?
Rather than answering questions in the humanities, these studies appear most relevant to more corporate domains of knowledge production, finding themselves more at home in the boardroom than the seminar. It’s obvious to see how evaluating differences in aesthetic preferences of website design or identifying primary subjects of Instagram photos might benefit the tech industry. But how might they benefit the (digital) humanities?
One answer might be that it identifies the potential pitfall of digital humanistic research becoming merely another appendage of Silicon Valley’s ownership of information-cum-capital. Robertson and Travaglia illustrate the potential for big data to reproduce the same horrific ends of the last information boom in the 19th century. In addition to that, what might be different this time around? How might newer technologies, as well as ownership (corporate rather than state) of information, produce different ends to the current big data revolution?