Aviral and Ali’s Presentation and Abstract

https://docs.google.com/presentation/d/140g2GMQpfbT1EXmuHiFMmiL3Hz3QW_8Wh8NKE-tslKw/edit?usp=sharing

Triggerhappy Commentary: The Reactionary Nature of Social Media and the Transformation of Societal Response

While fake news, trolls, and millennial culture are the most often blamed side effects of social media, the reactionary nature of the medium has gone overlooked as the main culprit for a larger societal problem.  The immediate and instantaneous qualities of social media, for better and for worse, allow for information (regardless of accuracy) to travel around the world like no other time in history.  But the reaction to said information is also travelling at the speed of your bandwidth, with real time responses fueling social movements, collective judgment, and increasing polarization in all aspects of life.  What is missing from the equation is simple: thoughtfulness.  Data needs to be processed before it can be useful, and current events must do the same in our collective conscience.  Instantaneous reactions spell doom for actual thought pieces, and instead social media news feeds are flooded with half baked ideas written by anyone and everyone with access to the internet.  The oversaturation of commentary has created a culture of fatigue.  For this project we will be exploring how the reactionary nature of social media is causing a shift in human behavior, or in fact creating new behaviors in the growing number of participants in online discourse.  We will examine a specific event on social media through its instant reaction, the impact of the reaction on those involved in the conversation, and the sentiment several weeks later.  For example, the recent Executive Order banning immigration from seven Muslim majority nations has inspired a large and polarized reaction from various sects of society.  While the sentiment around the order itself may not change much from the immediate aftermath to the present, false information (whether intentional or not) spread about the details of the order and the number of people affected at airports quickly disseminated across social media and amplified the intensity of people’s reactions.  The consequences of social media in this situation are thereby more difficult to understand.  Did the spread of this information cause protestors and those sympathetic to give money or time to the cause?  Did it mask Islamophobic currents within the far right to those moderates unable to see the ban as targeting Muslims?  What is the responsibility of the platform (Twitter, Facebook, etc.) in managing these unintended consequences?  We will mine the data around several hashtags during different times, allowing the chain of reaction to guide our broader research question about social media’s impact on behavior.  Through a visualization of the social media data we collect, we hope to show the class an alternative and more nuanced approach to the study of social media while avoiding the traps of becoming a millennial-bashing luddite or tech industry sycophant whose bubble never burst.

Notes for INT200: Algorithms and Culture on 2/16/2017

Presentation by Ryan Leach

Article “Examining the Impact of Ranking on Consumer Behavior and Search Engine Revenue” by Anindya Ghose, Panagiotis Ipeirotis and Beibei Li

-Critical approach to impact of consumer behavior on the search engine
-Looking at hotel search websites
-Study illustrates how the internet can be structured to encourage certain user pathways— particularly consumer pathways
-Parallel with the structure of cities— Psychogeography which maps the relations between late capitalism and structure of urban areas
-Derive— “high theory parkour”— a digital derive? Applying this idea to internet structures, pathways a user can take through hyperlinks,
-Alternative models:
-Neutral Vessel model
-Actor-Network model
-Technodeterminist model— Kittler— the category of human is produced by technologies
-Technologies aren’t all powerful
-In the article humans aren’t really mentioned, the subject is reduced to a series of clicks and purchases
-Fragmentation of human subjectivity
-How might the algorithms themselves play a role in their application to increase consumption and profit?

Rachel, Lin, and Ryan Abstract

Project Proposal

Given this class’s goal of approaching and exploring the spaces where algorithms and the production, reception, and repurposing of cultural texts meet, we would like to explore the intersection between the various valences of legibility and oppositional reading strategies in the consumption of popular media texts, and the limits and possibilities of making these vital human uses of media legible to computational processes. Given that one of the largest obstacles to “reading” the affect or emotional valences of social media texts is understanding inflections of irony on the part of the author, we intend to explore this apparent limitation as it functions in relation to another type of reading – the cluster of different viewing practices and strategies which might generally be termed “ironic.” What tools might we be able to create to make legible to algorithmic processing occurrences of ironic enjoyment or engagement with a given cultural text, in opposition to what might be termed “innocent,” “straightforward,” or even “uncritical” enjoyment? In her exploration of the divide between such viewer positions in relation to the film Disco Dancer, Neepa Majumdar conceives of this divide as “queer” vs. “straight” reading positions. These more general terms can be broken down into practices of camp, irony, and oppositionality, all of which raise critical questions of politics, class, taste, and cultural capital which are deeply ingrained in the act of media reception.

In exploring these multiple layers of language and legibility (aesthetic, political, technological), we hope to generate interesting questions about both human and computational “reading” by thinking critically about the limits of algorithmic potential to make legible human strategies of cultural use (and even “making do,” to cite de Certeau) in which pleasure, desire, and affect are so deeply entwined. We will be using a data-set of tweets collected through R-Shief on a specific cultural text which invites both ironic and straight viewing practice and positions, which are also either recent or popular enough to be able to generate a large enough data-set (for example, shows like Ancient Aliens, Vanderpump Rules, or even Alex Jones’ Infowars, or franchises like The Fast and the Furious). By adapting pre-existing open-source tools built to examine text and determine an emotional or affective reading of the author, we hope to propose strategies which attempt to make possible (or legible) gaining an understanding of the author’s underlying reading positionality or postures. By thinking critically along these boundaries, we hope to expand our understanding of both the computational limits and possibilities of legibility, as well as the function of such cultural reading strategies and practices as they intersect with the specificities of social media platforms.

Project Twitterbot Abstract: Leavell/Mackey

When Twitter released its IPO several years ago, much information about the company became public as well. Most relevant to this discussion was the surprising revelation that much of Twitter is run by automated accounts, aka “twitterbots”. In 2014, Twitter admitted that as many as 23 million, or 8.5%, of its users were fake.  For our project, we are curious about the prevalence and influence of automated accounts, those exhibiting behavior outside of human-like patterns of content generation, in political events and narratives with a large presence on Twitter.

We propose using R-Shief to sample Twitter activity during a popular political event, such as the Women’s March last month in Washington DC and around the country, or possibly an upcoming event, such as Donald Trump’s Supreme Court nominee process. With the help of pre-existing machine learning-based classification software, BotOrNot , we can make reasonably confident guesses as to whether a Tweet sent a specific Twitter handle is an automated account, one of the so called “twitter bots”; while the identification of twitterbots can be done almost as efficiently through sampling, we are also interested in more nuanced aspects of their behavior, creating a need to ‘fish’ for large number of twitterbots to perform further analysis on. With these automated techniques for twitterbot gathering, we hope to characterize twitterbot activity during political events and present our findings visually, using tools provided by the R-Shief software and others.

While the exact list of characterizations is still being developed, we have targeted a few questions which are both theoretically interesting and realistically observable based on the methods of data collection and analysis we have at our disposal.

Demographics: Do communities on different positions on the political spectrum have significantly different proportions of automated accounts? While it has been shown that the number of automated followers differs for certain political candidates, observing the ecosystems of automated accounts which exist for different political camps would yield insight into the social media activity of these ideological groups.

Thresholds: Is there a threshold of trending activity on a hashtag that triggers twitterbot activity? If so, what is that threshold? R-Shief conveniently segregates collected tweets by the time interval in which they were collected, leading us to wonder if there existed an identifiable threshold of authentic activity at which twitterbots begin to participate heavily? This threshold would be dependent on both the structure of the Twitter social network and specific twitterbot attributes. Additionally, can different thresholds be found for different events?

Impact: Is there any discernable goal of twitterbots on the social media conversation around the given topic (ie, dissemination of fake news, targeting of any specific people on twitter, promoting alternative analyses or politics to the majority of hashtag users, etc.). Although these accounts are widely assumed to be harmful to the user experience, little is known about the intentions of those who develop and deploy these accounts. While some accounts simply bolster the activity of those they follow (magnifying the perception of a public figure or idea), others hijack popular hashtags or intentionally spread misinformation.

Project Abstract: Huynh / Moore

Final Project Slides: https://docs.google.com/presentation/d/1gN0y5BoAiCQjCHKCdz-uIiKDFmoRM2ytTTVuuPeeqlQ/edit?usp=sharing

Final Project Code: https://github.com/animekraxe/fake-news-analyzer

Given the contemporary ubiquity for charged terms such as ‘fake news’ and ‘post-truth politics,’ this project seeks to work toward a more usefully quantitative metric for measuring the veracity of allegedly factual news stories spread through social media platforms. We acknowledge the inherent politicization of inaccuracies posing as legitimate reporting in a landscape wherein value judgments too often inform factual exposure rather than vice versa, but instead of finding an excuse to dismiss ideological components from our metric here we instead see an opportunity to fold them into our process of identification and observation. Further, we speculate that in making ‘fake news’ more tangible there must exist additional commonalities, both across those social media users that proliferate these stories and within the content of the stories themselves, that could reinforce the accuracy of an identification metric.

Therefore, this project seeks to question which factors of purportedly factual news stories and the users that share them might in fact serve as reliably high indicators of ‘fake news.’ Through our consideration of available data from social media sites like Twitter and consideration of previous research and scholarship of online credibility detection we hope to propose both the most and least effective indicators of inaccurate news stories. Additionally, observations regarding the spread of the stories through user activity will inform a metric for measuring the likelihood of any given user’s propensity for fake news proliferation. From this research we hope to produce an algorithm that might begin to evaluate news stories and social media users for our stated concerns in real time.

Week 3: Consumer Data and the Social Sciences

The article “What Happens When Big Data Blunders” by Logan Kugler concentrates on the failure of Google researches using search trends to predict flu outbreaks, identifying these failures to be a result of both the inability of Google researchers to isolate what should be meaningful indicators of illness (searches about flu symptoms and remedies) from other trendy searches and the difficulty of adapting dynamic nature of Google’s search algorithms to assumptions about the search habits of the susceptible population. This article characterizes a common problem within social science research: statistical methods which once struggled to collect enough data are applicable now that digital resources faithfully aggregate copious amounts of information, but these methods often require stable sampling techniques which don’t align with the goals of the application or the consumer’s behavior. In a few words, messy data is as bad as no data. As Kugler notes, Google’s profit-driven, business goals don’t align with those of social science researchers and the data being collected is often skewed by the desire of the application (like Google search) to improve customer experience, rather than provide consistency. Finding clever ways to work with data with has been comprised in such a way, allowing social scientists to piggy-back their experimental data collection on modern applications, would provide ways for businesses to profit from selling consumer data and ways for social scientists to utilize the computational resources which have revolutionized so many other fields already.

Week 3 – Undecidable Problems and Big Data Conclusions

In “What is Computable,” MacCormick gives a proof for why a program that can decide whether any other program can crash cannot exist. He relates this to the halting problem and explains that although it is not as important in practice as one might think, it raises important philosophical questions about what computers and people are capable of.

In “What Happens When Big Data Blunders,” Logan Kugler explains the reasons that David Lazer and Ryan Kennedy discovered for the failure of Google Flu Trends to predict the 2013 flu outbreak. Furthermore, he discusses reasons for over predicting the spread of Ebola. In both cases, the problem was based on making assumptions based only on big data that left out changing dynamics. In the Google case, the algorithm did not account for changes in the Google search algorithm itself and in the Ebola case, the CDC and WHO did not account for the initial efforts of people working to contain the disease.

It is an interesting and challenging idea to combine the themes of these two articles. One exercise that comes to mind is coming up with our own theoretical questions about what is possible with big data and whether these questions can be answered. Some questions might be:

  1. Is it possible to determine whether a big data algorithm is sound by some definition of sound? If not, can we derive bounds on acceptable error?
  2. Is it possible to prove that a particular problem cannot be decided by any big data algorithm?

The first question is quite challenging. The goal of a “big data” algorithm is typically to make some prediction given some large quantity of data. To think about this problem, we might think about solving one of these problems without the aid of a computer. Imagine you were able to think fast enough or live long enough to process all of the data. What are some issues that might arise? Is the data relevant? Is there enough non-overlapping information in the data to arrive at an answer? We would need to answer these questions. The answer to our question involves the relationship between the question, the data itself, and the operations we can perform over the data.

For the second question, we must first decide what it means to be decidable. Clearly if we give no data, and the problem requires data, it will be possible to prove this. On the other hand, if we supply all of the data about everything, will it then be able to solve it? This is somewhat philosophical, in fact. If we knew everything, could we predict the future?

Short thoughts week 3

There’s a dual comfort and dismay at the fact of incomputability. Philosophically, it’s humbling to admit that there are simply things that we cannot and may  not ever be able to solve. Bringing in Stephen Hawking and his 10 billion year time frame gives perspective to our humanity and our abilities. I recently bought a telescope and having spent a few weeks looking up at the sky at night between the rain lately, I have to admit it’s given my research some helpful perspective.

It contrasts some with the discussion of Turing’s “On Computable Numbers” paper in this same piece, the Church-Turing thesis. It’s more than a computable question, I think, on whether human brain capacity can be equivalent to a computer and deep neural network computing. I listened to an interesting debate recently between Jaron Lanier and a singularity advocate, whose name I now forget. The idea that the human mind could and will eventually be replicated by a computer to me seems like a bad ending to what had been an otherwise enjoyable sci fi novel. I don’t know that that need be our end point, or that it is even possible. Jacques Ellul wrote about Technique, and the ever growing and ever more integral obsession with results, efficiency, and function, and I think there is a healthy space for critique in this area- what’s possible, what’s not, what is lost, what is outside of our horizon and paradigm.

Then there’s the example of Google and the flu. Here we see that big data can make mistakes, and reach impossible predictions. But, when it comes to other areas, like self driving cars, or areas where artificial general intelligence or artificial super intelligence may step in as part of the algorithmic fabric of the imminent future, big data fails are inevitable and more deadly. All new technologies fail. A failure at a certain level, however, might be difficult to come back from. Interesting discussion of this in Our Final Invention.