When Twitter released its IPO several years ago, much information about the company became public as well. Most relevant to this discussion was the surprising revelation that much of Twitter is run by automated accounts, aka “twitterbots”. In 2014, Twitter admitted that as many as 23 million, or 8.5%, of its users were fake. For our project, we are curious about the prevalence and influence of automated accounts, those exhibiting behavior outside of human-like patterns of content generation, in political events and narratives with a large presence on Twitter.
We propose using R-Shief to sample Twitter activity during a popular political event, such as the Women’s March last month in Washington DC and around the country, or possibly an upcoming event, such as Donald Trump’s Supreme Court nominee process. With the help of pre-existing machine learning-based classification software, BotOrNot , we can make reasonably confident guesses as to whether a Tweet sent a specific Twitter handle is an automated account, one of the so called “twitter bots”; while the identification of twitterbots can be done almost as efficiently through sampling, we are also interested in more nuanced aspects of their behavior, creating a need to ‘fish’ for large number of twitterbots to perform further analysis on. With these automated techniques for twitterbot gathering, we hope to characterize twitterbot activity during political events and present our findings visually, using tools provided by the R-Shief software and others.
While the exact list of characterizations is still being developed, we have targeted a few questions which are both theoretically interesting and realistically observable based on the methods of data collection and analysis we have at our disposal.
Demographics: Do communities on different positions on the political spectrum have significantly different proportions of automated accounts? While it has been shown that the number of automated followers differs for certain political candidates, observing the ecosystems of automated accounts which exist for different political camps would yield insight into the social media activity of these ideological groups.
Thresholds: Is there a threshold of trending activity on a hashtag that triggers twitterbot activity? If so, what is that threshold? R-Shief conveniently segregates collected tweets by the time interval in which they were collected, leading us to wonder if there existed an identifiable threshold of authentic activity at which twitterbots begin to participate heavily? This threshold would be dependent on both the structure of the Twitter social network and specific twitterbot attributes. Additionally, can different thresholds be found for different events?
Impact: Is there any discernable goal of twitterbots on the social media conversation around the given topic (ie, dissemination of fake news, targeting of any specific people on twitter, promoting alternative analyses or politics to the majority of hashtag users, etc.). Although these accounts are widely assumed to be harmful to the user experience, little is known about the intentions of those who develop and deploy these accounts. While some accounts simply bolster the activity of those they follow (magnifying the perception of a public figure or idea), others hijack popular hashtags or intentionally spread misinformation.