class: center, middle, inverse, title-slide # Conservatives and Twitter ## Automated Accounts in Partisan User Networks on Twitter ###
@kearneymw
@mkearney
--- class: center, middle Presentation created using **{xaringan}** (the **robot** theme) Slides available at [mkearney.github.io/utchapter/](https://mkearney.github.io/dsa_execweek_talk/) --- background-size: 150px auto background-position: 490px 185px, 567px 320px, 644px 185px, 721px 320px, 644px 455px, 567px 50px background-image: url(../../img/chr-logo.png), url(../../img/hexagon-logo.png), url(../../img/textfeatures-logo.png), url(../../img/tfse-logo.png), url(../../img/botrnot-logo.png), url(../../img/rtweet-logo.svg) # About Me **Data Science** - Twitter APIs: [**{rtweet}**](https://cran.r-project.org/package=rtweet) - Text analysis: [**{textfeatures}**](https://cran.r-project.org/package=textfeatures) [**{chr}**](https://github.com/mkearney/chr/) - Data wrangling/viz: [**{hexagon}**](https://github.com/mkearney/hexagon/) [**{tfse}**](https://github.com/mkearney/tfse/) - Machine learning: [**{botrnot}**](https://github.com/mkearney/botrnot/) **Shiny apps** - Web app interface for [botrnot](https://mikewk.shinyapps.io/botornot/) - [Interactive friendship tool](https://mikewk.shinyapps.io/friendship/) --- # Overview + Concerning trend of automated behaviors on social media - How can we accurately and dynamically classify bots on social media? + New method and **demonstration** - A new, fast, and flexible approach to training models to detect social media bots - Tested on partisan Twitter networks identified during the 2016 election + **Future** directions - Things to consider moving forward --- # 2016 election Concerns about **automated accounts, or bots, on social media** manipulating public opinion reached a fever pitch during the 2016 general election. The most alarming concerns welated to behaviors of **Kremlin-linked bots account** ... + pushing **fake news** stories + amplifying **social divisions** + misrepresenting **public opinion** --- # Hypothesis ### Conservative user networks will follow more bot accounts than will liberal and moderate user networks. --- class: inverse, center, middle # The data --- # Source accounts 3,000 users were sampled from 12 different source accounts, representing 3 different partisan groups + Democrats + Republicans + Moderates (tuned out/entertainment) --- # Partisan groups **Democrats** + HuffPostPol + paulkrugman + maddow + Salone **Republicans** + seanhannity + SarahPalinUSA + DRUDGE_REPORT + FoxNewsPolitics **Moderates** + AmericanIdol + SInow + survivorcbs + AMC_TV --- class: inverse, center, middle # Detecting Twitter bots --- # Current approach Creators of [**botometer**](https://botometer.iuni.iu.edu/#!/) maintain a list open sourced academic studies that identified bots Otherwise, current approaches have major **limitations**: + Academic research moves slowly (especially as Twitter continues banning bots) + Changes in the definition(s) and papertrail of "bots" + Relatively small number of publicly classified bot accounts + Labor intensive methods (human coding) --- # New approach Leverage small number of academic classifications using an easy-to-automate method that takes advantage of naturally occurring human coding 1. Select a handful of previously **validated** bots 1. Look up the public **Twitter lists** that include those bots 1. Identify the **names of Twitter list** that self identify as bot lists - Perform validity checks on other accounts in each list --- class: inverse, center, middle # botrnot --- # Initial tests + The **default model** was 93.53% accurate when classifying bots and 95.32% accurate when classifying non-bots + The **fast model** was 91.78% accurate when classifying bots and 92.61% accurate when classifying non-bots Overall... + The **default model** was correct 93.8% of the time + The **fast model** was correct 91.9% of the time. --- # Applications + Packaged as an R library, [**{botrnot}**](https://github.com/mkearney/botrnot)---coming soon to CRAN! + Exported as a web app at [mikewk.shinyapps.io/botornot](https://mikewk.shinyapps.io/botornot) --- class: inverse, center, middle # Results --- # Hypothesis ### Conservative user networks will follow more bot accounts than liberal or moderate networks. --- # Analysis **\#DataViz** + Source accounts + Partisan groups **Statistical test** + Generalized linear model --- background-image: url(../../img/source-accounts.png) background-position: 50% 50% background-size: auto 90% --- background-image: url(../../img/source-partisan.png) background-position: 50% 50% background-size: auto 90% --- <p align="center"><strong>Table 1</strong></p> <table> <caption style="text-align:left; font-style:italic;">Quasibinomial models predicting Twitter bot probabilities</caption> <thead> <tr> <th style="text-align:left;"> Predictor </th> <th style="text-align:left;"> M1 </th> <th style="text-align:left;"> M2 </th> <th style="text-align:left;"> M3 </th> <th style="text-align:left;"> M4 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:left;"> 0.93*** </td> <td style="text-align:left;"> 0.87*** </td> <td style="text-align:left;"> 0.87*** </td> <td style="text-align:left;"> 0.77*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.02) </td> <td style="text-align:left;"> (0.02) </td> <td style="text-align:left;"> (0.03) </td> <td style="text-align:left;"> (0.03) </td> </tr> <tr> <td style="text-align:left;"> Account age </td> <td style="text-align:left;"> -0.31*** </td> <td style="text-align:left;"> -0.30*** </td> <td style="text-align:left;"> -0.30*** </td> <td style="text-align:left;"> -0.29*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.00) </td> <td style="text-align:left;"> (0.00) </td> <td style="text-align:left;"> (0.00) </td> <td style="text-align:left;"> (0.00) </td> </tr> <tr> <td style="text-align:left;"> Statuses </td> <td style="text-align:left;"> -0.06*** </td> <td style="text-align:left;"> -0.08*** </td> <td style="text-align:left;"> -0.10*** </td> <td style="text-align:left;"> -0.19*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.02) </td> <td style="text-align:left;"> (0.02) </td> </tr> <tr> <td style="text-align:left;"> Favorites </td> <td style="text-align:left;"> -0.47*** </td> <td style="text-align:left;"> -0.46*** </td> <td style="text-align:left;"> -0.46*** </td> <td style="text-align:left;"> -0.47*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> Followers </td> <td style="text-align:left;"> 0.18*** </td> <td style="text-align:left;"> 0.25*** </td> <td style="text-align:left;"> 0.25*** </td> <td style="text-align:left;"> 0.22*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> Friends </td> <td style="text-align:left;"> 0.33*** </td> <td style="text-align:left;"> 0.32*** </td> <td style="text-align:left;"> 0.32*** </td> <td style="text-align:left;"> 0.39*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> <td style="text-align:left;"> (0.01) </td> </tr> <tr> <td style="text-align:left;" bgcolor="#D0FF00"><strong>Partisan - Moderate</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"> . </td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>-0.13***</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>-0.13***</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>-0.11***</strong></td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>(0.02)</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>(0.02)</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>(0.02)</strong></td> </tr> <tr> <td style="text-align:left;" bgcolor="#D0FF00"><strong>Partisan - Republican</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"> . </td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>0.29***</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>0.29***</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>0.26***</strong></td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>(0.02)</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>(0.02)</strong></td> <td style="text-align:left;" bgcolor="#D0FF00"><strong>(0.02)</strong></td> </tr> <tr> <td style="text-align:left;"> Account age * Statuses </td> <td style="text-align:left;"> . </td> <td style="text-align:left;"> . </td> <td style="text-align:left;"> 0.00 </td> <td style="text-align:left;"> 0.02*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.00) </td> <td style="text-align:left;"> (0.00) </td> </tr> <tr> <td style="text-align:left;"> Followers * Friends </td> <td style="text-align:left;"> . </td> <td style="text-align:left;"> . </td> <td style="text-align:left;"> . </td> <td style="text-align:left;"> -0.22*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> N </td> <td style="text-align:left;"> 53284 </td> <td style="text-align:left;"> 53284 </td> <td style="text-align:left;"> 53284 </td> <td style="text-align:left;"> 53284 </td> </tr> <tr> <td style="text-align:left;"> Deviance </td> <td style="text-align:left;"> 25629.88 </td> <td style="text-align:left;"> 25382.36 </td> <td style="text-align:left;"> 25381.35 </td> <td style="text-align:left;"> 24975.48 </td> </tr> <tr> <td style="text-align:left;"> χ2 </td> <td style="text-align:left;"> 6917.32*** </td> <td style="text-align:left;"> 7164.84*** </td> <td style="text-align:left;"> 7165.85*** </td> <td style="text-align:left;"> 7571.72*** </td> </tr> </tbody> </table> <p style="font-style:italic; font-size:15px;"> * p < .05; ** p < .01; *** p < .001 </p> --- # Findings + Empirical evidence of conservatives being more likely to follow "fake" Twitter accounts + Decentralization of conservative networks [compared to liberal and entertainment ones] plays a role + Results not overwhelming; these kinds of patterns can be meaningful, but they are fleeting by time and by technology/platform --- # Future directions Between news stories about Kremlin-linked bots on social media and reactions to false positives from **botrnot**, I'm starting to think... The real challenge is not **identifying** social media bots but instead **defining** what is means to be a *bot* on social media Not in an existential or even philosophical sense...but, in a, like, "know it when I see it" sense --- class: inverse, center, middle # The end