Conservatives and Twitter

class: center, middle, inverse, title-slide

# Conservatives and Twitter
## Automated Accounts in Partisan User Networks on Twitter
### <table style="border-style:none; padding: 0 0 50px 0; background-color:transparent" class="table">
<tr>
<th style="padding:0 30px">
<a href="https://twitter.com/kearneymw"></a>
</th>
<th style="padding:0 30px">
<a href="https://github.com/mkearney"> </a>
</th>
</tr>
<tr style="background-color:transparent">
<th style="padding:0 30px">
<a href="https://twitter.com/kearneymw"> @kearneymw </a>
</th>
<th style="padding:0 30px">
<a href="https://github.com/mkearney"> @mkearney </a>
</th>
</tr>
</table>

---

class: center, middle

Presentation created using **{xaringan}** (the **robot** theme)

Slides available at [mkearney.github.io/utchapter/](https://mkearney.github.io/dsa_execweek_talk/)

---

background-size: 150px auto
background-position: 490px 185px, 567px 320px, 644px 185px, 721px 320px, 644px 455px, 567px 50px
background-image: url(../../img/chr-logo.png), url(../../img/hexagon-logo.png), url(../../img/textfeatures-logo.png), url(../../img/tfse-logo.png), url(../../img/botrnot-logo.png), url(../../img/rtweet-logo.svg)

# About Me

**Data Science**

- Twitter APIs: [**{rtweet}**](https://cran.r-project.org/package=rtweet)
- Text analysis: [**{textfeatures}**](https://cran.r-project.org/package=textfeatures) [**{chr}**](https://github.com/mkearney/chr/)
- Data wrangling/viz: [**{hexagon}**](https://github.com/mkearney/hexagon/) [**{tfse}**](https://github.com/mkearney/tfse/)
- Machine learning: [**{botrnot}**](https://github.com/mkearney/botrnot/)

**Shiny apps**

- Web app interface for [botrnot](https://mikewk.shinyapps.io/botornot/)
- [Interactive friendship tool](https://mikewk.shinyapps.io/friendship/)

---

# Overview
+ Concerning trend of automated behaviors on social media
   - How can we accurately and dynamically classify bots on social media?

+ New method and **demonstration**
   - A new, fast, and flexible approach to training models to detect social media bots
   - Tested on partisan Twitter networks identified during the 2016 election

+ **Future** directions
   - Things to consider moving forward

---

# 2016 election

Concerns about **automated accounts, or bots, on social media** manipulating public opinion reached a fever pitch during the 2016 general election.
The most alarming concerns welated to behaviors of **Kremlin-linked bots account** ...
+ pushing **fake news** stories

+ amplifying **social divisions**

+ misrepresenting **public opinion**

---

# Hypothesis

### Conservative user networks will follow more bot accounts than will liberal and moderate user networks.

---
class: inverse, center, middle

# The data

---

# Source accounts

3,000 users were sampled from 12 different source accounts, representing 3 different partisan groups

+ Democrats
+ Republicans
+ Moderates (tuned out/entertainment)

---

# Partisan groups

**Democrats**

+ HuffPostPol
+ paulkrugman
+ maddow
+ Salone

**Republicans**

+ seanhannity
+ SarahPalinUSA
+ DRUDGE_REPORT
+ FoxNewsPolitics

**Moderates**

+ AmericanIdol
+ SInow
+ survivorcbs
+ AMC_TV

---
class: inverse, center, middle

# Detecting Twitter bots

---

# Current approach

Creators of [**botometer**](https://botometer.iuni.iu.edu/#!/) maintain a list open sourced academic studies that identified bots

Otherwise, current approaches have major **limitations**:
+ Academic research moves slowly (especially as Twitter continues banning bots)
+ Changes in the definition(s) and papertrail of "bots"
+ Relatively small number of publicly classified bot accounts
+ Labor intensive methods (human coding)

---

# New approach

Leverage small number of academic classifications using an easy-to-automate method that takes advantage of naturally occurring human coding

1. Select a handful of previously **validated** bots
1. Look up the public **Twitter lists** that include those bots
1. Identify the **names of Twitter list** that self identify as bot lists
   - Perform validity checks on other accounts in each list

---
class: inverse, center, middle

# botrnot

---

# Initial tests

+ The **default model** was 93.53% accurate when classifying bots and 95.32% accurate when classifying non-bots

+ The **fast model** was 91.78% accurate when classifying bots and 92.61% accurate when classifying non-bots

Overall...

+ The **default model** was correct 93.8% of the time

+ The **fast model** was correct 91.9% of the time.

---

# Applications

+ Packaged as an R library, [**{botrnot}**](https://github.com/mkearney/botrnot)---coming soon to CRAN!

+ Exported as a web app at [mikewk.shinyapps.io/botornot](https://mikewk.shinyapps.io/botornot)

---
class: inverse, center, middle

# Results

---

# Hypothesis

### Conservative user networks will follow more bot accounts than liberal or moderate networks.

---

# Analysis

**\#DataViz**

+ Source accounts
+ Partisan groups

**Statistical test**

+ Generalized linear model

---
background-image: url(../../img/source-accounts.png)
background-position: 50% 50%
background-size: auto 90%

---
background-image: url(../../img/source-partisan.png)
background-position: 50% 50%
background-size: auto 90%

---

Table 1

<table>
<caption style="text-align:left; font-style:italic;">Quasibinomial models predicting Twitter bot probabilities</caption>
 <thead>
 <tr>
 <th style="text-align:left;"> Predictor </th>
 <th style="text-align:left;"> M1 </th>
 <th style="text-align:left;"> M2 </th>
 <th style="text-align:left;"> M3 </th>
 <th style="text-align:left;"> M4 </th>
 </tr>
 </thead>
<tbody>
 <tr>
 <td style="text-align:left;"> (Intercept) </td>
 <td style="text-align:left;"> 0.93*** </td>
 <td style="text-align:left;"> 0.87*** </td>
 <td style="text-align:left;"> 0.87*** </td>
 <td style="text-align:left;"> 0.77*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.02) </td>
 <td style="text-align:left;"> (0.02) </td>
 <td style="text-align:left;"> (0.03) </td>
 <td style="text-align:left;"> (0.03) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Account age </td>
 <td style="text-align:left;"> -0.31*** </td>
 <td style="text-align:left;"> -0.30*** </td>
 <td style="text-align:left;"> -0.30*** </td>
 <td style="text-align:left;"> -0.29*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.00) </td>
 <td style="text-align:left;"> (0.00) </td>
 <td style="text-align:left;"> (0.00) </td>
 <td style="text-align:left;"> (0.00) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Statuses </td>
 <td style="text-align:left;"> -0.06*** </td>
 <td style="text-align:left;"> -0.08*** </td>
 <td style="text-align:left;"> -0.10*** </td>
 <td style="text-align:left;"> -0.19*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.02) </td>
 <td style="text-align:left;"> (0.02) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Favorites </td>
 <td style="text-align:left;"> -0.47*** </td>
 <td style="text-align:left;"> -0.46*** </td>
 <td style="text-align:left;"> -0.46*** </td>
 <td style="text-align:left;"> -0.47*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Followers </td>
 <td style="text-align:left;"> 0.18*** </td>
 <td style="text-align:left;"> 0.25*** </td>
 <td style="text-align:left;"> 0.25*** </td>
 <td style="text-align:left;"> 0.22*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Friends </td>
 <td style="text-align:left;"> 0.33*** </td>
 <td style="text-align:left;"> 0.32*** </td>
 <td style="text-align:left;"> 0.32*** </td>
 <td style="text-align:left;"> 0.39*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 <td style="text-align:left;"> (0.01) </td>
 </tr>
 <tr>
 <td style="text-align:left;" bgcolor="#D0FF00">Partisan - Moderate</td>
 <td style="text-align:left;" bgcolor="#D0FF00"> . </td>
 <td style="text-align:left;" bgcolor="#D0FF00">-0.13***</td>
 <td style="text-align:left;" bgcolor="#D0FF00">-0.13***</td>
 <td style="text-align:left;" bgcolor="#D0FF00">-0.11***</td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;" bgcolor="#D0FF00">(0.02)</td>
 <td style="text-align:left;" bgcolor="#D0FF00">(0.02)</td>
 <td style="text-align:left;" bgcolor="#D0FF00">(0.02)</td>
 </tr>
 <tr>
 <td style="text-align:left;" bgcolor="#D0FF00">Partisan - Republican</td>
 <td style="text-align:left;" bgcolor="#D0FF00"> . </td>
 <td style="text-align:left;" bgcolor="#D0FF00">0.29***</td>
 <td style="text-align:left;" bgcolor="#D0FF00">0.29***</td>
 <td style="text-align:left;" bgcolor="#D0FF00">0.26***</td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;" bgcolor="#D0FF00">(0.02)</td>
 <td style="text-align:left;" bgcolor="#D0FF00">(0.02)</td>
 <td style="text-align:left;" bgcolor="#D0FF00">(0.02)</td>
 </tr>
 <tr>
 <td style="text-align:left;"> Account age * Statuses </td>
 <td style="text-align:left;"> . </td>
 <td style="text-align:left;"> . </td>
 <td style="text-align:left;"> 0.00 </td>
 <td style="text-align:left;"> 0.02*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.00) </td>
 <td style="text-align:left;"> (0.00) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Followers * Friends </td>
 <td style="text-align:left;"> . </td>
 <td style="text-align:left;"> . </td>
 <td style="text-align:left;"> . </td>
 <td style="text-align:left;"> -0.22*** </td>
 </tr>
 <tr>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> </td>
 <td style="text-align:left;"> (0.01) </td>
 </tr>
 <tr>
 <td style="text-align:left;"> N </td>
 <td style="text-align:left;"> 53284 </td>
 <td style="text-align:left;"> 53284 </td>
 <td style="text-align:left;"> 53284 </td>
 <td style="text-align:left;"> 53284 </td>
 </tr>
 <tr>
 <td style="text-align:left;"> Deviance </td>
 <td style="text-align:left;"> 25629.88 </td>
 <td style="text-align:left;"> 25382.36 </td>
 <td style="text-align:left;"> 25381.35 </td>
 <td style="text-align:left;"> 24975.48 </td>
 </tr>
 <tr>
 <td style="text-align:left;"> χ2 </td>
 <td style="text-align:left;"> 6917.32*** </td>
 <td style="text-align:left;"> 7164.84*** </td>
 <td style="text-align:left;"> 7165.85*** </td>
 <td style="text-align:left;"> 7571.72*** </td>
 </tr>
</tbody>
</table>

* p < .05; ** p < .01; *** p < .001

---

# Findings

+ Empirical evidence of conservatives being more likely to follow "fake" Twitter accounts

+ Decentralization of conservative networks [compared to liberal and entertainment ones] plays a role

+ Results not overwhelming; these kinds of patterns can be meaningful, but they are fleeting by time and by technology/platform

---

# Future directions

Between news stories about Kremlin-linked bots on social media and reactions to false positives from **botrnot**, I'm starting to think...

The real challenge is not **identifying** social media bots but instead **defining** what is means to be a *bot* on social media

Not in an existential or even philosophical sense...but, in a, like, "know it when I see it" sense

---
class: inverse, center, middle

# The end