The University of Sheffield
Department of Computer Science

James Mcilveen Undergraduate Dissertation 2014/15

Identifying Social Bots on Twitter and Analysing Their Behavior

Supervised by F.Ciravegna

Abstract

Online social networks (OSN) have become increasingly commonplace since the early 2000s, with many different kinds of OSNs being created since then. Twitter is a micro blogging platform that has become popular since its inception in 2006.

It is unlike most other OSNs in two main ways. Firstly, it is a very open platform; with the majority of profiles publically shared so that anyone can see content that has been posted. Secondly, it is not unusual for a user to interact with accounts belonging to strangers.

Since Twitter’s creation, developers have been able to utilise Twitter’s API (and more controversially, web automation) which is designed to be very accessible and allow for content to be programmatically retrieved from the OSN.  This has lead to both cyborg accounts (partially computer controlled, partially human controlled) and completely autonomous accounts (bots) becoming part of the Twitter ecosystem alongside regular users.

Twitter allows this because bots can be worthwhile as they can update Twitter in real time with many sources of information, such as RSS news feeds or stock prices.

However, some bots are made to act like humans so that they can stealthy gain advantageous positions within Twitter’s social graph. They do this so that they can perform certain tasks including advertising, spreading grassroots propaganda (astroturfing) or simply to boost follower counts. These bots are known as social bots, and along with spam bots (which don’t imitate humans) can degrade the quality of a user’s experience on Twitter.

Social bots tend to go through several phases throughout their life cycle. They usually start with an initial network building stage, in which they build up social links with other bots within a bot network to gain credibility as a genuine Twitter user. They then typically move into an infiltration stage, when they use their credible position within the social graph to build links with genuine Twitter users while simultaneously cutting links with their bot network. Lastly they will go into their attack phase in which they try to manipulating genuine Twitter users in some way.

Social bot campaigns can be particularly effective because humans often find it difficult to distinguish between real Twitter users and autonomous accounts. Social bots often rely on social engineering tactics to engage people, with the goal of influencing these users actions or opinion on a particular topic.

This study will look at identifying accounts that are suspected of being social bots, and attempt to recognize any underlying behavior they adhere to. The aim is to be able to correctly identify these characteristics in close to real time by looking at account features.