Wednesday, 29 March 2017

Who uses Twitter?

Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University, UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies. @drlukesloan

Who uses Twitter?

It’s a simple question, but one that is tricky to answer. We all think we know the types of people who use Twitter – the urban elite, celebrities, professionals, young people… but providing an empirical account is challenging and without knowing who tweets we can’t even start a conversation about representativeness and bias. To understand how the social world manifests in the virtual we need to know who is present or underrepresented.

Much work has been done on using Twitter metadata to estimate proxy demographics for UK users such as gender (Sloan et al. 2013) and age, occupation and social class (Sloan et al. 2015), but these methods rely on people self-reporting a first name, an age or date of birth and an occupation to classify. The question has always been whether certain groups, such as older people and those from certain occupations, are less likely to choose to construct their virtual identity with reference to these characteristics or not.

Clearly it’s quite a leap forward to be able to use British Social Attitudes 2015, a random probability sample survey of over 4,000 respondents with weights calculated to account for non-response bias, to help us understand the Twitter population. The data allow us to compare Twitter usage by demographic groups benchmarked against the 2011 Census whilst evaluating previous attempts at demographic proxies.

So, how accurate is the picture of the demographic characteristics developed through proxies?

As it turns out we find some interesting discrepancies. According to the BSA data we find more men on Twitter than expected and we see that although most users are younger there are more older users on the platform than we previously thought. We also find that there are strong class effects regarding Twitter use, largely in line with previous proxy estimates most of the time but substantially out of line for certain groups. The full paper is open access and can be read here.

How does this aid our understanding of how the social world manifests online? To take an example, a recent study by Draper et al. found that, during the horsemeat food scare of 2013 Twitter was dominated by jokes and humour. The overall discourse suggested that this wasn’t perceived as a serious incident and that the issue wasn’t really a public concern, but we now know that Twitter is dominated by the higher NS-SEC groups – people with high incomes who are the least likely to come into contact with the budget adulterated products. Twitter thought it was funny because Twitter is dominated by people who were largely unaffected by the scare. This is an important lesson in how representation impacts upon what the data is telling us.

Of course, it’s no surprise that Twitter is dominated by the professional and managerial groups, but at least now we have some strong evidence to underwrite our expectations.

Read the full paper: Sloan, L. (2016) Who Tweets in the United Kingdom? Profiling the Twitter Population Using the British Social Attitudes Survey 2015, Social Media + Society 3:1, DOI:

Thursday, 23 February 2017

Programming as Social Science - new methods network

Phillip Brooker is a Research Associate at the University of Bath working in social media analytics, with a particular interest in the exploration of research methodologies to support the emerging field. His background is in sociology, drawing especially on ethnomethodology and conversation analysis, science and technology studies, computer-supported cooperative work and human-computer interaction. Phillip has previously contributed to the development of Chorus (, a Twitter data collection and visualisation suite. He currently works on CuRAtOR (Challenging online feaR And OtheRing), and interdisciplinary project focusing on how "cultures of fear" are propagated through online "othering".

Digital data and computational methods are increasingly becoming consolidated as essential elements of social science research and teaching. However, the algorithmic processes through which digital data are extracted, processed and visualised are often ‘black boxed’ and obscured from researchers who use those tools, which hinders our understanding of how they might be handled methodologically. Hence, there is an already-high and ever-increasing need for social scientists to engage with computational tools as a “critical technical practice” (Agre, 1997). In other words, since we are now pretty much completely reliant on software as part of our everyday research and teaching practices, it is all the more important that we were able to unpick and interrogate how these software packages operate, in order to better account for our data and research practices!

To this end, myself and Jonathan Gray (both at the University of Bath) have set up a mailing list/network called “Programming as Social Science (PaSS)”, for researchers interested in software programming both as an object of study and as a tool that we can learn and use within social science research. Here, we’re capitalising on lots of good work that has already been done in fields such as Science and Technology Studies, New Media Studies, Social Media Analytics, Software Studies, Ethnomethodology, Human-Computer Interaction, Computer-Supported Cooperative Work, and so on. All of these fields (and many more we haven’t listed!) have contributions to make in regard to understanding how we might critically leverage programming skills as part of social science teaching and research. So the PaSS mailing list/network has been established to act as a (low-traffic) hub for discussing these kinds of ideas, as well as sharing resources, updates, announcements and initiatives around programming in the context of social research.

If you’d like to join in, you can sign up via the following link: Please feel free to invite anyone and share widely; the computer geek in me is very much looking forward to chatting about programming as part of my work!