Wednesday, 29 March 2017

Who uses Twitter?

Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University, UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies. @drlukesloan

Who uses Twitter?

It’s a simple question, but one that is tricky to answer. We all think we know the types of people who use Twitter – the urban elite, celebrities, professionals, young people… but providing an empirical account is challenging and without knowing who tweets we can’t even start a conversation about representativeness and bias. To understand how the social world manifests in the virtual we need to know who is present or underrepresented.

Much work has been done on using Twitter metadata to estimate proxy demographics for UK users such as gender (Sloan et al. 2013) and age, occupation and social class (Sloan et al. 2015), but these methods rely on people self-reporting a first name, an age or date of birth and an occupation to classify. The question has always been whether certain groups, such as older people and those from certain occupations, are less likely to choose to construct their virtual identity with reference to these characteristics or not.

Clearly it’s quite a leap forward to be able to use British Social Attitudes 2015, a random probability sample survey of over 4,000 respondents with weights calculated to account for non-response bias, to help us understand the Twitter population. The data allow us to compare Twitter usage by demographic groups benchmarked against the 2011 Census whilst evaluating previous attempts at demographic proxies.

So, how accurate is the picture of the demographic characteristics developed through proxies?

As it turns out we find some interesting discrepancies. According to the BSA data we find more men on Twitter than expected and we see that although most users are younger there are more older users on the platform than we previously thought. We also find that there are strong class effects regarding Twitter use, largely in line with previous proxy estimates most of the time but substantially out of line for certain groups. The full paper is open access and can be read here.

How does this aid our understanding of how the social world manifests online? To take an example, a recent study by Draper et al. found that, during the horsemeat food scare of 2013 Twitter was dominated by jokes and humour. The overall discourse suggested that this wasn’t perceived as a serious incident and that the issue wasn’t really a public concern, but we now know that Twitter is dominated by the higher NS-SEC groups – people with high incomes who are the least likely to come into contact with the budget adulterated products. Twitter thought it was funny because Twitter is dominated by people who were largely unaffected by the scare. This is an important lesson in how representation impacts upon what the data is telling us.

Of course, it’s no surprise that Twitter is dominated by the professional and managerial groups, but at least now we have some strong evidence to underwrite our expectations.

Read the full paper: Sloan, L. (2016) Who Tweets in the United Kingdom? Profiling the Twitter Population Using the British Social Attitudes Survey 2015, Social Media + Society 3:1, DOI:

Thursday, 23 February 2017

Programming as Social Science - new methods network

Phillip Brooker is a Research Associate at the University of Bath working in social media analytics, with a particular interest in the exploration of research methodologies to support the emerging field. His background is in sociology, drawing especially on ethnomethodology and conversation analysis, science and technology studies, computer-supported cooperative work and human-computer interaction. Phillip has previously contributed to the development of Chorus (, a Twitter data collection and visualisation suite. He currently works on CuRAtOR (Challenging online feaR And OtheRing), and interdisciplinary project focusing on how "cultures of fear" are propagated through online "othering".

Digital data and computational methods are increasingly becoming consolidated as essential elements of social science research and teaching. However, the algorithmic processes through which digital data are extracted, processed and visualised are often ‘black boxed’ and obscured from researchers who use those tools, which hinders our understanding of how they might be handled methodologically. Hence, there is an already-high and ever-increasing need for social scientists to engage with computational tools as a “critical technical practice” (Agre, 1997). In other words, since we are now pretty much completely reliant on software as part of our everyday research and teaching practices, it is all the more important that we were able to unpick and interrogate how these software packages operate, in order to better account for our data and research practices!

To this end, myself and Jonathan Gray (both at the University of Bath) have set up a mailing list/network called “Programming as Social Science (PaSS)”, for researchers interested in software programming both as an object of study and as a tool that we can learn and use within social science research. Here, we’re capitalising on lots of good work that has already been done in fields such as Science and Technology Studies, New Media Studies, Social Media Analytics, Software Studies, Ethnomethodology, Human-Computer Interaction, Computer-Supported Cooperative Work, and so on. All of these fields (and many more we haven’t listed!) have contributions to make in regard to understanding how we might critically leverage programming skills as part of social science teaching and research. So the PaSS mailing list/network has been established to act as a (low-traffic) hub for discussing these kinds of ideas, as well as sharing resources, updates, announcements and initiatives around programming in the context of social research.

If you’d like to join in, you can sign up via the following link: Please feel free to invite anyone and share widely; the computer geek in me is very much looking forward to chatting about programming as part of my work!

Thursday, 16 February 2017

Visualising Facebook

Daniel Miller is Professor of Anthropology at University College London. Recent books include Social Media in an English Village (UCL Press 2016). Miller. et. al. How the World Changed Social Media (UCL Press 2016). With J. Sinanan Webcam (Polity 2014) Ed. With H. Horst, Digital Anthropology (Bloomsbury 2012). With M. Madianou Migration and New Media (Routledge 2012) Consumption and its Consequences (Polity 2012), with S. Woodward Blue Jeans (California 2012) Tales from Facebook (Polity 2011). He recently completed a volume about media in the social lives of patients with a terminal diagnosis, forthcoming as, The Comfort of People (Polity 2017). @DannyAnth

This March will see the publication of a new book called Visualising Facebook, which I have written with Jolynna Sinanan. It will be available as a free download from UCL Press. One of the key arguments from the larger Why We Post project, of which this book is one out of eleven volumes, is that human communication has fundamentally changed. Where previously it consisted almost entirely of either oral or textual forms, today, thanks to social media, it is equally visual. Think literally of Snapchat. So, it is a pity that when you look at the journals and most of the books about social media, they often contain either no, or precious few, actual visual illustrations from social media itself. One of the joys of digital publication is that it is possible to reproduce hundreds of images. So, our book is stuffed to the gills with photographs and memes taken directly from Facebook, which is, after all, our evidence.
For example, as academics, we might suggest that the way women respond to becoming new mothers in Trinidad, is entirely different from what you would find in England. In the book, we can reproduce examples from hundreds of cases, where it is apparent that when an English woman becomes a mother she, in effect, replaces herself on Facebook with images of her new infant. Indeed, these often become her own profile picture for quite some time. By contrast, one can see postings by new mothers in Trinidad, where they are clearly trying to show that they still look young and sexy or glamorous, precisely because they do not want people to feel that these attributes have been lost, merely because they are now new mothers.

In writing this book we examined over 20,000 images. These provide the evidence for many generalisations, such as that Trinidadians seem to care a good deal about what they are wearing when they post images of themselves on Facebook. While, by and large, English people do not. But this becomes much clearer when you can see the actual images themselves. Or we might suggest that English people are given to self-deprecating humour, while Trinidadians are not. Or that in England gender may create a highly repetitive association between males and generic beer, as against women with generic wine. In every case, you can now see exactly what we mean. We also have a long discussion about the importance of memes and why we call them `the moral police of the Internet’. How memes help to establish what people regard as good and bad values. This makes much more sense when you are examining typical memes with that question in your head.

To conclude, given the sheer proportion of social media posting that now consists of visual images, it would seem a real pity to look this gift horse in the mouth. Firstly, it has now become really quite simple to look at tens of thousands of such images in order to come to scholarly conclusions. But equally, it is now much easier to also include hundreds of such images in your publications to help readers have a much better sense of what exactly those conclusions mean and whether they agree with them.

Friday, 3 February 2017

Mine your Data – Why understanding online health communities matters

Originally posted on the NatCen blogsite on 10/11/16 
Aude Bicquelet is Research Director in the Health team. Prior to joining NatCen in 2016, she held a fellowship at the LSE (Department of Methodology) where she taught courses on Research Design, Mixed-Methods and Text Mining approaches. 
Aude specialises in the analysis of ‘Big Qualitative Data’ on health related issues and has worked with professional and regulatory health bodies such as the National Institute for Health and Care Excellence (NICE) and the Royal College of Physicians.  Methodological and substantive outputs of her research have been published in academic journals; she has also published a book on ‘Textual Analysis’ with Sage.
In addition to her interest in Health policies she is interested in Social and Political attitudes and has researched widely in the areas of political participation and √©lites’ attitudes towards the EU. 

A staggering 73% of adults in the UK turn to the internet when experiencing health problems. Whether it is to check symptoms, find out about available treatments or share experiences about living with a particular condition, the internet has become the first port of call with many turning to the web before they even consider going to see a doctor. While many of these conversations take place on health-related websites such as Patient or Netdoctor, people suffering from health conditions also share their experiences on social media – and health practitioners should take note.  
Earlier this week I presented findings from a recent study looking into how people use social media to discuss health issues at the ESRC Festival of Social Science. In this study, funded by the NCRM, we used text mining techniques to analyse comments about chronic pain posted under YouTube videos.  
We found that chronic pain sufferers use YouTube to describe their experiences and vent their frustration. We analysed over 700 YouTube comments, and found they can be sorted into one of five categories:
  • Sharing Experiences: commenters thank each other for sharing their experiences in the videos posted on the website, emphasising tolerance and empathy for chronic pain sufferers.
  • Expressing Frustration: chronic pain sufferers expressed their frustration in their own words. These illustrate how YouTube and other social media offer new avenues for communicating pain outside clinical contexts.
  • Coping with Pain: chronic pain sufferers used social media to share their daily practices to cope with chronic pain.
  • Alternative Therapy: commenters spoke openly about their use of alternative medicines, illegal drugs or alcohol to manage their pain. The often conflicting relationship with clinicians – who were perceived as over- or under-medicating – was also common in this category.
  • Risks and Concerns: they also discuss the risks associated with different types of medication – in particular, addiction and overdose - along with increased risks of depression associated with some treatments against pain.
The insights gained from social media research provide important substantive information for health practitioners. People communicate online in a way they don’t during interviews with researchers or during doctors’ appointments. Online forums and social media are rife with information that’s difficult to obtain through traditional research techniques where social desirability, fear of judgement or stigma, and wanting to be seen as ‘functioning well’ may influence what people are willing to say.  From a purely practical perspective, they also provide freely available naturally occurring data with access to (at times) to hard to reach groups.
Of course, there is a great deal of uncertainty around how to harness the opportunities of analysing the wealth of health information posted online in a representative, robust and ethical way.
Despite their usefulness and efficiency, analyses of Internet comments on health forums do raise a host of concerns such as representativeness – where the views of one cohort in a population having access, technical skills and inclination to post comments on Internet websites are over represented while the views of others are excluded (i.e. the so-called ‘digital divide’) and consent – where, online commentators may not expect to be research subjects.
Nevertheless, the explosion of Big Data and the popularity of online communities might precipitate the need to integrate social media analysis and health research in the near future. For instance, it has been shown that patients who visit their doctors with inappropriate or misinterpreted information from the internet will do little to enhance doctor–patient communication (see Ziebland 2004). But, doctor-patient communication could be improved simply if health professional themselves were better informed about the common fears and sometimes the common ‘myths’ disseminated on online health communities.

Watch Aude’s presentation from NatCen’s event ‘What Social Media Can Tell us about Society’, live from Twitter’s London HQ. This event was part of the ESRC Festival of Social Science
If you’re interested in how social media research can help you, please get in touch: or

Monday, 16 January 2017

How Social Media Can Be a Researcher’s Miracle or Downfall

Cassie Phillips is a freelance technology writer who also dabbles in social media. She’s a firm believer that everyone can find a use for social media whether to make friends or conduct a research experiment. Like technology, she finds social media is just another tool to add to one’s arsenal. @securethoughtsc

The idea of combining social media and research at first might be at odds with one another, but they actually complement one another. Research involves the production, use and consumption of knowledge. Before social, scientists and researchers disseminated information via conferences, journals, peer reviews and publication. What brings all of these events together is collaboration. This is where the true benefits of social become apparent.

Finding Information
You probably already have a system in place to find journals and articles that will suit your research. This can include using information portals, attending meetings and even focusing on certain peer-reviewed journals. While still useful, this takes time and can also lead to information overload. Social media can help you find more relevant information and sift through the noise. Following researchers within your discipline can help you find articles and journals that may be particularly valuable to you.

On the flip side, you still need to verify the sources you find online. Anyone can publish an article or post on the internet, so it's more important than ever to check sources and make sure what you're reading is legitimate.

Knowledge Creation
Most researchers view data generation as the main aspect of the job. For the most part, this means finding other literature that supports your research. However, the other important aspect is ensuring you publish and disseminate the information at the right time. So where does social media fit especially when there are risks in communicating your research while it’s still going on? After all, it can reduce your chances of getting published while also providing ammunition against you should you make a mistake. And with social, there’s also the possibility your account might get hacked, especially if your internet connection isn’t secure, though luckily there are ways to protect yourself.

So what are the benefits? Consider this example. Marianne Hatzopoulou, a civil engineer professor, wanted to research the impact of air pollution on cyclists. She turned to Twitter and sent out a couple of tweets encouraging people to fill out a survey. A popular cycling blog found the survey and then wrote an article. This then got picked up by a local newspaper, which then led to coverage on a radio show and a major network.

Spreading the World

Perhaps the most attractive quality of social media is its ability to disseminate information. Above all, social media is about engagement and communication, making it ideal for researchers to reach a wider audience.

Of course, depending on the type of article you produce, you’ll reach a very particular audience. More scholarly and academic articles will likely attract other researchers in your discipline. However, this means you’ll likely alienate the layperson as they won’t want to wade through pages of information.

Since social media is such an effective tool in attracting attention, it might backfire. Always double check before publishing anything on social media to ensure the post doesn’t come off as offensive or in poor taste. Always read a post multiple times before hitting the send or publish button. One poorly worded tweet can reach thousands of people and ultimately lower your credibility not only among your followers but the scientific community as well.

While there are obvious benefits to using social media, it still doesn’t replace face-to-face interactions. If used improperly, it may end up hurting your reputation and research more than it helps. At the end of the day, it all depends on how you approach it and how you engage with your community.

Monday, 9 January 2017

“The Big Data rich and the Big Data poor”: the new digital divide raises questions about future academic research

Data is being created faster than ever before. However, as Kate Metzler explains, limited access to this big data is creating a digital divide between large companies and the broader scholarly community. To compound this problem, there is also a big data analysis skills gap that further hinders the progress of social science. Without access to these datasets or the expertise to analyse them, research is confronted with a replication crisis and is vulnerable to commercial motivations.
“Data is the new oil.” Clive Humby, mathematician and architect of Tesco’s Clubcard, is credited with saying this first in 2006, and it’s been repeated numerous times in the last decade. The comparison between data and oil refers to its value being extracted through refinement; or in the case of data, through analysis. Unlike oil, data is being created at a faster pace than it can be consumed, or analysed. We’re awash with data. You may have heard it said that “90% of all the data in the world has been generated over the last two years.” Or, as Hal R. Varian, Chief Economist at Google, puts it another way: “A billion hours ago, modern homo sapiens emerged. A billion minutes ago, Christianity began. A billion seconds ago, the IBM PC was released. A billion Google searches ago … was this morning.”
The capacity to collect and analyse massive datasets has already transformed fields such as biology, astronomy, and physics, and for many, the ‘big data revolution’ promises to ask, and help us answer,fundamental questions about individuals and collectives. But who gets access to all this data we’re producing through our increasingly networked and digital lives, and for what purpose?
Image credit: Divided by David Wan. This work is licensed under a CC BY 2.0 license.
In 2012, danah boyd and Kate Crawford offered a provocation that the limited access to big data was creating a new digital divide between “the Big Data rich and the Big Data poor.” It’s only companies, and the social scientists working within these companies, that have access to really large social and transactional datasets. The broader scholarly community usually does not because companies refuse to release it or because purchasing it costs too much.
Recently, I conducted a survey of more than 9,000 social scientists to learn more about researchers who are engaged in research using big data and the challenges they face, as well as the barriers to entry for those looking to do this kind of research in the future. 32 per cent of respondents who are currently engaged in big data research reported that getting access to commercial or proprietary data was a “big problem” for them:
Figure 1: Challenges facing big data researchers (n = 2273)
But it isn’t only the question of who can access data that leads to divides. As boyd and Crawford point out, and our survey supports, there is also a skills gap holding social science back: the level of quantitative and programming skills required for big data research make it a challenge for educators to introduce it into traditional social science degree courses as there is little time or expertise amongst teaching faculty:
Figure 2: Challenges facing educators teaching big data (n = 1212)
Why does it matter?
So who cares if academic social scientists can’t do big data, either because they can’t access the data and/or don’t have the skills they need to engage with it? Why not just have companies like Twitter and Facebook analysing social media data? Some have even gone so as far as to argue that academics should not engage in research that can be done better by industry.
There are a couple of reasons why this is problematic. Firstly, because replication is the engine of science, and irreproducible research slows progress. If only researchers within companies can access and analyse big social datasets, “those without access can neither reproduce nor evaluate the methodological claims of those who have privileged access”.
And secondly, and arguably most importantly, the motivations of industry researchers and social scientists may differ in ways that may really matter. Big data research conducted by companies is usually in service of a single overarching goal: to sell you more stuff. Social scientists with the right skills and access to the right data may use their research to contribute to the body of knowledge, with the aim of better understanding and improving social outcomes.
The questions boyd and Crawford pose at the start of their paper summarize this perfectly. They ask:
“Will large-scale search data help us create better tools, services, and public goods? Or will it usher in a new wave of privacy incursions and invasive marketing? Will data analytics help us understand online communities and political movements? Or will it be used to track protesters and suppress speech? Will it transform how we study human communication and culture, or narrow the palette of research options and alter what ‘research’ means?”
As of yet, the answers to these important questions are unclear.
Read more in the recent SAGE Publishing white paper revealing full results of the survey, “Who is Doing Computational Social Science? Trends in Big Data Research.”
About the author
Katie Metzler is Head of Methods Innovation at SAGE Publishing. Katie is responsible for content strategy and innovation for SAGE’s award winning online platform for researchers, SAGE Research Methods, which includes SAGE Research Methods Cases, SAGE Research Methods Datasets and SAGE Research Methods Video. In addition to heading up the London commissioning team for the SAGE Research Methods platform, she is part of a new team at SAGE whose mission is to improve social science by equipping every researcher with the skills and tools they need to work effectively with big data and new technology. At SAGE, we believe big data and new technology are fundamentally changing how we make sense of the world and that social science needs to play a critical role where this impacts on society.