Survey of young people
Over 1,000 young people responded to Tech City UK’s online survey on tech careers in August 2017. Sampling was structured to ensure that the results were as representative as possible of people aged 15-21 across UK regions.
Young people were asked a series of questions about their career plans, preferences and the rationale for their choices. Respondent were able to select more than one response in the survey, hence not all totals sum to 100.
Characteristics of young people in the survey sample
The sampling framework was structured to enable a close mach to country level population estimates, to ensure a close to representative sample of young people was gathered by UK constituent country. Though not entirely commensurable, as a guide, we compare the sample with ONS population estimates for 16 – 21 year old people for 2017.
As can be seen from the two graphs below, the sample is closely aligned to the country level populations of young people across the UK. However, the survey over-samples England – with the proportion of 15 – 21 year old people in this nation at 87% compared to population projection of 84%. For Scotland, Wales and Northern Ireland, the sample is 1% below the national projection based on population estimates.
Location of young people surveyed in Tech City UK survey of Young People (2017)
|Proportion of respondents (%)
||Proportion of respondents (%)
National Population projections of people aged 16 – 21 (2017) by country (Source: ONS Population Projections/ Estimates, Nomis)
|Proportion of national population of young people (%)
||Proportion of national population of young people (%)
Data from Reddit
Reddit is a US based social news aggregation, content rating, and discussion website. Members post content such as questions and replies, links – both to external sites, and other Reddit feeds, and images, which are then voted up or down by other members.
Posts are organised by subject into user-created boards called ‘subreddits’, which cover a variety of topics including news, science, jobs, movies, video games, music, books, fitness, food, and image-sharing. Submissions with more up-votes appear towards the top of their subreddit and, if they receive enough votes, ultimately on the website’s home page. Despite strict rules prohibiting harassment, Reddit’s administrators spend considerable resources on moderating the site.
As of 2017, Reddit had 542 million monthly visitors (234 million unique users), ranking it as the 4th most visited website in US and 8th in the world. According to 2015 data, Reddit experienced 82.54 billion page views, 73.15 million submissions, 725.85 million comments, and 6.89 billion upvotes from its users. Use of Reddit has only increased in a blog from Reddit in late December 2017, the total number of comments had reached over 900 million, and over 12 billion upvotes – making Reddit one of the definitive sources of information directly from people around the world.
In this research, we used a small subset of this global conversation curator’s data – focusing on tech career discussions. We scraped just under 80,000 Reddit responses to around 7,000 questions on careers to listen to how young people were talking about tech careers in the UK, and across the world.
Given that Reddit works by allowing users to pose questions under themed threads, or sub-Reddits, we identified a number of sub-Reddits associated with tech careers, and careers more broadly. Reddit’s mission is to ‘help people discover places where they can be their true selves, and empower their community to flourish’. We use their data accordingly, surfacing hidden conversations that users are having about careers, to listen to their thoughts when it comes to important issues, like the skills they think they need for tech, and their perceptions of tech.
We use the Thomson- Reuters Business Classification (TRBC) to classify career areas into sectors for our analysis using Reddit data. Critically, the TRBC is a global industry classification system – allowing us to capture sectoral activity in a way that is not biased by nationally specific classifications, which is appropriate for Reddit given its international reach and user base.
Responses to questions on sub-Reddits range from very brief answers – such as single word responses, to very lengthy passages of text, some of which are up to 2,000 words in length. As such, we have access to extremely rich text data which is directly reflective of young people’s experiences and perceptions of tech careers. However, this long text data is notoriously difficult to analyse, and distill. This is why we partnered with the Department of Computer Science at the University of Sheffield.
The tools that The University of Sheffield used as a data partner for the report, through their General Architecture for Textual Engineering (GATE), include data collection, semantic analysis, information aggregation, search and visualisation tools, which allow analysts to dig deep into the data and to perform complex queries over large volumes of data. The infrastructure enables users to collect and structure data from Reddit, analyse the posts and make the analysis results available for searching (using an indexing system called Mimir).
Characteristics of Reddit use
Most Reddit users post only 1 or 2 responses to questions – there is a long tail of users who post more than 10, however, there are some prolific users who have posted up to 235 responses to questions.
Likewise, with the number of responses, there is a strong skew in the distribution of questions posed by users. We see that most users tend to post just 1 question (around 5,500 users), and very few users post more than 3 questions. In terms of what this means for Reddit use, it suggests that users have a specific requirement of the platform and Reddit community – they tap into Reddit to seek responses for a single burning question, in the case of the data that we have investigated, around careers, or employment related themes.