Methodology

The Tech Nation series is continually evolving. In 2018, we have combined official statistics with a range of web, and open data sources to analyse the true scale and capabilities of digital tech sector in the UK. This range of data is explored below.

An accurate definition of the digital tech sector itself is key to ensuring that our analysis best represents the UK’s economic realities.

Rigorous measurement and mapping are also crucial, to highlight new and innovative industries and bring them to the attention of stakeholders including investors, collaborators, customers, educators and policymakers.

However, the process of mapping is innately contentious. We have received helpful feedback about how Tech Nation reports have examined the digital tech sector over the last four years.

One of the key challenges we have faced is the difficulty of measuring such a dynamic sector.

The Standard Industrial Classification (SIC) codes underpinning official economic statistics, (and used to derive the employment, productivity and value-added estimates in this report), make it possible to track the growth of industries consistently. Since they are updated only infrequently, however, they have distinct limitations when it comes to capturing the evolution of fast-changing industries. There is a SIC code for hunting, but not one for animation, for example.

Big data sources and methods aside, we have also made substantial use of official data sources to extrapolate economic statistics about the digital tech economy and industries. This is based on the most rigorous selection of digital SIC and SOC codes available, as developed in Nesta’s previous dynamic mapping research for Tech UK.

We have combined this business and industry data with other data sources that capture the capabilities, resources and infrastructures of tech clusters across the UK. An online survey of 3,428 digital tech ecosystem stakeholders has also been used, as have qualitative case studies featuring 70 digital tech founders and CEOs, educators and tech support organisations.

Economic statistics

We accessed data on key measures of economic performance from four official ONS datasets:

  • The Annual Population Survey (APS) APS is a household survey with information about respondent’s occupation (Standard Occupational Classification – SOC) and industry of employment (Standard Industrial Classification – SIC).

Given Tech Nation’s focus on the digitisation of the wider economy, we wanted to capture the diversity of workers in digital jobs (SOC), including those digital experts working in non-digital industries. We also wanted to measure freelancers and self-employed workers, important components of the digital workforce that are missing from industry-focused business surveys.

  • The Business Structure Database (BSD): An administrative dataset including SIC, location, employment and turnover data for all UK businesses registered for PAYE/ VAT. Researchers at Tech Nation accessed the BSD and ABS micro-data required for the project in early 2018.
  • The Annual Business Survey (ABS): A business survey with 2007 SIC, location and detailed financial data, allowing the estimation of approximate Global Value Added (GVA) figures at various geographic granularities.
  • The Annual Survey of Hours and Earnings (ASHE): ASHE is the most comprehensive source of information on the structure and distribution of earnings in the UK. It provides information about the level, distribution, make-up of earnings and paid hours worked, for employees in all industries and occupations.

The ASHE tables contain estimates of employee earnings, broken down by sex and full-time/part-time status. Breakdowns are also available by region, occupation, industry, age group and public/private sector. We use ASHE to look at industry earnings, and digital tech jobs.

Where possible, we have used these ONS datasets to produce estimates of employment, turnover, and digital tech GVA at both national and local levels. One barrier to doing this with digital tech GVA is that ABS data is not available at Travel to Work Area (TTWA) level.

To overcome this, GVA per employee was calculated using the Annual Business Survey at Local Authority District Level. In England, these were then converted to Lower Level Super Output Areas, and then re-aggregated to Travel to Work Areas. The median GVA per employee figure was taken for each TTWA, and then multiplied by the number of tech employees in each TTWA.

Outside of England, a less accurate proxy measure was taken. The median GVA per employee figure was taken for each Northern Ireland, Wales and Scotland. This was then multiplied by the number of employees in each TTWA.

As a consequence, it is not possible to produce digital tech GVA per worker (i.e. labour productivity) measures at the local level. The ratio of digital GVA to local digital employment would simply return regional digital GVA per worker. Instead, we use turnover per worker from the BSD as a proxy for local labour productivity in the digital tech industry.

We tested these datasets with a list of digital SOC and SIC codes adopted from Tech UK/ Nesta’s Dynamic Mapping of the Information Economy. This report follows a rigorous methodology to identify digital occupations in the SOC codes, and then calculate a digital intensity measure at the SIC level to identify digital tech SICs. Digital intensity corresponds to the share of digital workers working in an industry.

Digital tech SOC codes

  • 1136 IT and telecommunications directors
  • 2133 IT specialist managers
  • 2134 IT project and programme managers
  • 2135 IT business analysts, architects & system designers
  • 2136 Programmers & software development professionals
  • 2137 Web design & development professionals
  • 2139 IT & telecommunications professionals not elsewhere classified
  • 3131 IT operations technicians
  • 3132 IT user support technicians
  • 5242 Telecommunications engineers
  • 5245 IT engineers

Digital tech SIC codes

  • 26.20 Manufacture of computers and peripheral equipment
  • 58.21 Publishing of computer games
  • 58.29 Other software publishing
  • 61.10 Wired telecommunications activities
  • 61.20 Wireless telecommunications activities
  • 61.30 Satellite telecommunications activities
  • 61.90 Other telecommunications activities
  • 62.01 Computer programming activities
  • 62.02 Computer consultancy activities
  • 62.03 Computer facilities management activities
  • 62.09 Other IT & computer service activities
  • 63.11 Data processing, hosting & related activities
  • 63.12 Web portals
  • 95.11 Repair of computers & peripheral equipment

How we measure jobs in the 2018 Tech Nation Report

To measure the total number of tech and tech-enabled jobs across the economy, we used data from the Office for National Statistics (ONS) Annual Population Survey (APS). This is a survey-based sample of the UK population – on individual people rather than businesses. To get UK-wide data on people working in tech jobs from the survey, we have to make sure that the sample of people reflects the broader UK population – so we have to use multipliers from the ONS.

But this kind of analysis does not measure the number of direct jobs created by digital tech companies. To understand the impact and benefits of digital tech we need to have reliable data not only on the number of tech jobs across the economy but also performance and productivity indicators for the sector itself.

To do this, we use official data from the ONS Business Structure Database (BSD), which we also use to look at the performance of tech companies. This methodology allows us to have refined data that can be relied upon as the most accurate count of direct jobs created by the digital tech companies across the country.

The numbers are quite different in some cases. This is because one analysis looks exclusively at people working for digital tech companies, while the other looks at people working in tech jobs across the economy.

This year, we have expanded the way we look at jobs for a few reasons, here is a run through of our thinking, and how the data we present is more robust and representative than ever:

This year we are presenting two different sets of stats on employment. This means that the economy-wide numbers should not be compared to the sector-wide ones. But we have used this year’s method to look back over time.  If you want to compare employment in your local area, all the data you need is in the 2018 report.

The new 2018 analysis is based on a comprehensive look at all UK businesses that are PAYE or VAT registered.This means that using BDS data will provide us not only with the number of direct jobs created by tech companies but also helps us understand the performance of these companies. Viewed together, the two sets of data will help us understand all people working in digital tech.

The data on digital tech companies also contains financial information, as well as employment. This means that we can have reliable data on productivity. To get a true picture of jobs in digital tech, we need to look at performance, as well as quantity of jobs – this cannot be obtained from the APS alone.

The diagram below explains the way we measure UK jobs data in the report.

Jobs in the UK

Digital tech jobs – includes all people working in digital tech occupations, irrespective of the industry. For example, a software developer working in a retail company.

Source: ONS Annual Population Survey, Wave 4 2016, Waves 1-3 2017

Digital tech jobs in digital tech – includes only people working in digital tech occupations in the digital tech industries. For example, a software developer working in a web development firm.

Source: ONS Annual Population Survey, Wave 4 2016, Waves 1-3 2017

Jobs in digital tech – includes all people working in digital tech industries, including non-digital jobs. For example, an accountant working in a web development firm.

Source: ONS Business Structure Database 2017

Web, survey and open data

Global Startup Ecosystems, Startup Genome

Startup Genome focuses their efforts on collecting measurable and verifiable information from startups and investors, local ecosystem partners, and third-party sources.

The core data that they use are:

  • Startup Genome proprietary data
    • Interview of Experts: ~ 275 interviews across 25 countries with founders, investors, and industry experts in 2015 and 2016;
    • 2016 Startup Ecosystem Survey with more than 10,000 participants with the majority concentrated in 56 ecosystems;
  • CrunchBase: global dataset on funding, exits, and locations of startups and investors;
  • Orb Intelligence: global dataset on funding, exits, and locations of startups and investors;
  • Dealroom: global dataset on funding, office location funding flow;
  • Local partners (accelerators, incubators, startup hubs, investors):
    • list of startups;
    • list of local funding events

The ranking that Startup Genome creates is a weighted average of the following factor scores:

  • Performance: 30%
  • Funding: 25%
  • Market Reach: 20%
  • Startup Experience: 15%
  • Talent: 10%

Startup Genome calculates an ecosystem index value for each factor, based on the sub-factor and metrics detailed below. The ecosystems scores were multiplied by the above weights to establish the overall rank of each ecosystem.

Informal industry meetups, meetup.com 

We accessed data about tech meetup groups and tech meetup members/attendees from Meetup.com’s open API, focusing on public groups.

It is of significance that we collected data on all UK tech meetup groups to have both an accurate representation of the tech structure and also to classify them under their Travel to Work Areas (TTWAs).  For this reason, we needed to identify the groups as ‘Tech’ groups as well as locating them in specific regions.

To achieve the former, we found that Meetup’s API uses a straightforward categorisation system to identify the different group classifications. Tech groups fell under the category name ‘tech’ and category id ‘34’ which enabled specificity in querying. For the latter, we queried the locations of the newly categorised tech groups. This involved querying groups a 200 mile radius from Edinburgh capturing the North, a 200 mile radius from Birmingham City Centre (B1 1AA) to capture the South East & West and finally a 70 mile radius from Plymouth to capture the deep South West.

This method has allowed us to obtain a more accurately distributed dataset on 284 unique UK locations, which fall within the set TTWAs and play host to tech meetup groups.

We have used an unsupervised clustering analysis of group topics to classify them into areas of activity (or tech specialisation), and their location to group them into TTWAs. We have also used location data about meetup users to generate counts of unique members in each area who are interested in, or involved in, tech topics.

Formal industry events, Eventbrite 

We accessed Eventbrite’s open API to access data about formal tech events, (defined as those classified by the event organiser as ‘Science and Tech’).

Our dataset contains a description on the event, location (however, this is often not specific, e.g. UK), and their date. This data was not used extensively in the report due to limited location data. The use of Eventbrite data enables an understanding of the formal underpinnings of the sector and measure recent changes.

Online development collaborations, GitHub 

We accessed GitHub’s open API to access data about recently active, UK-based developers. Our dataset contains information about their location, the repositories (also known as repos or projects) they are involved with and key metrics about those repos, including the number of collaborators and their programming languages.

We have used those metrics to quantify levels of online collaboration by looking at the repos of top UK users, and to identify developers’ ‘dominant’ programming languages, (based on the distribution of lines of code they have contributed to repos in each language). We then combined the local and national distribution of developers specialising in each, to generate local indices of programming language specialisation.

Digital job vacancies and skills, Adzuna 

The Adzuna API was used to collect data on open job ads in the UK. We collected 50,000 ads by iterating through 1000 pages with 50 ads per query resulting in a dataset with information on role descriptions, locations and max/min salary. Because the Adzuna API had no classification system to allow the identification of digital tech roles and non-digital skills, we used Python {if/else} statements to categorise ads based on the information within their descriptions. As a result, we were able to classify the ads into three different skill identifiers – digital skills, digitally-enabled skills and non-digital skills – and obtain the location distributions of these skills by NUTS-1 region metric in addition to their average salary distribution and difference.

Investment and High growth digital tech firms, Pitchbook and Beauhurst

Pitchbook data is used to look at international investment in digital tech companies. Beauhurst data is used to identify high growth digital tech firms in the UK. The methodology used intends to closely replicate the SIC code based definition used for official statistics. As such we query Beauhurst and Pitchbook's data using the following criteria:

Beauhurst

  • Sectors
    • Retail
      • E-commerce
    • Technology/IP-based businesses
      • Software
        • Desktop software
        • Embedded software
        • Internet platform
        • Middleware
        • Mobile apps
        • Server software
        • Software-as-a-service (SaaS)
        • Other software
      • Business and Professional Services
        • Analytics, insight, tools
        • IT consultancy services
        • IT support service

OR

  • Buzzwords
    • 3D Printing
    • Adaptive learning
    • AdTech
    • Artificial Intelligence
    • Augmented reality
    • Big data
    • Blockchain
    • Chatbots
    • Challenger Banks
    • Crypto-currencies
    • Digital security
    • EdTech
    • eHealth
    • FinTech
    • InsurTech
    • Internet of Things
    • PropTech
    • Robotics
    • VoIP
    • Virtual Reality
    • Wearables

Pitchbook

  • Industry
    • Information Technology
      • Communications and Networking
      • Computer Hardware
        • Storage (IT)
      • IT Services
      • Software
      • Other Information Technology

OR

  • Verticals
    • AdTech
    • AgTech
    • AI & Machine Learning
    • AudioTech
    • Autonomous Cars
    • Big Data
    • CleanTech
    • Cyberscurity
    • E-Commerce
    • EdTech
    • Ephemeral Content
    • FinTech
    • HealthTech
    • InsurTech
    • IoT
    • Marketing Tech
    • Mobile
    • Robotics and Drones
    • SaaS
    • VR
    • Wearables & Quantified Self

Community perceptions survey data, Tech Nation and Streetbees

The Tech Nation 2018 Survey was conducted between 15 January 2018 and 2nd February 2018. The survey received 3,438 complete responses. We use postcode data to identify the TTWA of respondents.

When referring to UK wide perceptions, we weight the sample based on official employment data from the ONS for the digital tech sector in each Travel to Work Area. This enables us to indicate absolute UK perception based on the relative size of the ecosystem in each cluster. This is necessarily a proxy, as we cannot accurately quantify the size of the tech ecosystem in each cluster and should be interpreted as an indication of perception for the nation.

Case studies, Tech founders and CEOs, community partners, and Tech City UK 

We undertook 70 in-depth interviews with digital businesses and organisations supporting digital businesses, such as networks, membership organisations, incubators and inward investment agencies.

Geography

When defining geographical areas in the report, our starting points are the Office for National Statistics’ 2011 Travel to Work Areas (TTWAs).

These units of geography are economically meaningful and are based on statistical analysis rather than administrative boundaries.

TTWAs are currently defined as follows: at least 75 percent of the area's resident workforce work in the area, and at least 75 percent of the people who work in the area also live in the area. In other words, they are the geographical boundaries within which most people both work and live.

To qualify as a TTWA, an area must also have an economically active population of at least 3,500. However, areas with a working population in excess of 25,000 are granted self-containment rates as low as 66.7 percent.

As a result, some areas are much larger than others – much of London and its surrounding area form one TTWA, for example.

Overall, there are 228 TTWAs in the UK, (as calculated using Census 2011 data): 149 in England, 45 in Scotland, 18 in Wales, 10 in Northern Ireland and six cross-border TTWAs.

Where we are not using TTWA geographies in the report, we explain why, and the alternative lens we are looking through.

Glossary

Digital tech cluster – a critical mass of digital technology businesses within a geographic location, which interact formally (e.g. by trading or forming partnerships) and informally (e.g. networking, socialising).

Digital job advertisements – online job advertisements for digital occupations (see Methodology). (Source: Adzuna)

Digital salary – average annual salary for digital roles, distinct from the average annual salary across all roles. (Source: Adzuna/ ASHE)

Productivity – nationally, this is a calculation of GVA per worker within a given industry. Locally, it is based on turnover per worker. (Source: ONS Annual Business Survey/ ONS Business Structure Database)                         

Employment/ jobs in digital tech – all digital and non-digital jobs within the digital tech industries. (Source: ONS Business Structure Database)

Digital tech industries – businesses operating in 4-digit Standard Industrial Classification (SIC) codes in Nesta/techUK (2015). (Source: ONS Business Structure Database)

Digital tech jobs – jobs classified as ‘information economy occupations’ by Nesta/techUK (2015). (Source: ONS Annual Population Survey)

GVA (Gross Value Added) – GVA measures the contribution of each economic unit by estimating the value of an output (goods or services) less the value of inputs used in that output’s production process. It is employed in the estimation of GDP (Source: ABS/BSD)

Location Quotient – a location quotient measures an area’s degree of specialisation in any given activity (e.g. a tech sector) compared to that of a larger geographical area (e.g. the UK). It is calculated by comparing the importance of the sector within that area, (e.g. percentage of business turnover), with its importance nationally. A score above 1 indicates relative specialisation in the sector in that location. A score below 1 indicates relative lack of specialisation in the sector in that location.

Meetup – Meetup.com is an online social networking portal that facilitates offline group meetings in locations across the world. A meetup group refers to the offline meetings that take place.

Non-digital tech industries – those 4-digit SIC industries not classed as digital tech industries in the Nesta/techUK (2015) definition.    

Scale-up – a company that is fast growing and has normally received a number of funding rounds to support its growth.

Standard Industrial Classification (SIC) codes – a set of internationally agreed codes used to classify businesses into industries.

Standard Occupational Classification (SOC) – a set of codes used to classify work into occupational categories, by skill level and skill content.

Startup – a company with a minimum viable product, working towards establishing a repeatable and scalable business model.

TTWA (Travel to Work Area) – a geographical unit created to approximate labour market areas, TTWAs are self-contained areas within which most people both live and work – calculated using Census data (see Geography). (Source: ONS)  

Turnover – the amount of money taken by a business in a particular period.

Digital infrastructure – infrastructure networks for data and voice communications. Often adopted to denote all technologies that enable internet access and broadband.

Tech Nation 2018 is now open!

Great news. All data featured in the 2018 Tech Nation Report is available online for non-commercial use by third parties.  The data can be accessed through the data.world platform. If you use the data, please let us know. We would love to showcase your work.

If you'd like know why we are doing this, check out our blog.

Get a summary of the Tech Nation 2018 report and key findings to print or read offline

Next section