Behind the Scenes of AI: What are Data Annotators Doing and Whom Do They Work For?
Behind the Scenes of AI: What are Data Annotators Doing and
Whom Do They Work For?
Sanja Tumbas, Assistant Professor, IESE Business School, Barcelona, Spain
Brana Savic, PhD Candidate, Universitat Politècnica de Catalunya, Barcelona, Spain
Extended Abstract
The massive development of artificial intelligence (AI) and its implementation immensely
affects tasks and occupational boundaries (Faraj et al. 2018). Recent attempts to depict the
future of work show two extreme ends of the continuum – either scenarios of optimistic views
of workless future (e.g. more leisure time) (Avent, 2016; 2014; Bregman, 2017; Dunlop, 2016;
Frase, 2016; Graeber, 2015; Mason, 2015; Srnicek & Williams, 2015 ) or the scary promise of
unemployment never seen before (Kaplan, 2015; Ford, 2015; Armstrong, 2014; Leonhard,
2016 ).
However, AI not only relies on highly skilled, well paid engineers and scientists but also on the
less visible contribution of low-qualified, low paid workforce (Fleming 2018). Given the wide
range of abilities assigned to machines the skills span from performing “cognitive functions
such as perceiving, reasoning, learning, interacting with the environment, problem solving,
decision-making, and even demonstrating creativity.” (Rai et al 2019). Hence, it is becoming
extremely challenging to disentangle how and why a given machine learning algorithm takes
a decision (Burrell 2016; von Krogh 2018).
There are multiple AI tasks relevant for organizations ranging from activities concerned with
input (data: sound, text, images, and numbers), task processes (algorithms), and task outputs
(solutions and decisions) (von Krogh 2018). To conduct an in-depth analysis of the types of
tasks involved, we explore the initial process concerned with data input. To allow for the variety
in the analysis, our study draws on the understanding of human-AI hybrids (Rai et al 2018).
Instead of the substitution relation between AI and humans, we are open to the view where
humans and AI augment one another (Rai et al 2018).
The primary goal is to gain new insights about different journeys that the input data follows in
the process of feeding complex learning algorithms. The job profile often affiliated with the
data input activity is named “data annotator.” To span a range of scenarios, we include the
following three vignettes: 1. data annotators recruited through digital labour platforms (for
example see Upwork) where they act as independent contractors paid by piece-rate, with
varying levels of activity and engagement; 2. data annotators as full time workforce for
technology companies’ that specialize in developing datasets for machine learning and
artificial intelligence, (see Appen) and 3. data annotators as permanent employees of
technology companies (such as Facebook) or other sectors (car manufacturers developing
autonomous driving vehicles).
In our analysis we code for tasks and competencies to reach a better understanding of
required qualifications for Data Annotator positions using information reflected in job
announcement description collected from two online platforms – Linkedin and Glassdor.
Following a systematic process of data collection and documentation, analysing the data using
qualitative content analyses techniques, a profile of the position is generated, which can be
used as a basis for further analysis of the position. We derived over 500 tasks, knowledge,
skill and ability statements from the job announcements, coded and grouped described
competencies that occur in each announcement.
Our initial results suggest that both, technical and non-technical skills are highly valued for the
data annotator position. There is a strong focus of job requirements on problem solving skills,
management, communication and interpersonal skills. Although each single data annotation
task may be quick-to-do and unchallenging, our results provide compelling evidence that by
working together with a wide variety of profiles, data annotators are providing valuable outputs
contributing significantly to improvements of processing tools and new features, efficiency and
data quality.
Data annotators are involved in a variety of processes in AI production ranging from small,
fragmented, remotely performed tasks of data annotation, e.g. answering simple questions,
identifying objects on a photograph, tagging, labelling images, categorizing audio files,
transcribing, correcting or copying short texts, sorting items in a list to data discovering and
collection, metrics evaluation, revision of annotation toolset, identification of bottleneck, tools
and processes improvement. These responsibilities apply to the task descriptions for both
identified positions, data annotator and data annotation specialist, although each group has
its own focus: the former with more repetitive and fragmented data annotation tasks and the
later with more responsibilities in improvement of data quality and annotation process.
Our data analysis is indicating high level of importance of requirements for collaborative efforts
for data annotation position. Data annotators need to interact and work closely with different
profiles and various departments in organization. S/he cooperates with engineering and
organizational development teams providing them feedbacks on tooling and processes; data
science and expert teams, to support their research and to assist in scientific data
manipulation, analysis and visualisation; project managers and project leads to ensure that
annotation meet project requirements and to achieve project goals.
The production of AI is a labour-intensive process and data annotation tasks are increasingly
complex and demand higher skills. Our analysis indicates that the need for data annotation is
a structural one, bound to go along with the future development of the sector.
References:
Armstrong, S. (2014). Smarter than us: The rise of machine intelligence. Berkeley, CA:
Machine Intelligence Research Institute.
Avent, R. (2016). The wealth of humans: Work, power, and status in the twenty-first century.
New York: St. Martin’s Press.
Bregman, R. (2017). Utopia for realists: And how we can get there. London: Bloomsbury.
Burrell, J. (2016). How the machine ‘thinks’: understanding opacity in machine learning
algorithms. Big Data & Society, 3(1)
Dunlop, T. (2016). Why the future is workless. Sydney, Australia: NewSouth Publishing.
Faraj, S., Pachidi S., Sayegh, K. (2018). Working and Organizing in the age of learning
algorithm. Information and Organization, 28, page(s) 62-70
Fleming, P. (2019). Robots and Organization Studies: Why Robots Might Not Want to Steal
Your Job. Organization Studies, Volume: 40 issue: 1, page(s): 23-38
Ford, M. (2015). The rise of the robots: Technology and the threat of mass unemployment.
London: Oneworld Publications.
Frase, P. (2016). Four futures: Life after capitalism. London: Verso.
Graeber, D. (2015). The utopia of rules: On technology, stupidity and the secret joys of
bureaucracy. Brooklyn, NY: Melville House.
Kaplan, J. (2015). Humans need not apply: A guide to wealth and work in the age of artificial
intelligence. New Haven, CT: Yale University Press.
Leonhard, G. (2016). Technology vs. humanity: The coming clash between man and machine.
New York: Fast Future Publishing.
Mason, P. (2015). Postcapitalism: A guide to our future. London: Allen Lane.
Rai, A., Constantinides, P., Sarker, S. (2019). Next-generation digital platforms: Toward
Human-AI hybrids. MIS Quarterly, Volume 43, No 1, page(s) iii-ix
Srnicek, N., Williams, A. (2015). Inventing the future: Postcapitalism and a world without work.
London: Verso.
Von Krogh, G. (2018). Artificial intelligence in organizations: New opportunities for
phenomenon-based theorizing. Academy of Management Discoveries, Volume 4, No. 4,
page(s) 404–409