Information System Development in the Age of AI: An ethnography of developing machine learning for recruitment

Information System Development in the Age of AI:

An ethnography of developing machine learning for recruitment

Elmira van den Broek, VU University,

Anastasia Sergeeva, VU University,

Marleen Huysman, VU University,

Introduction  Artificial Intelligence (AI), or in the specific sense, machine learning (ML), is increasingly exerting influence over many impactful decisions that were once made by experts, ranging from medical diagnosis to determining one’s creditworthiness. These ML applications differ from conventional development approaches, by deriving insights from patterns in data instead of domain experts (Huysman and De Wit 2004; Vogelsang and Borg 2019). The changes that ML presents for the development paradigm therefore result in the need to understand how the development of ML is managed in practice. We address this need by studying an AI vendor that develops a recruiting tool powered by ML and neuroscience to enable objective and data-driven hiring decisions.  Method  We conducted a 19-month field study of an AI vendor in Europe, “NeuroYou” (pseudonym), that develops an AI recruiting application to assist HR managers in making hiring decisions. The application was supported by machine learning (ML), which refers to a broad set of techniques that can adapt and learn from data about past decisions to generate predictions (Agrawal et al. 2018). The application offered measurement of over 150 human traits and skills, by drawing on neuroscience gamified assessments and automated video analysis. NeuroYou employed a team of around 30 people, consisting of a management team, a software team, a R&D team, a sales team, and a support team. Our data sources include field notes from non-participant observation of work practices, 53 semi-structured interviews (20 AI developers, 30 users of the tool, 3 HR managers), and more than 500 company documents (e.g. PowerPoints, client proposals).  Findings  We arrange our findings according to the four phases of convincing, eliciting, modeling, and validating, which emerged from our data analysis. In the first phase, developers aimed to convince clients about the superiority of AI for making hiring decisions. However, clients often had an insufficient understanding or unrealistic expectations of the tool. Moreover, clients questioned the validity of the tool, as they found it difficult to relate the assessments to actual job performance. Finally, developers also experienced resistance of domain experts, relating to fears of being replaced or the invasive way of assessment. The developers coped with these challenges by demonstrating their neuroscience background to gain authority, simplifying difficult language, and emphasized human control.  When the first convincing phase was successfully completed, developers continued with the elicitation phase. In contrast to conventional approaches, in which developers were dependent on the ability and motivation of domain experts to transfer their expertise, developers now required access to a set of input-output pairs, or training data. However, high quantity and quality training data turned out to be challenging to obtain in practice, as developers were dependent on clients for data supply. Developers managed these challenges by, for example, assisting clients in promoting the assessments, augmenting small training data sets with data from other clients, and minimizing the possibility of socially desirable and “gamed” answers by design.   In the third modeling phase, developers desired to inductively generate an algorithm from the training data, in contrast to manually coding rules. However, developers found it sometimes difficult to work with inductive models in practice, because of several issues. First, the inductive nature of ML made it hard to understand and explain the algorithm and predictions for developers and clients. Developers also had to deal with tensions arising from a gap between predictions from the ML model and actual client outcomes. Finally, developers could be confronted with client requests that undermined their scientific and data-driven methodology, clashed with legal and regulatory requirements, or were deemed unethical. Developers aimed to deal with these issues by defending and emphasizing their scientific knowledge base, “managing the message” by presenting analysis results in the best possible way to clients, and keeping true to their company values when facing dilemmas.  In the final ongoing validating phase, developers tried to retrieve a continuous flow of data from the client, to be able to validate and verify the algorithm over time. However, once developers started working with clients, they stumbled upon a situation where clients were often unable or unwilling to feed data back into the system, or provided feedback data containing “bias”. Developers aimed to solve these issues by educating clients on the importance of feedback loops, analyzing data for bias, and excluding biased data from their ML models.  Conclusion  In sum, our ethnography of ML in development shows how the ML-nature of the tool gives rise to new tensions and challenges that developers face compared to conventional approaches. In particular, developers have to convince clients of the superiority of AI for hiring decisions, in which clients have to continuously supply valuable data to the developer. This ultimately results in clients “buying into” a different understanding of and approach to recruitment.  References  Agrawal, A., Gans, J., and Goldfarb, A. 2018. Prediction Machines: The Simple Economics of Artificial Intelligence. Harvard Business Press.  Huysman, M., and De Wit, D. 2004. “Practices of Managing Knowledge Sharing: Towards a Second Wave of Knowledge Management,” Knowledge and process management (11:2), pp. 81-92.  Vogelsang, A., and Borg, M. 2019. “Requirements Engineering for Machine Learning: Perspectives from Data Scientists,” arXiv:1908.04674.