Vol.:(0123456789)

Requirements Engineering (2024) 29:567–600 
https://doi.org/10.1007/s00766-024-00432-3

ORIGINAL ARTICLE

How mature is requirements engineering for AI‑based systems? 
A systematic mapping study on practices, challenges, and future 
research directions

Umm‑e‑ Habiba1,2  · Markus Haug1 · Justus Bogner3 · Stefan Wagner4

Received: 15 February 2023 / Accepted: 1 October 2024 / Published online: 23 October 2024 
© The Author(s) 2024

Abstract
Artificial intelligence (AI) permeates all fields of life, which resulted in new challenges in requirements engineering for 
artificial intelligence (RE4AI), e.g., the difficulty in specifying and validating requirements for AI or considering new 
quality requirements due to emerging ethical implications. It is currently unclear if existing RE methods are sufficient or if 
new ones are needed to address these challenges. Therefore, our goal is to provide a comprehensive overview of RE4AI to 
researchers and practitioners. What has been achieved so far, i.e., what practices are available, and what research gaps and 
challenges still need to be addressed? To achieve this, we conducted a systematic mapping study combining query string 
search and extensive snowballing. The extracted data was aggregated, and results were synthesized using thematic analysis. 
Our selection process led to the inclusion of 126 primary studies. Existing RE4AI research focuses mainly on requirements 
analysis and elicitation, with most practices applied in these areas. Furthermore, we identified requirements specification, 
explainability, and the gap between machine learning engineers and end-users as the most prevalent challenges, along with 
a few others. Additionally, we proposed seven potential research directions to address these challenges. Practitioners can use 
our results to identify and select suitable RE methods for working on their AI-based systems, while researchers can build on 
the identified gaps and research directions to push the field forward.

Keywords RE · AI · Systematic mapping study · RE4AI · SMS

1 Introduction

In the past decade, the need for automation and intelligence 
has led to huge advancements in machine learning (ML) and 
artificial intelligence (AI) [1]. Despite substantial growth in 
AI-based systems, we continue to see significant challenges 
and project failures [2]. Weiner [3] stated that according to 
recent data, 87% of AI projects do not make it into produc-
tion, meaning that most projects are never deployed. One 
primary reason is that AI has disrupted traditional software 
development practices, which are typically deductive, where 
requirements are explicitly defined and translated into code. 
In contrast, AI-based systems, i.e., systems that incorporate 
AI components [4], are developed inductively, as they learn 
and adapt from training data. This shift in approach makes 
it challenging to anticipate and understand the behavior of 
AI-based systems. Due to the critical role of AI-based sys-
tems, software engineering (SE) and the AI community must 
collaborate to develop new strategies to address these issues.

 * Umm-e- Habiba 
 umm-e-habiba@iste.uni-stuttgart.de

 Markus Haug 
 markus.haug@iste.uni-stuttgart.de

 Justus Bogner 
 j.bogner@vu.nl

 Stefan Wagner 
 stefan.wagner@tum.de

1 Institute of Software Engineering, University of Stuttgart, 
70569 Stuttgart, Germany

2 Department of Software Engineering, University of Kotli 
Azad Jammu and Kashmir, Kotli, Azad Kashmir, Pakistan

3 Department of Computer Science, Vrije Universiteit 
Amsterdam, Amsterdam, The Netherlands

4 TUM School of Communication, Information 
and Technology, Technical University of Munich, Heilbronn, 
Germany

http://crossmark.crossref.org/dialog/?doi=10.1007/s00766-024-00432-3&domain=pdf
http://orcid.org/0000-0001-8953-9624


568 Requirements Engineering (2024) 29:567–600

Requirements engineering (RE), the systematic handling 
of software requirements [5], has been impacted by the com-
plexities of AI-based systems [6]. Existing RE approaches 
are challenging to apply to AI-based systems because of 
their probabilistic nature and the need for constant adapta-
tion. To address these challenges, RE needs to evolve to be 
compatible with AI-based systems [7]. The roles and respon-
sibilities related to RE have changed, with data scientists 
now being responsible for specifying high-level require-
ments in ML systems, which can lead to systems that prior-
itize data quality over stakeholder requirements [8]. Hence, 
software engineers and data scientists must work together to 
address issues arising from the combination of AI and SE 
[9]. Kondermann [10] claimed that despite its long history, 
RE has yet to be used extensively for AI, especially com-
puter vision, and called for more research to integrate data 
selection approaches. Further, he emphasized that AI system 
requirements are complex to elicit, specify, and manage.

Recently, substantial growth in studies addressing 
Requirements Engineering for Artificial Intelligence 
(RE4AI) can be seen. With this proliferation, there is a need 
to understand what has been achieved so far. This is impor-
tant for practitioners to identify suitable methods for their 
day-to-day work as well as for researchers to tackle impor-
tant challenges. Therefore, we conducted a systematic map-
ping study to explore the potential of RE for contributing 
to AI-based systems. We chose a systematic mapping study 
(SMS) over a systematic literature review because our objec-
tive was to capture the landscape of research in our area of 
study, not just to answer specific questions but to understand 
the diversity and scope of research that has been conducted. 
An SMS allows us to achieve this by categorizing research 
works based on various dimensions such as methodology, 
themes, outcomes, and geographical focus. This study aims 
to explore the current RE4AI research landscape regarding 
RE practices for AI-based systems, topics covered regarding 
SWEBOK [11] areas, maturity of the research area, chal-
lenges, and future directions.

The main contributions of this SMS are summarised as 
follows:

• We provide a research overview regarding RE4AI, i.e., 
which topics have been explored according to SWEBOK 
[11] and the type of research conducted according to the 
classification of Wieringa et al. [12].

• We identify which existing RE practices have been 
applied to AI-based systems and also present new RE 
practices proposed specifically for AI-based systems.

• We identify the current trends and challenges in applying 
the existing RE approaches for AI-based systems.

• Finally, we extracted potential future research directions 
from selected primary studies.

Article organization: The remainder of the article is organ-
ized as follows: Sect. 2 compares existing studies similar to 
ours. Section 3 describes the research design we followed 
in this study. Then, Sect. 4 presents the extractions, synthe-
sis, and results from the selected primary studies. Section 5 
identifies challenges and future directions whereas, Sect. 6 
discusses our results by providing combined analysis of RQ4 
and RQ5, and Sect. 7 presents threats to validity and how we 
mitigated them. Finally, Sect. 8 concludes this paper.

2  Related work

With the recent growth of AI-based systems, the number 
of empirical studies on RE for AI is increasing, with more 
and more different aspects being investigated. There exist 
secondary studies that highlighted research in the field of 
SE for AI-based systems [4, 13–16]. However, these studies 
are not exclusively focused on the RE process but consider 
it only as a part of SE. In relation to RE, we identified three 
secondary studies on RE for AI/ML [17–19] that are closely 
related to our work. Ahmad et al. [17] conducted a system-
atic literature review and identified 27 primary studies. They 
identified the notations and modeling languages utilized in 
the development of AI systems, focusing specifically on the 
application domains of these AI systems. Villamizar et al. 
[18] conducted a mapping study incorporating findings 
from 35 distinct studies. They identified the contribution 
of RE aspects to the development of ML-based systems, 
focusing on RE topics. Further, they emphasized the quality 
characteristics that are considered during RE for ML-based 
systems. Yoshioka et al. [19] also conducted a systematic 
literature review. Their focus was mainly on current tech-
niques and practices of RE for machine learning systems 
(MLS). They analyzed 32 studies and mapped a research 
landscape of RE techniques and practices while identifying 
research gaps.

Our research significantly updates and expands upon pre-
vious reviews by analyzing 126 primary studies, broaden-
ing the scope to capture developments up to and including 
July 2023. We contribute by organizing current RE practices 
within SWEBOK Knowledge Areas (KAs) and spotlighting 
newly introduced practices.

Our work synthesizes challenges and future research 
directions from these studies, providing an in-depth over-
view of the evolving trends in the field. Distinguishing our 
systematic mapping study, we align our categorization with 
SWEBOK standards and employ Wieringa’s [12] classifica-
tion and Wohlin’s [20] evaluation strategy to assess research 
maturity. This methodology sets our study apart from others, 
such as the systematic literature reviews by Ahmad et al. 
[17] and Yoshioka et al.. [19], which cover studies up to 


569Requirements Engineering (2024) 29:567–600 

August 2021, and Villamizar et al. [18]’s study, which con-
cludes in December 2020.

Our inclusion of literature up to and including July 2023 
and our broader examination of AI-based systems, encom-
passing the entirety of AI, allows for a comprehensive under-
standing of RE challenges and practices relevant to AI. By 
including an additional two years and 100 more papers, our 
study reflects the significant shift in focus towards safe and 
responsible AI, as well as the evolving legal requirements 
of AI, particularly as AI impacts many more aspects of life 
following the rise of large language models (LLMs).

We critically assess and identify emerging RE practices 
tailored for AI, addressing the distinctive needs of AI sys-
tems compared to traditional approaches. This thorough 
analysis leads to a detailed compilation of challenges and 
directions for future research, significantly advancing the 
discussion on RE for AI.

We compare the related work to our study in Table 1.

3  Research design

We conducted this systematic mapping study by following 
the guidelines of Petersen et al. [21]. In the following, we 
outline our research questions (RQs) and their rationales, 
how we obtained articles, and how we addressed them 
systematically.

3.1  Research questions and rationales

To define the scope of our study, we formulated RQs that 
highlight the categorization of the literature in a way that 
can be interesting for scholars and practitioners, while also 
providing insights into how RE approaches have been used 
for AI-based system development.

RQ1: Where and when have RE practices for AI-
based systems been published?

We aim to explore the recent trends in this research area 
in terms of the publication year of articles, and how active 
this area has been. For this purpose, we analyze the study 
distribution by year of publication and the most preferred 
venues for RE4AI.

RQ2: How are the different RE topics distributed 
within the literature on AI-based systems?

We intend to use the SWEBOK [11] classification scheme 
to assign RE categories to our primary studies. The most 
recently published version of SWEBOK V3.0 has 8 knowl-
edge areas. Using the RE knowledge area and its categories, 
we analyze frequently studied RE topics, as well as topics 
that may require further attention.

RQ3: What is the maturity of the research in this 
area? Ta

bl
e 

1 
 C

om
pa

ris
on

 o
f r

el
at

ed
 st

ud
ie

s w
ith

 o
ur

 S
ys

te
m

at
ic

 M
ap

pi
ng

 S
tu

dy

St
ud

y
Ty

pe
 o

f s
tu

dy
 &

 st
ud

y 
pe

rio
d

Fo
cu

s o
f t

he
 st

ud
y

C
on

tri
bu

tio
n 

to
 R

E 
fo

r A
I/M

L

 A
hm

ad
 e

t a
l.

 S
ys

te
m

at
ic

 L
ite

ra
tu

re
 R

ev
ie

w
 A

ug
us

t 2
02

0 
Id

en
tifi

ed
 

27
 st

ud
ie

s
 E

xp
lo

re
s a

pp
ro

ac
he

s f
or

 d
oc

um
en

tin
g 

re
qu

ire
m

en
ts

 in
 

A
I s

ys
te

m
s, 

id
en

tif
yi

ng
 to

ol
s, 

te
ch

ni
qu

es
, c

ha
lle

ng
es

, 
an

d 
lim

ita
tio

ns
.

 E
xt

ra
ct

ed
 R

E 
no

ta
tio

ns
 a

nd
 m

od
el

in
g 

la
ng

ua
ge

s

 V
ill

am
iz

ar
 e

t a
l.

 M
ap

pi
ng

 S
tu

dy
 D

ec
em

be
r 2

02
0 

Id
en

tifi
ed

 3
5 

stu
di

es
 C

ha
ra

ct
er

iz
es

 th
e 

R
E 

pu
bl

ic
at

io
n 

la
nd

sc
ap

e 
fo

r 
M

L 
sy

ste
m

s, 
de

ta
ili

ng
 c

on
tri

bu
tio

ns
 li

ke
 a

na
ly

se
s, 

ap
pr

oa
ch

es
, a

nd
 q

ua
lit

y 
m

od
el

s. 
C

at
eg

or
iz

es
 b

as
ed

 o
n 

R
E 

ac
tiv

iti
es

 a
nd

 e
m

pi
ric

al
 e

va
lu

at
io

n 
str

at
eg

ie
s.

 Id
en

tifi
ed

 R
E 

as
pe

ct
s c

on
tri

bu
tin

g 
to

 M
L-

ba
se

d 
sy

ste
m

s

 Y
os

hi
ok

a 
et

 a
l.

 S
ys

te
m

at
ic

 L
ite

ra
tu

re
 R

ev
ie

w
 A

ug
us

t 2
02

1 
Id

en
tifi

ed
 

32
 st

ud
ie

s
 A

ss
es

se
s t

ec
hn

iq
ue

s a
nd

 p
ra

ct
ic

es
 o

f R
E 

fo
r M

L,
 c

la
s-

si
fy

in
g 

lit
er

at
ur

e 
by

 y
ea

r, 
ve

nu
e,

 M
L 

al
go

rit
hm

s, 
an

d 
st

ak
eh

ol
de

rs
. H

ig
hl

ig
ht

s r
es

ea
rc

h 
ga

ps
 a

nd
 d

ire
ct

io
ns

.

 A
na

ly
ze

d 
cu

rr
en

t R
E 

te
ch

ni
qu

es
 a

nd
 p

ra
ct

ic
es

 fo
r M

L 
sy

ste
m

s

 O
ur

 S
tu

dy
 S

ys
te

m
at

ic
 M

ap
pi

ng
 S

tu
dy

 u
p 

to
 a

nd
 in

cl
ud

in
g 

Ju
ly

 
20

23
 Id

en
tifi

ed
 1

26
 st

ud
ie

s
 P

re
se

nt
s l

ite
ra

tu
re

 d
em

og
ra

ph
ic

s, 
cl

as
si

fie
s R

E 
co

nt
ri-

bu
tio

ns
 b

y 
SW

EB
O

K
 K

A
s, 

as
se

ss
es

 re
se

ar
ch

 m
at

ur
ity

. 
A

ls
o,

 Id
en

tifi
es

 a
nd

 a
na

ly
ze

s b
ot

h 
ex

ist
in

g 
an

d 
ne

w
 

R
E 

pr
ac

tic
es

 fo
r A

I s
ys

te
m

s, 
di

sti
nc

tly
. L

as
tly

, a
ls

o 
pr

op
os

es
 a

 ta
xo

no
m

y 
fo

r c
ha

lle
ng

es
 a

nd
 fu

tu
re

 
re

se
ar

ch
 d

er
iv

ed
 fr

om
 p

rim
ar

y 
stu

di
es

.

 E
xp

an
de

d 
th

e 
sc

op
e 

of
 re

vi
ew

s, 
id

en
tifi

ed
 n

ew
 R

E 
pr

ac
-

tic
es

 &
 sy

nt
he

si
ze

d 
ch

al
le

ng
es

 a
nd

 d
ire

ct
io

ns
 fo

r A
I


570 Requirements Engineering (2024) 29:567–600

To evaluate the maturity of this area, we analyze the pri-
mary studies according to the criteria given below:

• We use Wieringa et al.’s [12] classification scheme for 
RE publications.

• We scrutinize if RE techniques discussed in the paper 
have been empirically evaluated [20].

• If yes, we analyze if the evaluation took place in an 
industrial environment.

RQ4: What RE practices have been proposed for or 
applied to AI-based systems?

This question aims to identify new RE practices that have 
been proposed in the context of AI-based systems and how 
existing RE practices have been used for AI-based systems. 
We intend to provide a holistic overview of the degree to 
which RE has contributed to AI. We answer the following 
sub-questions:

• RQ4.1: Which conventional RE practices have been 
applied to AI-based systems?

• RQ4.2: Which new RE practices have been proposed 
for AI-based systems? We classify practices into tools, 
techniques, processes, and models. To make a clear 
distinction, we define these terms below. Technique: A 
technique is a specific method applied during a part of 
the procedure, focusing on actions that are observable 
and measurable by practitioners [22, 23]. For example, 
a common technique in requirements gathering is the 
use of interviews, where the interviewer follows a struc-
tured approach to elicit information from stakeholders. 
Another example is prototyping, a technique used to 
quickly create a working model of a system’s features to 
gather feedback. Process: A process is a set of activities 

to reach a certain goal, which specifies the concrete activ-
ity details and the sequence in which they are performed. 
However, a technique is a specific way to perform one of 
those steps. Model: A model is an abstraction of a system 
that describes it from a certain perspective. For example, 
a UML (Unified Modeling Language) Use Case Diagram 
serves as an example of a model. In the context of RE 
for AI, models can also include frameworks or theoreti-
cal constructs. Tool: Any software that has been used to 
support the RE process.

RQ5: What makes requirement engineering for AI-based 
systems challenging, and what are future directions to 
address?

To answer this question, we analyze AI-based systems 
and their key characteristics in the targeted studies. This 
analysis helps us to identify differences between AI-based 
systems and conventional systems, as well as RE challenges 
for AI-based systems. Furthermore, we analyze studies that 
propose future research directions. To address this RQ, we 
split it into two sub-research questions:

• RQ5.1: What are the current challenges in the develop-
ment of AI-based systems?

• RQ5.2: What are future research directions?

3.2  Research protocol

To execute an impartial, objective investigation, a research 
protocol is required. It manages the flow of research and 
maximizes the study’s valuable findings. We created a 
research protocol that describes the parts of the study and is 
depicted in Fig. 1. The main steps of the research protocol 
are as follows. 

Fig. 1  Search Process


571Requirements Engineering (2024) 29:567–600 

1. Search query formulation: We formulated our search 
query using the first two elements of the PICO crite-
ria [21]. The first element is the Population (}}P��) , 
which indicates RE publications. The second element 
is Intervention (}}I��) , which specifies AI, where ML 
and deep learning (DL) are a part of AI. We excluded 
the Comparison(C) and Outcome (O) criteria from our 
search to broaden its scope, allowing us to capture a 
wider range of studies, which is especially useful for 
exploratory research or obtaining a comprehensive field 
overview as in an SMS. We constructed our query itera-
tively and restricted our search to article titles to achieve 
the best results. Initially, we evaluated four digital librar-
ies ACM Digital Library, IEEE Xplore, ScienceDirect, 
and Springer for our search. Our findings indicated 
that IEEE Xplore and ACM Digital Library were most 
effective in handling our search query. Concerns may 
arise regarding the exclusivity of our selection poten-
tially overlooking relevant studies, particularly from 
Springer. To address this, we incorporated a snowball-
ing technique, systematically reviewing references from 
our initial findings to ensure no significant work was 
overlooked. Alongside these selected digital libraries, 
we expanded our search to include meta-search engines, 
specifically Google Scholar and Web of Science, focus-
ing on article titles to refine our results. This comprehen-
sive approach, combining direct searches with snowball-
ing, was designed to ensure thorough coverage of the 
literature. The finalized search string is given below:

 ("requirement" OR "requirements")

 AND

 ("AI" OR "artificial intelligence" OR "ML" 
OR "machine learning" OR "DL" OR "deep 
learning")

  This query resulted in 123 articles from ACM Digital 
Library and 260 from IEEEXplore, while meta-search 
engines such as Web of Science returned 452 and 
Google Scholar 955 articles. In total, we obtained 1790 
papers.

2. Removal of duplicates: In this step, we removed dupli-
cated articles as we ran our query on two digital librar-
ies and two meta-search engines. After the removal of 
duplicates, we are left with 795 papers.

3. Inclusion/exclusion criteria: Our query yielded litera-
ture that included all keywords in their title. To rectify 
the literature according to the scope of our study, we 
developed inclusion (IC) and exclusion criteria (EC). 

1. EC1: Not relevant to the scope of the study (i.e., 
studies that do not focus on RE for AI)

2. EC2: Published before 2010 and after July 2023
3. EC3: Not written in English
4. EC4: Secondary studies
5. EC5: Not peer-reviewed / not a scientific paper
6. EC6: Not accessible
7. IC1: The primary focus of the paper is requirements 

engineering
8. IC2: The paper targets AI-based systems

   During our study, we used Rayyan [24] to remove 
duplicates and applied inclusion/exclusion of articles. 
Subsequently, after eliminating duplication, we were left 
with 795 unique papers. The inclusion/exclusion criteria 
mentioned above were used to refine articles that fit our 
study scope. EC3 and EC6 were designed to exclude 
studies that are not written in English and are accessible 
through any source. At the same time, EC2 is intended 
to strictly select the publication years considered during 
the mapping study. To exclude the secondary studies and 
the grey literature, we apply EC4 and EC5. IC1, IC2, 
and EC1 required an in-depth study of articles to analyze 
whether the article fits the scope of the study. The first 
three authors independently assessed 795 studies using 
these criteria. Discrepancies in their selections were 
resolved through discussion and consensus voting. This 
rigorous process resulted in the selection of 93 articles.

4. Data extraction: We extracted data from each primary 
study to answer our research questions described in 
Sect. 3.1 above. We defined extraction sheets to record 
the necessary information related to each publica-
tion. Having a specified extraction sheet will reduce 
the opportunity to include researcher bias. As a result, 
during data extraction, the researcher extracted data 
that should answer the research questions. Before we 
started extracting data, a pilot extraction process was 
conducted to develop a shared understanding and avoid 
any confusion regarding the extraction process. This 
pilot ensured that each researcher clearly understood 
the research questions and respective extraction sheets. 
For this purpose, we selected three initial studies, and 
each researcher independently extracted data into their 
sheet. Afterward, we discussed the extracted data and 
further improved the extraction sheets. We outlined the 
individual data cells according to each research question. 
Since each RQ has multiple fields, we maintained a sep-
arate spreadsheet for each RQ. RQ1 is primarily focused 
on studying metadata, including the year of publication, 
publishing venue, and the involved research community 
(Note: we classified papers based on author affiliations 
as industry, academic, or collaborative). To identify 
which RE topics have been covered frequently within the 
literature (RQ2), we classified the primary studies using 
the SWEBOK [11] subcategories for RE. Moreover, to 


572 Requirements Engineering (2024) 29:567–600

judge the maturity of this research area (RQ3), we clas-
sified literature according to the RE publication types 
proposed by Wieringa et al. [12] as well as empirical 
evaluation method [20]. We characterized RE practices 
in four dimensions: tool, techniques, model, and process. 
Therefore, RQ4 is designed to capture these details and 
differentiate between practices that are new or already 
existing. Finally, we extracted challenges highlighted by 
different authors and synthesized literature to outline 
possible research directions (RQ5). Eventually, we split 
up the extractions and assigned two researchers to each 
paper, and after every week, a synchronization meeting 
was held to discuss extraction as shown in Fig. 2. A 
separate consensus spreadsheet was maintained where 
all finalized entries were recorded. Further, to analyse 
inter-rater reliability, the agreement level was meas-
ured using Cohen’s kappa coefficient, which provides 
a robust statistical measure of inter-rater reliability. For 
the inclusion/exclusion criteria, the researchers used the 
rayyan.ai tool and independently performed the inclu-
sion/exclusion of papers. However, there was no conflict 
found during this process. For the thematic analysis, the 
researchers evaluated 5 papers with 4 questions each, 
making a total of 20 evaluations.

• Total items evaluated (N): 5 papers x 4 questions = 
20

• Agreement on all 4 questions for 3 papers: 3 x 4 = 
12 agreements

• Agreement on 3 questions for 1 paper: 3 agreements
• Agreement on 2 questions for 1 paper: 2 agreements
• Total agreements (A): 12 + 3 + 2 = 17
• Total disagreements (D): 20 - 17 = 3

   Cohen’s kappa (�) is calculated as follows: � =
P
o
−P

e

1−P
e

 
Where P

o
 is the observed agreement and P

e
 is the 

expected agreement by chance. 

1. Observed Agreement P
o
 : Number of agreements / 

Total items evaluated = 17∕20 = 0.85

2. Expected Agreement P
e
 : Assuming equal prob-

ability for agreement and disagreement: P
e
=(Prob-

ability of both agreeing)2+(Probability of both 
disagreeing) Again, assuming equal distribution 
(0.5 for agreement and 0.5 for disagreement): P

e
 = 

(0.5 × 0.5) + (0.5 × 0.5) = 0.25 + 0.25 = 0.50

3. Cohen’s Kappa ( � ): � =
0.85−0.50

1−0.50
=

0.35

0.50
= 0.70

   These kappa values indicate a substantial level of 
agreement between the researchers, supporting the reli-
ability of the conclusions derived from their analyses.

5. Snowballing: Following the first iteration of extrac-
tions, we applied forward and backward snowballing 
according to the guidelines by Wohlin [25]. Snowball-
ing, which involves using the references of identified 
papers to find additional relevant studies, can be espe-
cially effective in fields where consistent terminology 
is lacking. This approach helped us identify important 
studies that might be missed due to inconsistent key-
word use in database searches. By using snowballing, 
we ensured a more comprehensive review by capturing 
relevant research that might not be easy to find through 
traditional database searches alone. To ensure overall 
coverage, snowballing iterations were performed until 
no further studies were included. The first round on the 
start set of 93 articles yielded 15 additional papers. After 
extracting these 15 papers, the second round of snow-
balling was carried out, which resulted in 15 more arti-
cles. We then performed a third iteration of snowballing, 
which yielded 3 more articles. Lastly, snowballing on 
these 3 articles did not result in additional papers. After 
this process, we ended with a final set of 126 primary 
studies.

6. Data synthesis: We began data analysis and synthesis 
once extractions had been completed. To categorize the 
retrieved data, we used both quantitative and qualitative 
analysis. Some extraction discrepancies and errors were 
detected throughout this process and were removed. The 
first author performed the synthesis and frequently pre-

Fig. 2  Data extraction and 
synthesis process


573Requirements Engineering (2024) 29:567–600 

sented the results to the rest, leading to iterative refine-
ments. To address RQ1, we conducted a frequency 
analysis to examine bibliographical data. For RQ2 and 
RQ3, we undertook quantitative analyses. Specifically, 
the analysis for RQ2 categorized literature based on 
SWEBOK KAs, whereas RQ3 focused on classify-
ing literature according to Wieringa’s [12] framework. 
Additionally, we identified and analyzed the evaluation 
methods used for RE practices. To respond to RQ4, we 
employed a combination of methods, including qualita-
tive analysis through thematic analysis, as recommended 
by Cruzes et al. [26]. As RQ5 is divided into two sub-
questions, we conducted a qualitative analysis for both 
questions. Further, we applied the thematic synthesis 
approach recommended by Cruzes et al. [26]. We then 
extracted free text from the papers and labeled the free 
text. Finally, we identified the most recurrent themes in 
the next step and assigned them to extracted text. In the 
following section, we present our data extraction results 
and their mapping to respective research questions.

4  Results

After data extraction, we move towards the data analysis 
phase. In this section, we summarise the results of our map-
ping study. Starting from RQ1, we systematically present 
results for each RQ, respectively.

4.1  Bibliometrics (RQ1)

This RQ covers the publication trends over the years. Based 
on the earliest published primary study, we see this field 
emerged in 2017. Although we started our search from the 
year 2010, we found the first relevant paper in 2017. As 
expected, there is a growing trend of studies in the field of 
RE for AI after that, as shown in Fig. 3.

Initially, the exploration of this area was predominantly 
undertaken by the academic community. However, with 

the recent advancements in AI, there has been a noticeable 
increase in industrial engagement up to 2021. This trend is 
evidenced by the rise in the number of publications from 
the industry during that period. Although there was a slight 
decrease in industrial publications in 2022 and 2023, the 
overall percentage of industrial papers has remained rela-
tively steady since 2019. This indicates a sustained interest 
and collaboration between academia and industry in RE4AI, 
as depicted in Fig. 4. Notably, IBM USA has emerged as a 
significant contributor with the highest publication count, 
while Fujitsu Laboratories Ltd., located in Kawasaki, Japan, 
has also made notable contributions to this field.

The academic sector has consistently indicated substan-
tial interest and research efforts in this field, with a gradual 
increase in publications over the years. In 2017 and 2018, 
the academic community published two and three papers, 
respectively. This momentum continued into 2019 with eight 
academic articles, three collaborative efforts, and two indus-
try contributions.

In 2020, the output increased to 12 academic publica-
tions, six collaborative articles, and two industry articles. 
This growth continued in 2021, with 17 academic articles, 
nine collaborative projects, and four industry publications. 
The upward trend continued in 2022, reaching 27 aca-
demic articles, six collaborative efforts, and two industry 
contributions.

By the end of July 2023, early data indicates 19 academic 
papers, three collaborations, and one industry publication. 
This sustained increase in academic involvement highlights 
the ongoing growth and interest in this research domain.

The increasing number of publications, especially from 
academia, shows growing interest and involvement in this 
research area. The steady yearly growth and numerous col-
laborations indicate an active and expanding research com-
munity. This ongoing momentum is evident even in the par-
tial data for 2023, highlighting the field’s importance and the 
key role of the academic community in driving innovations 
forward.

Fig. 3  Yearly Publications 
Distribution


574 Requirements Engineering (2024) 29:567–600

Considering the publishing venue, 70 out of 126 
( 54.9% ) papers were published at various conferences, 
whereas, 28 ( 33.3% ) were at workshops, and 28 ( 11.8% ) 
were in journals. Around 2021, workshops became more 
popular as an RE4AI venue, but conferences still account 
for a higher proportion. We can also observe that 23 out 
of 28 papers in the journals were contributed by the aca-
demic community. In contrast, the industry’s preferred 
venues are conferences. The most recurring conferences 
in this domain are the International Requirements Engi-
neering Conference (RE) with 7 papers. It is followed by 
the International Working Conference on Requirements 
Engineering: Foundation for Software Quality (REFSQ), 
and the Conference on Human Factors in Computing Sys-
tems (CHI), with five papers each. Whereas in the work-
shop category, the most preferred venue is the Interna-
tional Requirements Engineering Conference Workshops 
(REW), followed by the Joint Proceedings of REFSQ-
Workshops with nine and four publications, respectively. 
Another main workshop is the Workshop on AI Engineer-
ing – Software Engineering for AI (WAIN), which has two 
publications. Moreover, only 28 journal publications were 
found, out of which three were published in Requirements 
Engineering Journal and three in IEEE Computer.

In conclusion, we can see that the field of RE4AI has 
grown over the years, with a majority of papers published 
in conferences where workshops and journals have equal 
numbers. Furthermore, the industry has shown a signifi-
cant interest and involvement in research within this field. 
Although less than half of the papers each year have indus-
trial involvement, we believe that the industrial adoption of 
AI is experiencing consistent interest, primarily due to the 
significant adoption of Large Language Models (LLMs) 
within the industry. This trend is not only catalyzing a shift 
from conventional software paradigms towards AI-based 

applications but is also likely to amplify the demand for 
specialized RE approaches tailored to AI within the indus-
trial sector. Consequently, traditional RE methodologies 
must be adapted to meet the unique demands and com-
plexities of AI-based systems.

4.2  Distribution of RE topics for AI‑based systems 
(RQ2)

Systematic Mapping Studies are typically employed to 
present a classification scheme for research topics within a 
specific field of interest. Analyzing the distribution of pub-
lications across these topics can provide insights into the 
breadth and depth of research, indicating the field’s scope 
and its level of maturity.

To answer this research question, we used the well-estab-
lished classification scheme of SWEBOK [11] for RE topics 
as shown in Fig. 5. It allows us to observe where RE4AI 
research has been focused and which topics may still require 
attention. One paper can be classified under more than one 
topic, depending upon which RE topics they addressed in 
their research. In addition, we analyzed topics and their sub-
categories, such as which sub-topics have been addressed or 
remained unattended. It can help researchers identify further 
gaps in the current research landscape. Figure 6 visualizes 
our general findings for this RQ.

Within our analysis, we identified that 104 studies con-
centrated on requirements analysis, marking it as the pre-
dominant category in our classification. This category is 
notable for the introduction of 33 RE practices, detailed in 
Sect. 4.4.2. The research work within this realm has primar-
ily explored the integration of conceptual modeling [27], the 
classification of new(novel) requirement types [28], and the 
assimilation of human-centric requirements into ML systems 
[29, 30].

Fig. 4  Research Community


575Requirements Engineering (2024) 29:567–600 

Moreover, our review reveals that 87 studies were dedi-
cated to requirements elicitation, representing the most 
substantial segment where established practices have been 
applied, as discussed in Sect. 4.4.1. This highlights the 
field’s ongoing efforts to refine and utilize traditional RE 
methodologies within the context of evolving technological 
frameworks. Furthermore, interviews [31–33], question-
naires [34], and scenarios have been used as practices in 
this area [35–37].

Further, we found that 77 papers discussed topics related 
to requirements specification, with 30 of these studies 
introducing new practices for specifying requirements. It is 
observed that the recent literature leans towards proposing 

practices specifically tailored to meet ML-related require-
ments, emphasizing stakeholder needs [38, 39]. Notably, 
only three studies adhered to existing practices for require-
ments specification.

Shifting the focus to the Requirements Validation Knowl-
edge Area, 53 studies were identified that delve into require-
ments validation, out of which eight introduced novel prac-
tices for conducting requirements validation. Thus, aiming 
to enhance the validation processes in line with AI-based 
systems. Figure 6 shows 22 studies focused on practical con-
sideration, whereas 5 studies proposed new practices for 
practical consideration. These studies highlighted require-
ments attributes regarding explanatory capabilities, ethical 

Fig. 5  Breakdown of Topics for the Software Requirements KA [11]

Fig. 6  Number of papers in each category


576 Requirements Engineering (2024) 29:567–600

guidelines, and quality characteristics. 16 studies covered 
RE processes and focused on tailored RE processes for ML-
based systems. These processes aim to incorporate ML-
specific needs and additional types of requirements. Few 
studies highlighted the different perspectives in the business 
context during the RE process. We can observe that 13 stud-
ies proposed new practices for RE process, where 9 applied 
existing RE processes to AI-based systems, primarily focus-
ing on Goal-Oriented Requirements Engineering (GORE) 
[40, 41]. The focus on requirements tools has been relatively 
limited, with only 4 studies identified in this domain propos-
ing innovative tools to support the RE process for AI-based 
systems (Sect. 4.4.2). These tools are particularly aimed at 
streamlining tasks related to the elicitation and specification 
of requirements. In the last category, i.e., software require-
ments fundamentals. Only 4 studies specifically addressed 
the topic of requirement definitions.

The trends we observed from these statistics suggest a 
field in transition, grappling with the unique challenges 
posed by AI and machine learning systems. The use of 
existing practices in elicitation and modeling points to a 
reliance on traditional RE strengths. However, the need 
for new practices, especially in analysis, specification, and 
the requirements process, suggests that AI-based systems 
introduce complexities and challenges that transcend the 
capabilities of traditional RE methods. These new practices 
likely address AI-specific concerns such as ethical consid-
erations, data quality and sourcing, model transparency, and 
the dynamic nature of learning algorithms.

4.3  RE4AI research maturity (RQ3)

To address this question, we classified the papers according 
to the taxonomy by Wieringa et al. [12].

We aim to highlight the research methods so far used by 
researchers in the RE4AI directions and how these practices 
have been evaluated.

Figure 7 shows that 74 studies fall into the proposal 
of solution category, i.e., papers proposing a solution and 
establishing its relevance. Either the proposed solutions 
should be novel, or an existing solution should be adapted 
and applied to a new domain. A new conceptual framework 
has been proposed in 21 studies, and we classify them as 
philosophical papers, as some of them do not provide a 
direct solution but all of them offer a new way of under-
standing and categorizing requirements.

Papers that investigate proposed solutions’ properties 
while the solution still requires implementation in RE are 
classified as validation research. 25 papers are in this cat-
egory that validate a solution proposed in the same paper 
or elsewhere. Papers that apply RE techniques in practice 
or investigate the usage of RE practices are classified as 
evaluation research. The novelty of the practice is not essen-
tial in this case. Instead, the knowledge claim of the paper 
should be novel. 18 among our primary studies describe the 
authors’ position, primarily to provoke discussions about 
RE4AI topics. These types of papers are categorized as opin-
ion papers. Lastly, 11 studies reported personal experiences 
and were labeled as personal experience. Papers could be 
classified with more than one of these categories.

Our analysis (see Fig.  8) reveals a notable trend in 
research practices. Specifically, we observe that proposal 
of solution papers predominantly incorporate validation 
research, with 20 studies validating solutions and 22 engag-
ing in evaluation research. Additionally, 5 papers combine 
proposals of solutions with opinion and philosophical dis-
course, while 3 include personal experiences. In validation 
research, a common pattern emerges where 20 studies both 

Fig. 7  Distribution of papers 
according to Wieringa’s clas-
sification


577Requirements Engineering (2024) 29:567–600 

propose and validate solutions, highlighting a preference for 
self-validation. Philosophical and opinion papers often inter-
twine, sharing a focus on conceptual framework. Evaluation 
methods vary, with case studies (43) being predominant, 
followed by surveys (15) and minimal use of mixed meth-
ods (1). This reflects a broader inclination towards practical 
validation in the proposal of a solution, while opinion and 
experience papers typically lack such research.

Further, to assess the maturity of the research, we inves-
tigate which type of research is conducted in each SWE-
BOK KA. Figure 9 shows how RE SWEBOK [11] topic are 
addressed using Wieringa’s classification [12]. It should be 
noted that one paper can be in more than one publication 
type, so the total adds up to more than 126. This analy-
sis highlights that significant focus has been placed on 

requirement analysis, elicitation, specification, and valida-
tion. However, foundational aspects of requirements, such 
as their fundamental principles, processes, and practical 
considerations, have been overlooked. Additionally, there 
is a notable shortage of tools to support the development 
of AI-based systems. The data reveals that while the initial 
activities of RE receive considerable attention, there is still a 
deficiency in managing the overall RE process for AI-based 
systems effectively

4.4  RE practices for AI‑based systems (RQ4)

This question aims to highlight the use of existing RE prac-
tices and the direction in which new RE practices specific 

Fig. 8  Frequently Combined Research Methods

Fig. 9  Maturity of the research area by analyzing the prevalent research methods utilized in each KA


578 Requirements Engineering (2024) 29:567–600

to AI have emerged. We extracted the practices according to 
the classification scheme provided in Sect. 3.1.

The bar chart in Fig. 10 indicates literature proposing 
new processes, techniques, models, and tools. Though more 
research is focused on techniques, the novelty can be seen 
more in model, process, and tool-related research. It is 
evident that current techniques are more frequently used. 
One major takeaway is that existing RE techniques could 
be adapted for AI-based systems, however, standard RE 
processes and tools are not adapted for the development of 
AI-based systems. In the subsequent sections, we will elabo-
rate on how existing practices have been used and what new 
practices have been used.

4.4.1  Usage of conventional RE practices for AI

We analyzed the suitability of current RE practices for AI 
by determining what RE practices have been used for such 
systems. The resulting model is shown in Table 2. We group 
the practices according to the RE topics in SWEBOK [11]. 
Since we found requirements modeling, which is not part of 
SWEBOK, to be an important topic among our papers, we 
included it as a distinct group. Further, we found require-
ments elicitation, requirements process, requirements vali-
dation, requirements analysis, and requirements specifica-
tion KAs using existing practices of RE. Each paper may 
have multiple RE practices and could fall into multiple soft-
ware requirements KAs.

Requirements elicitation Out of 126 papers examined, 
40 employed various practices for eliciting requirements. 
Among these practices, interviews emerged as the pre-
dominant method, with 16 of the 40 papers utilizing them 
for gathering requirements. Notably, a significant number 
of researchers favored semi-structured interviews as a tool 
to initiate conversations around generative AI [33] and to 
foster a collaborative design process [29, 42]. Similarly, 

researchers [31, 32, 36, 42] also conducted semi-structured 
interviews for requirements elicitation. Additionally, there 
are studies [47–49, 52] concentrated on requirements elici-
tation that prioritize stakeholders’ perspectives and needs.

Other methods identified for requirements elicitation 
across the reviewed literature encompass surveys, scenario-
based elicitation techniques [36, 37, 53–56], questionnaires 
[34, 45], think-aloud protocols [57], focus groups [32, 49, 
59, 60], and controlled experiments, showcasing a diverse 
range of strategies for gathering and understanding project 
requirement.

Requirements modelling Twenty-six among 126 papers 
mentioned an existing practice for requirements modeling. 
The most frequently used practice for this topic was concep-
tual modeling, which 8 different articles have addressed. In 
[66], conceptual modeling is used for requirements elicita-
tion, design, and development of ML solutions. Similarly, 
[27] incorporated conceptual modeling into a data sci-
ence project and applied it to a healthcare application. The 
authors of [65] use conceptual modeling to illustrate the 
business view, analytics design view, and data preparation 
view. These perspectives are used to relate the corporate 
strategy to analytics algorithms and data preparation opera-
tions. Authors in [64] argued conceptual modeling could 
support the application of ML within an organization while 
improving usability and optimizing the performance of ML 
algorithms. Additionally, in [30] the authors demonstrated 
that conceptual modeling can be used to map human mental 
models to model AI-based systems. Other frequently used 
modeling techniques are scenario-based design and the 
Unified Modeling Language (UML). Husen et al.  [77] use 
UML for analyzing ML safety requirements top-down from 
higher-level business requirements, whereas [40] provides 
comparison using UML diagrams and aims to propose effec-
tive design practices for planning problems, with a focus on 
the early requirements phase.

Fig. 10  Existing vs new prac-
tices


579Requirements Engineering (2024) 29:567–600 

Furthermore, within the scope of the requirements pro-
cess, Goal-Oriented Requirements Engineering (GORE) 
and Softgoal Interdependency Graphs (SIGs) have been 
employed in 8 and 1 studies, respectively, as detailed in 
Table 2. It’s also worth noting that there has been a lesser 
focus on utilizing existing practices for requirements vali-
dation, specification, and analysis, with only 3 instances 
identified in each of these areas. It can also be observed that 
the researchers paid attention to using existing ISO stand-
ards, including ISO 26262 [92], ISO 25012 [90], and ISO 
25000 [91].

4.4.2  New practices employed in RE4AI

For new practices, Table 3 provides a brief description of 
each practice and the type of practice, i.e., model, process, 
technique, or tool. We can observe that numerous studies 

have proposed new models, while a significant number also 
introduced new processes. Further, we will elaborate on how 
each KA has been addressed by researchers.

Requirements analysis This category deals with the 
comprehensive process of validating and managing stake-
holders’ needs and constraints to ensure a clear understand-
ing and agreement on what the system or project must 
achieve. It includes activities such as conflict detection, pri-
oritization, and scope definition. The main themes found in 
requirements analysis were explainability and human-centric 
requirements analysis. 33 papers proposed different prac-
tices in this area, with 5 focused on requirement analysis for 
explainability needs. Sheh and Monteath [28] categorized 
explainability requirements while considering the source, 
depth, and scope of the explanation. Where [76] introduces 
an explainability framework that automatically recom-
mends methods to improve system design’s explainability 

Table 2  Existing Practices Used

RE topic Practices used Type Paper ID

Requirements elicitation (40) Interviews Technique [29, 31–33, 36, 42–52]
Scenario-based requirements elicitation Technique [35–37, 53–56]
Card sorting Technique [57, 58]
Controlled experiment Technique [57]
Focused groups Technique [32, 49, 59, 60]
SigniFYIng message Model [53]
Survey Technique [46, 48, 59, 61–63]
Think aloud Technique [57]
User Stories Technique [49, 58]
Questionnaire Technique [34, 45]

Requirements modelling (26) Conceptual modelling Technique [27, 30, 54, 64–68]
AMDiRE model (Artefact Model for Domain-independent RE) Model [69]
Data flow diagrams Technique [70]
ISO 26262 Model [71]
Persona Technique [36, 72]
actor-based requirements Modelling using istar framework Technique [73–75]
Goal Modeling Technique [76, 77]
STRIDE Technique [70]
Model Driven Engineering Model [78, 79]
UML Technique [40, 77, 80, 81]

Requirements Process (9) GORE Technique [40, 41, 82–87]
Softgoal Interdependency Graphs (SIGs) Technique [88]

Requirements validation (3) Document analysis Technique [36]
Fuzzy Kano Technique [89]
ISO 25012 model Model [90]

Requirements Analysis (3) ISO 25000 series, known as SQuaRE Model [91]
ISO 26262 and ISO/FDIS 21448 Process [92]
Use case boundary condition Technique [49]

Requirements Specification (3) Alloy formal specification Technique [80]
EARS Technique [54]
Operational Design Domain Technique [93]


580 Requirements Engineering (2024) 29:567–600

Table 3  New practices proposed

Contribution in SWEBOK KA Paper ID Type Short description

Requirements analysis (33) [82] Model RE4AI taxonomy for building AI-based complex systems
[94] Technique Multi-aspectual analysis of AI Ethics frameworks
[28] Model Novel categorisation for explanations
[29] Model Investigating human-centric design requirements in engineering design
[27] Process A framework for incorporating conceptual Modeling into data science projects
[84] Process Propose an RE framework needs to address both sides of the cognitive cycle for all 

relevant actors
[95] Process Presents a five-step systematic method in the development of an explainable AI (XAI) 

system
[88] Model Conceptual analysis which unifies the different notions of explainability and the cor-

responding explainability demands
[38] Model Framework with a more granular and composable vocabulary to characterize the stake-

holders of interpretable ML
[96] Model Framework for XAI researchers and designers to identify pathways along which human 

cognitive patterns drive needs for building XAI and how XAI can mitigate common 
cognitive biases

[97] Process Risk Modeling approach tailored to Collaborative AI systems
[30] Process Propose a framework for progressing from human mental Models to machine learning 

Models and implementation via the use of conceptual Models
[39] Model Task analysis of human-guided machine learning
[73] Model A framework and an actor-based requirements model to analyze the roles, requirements, 

and responsibilities in AI ecosystems, enhancing stakeholder identification and reveal-
ing potential goal tensions

[98] Model Presents a list of AI requirements for effective human-AI interaction
[45] Model Development of a requirements model that captures diverse and sometimes conflicting 

needs of patients and AI researchers/developers for the pAItient project, facilitating 
communication of stakeholder requirements within the consortium

[99] Model Introduces a dynamic framework that integrates project context and law modeling to 
identify protected attributes for AI model fairness, maps legal requirements to dataset 
attributes, aids in selecting suitable fairness definitions, and visually represents AI 
model outputs for fair and accurate decision interpretation

[100] Model Identify challenges related to NFRs and develop solutions to manage NFRs for ML 
systems

[101] Technique Method for evaluating risks in AI/ML-based software systems design, focusing on 
dependency-driven assessments

[63] Model Summarizes the refined Lifelong Learning System (LLS) requirements based on feed-
back from students and instructors

[85] Model Introduces a goal-centralized meta-model for integrating FRs and NFRs through goal-
oriented analysis of ML systems

[102] Process Identifies AI’s ethical problems by linking responsible AI requirements from ethics 
guidelines with AI system interactions, including building an AI ethics model to 
embody guidelines as requirements

[103] Technique Presents a two-tiered approach for data requirements modeling in ML systems within 
the scope of requirements analysis. The foundational tier maps out the learning 
context, employing a feature-oriented domain analysis to detail system components, 
their environment, and interrelations. The subsequent tier advances to define property-
based specifications

[104] Process Introduces a provenance-based, trust-aware RE framework for self-adaptive systems, 
enabling engineers to derive trust-aware goal models from user requirements

[105] Model Outlines challenges in specifying training data and runtime monitoring for ML models 
faced by practitioners, identifies interconnections between these challenges, and offers 
recommendations to address the root causes

[89] Process Proposes a systematic approach for evaluating stakeholder requirements
[77] Process Outlines a top-down approach for analyzing machine learning safety requirements


581Requirements Engineering (2024) 29:567–600 

Table 3  (continued)

Contribution in SWEBOK KA Paper ID Type Short description

[106] Technique Leveraging established RE techniques for software systems to elicit and analyze ethical 
requirements

[87] Model This paper proposes a new theoretical framework for an extended requirements prob-
lem, incorporating both stakeholder goals and objectives specific to the AI-based 
system

[59] Model Expands current explanation categories to suit the financial domain by pinpointing 
specific explainability needs of users

[68] Model Introduces a framework for transitioning from human mental models to machine learn-
ing models through conceptual models

[75] Model Introduce an extension of i*, addressing the disconnect between machine learning and 
conceptual modeling to establish a foundational methodology for machine learning 
requirements engineering

[76] Model Presents an explainability framework that automates the recommendation of explain-
ability methods for system design, enhancing system explainability and efficiency for 
developers, and mitigating the tension between explainability and usability

Requirements process (13) [83] Process Evidence-driven RE to deal with the additional type of uncertainty of the implementa-
tion

[107] Process Tailored RE methodology for ML systems incorporating New types of requirements as 
well

[80] Process Model-driven engineering method based on traditional RE
[108] Process RE to build effective ML systems with provable compliance assurances
[65] Process Modelling framework for requirements analysis and design from a business view, ana-

lytics design view, and data preparation view
[69] Model An artefact-based RE approach for the development of datacentric systems
[109] Process Developed an improved Agile data mining framework to fulfill the government business 

objectives and needs and a systematic way for identifying business problems
[110] Process Methodology for developing and assessing legal, privacy, social, and ethical require-

ments
[111] Process Requirements Engineering for AI
[112] Process Outlines challenges in XAI and proposes a framework with research directions for using 

RE practices to address these challenges
[113] Process Introduces the Data-Driven Engineering process, a systematic approach for ML applica-

tion in industry, featuring hierarchical RE and semi-automated data set generation, 
integrating with other processes in a V-Model, aiming at harmonizing development 
levels and automating dataset compilation

[56] Process Safety requirements can be systematically and traceably generated and refined across 
the different life-cycle phases of the MLM

[41] Process Introduces a GORE-based methodology for autonomy requirements engineering in 
Unmanned Aircraft Systems

Requirements specification (30) [91] Model Specification of safety requirements based on uncertainty
[114] Process Approach for specifying and testing requirements
[115] Technique Efficiently tackle the issue and derive component-level requirements
[38] Model Framework with a more granular and composable vocabulary to formulate the stake-

holders’ needs
[71] Process Requirements for specification languages and identify types of specification that are 

well-suited to ML-based components
[116] Process Proposed an approach to improve the process of requirements specification in which an 

MLC is developed and operates by explicitly specifying domain-related concepts
[39] Model Human-guided machine learning as a hybrid approach where a user interacts with an 

AutoML system and tasks it to explore different problem settings that reflect the user’s 
knowledge about the data available

[117] Model Synthesizes 75 unique requirements from a case study
[43] Model Identified five requirements for AI documentation that emphasize the need for engineers 

to combine technical details with understandable integration into business processes


582 Requirements Engineering (2024) 29:567–600

Table 3  (continued)

Contribution in SWEBOK KA Paper ID Type Short description

[98] Technique Introduces Intelligence-Centered Design (ICD), a modification of the Human-Centered 
Design approach, to integrate AI considerations from the start, providing guidance for 
novice designers in creating AI-based interactive systems with a focus on AI-human 
interaction and user experience

[44] Technique Specifying NFR for the whole system
[118] Model Emphasizes that requirements specification for AI systems must include not only 

evaluation measures (M1,..., Mn) but also criteria for acceptable values, their relative 
importance for trade-offs, vetting processes, and essential data like training data for 
proper AI functionality

[119] Model Introduced a comprehensive framework for AI trustworthiness, emphasizing a human-
centric approach with criteria like human agency, security, privacy, and fairness

[47] Model Holistic view of the specific requirements for AI-enabled medical devices
[120] Technique Define sensor accuracy requirements for medical devices using ML algorithms for 

stability scoring
[121] Model Paper proposes a specification framework for ML requirements
[49] Technique Proposed Quality levels for requirements specifications
[50] Model Propose a set of requirements for an AI-based team member
[122] Model List of generic audit requirements, which are technically relevant to assure the trustwor-

thiness, security, safety, robustness, and explainability
[123] Model List of shared Requirements
[62] Model 67 usage view activities/scenarios, 141 top-level requirements, and 179 detailed sub-

requirements
[72] Model Constructs user personas for AI medical interviews, from which it derives specific 

usability, reliability, and acceptability requirements, employing ISO/IEC 25010:2011 
standards to address previously overlooked aspects

[124] Technique Specifying and formulating user requirements for explainable AI
[125] Process Proposes a perspective-based method for specifying ML-enabled systems
[78] Model Introduces SEMKIS-DSL, a textual domain-specific language designed to assist soft-

ware engineers in specifying the requirements
[126] Technique Provides a list of promising techniques for requirement specification, validation, and 

verification
[51] Model Introduces a model and template for defining explainability requirements in AI systems
[79] Model Definition of 78 high-level requirements refined into 30 generic ones by AIDOaRt 

partners for advancing cyber-physical systems development using AI, DevOps, and 
Model-driven engineering

[127] Technique Introduces Ethical User Stories (EUS) as a method to integrate AI ethics into standard 
SE practices, enabling the formulation of both FRs and NFRs

[128] Model Proposes a transparency playbook for technologists, aimed at developing AI systems 
that adhere to legal and regulatory standards and satisfy user requirements

Requirements elicitation (16) [129] Model A method to identify requirements
[37] Model Scenario-based design for XAI: "Explainability scenarios focus on what people might 

need to understand about AI systems
[34] Model Present an extended XAI question bank by combining algorithm-informed questions 

and user questions
[53] Model Scenario based explainability
[130] Model Ontology to support user requirements for explanations in the domain of healthcare
[131] Process Methodology for addressing explainability requirements in ML services for IoT Cloud 

systems through stakeholder involvement, end-to-end engineering processes, and 
multiple aspects of explainability

[132] Model Provide a catalog to elicit requirements and a conceptual Model to present them visually
[58] Process Guide for Artificial Intelligence Ethical Requirements Elicitation
[48] Technique Technique for a socio-technical requirements elicitation in the design of AI-based sys-

tems by adapting the HTO-analysis


583Requirements Engineering (2024) 29:567–600 

and efficiency for developers, thereby reducing the conflict 
between explainability and usability. Köhl et al. [88] pro-
vided a conceptual analysis that unifies the different con-
cepts of explainability and the corresponding explainability 
demands. Suresh et al. [38] provided a framework for iden-
tifying stakeholders for interpretability and using the human 
cognitive process to derive requirements for explainability 

[96]. Hall et al. [95] outlined a systematic method to build 
an explainable artificial intelligence (XAI) system, which 
focuses on understanding specific explanation requirements 
and assessing existing explanation capabilities. 4 among 33 
papers were focused on human-centric requirements analy-
sis, including human-centric design requirements [29], or 
proposing RE frameworks that map the human mental model 

Table 3  (continued)

Contribution in SWEBOK KA Paper ID Type Short description

[133] Process Introduces a multi-layered framework with a verifiable template for eliciting data 
requirements and uses Dempster-Shafer theory to assess training data quality through 
expert judgments

[134] Process A requirements process that amalgamates insights from domain experts, forums, and 
formal documentation to identify and articulate requirements and design definitions 
as time-series attributes, enhancing the development of deep learning-based anomaly 
detectors

[135] Model Explore the requirements elicitation and documentation techniques used in the industry 
and identify challenges

[60] Process This work is to offer a guide for eliciting ethical requirements in artificial intelligence 
projects (RE4AI Ethical Guide)

[136] Technique Developed and evaluated a Playful Probe protocol through a participatory design work-
shop, demonstrating how to elicit ideas for integrating maintenance planning practices 
with machine learning

[40] Model Present a new RE Model that allows software engineers and data scientists to discover 
these values hand in hand as part of the software requirement process

[52] Model Reveals key collaboration challenges and patterns in developing and deploying produc-
tion ML systems, focusing on requirements, data, and integration

Requirements validation (8) [90] Technique Metamorphic testing for data quality requirements validation of DL systems
[115] Technique Approaches to efficiently address the problem and derive component-level requirements 

and tests
[137] Model Develop nine specific areas where confidence is required in training data
[114] Process Approach for specifying and testing requirements
[138] Model Comprehensive approach to enhancing MVC safety, including safety-related image 

transformations, reliability requirement classes, methods for creating machine-verifia-
ble requirements

[139] Technique A technique that evaluates datasets by bridging the gap between the specification of 
hard-to-specify domain concepts

[140] Process Presented includes a Data Quality Workflow, Lists of Data Quality Challenges and 
Attributes, and Solution Candidates, serving as tools for assessing and maintaining 
data quality, validated through a focus group

[141] Model Examining the system’s output in scenarios that both align with and deviate from user 
expectations

Practical consideration (5) [142] Model Categorisation of explanatory capabilities and requirements
[129] Model Extending the quality characteristics of ISO 25010
[143] Model General methodological approach for quality Modelling of ML systems
[144] Technique Method for applying ethical requirements in Agile portfolio management
[86] Model Interaction between RE and Software Architecture in the context of ML

Software requirements tool (4) [145] Tool Multi-layered approach allowing users to formulate their requests for explanations
[122] Tool Implement Tool for exemplary audit requirements to demonstrate the applicability of a 

selected mobility application
[144] Tool Tutorial aims to teach SE stakeholders how to apply the Ethical Requirements Stack for 

implementing AI’s ethical requirements across business levels, enhancing AI ethics 
research

[67] Tool Tool support for Modelling Requirements


584 Requirements Engineering (2024) 29:567–600

to ML models [97, 100, 105] focused on ethical and legal 
requirement analysis. Other papers proposed requirement 
analysis for risk modeling and frameworks for requirements 
analysis.

Requirements process This section of RE KA illustrates 
how the requirements process aligns with the overall SE 
process. In the requirements process category, we find a 
paper proposing the process model of evidence-driven RE 
to capture the requirements specific to ML-based systems, 
i.e., uncertainty [83]. Ries et at. [80], tailored traditional RE 
to improve dataset requirements engineering. At the same 
time, Vogelsang and Borg [107] highlighted; the need to 
integrate ML specifics in the RE process and new types of 
quality requirements such as explainability, freedom from 
discrimination, or specific legal requirements. Further, [112] 
outlines challenges in Explainable AI (XAI) and proposes 
a framework for using RE practices to address these chal-
lenges. Similarly, [69] proposed an artifact-based approach 
for the development of data-centric systems while [113] 
introduced a data-driven engineering process featuring hier-
archical RE. [108] provided an overview of how research in 
the RE discipline can support building effective ML sys-
tems. Other authors proposed a modeling framework for 
analytics algorithms and data preparation activities [56, 65] 
or an agile data mining framework [109] in the context of 
business objectives. However, another notable work [110] 
proposed a methodology for developing and assessing legal, 
privacy, social, and ethical requirements.

Requirements specification New specification practices 
have been proposed for AI-based systems to capture domain-
specific and component-level requirements. Czarnecki and 
Salay [146] proposed an approach to specify safety require-
ments based on uncertainty, whereas Rahimi [116] proposed 
an approach to specify requirements for ML components 
explicitly specifying domain-related concepts. Furthermore, 
[114] and [115] focused on specifying requirements well 
suited to ML components and testing these requirements 
[71]. Others focused on specifying requirements using user 
knowledge [39] and providing the framework with a more 
granular and composable vocabulary to formulate user needs 
[38]. Requirements Documentation and Evaluation has been 
highlighted by [123] by providing a list of shared require-
ments. However, [62] details usage view activities/scenarios, 
top-level requirements, and detailed sub-requirements. In 
this context, [128] proposes a transparency playbook for 
developing AI systems that meet legal, regulatory, and user 
requirements. Furthermore, several studies have made con-
tributions to defining specific requirements. For instance, 
[117] extracts unique requirements from a case study, while 
[43] focuses on the necessities for AI documentation that 
bridge technical aspects with business processes. Berry and 
Daniel [118] highlight the importance of detailed evaluation 
measures and criteria in the specification process. Grüning 

[47] and Bartlett [120] offer insights into the requirements 
for AI-enabled medical devices, with the former providing 
a comprehensive overview and the latter specifying sensor 
accuracy for stability scoring. Elshan et al. [50] delve into 
what is needed for an AI-based team member, and Noda [72] 
uses user personas from AI medical interviews to specify 
usability and reliability needs.

Requirements elicitation Elicitation considers the origin 
of requirements and how they can be gathered. We identified 
among 16, six papers proposing different models for require-
ments elicitation, of which four papers were focused on elic-
itation of explainability requirements [34, 37, 53, 130, 132]. 
The authors of [129] proposed a method to identify require-
ments to ensure quality characteristics. Moreover, [58] and 
[60] provided a guide on how to elicit ethical requirements 
for AI-based systems. Further important requirement elicita-
tion challenges related to data requirements are highlighted 
by [133] and [52].

Requirements validation While validating requirements 
is considered a crucial part of RE, we identified 8 studies 
proposing new practices in this area. The Challa et al.  [90] 
used a metamorphic testing approach to validate data quality 
requirements. Similarly, Banks and Ashmore [137] estab-
lished that training data provides the functional requirements 
for AI-based systems. Using traditional assurance concepts, 
they developed nine areas where confidence is required in 
training data. Barzamini et al. [139] and Pradhan et al. [140] 
presented a framework to evaluate data quality. However, 
[61] suggested a model for examining the system’s out-
put in scenarios that both align with and deviate from user 
expectations.

Practical considerations The requirements process spans 
the whole software life cycle. This KA aimed to maintain 
stability in requirements to ensure they accurately reflect 
the software to be built or that has been built. To support 
that, Sheh [142] presents traceability between the explana-
tions and the capabilities of underlying AI techniques to help 
users and developers. Authors in [129] proposed a meth-
odology to derive quality characteristics and measurement 
methods for MLS. Furthermore, a general methodological 
approach for quality modeling of ML has been proposed by 
[143]. Further, [144] proposed a method to deal with ethical 
requirements, and [86] addressed the interaction between 
RE and Software Architecture in the context of machine 
learning.

Software requirements tool In total 4 tools have been 
developed to address distinct needs. One such tool [145] 
offers a multi-layered approach, enabling users to articu-
late their demands for explanations, facilitating a deeper 
understanding of AI systems. Another tool [122] focuses 
on implementing tools for audit requirements, showcasing 
its utility with a mobility application to ensure compliance 
and functionality. Additionally, a tool [144] has been created 


585Requirements Engineering (2024) 29:567–600 

with the objective of educating SE stakeholders on utiliz-
ing the Ethical Requirements Stack, aiming to integrate 
AI’s ethical requirements comprehensively and contribute 
to the advancement of AI ethics research. Lastly, there is 
tool support [67] dedicated to modeling requirements, which 
assists in the precise definition and management of system 
requirements, underscoring the importance of clear and 
structured requirement specifications in successful system 
development.

5  Open challenges and future research 
directions (RQ5)

This section highlights the prevailing challenges in the 
RE4AI literature and presents future research directions 
outlined among 126 primary studies. We used the thematic 
synthesis approach recommended by Cruzes et al. [26] to 
answer challenges and future directions in RQ5.

5.1  RE challenges for AI‑based systems

In this section, we underline the challenges in RE4AI. We 
identified 27 challenges classified into 9 categories as seen 
in Fig. 11. In the following subsections, we discuss them 
one by one.

5.1.1  Requirements specification

In the requirements specifications, we encountered the most 
challenges, categorized into five types:

Hard to specify requirements concretely The necessity 
for requirements engineers to adopt new methods to deal 
with data biases and the challenge of developing require-
ments when the data is not yet available highlight the dif-
ficulty in specifying requirements concretely for AI systems 
[90, 147]. The challenge of ensuring that legal regulations 
and ethical considerations are adequately considered requires 
a shift in perspective towards a data and analytics viewpoint 
[107]. The complexity of specifying non-functional require-
ments (NFR) on overall ML system performance and the dif-
ficulty of rigorously specifying requirements due to a lack of 
domain knowledge [107, 114]. The challenge of specifying 
explainability requirements and functionality that depends 
on input data underscores the difficulty of concretely speci-
fying requirements in AI-based systems [71, 88]. The dif-
ficulty of specifying unambiguous requirements, such as 
for a pedestrian detector component, further illustrates this 
challenge [116].

Incomplete and incorrect knowledge Challenges around 
less tangible characteristics are hard to express meaning-
fully, leading to overlooked and misconstrued requirements 
[142]. Incomplete, incorrect, and inconsistent knowledge 

encompassing missing or insufficient entities, mislabeled 
entities, and differing labels for the same entity or merged 
entities, highlighting issues of knowledge integrity [27].

Emergent functionality hard to specify in advance The 
entanglement of requirements where even minor changes 
can dramatically alter other requirements illustrates the chal-
lenge of specifying emergent functionality in advance [82].

New type of quality requirements The explicit specifi-
cation of explainability as a quality requirement presents a 
new challenge due to the lack of a systematic and overarch-
ing approach [88]. Standard requirements specification tech-
niques become less applicable in AI-based systems where 
requirements are informed through training data, indicating 
a shift towards new types of quality requirements [137, 148].

Lack of suitable guidelines for AI documentation 
Königstorfer [43] and Treacy [110] underscore the issue of 
insufficient guidance on documenting AI, noting that many 
guidelines do not effectively connect principles with action-
able requirements.

5.1.2  Explainability challenges

Many studies have identified explainability as a noteworthy 
challenge. We arranged these challenges into three major 
categories:

Explainability as a new requirement Ishikawa et al. 
[83] and Kuwajima et al. [91] highlighted explainability as 
an emerging requirement, aligning with the European Com-
mission’s ethical guidelines for trustworthy AI, which advo-
cate for fairness and explainability. This category under-
scores the recognition of explainability as a crucial aspect 
of ethical AI development.

No consistent definition for explainability A signifi-
cant challenge in the domain of explainability is the absence 
of a unified definition, making it difficult to pinpoint what 
’explainability’ precisely entails [124]. This ambiguity is 
emphasized by studies like those of Köhl et al. [88], and 
Suresh et  al. [38], who note that different stakeholders 
have varying interpretations of explainability. Further-
more, Jansen et al. [53] and Kim et al. [149] discuss the 
gap between stakeholders’ expectations of AI explanations 
and their understanding of AI system actions, illustrating 
the complexity of achieving a common understanding of 
explainability across diverse groups.

Lack of stakeholder-centric approaches for explain-
ability The necessity for stakeholder-centric approaches in 
explainability is underscored by the challenges in ensuring 
AI-based systems are transparent enough to foster trust and 
accountability [141]. Suresh et al. [38] and Wang et al. [96] 
address the difficulties in creating AI-based systems that 
can effectively communicate their reasoning to users, par-
ticularly in critical situations. The literature suggests that 
existing model interpretability methods often fail to consider 


586 Requirements Engineering (2024) 29:567–600

the end-user, typically being most comprehensible to those 
who develop them, such as ML researchers or developers 
[76]. This point is further elaborated by Dhanorkar [31], 
who argues for the need to extend beyond current explain-
ability techniques to accommodate the diverse explanations 
required by different stakeholders in an AI system. Henin 
and Metayer [145] highlight the challenge of developing 
explanation methods that cater to various explainees with 
distinct interests, advocating for personalized approaches to 
explainability.

Collectively, these challenges indicate a growing aware-
ness of the importance of explainability in AI, the need for 
a clearer definition and understanding of what explainabil-
ity means to different stakeholders, and the importance of 
developing approaches that prioritize the perspectives and 
needs of those stakeholders

5.1.3  New requirements engineering practices

The literature identifies critical areas where new Require-
ments Engineering (RE) practices are essential to address 
the unique challenges posed by AI-based systems. These 
areas are categorized into four key segments.

Integrating AI components in system The integration 
of AI components into systems presents novel challenges 
for RE, necessitating new validation techniques beyond tra-
ditional inspection and static reading, especially where data 
quality is paramount [90]. [150] highlights the need for a 

revised RE process pipeline to effectively address and evalu-
ate the requirements for these AI components, underscoring 
the importance of safety, reliability, and effectiveness in AI 
systems [108].

Data as a new source of requirements Data quality and 
its role as a source of requirements for AI-based systems 
emerge as significant concerns. The traditional principles 
and techniques of RE are found inadequate in addressing 
the unique requirements of ML-based systems, prompting a 
reevaluation of existing RE practices [83].

Integrating new concepts into existing practices The 
challenge extends to integrating new concepts into estab-
lished RE practices. Existing RE frameworks must evolve 
to accommodate the distinct needs of AI-based systems, 
requiring a comprehensive approach that includes strategic 
planning, technology selection, system validation, and main-
tenance processes [69].

Lack of suitable RE concepts and methodologies for 
ML-based systems There is a conspicuous gap in RE con-
cepts and methodologies tailored to ML-based systems. 
This deficiency points to a broader issue within the field, 
where RE practices fail to align with the legal and regulatory 
demands specific to ML systems. Ensuring compliance with 
relevant laws and regulations remains a primary concern for 
requirements engineers in this domain [107].

Fig. 11  Challenges Identified in Literature


587Requirements Engineering (2024) 29:567–600 

5.1.4  Human‑centric requirements evaluation

Lack of human-centric approaches The challenges across 
papers [46, 47, 112, 121, 132] collectively highlight a sig-
nificant shortfall in human-centric approaches within AI 
system development. Habiba et al. [112] outline issues such 
as the lack of a mediator role for effective communication 
among stakeholders, the absence of a unified explainability 
definition, and the shortfall in stakeholder-focused develop-
ment methodologies, alongside a missing common language 
for all involved in ML projects. These issues underscore a 
widespread neglect of human-centered perspectives in AI’s 
technical evolution.

Ahmad et al. [132] point out the increasing reliance on 
AI in software solutions that unfortunately often overlook 
essential human-centered considerations in favor of techni-
cal priorities, indicating a misalignment between technologi-
cal progress and human values. Similarly, Yu and Yong [46] 
expose a specific lack of engagement with the needs and 
perspectives of Korean stakeholders in AI for Health, reveal-
ing both a geographic and cultural oversight in stakeholder 
engagement. Grüning et al. [47] discuss how companies fre-
quently miss integrating user requirements in the innovation 
of business models and the creation of new AI products, 
especially in healthcare, leaving uncertain how AI might 
shape future business models in this vital sector. Lastly, 
Wang [121] criticizes the dominant focus on technical strate-
gies like model extraction for interpretability, which neglects 
user expectations, highlighting a critical gap in aligning AI 
system development with actual user needs.

Hard to evaluate requirements Habibullah et al. [44] 
underscore the importance of NFRs in maintaining ML sys-
tem quality, noting differences in definitions and measure-
ments of NFRs between traditional systems and ML sys-
tems, such as adaptability and maintainability. The difficulty 
in measuring NFRs like fairness and explainability due to 
their qualitative nature is compounded in safety-critical situ-
ations where both human and machine judgment are crucial. 
Additionally, challenges in NFR measurement are identi-
fied, including gaps in knowledge or practices, absence of 
measurement baselines, complex ecosystems, data quality 
issues, testing costs, bias in results, and domain depend-
encies. However, Bartlett [120] points out the complexity 
of defining sensor accuracy requirements to ensure reliable 
algorithm outputs, indicating a lack of straightforward or 
well-defined processes. Similarly, Dey et al. [133] observe 
that while there is an emphasis on specifying ML-specific 
performance requirements, there is insufficient guidance on 
systematically engineering data requirements that involve 
diverse stakeholders.

5.1.5  Gap between ML engineers and end‑users

This section focuses on the challenges arising from the gap 
between ML engineers and end-users. We categorized this 
gap into three distinct groups.

Lack of a collaborative approach to requirements and 
design Initiating from a lack of collaboration, Vogelsang 
and Borg [107] underlined that it is challenging for data 
scientists to explain performance measures and their rel-
evance to the client in an effective and understandable way. 
Furthermore, to ensure that customers understand the per-
formance measures, data scientists should also have skills in 
communication and customer education. Likewise, Shergad-
wala and El-Nasr [29] underscored the need to understand 
the shared mental model of design teams during human-AI 
collaboration. Liao et al. [34] felt the need for explainability 
to make AI algorithms understandable to people. In contrast, 
Nalchigar and Yu [65] questioned the huge conceptual dis-
tance between business strategies, decision processes, and 
organizational performance. Lastly, Brennen [32] stressed it 
is essential to define a common terminology when discuss-
ing XAI to enable meaningful, productive conversations that 
can move the field forward. It could include establishing a 
shared vocabulary and clearly defined concepts and provid-
ing guidance on how to classify and rank models based on 
their explainability.

Lack of communication Secondly, in lack of commu-
nication, a key challenge for software engineers develop-
ing ML systems is to determine how to capture customer 
requirements effectively and design user interfaces that 
effectively convey data to the user [36]. Similarly, another 
significant challenge is overblown expectations identified 
due to a lack of communication [37]. To bridge user needs 
and technical capabilities to develop explainability systems 
that are flexible, responsive, and resilient to changing condi-
tions is also a challenge [34]. Qadadeh and Abdallah [109] 
stated that understanding the language and terminology used 
by data scientists and business users is a challenge in the 
context of data mining. They added that improving organi-
zational communication between data miners and business 
analysts and finding a way to bridge the gap between theo-
retical research results in data mining and realistic project 
goals is also a challenge.

Knowledge gap The third challenge is to build a shared 
understanding among stakeholders of the potential ML 
technology. Most importantly, it involves addressing the 
challenges in data collection and processing techniques, 
as well as implementing appropriate algorithms and mod-
els to ensure the overall effectiveness and reliability of the 
AI-based systems [27, 66]. Another challenge related to 


588 Requirements Engineering (2024) 29:567–600

requirement elicitation for data analytics systems is to deter-
mine how to translate the business objectives into tangible 
and measurable analytics requirements. Additionally, there 
is a gap between non-technical stakeholders, who often have 
difficulty expressing their needs, and technical stakeholders, 
who need to understand and implement the requirements 
[65].

5.1.6  Uncertainty

Uncertainty in AI-based systems presents significant chal-
lenges to the Requirements Engineering (RE) process, 
categorized into three distinct areas: uncertain environ-
ments, changing requirements, and the uncertain nature of 
outcomes.

Uncertain environment The uncertain environment 
encompasses challenges stemming from reliance on large 
volumes of data, where the accuracy and reliability of this 
data cannot always be assured [148]. This situation is fur-
ther complicated by unpredictable external conditions that 
might affect the system’s performance and decision-making 
capabilities [108], making it difficult to guarantee system 
behavior under varying conditions.

Changing requirements Changing requirements pose a 
persistent challenge, reflecting the dynamic nature of busi-
ness and operational goals. As business requirements evolve, 
the technical side struggles to keep pace, especially in under-
standing and processing data effectively [66]. This fluidity 
can lead to discrepancies between expected and actual sys-
tem capabilities, necessitating ongoing adjustments to the 
RE process.

Uncertain nature of the outcome The outcome’s uncer-
tain nature is particularly pronounced in AI-based systems, 
where the behavior on unseen data can significantly differ 
from expected results. This unpredictability complicates the 
RE process, as it undermines the ability to predict the sys-
tem’s performance accurately and, by extension, its devel-
opment timeline, cost-effectiveness, and overall feasibility 
[83, 114, 143]. The inherent unpredictability of AI models 
demands a flexible and adaptive approach to requirements 
engineering, capable of accommodating unforeseen changes 
and outcomes.

5.1.7  Requirements analysis

We identified three primary areas of concern in Require-
ments Analysis where each category reflects specific issues 
encountered in the development of AI-based systems, as 
delineated by primary studies.

Hard to classify The classification of requirements for 
AI-based systems into FR and NFR presents a significant 
challenge due to the inherent complexity of these systems. 

They leverage intricate algorithms and vast datasets, com-
plicating predictions of system behavior in various scenarios 
or environments [82]. The interaction of AI systems with 
external elements further amplifies this complexity, render-
ing traditional requirement analysis methods less effective. 
Moreover, the predictive nature of AI learning models com-
plicates the advanced definition of system behavior, under-
scoring the classification challenge [129].

Requirements management throughout the life cycle 
Effective requirements management across the lifecycle 
of an AI-based system is crucial yet challenging. [148] 
emphasizes the importance of understanding the impacts 
of ML algorithms not only during the design phase but also 
post-deployment, advocating for a broader consideration of 
non-functional requirements (NFRs) beyond the integration 
of ML solutions. The non-deterministic behavior at runt-
ime, influenced by the learning algorithms, complicates the 
classification and management of requirements for AI/ML-
intensive systems [69]. This dynamic behavior necessitates a 
flexible and adaptive approach to requirements management 
throughout the system’s lifecycle.

No clear perception "Clear perception" in the context 
of requirements analysis for ML-based systems refers to the 
precise and accurate understanding of how these systems 
perceive and interpret data from their environment. The lack 
of a clear perception during requirements analysis for ML-
based systems poses a significant risk, potentially leading 
to the violation of other system requirements, such as data 
dependencies. This is particularly concerning for safety 
requirements in ML-intensive systems, where unclear or 
inaccurate perceptions can undermine the achievement of 
top-level safety goals [115]. The challenge lies in adequately 
capturing and specifying these requirements in a manner that 
accounts for the nuanced and often unpredictable nature of 
ML-based perception.

5.1.8  Contextual requirements

The concept of contextual requirements highlights the neces-
sity of incorporating the specific environment or context in 
which an AI system operates into its design process. How-
ever, this presents two main challenges: accurately captur-
ing and defining these contextual requirements, and inte-
grating them into the design and development process. The 
variability and dynamic nature of real-world environments 
make it difficult to ensure that the AI system will perform 
optimally across different contexts. Traditional requirements 
engineering practices often fail to address these complexi-
ties, underscoring the need for new methods to effectively 
handle contextual requirements:

NFR attributes change in the ML context In the ML 
context, NFR attributes undergo significant transforma-
tions, necessitating a nuanced approach to their elicitation, 


589Requirements Engineering (2024) 29:567–600 

specification, and validation. The complex nature of AI sys-
tems, coupled with the specific demands of their application 
domains, often results in a shift in the prioritization and 
characterization of NFRs. These shifts can be attributed to 
various factors, including emerging stakeholder expecta-
tions, evolving legal and ethical standards, and the technical 
requirements of integrating AI components. The challenge 
is compounded by a frequent lack of domain-specific knowl-
edge, implicit stakeholder needs, and ill-defined problem 
scopes, making the accurate definition and management of 
these attributes particularly challenging [27, 147].

Context changes over time Contexts within which 
AI systems operate are not static; they evolve over time, 
affecting the relevance and accuracy of the initially defined 
requirements. This dynamic nature of contexts can lead to 
significant alterations in requirement attributes, necessitating 
ongoing adjustments to both functional and non-functional 
requirements to maintain system efficacy and compliance. 
The ability to anticipate and adapt to these changes is crucial 
for the long-term success of AI systems, highlighting the 
need for flexible and responsive requirement engineering 
processes [148].

5.1.9  Legal and ethical challenges

Challenges in data dependence and interpretability One 
significant challenge highlighted by Gabriel et al. [48] is 
the reliance on extensive datasets and the domain exper-
tise required to develop models while adhering to regula-
tory and ethical standards. There is a particular emphasis on 
the necessity to encapsulate implicit knowledge, especially 
from employees, and to ensure the AI system’s operations 
are interpretable to them. The lack of practical experience 
with AI applications in many companies further complicates 
these challenges. Additionally, Silva et al. [135] highlighted 
for ML systems, the inherent opacity poses a significant bar-
rier to explainability, compounded by issues like ensuring 
non-discrimination, navigating legal restrictions on data 
usage, and the complex task of specifying data requirements.

Fairness, regulation, and ethical accountability chal-
lenges Treacy [110] and Barclay [73] highlighted that cur-
rent approaches lack mechanisms to extract protected attrib-
utes from legal requirements and to assist in the definition 
and interpretation of fairness in AI models, indicating a 
gap in developing fair AI systems. Further, Grüning [47] 
stated companies aiming to offer AI solutions must navi-
gate complex product requirements and complex regulatory 
landscapes, presenting significant operational challenges. 
Similarly, Cerqueira [58] emphasized that developers often 
lack adequate training in AI ethics, both in academic settings 
and within development projects. Furthermore, the absence 
of legal consequences for failing to implement ethical 
guidelines – often because these guidelines are non-binding 

– results in a lack of motivation or accountability among 
developers regarding AI ethics.

5.2  Future research directions

We propose future research in the directions of RE4AI, 
as outlined and summarized in Table 4. This proposal is 
founded upon a selective extraction of insights from primary 
studies.

RD1: How to incorporate human knowledge in build-
ing AI-system? New sophisticated and AI-enabled safety 
systems, such as automatic emergency braking (AEB), 
have dramatically transformed the relationship between 
human drivers and their respective cars. It frees up mental 
resources, enhances driving quality, and impacts other traffic 
participants and their conduct. While AI-powered driving 
assistance has evolved considerably recently, humans have 
remained the same over the previous millennia. So, while 
building such features, we must consider several crucial fac-
tors (limitations and capabilities) from a human perspective. 
The fact that people may override or deactivate AEB capa-
bility, for example, has become a key constraint in its poten-
tial to make traffic safer. In this regard, considering to which 
extent human aspects must be included when examining 
the desired quality and needed functionality of the system 
and its components is a fruitful research opportunity [147]. 
Also, how knowledge about human factors can be effectively 
incorporated into AI-intensive system development meth-
odologies would be a promising research opportunity [29].

RD2: How can requirements modeling be used for 
understanding AI-based systems? Requirements modeling 
enables the connection between domain problem under-
standing and technology solution, describing and justifying 
the step-by-step progression from problem to solution. Like 
conventional software, ML applications may benefit from 
well-known RE methodologies such as goal- and agent-
oriented RE, ensuring that the final systems meet the goals 
and desires of end-users and other stakeholders [84]. Fur-
thermore, conceptual modeling can be seen as worthwhile to 
improve business understanding and enhance systems trans-
parency [64]. Where [115] and [147] highlighted efficient 
top-down requirement formulation and deriving contextual 
requirements from use cases as a new research avenue.

RD3: How can existing RE practices be adapted for 
AI-based systems? One of the major causes of poor ML 
system quality is the lack of requirements specification 
[152]. The main reason for this is the change in develop-
ment paradigm and new types of requirements. [82] and [66] 
outlined that research should be conducted to investigate 
how existing RE practices, e.g., GORE, data-driven and 
model-based design (MDM), can be adapted for AI-based 
systems. The same applies to explainability: Kohl et al. [88] 


590 Requirements Engineering (2024) 29:567–600

emphasized that further research is needed to investigate 
how RE techniques can be applied to design explainable 
systems.

RD4: How to identify the need for new RE practices 
specific to AI-based system? Several studies have shown 
that RE for ML systems is different because of the different 
ways these systems are developed; therefore, RE practices 
for these systems should also evolve. In this context [107] 
outlined the issues that future research should address: Is RE 
for ML distinct? If yes, what distinguishes it? If not, what are 
the reasons and consequences? Further research is needed 
to find how RE for ML can be integrated with the RE of a 
traditional software system [148].

RD5: How to address non-functional requirements? 
A rigorous RE approach is required to assure quality. NFRs 
are requirements placed on system quality and are articu-
lated over many quality characteristics [153]. Further, the 
authors stated that our knowledge of NFR from the tradi-
tional system is no longer applicable to AI-based systems 
due to the non-deterministic behavior and additive perfor-
mance requirements. One of the most critical aspects is data 
and its representation in ML systems since there needs to be 
an adequate mechanism to identify and manage the needed 
quality and amount of data [147]. Future research should 
focus on what are specific quality requirements related to 
ML systems, how these requirements can be specified [129, 
148], particularly data quality requirements [147], safety 
requirements [107], and compliance requirements [108].

RD6: How to validate ML requirements? Some stud-
ies have dived into requirement validations for AI, but it 
is still in its infancy. Some possible future research direc-
tions are identifying appropriate performance metrics or key 
performance indicators (KPIs) for trained ML models in a 
particular context. Defining and monitoring the performance 
of ML systems ensures the system stays within its intended 
behavior [147]. In the context of requirements validation, 
[145] pointed towards validating explanation as a potential 
future work. Some frameworks are provided to specify qual-
ity objectives as constraints or as criteria. Nonetheless, they 
need an evaluation of the relevance of such objectives, e.g., 
assessing the understanding of stakeholders, which can be a 
promising future research direction.

RD7: How to address new types of requirements? This 
future research avenue should consider how to specify new 
types of requirements [81, 154]? How to incorporate these 
requirements in the current development scenario [80]? How 
do we formulate the specifications for alteration