Please use this identifier to cite or link to this item:
Authors: Wyrich, Marvin
Title: Evidence for the design of code comprehension experiments
Issue Date: 2023 Dissertation 314
Abstract: Context: Valid studies establish confidence in scientific findings. However, to carefully assess a study design, specific domain knowledge is required in addition to general expertise in research methodologies. For example, in an experiment, the influence of a manipulated condition on an observation can be influenced by many other conditions. We refer to these as confounding variables. Knowing possible confounding variables in the thematic context is essential to be able to assess a study design. If certain confounding variables are not identified and consequently not controlled, this can pose a threat to the validity of the study results. Problem: So far, the assessment of the validity of a study is only intuitive. The potential bias of study findings due to confounding variables is thus speculative, rather than evidence-based. This leads to uncertainty in the design of studies, as well as disagreement in peer review. However, two barriers currently impede evidence-based evaluation of study designs. First, many of the suspected confounding variables have not yet been adequately researched to demonstrate their true effects. Second, there is a lack of a pragmatic method to synthesize the existing evidence from primary studies in a way that is easily accessible to researchers. Scope: We investigate the problem in the context of experimental research methods with human study participants and in the thematic context of code comprehension research. Contributions: We first systematically analyze the design choices in code comprehension experiments over the past 40 years and the threats to the validity of these studies. This forms the basis for a subsequent discussion of the wide variety of design options in the absence of evidence on their consequences and comparability. We then conduct experiments that provide evidence on the influence of intelligence, personality, and cognitive biases on code comprehension. While previously only speculating on the influence of these variables, we now have some initial data points on their actual influence. Finally, we show how combining different primary studies into evidence profiles facilitates evidence-based discussion of experimental designs. For the three most commonly discussed threats to validity in code comprehension experiments, we create evidence profiles and discuss their implications. Conclusion: Evidence for and against threats to validity can be found for frequently discussed threats. Such conflicting evidence is explained by the need to consider individual confounding variables in the context of a specific study design, rather than as a universal rule, as is often the case. Evidence profiles highlight such a spectrum of evidence and serve as an entry point for researchers to engage in an evidence-based discussion of their study design. However, as with all types of systematic secondary studies, the success of evidence profiles relies on publishing a sufficient number of studies on the same respective research question. This is a particular challenge in a research field where the novelty of a manuscript's research findings is one of the evaluation criteria of any major conference. Nevertheless, we are optimistic about the future, as even evidence profiles that will merely indicate that evidence on a particular controversial issue is scarce will make a contribution: they will identify opinionated assessments of study designs as such, as well as motivate additional studies to provide more evidence.
Appears in Collections:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Files in This Item:
File Description SizeFormat 
thesis.pdf2,03 MBAdobe PDFView/Open

Items in OPUS are protected by copyright, with all rights reserved, unless otherwise indicated.