for Research

Our research and projects are designed for a wide range of applications across industries.

Our Focus

The Data Science and its Applications group at the DFKI conducts research in the following funded research areas.

If you would like to contribute to or collaborate on one of these projects, be it as a research assistant, doctoral or postdoctoral researcher, then we would like to hear from you!

Centered Button

AI4Nof1

Reducing patient burden and research cost: the AI4Nof1 project leverages cutting edge developments in (causal, neurosymbolic) reinforcement learning, psychometrics and digital epidemiology to build adaptive personalised treatment regimes for chronic conditions, simultaneously identifying phenotypes and causal pathways while minimising the time and number of measurements needed for patients to find a treatment that is right for them.

curATime

curATime is a collaborative, multi-centre effort applying AI methods to high-dimensional biomedical data to promote an understanding of biological processes associated with cardiovascular illness. This exciting project involves highly granular data collected as part of the Gutenberg Health Study — a prospective cohort study of a representative sample (N = 15000) of the population of Mainz — including genotyping, DNA methylation, transcriptomics, proteomics and extensive time-varying clinical information.

Eventful

The project explores methods of selecting and extracting features from whole genome data for survival problems (individual health), the effect of different levels of data aggregation on spatiotemporal models (societal health) and the challenges behind building a transparent, explainable and useful data science pipeline for such complex temporal data. A particular focus is paid to automation of the data cleaning process, as well as to effective combination of scientific and machine learning knowledge on problems where data are messy, sparse or only available in aggregate form.

Physics-informed ML

The aim of this project is to develop a new form of AI: physics-informed deep anomaly detection. This involves utilising state-of-the-art neural anomaly detection algorithms, which have recently drastically reduced error rates in common benchmarks. Neural network architectures are being developed that specifically incorporate information from the data domain to further increase data efficiency. In addition, physics-informed generative AI methods are being developed to expand and improve the existing training data. The application of the new physics-informed anomaly detection methods is being evaluated specifically in additive manufacturing, particularly with regard to efficient process execution.

Structuring unstructured information

Managing a large building requires knowing exactly what equipment is inside it, but this vital data is usually trapped in thousands of pages of messy blueprints, tables, and text documents. Traditionally, compiling this information into a usable register takes weeks of tedious, manual paperwork and costly site inspections. This project changes that by using AI to automatically read, extract, and organize this hidden data from your existing documents. By turning a chaotic mountain of paperwork into a structured digital catalog, this project aims to save massive amounts of time, cuts maintenance costs, and seamlessly connect with modern digital building software to optimize energy efficiency.

Trustworthy AI in medicine

The TrustifAI project aims to contribute a set of concrete solutions to improve trustworthiness of AI applications in health and wellbeing at various stages of development lifecycle. A quality platform for development of trustworthy AI applications will enable users to create efficient and effective data science analytics pipelines through a human-in-the-loop approach with the goal of increasing trustworthiness.

Our Network

The DSA proudly collaborates with a wide network of scientists and professionals from around the world. We aspire to turn our shared values into innovation and education standards in the domain of data science. A few select collaborators are listed below.

Bernd Bischl

Fraunhofer & München

Louis Aslett

Durham & Alan Turing Institute

Spiros Denaxas

University College London

Lester Mackey

Stanford & Microsoft

Hong Ge

Cambridge

Yee Whye Teh

Oxford & Deep Mind

Andrew Duncan

Imperial College

Adolfo de Unánue

Escuela de Gobierno y Transformación Pública

Rayid Ghani

Carnegie Mellon University

Seth Flaxman

Oxford

Chris Holmes

Oxford & Alan Turing Institute

Our Latest Publications

Trends in Pediatric Hospital Admissions Caused or Contributed by SARS-CoV-2 Infection in England

To investigate the changing characteristics of SARS-CoV-2-related pediatric hospital admissions over time. STUDY DESIGN: This was a national, observational cohort study from July 1, 2020, to August …

Harrison Wilde, Christopher Tomlinson, Bilal A. Mateen, David Selby, Hari Krishnan Kanthimathinathan, Spiros Denaxas, Seth Flaxman, Sebastian Vollmer, Christina Pagel, Katherine Brown 2025

Actionable Trustworthy AI with a Knowledge-based Debugger

The rapidly evolving regulatory landscape in AI presents significant challenges to establishing and maintaining trust. AI practitioners face a substantial burden in understanding and operationalizing …

Priyabanta Sandulu, Andrea Šipka, Sergey Redyuk, Sebastian J. Vollmer 2025

The Power of Stories: Narrative Priming in Multi-Agent Networked Public Goods Games

Research suggests that large-scale human cooperation is driven by shared narratives that encode common beliefs and values. This study explores whether such narratives can similarly nudge LLM agents …

Gerrit Großmann, Larisa Ivanova, Sai Leela Poduru, Mohaddeseh Tabrizian, Islam Mesabah, David Antony Selby, Sebastian Vollmer 2025

Describing variability of intensively collected longitudinal ordinal data with latent spline models

Population health studies increasingly collect longitudinal, patient-reported symptom data via mobile devices, offering unique insights into experiences outside clinical settings, such as pain, …

Mark Lunt, David Antony Selby, William Dixon 2025

X-Hacking: The Threat of Misguided AutoML

Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI …

Rahul Sharma, Sumantrak Mukherjee, Andrea Sipka, Eyke Hüllermeier, Sebastian Vollmer, Sergey Redyuk, David Antony Selby 2025

Centered Button

Collaborate with Us

Cooperation is at the heart of our operations, and we are always looking for new academics and professionals with whom we can shape the future of data science. If you have ideas on how we could collaborate or just want to discuss our work, please reach out!

Centered Button