Welcome to the DX2025 Benchmarks homepage

Information

This page contains the information on the three benchmarks introduced for The 36th International Conference on Principles of Diagnosis and Resilient Systems 2025 in Nashville, TN, USA.

The DXC’25 organization team (in a.o.): Johan de Kleer (co-chair), Jan Deeken, Kai Dresia, Erik Frisk, Daniel Jung (chair LiU-ICE), Mattias Krysander, Eldin Kurudzija (chair Lumen), Ingo Pill (co-chair), Michal Syfert, Anna Sztyber-Betley (chair SLIDe), Tobias Traudt, Günther Waxenegger-Wilfling

More details regarding each benchmark can be found below.

DX 2025 Competition

In the DX 25 competition, it is possible to participate in three benchmarks. The objective for each benchmark is to develop a diagnosis system. It is possible to participate in one or multiple benchmarks.

If you are interested in or intend to participate in any of the benchmarks, please send a notification email to the corresponding contact persons for the benchmarks.

The deadline for submitting your solutions is April 1st. Your diagnosis system solution should be submitted to the corresponding contact person.

Benchmarks (More info will be added)

As extra motivation for your participation, we would like to let you know that we’re in contact with the AIJ associate editor Meir Kalech who is responsible for special issues dedicated to AI competitions Depending on the submissions, the DXC competition might become part of such a special issue, so that we would invite contestants for extended versions of their DX competition papers.

Description of the diagnosis system interface

The developed diagnosis system solution must be implemented in Python. We have provided a Docker environment for each benchmark where the diagnosis system shall be implemented. The diagnosis system shall be implemented with a specific input/output interface based on a DiagnosisSystem class.

At each time instance, a new sample is provided to the diagnosis system and a diagnosis output shall be provided from the diagnosis system. All benchmarks are using the same implementation environment to simplify participation in multiple competitions. Please look into each benchmark for a more detailed description of each input/output interface.

Benchmarks

LiU-ICE

The LiU-ICE benchmark covers some challenging problems of fault diagnosis of technical systems. The diagnosis system needs to identify the faulty component as fast and accurately as possible while avoiding misclassifications and falsely rejecting the true diagnosis. The objective of the competition is to address these challenges by designing a diagnosis system for the air path of an internal combustion engine. It is a challenging system because of its dynamic non-linear behavior and wide operating range. A state-of-the-art structural model of the system is provided together with training data from different fault scenarios. The set of available actuator and sensor signals corresponds to the standard signals that are available in a commercial vehicle.

More information and downloadable resources can be found here:

Contact: Daniel Jung

SLIDe

SLIDe (Steam Line Intrusion Detection Benchmark) benchmark is devoted to the analysis of diagnostic algorithms for the detection and isolation of process faults and the detection of cyberattacks for a simulated fragment of the steam line of a fluidized bed boiler including the third and fourth stage of superheaters. It includes challenging scenarios including sensor, actuator, and technological components faults as well as cyber-attacks. To reflect the industrial nature of the benchmark, participants will only have a qualitative description of the process with a list of measurements and a few archival datasets representing different operating conditions, but only for fault-free and attack-free states.

More information and downloadable resources can be found here:

Contact: Anna Sztyber-Betley

LUMEN

LUMEN (Liquid Upper stage demonstrator Engine) is a modular pump-fed liquid oxygen (LOX) and liquid methane (LNG) rocket engine developed by the Institute of Space Propulsion of the German Aerospace Center (DLR). This benchmark focuses on the fuel turbopump subsystem of the rocket engine and addresses key challenges encountered in safety-critical systems, such as the lack of experimental data from faulty operation. The goal of this benchmark is to utilize information from a simulation model with uncertain parameter and limited experimental data from nominal operation to enable the diagnosis system to perform effectively under realistic operating conditions.

More information and downloadable resources can be found here:

Contact: Eldin Kurudzija