Výzva 3/25

ZOZNAM ÚSPEŠNÝCH PROJEKTOV VO VÝZVE 3/25:

Fine-tuning and benchmarking open-source speech-to-text models for Slovak

Martin Tamajka

Recent advances in open-source speech-to-text (ASR) models such as Whisper or Parakeet have demonstrated that large-scale pretraining combined with fine-tuning enables robust transcription across many languages and domains. However, languages with fewer available resources, such as Slovak, remain underrepresented in both pretraining corpora and in downstream benchmarks. Fine-tuning these open-source models for Slovak is therefore a promising direction.
Despite these advantages, fine-tuning ASR systems for Slovak presents multiple challenges. The most pressing issue is the scarcity of large, high-quality transcribed speech corpora covering diverse domains, speakers, and dialects. Additional difficulties include the computational demands of training or adapting large pretrained models, the risk of overfitting when data is limited, and the lack of standardized benchmarks for evaluating Slovak ASR performance. These constraints make it essential to carefully select fine-tuning strategies such as parameter-efficient methods, or data augmentation, and to establish reproducible evaluation procedures. Addressing these challenges will be key to building competitive Slovak ASR systems and to advancing research in speech technology for Slovak.

Lattice distortion and dislocation slip in rock salt transition metal binary carbides using first-principles DFT calculations

Tamás Csanádi

High-entropy (HE) carbides of rock salt structure could broaden the application and improve the performance of ultra-high temperature ceramics, the only materials for temperatures exceeding 2000ºC in oxidising atmospheres, such as hypersonic vehicles and spacecraft, which are: I) a limited group of materials (about 15 compounds) and II) macroscopically brittle at room temperature. Despite the great potential of HE carbides that could overperform the conventional monocarbides, the first-principles density functional theory (DFT) computational screening of their synthesisability is very challenging, and the prediction of their mechanical properties (i.e. strength and plasticity) is still unexplored due to the large compositional space and complex nature of the problem. The present project addresses the above challenge by performing accurate first-principles DFT simulations on stoichiometric binary carbides of rock salt structures, such as TiZrC2 and TaHfC2, from which HE carbides could be built up using other analytical or machine learning models in the future (out of the scope of the present project). These binary carbides could serve as the smallest building blocks, providing the essential features of the HE carbides, namely: 1) the lattice distortion caused by the random arrangement of different metal cations and 2) the tunability of dislocation slip via different electronic environments of the component metals.

Therapy of Aphasia Using Social Robotics and Artificial Intelligence

Tomáš Černáček

Aphasia significantly impairs communication, creating a need for specialized therapeutic tools. This project leverages social robotics and artificial intelligence to develop models for Automatic Speech Recognition (ASR), Text-to-Speech Synthesis (TTS), and Large Language Models (LLMs) tailored to aphasic communication. The ASR model targets accurate recognition of pathological speech, including atypical articulation and long pauses. The TTS system generates natural, highly intelligible speech for clinical contexts, while the LLM supports therapeutic dialogues with simplified, domain-specific language. Using high-performance computing, we analyze baseline models, preprocess pathological speech datasets, and perform large-scale training and evaluation. The resulting AI components aim to enhance aphasia therapy by enabling reliable recognition, clear speech synthesis, and supportive, context-aware interactions. This work provides a foundation for AI-driven rehabilitation tools and opens pathways for domain-specific applications, such as everyday scenarios like shopping or banking, improving independence and communication for individuals with aphasia.

Evaluation and benchmarking of LLM capabilities in text generation and classification tasks 2

Dominik Macko

Multilingual machine text generation has significantly progressed in recent years thanks to a new generation of large language models (LLMs). They had a profound impact on the area of NLP, not limited to English only anymore, although low-resource languages (such as Slovak) are still quite challenging to work with. Our objective is to research LLMs’ capabilities (incl. in-context learning and synthetic data augmentation) to tackle multilingual text generation and classification. Namely, we will focus on the following tasks: 1) personalized disinformation generation, 2) machine generated text detection, 3) textual/multimodal content assessment, 4) LLM benchmark for the Slovak language, and 5) effective fine-tuning of LLMs for the Slovak language. To perform the planned tasks, we will need to infer and fine-tune different kinds of language models, covering various sizes and architectures, and explore quantization and efficient fine-tuning techniques for different tasks. The expected results will improve our understanding of advantages and limitations of LLMs and provide several improved methods. This is continuation of the previous project working on the same topic.

Unravelling excited-state reaction paths of selected molecular photoswitches by nonadiabatic dynamics methods

Martin Ošťadnický

Light-driven isomerization defines a process with broad applicability in many research areas, from photo-active materials design to photo-chemically driven control of large biomolecules. In the past decade, new ideas have emerged about pharmacological usage of molecular photoswitches, one of the potential candidates being iminothioindoxyl. Despite their favorable photophysical properties upon excitation (visible light absorption, well separated isomer absorption maxima, appropriate thermal half-lives) a key drawback remains limited exploration of the excited state potential energy surface. With the use of static quantum chemistry approaches and non-adiabatic molecular dynamics simulations, our aim is to elucidate excited state pathways of selected ITI derivatives and other aryliminoindigoid switches, thereby developing a strategy for quantum yield optimization of the light-driven isomerization, further advancing their practical applicability in pharmacology.

Designing Novel SUPERatomic CLUSTERS for Next-Generation Materials

Katarína Šulková

In this project, we will employ density functional theory (DFT) to systematically investigate the interaction between CO2 and doped aluminum clusters, focusing on how doping influences their electronic structure, local surface charges, ionization potentials and in turn catalytic activation of CO2 molecule. Standard density functional theory (DFT) calculations inherently assume a single-reference character of the studied systems, which can limit their accuracy for strongly correlated systems. We have recently developed a method for partitioning an exact electron correlation energy into the dynamic and non-dynamic components, which is based on the Fixed-node diffusion Monte Carlo (FNDMC) method utilizing the nodes derived from restricted Hartree-Fock Slater determinant. The proposed method allows for an unambiguous separation of electron correlation energies and will be applied to the analysis of the nature of electron correlation in small metal clusters.

openCARP-Based In-Silico Simulation of Ventricular Electrophysiology and Torso Potentials Using Patient-Specific Models

Lukáš Zelieska

This project builds on our previous experience with patient-specific computational modeling of ventricular electrophysiology and extends it toward the investigation of torso potentials during premature ventricular contractions (PVC). PVC represent a frequent arrhythmia that can disrupt cardiac rhythm and are often treated with catheter-based radiofrequency ablation (RFA). The success of such interventions strongly depends on precise identification of the ectopic origin and accurate understanding of activation pathways. To address these challenges, we employ the openCARP simulation environment and a reaction–diffusion bidomain framework to perform high-resolution in-silico experiments on patient-derived geometries.
Our workflow integrates electrocardiographic recordings with CT-based anatomical reconstructions, enabling the generation of detailed three-dimensional heart–torso models. By simulating PVCs from multiple ventricular regions and comparing simulated torso potentials with clinical measurements, we aim to improve localization accuracy and provide mechanistic insights into arrhythmia propagation. The project exploits high-performance computing (HPC) resources not only for large-scale bidomain simulations but also for computationally intensive mesh generation, ensuring efficient execution of tasks that are otherwise infeasible on standard hardware.
The expected outcome is a robust modeling framework that supports improved non-invasive diagnostics and personalized therapy planning. By advancing our ability to link cardiac electrical sources with body surface potentials, the project will contribute to more reliable localization of PVC origins and assist clinicians in optimizing therapeutic interventions.

Isobutanol to butenes transformations catalysed by external surfaces of ferrierit

Tomáš Bučko

Building upon our previous systematic work, we will use DFT and its ML surrogate models to study mechanisms and kinetics of important elementary isobutanol transformations into butenes catalyzed by the external surfaces of zeolite ferrierite. To this end, representative slab models will be built, potentially active sites will be identified, and rate-constants of rate-determining elementary steps will be computed by means of ML-accelerated AIMD simulations. We will also perform AIMD simulations of power/IR spectra of selected systems to enable comparison with time-resolve IR spectroscopic data from our experimental partners.

Influence of temperature on electronic bands and photo-conversion efficiency of perovskite optical absorbers – DFT/MD atomistic modelling

Kamil Tokár

Perovskite (PV) structures implementation as optical absorbers in photovoltaic cells could possess a comparable theoretical photoelectric efficiency with advanced silicon technologies, while the technology for preparing such solar cells based on PVs should be much more advantageous. However, the efficiency of light conversion to photo-current can be affected by the occurrence of various types of defects present in the structure of PV films as so as by the thermal dependence of optical absorption.
The aim of the project will be continuation on previously conducted first-principles research of the temperature dependent electronic band structure/gaps of pure inorganic analogues CsPbI3 and CsPbCl3 and optical absorption transitions using the framework of DFT theory and DFT molecular dynamics for simulation of thermalized crystal lattices in conjunction with post-DFT methods (hybrids, TD-DFT, GW, G0W0) for studying the electronic band structure and electronic transitions in PVs.

AI-driven exploration of novel properties of 2D materials

Adam Hložný

We aim to study three representative 2D systems.
WSe₂ at magic angles: Twisted TMDc systems like WSe₂ show strongly correlated behaviour at small twist angles. These include the emergence of flat bands, superconductivity, the quantum anomalous Hall effect, and more. We will use machine-learning interatomic potentials (MLIPs) to capture lattice relaxations, density functional theory to compute the electronic structure of smaller cells at high strains to estimate the scaling and to calibrate the DFTB+ calculation, and then employ DFTB+ to treat small angles and very large moiré systems.
NbSe₂ monolayer superconductivity: We perform quasiparticle interference (QPI) simulations to produce interference (scattering) spectra simulating interaction of an STM tip with the material. We model the system via an effective Bogoliubov–de Gennes Hamiltonian, which has parameters dependent on the representation of the underlying crystalline symmetry point group. We aim to analyze the spectrum using deep learning methods to „reverse the simulation“ and infer the parameters of the Hamiltonian. This serves as an interpretation tool for the STM experiments via the effective Hamiltonian.
Phosphorene with defects (vacancies): Phosphorene is a material where vacancies are very mobile and have complex dynamics. We carry out constrained molecular dynamics simulations to train an MLIP. We follow up with fixed-node quantum Monte Carlo sampling and iterative retraining of the potential to „upscale“ it to the fnQMC level. We can then estimate mechanical properties of the material at the fnQMC level of theory.

DFT Transition State Investigation of Materials for HER in Electrolysers and Redox Flow Batteries

Natália Podrojková

The proposed project focuses on computational investigation of the hydrogen evolution reaction (HER) in water electrolysers (WE) and redox flow batteries (RFB), where HER critically influences efficiency and stability. Building on previous work that optimized catalyst surfaces and adsorption energies (p901-24-3), this continuation project will resolve reaction pathways and transition states of HER on transition metal phosphides (MoP, CoP, FeP, NiP) and metal-doped carbon materials (Bi, In, Sn, Pb). Using density functional theory (DFT) in Quantum ESPRESSO, the study will combine the computational hydrogen electrode (CHE) framework with climbing-image nudged elastic band (CI-NEB) and vibrational analyses to provide potential- and pH-dependent free-energy diagrams, validated transition states, and activation barriers for elementary HER steps. The results will enable the construction of mechanistic profiles and reaction coordinate diagrams across material classes, identifying promising catalysts for WE and HER-resistant electrodes for RFB. These theoretical insights will directly support experimental efforts in the parallel HERAQUAS project, enhance mechanistic understanding beyond adsorption descriptors, and contribute to rational catalyst design in sustainable hydrogen technologies.

CFD simulations of tunnel fire and wildland fire using FDS

Lukáš Valášek

The project focuses on numerical simulations of fires in both enclosed and open environments using high-performance computing resources and the Fire Dynamics Simulator (FDS).
In the field of tunnel safety, the research will concentrate on refining the estimation of critical airflow velocity under different geometries, tunnel lengths, and fire intensities, as well as on analysing the impact of emergency ventilation and meteorological factors on airflow and smoke stratification. In the field of wildland fires, the project will explore the use of extended FDS modules for modelling vegetation fuels, topography, and wind profiles under various conditions. The combination of tunnel and outdoor cases will provide a comprehensive framework for testing the efficiency and scalability of parallel computations in an HPC environment. The simulations will be carried out as parametric studies and scaling tests on the Devana HPC system using MPI parallelization.

Layered inorganic structures as advanced materials for applications in green technologies

Eva Scholtzová

The main aim of this project is to study the properties of advanced materials based on the inorganic layered structures (ILS) like aluminosilicates (AS) or graphene (G) for their possible application in remediation processes, focusing on the most representative groups of selected organic pollutants (e.g., drugs, technological agents and perfluoroalkyl acids (PFAS)). From an ecological perspective, understanding the keying mechanism of contaminants on the surface of the ILS is crucial. Structural stability, interaction energies, and the mechanism of keying in the formation of complexes among the selected adsorbed pollutants will be studied for both the economically more affordable clay minerals and the more expensive materials based on graphene, using computational methods (e.g., the DFT-D3 method and ab initio molecular dynamics).

Fair-aware Cooperative Routing

Shima Rahmani

The project aims to develop cooperative traffic routing strategies that balance network efficiency and fairness among users. Using multi-agent reinforcement learning (MARL), vehicles are modeled as autonomous agents that learn to coordinate their routes to minimize overall congestion while preventing systematic disadvantages for any subset of drivers. The methodology combines SUMO traffic simulations, PettingZoo ParallelEnv, and MARLlib frameworks to train and evaluate agents on both small test networks and realistic city-scale networks. The project seeks to provide a scalable, data-driven framework for fair and efficient traffic management.