is Professor for AI-based Material Science at TU Munich.
His chair is developing electronic structure and machine learning methods and applies them to pertinent problems in material science, surface science, physics, chemistry and the nano sciences.
Active learning (AL) has shown promise to be a particularly data-efficient machine learning approach. Yet, its performance depends on the application, and it is not clear when AL practitioners can expect computational savings. Here, we carry out a systematic AL performance assessment for three diverse molecular datasets and two common scientific tasks: compiling compact, informative datasets and targeted molecular searches. We implemented AL with Gaussian processes (GP) and used the many-body tensor as molecular representation. For the first task, we tested different data acquisition strategies, batch sizes, and GP noise settings. AL was insensitive to the acquisition batch size, and we observed the best AL performance for the acquisition strategy that combines uncertainty reduction with clustering to promote diversity. However, for optimal GP noise settings, AL did not outperform the randomized selection of data points. Conversely, for targeted searches, AL outperformed random sampling and achieved data savings of up to 64%. Our analysis provides insight into this task-specific performance difference in terms of target distributions and data collection strategies. We established that the performance of AL depends on the relative distribution of the target molecules in comparison to the total dataset distribution, with the largest computational savings achieved when their overlap is minimal.
The investigation of magnetic energy landscapes and the search for ground states of magnetic materials using ab initio methods like density functional theory (DFT) is a challenging task. Complex interactions, such as superexchange and spin-orbit coupling, make these calculations computationally expensive and often lead to non-trivial energy landscapes. Consequently, a comprehensive and systematic investigation of large magnetic configuration spaces is often impractical. We approach this problem by utilizing Bayesian Optimization, an active machine learning scheme that has proven to be efficient in modeling unknown functions and finding global minima. Using this approach we can obtain the magnetic contribution to the energy as a function of one or more spin canting angles with relatively small numbers of DFT calculations. To assess the capabilities and the efficiency of the approach we investigate the noncollinear magnetic energy landscapes of selected materials containing 3d, 5d and 5f magnetic ions: Ba3MnNb2O9, LaMn2Si2, β-MnO2, Sr2IrO4, UO2 and Ba2NaOsO6. By comparing our results to previous ab initio studies that followed more conventional approaches, we observe significant improvements in efficiency.
Transforming CO2 into methanol represents a crucial step towards closing the carbon cycle, with thermoreduction technology nearing industrial application. However, obtaining high methanol yields and ensuring the stability of heterocatalysts remain significant challenges. Herein, we present a sophisticated computational framework to accelerate the discovery of novel thermal heterogeneous catalysts, using machine-learned force fields. We propose a new catalytic descriptor, termed adsorption energy distribution, that aggregates the binding energies for different catalyst facets, binding sites, and adsorbates. The descriptor is versatile and can easily be adjusted to a specific reaction through careful choice of the key-step reactants and reaction intermediates. By applying unsupervised machine learning and statistical analysis to a dataset comprising nearly 160 metallic alloys, we offer a powerful tool for catalyst discovery. Finally, we propose new promising candidate materials such as ZnRh and ZnPt3, which to our knowledge, have not yet been tested, and discuss their possible advantage in terms of stability.
©all images: LMU | TUM