Home | Research | Groups | Massimo Fornasier

Research Group Massimo Fornasier


Link to website at TUM

Massimo Fornasier

Prof. Dr.

Principal Investigator

Applied Numerical Analysis

Massimo Fornasier

holds the Chair of Applied Numerical Analysis at TU Munich.

His embraces a broad spectrum of problems in mathematical modeling, analysis and numerical analysis. He is particularly interested in the concept of compression as appearing in different forms in data analysis, image and signal processing, and in the adaptive numerical solutions of partial differential equations or high-dimensional optimization problems.

Team members @MCML

PostDocs

Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis

PhD Students

Link to website

Jona Klemenc

Applied Numerical Analysis

Link to website

Alessandro Scagliotti

Applied Numerical Analysis

Link to website

Lukang Sun

Applied Numerical Analysis

Publications @MCML

2025


[41]
A. Scagliotti and S. Farinelli.
Normalizing flows as approximations of optimal transport maps via linear-control neural ODEs.
Nonlinear Analysis 257.113811 (Aug. 2025). DOI
Abstract

In this paper, we consider the problem of recovering the W2-optimal transport map T between absolutely continuous measures as the flow of a linear-control neural ODE, where the control depends only on the time variable and takes values in a finite-dimensional space. We first show that, under suitable assumptions on and on the controlled vector fields governing the neural ODE, the optimal transport map is contained in the -closure of the flows generated by the system. Then, we tackle the problem under the assumption that only discrete approximations of of the original measures are available: we formulate approximated optimal control problems, and we show that their solutions give flows that approximate the original optimal transport map . In the framework of generative models, the approximating flow constructed here can be seen as a ‘Normalizing Flow’, which usually refers to the task of providing invertible transport maps between probability measures by means of deep neural networks. We propose an iterative numerical scheme based on the Pontryagin Maximum Principle for the resolution of the optimal control problem, resulting in a method for the practical computation of the approximated optimal transport map, and we test it on a two-dimensional example.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[40]
F. Krahmer and A. Veselovska.
The mathematics of dots and pixels: On the theoretical foundations of image halftoning.
GAMM Mitteilungen 48.1 (Mar. 2025). DOI
Abstract

The evolution of image halftoning, from its analog roots to contemporary digital methodologies, encapsulates a fascinating journey marked by technological advancements and creative innovations. Yet the theoretical understanding of halftoning is much more recent. In this article, we explore various approaches towards shedding light on the design of halftoning approaches and why they work. We discuss both halftoning in a continuous domain and on a pixel grid. We start by reviewing the mathematical foundation of the so-called electrostatic halftoning method, which departed from the heuristic of considering the back dots of the halftoned image as charged particles attracted by the grey values of the image in combination with mutual repulsion. Such an attraction-repulsion model can be mathematically represented via an energy functional in a reproducing kernel Hilbert space allowing for a rigorous analysis of the resulting optimization problem as well as a convergence analysis in a suitable topology. A second class of methods that we discuss in detail is the class of error diffusion schemes, arguably among the most popular halftoning techniques due to their ability to work directly on a pixel grid and their ease of application. The main idea of these schemes is to choose the locations of the black pixels via a recurrence relation designed to agree with the image in terms of the local averages. We discuss some recent mathematical understanding of these methods that is based on a connection to Σ∆ quantizers, a popular class of algorithms for analog-to-digital conversion.

MCML Authors
Link to Profile Felix Krahmer

Felix Krahmer

Prof. Dr.

Optimization & Data Analysis

Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis


[39]
E. Pozzoli and A. Scagliotti.
Approximation of diffeomorphisms for quantum state transfers.
Preprint (Mar. 2025). arXiv
Abstract

In this paper, we seek to combine two emerging standpoints in control theory. On the one hand, recent advances in infinite-dimensional geometric control have unlocked a method for controlling (with arbitrary precision and in arbitrarily small times) state transfers for bilinear Schrödinger PDEs posed on a Riemannian manifold M. In particular, these arguments rely on controllability results in the group of the diffeomorphisms of M. On the other hand, using tools of Γ-convergence, it has been proved that we can phrase the retrieve of a diffeomorphism of M as an ensemble optimal control problem. More precisely, this is done by employing a control-affine system for emph{simultaneously} steering a finite swarm of points towards the respective targets. Here we blend these two theoretical approaches and numerically find control laws driving state transitions (such as eigenstate transfers) in a bilinear Schrödinger PDE posed on the torus. Such systems have experimental relevance and are currently used to model rotational dynamics of molecules, and cold atoms trapped in periodic optical lattices.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[38]
A. Scagliotti, F. Scagliotti, L. Locati and F. Sottotetti.
Ensemble optimal control for managing drug resistance in cancer therapies.
Preprint (Mar. 2025). arXiv
Abstract

In this paper, we explore the application of ensemble optimal control to derive enhanced strategies for pharmacological cancer treatment. In particular, we focus on moving beyond the classical clinical approach of giving the patient the maximal tolerated drug dose (MTD), which does not properly exploit the fight among sensitive and resistant cells for the available resources. Here, we employ a Lotka-Volterra model to describe the two competing subpopulations, and we enclose this system within the ensemble control framework. In the first part, we establish general results suitable for application to various solid cancers. Then, we carry out numerical simulations in the setting of prostate cancer treated with androgen deprivation therapy, yielding a computed policy that is reminiscent of the medical ‘active surveillance’ paradigm. Finally, inspired by the numerical evidence, we propose a variant of the celebrated adaptive therapy (AT), which we call ‘Off-On’ AT.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[37]
M. Fornasier and L. Sun.
A PDE Framework of Consensus-Based Optimization for Objectives with Multiple Global Minimizers.
Communications in Partial Differential Equations 50.4 (Feb. 2025). DOI
Abstract

Introduced in 2017, Consensus-Based Optimization (CBO) has rapidly emerged as a significant breakthrough in global optimization. This straightforward yet powerful multi-particle, zero-order optimization method draws inspiration from Simulated Annealing and Particle Swarm Optimization. Using a quantitative mean-field approximation, CBO dynamics can be described by a nonlinear Fokker-Planck equation with degenerate diffusion, which does not follow a gradient flow structure. In this paper, we demonstrate that solutions to the CBO equation remain positive and maintain full support. Building on this foundation, we establish the { unconditional} global convergence of CBO methods to global minimizers. Our results are derived through an analysis of solution regularity and the proof of existence for smooth, classical solutions to a broader class of drift-diffusion equations, despite the challenges posed by degenerate diffusion.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis

Link to website

Lukang Sun

Applied Numerical Analysis


[36]
M. Fornasier and L. Sun.
Regularity and positivity of solutions of the Consensus-Based Optimization equation: unconditional global convergence.
Preprint (Feb. 2025). arXiv
Abstract

Introduced in 2017, Consensus-Based Optimization (CBO) has rapidly emerged as a significant breakthrough in global optimization. This straightforward yet powerful multi-particle, zero-order optimization method draws inspiration from Simulated Annealing and Particle Swarm Optimization. Using a quantitative mean-field approximation, CBO dynamics can be described by a nonlinear Fokker-Planck equation with degenerate diffusion, which does not follow a gradient flow structure. In this paper, we demonstrate that solutions to the CBO equation remain positive and maintain full support. Building on this foundation, we establish the { unconditional} global convergence of CBO methods to global minimizers. Our results are derived through an analysis of solution regularity and the proof of existence for smooth, classical solutions to a broader class of drift-diffusion equations, despite the challenges posed by degenerate diffusion.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis

Link to website

Lukang Sun

Applied Numerical Analysis


[35]
M. Fornasier, J. Klemenc and A. Scagliotti.
Trade-off Invariance Principle for minimizers of regularized functionals.
Math4AiMl 2025 - 3rd Workshop of UMI Group Mathematics for Artificial Intelligence and Machine Learning. Bari, Italy, Jan 29-31, 2025. To be published. Preprint available. arXiv
Abstract

In this paper, we consider functionals of the form Hα(u)=F(u)+αG(u) with α∈[0,+∞), where u varies in a set U≠∅ (without further structure). We first show that, excluding at most countably many values of α, we have that infH⋆αG=supH⋆αG, where H⋆α:=argminUHα, which is assumed to be non-empty. We further prove a stronger result that concerns the {invariance of the} limiting value of the functional G along minimizing sequences for Hα. This fact in turn implies an unexpected consequence for functionals regularized with uniformly convex norms: excluding again at most countably many values of α, it turns out that for a minimizing sequence, convergence to a minimizer in the weak or strong sense is equivalent.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis

Link to website

Jona Klemenc

Applied Numerical Analysis

Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[34]
A. Scagliotti.
Minimax Problems for Ensembles of Control-Affine Systems.
SIAM Journal on Control and Optimization 63.1 (Jan. 2025). DOI
Abstract

In this paper, we consider ensembles of control-affine systems in ℝd, and we study simultaneous optimal control problems related to the worst-case minimization. After proving that such problems admit solutions, denoting with (ΘN)N a sequence of compact sets that parametrize the ensembles of systems, we first show that the corresponding minimax optimal control problems are Γ-convergent whenever (ΘN)N has a limit with respect to the Hausdorff distance. Besides its independent interest, the previous result plays a crucial role for establishing the Pontryagin Maximum Principle (PMP) when the ensemble is parametrized by a set Θ consisting of infinitely many points. Namely, we first approximate Θ by finite and increasing-in-size sets (ΘN)N for which the PMP is known, and then we derive the PMP for the Γ-limiting problem. The same strategy can be pursued in applications, where we can reduce infinite ensembles to finite ones to compute the minimizers numerically. We bring as a numerical example the Schrödinger equation for a qubit with uncertain resonance frequency.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


2024


[33]
A. Bonfanti, G. Bruno and C. Cipriani.
The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. URL
Abstract

The Neural Tangent Kernel (NTK) viewpoint is widely employed to analyze the training dynamics of overparameterized Physics-Informed Neural Networks (PINNs). However, unlike the case of linear Partial Differential Equations (PDEs), we show how the NTK perspective falls short in the nonlinear scenario. Specifically, we establish that the NTK yields a random matrix at initialization that is not constant during training, contrary to conventional belief. Another significant difference from the linear regime is that, even in the idealistic infinite-width limit, the Hessian does not vanish and hence it cannot be disregarded during training. This motivates the adoption of second-order optimization methods. We explore the convergence guarantees of such methods in both linear and nonlinear cases, addressing challenges such as spectral bias and slow convergence. Every theoretical result is supported by numerical examples with both linear and nonlinear PDEs, and we highlight the benefits of second-order methods in benchmark test cases.

MCML Authors

[32]
C. Geldhauser and C. Kuehn.
Travelling waves for discrete stochastic bistable equations.
Partial Differential Equations and Applications 5.35 (Nov. 2024). DOI
Abstract

Many physical, chemical and biological systems have an inherent discrete spatial structure that strongly influences their dynamical behaviour. Similar remarks apply to internal or external noise. In this paper we study the combined effect of spatial discretization and stochastic perturbations on travelling waves in the Nagumo equation, which is a prototypical model for bistable reaction-diffusion partial differential equations (PDEs). We prove that under suitable parameter conditions, various discrete-stochastic variants of the Nagumo equation have solutions, which stay close on long time scales to the classical monotone Nagumo front with high probability if the noise covariance and spatial discretization are sufficiently small.

MCML Authors
Carina Geldhauser

Carina Geldhauser

Dr.

* Former Member


[31]
K. Jin, J. Latz, C. Liu and A. Scagliotti.
Losing momentum in continuous-time stochastic optimisation.
Preprint (Nov. 2024). arXiv
Abstract

The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become particularly widespread. The stochasticity arises from data subsampling which reduces computational cost. Both, momentum and stochasticity help the algorithm to converge globally. In this work, we propose and analyse a continuous-time model for stochastic gradient descent with momentum. This model is a piecewise-deterministic Markov process that represents the optimiser by an underdamped dynamical system and the data subsampling through a stochastic switching. We investigate longtime limits, the subsampling-to-no-subsampling limit, and the momentum-to-no-momentum limit. We are particularly interested in the case of reducing the momentum over time. Under convexity assumptions, we show convergence of our dynamical system to the global minimiser when reducing momentum over time and letting the subsampling rate go to infinity. We then propose a stable, symplectic discretisation scheme to construct an algorithm from our continuous-time dynamical system. In experiments, we study our scheme in convex and non-convex test problems. Additionally, we train a convolutional neural network in an image classification problem. Our algorithm {attains} competitive results compared to stochastic gradient descent with momentum.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[30]
M. Rauscher, A. Scagliotti and F. Pagginelli Patricio.
Shortest-path recovery from signature with an optimal control approach.
Mathematics of Control, Signals, and Systems (Oct. 2024). DOI
Abstract

In this paper, we consider the signature-to-path reconstruction problem from the control-theoretic perspective. Namely, we design an optimal control problem whose solution leads to the minimal-length path that generates a given signature. In order to do that, we minimize a cost functional consisting of two competing terms, i.e., a weighted final-time cost combined with the -norm squared of the controls. Moreover, we can show that, by taking the limit to infinity of the parameter that tunes the final-time cost, the problem -converges to the problem of finding a sub-Riemannian geodesic connecting two signatures. Finally, we provide an alternative reformulation of the latter problem, which is particularly suitable for the numerical implementation.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[29]
O. Åström, C. Geldhauser, M. Grillitsch, O. Hall and A. Sopasakis.
Enhancing Carbon Emission Reduction Strategies using OCO and ICOS data.
Preprint (Oct. 2024). arXiv
Abstract

We propose a methodology to enhance local CO2 monitoring by integrating satellite data from the Orbiting Carbon Observatories (OCO-2 and OCO-3) with ground level observations from the Integrated Carbon Observation System (ICOS) and weather data from the ECMWF Reanalysis v5 (ERA5). Unlike traditional methods that downsample national data, our approach uses multimodal data fusion for high-resolution CO2 estimations. We employ weighted K-nearest neighbor (KNN) interpolation with machine learning models to predict ground level CO2 from satellite measurements, achieving a Root Mean Squared Error of 3.92 ppm. Our results show the effectiveness of integrating diverse data sources in capturing local emission patterns, highlighting the value of high-resolution atmospheric transport models. The developed model improves the granularity of CO2 monitoring, providing precise insights for targeted carbon mitigation strategies, and represents a novel application of neural networks and KNN in environmental monitoring, adaptable to various regions and temporal scales.

MCML Authors
Carina Geldhauser

Carina Geldhauser

Dr.

* Former Member


[28]
M. Fornasier, P. Heid and G. Sodini.
Approximation Theory, Computing, and Deep Learning on the Wasserstein Space.
Preprint (Oct. 2024). arXiv
Abstract

The challenge of approximating functions in infinite-dimensional spaces from finite samples is widely regarded as formidable. We delve into the challenging problem of the numerical approximation of Sobolev-smooth functions defined on probability spaces. Our particular focus centers on the Wasserstein distance function, which serves as a relevant example. In contrast to the existing body of literature focused on approximating efficiently pointwise evaluations, we chart a new course to define functional approximants by adopting three machine learning-based approaches: 1. Solving a finite number of optimal transport problems and computing the corresponding Wasserstein potentials. 2. Employing empirical risk minimization with Tikhonov regularization in Wasserstein Sobolev spaces. 3. Addressing the problem through the saddle point formulation that characterizes the weak form of the Tikhonov functional’s Euler-Lagrange equation. We furnish explicit and quantitative bounds on generalization errors for each of these solutions. We leverage the theory of metric Sobolev spaces and we combine it with techniques of optimal transport, variational calculus, and large deviation bounds. In our numerical implementation, we harness appropriately designed neural networks to serve as basis functions. These networks undergo training using diverse methodologies. This approach allows us to obtain approximating functions that can be rapidly evaluated after training. Our constructive solutions significantly enhance at equal accuracy the evaluation speed, surpassing that of state-of-the-art methods by several orders of magnitude. This allows evaluations over large datasets several times faster, including training, than traditional optimal transport algorithms. Our analytically designed deep learning architecture slightly outperforms the test error of state-of-the-art CNN architectures on datasets of images.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis


[27]
S. Karnik, A. Veselovska, M. Iwen and F. Krahmer.
Implicit Regularization for Tubal Tensor Factorizations via Gradient Descent.
Preprint (Oct. 2024). arXiv
Abstract

We provide a rigorous analysis of implicit regularization in an overparametrized tensor factorization problem beyond the lazy training regime. For matrix factorization problems, this phenomenon has been studied in a number of works. A particular challenge has been to design universal initialization strategies which provably lead to implicit regularization in gradient-descent methods. At the same time, it has been argued by Cohen et. al. 2016 that more general classes of neural networks can be captured by considering tensor factorizations. However, in the tensor case, implicit regularization has only been rigorously established for gradient flow or in the lazy training regime. In this paper, we prove the first tensor result of its kind for gradient descent rather than gradient flow. We focus on the tubal tensor product and the associated notion of low tubal rank, encouraged by the relevance of this model for image data. We establish that gradient descent in an overparametrized tensor factorization model with a small random initialization exhibits an implicit bias towards solutions of low tubal rank. Our theoretical findings are illustrated in an extensive set of numerical simulations show-casing the dynamics predicted by our theory as well as the crucial role of using a small random initialization.

MCML Authors
Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis

Link to Profile Felix Krahmer

Felix Krahmer

Prof. Dr.

Optimization & Data Analysis


[26]
K. Riedl.
Mathematical Foundations of Interacting Multi-Particle Systems for Optimization.
Dissertation 2024. URL
Abstract

This dissertation lays mathematical foundations for the numerical analysis of interacting multi-particle systems in the setting of optimization. While such systems are of paramount importance in and beyond applied mathematics, their rigorous analysis largely remained elusive. Given the necessity for capable, reliable, and robust algorithms with informative and solid convergence guarantees, we provide an analytical framework that builds upon insights obtained by taking a mean-field perspective.

MCML Authors
Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member


[25]
C. Geldhauser and K. Malyshev.
Semi-automatic annotation of Greek majuscule manuscripts: Steps towards integrated transcription and annotation.
FedCSIS 2024 - 19th Conference on Computer Science and Intelligence Systems. Belgrade, Serbia, Sep 08-11, 2024. DOI
Abstract

We present a prototype for the integration of HTR transcription and semi-automated markup of textual features in the eScriptorium GUI.

MCML Authors
Carina Geldhauser

Carina Geldhauser

Dr.

* Former Member


[24]
C. Geldhauser, M. Herrmann and D. Janßen.
Traveling Phase Interfaces in Viscous Forward–Backward Diffusion Equations.
Journal of Dynamics and Differential Equations (Aug. 2024). DOI
Abstract

The viscous regularization of an ill-posed diffusion equation with bistable nonlinearity predicts a hysteretic behavior of dynamical phase transitions but a complete mathematical understanding of the intricate multiscale evolution is still missing. We shed light on the fine structure of propagating phase boundaries by carefully examining traveling wave solutions in a special case. Assuming a trilinear constitutive relation we characterize all waves that possess a monotone profile and connect the two phases by a single interface of positive width. We further study the two sharp-interface regimes related to either vanishing viscosity or the bilinear limit.

MCML Authors
Carina Geldhauser

Carina Geldhauser

Dr.

* Former Member


[23]
M. Fornasier, T. Klock and K. Riedl.
Consensus-Based Optimization Methods Converge Globally.
SIAM Journal on Optimization 34.3 (Jul. 2024). DOI
Abstract

In this paper we study consensus-based optimization (CBO), which is a multiagent metaheuristic derivative-free optimization method that can globally minimize nonconvex nonsmooth functions and is amenable to theoretical analysis. Based on an experimentally supported intuition that, on average, CBO performs a gradient descent of the squared Euclidean distance to the global minimizer, we devise a novel technique for proving the convergence to the global minimizer in mean-field law for a rich class of objective functions. The result unveils internal mechanisms of CBO that are responsible for the success of the method. In particular, we prove that CBO performs a convexification of a large class of optimization problems as the number of optimizing agents goes to infinity. Furthermore, we improve prior analyses by requiring mild assumptions about the initialization of the method and by covering objectives that are merely locally Lipschitz continuous. As a core component of this analysis, we establish a quantitative nonasymptotic Laplace principle, which may be of independent interest. From the result of CBO convergence in mean-field law, it becomes apparent that the hardness of any global optimization problem is necessarily encoded in the rate of the mean-field approximation, for which we provide a novel probabilistic quantitative estimate. The combination of these results allows us to obtain probabilistic global convergence guarantees of the numerical CBO method.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis

Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member


[22]
J. Beddrich, E. Chenchene, M. Fornasier, H. Huang and B. Wohlmuth.
Constrained Consensus-Based Optimization and Numerical Heuristics for the Few Particle Regime.
Preprint (Jul. 2024). arXiv
Abstract

Consensus-based optimization (CBO) is a versatile multi-particle optimization method for performing nonconvex and nonsmooth global optimizations in high dimensions. Proofs of global convergence in probability have been achieved for a broad class of objective functions in unconstrained optimizations. In this work we adapt the algorithm for solving constrained optimizations on compact and unbounded domains with boundary by leveraging emerging reflective boundary conditions. In particular, we close a relevant gap in the literature by providing a global convergence proof for the many-particle regime comprehensive of convergence rates. On the one hand, for the sake of minimizing running cost, it is desirable to keep the number of particles small. On the other hand, reducing the number of particles implies a diminished capability of exploration of the algorithm. Hence numerical heuristics are needed to ensure convergence of CBO in the few-particle regime. In this work, we also significantly improve the convergence and complexity of CBO by utilizing an adaptive region control mechanism and by choosing geometry-specific random noise. In particular, by combining a hierarchical noise structure with a multigrid finite element method, we are able to compute global minimizers for a constrained p-Allen-Cahn problem with obstacles, a very challenging variational problem.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis


[21]
C. Cipriani, A. Scagliotti and T. Wöhrer.
A Minimax Optimal Control Approach for Robust Neural ODEs.
ECC 2024 - European Control Conference. Stockholm, Sweden, Jun 25-28, 2024. DOI
Abstract

In this paper, we address the adversarial training of neural ODEs from a robust control perspective. This is an alternative to the classical training via empirical risk minimization, and it is widely used to enforce reliable outcomes for input perturbations. Neural ODEs allow the interpretation of deep neural networks as discretizations of control systems, unlocking powerful tools from control theory for the development and the understanding of machine learning. In this specific case, we formulate the adversarial training with perturbed data as a minimax optimal control problem, for which we derive first order optimality conditions in the form of Pontryagin’s Maximum Principle. We provide a novel interpretation of robust training leading to an alternative weighted technique, which we test on a low-dimensional classification task.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[20]
A. Scagliotti and P. Colli Franzone.
A subgradient method with constant step-size for l1-composite optimization.
Bollettino dell’Unione Matematica Italiana 17 (Jun. 2024). DOI
Abstract

Subgradient methods are the natural extension to the non-smooth case of the classical gradient descent for regular convex optimization problems. However, in general, they are characterized by slow convergence rates, and they require decreasing step-sizes to converge. In this paper we propose a subgradient method with constant step-size for composite convex objectives with -regularization. If the smooth term is strongly convex, we can establish a linear convergence result for the function values. This fact relies on an accurate choice of the element of the subdifferential used for the update, and on proper actions adopted when non-differentiability regions are crossed. Then, we propose an accelerated version of the algorithm, based on conservative inertial dynamics and on an adaptive restart strategy, that is guaranteed to achieve a linear convergence rate in the strongly convex case. Finally, we test the performances of our algorithms on some strongly and non-strongly convex examples.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[19]
M. Brunner, M. Innerberger, A. Miraçi, D. Praetorius, J. Streitberger and P. Heid.
Adaptive FEM with quasi-optimal overall cost for nonsymmetric linear elliptic PDEs.
IMA Journal of Numerical Analysis 44.3 (May. 2024). DOI
Abstract

We consider a general nonsymmetric second-order linear elliptic partial differential equation in the framework of the Lax–Milgram lemma. We formulate and analyze an adaptive finite element algorithm with arbitrary polynomial degree that steers the adaptive meshrefinement and the inexact iterative solution of the arising linear systems. More precisely, the iterative solver employs, as an outer loop, the so-called Zarantonello iteration to symmetrize the system and, as an inner loop, a uniformly contractive algebraic solver, for example, an optimally preconditioned conjugate gradient method or an optimal geometric multigrid algorithm. We prove that the proposed inexact adaptive iteratively symmetrized finite element method leads to full linear convergence and, for sufficiently small adaptivity parameters, to optimal convergence rates with respect to the overall computational cost, i.e., the total computational time. Numerical experiments underline the theory.

MCML Authors

[18]
R. Bailo, A. Barbaro, S. N. Gomes, K. Riedl, T. Roith, C. Totzeck and U. Vaes.
CBX: Python and Julia packages for consensus-based interacting particle methods.
Preprint (Mar. 2024). arXiv
Abstract

We introduce CBXPy and ConsensusBasedX.jl, Python and Julia implementations of consensus-based interacting particle systems (CBX), which generalise consensus-based optimization methods (CBO) for global, derivative-free optimisation. The raison d’ˆetre of our libraries is twofold: on the one hand, to offer high- performance implementations of CBX methods that the community can use directly, while on the other, providing a general interface that can accommodate and be extended to further variations of the CBX family. Python and Julia were selected as the leading high-level languages in terms of usage and performance, as well as for their popularity among the scientific computing community. Both libraries have been developed with a common ethos, ensuring a similar API and core functionality, while leveraging the strengths of each language and writing idiomatic code.

MCML Authors
Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member


[17]
C. Cipriani, M. Fornasier and A. Scagliotti.
From NeurODEs to AutoencODEs: a mean-field control framework for width-varying Neural Networks.
European Journal of Applied Mathematics (Feb. 2024). DOI
Abstract

The connection between Residual Neural Networks (ResNets) and continuous-time control systems (known as NeurODEs) has led to a mathematical analysis of neural networks, which has provided interesting results of both theoretical and practical significance. However, by construction, NeurODEs have been limited to describing constant-width layers, making them unsuitable for modelling deep learning architectures with layers of variable width. In this paper, we propose a continuous-time Autoencoder, which we call AutoencODE, based on a modification of the controlled field that drives the dynamics. This adaptation enables the extension of the mean-field control framework originally devised for conventional NeurODEs. In this setting, we tackle the case of low Tikhonov regularisation, resulting in potentially non-convex cost landscapes. While the global results obtained for high Tikhonov regularisation may not hold globally, we show that many of them can be recovered in regions where the loss function is locally convex. Inspired by our theoretical findings, we develop a training method tailored to this specific type of Autoencoders with residual connections, and we validate our approach through numerical experiments conducted on various examples.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis

Link to website

Alessandro Scagliotti

Applied Numerical Analysis


[16]
C. Geldhauser and H. Diebel-Fischer.
Is diverse and inclusive AI trapped in the gap between reality and algorithmizability?
NLDL 2024 - Northern Lights Deep Learning Conference. Tromsø, Norway, Jan 09-11, 2024. URL
Abstract

We investigate the preconditions of an operationalization of ethics on the example algorithmization, i.e. the mathematical implementation, of the concepts of fairness and diversity in AI. From a non-technical point of view in ethics, this implementation entails two major drawbacks, (1) as it narrows down big concepts to a single model that is deemed manageable, and (2) as it hides unsolved problems of humanity in a system that could be mistaken as the `solution’ to these problems. We encourage extra caution when dealing with such issues and vote for human oversight.

MCML Authors
Carina Geldhauser

Carina Geldhauser

Dr.

* Former Member


2023


[15]
F. Filbir, M. Tasche and A. Veselovska.
Regularized Shannon sampling formulas related to the special affine Fourier transform.
Preprint (Nov. 2023). arXiv
Abstract

In this paper, we present new regularized Shannon sampling formulas related to the special affine Fourier transform (SAFT). These sampling formulas use localized sampling with special compactly supported window functions, namely B-spline, sinh-type, and continuous Kaiser-Bessel window functions. In contrast to the Shannon sampling series for SAFT, the regularized Shannon sampling formulas for SAFT possesses an exponential decay of the approximation error and are numerically robust in the presence of noise, if certain oversampling condition is fulfilled. Several numerical experiments illustrate the theoretical results.

MCML Authors
Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis


[14]
K. Riedl.
Leveraging Memory Effects and Gradient Information in Consensus-Based Optimisation: On Global Convergence in Mean-Field Law.
European Journal of Applied Mathematics (Oct. 2023). DOI
Abstract

In this paper, we study consensus-based optimisation (CBO), a versatile, flexible and customisable optimisation method suitable for performing nonconvex and nonsmooth global optimisations in high dimensions. CBO is a multi-particle metaheuristic, which is effective in various applications and at the same time amenable to theoretical analysis thanks to its minimalistic design. The underlying dynamics, however, is flexible enough to incorporate different mechanisms widely used in evolutionary computation and machine learning, as we show by analysing a variant of CBO which makes use of memory effects and gradient information. We rigorously prove that this dynamics converges to a global minimiser of the objective function in mean-field law for a vast class of functions under minimal assumptions on the initialisation of the method. The proof in particular reveals how to leverage further, in some applications advantageous, forces in the dynamics without loosing provable global convergence. To demonstrate the benefit of the herein investigated memory effects and gradient information in certain applications, we present numerical evidence for the superiority of this CBO variant in applications such as machine learning and compressed sensing, which en passant widen the scope of applications of CBO.

MCML Authors
Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member


[13]
M. Fornasier, P. Richtárik, K. Riedl and L. Sun.
Consensus-Based Optimization with Truncated Noise.
Preprint (Oct. 2023). arXiv
Abstract

Consensus-based optimization (CBO) is a versatile multi-particle metaheuristic optimization method suitable for performing nonconvex and nonsmooth global optimizations in high dimensions. It has proven effective in various applications while at the same time being amenable to a theoretical convergence analysis. In this paper, we explore a variant of CBO, which incorporates truncated noise in order to enhance the well-behavedness of the statistics of the law of the dynamics. By introducing this additional truncation in the noise term of the CBO dynamics, we achieve that, in contrast to the original version, higher moments of the law of the particle system can be effectively bounded. As a result, our proposed variant exhibits enhanced convergence performance, allowing in particular for wider flexibility in choosing the noise parameter of the method as we confirm experimentally. By analyzing the time-evolution of the Wasserstein-2 distance between the empirical measure of the interacting particle system and the global minimizer of the objective function, we rigorously prove convergence in expectation of the proposed CBO variant requiring only minimal assumptions on the objective function and on the initialization. Numerical evidences demonstrate the benefit of truncating the noise in CBO.

MCML Authors
Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis

Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member

Link to website

Lukang Sun

Applied Numerical Analysis


[12]
P. Heid.
A damped Kačanov scheme for the numerical solution of a relaxed p(x)-Poisson equation.
Partial Differential Equations and Applications 4.40 (Aug. 2023). DOI
Abstract

The focus of the present work is the (theoretical) approximation of a solution of the p(x)-Poisson equation. To devise an iterative solver with guaranteed convergence, we will consider a relaxation of the original problem in terms of a truncation of the nonlinearity from below and from above by using a pair of positive cut-off parameters. We will then verify that, for any such pair, a damped Kačanov scheme generates a sequence converging to a solution of the relaxed equation. Subsequently, it will be shown that the solutions of the relaxed problems converge to the solution of the original problem in the discrete setting. Finally, the discrete solutions of the unrelaxed problem converge to the continuous solution. Our work will finally be rounded up with some numerical experiments that underline the analytical findings.

MCML Authors

[11]
F. Krahmer, H. Lyu, R. Saab, A. Veselovska and R. Wang.
Quantization of Bandlimited Graph Signals.
SampTA 2023 - International Conference on Sampling Theory and Applications. Yale, CT, USA, Jul 10-14, 2023. DOI
Abstract

Graph models and graph-based signals are becoming increasingly important in machine learning, natural sciences, and modern signal processing. In this paper, we address the problem of quantizing bandlimited graph signals. We introduce two classes of noise-shaping algorithms for graph signals that differ in their sampling methodologies. We demonstrate that these algorithms can be efficiently used to construct quantized representatives of bandlimited graph-based signals with bounded amplitude. Moreover, for one of the algorithms, we provide theoretical guarantees on the relative error between the quantized representative and the true signal.

MCML Authors
Link to Profile Felix Krahmer

Felix Krahmer

Prof. Dr.

Optimization & Data Analysis

Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis


[10]
F. Krahmer and A. Veselovska.
Digital Halftoning via Mixed-Order Weighted Σ∆ Modulation.
SampTA 2023 - International Conference on Sampling Theory and Applications. Yale, CT, USA, Jul 10-14, 2023. DOI
Abstract

In this paper, we propose 1-bit weighted Σ∆ quantization schemes of mixed order as a technique for digital halftoning. These schemes combine weighted Σ∆ schemes of different orders for two-dimensional signals so one can profit both from the better stability properties of low order schemes and the better accuracy properties of higher order schemes. We demonstrate that the resulting mixed-order Σ∆ schemes in combination with a padding strategy yield improved representation quality in digital halftoning as measured in the Feature Similarity Index.These empirical results are complemented by mathematical error bounds for the model of two-dimensional bandlimited signals as motivated by a mathematical model of human visual perception.

MCML Authors
Link to Profile Felix Krahmer

Felix Krahmer

Prof. Dr.

Optimization & Data Analysis

Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis


[9]
F. Krahmer and A. Veselovska.
Enhanced Digital Halftoning via Weighted Sigma-Delta Modulation.
SIAM Journal on Imaging Sciences 16.3 (Jul. 2023). DOI
Abstract

In this paper, we study error diffusion techniques for digital halftoning from the perspective of 1-bit quantization. We introduce a method to generate schemes for two-dimensional signals as a weighted combination of their one-dimensional counterparts and show that various error diffusion schemes proposed in the literature can be represented in this framework via schemes of first order. Under the model of two-dimensional bandlimited signals, which is motivated by a mathematical model of human visual perception, we derive quantitative error bounds for such weighted schemes. We see these bounds as a step towards a mathematical understanding of the good empirical performance of error diffusion, even though they are formulated in the supremum norm, which is known to not fully capture the visual similarity of images. Motivated by the correspondence between existing error diffusion algorithms and first-order schemes, we study the performance of the analogous weighted combinations of second-order schemes and show that they exhibit a superior performance in terms of guaranteed error decay for two-dimensional bandlimited signals. In extensive numerical simulations for real-world images, we demonstrate that with some modifications to enhance stability this superior performance also translates to the problem of digital halftoning. More concretely, we find that certain second-order weighted schemes exhibit competitive performance for digital halftoning of real-world images in terms of the Feature Similarity Index (FSIM), a state-of-the-art measure for image quality assessment.

MCML Authors
Link to Profile Felix Krahmer

Felix Krahmer

Prof. Dr.

Optimization & Data Analysis

Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis


[8]
K. Riedl, T. Klock, C. Geldhauser and M. Fornasier.
Gradient is All You Need?
Preprint (Jun. 2023). arXiv
Abstract

In this paper we provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently proposed multi-particle derivative-free optimization method, as a stochastic relaxation of gradient descent. Remarkably, we observe that through communication of the particles, CBO exhibits a stochastic gradient descent (SGD)-like behavior despite solely relying on evaluations of the objective function. The fundamental value of such link between CBO and SGD lies in the fact that CBO is provably globally convergent to global minimizers for ample classes of nonsmooth and nonconvex objective functions, hence, on the one side, offering a novel explanation for the success of stochastic relaxations of gradient descent. On the other side, contrary to the conventional wisdom for which zero-order methods ought to be inefficient or not to possess generalization abilities, our results unveil an intrinsic gradient descent nature of such heuristics. This viewpoint furthermore complements previous insights into the working principles of CBO, which describe the dynamics in the mean-field limit through a nonlinear nonlocal partial differential equation that allows to alleviate complexities of the nonconvex function landscape. Our proofs leverage a completely nonsmooth analysis, which combines a novel quantitative version of the Laplace principle (log-sum-exp trick) and the minimizing movement scheme (proximal iteration). In doing so, we furnish useful and precise insights that explain how stochastic perturbations of gradient descent overcome energy barriers and reach deep levels of nonconvex functions. Instructive numerical illustrations support the provided theoretical insights.

MCML Authors
Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member

Carina Geldhauser

Carina Geldhauser

Dr.

* Former Member

Link to Profile Massimo Fornasier

Massimo Fornasier

Prof. Dr.

Applied Numerical Analysis


[7]
H. Huang, J. Qiu and K. Riedl.
On the global convergence of particle swarm optimization methods.
Applied Mathematics and Optimization 88.2 (May. 2023). DOI
Abstract

In this paper we provide a rigorous convergence analysis for the renowned particle swarm optimization method by using tools from stochastic calculus and the analysis of partial differential equations. Based on a continuous-time formulation of the particle dynamics as a system of stochastic differential equations, we establish convergence to a global minimizer of a possibly nonconvex and nonsmooth objective function in two steps. First, we prove consensus formation of an associated mean-field dynamics by analyzing the time-evolution of the variance of the particle distribution, which acts as Lyapunov function of the dynamics. We then show that this consensus is close to a global minimizer by employing the asymptotic Laplace principle and a tractability condition on the energy landscape of the objective function. These results allow for the usage of memory mechanisms, and hold for a rich class of objectives provided certain conditions of well-preparation of the hyperparameters and the initial datum. In a second step, at least for the case without memory effects, we provide a quantitative result about the mean-field approximation of particle swarm optimization, which specifies the convergence of the interacting particle system to the associated mean-field limit. Combining these two results allows for global convergence guarantees of the numerical particle swarm optimization method with provable polynomial complexity. To demonstrate the applicability of the method we propose an efficient and parallelizable implementation, which is tested in particular on a competitive and well-understood high-dimensional benchmark problem in machine learning.

MCML Authors
Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member


[6]
L. Sun.
Well-posedness and L1−Lp Smoothing Effect of the Porous Media Equation under Poincaré Inequality.
Preprint (Apr. 2023). arXiv
Abstract

We investigate the well-posedness and uniqueness of the Cauchy problem for a class of porous media equations defined on ℝd, and demonstrate the L1−Lp smoothing effect. In particular, we establish that the logarithm of the ratio of the Lp norm to the L1 norm decreases super-exponentially fast during the initial phase, subsequently decaying to zero exponentially fast in the latter phase. This implies that if the initial data is solely in L1, then for t>0, the solution will belong to Lp for any p∈[1,∞). The results are obtained under the assumption of a Poincaré inequality.

MCML Authors
Link to website

Lukang Sun

Applied Numerical Analysis


[5]
P. Heid.
A short note on an adaptive damped Newton method for strongly monotone and Lipschitz continuous operator equations.
Archiv der Mathematik (Mar. 2023). URL
Abstract

We consider the damped Newton method for strongly monotone and Lipschitz continuous operator equations in a variational setting. We provide a very accessible justification why the undamped Newton method performs better than its damped counterparts in a vicinity of a solution. Moreover, in the given setting, an adaptive step-size strategy be presented, which guarantees the global convergence and favours an undamped update if admissible.

MCML Authors

[4]
A. Scagliotti.
Optimal control of ensembles of dynamical systems.
ESAIM - Control, Optimisation and Calculus of Variations 29.22 (Mar. 2023). DOI
Abstract

In this paper we consider the problem of the optimal control of an ensemble of affine-control systems. After proving the well-posedness of the minimization problem under examination, we establish a $Gamma$-convergence result that allows us to substitute the original (and usually infinite) ensemble with a sequence of finite increasing-in-size sub-ensembles. The solutions of the optimal control problems involving these sub-ensembles provide approximations in the $L^2$-strong topology of the minimizers of the original problem. Using again a $Gamma$-convergence argument, we manage to derive a Maximum Principle for ensemble optimal control problems with end-point cost. Moreover, in the case of finite sub-ensembles, we can address the minimization of the related cost through numerical schemes. In particular, we propose an algorithm that consists of a subspace projection of the gradient field induced on the space of admissible controls by the approximating cost functional. In addition, we consider an iterative method based on the Pontryagin Maximum Principle. Finally, we test the algorithms on an ensemble of linear systems in mathbb{R^2}.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis


2022


[3]
M. Herold, A. Veselovska, J. Jehle and F. Krahmer.
Non-intrusive surrogate modelling using sparse random features with applications in crashworthiness analysis.
Preprint (Dec. 2022). arXiv
Abstract

Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised dimensionality reduction is described. The method is compared to other methods on synthetic and real data obtained from crashworthiness analyses. The results show a superiority of the here described approach over state of the art surrogate modelling techniques, Polynomial Chaos Expansions and Neural Networks.

MCML Authors
Link to website

Hanna Veselovska

Dr.

Applied Numerical Analysis

Link to Profile Felix Krahmer

Felix Krahmer

Prof. Dr.

Optimization & Data Analysis


[2]
H. Huang, J. Qiu and K. Riedl.
Consensus-Based Optimization for Saddle Point Problems.
Preprint (Dec. 2022). arXiv
Abstract

In this paper, we propose consensus-based optimization for saddle point problems (CBO-SP), a novel multi-particle metaheuristic derivative-free optimization method capable of provably finding global Nash equilibria. Following the idea of swarm intelligence, the method employs a group of interacting particles, which perform a minimization over one variable and a maximization over the other. This paradigm permits a passage to the mean-field limit, which makes the method amenable to theoretical analysis and allows to obtain rigorous convergence guarantees under reasonable assumptions about the initialization and the objective function, which most notably include nonconvex-nonconcave objectives.

MCML Authors
Konstantin Riedl

Konstantin Riedl

Dr.

* Former Member


[1]
A. Scagliotti and P. Colli Franzone.
Accelerated subgradient methods.
Preprint (Feb. 2022). arXiv
Abstract

Subgradient methods are the natural extension to the non-smooth case of the classical gradient descent for regular convex optimization problems. However, in general, they are characterized by slow convergence rates, and they require decreasing step-sizes to converge. In this paper we propose a subgradient method with constant step-size for composite convex objectives with ℓ1-regularization. If the smooth term is strongly convex, we can establish a linear convergence result for the function values. This fact relies on an accurate choice of the element of the subdifferential used for the update, and on proper actions adopted when non-differentiability regions are crossed. Then, we propose an accelerated version of the algorithm, based on conservative inertial dynamics and on an adaptive restart strategy, that is guaranteed to achieve a linear convergence rate in the strongly convex case. Finally, we test the performances of our algorithms on some strongly and non-strongly convex examples.

MCML Authors
Link to website

Alessandro Scagliotti

Applied Numerical Analysis