Home | Research | Groups | Johannes Maly

Research Group Johannes Maly


Link to website at LMU

Johannes Maly

Prof. Dr.

Associate

Mathematical Data Science and Artificial Intelligence

Johannes Maly

is Junior Professor at the Working Group Mathematical Data Science and Artificial Intelligence at LMU Munich.

Publications @MCML

2025


[7]
H.-H. Chou, J. Maly, C. M. Verdun, B. Freitas Paulo da Costa and H. Mirandola.
Get rid of your constraints and reparametrize: A study in NNLS and implicit bias.
AISTATS 2025 - 28th International Conference on Artificial Intelligence and Statistics. Mai Khao, Thailand, May 03-05, 2025. To be published. URL
Abstract

Over the past years, there has been significant interest in understanding the implicit bias of gradient descent optimization and its connection to the generalization properties of overparametrized neural networks. Several works observed that when training linear diagonal networks on the square loss for regression tasks (which corresponds to overparametrized linear regression) gradient descent converges to special solutions, e.g., non-negative ones. We connect this observation to Riemannian optimization and view overparametrized GD with identical initialization as a Riemannian GD. We use this fact for solving non-negative least squares (NNLS), an important problem behind many techniques, e.g., non-negative matrix factorization. We show that gradient flow on the reparametrized objective converges globally to NNLS solutions, providing convergence rates also for its discretized counterpart. Unlike previous methods, we do not rely on the calculation of exponential maps or geodesics. We further show accelerated convergence using a second-order ODE, lending itself to accelerated descent methods. Finally, we establish the stability against negative perturbations and discuss generalization to other constrained optimization problems.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence


[6]
S. Dirksen, W. Li and J. Maly.
Subspace and DOA estimation under coarse quantization.
Preprint (Feb. 2025). arXiv
Abstract

We study direction-of-arrival (DOA) estimation from coarsely quantized data. We focus on a two-step approach which first estimates the signal subspace via covariance estimation and then extracts DOA angles by the ESPRIT algorithm. In particular, we analyze two stochastic quantization schemes which use dithering: a one-bit quantizer combined with rectangular dither and a multi-bit quantizer with triangular dither. For each quantizer, we derive rigorous high probability bounds for the distances between the true and estimated signal subspaces and DOA angles. Using our analysis, we identify scenarios in which subspace and DOA estimation via triangular dithering qualitatively outperforms rectangular dithering. We verify in numerical simulations that our estimates are optimal in their dependence on the smallest non-zero eigenvalue of the target matrix. The resulting subspace estimation guarantees are equally applicable in the analysis of other spectral estimation algorithms and related problems.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence


2024


[5]
T. Yang, J. Maly, S. Dirksen and G. Caire.
Plug-In Channel Estimation With Dithered Quantized Signals in Spatially Non-Stationary Massive MIMO Systems.
IEEE Transactions on Communications 72.1 (Jan. 2024). DOI
Abstract

As the array dimension of massive MIMO systems increases to unprecedented levels, two problems occur. First, the spatial stationarity assumption along the antenna elements is no longer valid. Second, the large array size results in an unacceptably high power consumption if high-resolution analog-to-digital converters are used. To address these two challenges, we consider a Bussgang linear minimum mean square error (BLMMSE)-based channel estimator for large scale massive MIMO systems with one-bit quantizers and a spatially non-stationary channel. Whereas other works usually assume that the channel covariance is known at the base station, we consider a plug-in BLMMSE estimator that uses an estimate of the channel covariance and rigorously analyze the distortion produced by using an estimated, rather than the true, covariance. To cope with the spatial non-stationarity, we introduce dithering into the quantized signals and provide a theoretical error analysis. In addition, we propose an angular domain fitting procedure which is based on solving an instance of non-negative least squares. For the multi-user data transmission phase, we further propose a BLMMSE-based receiver to handle one-bit quantized data signals. Our numerical results show that the performance of the proposed BLMMSE channel estimator is very close to the oracle-aided scheme with ideal knowledge of the channel covariance matrix. The BLMMSE receiver outperforms the conventional maximum-ratio-combining and zero-forcing receivers in terms of the resulting ergodic sum rate.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence


[4]
S. Dirksen and J. Maly.
Tuning-free one-bit covariance estimation using data-driven dithering.
Preprint (Jan. 2024). arXiv
Abstract

We consider covariance estimation of any subgaussian distribution from finitely many i.i.d. samples that are quantized to one bit of information per entry. Recent work has shown that a reliable estimator can be constructed if uniformly distributed dithers on [−λ,λ] are used in the one-bit quantizer. This estimator enjoys near-minimax optimal, non-asymptotic error estimates in the operator and Frobenius norms if λ is chosen proportional to the largest variance of the distribution. However, this quantity is not known a-priori, and in practice λ needs to be carefully tuned to achieve good performance. In this work we resolve this problem by introducing a tuning-free variant of this estimator, which replaces λ by a data-driven quantity. We prove that this estimator satisfies the same non-asymptotic error estimates - up to small (logarithmic) losses and a slightly worse probability estimate. We also show that by using refined data-driven dithers that vary per entry of each sample, one can construct an estimator satisfying the same estimation error bound as the sample covariance of the samples before quantization – again up logarithmic losses. Our proofs rely on a new version of the Burkholder-Rosenthal inequalities for matrix martingales, which is expected to be of independent interest.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence


2023


[3]
C. Kümmerle and J. Maly.
Recovering Simultaneously Structured Data via Non-Convex Iteratively Reweighted Least Squares.
NeurIPS 2023 - 37th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Dec 10-16, 2023. URL
Abstract

We propose a new algorithm for the problem of recovering data that adheres to multiple, heterogenous low-dimensional structures from linear observations. Focussing on data matrices that are simultaneously row-sparse and low-rank, we propose and analyze an iteratively reweighted least squares (IRLS) algorithm that is able to leverage both structures. In particular, it optimizes a combination of non-convex surrogates for row-sparsity and rank, a balancing of which is built into the algorithm. We prove locally quadratic convergence of the iterates to a simultaneously structured data matrix in a regime of minimal sample complexity (up to constants and a logarithmic factor), which is known to be impossible for a combination of convex surrogates. In experiments, we show that the IRLS method exhibits favorable empirical convergence, identifying simultaneously row-sparse and low-rank matrices from fewer measurements than state-of-the-art methods.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence


[2]
H.-H. Chou, J. Maly and D. Stöger.
How to induce regularization in linear models: A guide to reparametrizing gradient flow.
Preprint (Aug. 2023). arXiv
Abstract

In this work, we analyze the relation between reparametrizations of gradient flow and the induced implicit bias in linear models, which encompass various basic regression tasks. In particular, we aim at understanding the influence of the model parameters - reparametrization, loss, and link function - on the convergence behavior of gradient flow. Our results provide conditions under which the implicit bias can be well-described and convergence of the flow is guaranteed. We furthermore show how to use these insights for designing reparametrization functions that lead to specific implicit biases which are closely connected to ℓp- or trigonometric regularizers.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence


[1]
J. Maly and R. Saab.
A simple approach for quantizing neural networks.
Preprint (Apr. 2023). arXiv
Abstract

In this short note, we propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization while preserving the network performance on given training data. On one hand, the computational complexity of this pre-processing slightly exceeds that of state-of-the-art algorithms in the literature. On the other hand, our approach does not require any hyper-parameter tuning and, in contrast to previous methods, allows a plain analysis. We provide rigorous theoretical guarantees in the case of quantizing single network layers and show that the relative error decays with the number of parameters in the network if the training data behaves well, e.g., if it is sampled from suitable random distributions. The developed method also readily allows the quantization of deep networks by consecutive application to single layers.

MCML Authors
Link to Profile Johannes Maly

Johannes Maly

Prof. Dr.

Mathematical Data Science and Artificial Intelligence