publications | Philipp Nazari

2026

The Curious Case of In-Training Compression of State Space Models

Makram Chahine, Philipp Nazari, Daniela Rus, and 1 more author

In ICLR, 2026
The Key to State Reduction in Linear Attention: A Rank-based Perspective

Philipp Nazari, and T. Konstantin Rusch

2026

2025

The Geometry of Generalization

Philipp Nazari

2025

Abs PDF

While recent developments undoubtedly demonstrate the power of deep learning, we still lack a fundamental understanding of why overparameterized models work so well in practice. A common explanation attributes this phenomenon to implicit regularization induced by first-order optimization techniques like SGD. However, recent work has found that even zeroth-order guess-and-check optimizers very frequently find well generalizing minima. In this work, we mathematically formulate this heuristic, known as the volume hypothesis. We then fully establish existing research ideas which, using a tropical geometric perspective, introduce a dual representation of fully connected feedforward ReLU networks. This abstraction offers a perspective for studying the volume hypothesis which, to the best of our knowledge, is novel. While deriving general results remains challenging, we analyze multiple lower-dimensional examples, some inspired by Telgarsky’s sawtooth construction, which support the volume hypothesis. In particular, using the tropical geometric framework, we argue that exponentially complex minima in the loss landscape are unstable, leading learning algorithms to converge to solutions where the network does not fully utilize its available expressivity. Our work provides a novel perspective to think about generalization of deep ReLU networks, and we hope to inspire further theoretical and empirical research to establish more general results.

2024

Geometric Encoder Regularization for Autoencoders

Philipp Nazari

Preprint, 2024

Abs PDF

In this study we adapt existing decoder-based geometric regularization methods for autoencoders to depend only on the encoder. This shift aligns more naturally with the primary function of autoencoders in generating low-dimensional embeddings. Specifically, we replace the pullback metric with a pushforward. Although this pushforward does not account for all of the geometric information encoded in the data manifold, it is applicable to a much broader spectrum of models.
Entropy Aware Message Passing in Graph Neural Networks

Philipp Nazari, Oliver Lemke, Davide Guidobene, and 1 more author

arXiv preprint arXiv:2403.04636, 2024

Abs PDF

Deep Graph Neural Networks struggle with oversmoothing. This paper introduces a novel, physics-inspired GNN model designed to mitigate this issue. Our approach integrates with existing GNN architectures, introducing an entropy-aware message passing term. This term performs gradient ascent on the entropy during node aggregation, thereby preserving a certain degree of entropy in the embeddings. We conduct a comparative analysis of our model against state-of-the-art GNNs across various common datasets.

2023

Geometric Autoencoders-What You See is What You Decode

Philipp Nazari, Sebastian Damrich, and Fred A Hamprecht

In International Conference on Machine Learning, 2023

Abs PDF

Visualization is a crucial step in exploratory data analysis. One possible approach is to train an autoencoder with low-dimensional latent space. Large network depth and width can help unfolding the data. However, such expressive networks can achieve low reconstruction error even when the latent representation is distorted. To avoid such misleading visualizations, we propose first a differential geometric perspective on the decoder, leading to insightful diagnostics for an embedding’s distortion, and second a new regularizer mitigating such distortion. Our “Geometric Autoencoder” avoids stretching the embedding spuriously, so that the visualization captures the data structure more faithfully. It also flags areas where little distortion could not be achieved, thus guarding against misinterpretation.