This book offers a comprehensive exploration of the theoretical foundations of deep learning, bridging the gap between its ever increasing empirical success and rigorous mathematical understanding. With a focus on interdisciplinary approaches, it illuminates the interplay between mathematics, statistics, and computer science to address pressing challenges in deep learning research. While deep learning continues to revolutionize industries such as healthcare, gaming, and autonomous systems, its applications in solving mathematical problems like inverse problems and partial differential equations mark a paradigm shift in scientific inquiry. Yet, this rapid progress often outpaces our theoretical understanding, leaving critical questions about expressivity, optimization, robustness, and fairness unanswered.
This book aims to close this gap by presenting perspectives from mathematics, statistics, and applications. This interdisciplinary endeavor draws upon diverse mathematical frameworks, including algebraic geometry, differential geometry, stochastics, functional analysis, and topology, alongside fundamental tools from statistics and theoretical computer science. By synthesizing these fields, the book not only advances the theoretical foundation of deep learning but also sets the stage for novel applications across scientific domains. Targeted at researchers, advanced students, and professionals in mathematics, computer science, and related fields, this book serves as both an introduction to emerging theoretical insights and a roadmap for interdisciplinary collaboration. It is will be of value to anyone seeking to understand or to contribute to the future of deep learning.
Table of Contents:
Part I. Expressivity, Optimization and Generalization.- Chapter 1. Constraining the outputs of ReLU neural networks.- Chapter 2. Implicit Bias for (Complex) Diagonal Linear Networks.- Chapter 3. A Dynamical Systems Perspective on the Analysis of Neural Networks.- Chapter 4. High-dimensional feedback control with deep learning.- Chapter 5. Meanfield Analysis for Interacting Particle Systems in Machine Learning Applications.- Chapter 6. Reproducing Kernel Hilbert Spaces for Neural Networks.- Chapter 7. On the Convergence of Variational Deep Learning to Entropy Sums.- Chapter 8. Pontryagin Maximum Principles for Training Neural ODEs.- Chapter 9. Theoretical Foundations of Representation Learning using Unlabeled Data: Statistics and Optimization.- Chapter 10. Deep Assignment Flows for Structured Prediction: Recent Results.- Chapter 11. Multilevel Architectures and Algorithms in Deep Learning.- Chapter 12. Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods.- Part II. Safety, Robustness, Interpretability and Fairness.- Chapter 13. A Unified Framework for Adversarial Generalization in Machine Learning.- Chapter 14. Adversarially Robust Foundation Models.- Chapter 15. Provable Robustness Certification of Graph Neural Networks.- Chapter 16. Explainable Learning Based Regularization of Inverse Problems.- Chapter 17. Provable Mixed-Noise Learning with Flow-Matching.- Chapter 18. A Tight Approximability Analysis of Boosting.- Part III. Inverse Problems and Partial Differential Equations.- Chapter 19. End-to-End Neural Networks for Inverse Problems: Performance and Robustness Guarantees.- Chapter 20. A neural operator view on U-Nets for inverse imaging problems.- Chapter 21. Structure Preserving Surrogate Models for the Boltzmann Equation.- Chapter 22. Harmonic Artificial Intelligence based on Linear Operators.- Chapter 23. Curse of dimensionality-free nonlinear Monte Carlo and deep
neural network approximations for MDPs and semilinear PDEs:existence results and limitations.- Chapter 24. Neural and tensor networks for high-dimensional parametric PDEs
and sampling.- Chapter 25. Error analysis for the deep Kolmogorov method.
About the Author :
Christopher Bülte is a doctoral student at the LMU Munich under supervision of Prof. Dr. Gitta Kutyniok. Furthermore, he is the coordinator of the DFG project SPP2298 "Theoretical Foundations of Deep Learning". He has research experience in the fields uncertainty quantification and statistics of neural networks and applies his methods in fields such as meteorology or quantum physics.
Martin Burger is a leading scientist and head of the Computational Imaging working group at DESY and a professor of mathematics at the University of Hamburg. He works on developing novel mathematical tools with the aim of improving the reconstruction of scientific images. Moreover, they are concerned with the mathematical foundations of machine learning and algorithms in data science. Martin has a background in applied mathematics, specialising in inverse problems, mathematical modelling and PDEs. He has received numerous awards and was an invited speaker at all major international conferences. Before joining DESY he was a professor at the University of Münster and the FAU Erlangen-Nürnberg.
Matthias Hein is a Bosch endowed professor of machine learning at the University of Tübingen. He is a member of the Excellence Cluster "Machine Learning: New Perspectives for Science'' at the Tübingen AI Center. His main research interests include making machine learning systems robust, safe and explainable and providing theoretical foundations for machine learning, in particular deep learning. He serves regularly as an area chair for ICML, NeurIPS or ICLR and has been action editor for JMLR from 2013 to 2018. He is an ELLIS Fellow and has been awarded the German Pattern recognition award, an ERC Starting grant and several best paper awards (NeurIPS, COLT, and ALT).
Gitta Kutyniok holds the Bavarian AI Chair for the Mathematical Foundations of AI at LMU Munich and is affiliated with DLR and the University of Tromsø. She earned her diploma and Ph.D. in mathematics and computer science at the University of Paderborn and completed her habilitation at JLU Giessen. Before joining LMU, she was professor at the University of Osnabrück and at TU Berlin (Einstein Chair), and held visiting positions at Princeton, Stanford, and Yale. Her research targets reliable and sustainable AI and scientific discovery, with applications in life sciences and robotics. She is a SIAM Fellow (2019), IEEE Fellow (2024), a member of BBAW and EurASc, has delivered invited and plenary lectures at ICM 2022 and ICIAM 2023, and serves as LMU Director of the Konrad Zuse School of Excellence in Reliable AI.
Sebastian Pokutta is the Vice President of the Zuse Institute Berlin (ZIB) and a professor of mathematics at TU Berlin, with a research focus on artificial intelligence and optimization. He holds leadership roles at the Cluster of Excellence MATH+ and the Research Campus MODAL. He earned his diploma and Ph.D. in mathematics from the University of Duisburg-Essen, followed by a visiting lectureship at MIT. Furthermore, Pokutta was the David M. McKenney Family associate professor and the Founding Associate Director of the Machine Learning @ GT Center at Georgia Tech, as well as a professor at the University of Erlangen-Nürnberg. His contributions have been recognized with several honors, including the Gödel Prize (2023), a STOC Test of Time Award (2022), and the NSF CAREER Award (2015), alongside various best paper and early career awards.
Ingo Steinwart is a professor in the Department of Mathematics at the University of Stuttgart. He is a founding faculty member of the Max-Planck International Research School for Intelligent Systems in Stuttgart and Tübingen and the Founding Co-Director of the ELLIS Unit Stuttgart. Prior to his position at Stuttgart, he worked at Los Alamos National Laboratory and as an adjunct associate professor for computer science at the University of California at Santa Cruz. His research focuses on various mathematical aspects of machine learning.