Antoine Guédon
IMAGINE /
LIGM
École des Ponts ParisTech (ENPC)
Date: September 25th, 2025
Time: 05:00 PM
Duration of the presentation: Approximately 45 minutes
Duration of the questions: Between 1 and 2 hours
Reception: A reception will be held around 9PM after the defense, at La Petite Place, 202 rue du Faubourg Saint-Antoine, 75012 Paris.
Venue: Amphithéâtre Caquot, Coriolis building
Address:
École des Ponts ParisTech
6-8, Av Blaise Pascal - Cité Descartes
Champs-sur-Marne
77455 Marne-la-Vallée cedex 2
France
Thesis Advisors:
• Vincent Lepetit (ENPC)
• Pascal Monasse (ENPC)
Jury Members:
• Valérie Gouet (LaSTIG - IGN) - President
• Andrea Tagliasacchi (Simon Fraser University) - Reviewer
• Maks Ovsjanikov (Ecole polytechnique) - Reviewer
• Claire Dune (Université de Toulon) - Examiner
The defense is public and open to all.
Online streaming: Link is coming soon.
Contact: antoine (dot) guedon (at) enpc.fr for any questions
This thesis addresses two fundamental challenges in computer vision: Autonomous scene exploration and photorealistic 3D reconstruction.
While recent advances in neural rendering have revolutionized the field of 3D reconstruction, existing methods face significant limitations.
They typically require dense sets of carefully captured images, struggle with geometric ambiguities, and often produce representations that are difficult to edit or integrate into standard graphics pipelines. Additionally, the challenge of efficiently acquiring these input images remains largely unexplored, creating a barrier for non-expert users attempting to create high-quality 3D content.
We present five complementary contributions that progressively tackle these challenges: For acquiring optimal input images through autonomous scene exploration, we first introduce SCONE and MACARONS. For photorealistic 3D reconstruction, we then propose SuGaR, Gaussian Frosting and MAtCha. More details are provided in the following sections.
Together, these contributions advance both autonomous exploration and 3D reconstruction by providing more practical, efficient, and accessible solutions for real-world applications. Our work bridges the gap between neural rendering and traditional computer graphics while making high-quality 3D reconstruction more accessible to non-expert users. The developed methods have potential applications across various domains, from virtual reality and digital content creation to robotics and cultural heritage preservation.
This thesis encompasses five research projects that participated to advance the state-of-the-art in 3D reconstruction and neural rendering.
The first part of this thesis focuses on autonomous exploration strategies for efficiently capturing optimal input images for 3D reconstruction.
NeurIPS 2022 (Spotlight)
A novel approach to solve the Next Best View problem for dense 3D reconstruction in unknown environments. Our method scales to large 3D scenes and handles completely free camera motion at inference. Details
We introduce SCONE, a novel mathematical framework for surface coverage-based exploration using depth sensors. Unlike previous approaches that rely on reinforcement learning, this supervised learning approach enables efficient exploration of arbitrarily large scenes with unrestricted camera motion. The framework's purely geometric considerations allow for better generalization to unseen scenes, making it particularly valuable for real-world applications.
CVPR 2023
A method that simultaneously learns to explore new large environments and to reconstruct them in 3D from color images only, in a self-supervised fashion. Details
We present MACARONS, which eliminates the need for depth sensors and explicit 3D ground truth by leveraging self-supervised learning from RGB images alone. This advancement enables online learning in large unknown environments, making autonomous exploration more practical and cost-effective. MACARONS learns to generate 3D reconstructions directly from RGB images and uses these reconstructions to supervise its coverage predictions, creating a self-improving system suitable for deployment in real-world scenarios.
The second part of this thesis develops novel neural rendering approaches for high-quality 3D reconstruction from captured images.
CVPR 2024
A method that extracts accurate and editable meshes from 3D Gaussian Splatting representations within minutes on a single GPU. Enables easy editing using traditional 3D software. Details
We propose SuGaR, a method that combines explicit surface representation with Gaussian splatting for fast, high-quality mesh reconstruction. SuGaR first introduces a regularization strategy for extracting explicit surface meshes from Gaussian Splatting representations. Then, SuGaR introduces a hybrid representation that binds Gaussians to the extracted surface meshes and refines the structure through differentiable rendering. This approach achieves photorealistic rendering within remarkably short optimization times (less than an hour) and enables unprecedented fine-grained editing capabilities based on meshes, including surface sculpting, character animation, and scene compositing—features previously unavailable in radiance field models.
ECCV 2024 (Oral)
A representation that captures complex volumetric effects and flat surfaces by reconstructing and covering meshes with a "Frosting" layer of 3D Gaussians. Enables real-time rendering and traditional animation workflows. Details
We extend SuGaR with Gaussian Frosting, which significantly improves material handling and rendering efficiency while maintaining SuGaR's interactive features. The method introduces a novel occlusion culling strategy and achieves rendering quality comparable to vanilla Gaussian Splatting. Notably, Frosting excels at reconstructing challenging materials like hair, fur, and grass, which are traditionally difficult to represent with pure surface-based methods but essential for building photorealistic scenes and virtual avatars. To democratize these capabilities, we developed a dedicated Blender add-on that enables artists and content creators to edit, sculpt, combine, and animate their reconstructions without programming expertise.
CVPR 2025 (Highlight)
A novel surface representation using Atlases of Charts rendered with 2D Gaussian surfels. Recovers sharp meshes from sparse unposed RGB images within minutes. Details
Finally, we introduce MAtCha, which tackles the challenge of sparse-view reconstruction. By leveraging learned priors, MAtCha enables high-quality mesh reconstruction from just 3-10 input images—a significant advancement over previous methods that required hundreds of images and complex preprocessing steps. The method produces sharp, detailed meshes of both foreground and background elements while maintaining high geometric quality across the entire scene. This capability might make high-quality 3D reconstruction more accessible to everyday users, potentially democratizing the technology for widespread adoption.
05:00 PM
Presentation (45 minutes). Details05:45 PM
Questions from the Jury (between 1 and 2.5 hours)08:00 PM
Results Announcement and short toast at the LabAfter Defense
Reception in La Petite Place Paris 12, 202 rue du Faubourg Saint-Antoine, 75012 Paris© 2025 Antoine Guédon. You are welcome to copy the code, please attribute the source with a link back to this page.