We propose MAtCha Gaussians, a novel surface representation for reconstructing
high-quality 3D meshes with photorealistic rendering from sparse-view images.
Our key idea is to model the underlying scene geometry as an Atlas of Charts which we render with 2D
Gaussian surfels.
We initialize the charts with a monocular depth estimation model and refine them using
differentiable Gaussian rendering and a lightweight neural chart deformation model.
Combined with a sparse-view SfM model like
MASt3R-SfM,
MAtCha can recover sharp and accurate surface meshes
of both foreground and background objects in unbounded scenes within minutes, only from
a few unposed RGB images.
We present a novel appearance model that simultaneously realizes explicit high-quality 3D surface mesh recovery
and photorealistic novel view synthesis from sparse view samples.
Our key idea is to model the underlying scene geometry Mesh as an Atlas of Charts
which we render with 2D Gaussian surfels (MAtCha Gaussians).
MAtCha distills high-frequency scene surface details from an off-the-shelf monocular depth estimator
and refines it through 2D Gaussian surfel rendering.
The Gaussian surfels are attached to the charts on the fly, satisfying photorealism of neural volumetric rendering
and crisp geometry of a mesh model, i.e., two seemingly contradicting goals in a single model.
At the core of MAtCha lies a novel neural deformation model and a structure loss that preserve the fine surface details
distilled from learned monocular depths while addressing their fundamental scale ambiguities.
Results of extensive experimental validation demonstrate MAtCha's state-of-the-art quality of surface reconstruction
and photorealism on-par with top contenders but with dramatic reduction in the number of input views and computational time.
We believe MAtCha will serve as a foundational tool for any visual application in vision, graphics,
and robotics that require explicit geometry in addition to photorealism.
Given a few RGB images and their camera poses obtained using a sparse-view SfM method such as MASt3R-SfM, we first initialize charts using a pretrained monocular depth estimation model. Each chart is represented as a mesh equipped with a UV map, mapping a 2D plane to the 3D surface.
We then optimize our charts and enforce their alignment with input SfM data using three key components:
(a) Multi-resolution TSDF fusion (10 training views)
(b) Adaptive tetrahedralization (10 training views)
Most existing methods relying on 3D Gaussians or 2D Gaussian Surfels
apply TSDF fusion on rendered depth maps to extract a mesh from the volumetric representation.
However, TSDF fusion is limited to bounded scenes and does not allow for extracting high-quality meshes
including both foreground and background objects of the scene.
Moreover, applying TSDF fusion on 2D Gaussian Surfels can over-smooth the geometry, erode fine details,
and produce artifacts, such as "disk-aliasing" patterns on the surface.
In this regard, while we propose (a) a custom multi-resolution TSDF fusion including foreground and background objects in our implementation,
we also propose (b) to adapt the tetrahedralization from
Gaussian Opacity Fields (GOF)
to make it compatible with any Gaussian-based method capable of rendering perspective-accurate depth maps.
@article{guedon2024matcha, title={MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views}, author={Gu{\'e}don, Antoine and Ichikawa, Tomoki and Yamashita, Kohei and Nishino, Ko}, journal={arXiv}, year={2024}, }
This work was in part supported by
JSPS 20H05951 and 21H04893,
and JST JPMJCR20G7 and JPMJAP2305.
This work was also in part supported by the ERC grant "explorer" (No. 101097259).
This work was granted access to the HPC resources of IDRIS under the allocation
2024-AD011013387R2 made by GENCI.
© You are welcome to copy the code of the webpage, please attribute the source with a link
back to this page and remove the analytics.