MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM

University of Amsterdam

MAGiC-SLAM can reconstruct a renderable 3D scene from RGBD streams of multiple simultaneously operating agents.

Abstract

Simultaneous localization and mapping (SLAM) systems with novel view synthesis capabilities are widely used in computer vision, with applications in augmented reality, robotics, and autonomous driving. However, existing approaches are limited to single-agent operation. Recent work has addressed this problem using a distributed neural scene representation. Unfortunately, existing methods are slow, cannot accurately render real-world data, are restricted to two agents, and have limited tracking accuracy.

In contrast, we propose a rigidly deformable 3D Gaussian-based scene representation that dramatically speeds up the system. However, improving tracking accuracy and reconstructing a consistent map from multiple agents remains challenging due to trajectory drift and discrepancies across agents' observations. Therefore, we propose new tracking and map-merging mechanisms and integrate loop closure in the Gaussian-based SLAM pipeline. We evaluate MAGiC-SLAM on synthetic and real-world datasets and find it more accurate and faster than the state of the art.

Method Overview

MAGiC-SLAM Architecture. Agent Side: Each agent processes a separate RGBD stream, maintaining a local sub-map and estimating its trajectory. When an agent starts a new sub-map, it sends the previous sub-map and image features to the centralized server. Server Side: The server stores the image features and sub-maps from all agents and performs loop closure detection, loop constraint estimation, and pose graph optimization. It then updates the stored sub-maps and returns the optimized poses to the agents. Once the algorithm completes (denoted by green arrows), the server merges the accumulated sub-maps into a single unified map and refines it.

Rendering Results

Comparison with the recent multiagent NVS-SLAM method

We compare our method with other recent pipelines through side-by-side rendering of desk_1 scene from TUM-RGBD dataset.




Novel View Synthesis

MaGIC-SLAM effectively merges maps from multiple agents to enable novel view synthesis. Thanks to Gaussian Splatting, these scenes can be rendered in real-time.

BibTeX

@misc{yugay2024magicslammultiagentgaussianglobally,
      title={MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM}, 
      author={Vladimir Yugay and Theo Gevers and Martin R. Oswald},
      year={2024},
      eprint={2411.16785},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.16785}, 
}