ECCV 2026
ETH Zürich
Dynamic 3D Gaussian splatting faces a fundamental tension between motion consistency and visual fidelity. Deformation-based approaches preserve temporal correspondence but suffer from motion over-factorization, oversmoothing high-frequency dynamics. In contrast, 4D-primitive methods capture fine visual details yet incur temporal over-parameterization, breaking object identity and leading to severe storage overhead. To resolve this, we introduce Multi4D, a framework for high-fidelity dynamic Gaussian Splatting based on multi-level competitive allocation. Instead of a monolithic representation, we distribute modeling capacity across three structured levels: static structure, persistent dynamic geometry, and transient appearance primitives. Through shared rasterization and residual-driven optimization, these levels dynamically compete to explain photometric error, enabling adaptive specialization without pre-assigned decomposition. This allocation preserves long-term motion consistency while capturing fine dynamic detail, achieving state-of-the-art rendering quality and real-time performance with significantly fewer dynamic primitives. Furthermore, because our representation explicitly tracks compact persistent Gaussians over time, semantic features can be embedded afterward, enabling Multi4D to achieve state-of-the-art 4D segmentation accuracy with an order-of-magnitude speedup.
Multi4D decomposes a dynamic scene into three functionally specialized Gaussian subsets that compete under a shared photometric objective: Static Gaussians anchor the time-invariant structure; Persistent Dynamic Gaussians model long-term, trackable motion through a geometry-only deformation field; and Transient Gaussians (4D primitives) absorb high-frequency appearance residuals. All subsets are rendered in a single differentiable pass — shared transmittance couples their gradients and induces competition, so once one subset explains a region, residual-driven densification in the others is suppressed. A bottom-up training strategy with velocity-aware periodical lifting and mask-aware utility-based pruning yields compact, specialized representations, and the persistent subset can be frozen for fast, accurate 4D semantic embedding.
Comparisons against state-of-the-art deformation-based and 4D-primitive baselines across three datasets. Each clip shows Ground Truth, a baseline, and Ours (left → right) with per-frame metrics rendered in-video; some clips play at reduced speed for clarity.
| Method | PSNR ↑ | DSSIM ↓ | LPIPS ↓ | FPS ↑ |
|---|---|---|---|---|
| DyNeRF | 31.80 | — | 0.1400 | 0.02 |
| HyperReel | 32.70 | 0.047 | 0.1090 | 4.0 |
| 4DGaussians | 30.86 | 0.071 | 0.1647 | 35 |
| Def-3DGS | 30.95 | 0.070 | 0.1553 | 76 |
| E-D3DGS | 32.89 | 0.049 | 0.1114 | 79 |
| 4DGS | 32.07 | 0.054 | 0.1189 | 55 |
| STG | 33.35 | 0.040 | 0.0846 | 86 |
| Multi4D (Ours) | 34.30 | 0.037 | 0.0704 | 161 |
| Method | PSNR ↑ | DSSIM ↓ | LPIPS ↓ | FPS ↑ |
|---|---|---|---|---|
| NeRFPlayer | 30.69 | 0.034 | 0.1110 | 0.05 |
| HyperReel | 31.10 | 0.036 | 0.0985 | 2.0 |
| HexPlane | 31.71 | — | 0.0750 | 0.56 |
| Def-3DGS | 30.98 | 0.033 | 0.0594 | 29 |
| 4DGaussian | 31.12 | 0.032 | 0.0588 | 53 |
| DeGauss | 31.52 | 0.029 | 0.0475 | 157 |
| E-D3DGS | 31.20 | 0.026 | 0.0369 | 70 |
| 4DGS | 31.57 | 0.029 | 0.0573 | 114 |
| STG | 32.04 | 0.026 | 0.0441 | 140 |
| Multi4D (Ours) | 32.30 | 0.026 | 0.0440 | 217 |
Monocular setting. Clips show four panels: GT · 4DGaussian · 4DGS · Ours.
| Method | PSNR ↑ | DSSIM ↓ | LPIPS ↓ |
|---|---|---|---|
| NeRF-DS | 23.24 | 0.081 | 0.2402 |
| HyperNeRF | 19.01 | 0.092 | 0.2615 |
| Def-3DGS | 23.43 | 0.086 | 0.2201 |
| 4DGaussian | 22.79 | 0.088 | 0.2115 |
| 4DGS | 21.51 | 0.108 | 0.3390 |
| STG | 22.54 | 0.089 | 0.3145 |
| Multi4D (Ours) | 23.69 | 0.077 | 0.1903 |
Because Multi4D explicitly tracks a compact set of persistent Gaussians, semantic features can be embedded afterward — yielding state-of-the-art 4D segmentation with an order-of-magnitude speedup. Comparisons against TRASE; tracking uses 2D Co-Tracker as a point-based reference.
| Method | mIoU ↑ | mAcc ↑ |
|---|---|---|
| OpenGaussian | 0.8178 | 0.9899 |
| SA4D | 0.8832 | 0.9931 |
| TRASE | 0.8932 | 0.9938 |
| Multi4D (Ours) | 0.9142 | 0.9952 |
The BibTeX entry will be added once the paper is published. (coming soon)