LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

Shuai Yang1,4*, Jing Tan2,4*, Mengchen Zhang3,4, Tong Wu2,4✉, Yixuan Li2,4, Gordon Wetzstein5, Ziwei Liu6, Dahua Lin2,4

1 Shanghai Jiao Tong University, 2 The Chinese University of Hong Kong, 3 Zhejiang University,
4 Shanghai AI Laboratory, 5 Stanford University, 6 S-Lab, Nanyang Technological University

LayerPano3D: Layered 3D Panorama
for Hyper-Immersive Scene Generation

Shuai Yang1,4*, Jing Tan2,4*, Mengchen Zhang3,4, Tong Wu2,4✉, Yixuan Li2,4, Gordon Wetzstein5, Ziwei Liu6, Dahua Lin2,4

1 Shanghai Jiao Tong University, 2 The Chinese University of Hong Kong, 3 Zhejiang University,
4 Shanghai AI Laboratory, 5 Stanford University, 6 S-Lab, Nanyang Technological University

We generate Hyper-Immersive Panoramic scenes

from a single text-description

Abstract

3D immersive scene generation is a challenging yet critical task in computer vision and graphics. A desired virtual 3D scene should 1) exhibit omnidirectional view consistency, and 2) allow for large-range exploration in complex scene hierarchies. Existing methods either rely on successive scene expansion via inpainting or employ panorama representation to represent large FOV scene environments. However, the generated scene suffers from semantic drift during expansion and is unable to handle occlusion among scene hierarchies. To tackle these challenges, we introduce LayerPano3D , a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt. Our key insight is to decompose a reference 2D panorama into multiple layers at different depth levels, where each layer reveals the unseen space from the reference views via diffusion prior. LayerPano3D comprises multiple dedicated designs: 1) We introduce a new panorama dataset Upright360, comprising 9k high-quality and upright panorama images, and finetune the advanced Flux model on Upright360 for high-quality, upright and consistent panorama generation. 2) We pioneer the Layered 3D Panorama as underlying representation to manage complex scene hierarchies and lift it into 3D Gaussians to splat detailed 360-degree omnidirectional scenes with unconstrained viewing paths. Extensive experiments demonstrate that our framework generates state-of-the-art 3D panoramic scene in both full view consistency and immersive exploratory experience. We believe that LayerPano3D holds promise for advancing 3D panoramic scene creation with numerous applications.


Abstract

3D immersive scene generation is a challenging yet critical task in computer vision and graphics. A desired virtual 3D scene should 1) exhibit omnidirectional view consistency, and 2) allow for large-range exploration in complex scene hierarchies. Existing methods either rely on successive scene expansion via inpainting or employ panorama representation to represent large FOV scene environments. However, the generated scene suffers from semantic drift during expansion and is unable to handle occlusion among scene hierarchies. To tackle these challenges, we introduce LayerPano3D , a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt. Our key insight is to decompose a reference 2D panorama into multiple layers at different depth levels, where each layer reveals the unseen space from the reference views via diffusion prior. LayerPano3D comprises multiple dedicated designs: 1) We introduce a new panorama dataset Upright360, comprising 9k high-quality and upright panorama images, and finetune the advanced Flux model on Upright360 for high-quality, upright and consistent panorama generation. 2) We pioneer the Layered 3D Panorama as underlying representation to manage complex scene hierarchies and lift it into 3D Gaussians to splat detailed 360-degree omnidirectional scenes with unconstrained viewing paths. Extensive experiments demonstrate that our framework generates state-of-the-art 3D panoramic scene in both full view consistency and immersive exploratory experience. We believe that LayerPano3D holds promise for advancing 3D panoramic scene creation with numerous applications.


Gallery


Free Rendering

Autumn park scene with people sitting on benches surrounded by colorful trees, storybook illustration style.

teaser v1_0 v2_8 v1_8 v2_0 v1_10 v2_4 v2_1 v2_3 v2_5 v2_6 v1_7 v1_9 v2_10

Method Overview

Explanation Overview

Overview of pipeline. Our framework consists of three stages, namely reference panorama generation, multi-layer panorama construction and panoramic 3D scene optimization. LayerPano3D streamlines an automatic generation pipeline without any manual efforts to design scene-specific navigation paths for expansion or completion.

LayerPano3D is robust to render consistent new panorama images at various locations other than the original camera location in the center.

Additional Results in Multi-Layer 3D Representation (Image Guided)

Citation


@article{yang2024layerpano3d,
  title={LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation},
  author={Yang, Shuai and Tan, Jing and Zhang, Mengchen and Wu, Tong and Li, Yixuan and Wetzstein, Gordon and Liu, Ziwei and Lin, Dahua},
  journal={arXiv preprint arXiv:2408.13252},
  year={2024}
}