Name: | Matthias Nießner |
---|---|
Position: | Professor |
E-Mail: | niessner@tum.de |
Phone: | contact via email only |
Room No: | 02.07.042 |
3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt |
---|
Lukas Höllein, Aljaž Božič, Michael Zollhöfer, Matthias Nießner |
arXiv |
3DGS-LM accelerates Gaussian-Splatting optimization by replacing the ADAM optimizer with Levenberg-Marquardt. We propose a highly-efficient GPU parallization scheme for PCG and implement it in custom CUDA kernels that compute Jacobian-vector products. |
[video][bibtex][project page] |
SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input Images |
---|
Abdullah Hamdi, Bernard Ghanem, Matthias Nießner |
arXiv |
SPARF is a large-scale sparse radiance field dataset consisting of ~ 1 million SRFs with multiple voxel resolutions (32, 128, and 512) and 17 million posed images with a resolution of 400 X 400. Furthermore, we propose SuRFNet, a pipline to generate SRFs conditioned on input images, achieving SOTA on ShapeNet novel views synthesis from one or few input images. |
[video][bibtex][project page] |
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs |
---|
Artem Sevastopolsky, Philip Grassal, Simon Giebenhain, ShahRukh Athar, Luisa Verdoliva, Matthias Nießner |
3DV 2025 |
We learn to generate large displacements for parametric head models, such as long hair, with high level of detail. The displacements can be added to an arbitrary head for animation and semantic editing. |
[code][bibtex][project page] |
Coherent 3D Scene Diffusion From a Single RGB Image |
---|
Manuel Dahnert, Angela Dai, Norman Müller, Matthias Nießner |
NeurIPS 2024 |
We propose a novel diffusion-based method for 3D scene reconstruction from a single RGB image, leveraging an image-conditioned 3D scene diffusion model to denoise object poses and geometries. By learning a generative scene prior that captures inter-object relationships, our approach ensures consistent and accurate reconstructions. |
[bibtex][project page] |
L3DG: Latent 3D Gaussian Diffusion |
---|
Barbara Rössle, Norman Müller, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Angela Dai, Matthias Nießner |
SIGGRAPH Asia 2024 |
L3DG proposes generative modeling of 3D Gaussians using a learned latent space. This substantially reduces the complexity of the costly diffusion generation process, allowing higher detail on object-level generation, and scalability to room-scale scenes. |
[video][bibtex][project page] |
GGHead: Fast and Generalizable 3D Gaussian Heads |
---|
Tobias Kirschstein, Simon Giebenhain, Jiapeng Tang, Markos Georgopoulos, Matthias Nießner |
SIGGRAPH Asia 2024 |
GGHead generates photo-realistic 3D heads and renders them at 1k resolution in real-time. Thanks to the efficiency of 3D Gaussian Splatting, no 2D super-resolution network is needed anymore which hampered the view-consistency of prior work. We adopt a 3D GAN formulation which allows training GGHead solely from 2D image datasets. |
[video][bibtex][project page] |
NPGA: Neural Parametric Gaussian Avatars |
---|
Simon Giebenhain, Tobias Kirschstein, Martin Rünz, Lourdes Agapito, Matthias Nießner |
SIGGRAPH Asia 2024 |
NPGA is a method to create 3d avatars from multi-view video recordings which can be precisely animated using the expression space of the underlying neural parametric head model. For increased dynamic representational capacity, we leaverage per-Gaussian latent features, which are used to condition our deformation MLP. |
[video][bibtex][project page] |
Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation |
---|
Yujin Chen, Yinyu Nie, Benjamin Ummenhofer, Reiner Birkl, Michael Paulitsch, Matthias Müller, Matthias Nießner |
ECCV 2024 |
Mesh2NeRF is a method for extracting ground truth radiance fields directly from 3D textured meshes by incorporating mesh geometry, texture, and environment lighting information. Mesh2NeRF serves as direct 3D supervision for neural radiance fields, leveraging mesh data for improving novel view synthesis performance. Mesh2NeRF can function as supervision for generative models during training on mesh collections. |
[video][bibtex][project page] |
Zero-Shot Detection of AI-Generated Images |
---|
Davide Cozzolino, Giovanni Poggi, Matthias Nießner, Luisa Verdoliva |
ECCV 2024 (Oral) |
New generative architectures emerge daily, requiring frequent updates to supervised detectors for synthetic images. To address this challenge, we propose a Zero-shot Entropy-based Detector (ZED) that neither needs AI-generated training data nor relies on knowledge of generative architectures to artificially synthesize their artifacts. The idea is to measure how surprising the image under analysis is compared to a model of real images. To this end, ZED leverages the intrinsic model of real images learned by a lossless image coder. |
[bibtex][project page] |
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation |
---|
Shentong Mo, Enze Xie, Yue Wu, Junsong Chen, Matthias Nießner, Zhenguo Li |
ECCV 2024 |
We propose FastDiT-3D, a novel masked diffusion transformer tailored for efficient 3D point cloud generation, which greatly reduces training costs. Our FastDiT-3D utilizes the encoder blocks with 3D global attention and Mixture-of-Experts (MoE) FFN to take masked voxelized point clouds as input. |
[bibtex][project page] |
LightIt: Illumination Modeling and Control for Diffusion Models |
---|
Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Nießner, Yannick Hold-Geoffroy |
CVPR 2024 |
Recent generative methods lack lighting control, which is crucial to numerous artistic aspects. We propose to condition the generation on shading and normal maps. We model the lighting with single bounce shading, which includes cast shadows. We first train a shading estimation module to generate a dataset of real-world images and shading pairs. Then, we train a control network using the estimated shading and normals as input. |
[video][bibtex][project page] |
DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars |
---|
Tobias Kirschstein, Simon Giebenhain, Matthias Nießner |
CVPR 2024 |
DiffusionAvatar uses diffusion-based, deferred neural rendering to translate geometric cues from an underlying neural parametric head model (NPHM) to photo-realistic renderings. The underlying NPHM provides accurate control over facial expressions, while the deferred neural rendering leverages the 2D prior of StableDiffusion, in order to generate compelling images. |
[video][bibtex][project page] |
MonoNPHM: Dynamic Head Reconstruction from Monoculuar Videos |
---|
Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos, Martin Rünz, Lourdes Agapito, Matthias Nießner |
CVPR 2024 |
MonoNPHM is a neural parametric head model that disentangles geomery, appearance and facial expression into three separate latent spaces. Using MonoNPHM as a prior, we tackle the task of dynamic 3D head reconstruction from monocular RGB videos, using inverse, SDF-based, volumetric rendering. |
[video][bibtex][project page] |
Intrinsic Image Diffusion for Single-view Material Estimation |
---|
Peter Kocsis, Vincent Sitzmann, Matthias Nießner |
CVPR 2024 |
Appearance decomposition is an ambiguous task and collecting real data is challenging. We utilize a pre-trained diffusion model and formulate the problem probabilistically. We fine-tune a pre-trained diffusion model conditioned on a single input image to adapt its image prior to the prediction of albedo, roughness and metallic maps. With our sharp material predictions, we optimize for spatially-varying lighting to enable photo-realistic material editing and relighting. |
[video][bibtex][project page] |
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models |
---|
Lukas Höllein, Aljaž Božič, Norman Müller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollhöfer, Matthias Nießner |
CVPR 2024 |
ViewDiff generates high-quality, multi-view consistent images of a real-world 3D object in authentic surroundings. We turn pretrained text-to-image model into 3D consistent image generator by finetuning them with multi-view supervision. |
[video][bibtex][project page] |
FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models |
---|
Shivangi Aneja, Justus Thies, Angela Dai, Matthias Nießner |
CVPR 2024 |
Given an input audio signal, FaceTalk proposes a diffusion-based approach to synthesize high-quality and temporally consistent 3D motion sequences of high-fidelity human heads as neural parametric head models. FaceTalk can generate detailed facial expressions including wrinkles and eye blinks alongside temporally synchronized mouth movements for diverse audio inputs including songs and foreign languages. |
[video][bibtex][project page] |
SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors |
---|
Dave Zhenyu Chen, Haoxuan Li, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner |
CVPR 2024 |
We propose SceneTex, a novel method for effectively generating high-quality and style-consistent textures for indoor scenes using depth-to-image diffusion priors. At its core, SceneTex proposes a multiresolution texture field to implicitly encode the mesh appearance. To further secure the style consistency across views, we introduce a cross-attention decoder to predict the RGB values by cross-attending to the pre-sampled reference locations in each instance. |
[video][bibtex][project page] |
DPHMs: Diffusion Parametric Head Models for Depth-based Tracking |
---|
Jiapeng Tang, Angela Dai, Yinyu Nie, Lev Markhasin, Justus Thies, Matthias Nießner |
CVPR 2024 |
We introduce Diffusion Parametric Head Models (DPHMs), a generative model that enables robust volumetric head reconstruction and tracking from monocular depth sequences. Tracking and reconstructing heads from real-world single-view depth sequences is very challenging, as the fitting to partial and noisy observations is underconstrained. To tackle these challenges, we propose a latent diffusion-based prior to regularize volumetric head reconstruction and tracking. This prior-based regularizer effectively constrains the identity and expression codes to lie on the underlying latent manifold which represents plausible head shapes. |
[video][code][bibtex][project page] |
DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis |
---|
Jiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nießner |
CVPR 2024 |
We present DiffuScene for indoor 3D scene synthesis based on a novel scene configuration denoising diffusion model. It generates 3D instance properties stored in an unordered object set and retrieves the most similar geometry for each object configuration, which is characterized as a concatenation of different attributes, including location, size, orientation, semantics, and geometry features. We introduce a diffusion network to synthesize a collection of 3D indoor objects by denoising a set of unordered object attributes. |
[video][code][bibtex][project page] |
Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking |
---|
Wei Cao, Chang Luo, Biao Zhang, Matthias Nießner, Jiapeng Tang |
CVPR 2024 |
We introduce Motion2VecSets, a 4D diffusion model for dynamic surface reconstruction from point cloud sequences. We introduce a diffusion model that explicitly learns the shape and motion distribution of non-rigid objects through an iterative denoising process of compressed latent representations. We parameterize 4D dynamics with latent vector sets instead of using a global latent. This novel 4D representation allows us to learn local surface shape and deformation patterns, leading to more accurate non-linear motion capture and significantly improving generalizability to unseen motions and identities. |
[video][bibtex][project page] |
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians |
---|
Shenhan Qian, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Simon Giebenhain, Matthias Nießner |
CVPR 2024 |
We introduce GaussianAvatars, a new method to create photorealistic head avatars that are fully controllable in terms of expression, pose, and viewpoint. The core idea is a dynamic 3D representation based on 3D Gaussian splats that are rigged to a parametric morphable face model. This combination facilitates photorealistic rendering while allowing for precise animation control via the underlying parametric model. |
[video][bibtex][project page] |
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers |
---|
Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, Matthias Nießner |
CVPR 2024 |
MeshGPT creates triangle meshes by autoregressively sampling from a transformer model that has been trained to produce tokens from a learned geometric vocabulary. Our method generates clean, coherent, and compact meshes, characterized by sharp edges and high fidelity. |
[video][bibtex][project page] |
TriPlaneNet: An Encoder for EG3D Inversion |
---|
Ananta Bhattarai, Matthias Nießner, Artem Sevastopolsky |
WACV 2024 |
EG3D is a powerful {z, camera}->image generative model, but inverting EG3D (finding a corresponding z for a given image) is not always trivial. We propose a fully-convolutional encoder for EG3D based on the observation that predicting both z code and tri-planes is beneficial. TriPlaneNet also works for videos and close to real time. |
[video][code][bibtex][project page] |
GANeRF: Leveraging Discriminators to Optimize Neural Radiance Fields |
---|
Barbara Rössle, Norman Müller, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Matthias Nießner |
SIGGRAPH Asia 2023 |
GANeRF proposes an adversarial formulation whose gradients provide feedback for a 3D-consistent neural radiance field representation. This introduces additional constraints that enable more realistic novel view synthesis. |
[video][bibtex][project page] |
Text2Tex: Text-driven Texture Synthesis via Diffusion Models |
---|
Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner |
ICCV 2023 |
We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. |
[video][bibtex][project page] |
ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes |
---|
Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, Angela Dai |
ICCV 2023 |
We present ScanNet++, a large scale dataset with 450+ 3D indoor scenes containing sub-millimeter resolution laser scans, registered 33-megapixel DSLR images, and commodity RGB-D streams from iPhone. The 3D reconstructions are annotated with long-tail and label-ambiguous semantics to benchmark semantic understanding methods, while the coupled DSLR and iPhone captures enable benchmarking of novel view synthesis methods in high-quality and commodity settings. |
[video][bibtex][project page] |
HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion |
---|
Ziya Erkoç, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai |
ICCV 2023 |
We propose HyperDiffusion, a novel approach for unconditional generative modeling of implicit neural fields. HyperDiffusion operates directly on MLP weights and generates new neural implicit fields encoded by synthesized MLP parameters. It enables diffusion modeling over a implicit, compact, and yet high-fidelity representation of complex signals across 3D shapes and 4D mesh animations within one single unified framework. |
[video][bibtex][project page] |
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models |
---|
Lukas Höllein, Ang Cao, Andrew Owens, Justin Johnson, Matthias Nießner |
ICCV 2023 |
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models. The core idea of our approach is a tailored viewpoint selection such that the content of each image can be fused into a seamless, textured 3D mesh. More specifically, we propose a continuous alignment strategy that iteratively fuses scene frames with the existing geometry. |
[video][bibtex][project page] |
How to Boost Face Recognition with StyleGAN? |
---|
Artem Sevastopolsky, Yury Malkov, Nikita Durasov, Luisa Verdoliva, Matthias Nießner |
ICCV 2023 |
State-of-the-art face recognition systems require huge amounts of labeled training data which is often compiled as a limited collection of celebrities images. We learn how to leverage pretraining of StyleGAN and an encoder for it on large-scale collections of random face images. The procedure can be applied to various backbones and is the most helpful on limited data. We release the collected datasets AfricanFaceSet-5M and AsianFaceSet-3M and a new fairness-concerned testing benchmark RB-WebFace. |
[video][code][bibtex][project page] |
CAD-Estate: Large-scale CAD Model Annotation in RGB Videos |
---|
Kevis-Kokitsi Maninis, Stefan Popov, Matthias Nießner, Vittorio Ferrari |
ICCV 2023 |
We propose a method for annotating videos of complex multi-object scenes with a globally-consistent 3D representation of the objects. This semi-automatic method allows for large scale crowd-sourcing and has allowed us to construct a large-scale dataset by annotating 21K real-estate videos from YouTube with 108K object instances of 12K unique CAD models. |
[bibtex][project page] |
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding |
---|
Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang |
ICCV 2023 |
We propose UniT3D, a simple yet effective fully unified transformer-based architecture for jointly solving 3D visual grounding and dense captioning. UniT3D enables learning a strong multimodal representation across the two tasks through a supervised joint pre-training scheme with bidirectional and seq-to-seq objectives. |
[bibtex][project page] |
End2End Multi-View Feature Matching with Differentiable Pose Optimization |
---|
Barbara Rössle, Matthias Nießner |
ICCV 2023 |
End2End Multi-View Feature Matching connects feature matching and pose optimization in an end-to-end trainable approach that enables matches and confidences to be informed by the pose estimation objective. We introduce GNN-based multi-view matching to predict matches and confidences tailored to a differentiable pose solver, which significantly improves pose estimation performance. |
[video][bibtex][project page] |
NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads |
---|
Tobias Kirschstein, Shenhan Qian, Simon Giebenhain, Tim Walter, Matthias Nießner |
SIGGRAPH'23 |
We propose NeRSemble for high-quality novel view synthesis of human heads. We combine a deformation field modeling coarse motion with an ensemble of multi-resolution hash encodings to represent fine expression-dependent details. To train our model, we recorded a novel multi-view video dataset containing over 4700 sequences of human heads covering a variety of facial expressions. |
[video][bibtex][project page] |
3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models |
---|
Biao Zhang, Jiapeng Tang, Matthias Nießner, Peter Wonka |
SIGGRAPH'23 |
We introduce 3DShape2VecSet, a novel shape representation for neural fields designed for generative diffusion models. Our new representation encodes neural fields on top of a set of vectors. |
[video][code][bibtex][project page] |
ClipFace: Text-guided Editing of Textured 3D Morphable Models |
---|
Shivangi Aneja, Justus Thies, Angela Dai, Matthias Nießner |
SIGGRAPH'23 |
ClipFace learns a self-supervised generative model for jointly synthesizing geometry and texture leveraging 3D morphable face models, that can be guided by text prompts. For a given 3D mesh with fixed topology, we can generate arbitrary face textures as UV maps. The textured mesh can then be manipulated with text guidance to generate diverse set of textures and geometric expressions in 3D by altering (a) only the UV texture maps for Texture Manipulation and (b) both UV maps and mesh geometry for Expression Manipulation. |
[video][bibtex][project page] |
HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion |
---|
Mustafa Işık, Martin Rünz, Markos Georgopoulos, Taras Khakhulin, Jonathan Starck, Lourdes Agapito, Matthias Nießner |
SIGGRAPH'23 |
We introduce ActorsHQ, a novel multi-view dataset that provides 12MP footage from 160 cameras for 16 sequences with high-fidelity per frame mesh reconstructions. We demonstrate challenges that emerge from using such high-resolution data and show that our newly-introduced HumanRF effectively leverages this data, making a significant step towards production-level quality novel view synthesis. |
[video][bibtex][project page] |
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors |
---|
Ji Hou, Xiaoliang Dai, Zijian He, Angela Dai, Matthias Nießner |
CVPR 2023 |
Mask3D is a noval approach that embeds 3D structural priors into 2D learned feature representations by leveraging existing large-scale RGB-D data in a self-supervised pre-training. The experiments show that Mask3D outperforms state-of-the-art 3D pre-training on various image understanding tasks, with a significant improvement of +6.5% mIoU on ScanNet image semantic segmentation. |
[video][bibtex][project page] |
Learning Neural Parametric Head Models |
---|
Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos, Martin Rünz, Lourdes Agapito, Matthias Nießner |
CVPR 2023 |
We present Neural Parametric Head Models (NPHMs) for high fidelity representation of complete human heads. We utilize a hybrid representation for a person's canonical head geometry, which is deformed using a deformation field to model expressions. To train our model we captured a large dataset of high-end laser scans of 120 persons in 20 expressions each. |
[video][bibtex][project page] |
Panoptic Lifting for 3D Scene Understanding with Neural Fields |
---|
Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulo, Norman Müller, Matthias Nießner, Angela Dai, Peter Kontschieder |
CVPR 2023 |
Given only RGB images of an in-the-wild scene as input, Panoptic Lifting optimizes a panoptic radiance field which can be queried for color, depth, semantics, and instances for any point in space. |
[bibtex][project page] |
DiffRF: Rendering-guided 3D Radiance Field Diffusion |
---|
Norman Müller, Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder, Matthias Nießner |
CVPR 2023 |
DiffRF is a denoising diffusion probabilistic model directly operating on 3D radiance fields and trained with an additional volumetric rendering loss. This enables learning strong radiance priors with high rendering quality and accurate geometry. This appraoch naturally enables tasks like 3D masked completion or image-to-volume synthesis. |
[video][bibtex][project page] |
High-Res Facial Appearance Capture from Polarized Smartphone Images |
---|
Dejan Azinović, Olivier Maury, Christophe Hery, Matthias Nießner, Justus Thies |
CVPR 2023 |
We propose a novel method for high-quality facial texture reconstruction from RGB images using a novel capturing routine based on a single smartphone which we equip with an inexpensive polarization foil. Specifically, we turn the flashlight into a polarized light source and add a polarization filter on top of the camera to separate the skin's diffuse and specular response. |
[video][bibtex][project page] |
ObjectMatch: Robust Registration using Canonical Object Correspondences |
---|
Can Gümeli, Angela Dai, Matthias Nießner |
CVPR 2023 |
ObjectMatch leverages indirect correspondences obtained via semantic object identification. For instance, when an object is seen from the front in one frame and from the back in another frame, ObjectMatch provides additional pose constraints through canonical object correspondences. We first propose a neural network to predict such correspondences, which we then combine in our energy formulation with state-of-the-art keypoint matching solved with a joint Gauss-Newton optimization. Our method significantly improves state-of-the-art feature matching in low-overlap frame pairs as well as in the registration of low frame-rate SLAM sequences. |
[video][bibtex][project page] |
Learning 3D Scene Priors with 2D Supervision |
---|
Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner |
CVPR 2023 |
We learn 3D scene priors with 2D supervision. We model a latent hypersphere surface to represent a manifold of 3D scenes, characterizing the semantic and geometric distribution of objects in 3D scenes. This supports many downstream applications, including scene synthesis, interpolation and single-view reconstruction. |
[video][bibtex][project page] |
COSMOS: Catching Out-of-Context Misinformation with Self-Supervised Learning |
---|
Shivangi Aneja, Chris Bregler, Matthias Nießner |
AAAI 2023 |
Despite the recent attention to DeepFakes, one of the most prevalent ways to mislead audiences on social media is the use of unaltered images in a new but false context. To address these challenges and support fact-checkers, we propose a new method that automatically detects out-of-context image and text pairs. Our key insight is to leverage grounding of image with text to distinguish out-of-context scenarios that cannot be disambiguated with language alone. Check out the paper for more details. |
[video][code][bibtex][project page] |
Neural Shape Deformation Priors |
---|
Jiapeng Tang, Lev Markhasin, Bi Wang, Justus Thies, Matthias Nießner |
NeurIPS 2022 |
We present Neural Shape Deformation Priors, a novel method for shape manipulation that predicts mesh deformations of non-rigid objects from user-provided handle movements. We learn the geometry-aware deformation behavior from a large-scale dataset containing a diverse set of non-rigid deformations. We introduce transformer-based deformation networks that represent a shape deformation as a composition of local surface deformations. |
[video][code][bibtex][project page] |
3DILG: Irregular Latent Grids for 3D Generative Modeling |
---|
Biao Zhang, Matthias Nießner, Peter Wonka |
NeurIPS 2022 |
We proposed a method for representing shapes as irregular latent grids. This representation enables 3d generative modeling with autoregressive transformers. We show different applications that improve over the current state of the art. All probabilistic experiments confirm that we are able to generate detailed and high quality shapes to yield the new state of the art in generative 3D shape modeling. |
[code][bibtex][project page] |
The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes |
---|
Peter Kocsis, Peter Sukenik, Guillem Braso, Matthias Nießner, Laura Leal-Taixé, Ismail Elezi |
NeurIPS 2022 |
We show that adding fully-connected layers is beneficial for the generalization of convolutional networks in the tasks working in the low-data regime. Furthermore, we present a novel online joint knowledge distillation method (OJKD), which allows us to utilize additional final fully-connected layers during training but drop them during inference without a noticeable loss in performance. Doing so, we keep the same number of weights during test time. |
[code][bibtex][project page] |
Texturify: Generating Textures on 3D Shape Surfaces |
---|
Yawar Siddiqui, Justus Thies, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai |
ECCV 2022 |
Texturify learns to generate geometry-aware textures for untextured collections of 3D objects. Our method trains from only a collection of images and a collection of untextured shapes, which are both often available, without requiring any explicit 3D color supervision or shape-image correspondence. Textures are created directly on the surface of a given 3D shape, enabling generation of high-quality, compelling textured 3D shapes. |
[bibtex][project page] |
4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding |
---|
Yujin Chen, Matthias Nießner, Angela Dai |
ECCV 2022 |
We present a new approach to instill 4D dynamic object priors into learned 3D representations by unsupervised pre-training. We propose a new data augmentation scheme leveraging synthetic 3D shapes moving in static 3D environments, and employ contrastive learning under 3D-4D constraints that encode 4D invariances into the learned 3D representations. Experiments demonstrate that our unsupervised representation learning results in improvement in downstream 3D semantic segmentation, object detection, and instance segmentation tasks. |
[video][bibtex][project page] |
TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations |
---|
Shivangi Aneja, Lev Markhasin, Matthias Nießner |
ECCV 2022 |
We introduce a novel data-driven approach that produces image-specific perturbations which are embedded in the original images to prevent face manipulation by causing the manipulation model to produce a predefined manipulation target. Compared to traditional adversarial attack baselines that optimize noise patterns for each image individually, our generalized model only needs a single forward pass, thus running orders of magnitude faster and allowing for easy integration in image processing stacks, even on resource-constrained devices like smartphones. Check out the paper for more details. |
[video][code][bibtex][project page] |
Pose2Room: Understanding 3D Scenes from Human Activities |
---|
Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner |
ECCV 2022 |
From an observed pose trajectory of a person performing daily activities in an indoor scene, we learn to estimate likely object configurations of the scene underlying these interactions, as set of object class labels and oriented 3D bounding boxes. By sampling from our probabilistic decoder, we synthesize multiple plausible object arrangements. |
[video][bibtex][project page] |
D3Net: A Speaker-Listener Architecture for Semi-supervised Dense Captioning and Visual Grounding in RGB-D Scans |
---|
Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang |
ECCV 2022 |
We present D3Net, an end-to-end neural speaker-listener architecture that can detect, describe and discriminate. Our D3Net unifies dense captioning and visual grounding in 3D in a self-critical manner. This self-critical property of D3Net also introduces discriminability during object caption generation and enables semi-supervised training on ScanNet data with partially annotated descriptions. |
[video][bibtex][project page] |
3D Equivariant Graph Implicit Functions |
---|
Yunlu Chen, Basura Fernando, Hakan Bilen, Matthias Nießner, Efstratios Gavves |
ECCV 2022 |
We propose a novel family of graph-based 3D implicit representations. The non-Euclidean graph embeddings in multiple-scales enable modeling of high-fidelity 3D geometric details and facilitate inherent robustness against geometric transformations. We further incorporate equivariant graph layers for guaranteed generalization to unseen similarity transformations, including rotation, translation, and scaling. |
[video][bibtex][project page] |
RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering |
---|
Di Chang, Aljaž Božič, Tong Zhang, Qingsong Yan, Yingcong Chen, Sabine Süsstrunk, Matthias Nießner |
ECCV 2022 |
We introduce RC-MVSNet, a neural-rendering based unsupervised Multi-View Stereo 3D reconstruction approach. First, we leverage NeRF-like rendering to generate consistent photometric supervision for non-Lambertian surfaces in unsupervised MVS task. Second, we impose depth rendering consistency loss to refine the initial depth map predicted by naive photometric consistency loss. We also propose Gaussian-Uniform sampling to improve NeRF's ability to learn the geometry features close to the object surface, which overcomes occlusion artifacts present in existing approaches. |
[code][bibtex][project page] |
Advances in Neural Rendering |
---|
Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul P. Srinivasan, Edgar Tretschk, Yifan Wang, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, Tomas Simon, Christian Theobalt, Matthias Nießner, Jonathan T. Barron, Gordon Wetzstein, Michael Zollhöfer, Vladislav Golyanik |
Eurographics 2022 |
In this state-of-the-art report, we analyze these recent developments in Neural Rendering in detail and review essential related work. We explain, compare, and critically analyze the common underlying algorithmic concepts that enabled these recent advancements. |
[bibtex][project page] |
AutoRF: Learning 3D Object Radiance Fields from Single View Observations |
---|
Norman Müller, Andrea Simonelli, Lorenzo Porzi, Samuel Rota Bulò, Matthias Nießner, Peter Kontschieder |
CVPR 2022 |
From just a single view, we learn neural 3D object representations for free novel view synthesis. This setting is in stark contrast to the majority of existing works that leverage multiple views of the same object, employ explicit priors during training, or require pixel-perfect annotations. Our method decouples object geometry, appearance, and pose enabling generalization to unseen objects, even across different datasets of challenging real-world street scenes such as nuScenes, KITTI, and Mapillary Metropolis. |
[video][bibtex][project page] |
Neural Head Avatars from Monocular RGB Videos |
---|
Philip-William Grassal, Malte Prinzler, Titus Leistner, Carsten Rother, Matthias Nießner, Justus Thies |
CVPR 2022 |
We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human. |
[video][bibtex][project page] |
ROCA: Robust CAD Model Retrieval and Alignment from a Single Image |
---|
Can Gümeli, Angela Dai, Matthias Nießner |
CVPR 2022 |
We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models to a single RGB image using a shape database. Thus, our method enables 3D understanding of an observed scene using clean and compact CAD representations of objects. Core to our approach is our differentiable alignment optimization based on dense 2D-3D object correspondences and Procrustes alignment. |
[video][code][bibtex][project page] |
Neural RGB-D Surface Reconstruction |
---|
Dejan Azinović, Ricardo Martin-Brualla, Dan B Goldman, Matthias Nießner, Justus Thies |
CVPR 2022 |
In this work, we explore how to leverage the success of implicit novel view synthesis methods for surface reconstruction. We demonstrate how depth measurements can be incorporated into the radiance field formulation to produce more detailed and complete reconstruction results than using methods based on either color or depth data alone. |
[video][bibtex][project page] |
StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions |
---|
Lukas Höllein, Justin Johnson, Matthias Nießner |
CVPR 2022 |
We apply style transfer on mesh reconstructions of indoor scenes. We optimize an explicit texture for the reconstructed mesh of a scene and stylize it jointly from all available input images. Our depth- and angle-aware optimization leverages surface normal and depth data of the underlying mesh to create a uniform and consistent stylization for the whole scene. |
[video][bibtex][project page] |
Dense Depth Priors for Neural Radiance Fields from Sparse Input Views |
---|
Barbara Rössle, Jonathan T. Barron, Ben Mildenhall, Pratul P. Srinivasan, Matthias Nießner |
CVPR 2022 |
We leverage dense depth priors for recovering neural radiance fields (NeRF) of complete rooms when only a handful of input images are available. First, we take advantage of the sparse depth that is freely available from the structure from motion preprocessing. Second, we use depth completion to convert these sparse points into dense depth maps and uncertainty estimates, which are used to guide NeRF optimization. |
[video][bibtex][project page] |
Panoptic 3D Scene Reconstruction From a Single RGB Image |
---|
Manuel Dahnert, Ji Hou, Matthias Nießner, Angela Dai |
NeurIPS 2021 |
Panoptic 3D Scene Reconstruction combines the tasks of 3D reconstruction, semantic segmentation and instance segmentation. From a single RGB image we predict 2D information and lift these into a sparse volumetric 3D grid, where we predict geometry, semantic labels and 3D instance labels. |
[video][code][bibtex][project page] |
TransformerFusion: Monocular RGB Scene Reconstruction using Transformers |
---|
Aljaž Božič, Pablo Palafox, Justus Thies, Angela Dai, Matthias Nießner |
NeurIPS 2021 |
We introduce TransformerFusion, a transformer-based 3D scene reconstruction approach. The input monocular RGB video frames are fused into a volumetric feature representation of the scene by a transformer network that learns to attend to the most relevant image observations, resulting in an accurate online surface reconstruction. |
[video][bibtex][project page] |
NPMs: Neural Parametric Models for 3D Deformable Shapes |
---|
Pablo Palafox, Aljaž Božič, Justus Thies, Matthias Nießner, Angela Dai |
ICCV 2021 |
We propose Neural Parametric Models (NPMs), a learned alternative to traditional, parametric 3D models. 4D dynamics are disentangled into latent-space representations of shape and pose, leveraging the flexibility of recent developments in learned implicit functions. Once learned, NPMs enable optimization over the learned spaces to fit to new observations. |
[video][code][bibtex][project page] |
4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface |
---|
Yang Li, Hikari Takehara, Takafumi Taketomi, Bo Zheng, Matthias Nießner |
ICCV 2021 |
We introduce 4DComplete, the first method that jointly recovers the shape and motion field from partial observations. We also provide a large-scale non-rigid 4D dataset for training and benchmaring. It consists of 1,972 animation sequences, and 122,365 frames. |
[video][code][bibtex][project page] |
ID-Reveal: Identity-aware DeepFake Video Detection |
---|
Davide Cozzolino, Andreas Rössler, Justus Thies, Matthias Nießner, Luisa Verdoliva |
ICCV 2021 |
ID-Reveal is an identity-aware DeepFake video detection. Based on reference videos of a person, we estimate a temporal embedding which is used as a distance metric to detect fake videos. |
[video][code][bibtex][project page] |
RetrievalFuse: Neural 3D Scene Reconstruction with a Database |
---|
Yawar Siddiqui, Justus Thies, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai |
ICCV 2021 |
We introduce a new 3D reconstruction method that directly leverages scene geometry from the training database, facilitating transfer of coherent structures and local detail from train scene geometry. |
[video][code][bibtex][project page] |
Pri3D: Can 3D Priors Help 2D Representation Learning? |
---|
Ji Hou, Saining Xie, Benjamin Graham, Angela Dai, Matthias Nießner |
ICCV 2021 |
Pri3D leverages 3D priors for downstream 2D image understanding tasks: during pre-training, we incorporate view-invariant and geometric priors from color-geometry information given by RGB-D datasets, imbuing geometric priors into learned features. We show that these 3D-imbued learned features can effectively transfer to improved performance on 2D tasks such as semantic segmentation, object detection, and instance segmentation. |
[video][code][bibtex][project page] |
Dynamic Surface Function Networks for Clothed Human Bodies |
---|
Andrei Burov, Matthias Nießner, Justus Thies |
ICCV 2021 |
We present a novel method for temporal coherent reconstruction and tracking of clothed humans. Given a monocular RGB-D sequence, we learn a person-specific body model which is based on a dynamic surface function network. To this end, we explicitly model the surface of the person using a multi-layer perceptron (MLP) which is embedded into the canonical space of the SMPL body model. |
[video][bibtex][project page] |
Thallo — Scheduling for High-Performance Large-scale Non-linear Least-Squares Solvers |
---|
Michael Mara, Felix Heide, Michael Zollhöfer, Matthias Nießner, Pat Hanrahan |
ACM Transactions on Graphics 2021 (TOG) |
Thallo is a high-performance domain-specific language (DSL) that generates tailored high-performance GPU solvers and schedules from a concise, high-level energy description without the hassle of manually constructing and maintaining tedious and error-prone solvers. |
[bibtex][project page] |
Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction |
---|
Guy Gafni, Justus Thies, Michael Zollhöfer, Matthias Nießner |
CVPR 2021 (Oral) |
Given a monocular portrait video sequence of a person, we reconstruct a dynamic neural radiance field representing a 4D facial avatar. The radiance field is conditioned on blendshape expressions, which allow us to then photorealistically synthesize novel head poses as well as novel facial expressions of the person. This can be used for self-reenactment, novel view synthesis and cross-subject reenactment. |
[video][code][bibtex][project page] |
SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans |
---|
Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Nießner |
CVPR 2021 |
We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion. Rather than relying on 3D reconstruction losses to inform our 3D geometry and color reconstruction, we propose adversarial and perceptual losses operating on 2D renderings in order to achieve high-resolution, high-quality colored reconstructions of scenes. |
[video][code][bibtex][project page] |
Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences |
---|
Norman Müller, Yu-Shiang Wong, Niloy J. Mitra, Angela Dai, Matthias Nießner |
CVPR 2021 |
We introduce a novel method to jointly predict complete geometry and dense correspondences of rigidly moving objects for 3D multi-object tracking on RGB-D sequences. By hallucinating unseen regions of objects, we can obtain additional correspondences between the same instance, thus providing robust tracking even under strong change of appearance. |
[video][bibtex][project page] |
RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction |
---|
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner |
CVPR 2021 |
We introduce RfD-Net that jointly detects and reconstructs dense object surfaces from raw point clouds. It leverages the sparsity of point cloud data and focuses on predicting shapes that are recognized with high objectness. It not only eases the difficulty of learning 2-D manifold surfaces from sparse 3D space, the point clouds in each object proposal convey shape details that support implicit function learning to reconstruct any high-resolution surfaces. |
[video][code][bibtex][project page] |
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts |
---|
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie |
CVPR 2021 (Oral) |
Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations. |
[video][bibtex][project page] |
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans |
---|
Dave Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang |
CVPR 2021 |
We introduce the new task of dense captioning in RGB-D scans with a model that can densely localize objects in a 3D scene and describe them using natural language in a single forward pass. |
[video][code][bibtex][project page] |
Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction |
---|
Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Justus Thies, Angela Dai, Matthias Nießner |
CVPR 2021 (Oral) |
We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects. Specifically, we implicitly model a deformation graph via a deep neural network and empose per-frame viewpoint consistency as well as inter-frame graph and surface consistency constraints in a self-supervised fashion. |
[video][bibtex][project page] |
Explicitly Modeled Attention Maps for Image Classification |
---|
Andong Tan, Duc Tam Nguyen, Maximilian Dax, Matthias Nießner, Thomas Brox |
AAAI 2021 |
We propose a novel self-attention module that explicitly models attention-maps to mitigate the problem of high computational requirements of attention maps. Our evaluation shows that our method achieves an accuracy improvement of up to 2.2% over the ResNet-baselines in ImageNet ILSVRC and outperforms other self-attention methods such as AA-ResNet152 (Bello et al., 2019) in accuracy by 0.9% with 6.4% fewer parameters and 6.7% fewer GFLOPs. |
[bibtex][project page] |
RigidFusion: RGB-D Scene Reconstruction with Rigidly-moving Objects |
---|
Yu-Shiang Wong, Changjian Li, Matthias Nießner, Niloy J. Mitra |
Eurographics 2021 |
We present RigidFusion, a novel surface reconstruction approach that handles changing environments. This is achieved by an asynchronous moving-object detection method, combined with a modified volumetric fusion. |
[video][bibtex][project page] |
Neural Non Rigid Tracking |
---|
Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Angela Dai, Justus Thies, Matthias Nießner |
NeurIPS 2020 |
We introduce a novel, end-to-end learnable, differentiable non-rigid tracker that enables state-of-the-art non-rigid reconstruction. By enabling gradient back-propagation through a non-rigid as-rigid-as-possible optimization solver, we are able to learn correspondences in an end-to-end manner such that they are optimal for the task of non-rigid tracking |
[video][bibtex][project page] |
Egocentric Videoconferencing |
---|
Mohamed Elgharib, Mohit Mendiratta, Justus Thies, Matthias Nießner, Hans-Peter Seidel, Ayush Tewari, Vladislav Golyanik, Christian Theobalt |
ACM Transactions on Graphics 2020 (TOG) |
We introduce a method for egocentric videoconferencing that enables hands-free video calls, for instance by people wearing smart glasses or other mixed-reality devices. |
[bibtex][project page] |
Modeling 3D Shapes by Reinforcement Learning |
---|
Cheng Lin, Tingxiang Fan, Wenping Wang, Matthias Nießner |
ECCV 2020 |
We explore how to enable machines to model 3D shapes like human modelers using deep reinforcement learning (RL). |
[video][code][bibtex][project page] |
SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans |
---|
Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner |
ECCV 2020 |
We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. |
[code][bibtex][project page] |
CAD-Deform: Deformable Fitting of CAD Models to 3D Scans |
---|
Vladislav Ishimtsev, Alexey Bokhovkin, Alexey Artemov, Savva Ignatiev, Matthias Nießner, Denis Zorin, Evgeny Burnaev |
ECCV 2020 |
We propose CAD-Deform, a method which obtains more accurate CAD-to-scan fits by non-rigidly deforming retrieved CAD models. |
[code][bibtex][project page] |
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language |
---|
Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner |
ECCV 2020 |
We propose ScanRefer, a method that learns a fused descriptor from 3D object proposals and encoded sentence embeddings, to address the newly introduced task of 3D object localization in RGB-D scans using natural language descriptions. Along with the method we release a large-scale dataset of 51,583 descriptions of 11,046 objects from 800 ScanNet scenes. |
[video][code][bibtex][project page] |
Neural Voice Puppetry: Audio-driven Facial Reenactment |
---|
Justus Thies, Mohamed Elgharib, Ayush Tewari, Christian Theobalt, Matthias Nießner |
ECCV 2020 |
Given an audio sequence of a source person or digital assistant, we generate a photo-realistic output video of a target person that is in sync with the audio of the source input. |
[video][code][bibtex][project page] |
State of the Art on Neural Rendering |
---|
Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit K Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B Goldman, Michael Zollhöfer |
EG 2020 |
Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. This state-of-the-art report summarizes the recent trends and applications of neural rendering. |
[bibtex][project page] |
Learning to Optimize Non-Rigid Tracking |
---|
Yang Li, Aljaž Božič, Tianwei Zhang, Yanli Ji, Tatsuya Harada, Matthias Nießner |
CVPR 2020 (Oral) |
We learn the tracking of non-rigid objects by differentiating through the underlying non-rigid solver. Specifically, we propose ConditionNet which learns to generate a problem-specific preconditioner using a large number of training samples from the Gauss-Newton update equation. The learned preconditioner increases PCG’s convergence speed by a significant margin. |
[bibtex][project page] |
3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation |
---|
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner |
CVPR 2020 |
We present 3D-MPA, a method for instance segmentation on 3D point clouds. We show that grouping proposals improves over NMS and outperforms previous state-of-the-art methods on the tasks of 3D object detection and semantic instance segmentation on the ScanNetV2 benchmark and the S3DIS dataset. |
[video][bibtex][project page] |
DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data |
---|
Aljaž Božič, Michael Zollhöfer, Christian Theobalt, Matthias Nießner |
CVPR 2020 |
We present a large dataset of 400 scenes, over 390,000 RGB-D frames, and 5,533 densely aligned frame pairs, and introduce a data-driven non-rigid RGB-D reconstruction approach using learned heatmap correspondences, achieving state-of-the-art reconstruction results on a newly established quantitative benchmark. |
[video][code][bibtex][project page] |
Local Implicit Grid Representations for 3D Scenes |
---|
Chiyu 'Max' Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, Thomas Funkhouser |
CVPR 2020 |
We learned implicit representations for large 3D environments anchored in regular grids, which facilitates high-quality surface reconstruction from unstructured input point clouds. |
[video][code][bibtex][project page] |
Adversarial Texture Optimization from RGB-D Scans |
---|
Jingwei Huang, Justus Thies, Angela Dai, Abhijit Kundu, Chiyu 'Max' Jiang, Leonidas Guibas, Matthias Nießner, Thomas Funkhouser |
CVPR 2020 |
We present a novel approach for color texture generation using a conditional adversarial loss obtained from weakly-supervised views. Specifically, we propose an approach to produce photorealistic textures for approximate surfaces, even from misaligned images, by learning an objective function that is robust to these errors. |
[video][bibtex][project page] |
ViewAL: Active Learning with Viewpoint Entropy for Semantic Segmentation |
---|
Yawar Siddiqui, Julien Valentin, Matthias Nießner |
CVPR 2020 |
We propose ViewAL, a novel active learning strategy for semantic segmentation that exploits viewpoint consistency in multi-view datasets. |
[video][code][bibtex][project page] |
SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans |
---|
Angela Dai, Christian Diller, Matthias Nießner |
CVPR 2020 |
We present a novel approach that converts partial and noisy RGB-D scans into high-quality 3D scene reconstructions by inferring unobserved scene geometry. Our approach is fully self-supervised and can hence be trained solely on real-world, incomplete scans. |
[video][code][bibtex][project page] |
RevealNet: Seeing Behind Objects in RGB-D Scans |
---|
Ji Hou, Angela Dai, Matthias Nießner |
CVPR 2020 |
This paper introduces the task of semantic instance completion: from an incomplete, RGB-D scan of a scene, we detect the individual object instances comprising the scene and jointly infer their complete object geometry. |
[bibtex][project page] |
Image-guided Neural Object Rendering |
---|
Justus Thies, Michael Zollhöfer, Christian Theobalt, Marc Stamminger, Matthias Nießner |
ICLR 2020 |
We propose a new learning-based novel view synthesis approach for scanned objects that is trained based on a set of multi-view images, where we directly train a deep neural network to synthesize a view-dependent image of an object. |
[video][bibtex][project page] |
RIO: 3D Object Instance Re-Localization in Changing Indoor Environments |
---|
Johanna Wald, Armen Avetisyan, Nassir Navab, Federico Tombari, Matthias Nießner |
ICCV 2019 (Oral) |
In this work, we explore the task of 3D object instance re-localization (RIO): given one or multiple objects in an RGB-D scan, we want to estimate their corresponding 6DoF poses in another 3D scan of the same environment taken at a later point in time. |
[video][bibtex][project page] |
FaceForensics++: Learning to Detect Manipulated Facial Images |
---|
Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, Matthias Nießner |
ICCV 2019 |
In this paper, we examine the realism of state-of-the-art facial image manipulation methods, and how difficult it is to detect them - either automatically or by humans. In particular, we create a datasets that is focused on DeepFakes, Face2Face, FaceSwap, and Neural Textures as prominent representatives for facial manipulations. |
[video][code][bibtex][project page] |
End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans |
---|
Armen Avetisyan, Angela Dai, Matthias Nießner |
ICCV 2019 |
We present a novel, end-to-end approach to align CAD models to an 3D scan of a scene, enabling transformation of a noisy, incomplete 3D scan to a compact, CAD reconstruction with clean, complete object geometry. |
[bibtex][project page] |
Joint Embedding of 3D Scan and CAD Objects |
---|
Manuel Dahnert, Angela Dai, Leonidas Guibas, Matthias Nießner |
ICCV 2019 |
In this paper, we address the problem of cross-domain retrieval between partial, incomplete 3D scan objects and complete CAD models. To this end, we learn a joint embedding where semantically similar objects from both domains lie close together regardless of low-level differences, such as clutter or noise. To enable fine-grained evaluation of scan-CAD model retrieval we additionally present a new dataset of scan-CAD object similarity annotations. |
[video][bibtex][project page] |
DDSL: Deep Differentiable Simplex Layer for Learning Geometric Signals |
---|
Chiyu 'Max' Jiang, Dana Lansigan, Philip Marcus, Matthias Nießner |
ICCV 2019 |
We present a Deep Differentiable Simplex Layer (DDSL) for neural networks for geometric deep learning. The DDSLis a differentiable layer compatible with deep neural net-works for bridging simplex mesh-based geometry represen-tations (point clouds, line mesh, triangular mesh, tetrahe-dral mesh) with raster images (e.g., 2D/3D grids). |
[bibtex][project page] |
Deferred Neural Rendering: Image Synthesis using Neural Textures |
---|
Justus Thies, Michael Zollhöfer, Matthias Nießner |
ACM Transactions on Graphics 2019 (TOG) |
We introduce Deferred Neural Rendering, a new paradigm for image synthesis that combines the traditional graphics pipeline with learnable components. Specifically, we propose Neural Textures, which are learned feature maps that are trained as part of the scene capture process. Similar to traditional textures, neural textures are stored as maps on top of 3D mesh proxies; however, the high-dimensional feature maps contain significantly more information, which can be interpreted by our new deferred neural rendering pipeline. Both neural textures and deferred neural renderer are trained end-to-end, enabling us to synthesize photo-realistic images even when the original 3D content was imperfect. |
[video][bibtex][project page] |
Multi-Robot Collaborative Dense Scene Reconstruction |
---|
Siyan Dong, Kai Xu, Qiang Zhou, Andrea Tagliasacchi, Shiqing Xin, Matthias Nießner, Baoquan Chen |
ACM Transactions on Graphics 2019 (TOG) |
We present an autonomous scanning approach which allows multiple robots to perform collaborative scanning for dense 3D reconstruction of unknown indoor scenes. Our method plans scanning paths for several robots, allowing them to efficiently coordinate with each other such that the collective scanning coverage and reconstruction quality is maximized while the overall scanning effort is minimized. |
[video][bibtex][project page] |
Face2Face: Real-time Face Capture and Reenactment of RGB Videos |
---|
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner |
CACM 2019 (Research Highlight) |
Research highlight of the Face2Face approach featured on the cover of Communications of the ACM in January 2019. Face2Face is an approach for real-time facial reenactment of a monocular target video. The method had significant impact in the research community and far beyond; it won several wards, e.g., Siggraph ETech Best in Show Award, it was featured in countless media articles, e.g., NYT, WSJ, Spiegel, etc., and it had a massive reach on social media with millions of views. The work was arguably started bringing attention to manipulations of facial videos. |
[video][bibtex][project page] |
Inverse Path Tracing for Joint Material and Lighting Estimation |
---|
Dejan Azinović, Tzu-Mao Li, Anton Kaplanyan, Matthias Nießner |
CVPR 2019 (Oral) |
We introduce Inverse Path Tracing, a novel approach to jointly estimate the material properties of objects and light sources in indoor scenes by using an invertible light transport simulation. |
[video][bibtex][project page] |
3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans |
---|
Ji Hou, Angela Dai, Matthias Nießner |
CVPR 2019 (Oral) |
We introduce 3D-SIS, a novel neural network architecture for 3D semantic instance segmentation in commodity RGB-D scans. |
[video][code][bibtex][project page] |
Scan2CAD: Learning CAD Model Alignment in RGB-D Scans |
---|
Armen Avetisyan, Manuel Dahnert, Angela Dai, Angel X. Chang, Manolis Savva, Matthias Nießner |
CVPR 2019 (Oral) |
We present Scan2CAD, a novel data-driven method that learns to align clean 3D CAD models from a shape database to the noisy and incomplete geometry of a commodity RGB-D scan. |
[video][code][bibtex][project page] |
Scan2Mesh: From Unstructured Range Scans to 3D Meshes |
---|
Angela Dai, Matthias Nießner |
CVPR 2019 |
We introduce Scan2Mesh, a novel data-driven approach which introduces a generative neural network architecture for creating 3D meshes as indexed face sets, conditioned on an input partial scan. |
[bibtex][project page] |
DeepVoxels: Learning Persistent 3D Feature Embeddings |
---|
Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, Michael Zollhöfer |
CVPR 2019 (Oral) |
In this work, we address the lack of 3D understanding of generative neural networks by introducing a persistent 3D feature embedding for view synthesis. To this end, we propose DeepVoxels, a learned representation that encodes the view-dependent appearance of a 3D object without having to explicitly model its geometry. |
[video][bibtex][project page] |
TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes |
---|
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkhouser, Matthias Nießner, Leonidas Guibas |
CVPR 2019 (Oral) |
We introduce, TextureNet, a neural network architecture designed to extract features from high-resolution signals associated with 3D surface meshes (e.g., color texture maps). The key idea is to utilize a 4-rotational symmetric (4-RoSy) field to define a domain for convolution on a surface. |
[bibtex][project page] |
Spherical CNNs on Unstructured Grids |
---|
Chiyu 'Max' Jiang, Jingwei Huang, Karthik Kashinath, Prabhat, Philip Marcus, Matthias Nießner |
ICLR 2019 |
We present an efficient convolution kernel for Convolutional Neural Networks (CNNs) on unstructured grids using parameterized differential operators while focusing on spherical signals such as panorama images or planetary signals. To this end, we replace conventional convolution kernels with linear combinations of differential operators that are weighted by learnable parameters. |
[code][bibtex][project page] |
Convolutional Neural Networks on non-uniform geometrical signals using Euclidean spectral transformation |
---|
Chiyu 'Max' Jiang, Dequan Wang, Jingwei Huang, Philip Marcus, Matthias Nießner |
ICLR 2019 |
We develop mathematical formulations for Non-Uniform Fourier Transforms (NUFT) to directly, and optimally, sample nonuniform data signals of different topologies defined on a simplex mesh into the spectral domain with no spatial sampling error. The spectral transform is performed in the Euclidean space, which removes the translation ambiguity from works on the graph spectrum. |
[bibtex][project page] |
RegNet: Learning the Optimization of Direct Image-to-Image Pose Registration |
---|
Lei Han, Mengqi Ji, Lu Fang, Matthias Nießner |
arXiv 2018 |
In this paper, we demonstrate that the inaccurate numerical Jacobian limits the convergence range which could be improved greatly using learned approaches. Based on this observation, we propose a novel end-to-end network, RegNet, to learn the optimization of image-to-image pose registration. By jointly learning feature representation for each pixel and partial derivatives that replace handcrafted ones (e.g., numerical differentiation) in the optimization step, the neural network facilitates end-to-end optimization. |
[bibtex][project page] |
ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection |
---|
Davide Cozzolino, Justus Thies, Andreas Rössler, Christian Riess, Matthias Nießner, Luisa Verdoliva |
arXiv 2018 |
ForensicTransfer tackles two challenges in multimedia forensics. First, we devise a learning-based forensic detector which adapts well to new domains, i.e., novel manipulation methods. Second we handle scenarios where only a handful of fake examples are available during training. |
[bibtex][project page] |
FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces |
---|
Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, Matthias Nießner |
arXiv 2018 |
In this paper, we introduce FaceForensics, a large scale video dataset consisting of 1004 videos with more than 500000 frames, altered with Face2Face, that can be used for forgery detection and to train generative refinement methods. |
[video][bibtex][project page] |
Calipso: Physics-based Image and Video Editing through CAD Model Proxies |
---|
Nazim Haouchine, Frederick Roy, Hadrien Courtecuisse, Matthias Nießner, Stephane Cotin |
Visual Computer 2018 |
We present Calipso, an interactive method for editing images and videos in a physically-coherent manner. Our main idea is to realize physics-based manipulations by running a full physics simulation on proxy geometries given by non-rigidly aligned CAD models. |
[video][bibtex][project page] |
Plan3D: Viewpoint and Trajectory Optimization for Aerial Multi-View Stereo Reconstruction |
---|
Benjamin Hepp, Matthias Nießner, Otmar Hilliges |
ACM Transactions on Graphics 2018 (TOG) |
We introduce a new method that efficiently computes a set of viewpoints and trajectories for high-quality 3D reconstructions in outdoor environments. Our goal is to automatically explore an unknown area, and obtain a complete 3D scan of a region of interest (e.g., a large building). |
[bibtex][project page] |
Parsing Geometry Using Structure-Aware Shape Templates |
---|
Vignesh Ganapathi-Subramanian, Olga Diamanti, Soeren Pirk, Chengcheng Tang, Matthias Nießner, Leonidas Guibas |
3DV 2018 |
In this paper, we organize large shape collections into parameterized shape templates to capture the underlying structure of the objects. The templates allow us to transfer the structural information onto new objects and incomplete scans. |
[video][bibtex][project page] |
QuadriFlow: A Scalable and Robust Method for Quadrangulation |
---|
Jingwei Huang, Yichao Zhou, Matthias Nießner, Jonathan Richard Shewchuk, Leonidas Guibas |
SGP 2018 (Best Paper Award) |
QuadriFlow is a scalable algorithm for generating quadrilateral surface meshes based on the Instant Field-Aligned Meshes of Jakob et al.. We modify the original algorithm such that it efficiently produces meshes with many fewer singularities. Singularities in quadrilateral meshes cause problems for many applications, includ- ing parametrization and rendering with Catmull-Clark subdivision surfaces. |
[code][bibtex][supplemental][project page] |
3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation |
---|
Angela Dai, Matthias Nießner |
ECCV 2018 |
We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3D-multi-view prediction network. In contrast to existing methods that either use geometry or RGB data as input for this task, we combine both data modalities in a joint, end-to-end network architecture. |
[code][bibtex][project page] |
PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction |
---|
Yifei Shi, Kai Xu, Matthias Nießner, Szymon Rusinkiewicz, Thomas Funkhouser |
ECCV 2018 (Oral) |
We introduce a novel RGB-D patch descriptor designed for detecting coplanar surfaces in SLAM reconstruction. The core of our method is a deep convolutional neural net that takes in RGB, depth, and normal information of a planar patch in an image and outputs a descriptor that can be used to find coplanar patches from other images. |
[bibtex][project page] |
HeadOn: Real-time Reenactment of Human Portrait Videos |
---|
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner |
ACM Transactions on Graphics 2018 (TOG) |
We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. |
[video][bibtex][project page] |
FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality |
---|
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner |
ACM Transactions on Graphics 2018 (TOG) |
We propose FaceVR, a novel image-based method that enables video teleconferencing in VR based on self-reenactment. FaceVR enables VR teleconferencing using an image-based technique that results in nearly photo-realistic outputs. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD). |
[video][bibtex][project page] |
Deep Video Portaits |
---|
Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Nießner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, Christian Theobalt |
ACM Transactions on Graphics 2018 (TOG) |
Our novel approach enables photo-realistic re-animation of portrait videos using only an input video. The core of our approach is a generative neural network with a novel space-time architecture. The network takes as input synthetic renderings of a parametric face model, based on which it predicts photo-realistic video frames for a given target actor. |
[video][bibtex][project page] |
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans |
---|
Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jürgen Sturm, Matthias Nießner |
CVPR 2018 |
We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels. The key contribution of our method is its ability to handle large scenes with varying spatial extent, managing the cubic growth in data size as scene size increases. |
[video][code][bibtex][project page] |
State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications |
---|
Michael Zollhöfer, Justus Thies, Derek Bradley, Pablo Garrido, Thabo Beeler, Patrick Pérez, Marc Stamminger, Matthias Nießner, Christian Theobalt |
Eurographics 2018 |
This state-of-the-art report summarizes recent trends in monocular facial performance capture and discusses its applications, which range from performance-based animation to real-time facial reenactment. We focus our discussion on methods where the central task is to recover and track a three dimensional model of the human face using optimization-based reconstruction algorithms. |
[bibtex][project page] |
State of the Art on 3D Reconstruction with RGB-D Cameras |
---|
Michael Zollhöfer, Patrick Stotko, Andreas Görlitz, Christian Theobalt, Matthias Nießner, Reinhard Klein, Andreas Kolb |
Eurographics 2018 |
In this state-of-the-art report, we analyze these recent developments in RGB-D scene reconstruction in detail and review essential related work. We explain, compare, and critically analyze the common underlying algorithmic concepts that enabled these recent advancements. Furthermore, we show how algorithms are designed to best exploit the benefits of RGB-D data while suppressing their often non-trivial datadistortions. |
[bibtex][project page] |
Matterport3D: Learning from RGB-D Data in Indoor Environments |
---|
Angel X. Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, Yinda Zhang |
3DV 2017 |
In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. |
[bibtex][project page] |
Multiframe Scene Flow with Piecewise Rigid Motion |
---|
Vladislav Golyanik, Kihwan Kim, Robert Maier, Matthias Nießner, Didier Stricker, Jan Kautz |
3DV 2017 |
We introduce a novel multi-frame scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences. |
[bibtex][project page] |
3DLite: Towards Commodity 3D Scanning for Content Creation |
---|
Jingwei Huang, Angela Dai, Leonidas Guibas, Matthias Nießner |
ACM Transactions on Graphics 2017 (TOG) |
We present 3DLite, a novel approach to reconstruct 3D environments using consumer RGB-D sensors, making a step towards directly utilizing captured 3D content in graphics applications, such as video games, VR, or AR. Rather than reconstructing an accurate one-to-one representation of the real world, our method computes a lightweight, low-polygonal geometric abstraction of the scanned geometry. |
[video][bibtex][supplemental][project page] |
Autonomous Reconstruction of Unknown Indoor Scenes Guided by Time-varying Tensor Fields |
---|
Kai Xu, Lintao Zheng, Zihao Yan, Guohang Yan, Eugene Zhang, Matthias Nießner, Oliver Deussen, Daniel Cohen-Or, Hui Huang |
ACM Transactions on Graphics 2017 (TOG) |
We present a navigation-by-reconstruction approach to address this question where moving paths of the robot are planned to account for both global efficiency for fast exploration and local smoothness to obtain high-quality scans. |
[video][code][bibtex][project page] |
Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging |
---|
Zachary DeVito, Michael Mara, Michael Zollhöfer, Gilbert Bernstein, Jonathan Ragan-Kelley, Christian Theobalt, Pat Hanrahan, Matthew Fisher, Matthias Nießner |
ACM Transactions on Graphics 2017 (TOG) |
We propose a new language, Opt, in which a user simply writes energy functions over image- or graph-structured unknowns, and a compiler automatically generates state-of-the-art GPU optimization kernels. |
[bibtex][project page] |
BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration |
---|
Angela Dai, Matthias Nießner, Michael Zollhöfer, Shahram Izadi, Christian Theobalt |
ACM Transactions on Graphics 2017 (TOG) |
We introduce a novel, real-time, end-to-end 3D reconstruction framework, with a robust pose optimization strategy based on sparse feature matches and dense geometric and photometric alignment. One main contribution is the ability to update the reconstructed model on-the-fly as new (global) pose optimization results become available. |
[video][bibtex][project page] |
Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting |
---|
Robert Maier, Kihwan Kim, Daniel Cremers, Jan Kautz, Matthias Nießner |
ICCV 2017 |
We introduce a novel method to obtain high-quality 3D reconstructions from consumer RGB-D sensors. Our core idea is to simultaneously optimize for geometry encoded in a signed distance field (SDF), textures from automaticallyselected keyframes, and their camera poses along with material and scene lighting. |
[code][bibtex][supplemental][project page] |
A Lightweight Approach for On-the-Fly Reflectance Estimation |
---|
Kihwan Kim, Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Nießner, Jan Kautz |
ICCV 2017 (Oral) |
We propose a lightweight, learning-based approach for surface reflectance estimation directly from 8-bit RGB images in real-time, which can be easily plugged into any 3D scanning-and-fusion system with a commodity RGBD sensor. |
[bibtex][project page] |
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes |
---|
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner |
CVPR 2017 (Spotlight) |
We introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. |
[video][bibtex][project page] |
Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis |
---|
Angela Dai, Charles Ruizhongtai Qi, Matthias Nießner |
CVPR 2017 (Spotlight) |
We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution - but complete - output. |
[bibtex][project page] |
3DMatch: Learning the Matching of Local 3D Geometry in Range Scans |
---|
Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao |
CVPR 2017 (Oral) |
In this paper, we introduce 3DMatch, a data-driven local feature learner that jointly learns a geometric feature representation and an associated metric function from a large collection of real-world scanning data. |
[video][bibtex][project page] |
Learning to Navigate the Energy Landscape |
---|
Julien Valentin, Angela Dai, Matthias Nießner, Pushmeet Kohli, Philip H. S. Torr, Shahram Izadi, Cem Keskin |
3DV 2016 |
In this paper, we present a novel, general, and efficient architecture for addressing computer vision problems that are approached from an `Analysis by Synthesis' standpoint. |
[video][bibtex][project page] |
VolumeDeform: Real-time Volumetric Non-rigid Reconstruction |
---|
Matthias Innmann, Michael Zollhöfer, Matthias Nießner, Christian Theobalt, Marc Stamminger |
ECCV 2016 |
We present a novel approach for the reconstruction of dynamic geometric shapes using a single hand-held consumer-grade RGB-D sensor at real-time rates. Our method does not require a pre-defined shape template to start with and builds up the scene model from scratch during the scanning process. |
[video][bibtex][supplemental][project page] |
Efficient GPU Rendering of Subdivision Surfaces using Adaptive Quadtrees |
---|
Wade Brainerd, Tim Foley, Manuel Kraemer, Henry Moreton, Matthias Nießner |
ACM Transactions on Graphics 2016 (TOG) |
We present a novel method for real-time rendering of subdivision surfaces whose goal is to make subdivision faces as easy to render as triangles, points, or lines. Our approach uses the GPU tessellation hardware and processes each face of a base mesh independently and in a streaming fashion, thus allowing an entire model to be rendered in a single pass. |
[video][bibtex][project page] |
PiGraphs: Learning Interaction Snapshots from Observations |
---|
Manolis Savva, Angel X. Chang, Pat Hanrahan, Matthew Fisher, Matthias Nießner |
ACM Transactions on Graphics 2016 (TOG) |
We learn a probabilistic model connecting human poses and arrangements of objects from observations of interactions collected with commodity RGB-D sensors. This model is encoded as a set of Prototypical Interaction Graphs (PiGraphs): a human-centric representation capturing physical contact and attention linkages between geometry and the human body. |
[video][bibtex][project page] |
ProxImaL: Efficient Image Optimization using Proximal Algorithms |
---|
Felix Heide, Steven Diamond, Matthias Nießner, Jonathan Ragan-Kelley, Wolfgang Heidrich, Gordon Wetzstein |
ACM Transactions on Graphics 2016 (TOG) |
ProxImaL is a domain-specific language and compiler for image optimization problems that makes it easy to experiment with different problem formulations and algorithm choices. The compiler intelligently chooses the best way to translate a problem formulation and choice of optimization algorithm into an efficient solver implementation. |
[bibtex][supplemental][project page] |
Face2Face: Real-time Face Capture and Reenactment of RGB Videos |
---|
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner |
CVPR 2016 (Oral) |
We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. |
[video][bibtex][supplemental][project page] |
Volumetric and Multi-View CNNs for Object Classification on 3D Data |
---|
Charles Ruizhongtai Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, Leonidas Guibas |
CVPR 2016 (Spotlight) |
In this paper, we improve both Volumetric CNNs and Multi-view CNNs by introducing new distinct network architectures. Overall, we are able to outperform current state-of-the-art methods for both Volumetric CNNs and Multi-view CNNs. |
[code][bibtex][supplemental][project page] |
Real-time Expression Transfer for Facial Reenactment |
---|
Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, Christian Theobalt |
ACM Transactions on Graphics 2015 (TOG) |
We present a method for the real-time transfer of facial expressions from an actor in a source video to an actor in a target video, thus enabling the ad-hoc control of the facial expressions of the target actor. |
[video][bibtex][project page] |
Activity-centric Scene Synthesis for Functional 3D Scene Modeling |
---|
Matthew Fisher, Manolis Savva, Yangyan Li, Pat Hanrahan, Matthias Nießner |
ACM Transactions on Graphics 2015 (TOG) |
We present a novel method to generate 3D scenes that allow the same activities as real environments captured through noisy and incomplete 3D scans. |
[video][bibtex][supplemental][project page] |
SemanticPaint: Interactive 3D Labeling and Learning at your Fingertips |
---|
Julien Valentin, Vibhav Vineet, Ming-Ming Cheng, David Kim, Jamie Shotton, Pushmeet Kohli, Matthias Nießner, Antonio Criminisi, Shahram Izadi, Philip H. S. Torr |
ACM Transactions on Graphics 2015 (TOG) |
We present a new interactive and online approach to 3D scene understanding. Our system, SemanticPaint, allows users to simultaneously scan their environment, while interactively segmenting the scene simply by reaching out and touching any desired object or surface. |
[video][bibtex][project page] |
Efficient Ray Tracing of Subdivision Surfaces using Tessellation Caching |
---|
Carsten Benthin, Sven Woop, Matthias Nießner, Kai Selgrad, Ingo Wald |
High Performance Graphics 2015 |
In this paper, we propose method to efficiently ray trace subdivision surfaces using a lazy-build caching scheme while exploiting the capabilities of today's many-core architectures. Our approach is part of Intel's Embree. |
[video][code][bibtex][project page] |
Real-time Rendering Techniques with Hardware Tessellation |
---|
Matthias Nießner, Benjamin Keinert, Matthew Fisher, Marc Stamminger, Charles Loop, Henry Schäfer |
Computer Graphics Forum 2015 |
In this survey, we provide an overview of real-time rendering techniques with hardware tessellation by summarizing, discussing, and comparing state-of-the art approaches. |
[bibtex][project page] |
Shading-based Refinement on Volumetric Signed Distance Functions |
---|
Michael Zollhöfer, Angela Dai, Matthias Innmann, Chenglei Wu, Marc Stamminger, Christian Theobalt, Matthias Nießner |
ACM Transactions on Graphics 2015 (TOG) |
We present a novel method to obtain fine-scale detail in 3D reconstructions generated with low-budget RGB-D cameras or other commodity scanning devices. |
[video][bibtex][project page] |
Exploiting Uncertainty in Regression Forests for Accurate Camera Relocalization |
---|
Julien Valentin, Matthias Nießner, Jamie Shotton, Andrew Fitzgibbon, Shahram Izadi, Philip H. S. Torr |
CVPR 2015 |
In this paper, we train a regression forest to predict mixtures of anisotropic 3D Gaussians and show how the predicted uncertainties can be taken into account for continuous pose optimization. Experiments show that our method is able to relocalize up to 40 percent more frames than the state of the art. |
[video][bibtex][project page] |
Incremental Dense Semantic Stereo Fusion for Large-Scale Semantic Scene Reconstruction |
---|
Vibhav Vineet, Ondrej Miksik, Morten Lidegaard, Matthias Nießner, Stuart Golodetz, Victor A. Prisacariu, Olaf Kähler, David W. Murray, Shahram Izadi, Patrick Pérez, Philip H. S. Torr |
ICRA 2015 |
In this paper, we build on a recent hash-based technique for large-scale fusion and an efficient mean-field inference algorithm for densely-connected CRFs to present what to our knowledge is the first system that can perform dense, large-scale, outdoor semantic reconstruction of a scene in (near) real time. |
[video][bibtex][project page] |
The Semantic Paintbrush: Interactive 3D Mapping and Recognition in Large Outdoor Spaces |
---|
Ondrej Miksik, Vibhav Vineet, Morten Lidegaard, Ram Prasaath, Matthias Nießner, Stuart Golodetz, Stephen L. Hicks, Patrick Pérez, Shahram Izadi, Philip H. S. Torr |
CHI 2015 |
We present an augmented reality system for large scale 3D reconstruction and recognition in outdoor scenes. Unlike existing prior work, which tries to reconstruct scenes using active depth cameras, we use a purely passive stereo setup, allowing for outdoor use and extended sensing range. |
[bibtex][project page] |
Database-Assisted Object Retrieval for Real-Time 3D Reconstruction |
---|
Yangyan Li, Angela Dai, Leonidas Guibas, Matthias Nießner |
Eurographics 2015 |
We present a novel reconstruction approach based on retrieving objects from a 3D shape database while scanning an environment in real-time. With this approach, we are able to replace scanned RGB-D data with complete, hand-modeled objects from shape databases. |
[video][bibtex][project page] |
Dynamic Feature-Adaptive Subdivision |
---|
Henry Schäfer, Jens Raab, Mark Meyer, Marc Stamminger, Matthias Nießner |
I3D 2015 |
In this paper, we present dynamic feature-adaptive subdivision (DFAS), which improves upon FAS by enabling an independent subdivision depth for every irregularity. |
[video][bibtex][project page] |
SceneGrok: Inferring Action Maps in 3D Environments |
---|
Manolis Savva, Angel X. Chang, Pat Hanrahan, Matthew Fisher, Matthias Nießner |
ACM Transactions on Graphics 2014 (TOG) |
In this paper, we present a method to establish a correlation between the geometry and the functionality of 3D environments. |
[video][bibtex][project page] |
Real-time Shading-based Refinement for Consumer Depth Cameras |
---|
Chenglei Wu, Michael Zollhöfer, Matthias Nießner, Marc Stamminger, Shahram Izadi, Christian Theobalt |
ACM Transactions on Graphics 2014 (TOG) |
We present the first real-time method for refinement of depth data using shape-from-shading in general uncontrolled scenes. |
[video][code][bibtex][project page] |
Real-time Non-rigid Reconstruction using an RGB-D Camera |
---|
Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rhemann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, Marc Stamminger |
ACM Transactions on Graphics 2014 (TOG) |
We present a combined hardware and software solution for markerless reconstruction of non-rigidly deforming physical objects with arbitrary shape in real-time. |
[video][bibtex][project page] |
Real-Time Deformation of Subdivision Surfaces from Object Collisions |
---|
Henry Schäfer, Benjamin Keinert, Matthias Nießner, Christoph Buchenau, Michael Guthe, Marc Stamminger |
High Performance Graphics 2014 |
We present a novel real-time approach for fine-scale surface deformations resulting from collisions. |
[video][bibtex][project page] |
Local Painting and Deformation of Meshes on the GPU |
---|
Henry Schäfer, Benjamin Keinert, Matthias Nießner, Marc Stamminger |
Computer Graphics Forum 2014 |
We present a novel method to adaptively apply modifications to scene data stored in GPU memory. Such modifications may include interactive painting and sculpting operations in an authoring tool, or deformations resulting from collisions between scene objects detected by a physics engine. |
[video][bibtex][project page] |
RetroDepth: 3D Silhouette Sensing for High-Precision Input On and Above Physical Surfaces |
---|
David Kim, Shahram Izadi, Jakub Dostal, Christoph Rhemann, Cem Keskin, Christopher Zach, Jamie Shotton, Tim Large, Steven Bathiche, Matthias Nießner, Alex Butler, Sean Fanello, Vivek Pradeep |
CHI 2014 |
We present RetroDepth, a new vision-based system for accurately sensing the 3D silhouettes of hands, styluses, and other objects, as they interact on and above physical surfaces. |
[video][bibtex][project page] |
State of the Art Report on Real-time Rendering with Hardware Tessellation |
---|
Henry Schäfer, Matthias Nießner, Benjamin Keinert, Marc Stamminger, Charles Loop |
Eurographics 2014 |
In this state of the art report, we provide an overview of recent work and challenges in this topic by summarizing, discussing and comparing methods for the rendering of smooth and highly detailed surfaces in real-time. |
[bibtex][project page] |
Combining Inertial Navigation and ICP for Real-time 3D Surface Reconstruction |
---|
Matthias Nießner, Angela Dai, Matthew Fisher |
Eurographics 2014 |
We present a novel method to improve the robustness of real-time 3D surface reconstruction by incorporating inertial sensor data when determining inter-frame alignment. With commodity inertial sensors, we can significantly reduce the number of iterative closest point (ICP) iterations required per frame. |
[video][bibtex][project page] |
Real-time 3D Reconstruction at Scale using Voxel Hashing |
---|
Matthias Nießner, Michael Zollhöfer, Shahram Izadi, Marc Stamminger |
ACM Transactions on Graphics 2013 (TOG) |
Online 3D reconstruction is gaining newfound interest due to the availability of real-time consumer depth cameras. We contribute an online system for large and fine scale volumetric reconstruction based on a memory and speed efficient data structure. |
[video][code][bibtex][project page] |
Analytic Displacement Mapping using Hardware Tessellation |
---|
Matthias Nießner, Charles Loop |
ACM Transactions on Graphics 2013 (TOG) |
Displacement mapping is ideal for modern GPUs since it enables high-frequency geometric surface detail on models with low memory I/O. We provide a comprehensive solution to these problems by introducing a smooth analytic displacement function. |
[bibtex][project page] |
Real-time Collision Detection for Dynamic Hardware Tessellated Objects |
---|
Matthias Nießner, Christian Siegl, Henry Schäfer, Charles Loop |
Eurographics 2013 |
We present a novel method for real-time collision detection of patch based, displacement mapped objects using hardware tessellation. Our method supports fully animated, dynamically tessellated objects and runs entirely on the GPU. |
[bibtex][project page] |
Rendering Subdivision Surfaces using Hardware Tessellation |
---|
Matthias Nießner |
PhD Thesis (Published by Dr. Hut) |
In this thesis we present techniques that facilitate the use of high-quality movie content in real-time applications that run on commodity desktop computers. We utilize modern graphics hardware and use hardware tessellation to generate surface geometry on-the-fly based on patches. |
[bibtex][project page] |
Real-time Simulation of Human Vision using Temporal Compositing with CUDA on the GPU |
---|
Matthias Nießner, Nadine Kuhnert, Kai Selgrad, Marc Stamminger, Georg Michelson |
Proceedings 25th Workshop on Parallel Systems and Algorithms 2013 |
We present a novel approach that simulates human vision including visual defects such as glaucoma by temporal composition of human vision in real-time on the GPU. Therefore, we determine eye focus points every time step and adapt the lens accommodation of our virtual eye model accordingly. |
[bibtex][project page] |
Feature Adaptive GPU Rendering of Catmull-Clark Subdivision Surfaces |
---|
Matthias Nießner, Charles Loop, Mark Meyer, Tony DeRose |
ACM Transactions on Graphics 2012 (TOG) |
We present a novel method for high-performance GPU based rendering of Catmull-Clark subdivision surfaces. Unlike previous methods, our algorithm computes the true limit surface up to machine precision, and is capable of rendering surfaces that conform to the full RenderMan specification for Catmull-Clark surfaces. |
[video][code][bibtex][project page] |
Patch-based Occlusion Culling for Hardware Tessellation |
---|
Matthias Nießner, Charles Loop |
Computer Graphics International 2012 |
We present an occlusion culling algorithm that leverages the unique characteristics of patch primitives within the hardware tessellation pipeline. |
[bibtex][project page] |
Efficient Evaluation of Semi-Smooth Creases in Catmull-Clark Subdivision Surfaces |
---|
Matthias Nießner, Charles Loop, Günther Greiner |
Eurographics 2012 |
We present a novel method to evaluate semi-smooth creases in Catmull-Clark subdivision surfaces. Our algorithm supports both integer and fractional crease tags corresponding to the RenderMan (Pixar) specification. |
[bibtex][project page] |
Real-time Simulation and Visualization of Human Vision through Eyeglasses on the GPU |
---|
Matthias Nießner, Roman Sturm, Günther Greiner |
ACM SIGGRAPH VRCAI 2012 |
We present a novel approach that allows real-time simulation of human vision through eyeglasses. Our system supports glasses that are composed of a combination of spheric, toric and in particular of free-form surfaces. |
[bibtex][project page] |