CAMCV

Camera Calibration in Sports (Magera, ULiege and EVS)

Thu, 29 Jan 2026 15:00:00 +0000

3D Computer Vision (Busam, TUM) and World Models (Parisot, Microsoft)

Thu, 11 Sep 2025 11:00:00 +0000

Computer Vision

Wed, 01 Jan 2025 00:00:00 +0000

We are the Computer Vision Working Group of CV4DT. Click for more!

We are the Computer Vision Working Group of CV4DT: (Olaf Wysocki, Haibing Wu, Qilin Zhang, Daniel Lehmberg, Wanru Yang).

Research Projects

Our projects are primarily research-oriented, aiming for publication in top-tier computer vision venues such as CVPR, ECCV NeurIPS, and similar.
Below is an overview of our ongoing and upcoming research directions.

🏠 Structured 3D Object Reconstruction

We aim to reconstruct structured 3D models aligned with interpretable geometric and semantic representations.
This direction builds upon our prior work:

Several project proposals are currently under review to expand this line of research.

🧩 Revisiting Geometric Features for 3D Scene Understanding

We revisit geometric descriptors for large-scale 3D semantic segmentation, SSL, 3D instance segmentation, 3D object pose estimation, 3D shape completion, studying how handcrafted and learned geometric features can be combined to achieve better generalization across domains. Preliminary findings are available in arXiv:2402.06506

Papers are in the making to expand this line of research.

🏁 Sim2Real 3D Domain Gap

We still observe large domain gaps between simulated and real-world data, hampering application of simulated data into real-world challenges and many downstream tasks. We believe in the power of diffusion models to cater for this gap. Preliminary results, building a framework for running simulations within the unique real-world city twin are here: arXiv:2505.17959

One paper is under review, while another draft is in preparation.

🧭 6DoF Estimation Using Structured 3D Models

We explore structured 3D model representations for 6-degree-of-freedom (6DoF) pose estimation, targeting improved robustness and interpretability compared to implicit or point-based methods.
This direction builds on related work of LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe Alignment (NeurIPS24).

A new iteration of this work is in preparation for upcoming major conference deadlines.

🌌 Geometry-Prior-Guided 3D Gaussian Splatting

This project investigates the integration of geometry-aware priors into 3D Gaussian Splatting to enhance reconstruction quality and geometric fidelity.
Preliminary findings are available in arXiv:2508.07355, and ongoing work extends the framework beyond building-specific scenarios toward general-purpose 3D environments.

📈 Quantifying Uncertainty of X

In this research direction, we explore quantification of uncertainty in various modalities and downstream tasks: From data acqusition, through segmentation to inference. Our rationale is often grounded in Bayesian modeling of uncertainty (but not limited to!). Previously published papers, e.g., on reconstruction uncertainty Scan2LoD3: Reconstructing semantic 3D building models at LoD3 using ray casting and Bayesian networks (CVPR23). Currently, we are involved in the funded project, NeRF2BIM, together with Profs Petzold, Holst and Niessner, where we analyze laser scanning uncertainty and its influence on final 3D object reconstruction.

🗂️ Dataset Development

We also curate and release datasets supporting our main research directions, including:

Facade Segmentation Dataset – for large-scale semantic façade parsing; This building upon our worldwide-largest facade dataset ZAHA (WACV25)
Point Cloud Completion Dataset – for partial-to-complete reconstruction learning
3D Object Reconstruction Dataset – for structured geometry prediction and analysis

These datasets promote reproducible, data-rich 3D research across geometry, perception, and robotics.

We are always looking for fantastic persons to join us for the following project collaboration!

Robotics

Wed, 01 Jan 2025 00:00:00 +0000

We are the Robotics Working Group of CV4DT. Click for more!

Robotics

We are a computer vision & robotics working group (Guangming Wang, Yixiong Jing, Qizhen Ying), focusing on:

Robotic manipulation and control
3D vision for robotic perception
Generative models for planning and world understanding

We are always looking for fantastic persons to join us for the following project collaboration!

Ongoing Research Projects

Project 1: ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking

Classical robotic systems typically rely on custom planners designed for constrained environments. While effective in restricted settings, these systems lack generalization capabilities, limiting the scalability of embodied AI and general‑purpose robots. To address this gap, we propose ActionReasoning, an LLM-driven framework that performs explicit action reasoning to produce physics-consistent, prior-guided decisions for robotic manipulation. The experiments demonstrate that the proposed multi-agent LLM framework enables stable brick placement without task-specific programming, highlighting its potential to generalize beyond narrowly defined tasks (Paper is submitted to a top robotic conference).

Project 2: Robotic Perception: Physics-Aware 3D Gaussian Modeling

Our goal is to develop a unified 3D Gaussian modeling framework that integrates geometric, semantic, and physical attributes, enabling robots to achieve dynamic and adaptive understanding of their environments, thereby acquiring human-like adaptability and generalization capabilities.

Project 3: Robotic Manipulation: Generalizable Manipulation of Different Types of Objects

Our goal is to build a general reasoning framework for the manipulation of deformable objects, hinged objects, and rigid objects, progressing from universal representations of different types of objects, to general reasoning, and ultimately to general manipulation. This will enable robots to attain human-like perception and manipulation skills for different types of objects.

Past Research Projects

RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

Sim-to-Real refers to the process of transferring policies learned in simulation to the real world, which is crucial for achieving practical robotics applications. However, recent Sim2real methods either rely on a large amount of augmented data or large learning models, which is inefficient for specific tasks. To this end, we propose RL-GSBridge, a novel real-to-sim-to-real framework which incorporates 3D Gaussian Splatting into the conventional RL simulation pipeline, enabling zero-shot sim-to-real transfer for vision-based deep reinforcement learning.

Through a series of sim-to-real experiments, including grasping and pick-and-place tasks, we demonstrate that RL-GSBridge maintains a satisfactory success rate in real-world task completion during sim-to-real transfer. Furthermore, a series of rendering metrics and visualization results indicate that our proposed mesh-based 3D GS reduces artifacts in unstructured objects, demonstrating more realistic rendering performance. The related work was published at top robotics conference ICRA.

SNI-SLAM: Semantic Neural Implicit SLAM

We propose SNI-SLAM, a first semantic SLAM system utilizing neural implicit representation that simultaneously performs accurate semantic mapping high-quality surface reconstruction and robust camera tracking. In this system we introduce hierarchical semantic representation to allow multi-level semantic comprehension for top-down structured semantic mapping of the scene. In addition to fully utilize the correlation between multiple attributes of the environment we integrate appearance geometry and semantic features through cross-attention for feature collaboration. Our SNI-SLAM method demonstrates superior performance over all recent NeRF-based SLAM methods in terms of mapping and tracking accuracy on multiple datasets while also showing excellent capabilities in accurate semantic segmentation and real-time semantic mapping. Related work was publihsed at top computer vision conference CVPR.

Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks With Base Controllers

Deep reinforcement learning (DRL) enables robots to perform some intelligent tasks end-to-end. However, there are still many challenges for long-horizon sparse-reward robotic manipulator tasks. We propose a method of learning long-horizon sparse-reward tasks utilizing one or more existing traditional controllers named base controllers. The experiments demonstrated that the learned policies steadily outperform base controllers. Compared to previous works of learning from demonstrations, our method improves sample efficiency by orders of magnitude and improves performance. Overall, our method bears the potential of leveraging existing industrial robot manipulation systems to build more flexible and intelligent controllers. The related work was published at top AI journal IEEE T-NNLS

Contact

Mon, 24 Oct 2022 00:00:00 +0000

People

Mon, 24 Oct 2022 00:00:00 +0000

Vision

Mon, 24 Oct 2022 00:00:00 +0000

The Cambridge Computer Vision Group:

Serves as a unifying hub for researchers in Cambridge and beyond in computer vision and related fields, including computational photography, 3D vision, robotics perception, and visual AI
Brings together academia and industry through regular meetings, seminars, and invited talks
Provides a platform for sharing cutting-edge research and discussing emerging challenges in visual computing
Highlights a broad range of computer vision applications, including construction and civil engineering, medical imaging and healthcare, autonomous driving and robotics, remote sensing, manufacturing, and creative industries
Encourages collaboration across institutions, career stages, and application domains
Aims to strengthen Cambridge’s position as a global centre of excellence in computer vision and accelerate real-world impact

Jian Yang and Monica Hall Win the Best Paper Award at Wowchemy 2020

Wed, 02 Dec 2020 00:00:00 +0000

Congratulations to Jian Yang and Monica Hall for winning the Best Paper Award at the 2020 Conference on Wowchemy for their paper “Learning Wowchemy”.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer tempus augue non tempor egestas. Proin nisl nunc, dignissim in accumsan dapibus, auctor ullamcorper neque. Quisque at elit felis. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Aenean eget elementum odio. Cras interdum eget risus sit amet aliquet. In volutpat, nisl ut fringilla dignissim, arcu nisl suscipit ante, at accumsan sapien nisl eu eros.

Sed eu dui nec ligula bibendum dapibus. Nullam imperdiet auctor tortor, vel cursus mauris malesuada non. Quisque ultrices euismod dapibus. Aenean sed gravida risus. Sed nisi tortor, vulputate nec quam non, placerat porta nisl. Nunc varius lobortis urna, condimentum facilisis ipsum molestie eu. Ut molestie eleifend ligula sed dignissim. Duis ut tellus turpis. Praesent tincidunt, nunc sed congue malesuada, mauris enim maximus massa, eget interdum turpis urna et ante. Morbi sem nisl, cursus quis mollis et, interdum luctus augue. Aliquam laoreet, leo et accumsan tincidunt, libero neque aliquet lectus, a ultricies lorem mi a orci.

Mauris dapibus sem vel magna convallis laoreet. Donec in venenatis urna, vitae sodales odio. Praesent tortor diam, varius non luctus nec, bibendum vel est. Quisque id sem enim. Maecenas at est leo. Vestibulum tristique pellentesque ex, blandit placerat nunc eleifend sit amet. Fusce eget lectus bibendum, accumsan mi quis, luctus sem. Etiam vitae nulla scelerisque, eleifend odio in, euismod quam. Etiam porta ullamcorper massa, vitae gravida turpis euismod quis. Mauris sodales sem ac ultrices viverra. In placerat ultrices sapien. Suspendisse eu arcu hendrerit, luctus tortor cursus, maximus dolor. Proin et velit et quam gravida dapibus. Donec blandit justo ut consequat tristique.

Richard Hendricks Wins First Place in the Wowchemy Prize

Tue, 01 Dec 2020 00:00:00 +0000

Congratulations to Richard Hendricks for winning first place in the Wowchemy Prize.

Mon, 01 Jan 0001 00:00:00 +0000