<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>CAMCV</title><link>https://camcv.github.io/</link><atom:link href="https://camcv.github.io/index.xml" rel="self" type="application/rss+xml"/><description>CAMCV</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Mon, 24 Oct 2022 00:00:00 +0000</lastBuildDate><image><url>https://camcv.github.io/media/icon_hu_642791a3c5816746.png</url><title>CAMCV</title><link>https://camcv.github.io/</link></image><item><title>Camera Calibration in Sports (Magera, ULiege and EVS)</title><link>https://camcv.github.io/event/26magera/</link><pubDate>Thu, 29 Jan 2026 15:00:00 +0000</pubDate><guid>https://camcv.github.io/event/26magera/</guid><description/></item><item><title>3D Computer Vision (Busam, TUM) and World Models (Parisot, Microsoft)</title><link>https://camcv.github.io/event/25busam_parisot/</link><pubDate>Thu, 11 Sep 2025 11:00:00 +0000</pubDate><guid>https://camcv.github.io/event/25busam_parisot/</guid><description/></item><item><title>Computer Vision</title><link>https://camcv.github.io/projects/cv/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://camcv.github.io/projects/cv/</guid><description>&lt;p>We are the Computer Vision Working Group of CV4DT. Click for more!&lt;/p>
&lt;p>We are the Computer Vision Working Group of CV4DT: (&lt;a href="https://camcv.github.io/author/dr-olaf-wysocki/">Olaf Wysocki&lt;/a>, &lt;a href="https://camcv.github.io/author/haibing-wu/">Haibing Wu&lt;/a>, &lt;a href="https://camcv.github.io/author/qilin-zhang/">Qilin Zhang&lt;/a>, &lt;a href="https://camcv.github.io/author/daniel-lehmberg/">Daniel Lehmberg&lt;/a>, &lt;a href="https://camcv.github.io/author/wanru-yang/">Wanru Yang&lt;/a>).&lt;/p>
&lt;h1 id="research-projects">Research Projects&lt;/h1>
&lt;p>Our projects are primarily &lt;strong>research-oriented&lt;/strong>, aiming for publication in top-tier computer vision venues such as &lt;strong>CVPR&lt;/strong>, &lt;strong>ECCV&lt;/strong> &lt;strong>NeurIPS&lt;/strong>, and similar.&lt;br>
Below is an overview of our ongoing and upcoming research directions.&lt;/p>
&lt;hr>
&lt;h3 id="-structured-3d-object-reconstruction">🏠 Structured 3D Object Reconstruction&lt;/h3>
&lt;p>We aim to reconstruct &lt;strong>structured 3D models&lt;/strong> aligned with interpretable geometric and semantic representations.&lt;br>
This direction builds upon our prior work:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://openaccess.thecvf.com/content/CVPR2025W/USM3D/papers/Tang_Texture2LoD3_Enabling_LoD3_Building_Reconstruction_With_Panoramic_Images_CVPRW_2025_paper.pdf" target="_blank" rel="noopener">&lt;em>Texture2LoD3: Enabling LoD3 Building Reconstruction With Panoramic Images&lt;/em> (CVPR25)&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://openaccess.thecvf.com/content/CVPR2023W/PCV/papers/Wysocki_Scan2LoD3_Reconstructing_Semantic_3D_Building_Models_at_LoD3_Using_Ray_CVPRW_2023_paper.pdf" target="_blank" rel="noopener">&lt;em>Scan2LoD3: Reconstructing semantic 3D building models at LoD3 using ray casting and Bayesian networks&lt;/em> (CVPR23)&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Several project proposals are currently under review to expand this line of research.&lt;/p>
&lt;hr>
&lt;h3 id="-revisiting-geometric-features-for-3d-scene-understanding">🧩 Revisiting Geometric Features for 3D Scene Understanding&lt;/h3>
&lt;p>We revisit &lt;strong>geometric descriptors&lt;/strong> for large-scale &lt;strong>3D semantic segmentation&lt;/strong>, &lt;strong>SSL&lt;/strong>, &lt;strong>3D instance segmentation&lt;/strong>, &lt;strong>3D object pose estimation&lt;/strong>, &lt;strong>3D shape completion&lt;/strong>, studying how handcrafted and learned geometric features can be combined to achieve better generalization across domains. Preliminary findings are available in &lt;a href="https://arxiv.org/pdf/2402.06506" target="_blank" rel="noopener">&lt;em>arXiv:2402.06506&lt;/em>&lt;/a>&lt;/p>
&lt;p>Papers are in the making to expand this line of research.&lt;/p>
&lt;hr>
&lt;h3 id="-sim2real-3d-domain-gap">🏁 Sim2Real 3D Domain Gap&lt;/h3>
&lt;p>We still observe large domain gaps between simulated and real-world data, hampering application of simulated data into real-world challenges and many downstream tasks. We believe in the power of &lt;em>diffusion models&lt;/em> to cater for this gap. Preliminary results, building a framework for running simulations within the unique real-world city twin are here: &lt;a href="https://arxiv.org/abs/2505.17959" target="_blank" rel="noopener">&lt;em>arXiv:2505.17959&lt;/em>&lt;/a>&lt;/p>
&lt;p>One paper is under review, while another draft is in preparation.&lt;/p>
&lt;hr>
&lt;h3 id="-6dof-estimation-using-structured-3d-models">🧭 6DoF Estimation Using Structured 3D Models&lt;/h3>
&lt;p>We explore &lt;strong>structured 3D model representations&lt;/strong> for &lt;strong>6-degree-of-freedom (6DoF) pose estimation&lt;/strong>, targeting improved robustness and interpretability compared to implicit or point-based methods.&lt;br>
This direction builds on related work of &lt;a href="https://proceedings.neurips.cc/paper_files/paper/2024/file/d78ece6613953f46501b958b7bb4582f-Paper-Conference.pdf" target="_blank" rel="noopener">&lt;em>LoD-Loc: Aerial Visual Localization using LoD 3D
Map with Neural Wireframe Alignment&lt;/em> (NeurIPS24)&lt;/a>.&lt;/p>
&lt;p>A new iteration of this work is in preparation for upcoming major conference deadlines.&lt;/p>
&lt;hr>
&lt;h3 id="-geometry-prior-guided-3d-gaussian-splatting">🌌 Geometry-Prior-Guided 3D Gaussian Splatting&lt;/h3>
&lt;p>This project investigates the integration of &lt;strong>geometry-aware priors&lt;/strong> into &lt;strong>3D Gaussian Splatting&lt;/strong> to enhance reconstruction quality and geometric fidelity.&lt;br>
Preliminary findings are available in &lt;a href="https://arxiv.org/pdf/2508.07355" target="_blank" rel="noopener">&lt;em>arXiv:2508.07355&lt;/em>&lt;/a>, and ongoing work extends the framework beyond building-specific scenarios toward &lt;strong>general-purpose 3D environments&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h3 id="-quantifying-uncertainty-of-x">📈 Quantifying Uncertainty of X&lt;/h3>
&lt;p>In this research direction, we explore quantification of uncertainty in various modalities and downstream tasks: From data acqusition, through segmentation to inference. Our rationale is often grounded in Bayesian modeling of uncertainty (but not limited to!). Previously published papers, e.g., on reconstruction uncertainty
&lt;a href="https://openaccess.thecvf.com/content/CVPR2023W/PCV/papers/Wysocki_Scan2LoD3_Reconstructing_Semantic_3D_Building_Models_at_LoD3_Using_Ray_CVPRW_2023_paper.pdf" target="_blank" rel="noopener">&lt;em>Scan2LoD3: Reconstructing semantic 3D building models at LoD3 using ray casting and Bayesian networks&lt;/em> (CVPR23)&lt;/a>.
Currently, we are involved in the funded project, &lt;a href="https://www.asg.ed.tum.de/en/gds/forschung-research/projects/nerf2bim/" target="_blank" rel="noopener">NeRF2BIM&lt;/a>, together with Profs Petzold, Holst and Niessner, where we analyze laser scanning uncertainty and its influence on final 3D object reconstruction.&lt;/p>
&lt;hr>
&lt;h3 id="-dataset-development">🗂️ Dataset Development&lt;/h3>
&lt;p>We also curate and release datasets supporting our main research directions, including:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Facade Segmentation Dataset&lt;/strong> – for large-scale semantic façade parsing; This building upon our worldwide-largest facade dataset &lt;a href="https://openaccess.thecvf.com/content/WACV2025/html/Wysocki_ZAHA_Introducing_the_Level_of_Facade_Generalization_and_the_Large-Scale_WACV_2025_paper.html" target="_blank" rel="noopener">&lt;em>ZAHA&lt;/em> (WACV25)&lt;/a>&lt;/li>
&lt;li>&lt;strong>Point Cloud Completion Dataset&lt;/strong> – for partial-to-complete reconstruction learning&lt;/li>
&lt;li>&lt;strong>3D Object Reconstruction Dataset&lt;/strong> – for structured geometry prediction and analysis&lt;/li>
&lt;/ul>
&lt;p>These datasets promote &lt;strong>reproducible, data-rich 3D research&lt;/strong> across geometry, perception, and robotics.&lt;/p>
&lt;p>We are always looking for fantastic persons to join us for the following project collaboration!&lt;/p></description></item><item><title>Robotics</title><link>https://camcv.github.io/projects/robotics/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://camcv.github.io/projects/robotics/</guid><description>&lt;p>We are the Robotics Working Group of CV4DT. Click for more!&lt;/p>
&lt;h1 id="robotics">Robotics&lt;/h1>
&lt;p>We are a computer vision &amp;amp; robotics working group (&lt;a href="https://camcv.github.io/author/dr-guangming-wang/">Guangming Wang&lt;/a>, &lt;a href="https://camcv.github.io/author/dr-yixiong-jing/">Yixiong Jing&lt;/a>, &lt;a href="https://camcv.github.io/author/qizhen-ying/">Qizhen Ying&lt;/a>), focusing on:&lt;/p>
&lt;ul>
&lt;li>Robotic manipulation and control&lt;/li>
&lt;li>3D vision for robotic perception&lt;/li>
&lt;li>Generative models for planning and world understanding&lt;/li>
&lt;/ul>
&lt;p>We are always looking for fantastic persons to join us for the following project collaboration!&lt;/p>
&lt;hr>
&lt;h2 id="ongoing-research-projects">Ongoing Research Projects&lt;/h2>
&lt;h3 id="project-1-actionreasoning-robot-action-reasoning-in-3d-space-with-llm-for-robotic-brick-stacking">Project 1: &lt;strong>ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking&lt;/strong>&lt;/h3>
&lt;img src="https://camcv.github.io/uploads/research_video_robotics/brick_stacking.gif" alt="brick demo" width="550">
&lt;p>Classical robotic systems typically rely on custom planners designed for constrained environments. While effective in restricted settings, these systems lack generalization capabilities, limiting the scalability of embodied AI and general‑purpose robots. To address this gap, we propose ActionReasoning, an LLM-driven framework that performs explicit action reasoning to produce physics-consistent, prior-guided decisions for robotic manipulation. The experiments demonstrate that the proposed multi-agent LLM framework enables stable brick placement without task-specific programming, highlighting its potential to generalize beyond narrowly defined tasks (Paper is submitted to a top robotic conference).&lt;/p>
&lt;h3 id="project-2-robotic-perception-physics-aware-3d-gaussian-modeling">Project 2: &lt;strong>Robotic Perception: Physics-Aware 3D Gaussian Modeling&lt;/strong>&lt;/h3>
&lt;p>Our goal is to develop a unified 3D Gaussian modeling framework that integrates geometric, semantic, and physical attributes, enabling robots to achieve dynamic and adaptive understanding of their environments, thereby acquiring human-like adaptability and generalization capabilities.&lt;/p>
&lt;h3 id="project-3-robotic-manipulation-generalizable-manipulation-of-different-types-of-objects">Project 3: &lt;strong>Robotic Manipulation: Generalizable Manipulation of Different Types of Objects&lt;/strong>&lt;/h3>
&lt;p>Our goal is to build a general reasoning framework for the manipulation of deformable objects, hinged objects, and rigid objects, progressing from universal representations of different types of objects, to general reasoning, and ultimately to general manipulation. This will enable robots to attain human-like perception and manipulation skills for different types of objects.&lt;/p>
&lt;hr>
&lt;h2 id="past-research-projects">Past Research Projects&lt;/h2>
&lt;h3 id="rl-gsbridge-3d-gaussian-splatting-based-real2sim2real-method-for-robotic-manipulation-learning">&lt;strong>RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning&lt;/strong>&lt;/h3>
&lt;img src="https://camcv.github.io/uploads/research_video_robotics/Sim2real.gif" alt="RL-GSBridge demo" width="350">
&lt;p>Sim-to-Real refers to the process of transferring policies learned in simulation to the real world, which is crucial for achieving practical robotics applications. However, recent Sim2real methods either rely on a large amount of augmented data or large learning models, which is inefficient for specific tasks. To this end, we propose RL-GSBridge, a novel real-to-sim-to-real framework which incorporates 3D Gaussian Splatting into the conventional RL simulation pipeline, enabling zero-shot sim-to-real transfer for vision-based deep reinforcement learning.&lt;/p>
&lt;p>Through a series of sim-to-real experiments, including grasping and pick-and-place tasks, we demonstrate that RL-GSBridge maintains a satisfactory success rate in real-world task completion during sim-to-real transfer. Furthermore, a series of rendering metrics and visualization results indicate that our proposed mesh-based 3D GS reduces artifacts in unstructured objects, demonstrating more realistic rendering performance. The related work was published at top robotics conference &lt;a href="https://ieeexplore.ieee.org/abstract/document/11128103" target="_blank" rel="noopener">ICRA&lt;/a>.&lt;/p>
&lt;h3 id="sni-slam-semantic-neural-implicit-slam">&lt;strong>SNI-SLAM: Semantic Neural Implicit SLAM&lt;/strong>&lt;/h3>
&lt;img src="https://camcv.github.io/uploads/research_video_robotics/SNI_SLAM.gif" alt="SLAM demo" width="550">
&lt;p>We propose SNI-SLAM, a first semantic SLAM system utilizing neural implicit representation that simultaneously performs accurate semantic mapping high-quality surface reconstruction and robust camera tracking. In this system we introduce hierarchical semantic representation to allow multi-level semantic comprehension for top-down structured semantic mapping of the scene. In addition to fully utilize the correlation between multiple attributes of the environment we integrate appearance geometry and semantic features through cross-attention for feature collaboration. Our SNI-SLAM method demonstrates superior performance over all recent NeRF-based SLAM methods in terms of mapping and tracking accuracy on multiple datasets while also showing excellent capabilities in accurate semantic segmentation and real-time semantic mapping. Related work was publihsed at top computer vision conference &lt;a href="https://openaccess.thecvf.com/content/CVPR2024/papers/Zhu_SNI-SLAM_Semantic_Neural_Implicit_SLAM_CVPR_2024_paper.pdf" target="_blank" rel="noopener">CVPR&lt;/a>.&lt;/p>
&lt;h3 id="learning-of-long-horizon-sparse-reward-robotic-manipulator-tasks-with-base-controllers">&lt;strong>Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks With Base Controllers&lt;/strong>&lt;/h3>
&lt;img src="https://camcv.github.io/uploads/research_video_robotics/20_arxiv_DDPGwB.gif" alt="RL robot arm demo" width="350">
&lt;p>Deep reinforcement learning (DRL) enables robots to perform some intelligent tasks end-to-end. However, there are still many challenges for long-horizon sparse-reward robotic manipulator tasks. We propose a method of learning long-horizon sparse-reward tasks utilizing one or more existing traditional controllers named base controllers. The experiments demonstrated that the learned policies steadily outperform base controllers. Compared to previous works of learning from demonstrations, our method improves sample efficiency by orders of magnitude and improves performance. Overall, our method bears the potential of leveraging existing industrial robot manipulation systems to build more flexible and intelligent controllers. The related work was published at top AI journal &lt;a href="https://ieeexplore.ieee.org/abstract/document/9882014" target="_blank" rel="noopener">IEEE T-NNLS&lt;/a>&lt;/p></description></item><item><title>Contact</title><link>https://camcv.github.io/contact/</link><pubDate>Mon, 24 Oct 2022 00:00:00 +0000</pubDate><guid>https://camcv.github.io/contact/</guid><description/></item><item><title>People</title><link>https://camcv.github.io/people/</link><pubDate>Mon, 24 Oct 2022 00:00:00 +0000</pubDate><guid>https://camcv.github.io/people/</guid><description/></item><item><title>Vision</title><link>https://camcv.github.io/vision/</link><pubDate>Mon, 24 Oct 2022 00:00:00 +0000</pubDate><guid>https://camcv.github.io/vision/</guid><description>&lt;hr>
&lt;p>&lt;strong>The Cambridge Computer Vision Group:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Serves as a unifying hub for researchers in Cambridge and beyond in computer vision and related fields, including computational photography, 3D vision, robotics perception, and visual AI&lt;/li>
&lt;li>Brings together academia and industry through regular meetings, seminars, and invited talks&lt;/li>
&lt;li>Provides a platform for sharing cutting-edge research and discussing emerging challenges in visual computing&lt;/li>
&lt;li>Highlights a broad range of computer vision applications, including construction and civil engineering, medical imaging and healthcare, autonomous driving and robotics, remote sensing, manufacturing, and creative industries&lt;/li>
&lt;li>Encourages collaboration across institutions, career stages, and application domains&lt;/li>
&lt;li>Aims to strengthen Cambridge’s position as a global centre of excellence in computer vision and accelerate real-world impact&lt;/li>
&lt;/ul>
&lt;hr></description></item><item><title>Jian Yang and Monica Hall Win the Best Paper Award at Wowchemy 2020</title><link>https://camcv.github.io/post/20-12-02-icml-best-paper/</link><pubDate>Wed, 02 Dec 2020 00:00:00 +0000</pubDate><guid>https://camcv.github.io/post/20-12-02-icml-best-paper/</guid><description>&lt;p>Congratulations to Jian Yang and Monica Hall for winning the Best Paper Award at the 2020 Conference on Wowchemy for their paper “Learning Wowchemy”.&lt;/p>
&lt;p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer tempus augue non tempor egestas. Proin nisl nunc, dignissim in accumsan dapibus, auctor ullamcorper neque. Quisque at elit felis. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Aenean eget elementum odio. Cras interdum eget risus sit amet aliquet. In volutpat, nisl ut fringilla dignissim, arcu nisl suscipit ante, at accumsan sapien nisl eu eros.&lt;/p>
&lt;p>Sed eu dui nec ligula bibendum dapibus. Nullam imperdiet auctor tortor, vel cursus mauris malesuada non. Quisque ultrices euismod dapibus. Aenean sed gravida risus. Sed nisi tortor, vulputate nec quam non, placerat porta nisl. Nunc varius lobortis urna, condimentum facilisis ipsum molestie eu. Ut molestie eleifend ligula sed dignissim. Duis ut tellus turpis. Praesent tincidunt, nunc sed congue malesuada, mauris enim maximus massa, eget interdum turpis urna et ante. Morbi sem nisl, cursus quis mollis et, interdum luctus augue. Aliquam laoreet, leo et accumsan tincidunt, libero neque aliquet lectus, a ultricies lorem mi a orci.&lt;/p>
&lt;p>Mauris dapibus sem vel magna convallis laoreet. Donec in venenatis urna, vitae sodales odio. Praesent tortor diam, varius non luctus nec, bibendum vel est. Quisque id sem enim. Maecenas at est leo. Vestibulum tristique pellentesque ex, blandit placerat nunc eleifend sit amet. Fusce eget lectus bibendum, accumsan mi quis, luctus sem. Etiam vitae nulla scelerisque, eleifend odio in, euismod quam. Etiam porta ullamcorper massa, vitae gravida turpis euismod quis. Mauris sodales sem ac ultrices viverra. In placerat ultrices sapien. Suspendisse eu arcu hendrerit, luctus tortor cursus, maximus dolor. Proin et velit et quam gravida dapibus. Donec blandit justo ut consequat tristique.&lt;/p></description></item><item><title>Richard Hendricks Wins First Place in the Wowchemy Prize</title><link>https://camcv.github.io/post/20-12-01-wowchemy-prize/</link><pubDate>Tue, 01 Dec 2020 00:00:00 +0000</pubDate><guid>https://camcv.github.io/post/20-12-01-wowchemy-prize/</guid><description>&lt;p>Congratulations to Richard Hendricks for winning first place in the Wowchemy Prize.&lt;/p>
&lt;p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer tempus augue non tempor egestas. Proin nisl nunc, dignissim in accumsan dapibus, auctor ullamcorper neque. Quisque at elit felis. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Aenean eget elementum odio. Cras interdum eget risus sit amet aliquet. In volutpat, nisl ut fringilla dignissim, arcu nisl suscipit ante, at accumsan sapien nisl eu eros.&lt;/p>
&lt;p>Sed eu dui nec ligula bibendum dapibus. Nullam imperdiet auctor tortor, vel cursus mauris malesuada non. Quisque ultrices euismod dapibus. Aenean sed gravida risus. Sed nisi tortor, vulputate nec quam non, placerat porta nisl. Nunc varius lobortis urna, condimentum facilisis ipsum molestie eu. Ut molestie eleifend ligula sed dignissim. Duis ut tellus turpis. Praesent tincidunt, nunc sed congue malesuada, mauris enim maximus massa, eget interdum turpis urna et ante. Morbi sem nisl, cursus quis mollis et, interdum luctus augue. Aliquam laoreet, leo et accumsan tincidunt, libero neque aliquet lectus, a ultricies lorem mi a orci.&lt;/p>
&lt;p>Mauris dapibus sem vel magna convallis laoreet. Donec in venenatis urna, vitae sodales odio. Praesent tortor diam, varius non luctus nec, bibendum vel est. Quisque id sem enim. Maecenas at est leo. Vestibulum tristique pellentesque ex, blandit placerat nunc eleifend sit amet. Fusce eget lectus bibendum, accumsan mi quis, luctus sem. Etiam vitae nulla scelerisque, eleifend odio in, euismod quam. Etiam porta ullamcorper massa, vitae gravida turpis euismod quis. Mauris sodales sem ac ultrices viverra. In placerat ultrices sapien. Suspendisse eu arcu hendrerit, luctus tortor cursus, maximus dolor. Proin et velit et quam gravida dapibus. Donec blandit justo ut consequat tristique.&lt;/p></description></item><item><title/><link>https://camcv.github.io/admin/config.yml</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://camcv.github.io/admin/config.yml</guid><description/></item></channel></rss>