🌟 Align diffusion processes with detailed human preferences to improve machine learning models for richer, more accurate outputs.
-
Updated
Dec 24, 2025 - Python
8000
🌟 Align diffusion processes with detailed human preferences to improve machine learning models for richer, more accurate outputs.
A novel multi-view feedforward network that enables direct and robust object pose estimation from a query image.
Training patient-specific 2D/3D registration models in 5 minutes
Python tools for rendering, viewing and generating metric 3D depth videos. Tools for recovering and exporting camera pose and 3D geometry to popular formats as well as tools for projecting depthvideo in to side by side stereo(sbs) 3D stereo video.
Official PyTorch implementation of BLADE: Single-view Body Mesh Estimation through Accurate Depth Estimation (CVPR 2025). BLADE tackles close-range human mesh recovery where perspective distortion is strongest, and solves for camera pose and focal length in addition to SMPL(-X) parameters.
A Deep Learning Approach to Camera Pose Estimation
CoHAtNet: An Integrated Convolution-Transformer Architecture for End-to-End Camera Localization
GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)
(ICCV 2025) Kaleidoscopic Background Attack: Disrupting Pose Estimation with Multi-Fold Radial Symmetry Textures​
Investigation of different CNN Backbones for PoseNet on 7-Scenes dataset with comprehensive comparison on prediction and runtime
[ICLR 2025] Official repo of "GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting"
[ICRA'20] Robust 360-8PA": Redesigning The Normalized 8-point Algorithm for 360-FoV Images
A preprocessing pipeline for monocular videos containing human motion in static scenes
🏛️ Reconstructing 3D landmarks from 2D images using feature detection, triangulation, and mesh generation. Built as a Computer Vision course project with interactive 3D model visualization via FlutterCube.
A curated list of awesome visual localization research works.
[CVPR 2024] Intraoperative 2D/3D registration via differentiable X-ray rendering
Multiview-Structure-From-Motion is an open-source implementation of a complete Structure-from-Motion (SfM) pipeline designed to reconstruct 3D scenes from multiple 2D images. Leveraging advanced computer vision techniques, this project aims to provide a modular and extensible framework.
[ICRA 2024]This is the official repo of paper "HR-APR: APR-agnostic Framework with Uncertainty Estimation and Hierarchical Refinement for Camera Relocalisation"
[AAAI2023] TopicFM: Robust, Efficient, and Interpretable Topic-Assisted Feature Matching
3D reconstruction using SfM
Add a description, image, and links to the camera-pose-estimation topic page so that developers can more easily learn about it.
To associate your repository with the camera-pose-estimation topic, visit your repo's landing page and select "manage topics."