Stars
real time face swap and one-click video deepfake with only a single image
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Implementation of "PaLM-E: An Embodied Multimodal Language Model"
Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Reverse engineered ChatGPT API
High-efficiency floating-point neural network inference operators for mobile, server, and Web
This script converts the ONNX/OpenVINO IR model to Tensorflow's saved_model, tflite, h5, tfjs, tftrt(TensorRT), CoreML, EdgeTPU, ONNX and pb. PyTorch (NCHW) -> ONNX (NCHW) -> OpenVINO (NCHW) -> opeβ¦
Create π₯ videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Convert Machine Learning Code Between Frameworks
A latent text-to-image diffusion model
A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF
ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems"
Tooling for the Common Objects In 3D dataset.
π₯π₯π₯π₯ (Earlier YOLOv7 not official one) YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! π₯π₯π₯
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
Hybrid Neural Fusion for Full-frame Video Stabilization
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
Software and pre-trained models for automatic photo quality enhancement using Deep Convolutional Networks
Solidity, the Smart Contract Programming Language