Stars
This is the official repo for paper "Agent-Environment Alignment via Automated Interface Generation"
Repo for paper "MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding".
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
This is the official repo for paper "Visual Abstract Thinking Empowers Multimodal Reasoning"
Official Repository for “CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models" [CVPR2025]
Official repo for EscapeCraft (an 3D environment for room escape) and benchmark MM-Escape. This work is accepted by ICCV 2025.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
✨✨Latest Advances on Multimodal Large Language Models
This is the repo for our work “Failures Pave the Way: Enhancing Large Language Models through Tuning-free Rule Accumulation” (EMNLP 2023).
Official code for our paper "Model Composition for Multimodal Large Language Models" (ACL 2024)
Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".
Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
A large number of free HTTP proxies updated every 10 minutes.Keep http/s proxies fresh at all times.