🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
-
Updated
Apr 4, 2025 - HTML
8000
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
[DEPRECATED prototype] Multimodal Access and Interactive Data Representation
This repository stores coding pipeline to process, analyze & model data associated with the manuscript 'Self-reported expressibility predicts communicative success: Open dataset, validation, and simulation'.
Comprehensive multimodal system for analyzing documents with support for extracting and processing text, tables, and images
AI-driven physiotherapy solution designed to assist stroke patients in their rehabilitation journey.
Add a description, image, and links to the multimodality topic page so that developers can more easily learn about it.
To associate your repository with the multimodality topic, visit your repo's landing page and select "manage topics."