Skip to main content

Showing 1–7 of 7 results for author: Man, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2512.21887  [pdf, ps, other

    cs.RO cs.AI

    Aerial World Model for Long-horizon Visual Generation and Navigation in 3D Space

    Authors: Weichen Zhang, Peizhi Tang, Xin Zeng, Fanhang Man, Shiquan Yu, Zichao Dai, Baining Zhao, Hongjin Chen, Yu Shang, Wei Wu, Chen Gao, Xinlei Chen, Xin Wang, Yong Li, Wenwu Zhu

    Abstract: Unmanned aerial vehicles (UAVs) have emerged as powerful embodied agents. One of the core abilities is autonomous navigation in large-scale three-dimensional environments. Existing navigation policies, however, are typically optimized for low-level objectives such as obstacle avoidance and trajectory smoothness, lacking the ability to incorporate high-level semantics into planning. To bridge this… ▽ More

    Submitted 2 January, 2026; v1 submitted 26 December, 2025; originally announced December 2025.

  2. arXiv:2512.16325  [pdf, ps, other

    cs.CV

    QUIDS: Quality-informed Incentive-driven Multi-agent Dispatching System for Mobile Crowdsensing

    Authors: Nan Zhou, Zuxin Li, Fanhang Man, Xuecheng Chen, Susu Xu, Fan Dang, Chaopeng Hong, Yunhao Liu, Xiao-Ping Zhang, Xinlei Chen

    Abstract: This paper addresses the challenge of achieving optimal Quality of Information (QoI) in non-dedicated vehicular mobile crowdsensing (NVMCS) systems. The key obstacles are the interrelated issues of sensing coverage, sensing reliability, and the dynamic participation of vehicles. To tackle these, we propose QUIDS, a QUality-informed Incentive-driven multi-agent Dispatching System, which ensures hig… ▽ More

    Submitted 18 December, 2025; originally announced December 2025.

  3. arXiv:2507.08885  [pdf, ps, other

    cs.RO cs.AI

    AirScape: An Aerial Generative World Model with Motion Controllability

    Authors: Baining Zhao, Rongze Tang, Mingyuan Jia, Ziyou Wang, Fanghang Man, Xin Zhang, Yu Shang, Weichen Zhang, Wei Wu, Chen Gao, Xinlei Chen, Yong Li

    Abstract: How to enable agents to predict the outcomes of their own motion intentions in three-dimensional space has been a fundamental problem in embodied intelligence. To explore general spatial imagination capability, we present AirScape, the first world model designed for six-degree-of-freedom aerial agents. AirScape predicts future observation sequences based on current visual inputs and motion intenti… ▽ More

    Submitted 10 October, 2025; v1 submitted 10 July, 2025; originally announced July 2025.

  4. arXiv:2505.24342  [pdf, ps, other

    cs.CV cs.CL

    VAEER: Visual Attention-Inspired Emotion Elicitation Reasoning

    Authors: Fanhang Man, Xiaoyue Chen, Huandong Wang, Baining Zhao, Han Li, Xinlei Chen

    Abstract: Images shared online strongly influence emotions and public well-being. Understanding the emotions an image elicits is therefore vital for fostering healthier and more sustainable digital communities, especially during public crises. We study Visual Emotion Elicitation (VEE), predicting the set of emotions that an image evokes in viewers. We introduce VAEER, an interpretable multi-label VEE framew… ▽ More

    Submitted 18 December, 2025; v1 submitted 30 May, 2025; originally announced May 2025.

    Comments: Currently under review as conference paper

  5. Context-Aware Sentiment Forecasting via LLM-based Multi-Perspective Role-Playing Agents

    Authors: Fanhang Man, Huandong Wang, Jianjie Fang, Zhaoyi Deng, Baining Zhao, Xinlei Chen, Yong Li

    Abstract: User sentiment on social media reveals the underlying social trends, crises, and needs. Researchers have analyzed users' past messages to trace the evolution of sentiments and reconstruct sentiment dynamics. However, predicting the imminent sentiment of an ongoing event is rarely studied. In this paper, we address the problem of \textbf{sentiment forecasting} on social media to predict the user's… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  6. arXiv:2504.12680  [pdf, other

    cs.AI cs.CV

    Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning

    Authors: Baining Zhao, Ziyou Wang, Jianjie Fang, Chen Gao, Fanhang Man, Jinqiang Cui, Xin Wang, Xinlei Chen, Yong Li, Wenwu Zhu

    Abstract: Humans can perceive and reason about spatial relationships from sequential visual observations, such as egocentric video streams. However, how pretrained models acquire such abilities, especially high-level reasoning, remains unclear. This paper introduces Embodied-R, a collaborative framework combining large-scale Vision-Language Models (VLMs) for perception and small-scale Language Models (LMs)… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

    Comments: 12 pages, 5 figures

  7. arXiv:2410.09604  [pdf, other

    cs.AI cs.RO

    EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

    Authors: Chen Gao, Baining Zhao, Weichen Zhang, Jinzhu Mao, Jun Zhang, Zhiheng Zheng, Fanhang Man, Jianjie Fang, Zile Zhou, Jinqiang Cui, Xinlei Chen, Yong Li

    Abstract: Embodied artificial intelligence emphasizes the role of an agent's body in generating human-like behaviors. The recent efforts on EmbodiedAI pay a lot of attention to building up machine learning models to possess perceiving, planning, and acting abilities, thereby enabling real-time interaction with the world. However, most works focus on bounded indoor environments, such as navigation in a room… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

    Comments: All of the software, Python library, codes, datasets, tutorials, and real-time online service are available on this website: https://embodied-city.fiblab.net