-
Event-based Data Format Standard (EVT+)
Authors:
Jonah P. Sengupta,
Mohammad Imran Vakil,
Thanh M. Dang,
Ian Pardee,
Paul Coen,
Olivia Aul
Abstract:
Event-based Sensing (EBS) hardware is quickly proliferating while finding foothold in many commercial, industrial, and defense applications. At present, there are a handful of technologically mature systems which produce data streams with diverse output formats. In the near future it is anticipated there will be vendors who offer new sensor hardware which could also yield unique data schema that a…
▽ More
Event-based Sensing (EBS) hardware is quickly proliferating while finding foothold in many commercial, industrial, and defense applications. At present, there are a handful of technologically mature systems which produce data streams with diverse output formats. In the near future it is anticipated there will be vendors who offer new sensor hardware which could also yield unique data schema that are not aligned to past efforts. Thus, due to the relative nascent nature of the technology and its potential for widespread use in a variety of applications, it is an opportune time to define a standard for this class of sensors' output data. The intent of this document is to identify and provide a standard for the collected EBS streaming data. The main objective of the standard is to be sensor agnostic, incorporate some of the current sensor configurations and modalities, and account for the developing configurations and modalities. The intent is also to leave enough place holders and space in the standard for future variations that may develop as EBS technology matures.
△ Less
Submitted 19 November, 2025;
originally announced November 2025.
-
Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis
Authors:
Haiqing Li,
Yuzhi Guo,
Feng Jiang,
Thao M. Dang,
Hehuan Ma,
Qifeng Zhou,
Jean Gao,
Junzhou Huang
Abstract:
Early-stage scoliosis is often difficult to detect, particularly in adolescents, where delayed diagnosis can lead to serious health issues. Traditional X-ray-based methods carry radiation risks and rely heavily on clinical expertise, limiting their use in large-scale screenings. To overcome these challenges, we propose a Text-Guided Multi-Instance Learning Network (TG-MILNet) for non-invasive scol…
▽ More
Early-stage scoliosis is often difficult to detect, particularly in adolescents, where delayed diagnosis can lead to serious health issues. Traditional X-ray-based methods carry radiation risks and rely heavily on clinical expertise, limiting their use in large-scale screenings. To overcome these challenges, we propose a Text-Guided Multi-Instance Learning Network (TG-MILNet) for non-invasive scoliosis detection using gait videos. To handle temporal misalignment in gait sequences, we employ Dynamic Time Warping (DTW) clustering to segment videos into key gait phases. To focus on the most relevant diagnostic features, we introduce an Inter-Bag Temporal Attention (IBTA) mechanism that highlights critical gait phases. Recognizing the difficulty in identifying borderline cases, we design a Boundary-Aware Model (BAM) to improve sensitivity to subtle spinal deviations. Additionally, we incorporate textual guidance from domain experts and large language models (LLM) to enhance feature representation and improve model interpretability. Experiments on the large-scale Scoliosis1K gait dataset show that TG-MILNet achieves state-of-the-art performance, particularly excelling in handling class imbalance and accurately detecting challenging borderline cases. The code is available at https://github.com/lhqqq/TG-MILNet
△ Less
Submitted 1 July, 2025;
originally announced July 2025.
-
Human-Centered Development of Guide Dog Robots: Quiet and Stable Locomotion Control
Authors:
Shangqun Yu,
Hochul Hwang,
Trung M. Dang,
Joydeep Biswas,
Nicholas A. Giudice,
Sunghoon Ivan Lee,
Donghyun Kim
Abstract:
A quadruped robot is a promising system that can offer assistance comparable to that of dog guides due to its similar form factor. However, various challenges remain in making these robots a reliable option for blind and low-vision (BLV) individuals. Among these challenges, noise and jerky motion during walking are critical drawbacks of existing quadruped robots. While these issues have largely be…
▽ More
A quadruped robot is a promising system that can offer assistance comparable to that of dog guides due to its similar form factor. However, various challenges remain in making these robots a reliable option for blind and low-vision (BLV) individuals. Among these challenges, noise and jerky motion during walking are critical drawbacks of existing quadruped robots. While these issues have largely been overlooked in guide dog robot research, our interviews with guide dog handlers and trainers revealed that acoustic and physical disturbances can be particularly disruptive for BLV individuals, who rely heavily on environmental sounds for navigation. To address these issues, we developed a novel walking controller for slow stepping and smooth foot swing/contact while maintaining human walking speed, as well as robust and stable balance control. The controller integrates with a perception system to facilitate locomotion over non-flat terrains, such as stairs. Our controller was extensively tested on the Unitree Go1 robot and, when compared with other control methods, demonstrated significant noise reduction -- half of the default locomotion controller. In this study, we adopt a mixed-methods approach to evaluate its usability with BLV individuals. In our indoor walking experiments, participants compared our controller to the robot's default controller. Results demonstrated superior acceptance of our controller, highlighting its potential to improve the user experience of guide dog robots. Video demonstration (best viewed with audio) available at: https://youtu.be/8-pz_8Hqe6s.
△ Less
Submitted 27 May, 2025; v1 submitted 16 May, 2025;
originally announced May 2025.
-
HOMIE: Histopathology Omni-modal Embedding for Pathology Composed Retrieval
Authors:
Qifeng Zhou,
Wenliang Zhong,
Thao M. Dang,
Hehuan Ma,
Saiyang Na,
Yuzhi Guo,
Junzhou Huang
Abstract:
The integration of Artificial Intelligence (AI) into pathology faces a fundamental challenge: black-box predictive models lack transparency, while generative approaches risk clinical hallucination. A case-based retrieval paradigm offers a more interpretable alternative for clinical adoption. However, current SOTA models are constrained by dual-encoder architectures that cannot process the composed…
▽ More
The integration of Artificial Intelligence (AI) into pathology faces a fundamental challenge: black-box predictive models lack transparency, while generative approaches risk clinical hallucination. A case-based retrieval paradigm offers a more interpretable alternative for clinical adoption. However, current SOTA models are constrained by dual-encoder architectures that cannot process the composed modality of real-world clinical queries. We formally define the task of Pathology Composed Retrieval (PCR). However, progress in this newly defined task is blocked by two critical challenges: (1) Multimodal Large Language Models (MLLMs) offer the necessary deep-fusion architecture but suffer from a critical Task Mismatch and Domain Mismatch. (2) No benchmark exists to evaluate such compositional queries. To solve these challenges, we propose HOMIE, a systematic framework that transforms a general MLLM into a specialized retrieval expert. HOMIE resolves the dual mismatch via a two-stage process: a retrieval-adaptation stage to solve the task mismatch, and a pathology-specific tuning stage, featuring a progressive knowledge curriculum, pathology specfic stain and native resolution processing, to solve the domain mismatch. We also introduce the PCR Benchmark, a benchmark designed to evaluate composed retrieval in pathology. Experiments show that HOMIE, trained only on public data, matches SOTA performance on traditional retrieval tasks and outperforms all baselines on the newly defined PCR task.
△ Less
Submitted 21 December, 2025; v1 submitted 10 February, 2025;
originally announced February 2025.