Skip to main content

Showing 1–50 of 122 results for author: Liang, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2512.16303  [pdf, ps, other

    cs.CV cs.AI

    PixelArena: A benchmark for Pixel-Precision Visual Intelligence

    Authors: Feng Liang, Sizhe Cheng, Chenqi Yi

    Abstract: Multi-modal large language models that have image output are emerging. Many image generation benchmarks focus on aesthetics instead of fine-grained generation capabilities. In PixelArena, we propose using semantic segmentation tasks to objectively examine their fine-grained generative intelligence with pixel precision. We find the latest Gemini 3 Pro Image has emergent image generation capabilitie… ▽ More

    Submitted 18 December, 2025; originally announced December 2025.

    Comments: 7 pages, 11 figures, project page: https://pixelarena.reify.ing/project

  2. arXiv:2512.14181  [pdf, ps, other

    quant-ph cs.AI cs.HC

    Towards Explainable Quantum AI: Informing the Encoder Selection of Quantum Neural Networks via Visualization

    Authors: Shaolun Ruan, Feng Liang, Rohan Ramakrishna, Chao Ren, Rudai Yan, Qiang Guan, Jiannan Li, Yong Wang

    Abstract: Quantum Neural Networks (QNNs) represent a promising fusion of quantum computing and neural network architectures, offering speed-ups and efficient processing of high-dimensional, entangled data. A crucial component of QNNs is the encoder, which maps classical input data into quantum states. However, choosing suitable encoders remains a significant challenge, largely due to the lack of systematic… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: 9 pages, 6 figures, accepted by TVCG 2026, not published yet

  3. arXiv:2512.07218  [pdf, ps, other

    cs.CL cs.AI cs.LG

    NeSTR: A Neuro-Symbolic Abductive Framework for Temporal Reasoning in Large Language Models

    Authors: Feng Liang, Weixin Zeng, Runhao Zhao, Xiang Zhao

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, temporal reasoning, particularly under complex temporal constraints, remains a major challenge. To this end, existing approaches have explored symbolic methods, which encode temporal structure explicitly, and reflective mechanisms, which revise reasoning errors t… ▽ More

    Submitted 8 December, 2025; originally announced December 2025.

    Comments: Accepted by AAAI 2026

  4. arXiv:2510.27240  [pdf, ps, other

    cs.LG

    FedSM: Robust Semantics-Guided Feature Mixup for Bias Reduction in Federated Learning with Long-Tail Data

    Authors: Jingrui Zhang, Yimeng Xu, Shujie Li, Feng Liang, Haihan Duan, Yanjie Dong, Victor C. M. Leung, Xiping Hu

    Abstract: Federated Learning (FL) enables collaborative model training across decentralized clients without sharing private data. However, FL suffers from biased global models due to non-IID and long-tail data distributions. We propose \textbf{FedSM}, a novel client-centric framework that mitigates this bias through semantics-guided feature mixup and lightweight classifier retraining. FedSM uses a pretraine… ▽ More

    Submitted 31 October, 2025; originally announced October 2025.

  5. arXiv:2510.04980  [pdf, ps, other

    cs.AI cs.CL

    LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game

    Authors: Fangzhou Liang, Tianshi Zheng, Chunkit Chan, Yauwai Yim, Yangqiu Song

    Abstract: Effective multi-agent collaboration requires agents to infer the rationale behind others' actions, a capability rooted in Theory-of-Mind (ToM). While recent Large Language Models (LLMs) excel at logical inference, their ability to infer rationale in dynamic, collaborative settings remains under-explored. This study introduces LLM-Hanabi, a novel benchmark that uses the cooperative game Hanabi to e… ▽ More

    Submitted 6 October, 2025; originally announced October 2025.

    Comments: EMNLP 2025 Wordplay

  6. arXiv:2510.02885  [pdf, ps, other

    cs.RO

    Point Cloud-Based Control Barrier Functions for Model Predictive Control in Safety-Critical Navigation of Autonomous Mobile Robots

    Authors: Faduo Liang, Yunfeng Yang, Shi-Lu Dai

    Abstract: In this work, we propose a novel motion planning algorithm to facilitate safety-critical navigation for autonomous mobile robots. The proposed algorithm integrates a real-time dynamic obstacle tracking and mapping system that categorizes point clouds into dynamic and static components. For dynamic point clouds, the Kalman filter is employed to estimate and predict their motion states. Based on the… ▽ More

    Submitted 3 October, 2025; originally announced October 2025.

    Comments: 8 pages, 8 figures, accepted to IROS2025

  7. arXiv:2509.25381  [pdf, ps, other

    cs.LG

    Deep Survival Analysis for Competing Risk Modeling with Functional Covariates and Missing Data Imputation

    Authors: Penglei Gao, Yan Zou, Abhijit Duggal, Shuaiqi Huang, Faming Liang, Xiaofeng Wang

    Abstract: We introduce the Functional Competing Risk Net (FCRN), a unified deep-learning framework for discrete-time survival analysis under competing risks, which seamlessly integrates functional covariates and handles missing data within an end-to-end model. By combining a micro-network Basis Layer for functional data representation with a gradient-based imputation module, FCRN simultaneously learns to im… ▽ More

    Submitted 29 September, 2025; originally announced September 2025.

  8. arXiv:2509.19589  [pdf, ps, other

    cs.CV

    Synthesizing Artifact Dataset for Pixel-level Detection

    Authors: Dennis Menn, Feng Liang, Diana Marculescu

    Abstract: Artifact detectors have been shown to enhance the performance of image-generative models by serving as reward models during fine-tuning. These detectors enable the generative model to improve overall output fidelity and aesthetics. However, training the artifact detector requires expensive pixel-level human annotations that specify the artifact regions. The lack of annotated data limits the perfor… ▽ More

    Submitted 23 September, 2025; originally announced September 2025.

    Comments: Under submission to WACV

  9. arXiv:2509.07933  [pdf, ps, other

    cs.SE cs.AI

    Breaking Android with AI: A Deep Dive into LLM-Powered Exploitation

    Authors: Wanni Vidulige Ishan Perera, Xing Liu, Fan liang, Junyi Zhang

    Abstract: The rapid evolution of Artificial Intelligence (AI) and Large Language Models (LLMs) has opened up new opportunities in the area of cybersecurity, especially in the exploitation automation landscape and penetration testing. This study explores Android penetration testing automation using LLM-based tools, especially PentestGPT, to identify and execute rooting techniques. Through a comparison of the… ▽ More

    Submitted 9 September, 2025; originally announced September 2025.

  10. arXiv:2508.01217  [pdf, ps, other

    stat.ML cs.LG

    Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling

    Authors: Yan Sun, Faming Liang

    Abstract: Deep learning has revolutionized modern data science. However, how to accurately quantify the uncertainty of predictions from large-scale deep neural networks (DNNs) remains an unresolved issue. To address this issue, we introduce a novel post-processing approach. This approach feeds the output from the last hidden layer of a pre-trained large-scale DNN model into a stochastic neural network (StoN… ▽ More

    Submitted 2 August, 2025; originally announced August 2025.

  11. arXiv:2508.01115  [pdf, ps, other

    cs.LG

    A hierarchy tree data structure for behavior-based user segment representation

    Authors: Yang Liu, Xuejiao Kang, Sathya Iyer, Idris Malik, Ruixuan Li, Juan Wang, Xinchen Lu, Xiangxue Zhao, Dayong Wang, Menghan Liu, Isaac Liu, Feng Liang, Yinzhe Yu

    Abstract: User attributes are essential in multiple stages of modern recommendation systems and are particularly important for mitigating the cold-start problem and improving the experience of new or infrequent users. We propose Behavior-based User Segmentation (BUS), a novel tree-based data structure that hierarchically segments the user universe with various users' categorical attributes based on the user… ▽ More

    Submitted 1 August, 2025; originally announced August 2025.

    Comments: 18 pages, 7 figures

  12. arXiv:2507.23608  [pdf, ps, other

    cs.CV cs.CR

    Medical Image De-Identification Benchmark Challenge

    Authors: Linmin Pei, Granger Sutton, Michael Rutherford, Ulrike Wagner, Tracy Nolan, Kirk Smith, Phillip Farmer, Peter Gu, Ambar Rana, Kailing Chen, Thomas Ferleman, Brian Park, Ye Wu, Jordan Kojouharov, Gargi Singh, Jon Lemon, Tyler Willis, Milos Vukadinovic, Grant Duffy, Bryan He, David Ouyang, Marco Pereanez, Daniel Samber, Derek A. Smith, Christopher Cannistraci , et al. (45 additional authors not shown)

    Abstract: The de-identification (deID) of protected health information (PHI) and personally identifiable information (PII) is a fundamental requirement for sharing medical images, particularly through public repositories, to ensure compliance with patient privacy laws. In addition, preservation of non-PHI metadata to inform and enable downstream development of imaging artificial intelligence (AI) is an impo… ▽ More

    Submitted 31 July, 2025; originally announced July 2025.

    Comments: 19 pages

  13. arXiv:2507.09537  [pdf, ps, other

    cs.RO

    Self-supervised Pretraining for Integrated Prediction and Planning of Automated Vehicles

    Authors: Yangang Ren, Guojian Zhan, Chen Lv, Jun Li, Fenghua Liang, Keqiang Li

    Abstract: Predicting the future of surrounding agents and accordingly planning a safe, goal-directed trajectory are crucial for automated vehicles. Current methods typically rely on imitation learning to optimize metrics against the ground truth, often overlooking how scene understanding could enable more holistic trajectories. In this paper, we propose Plan-MAE, a unified pretraining framework for predicti… ▽ More

    Submitted 13 July, 2025; originally announced July 2025.

  14. arXiv:2507.04781  [pdf, ps, other

    cs.LG

    FedPall: Prototype-based Adversarial and Collaborative Learning for Federated Learning with Feature Drift

    Authors: Yong Zhang, Feng Liang, Guanghu Yuan, Min Yang, Chengming Li, Xiping Hu

    Abstract: Federated learning (FL) enables collaborative training of a global model in the centralized server with data from multiple parties while preserving privacy. However, data heterogeneity can significantly degrade the performance of the global model when each party uses datasets from different sources to train a local model, thereby affecting personalized local models. Among various cases of data het… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: 10 pages, 6 figures, and 1 table

    Journal ref: International Conference on Computer Vision (ICCV), 2025

  15. arXiv:2507.01320  [pdf, ps, other

    cs.MM

    Robust Multi-generation Learned Compression of Point Cloud Attribute

    Authors: Xiangzuo Liu, Zhikai Liu, PengPeng Yu, Ruishan Huang, Fan Liang

    Abstract: Existing learned point cloud attribute compression methods primarily focus on single-pass rate-distortion optimization, while overlooking the issue of cumulative distortion in multi-generation compression scenarios. This paper, for the first time, investigates the multi-generation issue in learned point cloud attribute compression. We identify two primary factors contributing to quality degradatio… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  16. arXiv:2506.15889  [pdf, ps, other

    cs.CL

    Entropy-Driven Pre-Tokenization for Byte-Pair Encoding

    Authors: Yifan Hu, Frank Liang, Dachuan Zhao, Jonathan Geuter, Varshini Reddy, Craig W. Schmidt, Chris Tanner

    Abstract: Byte-Pair Encoding (BPE) has become a widely adopted subword tokenization method in modern language models due to its simplicity and strong empirical performance across downstream tasks. However, applying BPE to unsegmented languages such as Chinese presents significant challenges, as its frequency-driven merge operation is agnostic to linguistic boundaries. To address this, we propose two entropy… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  17. arXiv:2506.15685   

    cs.LG cs.AI

    Ignition Phase : Standard Training for Fast Adversarial Robustness

    Authors: Wang Yu-Hang, Liu ying, Fang liang, Wang Xuelin, Junkang Guo, Shiwei Li, Lei Gao, Jian Liu, Wenfei Yin

    Abstract: Adversarial Training (AT) is a cornerstone defense, but many variants overlook foundational feature representations by primarily focusing on stronger attack generation. We introduce Adversarial Evolution Training (AET), a simple yet powerful framework that strategically prepends an Empirical Risk Minimization (ERM) phase to conventional AT. We hypothesize this initial ERM phase cultivates a favora… ▽ More

    Submitted 10 October, 2025; v1 submitted 25 May, 2025; originally announced June 2025.

    Comments: Due to errors in both the theoretical formulation and the experimental design, we hereby withdraw the manuscript

  18. arXiv:2506.13050  [pdf, ps, other

    cs.GR cs.CV

    NeuVAS: Neural Implicit Surfaces for Variational Shape Modeling

    Authors: Pengfei Wang, Qiujie Dong, Fangtian Liang, Hao Pan, Lei Yang, Congyi Zhang, Guying Lin, Caiming Zhang, Yuanfeng Zhou, Changhe Tu, Shiqing Xin, Alla Sheffer, Xin Li, Wenping Wang

    Abstract: Neural implicit shape representation has drawn significant attention in recent years due to its smoothness, differentiability, and topological flexibility. However, directly modeling the shape of a neural implicit surface, especially as the zero-level set of a neural signed distance function (SDF), with sparse geometric control is still a challenging task. Sparse input shape control typically incl… ▽ More

    Submitted 25 September, 2025; v1 submitted 15 June, 2025; originally announced June 2025.

  19. arXiv:2506.12700  [pdf, ps, other

    cs.LG

    Large Scalable Cross-Domain Graph Neural Networks for Personalized Notification at LinkedIn

    Authors: Shihai He, Julie Choi, Tianqi Li, Zhiwei Ding, Peng Du, Priya Bannur, Franco Liang, Fedor Borisyuk, Padmini Jaikumar, Xiaobing Xue, Viral Gupta

    Abstract: Notification recommendation systems are critical to driving user engagement on professional platforms like LinkedIn. Designing such systems involves integrating heterogeneous signals across domains, capturing temporal dynamics, and optimizing for multiple, often competing, objectives. Graph Neural Networks (GNNs) provide a powerful framework for modeling complex interactions in such environments.… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    MSC Class: 68R10

  20. arXiv:2505.19136  [pdf, ps, other

    stat.ML cs.LG

    Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference

    Authors: Frank Shih, Zhenghao Jiang, Faming Liang

    Abstract: Uncertainty quantification (UQ) in scientific machine learning is increasingly critical as neural networks are widely adopted to tackle complex problems across diverse scientific disciplines. For physics-informed neural networks (PINNs), a prominent model in scientific machine learning, uncertainty is typically quantified using Bayesian or dropout methods. However, both approaches suffer from a fu… ▽ More

    Submitted 16 October, 2025; v1 submitted 25 May, 2025; originally announced May 2025.

  21. arXiv:2505.18521  [pdf, ps, other

    cs.CV

    Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility

    Authors: Yiheng Li, Feng Liang, Dan Kondratyuk, Masayoshi Tomizuka, Kurt Keutzer, Chenfeng Xu

    Abstract: The substantial training cost of diffusion models hinders their deployment. Immiscible Diffusion recently showed that reducing diffusion trajectory mixing in the noise space via linear assignment accelerates training by simplifying denoising. To extend immiscible diffusion beyond the inefficient linear assignment under high batch sizes and high dimensions, we refine this concept to a broader misci… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  22. arXiv:2505.01995  [pdf, ps, other

    stat.ML cs.LG math.ST stat.CO

    Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks

    Authors: Sehwan Kim, Faming Liang

    Abstract: Individual treatment effect estimation has gained significant attention in recent data science literature. This work introduces the Double Neural Network (Double-NN) method to address this problem within the framework of extended fiducial inference (EFI). In the proposed method, deep neural networks are used to model the treatment and control effect functions, while an additional neural network is… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

  23. arXiv:2504.19624  [pdf, other

    cs.RO

    ARMOR: Adaptive Meshing with Reinforcement Optimization for Real-time 3D Monitoring in Unexposed Scenes

    Authors: Yizhe Zhang, Jianping Li, Xin Zhao, Fuxun Liang, Zhen Dong, Bisheng Yang

    Abstract: Unexposed environments, such as lava tubes, mines, and tunnels, are among the most complex yet strategically significant domains for scientific exploration and infrastructure development. Accurate and real-time 3D meshing of these environments is essential for applications including automated structural assessment, robotic-assisted inspection, and safety monitoring. Implicit neural Signed Distance… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

  24. arXiv:2504.04658  [pdf, other

    cs.CV stat.AP

    3DM-WeConvene: Learned Image Compression with 3D Multi-Level Wavelet-Domain Convolution and Entropy Model

    Authors: Haisheng Fu, Jie Liang, Feng Liang, Zhenman Fang, Guohe Zhang, Jingning Han

    Abstract: Learned image compression (LIC) has recently made significant progress, surpassing traditional methods. However, most LIC approaches operate mainly in the spatial domain and lack mechanisms for reducing frequency-domain correlations. To address this, we propose a novel framework that integrates low-complexity 3D multi-level Discrete Wavelet Transform (DWT) into convolutional layers and entropy cod… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

    Comments: 13 pages

  25. arXiv:2502.07802  [pdf, other

    cs.CV cs.GR cs.LG

    Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

    Authors: Feng Liang, Haoyu Ma, Zecheng He, Tingbo Hou, Ji Hou, Kunpeng Li, Xiaoliang Dai, Felix Juefei-Xu, Samaneh Azadi, Animesh Sinha, Peizhao Zhang, Peter Vajda, Diana Marculescu

    Abstract: Video personalization, which generates customized videos using reference images, has gained significant attention. However, prior methods typically focus on single-concept personalization, limiting broader applications that require multi-concept integration. Attempts to extend these models to multiple concepts often lead to identity blending, which results in composite characters with fused attrib… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: Project page: https://jeff-liangf.github.io/projects/movieweaver/

  26. arXiv:2501.05809  [pdf, other

    cs.LG

    AdaPRL: Adaptive Pairwise Regression Learning with Uncertainty Estimation for Universal Regression Tasks

    Authors: Fuhang Liang, Rucong Xu, Deng Lin

    Abstract: Current deep regression models usually learn in a point-wise way that treats each sample as an independent input, neglecting the relative ordering among different data. Consequently, the regression model could neglect the data's interrelationships, potentially resulting in suboptimal performance. Moreover, the existence of aleatoric uncertainty in the training data may drive the model to capture n… ▽ More

    Submitted 9 February, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

    Comments: 24 pages, 11 figures

  27. arXiv:2412.17109  [pdf, other

    cs.CV cs.LG

    Similarity Trajectories: Linking Sampling Process to Artifacts in Diffusion-Generated Images

    Authors: Dennis Menn, Feng Liang, Hung-Yueh Chiang, Diana Marculescu

    Abstract: Artifact detection algorithms are crucial to correcting the output generated by diffusion models. However, because of the variety of artifact forms, existing methods require substantial annotated data for training. This requirement limits their scalability and efficiency, which restricts their wide application. This paper shows that the similarity of denoised images between consecutive time steps… ▽ More

    Submitted 22 December, 2024; originally announced December 2024.

  28. arXiv:2412.12581  [pdf, other

    cs.HC

    Understanding Emotional Body Expressions via Large Language Models

    Authors: Haifeng Lu, Jiuyi Chen, Feng Liang, Mingkui Tan, Runhao Zeng, Xiping Hu

    Abstract: Emotion recognition based on body movements is vital in human-computer interaction. However, existing emotion recognition methods predominantly focus on enhancing classification accuracy, often neglecting the provision of textual explanations to justify their classifications. In this paper, we propose an Emotion-Action Interpreter powered by Large Language Model (EAI-LLM), which not only recognize… ▽ More

    Submitted 20 December, 2024; v1 submitted 17 December, 2024; originally announced December 2024.

    Comments: Accepted by AAAI 2025

  29. arXiv:2411.07815  [pdf, other

    cs.RO cs.CV

    Reliable-loc: Robust sequential LiDAR global localization in large-scale street scenes based on verifiable cues

    Authors: Xianghong Zou, Jianping Li, Weitong Wu, Fuxun Liang, Bisheng Yang, Zhen Dong

    Abstract: Wearable laser scanning (WLS) system has the advantages of flexibility and portability. It can be used for determining the user's path within a prior map, which is a huge demand for applications in pedestrian navigation, collaborative mapping, augmented reality, and emergency rescue. However, existing LiDAR-based global localization methods suffer from insufficient robustness, especially in comple… ▽ More

    Submitted 6 April, 2025; v1 submitted 9 November, 2024; originally announced November 2024.

  30. arXiv:2411.00969  [pdf, other

    stat.ML cs.LG

    Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior

    Authors: Mingxuan Zhang, Yan Sun, Faming Liang

    Abstract: Large pretrained transformer models have revolutionized modern AI applications with their state-of-the-art performance in natural language processing (NLP). However, their substantial parameter count poses challenges for real-world deployment. To address this, researchers often reduce model size by pruning parameters based on their magnitude or sensitivity. Previous research has demonstrated the l… ▽ More

    Submitted 1 November, 2024; originally announced November 2024.

  31. arXiv:2411.00273  [pdf, other

    cs.LG stat.AP stat.ML

    Efficient Model Compression for Bayesian Neural Networks

    Authors: Diptarka Saha, Zihe Liu, Feng Liang

    Abstract: Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories, and resistance to adversarial attacks. This may be achieved via weight pruning or fully discarding certain input features. Here we demonstrate a novel strategy to… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

  32. arXiv:2410.11913  [pdf

    cs.CV

    Development and Testing of a Wood Panels Bark Removal Equipment Based on Deep Learning

    Authors: Rijun Wang, Guanghao Zhang, Hongyang Chen, Xinye Yu, Yesheng Chen, Fulong Liang, Xiangwei Mou, Bo Wang

    Abstract: Attempting to apply deep learning methods to wood panels bark removal equipment to enhance the quality and efficiency of bark removal is a significant and challenging endeavor. This study develops and tests a deep learning-based wood panels bark removal equipment. In accordance with the practical requirements of sawmills, a wood panels bark removal equipment equipped with a vision inspection syste… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  33. arXiv:2409.11419  [pdf, other

    cs.HC

    Vsens Reality: Blending the Virtual Sensors into XR

    Authors: Fengzhou Liang, Tian Min, Yuta Sugiura

    Abstract: In recent years, virtual sensing techniques have been extensively studied as a method of data collection in simulated virtual spaces for the development of human activity recognition (HAR) systems. To date, this technique has enabled the transformation between different modalities, significantly expanding datasets that are typically difficult to collect. However, there is limited research on how t… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

  34. arXiv:2409.03597  [pdf, other

    cs.SD cs.AI eess.AS

    Multimodal Laryngoscopic Video Analysis for Assisted Diagnosis of Vocal Fold Paralysis

    Authors: Yucong Zhang, Xin Zou, Jinshan Yang, Wenjun Chen, Juan Liu, Faya Liang, Ming Li

    Abstract: This paper presents the Multimodal Laryngoscopic Video Analyzing System (MLVAS), a novel system that leverages both audio and video data to automatically extract key video segments and metrics from raw laryngeal videostroboscopic videos for assisted clinical assessment. The system integrates video-based glottis detection with an audio keyword spotting method to analyze both video and audio data, i… ▽ More

    Submitted 22 April, 2025; v1 submitted 5 September, 2024; originally announced September 2024.

    Comments: Submitted to CSL

  35. arXiv:2408.08769  [pdf, ps, other

    cs.CL

    Lower Layers Matter: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused

    Authors: Dingwei Chen, Feiteng Fang, Shiwen Ni, Feng Liang, Xiping Hu, Ahmadreza Argha, Hamid Alinejad-Rokny, Min Yang, Chengming Li

    Abstract: Large Language Models (LLMs) have demonstrated exceptional performance across various natural language processing tasks. However, they occasionally generate inaccurate and counterfactual outputs, a phenomenon commonly referred to as "hallucinations''. To tackle this issue, recent studies have explored contrastive decoding between the original model and an amateur model with induced hallucination,… ▽ More

    Submitted 3 June, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

  36. arXiv:2407.21622  [pdf, other

    stat.ML cs.LG math.ST

    Extended Fiducial Inference: Toward an Automated Process of Statistical Inference

    Authors: Faming Liang, Sehwan Kim, Yan Sun

    Abstract: While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set --`inferring the uncertainty of model parameters on the basis of observations' -- has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leve… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  37. arXiv:2406.08115  [pdf, other

    cs.DC cs.AI

    Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

    Authors: Feng Liang, Zhen Zhang, Haifeng Lu, Chengming Li, Victor C. M. Leung, Yanyi Guo, Xiping Hu

    Abstract: With rapidly increasing distributed deep learning workloads in large-scale data centers, efficient distributed deep learning framework strategies for resource allocation and workload scheduling have become the key to high-performance deep learning. The large-scale environment with large volumes of datasets, models, and computational and communication resources raises various unique challenges for… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  38. arXiv:2405.15757  [pdf, other

    cs.CV cs.MM

    Looking Backward: Streaming Video-to-Video Translation with Feature Banks

    Authors: Feng Liang, Akio Kodaira, Chenfeng Xu, Masayoshi Tomizuka, Kurt Keutzer, Diana Marculescu

    Abstract: This paper introduces StreamV2V, a diffusion model that achieves real-time streaming video-to-video (V2V) translation with user prompts. Unlike prior V2V methods using batches to process limited frames, we opt to process frames in a streaming fashion, to support unlimited frames. At the heart of StreamV2V lies a backward-looking principle that relates the present to the past. This is realized by m… ▽ More

    Submitted 15 February, 2025; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: ICLR 2025. Project page: https://jeff-liangf.github.io/projects/streamv2v

  39. arXiv:2404.11467  [pdf, other

    cs.SE cs.CR

    A Large-scale Fine-grained Analysis of Packages in Open-Source Software Ecosystems

    Authors: Xiaoyan Zhou, Feiran Liang, Zhaojie Xie, Yang Lan, Wenjia Niu, Jiqiang Liu, Haining Wang, Qiang Li

    Abstract: Package managers such as NPM, Maven, and PyPI play a pivotal role in open-source software (OSS) ecosystems, streamlining the distribution and management of various freely available packages. The fine-grained details within software packages can unveil potential risks within existing OSS ecosystems, offering valuable insights for detecting malicious packages. In this study, we undertake a large-sca… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  40. arXiv:2404.11051  [pdf

    cs.CV

    WPS-Dataset: A benchmark for wood plate segmentation in bark removal processing

    Authors: Rijun Wang, Guanghao Zhang, Fulong Liang, Bo Wang, Xiangwei Mou, Yesheng Chen, Peng Sun, Canjin Wang

    Abstract: Using deep learning methods is a promising approach to improving bark removal efficiency and enhancing the quality of wood products. However, the lack of publicly available datasets for wood plate segmentation in bark removal processing poses challenges for researchers in this field. To address this issue, a benchmark for wood plate segmentation in bark removal processing named WPS-dataset is prop… ▽ More

    Submitted 25 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Report number: b06d7e0b-306f-476a-a72d-59a8793ac232 | v.1.2

  41. arXiv:2404.06114  [pdf, other

    cs.DC cs.AI

    Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey

    Authors: Feng Liang, Zhen Zhang, Haifeng Lu, Victor C. M. Leung, Yanyi Guo, Xiping Hu

    Abstract: With the rapid growth in the volume of data sets, models, and devices in the domain of deep learning, there is increasing attention on large-scale distributed deep learning. In contrast to traditional distributed deep learning, the large-scale scenario poses new challenges that include fault tolerance, scalability of algorithms and infrastructures, and heterogeneity in data sets, models, and resou… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  42. arXiv:2403.18994  [pdf, other

    stat.ML cs.LG

    Causal-StoNet: Causal Inference for High-Dimensional Complex Data

    Authors: Yaxin Fang, Faming Liang

    Abstract: With the advancement of data science, the collection of increasingly complex datasets has become commonplace. In such datasets, the data dimension can be extremely high, and the underlying data generation process can be unknown and highly nonlinear. As a result, the task of making causal inference with high-dimensional complex data has become a fundamental problem in many disciplines, such as medi… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  43. arXiv:2403.13178  [pdf, other

    stat.ML cs.AI cs.LG

    Fast Value Tracking for Deep Reinforcement Learning

    Authors: Frank Shih, Faming Liang

    Abstract: Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interacts with their environment. However, existing algorithms often view these problem as static, focusing on point estimates for model parameters to maximize expected rewards, neglecting the stochastic dynamics of agent-environment interactions and the critical role of uncertainty quantification. Our… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  44. arXiv:2402.15602  [pdf, other

    math.ST cs.IT cs.LG stat.ML

    Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions

    Authors: Kaihong Zhang, Caitlyn H. Yin, Feng Liang, Jingbo Liu

    Abstract: We study the asymptotic error of score-based diffusion model sampling in large-sample scenarios from a non-parametric statistics perspective. We show that a kernel-based score estimator achieves an optimal mean square error of $\widetilde{O}\left(n^{-1} t^{-\frac{d+2}{2}}(t^{\frac{d}{2}} \vee 1)\right)$ for the score function of $p_0*\mathcal{N}(0,t\boldsymbol{I}_d)$, where $n$ and $d$ represent t… ▽ More

    Submitted 23 July, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:60134-60178, 2024

  45. arXiv:2402.14399  [pdf, other

    cs.IR cs.AI

    Ensure Timeliness and Accuracy: A Novel Sliding Window Data Stream Paradigm for Live Streaming Recommendation

    Authors: Fengqi Liang, Baigong Zheng, Liqin Zhao, Guorui Zhou, Qian Wang, Yanan Niu

    Abstract: Live streaming recommender system is specifically designed to recommend real-time live streaming of interest to users. Due to the dynamic changes of live content, improving the timeliness of the live streaming recommender system is a critical problem. Intuitively, the timeliness of the data determines the upper bound of the timeliness that models can learn. However, none of the previous works addr… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  46. arXiv:2401.10386  [pdf, other

    cs.LG eess.SP physics.med-ph

    Noninvasive Acute Compartment Syndrome Diagnosis Using Random Forest Machine Learning

    Authors: Zaina Abu Hweij, Florence Liang, Sophie Zhang

    Abstract: Acute compartment syndrome (ACS) is an orthopedic emergency, caused by elevated pressure within a muscle compartment, that leads to permanent tissue damage and eventually death. Diagnosis of ACS relies heavily on patient-reported symptoms, a method that is clinically unreliable and often supplemented with invasive intracompartmental pressure measurements that can malfunction in motion settings. Th… ▽ More

    Submitted 12 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  47. arXiv:2312.17681  [pdf, other

    cs.CV cs.MM

    FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

    Authors: Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu

    Abstract: Diffusion models have transformed the image-to-image (I2I) synthesis and are now permeating into videos. However, the advancement of video-to-video (V2V) synthesis has been hampered by the challenge of maintaining temporal consistency across video frames. This paper proposes a consistent V2V synthesis framework by jointly leveraging spatial conditions and temporal optical flow clues within the sou… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Comments: Project website: https://jeff-liangf.github.io/projects/flowvid/

  48. arXiv:2312.13834  [pdf, other

    cs.CV

    Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

    Authors: Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang, Yichen Jia, Kapil Krishnakumar, Tong Xiao, Feng Liang, Licheng Yu, Peter Vajda

    Abstract: In this paper, we introduce Fairy, a minimalist yet robust adaptation of image-editing diffusion models, enhancing them for video editing applications. Our approach centers on the concept of anchor-based cross-frame attention, a mechanism that implicitly propagates diffusion features across frames, ensuring superior temporal coherence and high-fidelity synthesis. Fairy not only addresses limitatio… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Project website: https://fairy-video2video.github.io

  49. RelJoin: Relative-cost-based Selection of Distributed Join Methods for Query Plan Optimization

    Authors: F. Liang, F. C. M. Lau, H. Cui, Y. Li, B. Lin, C. Li, X. Hu

    Abstract: Selecting appropriate distributed join methods for logical join operations in a query plan is crucial for the performance of data-intensive scalable computing (DISC). Different network communication patterns in the data exchange phase generate varying network communication workloads and significantly affect the distributed join performance. However, most cost-based query optimizers focus on the lo… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Journal ref: Information Sciences 658 (2024) 120022

  50. arXiv:2311.11210  [pdf, other

    cs.CV

    HiH: A Multi-modal Hierarchy in Hierarchy Network for Unconstrained Gait Recognition

    Authors: Lei Wang, Bo Liu, Yinchi Ma, Fangfang Liang, Nawei Guo

    Abstract: Gait recognition has achieved promising advances in controlled settings, yet it significantly struggles in unconstrained environments due to challenges such as view changes, occlusions, and varying walking speeds. Additionally, efforts to fuse multiple modalities often face limited improvements because of cross-modality incompatibility, particularly in outdoor scenarios. To address these issues, w… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 November, 2023; originally announced November 2023.