Skip to main content

Showing 1–7 of 7 results for author: Gui, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.09168  [pdf, other

    cs.CV cs.AI

    DRRNet: Macro-Micro Feature Fusion and Dual Reverse Refinement for Camouflaged Object Detection

    Authors: Jianlin Sun, Xiaolin Fang, Juwei Guan, Dongdong Gui, Teqi Wang, Tongxin Zhu

    Abstract: The core challenge in Camouflage Object Detection (COD) lies in the indistinguishable similarity between targets and backgrounds in terms of color, texture, and shape. This causes existing methods to either lose edge details (such as hair-like fine structures) due to over-reliance on global semantic information or be disturbed by similar backgrounds (such as vegetation patterns) when relying solel… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  2. arXiv:2502.18699  [pdf, ps, other

    cs.CL cs.LG stat.ME

    MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment

    Authors: Tianze Wang, Dongnan Gui, Yifan Hu, Shuhang Lin, Linjun Zhang

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has shown promise in aligning large language models (LLMs). Yet its reliance on a singular reward model often overlooks the diversity of human preferences. Recent approaches address this limitation by leveraging multi-dimensional feedback to fine-tune corresponding reward models and train LLMs using reinforcement learning. However, the process is c… ▽ More

    Submitted 22 July, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

    Comments: ICML 2025

  3. arXiv:2406.04744  [pdf, other

    cs.CL

    CRAG -- Comprehensive RAG Benchmark

    Authors: Xiao Yang, Kai Sun, Hao Xin, Yushi Sun, Nikita Bhalla, Xiangsen Chen, Sajal Choudhary, Rongze Daniel Gui, Ziran Will Jiang, Ziyu Jiang, Lingkun Kong, Brian Moran, Jiaqi Wang, Yifan Ethan Xu, An Yan, Chenyu Yang, Eting Yuan, Hanwen Zha, Nan Tang, Lei Chen, Nicolas Scheffer, Yue Liu, Nirav Shah, Rakesh Wanga, Anuj Kumar , et al. (2 additional authors not shown)

    Abstract: Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering bench… ▽ More

    Submitted 1 November, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Datasets and Benchmarks Track

  4. arXiv:2310.18642  [pdf

    cs.CV cs.AI

    One-shot Localization and Segmentation of Medical Images with Foundation Models

    Authors: Deepa Anand, Gurunath Reddy M, Vanika Singhal, Dattesh D. Shanbhag, Shriram KS, Uday Patil, Chitresh Bhushan, Kavitha Manickam, Dawei Gui, Rakesh Mullick, Avinash Gopal, Parminder Bhatia, Taha Kass-Hout

    Abstract: Recent advances in Vision Transformers (ViT) and Stable Diffusion (SD) models with their ability to capture rich semantic features of the image have been used for image correspondence tasks on natural images. In this paper, we examine the ability of a variety of pre-trained ViT (DINO, DINOv2, SAM, CLIP) and SD models, trained exclusively on natural images, for solving the correspondence problems o… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023 R0-FoMo Workshop

  5. arXiv:2305.19543  [pdf, other

    cs.CV

    Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model

    Authors: Haisong Ding, Bozhi Luan, Dongnan Gui, Kai Chen, Qiang Huo

    Abstract: Constructing a highly accurate handwritten OCR system requires large amounts of representative training data, which is both time-consuming and expensive to collect. To mitigate the issue, we propose a denoising diffusion probabilistic model (DDPM) to generate training samples. This model conditions on a printed glyph image and creates mappings between printed characters and handwritten images, thu… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  6. arXiv:2305.18259  [pdf, other

    cs.CV

    GlyphControl: Glyph Conditional Control for Visual Text Generation

    Authors: Yukang Yang, Dongnan Gui, Yuhui Yuan, Weicong Liang, Haisong Ding, Han Hu, Kai Chen

    Abstract: Recently, there has been an increasing interest in developing diffusion-based text-to-image generative models capable of generating coherent and well-formed visual text. In this paper, we propose a novel and efficient approach called GlyphControl to address this task. Unlike existing methods that rely on character-aware text encoders like ByT5 and require retraining of text-to-image models, our ap… ▽ More

    Submitted 11 November, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted by NeurIPS 2023. The codes have been released at https://github.com/AIGText/GlyphControl-release

  7. arXiv:2305.15660  [pdf, other

    cs.CV

    Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition

    Authors: Dongnan Gui, Kai Chen, Haisong Ding, Qiang Huo

    Abstract: There are more than 80,000 character categories in Chinese while most of them are rarely used. To build a high performance handwritten Chinese character recognition (HCCR) system supporting the full character set with a traditional approach, many training samples need be collected for each character category, which is both time-consuming and expensive. In this paper, we propose a novel approach to… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.