Skip to main content

Showing 1–28 of 28 results for author: Ju, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2602.19086  [pdf, ps, other

    cs.CV

    Restoration-Guided Kuzushiji Character Recognition Framework under Seal Interference

    Authors: Rui-Yang Ju, Kohei Yamashita, Hirotaka Kameko, Shinsuke Mori

    Abstract: Kuzushiji was one of the most popular writing styles in pre-modern Japan and was widely used in both personal letters and official documents. However, due to its highly cursive forms and extensive glyph variations, most modern Japanese readers cannot directly interpret Kuzushiji characters. Therefore, recent research has focused on developing automated Kuzushiji character recognition methods, whic… ▽ More

    Submitted 22 February, 2026; originally announced February 2026.

  2. arXiv:2601.17088  [pdf, ps, other

    cs.CV

    GlassesGB: Controllable 2D GAN-Based Eyewear Personalization for 3D Gaussian Blendshapes Head Avatars

    Authors: Rui-Yang Ju, Jen-Shiun Chiang

    Abstract: Virtual try-on systems allow users to interactively try different products within VR scenarios. However, most existing VTON methods operate only on predefined eyewear templates and lack support for fine-grained, user-driven customization. While GlassesGAN enables personalized 2D eyewear design, its capability remains limited to 2D image generation. Motivated by the success of 3D Gaussian Blendshap… ▽ More

    Submitted 23 January, 2026; originally announced January 2026.

    Comments: IEEE VR 2026 Poster

  3. arXiv:2601.01708  [pdf, ps, other

    cs.CL

    A Training-Free Large Reasoning Model-based Knowledge Tracing Framework for Unified Prediction and Prescription

    Authors: Unggi Lee, Joo Young Kim, Ran Ju, Minyoung Jung, Jeyeon Eo

    Abstract: Knowledge Tracing (KT) aims to estimate a learner's evolving mastery based on interaction histories. Recent studies have explored Large Language Models (LLMs) for KT via autoregressive nature, but such approaches typically require fine-tuning and exhibit unstable or near-random performance. Moreover, prior KT systems primarily focus on prediction and rely on multi-stage pipelines for feedback and… ▽ More

    Submitted 4 January, 2026; originally announced January 2026.

  4. arXiv:2512.14114  [pdf, ps, other

    cs.CV

    MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction

    Authors: Rui-Yang Ju, KokSheik Wong, Yanlin Jin, Jen-Shiun Chiang

    Abstract: Document image enhancement and binarization are commonly performed prior to document analysis and recognition tasks for improving the efficiency and accuracy of optical character recognition (OCR) systems. This is because directly recognizing text in degraded documents, particularly in color images, often results in unsatisfactory recognition performance. To address these issues, existing methods… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: Extended Journal Version of APSIPA ASC 2025

  5. arXiv:2511.17150  [pdf, ps, other

    cs.CV

    DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving

    Authors: Liuhan Yin, Runkun Ju, Guodong Guo, Erkang Cheng

    Abstract: Unlike discriminative approaches in autonomous driving that predict a fixed set of candidate trajectories of the ego vehicle, generative methods, such as diffusion models, learn the underlying distribution of future motion, enabling more flexible trajectory prediction. However, since these methods typically rely on denoising human-crafted trajectory anchors or random noise, there remains significa… ▽ More

    Submitted 21 November, 2025; originally announced November 2025.

    Comments: Accepted to AAAI 2026

  6. arXiv:2511.09117  [pdf, ps, other

    cs.CV

    DKDS: A Benchmark Dataset of Degraded Kuzushiji Documents with Seals for Detection and Binarization

    Authors: Rui-Yang Ju, Kohei Yamashita, Hirotaka Kameko, Shinsuke Mori

    Abstract: Kuzushiji, a pre-modern Japanese cursive script, can currently be read and understood by only a few thousand trained experts in Japan. With the rapid development of deep learning, researchers have begun applying Optical Character Recognition (OCR) techniques to transcribe Kuzushiji into modern Japanese. Although existing OCR methods perform well on clean pre-modern Japanese documents written in Ku… ▽ More

    Submitted 17 March, 2026; v1 submitted 12 November, 2025; originally announced November 2025.

  7. arXiv:2508.16739  [pdf, ps, other

    cs.CV

    Two-Stage Framework for Efficient UAV-Based Wildfire Video Analysis with Adaptive Compression and Fire Source Detection

    Authors: Yanbing Bai, Rui-Yang Ju, Lemeng Zhao, Junjie Hu, Jianchao Bi, Erick Mas, Shunichi Koshimura

    Abstract: Unmanned Aerial Vehicles (UAVs) have become increasingly important in disaster emergency response by enabling real-time aerial video analysis. Due to the limited computational resources available on UAVs, large models cannot be run independently for real-time analysis. To overcome this challenge, we propose a lightweight and efficient two-stage framework for real-time wildfire monitoring and fire… ▽ More

    Submitted 22 August, 2025; originally announced August 2025.

  8. arXiv:2506.19266  [pdf

    q-bio.NC cs.CV eess.IV

    Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans

    Authors: Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Andrew Zalesky, Xiaoquan Yang, Yuanjing Feng, Zheng Wang

    Abstract: The organization and connectivity of the arcuate fasciculus (AF) in nonhuman primates remain contentious, especially concerning how its anatomy diverges from that of humans. Here, we combined cross-scale single-neuron tracing - using viral-based genetic labeling and fluorescence micro-optical sectioning tomography in macaques (n = 4; age 3 - 11 years) - with whole-brain tractography from 11.7T dif… ▽ More

    Submitted 2 July, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

    Comments: 34 pages, 6 figures

  9. arXiv:2505.10072  [pdf, ps, other

    cs.CV

    ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars

    Authors: Rui-Yang Ju, Sheng-Yen Huang, Yi-Ping Hung

    Abstract: The introduction of 3D Gaussian blendshapes has enabled the real-time reconstruction of animatable head avatars from monocular video. Toonify, a StyleGAN-based method, has become widely used for facial image stylization. To extend Toonify for synthesizing diverse stylized 3D head avatars using Gaussian blendshapes, we propose an efficient two-stage framework, ToonifyGB. In Stage 1 (stylized video… ▽ More

    Submitted 22 January, 2026; v1 submitted 15 May, 2025; originally announced May 2025.

    Comments: 2-Page Version Accepted as a Poster at IEEE VR 2026

  10. arXiv:2502.09913  [pdf, other

    cs.AI cs.HC

    AutoS$^2$earch: Unlocking the Reasoning Potential of Large Models for Web-based Source Search

    Authors: Zhengqiu Zhu, Yatai Ji, Jiaheng Huang, Yong Zhao, Sihang Qiu, Rusheng Ju

    Abstract: Web-based management systems have been widely used in risk control and industrial safety. However, effectively integrating source search capabilities into these systems, to enable decision-makers to locate and address the hazard (e.g., gas leak detection) remains a challenge. While prior efforts have explored using web crowdsourcing and AI algorithms for source search decision support, these appro… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

  11. Pediatric Wrist Fracture Detection Using Feature Context Excitation Modules in X-ray Images

    Authors: Rui-Yang Ju, Chun-Tse Chien, Enkaer Xieerke, Jen-Shiun Chiang

    Abstract: Children often suffer wrist trauma in daily life, while they usually need radiologists to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural networks to serve as computer-assisted diagnosis (CAD) tools to help doctors and experts in medical image diagnostics. Since YOLOv8 model has obtained the satisfactory success in objec… ▽ More

    Submitted 7 November, 2024; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: arXiv admin note: text overlap with arXiv:2407.03163

    Journal ref: IET Image Process. 20 (2026) e70269

  12. arXiv:2409.18826  [pdf, other

    cs.CV

    YOLOv8-ResCBAM: YOLOv8 Based on An Effective Attention Module for Pediatric Wrist Fracture Detection

    Authors: Rui-Yang Ju, Chun-Tse Chien, Jen-Shiun Chiang

    Abstract: Wrist trauma and even fractures occur frequently in daily life, particularly among children who account for a significant proportion of fracture cases. Before performing surgery, surgeons often request patients to undergo X-ray imaging first, and prepare for the surgery based on the analysis of the X-ray images. With the development of neural networks, You Only Look Once (YOLO) series models have… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Comments: Accepted by ICONIP 2024. arXiv admin note: substantial text overlap with arXiv:2402.09329

  13. arXiv:2409.11692  [pdf, ps, other

    cs.CV

    ORB-SfMLearner: ORB-Guided Self-supervised Visual Odometry with Selective Online Adaptation

    Authors: Yanlin Jin, Rui-Yang Ju, Haojun Liu, Yuzhong Zhong

    Abstract: Deep visual odometry, despite extensive research, still faces limitations in accuracy and generalizability that prevent its broader application. To address these challenges, we propose an Oriented FAST and Rotated BRIEF (ORB)-guided visual odometry with selective online adaptation named ORB-SfMLearner. We present a novel use of ORB features for learning-based ego-motion estimation, leading to more… ▽ More

    Submitted 10 January, 2026; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: ICRA 2025; Project page: https://www.neiljin.site/projects/orbsfm/

  14. arXiv:2407.04231  [pdf, ps, other

    cs.CV

    Efficient Generative Adversarial Networks for Color Document Image Enhancement and Binarization Using Multi-scale Feature Extraction

    Authors: Rui-Yang Ju, KokSheik Wong, Jen-Shiun Chiang

    Abstract: The outcome of text recognition for degraded color documents is often unsatisfactory due to interference from various contaminants. To extract information more efficiently for text recognition, document image enhancement and binarization are often employed as preliminary steps in document analysis. Training independent generative adversarial networks (GANs) for each color channel can generate imag… ▽ More

    Submitted 30 November, 2025; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to APSIPA ASC 2025

  15. arXiv:2407.03163  [pdf, other

    cs.CV

    Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

    Authors: Rui-Yang Ju, Chun-Tse Chien, Chia-Min Lin, Jen-Shiun Chiang

    Abstract: Children often suffer wrist injuries in daily life, while fracture injuring radiologists usually need to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural network models to work as computer-assisted diagnosis (CAD) tools to help doctors and experts in diagnosis. Since the YOLOv8 models have obtained the satisfactory succes… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  16. arXiv:2404.18245  [pdf, other

    cs.CV

    FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method

    Authors: Yanbing Bai, Siao Li, Rui-Yang Ju, Zihao Yang, Jinze Yu, Jen-Shiun Chiang

    Abstract: Illegal, unreported, and unregulated (IUU) fishing activities seriously affect various aspects of human life. However, traditional methods for detecting and monitoring IUU fishing activities at sea have limitations. Although synthetic aperture radar (SAR) can complement existing vessel detection systems, extracting useful information from SAR images using traditional methods remains a challenge, e… ▽ More

    Submitted 12 July, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  17. arXiv:2404.18235  [pdf, other

    cs.CV

    Flood Data Analysis on SpaceNet 8 Using Apache Sedona

    Authors: Yanbing Bai, Zihao Yang, Jinze Yu, Rui-Yang Ju, Bin Yang, Erick Mas, Shunichi Koshimura

    Abstract: With the escalating frequency of floods posing persistent threats to human life and property, satellite remote sensing has emerged as an indispensable tool for monitoring flood hazards. SpaceNet8 offers a unique opportunity to leverage cutting-edge artificial intelligence technologies to assess these hazards. A significant contribution of this research is its application of Apache Sedona, an advan… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  18. arXiv:2403.11249  [pdf, other

    eess.IV cs.CV

    YOLOv9 for Fracture Detection in Pediatric Wrist Trauma X-ray Images

    Authors: Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Jen-Shiun Chiang

    Abstract: The introduction of YOLOv9, the latest version of the You Only Look Once (YOLO) series, has led to its widespread adoption across various scenarios. This paper is the first to apply the YOLOv9 algorithm model to the fracture detection task as computer-assisted diagnosis (CAD) to help radiologists and surgeons to interpret X-ray images. Specifically, this paper trained the model on the GRAZPEDWRI-D… ▽ More

    Submitted 27 May, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by Electronics Letters

    Journal ref: Electron. Lett. 60 (2024) e13248

  19. YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection

    Authors: Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Enkaer Xieerke, Jen-Shiun Chiang

    Abstract: Wrist trauma and even fractures occur frequently in daily life, particularly among children who account for a significant proportion of fracture cases. Before performing surgery, surgeons often request patients to undergo X-ray imaging first and prepare for it based on the analysis of the radiologist. With the development of neural networks, You Only Look Once (YOLO) series models have been widely… ▽ More

    Submitted 28 September, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Journal ref: IEEE Access 13 (2025) 52461-52477

  20. arXiv:2305.17420  [pdf, other

    cs.CV

    CCDWT-GAN: Generative Adversarial Networks Based on Color Channel Using Discrete Wavelet Transform for Document Image Binarization

    Authors: Rui-Yang Ju, Yu-Shian Lin, Jen-Shiun Chiang, Chih-Chia Chen, Wei-Han Chen, Chun-Tse Chien

    Abstract: To efficiently extract textual information from color degraded document images is a significant research area. The prolonged imperfect preservation of ancient documents has led to various types of degradation, such as page staining, paper yellowing, and ink bleeding. These types of degradation badly impact the image processing for features extraction. This paper introduces a novelty method employi… ▽ More

    Submitted 24 August, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: accepted by PRICAI 2023

  21. Fracture Detection in Pediatric Wrist Trauma X-ray Images Using YOLOv8 Algorithm

    Authors: Rui-Yang Ju, Weiming Cai

    Abstract: Hospital emergency departments frequently receive lots of bone fracture cases, with pediatric wrist trauma fracture accounting for the majority of them. Before pediatric surgeons perform surgery, they need to ask patients how the fracture occurred and analyze the fracture situation by interpreting X-ray images. The interpretation of X-ray images often requires a combination of techniques from radi… ▽ More

    Submitted 14 November, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Scientific Reports

    Journal ref: Sci Rep 13 (2023) 20077

  22. Resolution Enhancement Processing on Low Quality Images Using Swin Transformer Based on Interval Dense Connection Strategy

    Authors: Rui-Yang Ju, Chih-Chia Chen, Jen-Shiun Chiang, Yu-Shian Lin, Wei-Han Chen, Chun-Tse Chien

    Abstract: The Transformer-based method has demonstrated remarkable performance for image super-resolution in comparison to the method based on the convolutional neural networks (CNNs). However, using the self-attention mechanism like SwinIR (Image Restoration Using Swin Transformer) to extract feature information from images needs a significant amount of computational resources, which limits its application… ▽ More

    Submitted 13 May, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Journal ref: Multimed Tools Appl 83 (2024) 14839-14855

  23. Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks

    Authors: Rui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, Jen-Shiun Chiang

    Abstract: The efficient extraction of text information from the background in degraded color document images is an important challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to different types of degradation over time, such as page yellowing, staining, and ink bleeding, seriously affecting the results of document image binarization. This work pr… ▽ More

    Submitted 28 September, 2024; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted by Knowledge-Based Systems

    Journal ref: Knowl.-Based Syst. 304 (2024) 112542

  24. arXiv:2208.01424  [pdf, other

    cs.CV

    Connection Reduction of DenseNet for Image Recognition

    Authors: Rui-Yang Ju, Jen-Shiun Chiang, Chih-Chia Chen, Yu-Shian Lin

    Abstract: Convolutional Neural Networks (CNN) increase depth by stacking convolutional layers, and deeper network models perform better in image recognition. Empirical research shows that simply stacking convolutional layers does not make the network train better, and skip connection (residual learning) can improve network model performance. For the image classification task, models with global densely conn… ▽ More

    Submitted 14 November, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

  25. Efficient Convolutional Neural Networks on Raspberry Pi for Image Classification

    Authors: Rui-Yang Ju, Ting-Yu Lin, Jia-Hao Jian, Jen-Shiun Chiang

    Abstract: With the good performance of deep learning algorithms in the field of computer vision (CV), the convolutional neural network (CNN) architecture has become a main backbone of the computer vision task. With the widespread use of mobile devices, neural network models based on platforms with low computing power are gradually being paid attention. However, due to the limitation of computing power, deep… ▽ More

    Submitted 19 November, 2022; v1 submitted 2 April, 2022; originally announced April 2022.

    Journal ref: J Real-Time Image Proc 20 (2023) 21

  26. arXiv:2203.00960  [pdf

    cs.CV

    Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions

    Authors: Rui-Yang Ju, Ting-Yu Lin, Jen-Shiun Chiang, Jia-Hao Jian, Yu-Shian Lin, Liu-Rui-Yi Huang

    Abstract: With the achievements of Transformer in the field of natural language processing, the encoder-decoder and the attention mechanism in Transformer have been applied to computer vision. Recently, in multiple tasks of computer vision (image classification, object detection, semantic segmentation, etc.), state-of-the-art convolutional neural networks have introduced some concepts of Transformer. This p… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

    Comments: 6 pages, 5 figures

  27. ThreshNet: An Efficient DenseNet Using Threshold Mechanism to Reduce Connections

    Authors: Rui-Yang Ju, Ting-Yu Lin, Jia-Hao Jian, Jen-Shiun Chiang, Wei-Bin Yang

    Abstract: With the continuous development of neural networks for computer vision tasks, more and more network architectures have achieved outstanding success. As one of the most advanced neural network architectures, DenseNet shortcuts all feature maps to solve the model depth problem. Although this network architecture has excellent accuracy with low parameters, it requires an excessive inference time. To… ▽ More

    Submitted 7 August, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

    Comments: IEEE Access

    Journal ref: IEEE Access 10 (2022) 82834-82843

  28. arXiv:2108.12604  [pdf

    cs.CV

    New Pruning Method Based on DenseNet Network for Image Classification

    Authors: Rui-Yang Ju, Ting-Yu Lin, Jen-Shiun Chiang

    Abstract: Deep neural networks have made significant progress in the field of computer vision. Recent studies have shown that depth, width and shortcut connections of neural network architectures play a crucial role in their performance. One of the most advanced neural network architectures, DenseNet, has achieved excellent convergence rates through dense connections. However, it still has obvious shortcomi… ▽ More

    Submitted 27 December, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

    Comments: 5 pages, 3 figures, TAAI 2021