Skip to main content

Showing 1–17 of 17 results for author: Poudel, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.10451  [pdf, ps, other

    cs.CV

    Parameter Efficient Fine-tuning for Domain-specific Gastrointestinal Disease Recognition

    Authors: Sanjaya Poudel, Nikita Kunwor, Raj Simkhada, Mustafa Munir, Manish Dhakal, Khem Poudel

    Abstract: Despite recent advancements in the field of medical image analysis with the use of pretrained foundation models, the issue of distribution shifts between cross-source images largely remains adamant. To circumvent that issue, investigators generally train a separate model for each source. However, this method becomes expensive when we fully fine-tune pretrained large models for a single dataset, as… ▽ More

    Submitted 12 April, 2026; originally announced April 2026.

    Comments: 6 pages, 3 figures, CVPR conference

  2. arXiv:2603.25886  [pdf, ps, other

    cs.CV

    Automated Quality Assessment of Blind Sweep Obstetric Ultrasound for Improved Diagnosis

    Authors: Prasiddha Bhandari, Kanchan Poudel, Nishant Luitel, Bishram Acharya, Angelina Ghimire, Tyler Wellman, Kilian Koepsell, Pradeep Raj Regmi, Bishesh Khanal

    Abstract: Blind Sweep Obstetric Ultrasound (BSOU) enables scalable fetal imaging in low-resource settings by allowing minimally trained operators to acquire standardized sweep videos for automated Artificial Intelligence(AI) interpretation. However, the reliability of such AI systems depends critically on the quality of the acquired sweeps, and little is known about how deviations from the intended protocol… ▽ More

    Submitted 26 March, 2026; originally announced March 2026.

  3. arXiv:2602.01069  [pdf, ps, other

    cs.CV cs.LG

    PDE-Constrained Optimization for Neural Image Segmentation with Physics Priors

    Authors: Seema K. Poudel, Sunny K. Khadka

    Abstract: Segmentation of microscopy images constitutes an ill-posed inverse problem due to measurement noise, weak object boundaries, and limited labeled data. Although deep neural networks provide flexible nonparametric estimators, unconstrained empirical risk minimization often leads to unstable solutions and poor generalization. In this work, image segmentation is formulated as a PDE-constrained optimiz… ▽ More

    Submitted 1 February, 2026; originally announced February 2026.

  4. arXiv:2601.08863  [pdf

    cs.OH

    WheatAI v1.0: An AI-Powered High Throughput Wheat Phenotyping Platform

    Authors: Maitiniyazi Maimaitijiang, Hillson Ghimire, Subash Thapa, Mohammad Maruf Billah, Shaurya Sehgal, Mandeep Singh, Swas Kaushal, Kushal Poudel, Santosh Subedi, Ubaid Ur Rehman Janjua, Lise-Olga Makonga, Jyotirmoy Halder, Harsimardeep S. Gill, Mazhar Sher, Jagdeep Singh Sidhu, Sunish K. Sehgal

    Abstract: High-throughput, low-cost phenotyping remains a critical bottleneck in wheat breeding, genetics, and crop management. This is particularly evident in the measurement of complex yield components (i.e., spike and spikelet counts), disease and grain-quality traits related to Fusarium Head Blight (FHB) and Fusarium-Damaged Kernels (FDK), and microscale physiological traits such as density and size of… ▽ More

    Submitted 9 January, 2026; originally announced January 2026.

  5. arXiv:2503.02904  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Surgical Vision World Model

    Authors: Saurabh Koju, Saurav Bastola, Prashant Shrestha, Sanskar Amgain, Yash Raj Shrestha, Rudra P. K. Poudel, Binod Bhattarai

    Abstract: Realistic and interactive surgical simulation has the potential to facilitate crucial applications, such as medical professional training and autonomous surgical agent training. In the natural visual domain, world models have enabled action-controlled data generation, demonstrating the potential to train autonomous agents in interactive simulated environments when large-scale real data acquisition… ▽ More

    Submitted 26 September, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: This paper has been accepted at the Data Engineering in Medical Imaging Workshop, MICCAI 2025

    Journal ref: MICCAI Workshop on Data Engineering in Medical Imaging (2025) 1-10

  6. arXiv:2412.14100  [pdf, other

    eess.IV cs.CV cs.LG

    Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset

    Authors: Bijay Adhikari, Pratibha Kulung, Jakesh Bohaju, Laxmi Kanta Poudel, Confidence Raymond, Dong Zhang, Udunna C Anazodo, Bishesh Khanal, Mahesh Shakya

    Abstract: Automating brain tumor segmentation using deep learning methods is an ongoing challenge in medical imaging. Multiple lingering issues exist including domain-shift and applications in low-resource settings which brings a unique set of challenges including scarcity of data. As a step towards solving these specific problems, we propose Convolutional adapter-inspired Parameter-efficient Fine-tuning (P… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

    Comments: Accepted to "The International Brain Tumor Segmentation (BraTS) challenge organized at MICCAI 2024 conference"

  7. arXiv:2403.11936  [pdf, other

    eess.IV cs.CV

    AI-Assisted Cervical Cancer Screening

    Authors: Kanchan Poudel, Lisasha Poudel, Prabin Raj Shakya, Atit Poudel, Archana Shrestha, Bishesh Khanal

    Abstract: Visual Inspection with Acetic Acid (VIA) remains the most feasible cervical cancer screening test in resource-constrained settings of low- and middle-income countries (LMICs), which are often performed screening camps or primary/community health centers by nurses instead of the preferred but unavailable expert Gynecologist. To address the highly subjective nature of the test, various handheld devi… ▽ More

    Submitted 4 September, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  8. arXiv:2312.09056  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    ReCoRe: Regularized Contrastive Representation Learning of World Model

    Authors: Rudra P. K. Poudel, Harit Pandya, Stephan Liwicki, Roberto Cipolla

    Abstract: While recent model-free Reinforcement Learning (RL) methods have demonstrated human-level effectiveness in gaming environments, their success in everyday tasks like visual navigation has been limited, particularly under significant appearance variations. This limitation arises from (i) poor sample efficiency and (ii) over-fitting to training scenarios. To address these challenges, we present a wor… ▽ More

    Submitted 3 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted at CVPR 2024. arXiv admin note: text overlap with arXiv:2209.14932

  9. arXiv:2311.17593  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.RO

    LanGWM: Language Grounded World Model

    Authors: Rudra P. K. Poudel, Harit Pandya, Chao Zhang, Roberto Cipolla

    Abstract: Recent advances in deep reinforcement learning have showcased its potential in tackling complex tasks. However, experiments on visual control tasks have revealed that state-of-the-art reinforcement learning models struggle with out-of-distribution generalization. Conversely, expressing higher-level concepts and global contexts is relatively easy using language. Building upon recent success of th… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  10. arXiv:2309.12829  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography

    Authors: Rabin Adhikari, Manish Dhakal, Safal Thapaliya, Kanchan Poudel, Prasiddha Bhandari, Bishesh Khanal

    Abstract: Accurate segmentation is essential for echocardiography-based assessment of cardiovascular diseases (CVDs). However, the variability among sonographers and the inherent challenges of ultrasound images hinder precise segmentation. By leveraging the joint representation of image and text modalities, Vision-Language Segmentation Models (VLSMs) can incorporate rich contextual information, potentially… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: Accepted at the 4th International Workshop of Advances in Simplifying Medical UltraSound (ASMUS)

  11. arXiv:2308.07706  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models

    Authors: Kanchan Poudel, Manish Dhakal, Prasiddha Bhandari, Rabin Adhikari, Safal Thapaliya, Bishesh Khanal

    Abstract: Medical image segmentation allows quantifying target structure size and shape, aiding in disease diagnosis, prognosis, surgery planning, and comprehension.Building upon recent advancements in foundation Vision-Language Models (VLMs) from natural image-text pairs, several studies have proposed adapting them to Vision-Language Segmentation Models (VLSMs) that allow using language text as an addition… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: Medical Imaging with Deep Learning (MIDL) 2024 (Oral)

  12. arXiv:2209.14932  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Contrastive Unsupervised Learning of World Model with Invariant Causal Features

    Authors: Rudra P. K. Poudel, Harit Pandya, Roberto Cipolla

    Abstract: In this paper we present a world model, which learns causal features using the invariance principle. In particular, we use contrastive unsupervised learning to learn the invariant causal features, which enforces invariance across augmentations of irrelevant parts or styles of the observation. The world-model-based reinforcement learning methods independently optimize representation learning and th… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  13. arXiv:2201.12678  [pdf, ps, other

    cs.LG cs.CV

    A Stochastic Bundle Method for Interpolating Networks

    Authors: Alasdair Paren, Leonard Berrada, Rudra P. K. Poudel, M. Pawan Kumar

    Abstract: We propose a novel method for training deep neural networks that are capable of interpolation, that is, driving the empirical loss to zero. At each iteration, our method constructs a stochastic approximation of the learning objective. The approximation, known as a bundle, is a pointwise maximum of linear functions. Our bundle contains a constant function that lower bounds the empirical loss. This… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  14. arXiv:2009.05429  [pdf, other

    cs.RO cs.AI cs.CV

    Embodied Visual Navigation with Automatic Curriculum Learning in Real Environments

    Authors: Steven D. Morad, Roberto Mecca, Rudra P. K. Poudel, Stephan Liwicki, Roberto Cipolla

    Abstract: We present NavACL, a method of automatic curriculum learning tailored to the navigation task. NavACL is simple to train and efficiently selects relevant tasks using geometric features. In our experiments, deep reinforcement learning agents trained using NavACL significantly outperform state-of-the-art agents trained with uniform sampling -- the current standard. Furthermore, our agents can navigat… ▽ More

    Submitted 6 January, 2021; v1 submitted 11 September, 2020; originally announced September 2020.

  15. arXiv:1902.04502  [pdf, other

    cs.CV

    Fast-SCNN: Fast Semantic Segmentation Network

    Authors: Rudra P K Poudel, Stephan Liwicki, Roberto Cipolla

    Abstract: The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  16. arXiv:1805.04554  [pdf, other

    cs.CV

    ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time

    Authors: Rudra P K Poudel, Ujwal Bonde, Stephan Liwicki, Christopher Zach

    Abstract: Modern deep learning architectures produce highly accurate results on many challenging semantic segmentation datasets. State-of-the-art methods are, however, not directly transferable to real-time applications or embedded devices, since naive adaptation of such systems to reduce computational cost (speed, memory and energy) causes a significant drop in accuracy. We propose ContextNet, a new deep n… ▽ More

    Submitted 5 November, 2018; v1 submitted 11 May, 2018; originally announced May 2018.

    Comments: Published as a conference paper at British Machine Vision Conference (BMVC), 2018

  17. arXiv:1608.03974  [pdf, other

    stat.ML cs.CV cs.LG

    Recurrent Fully Convolutional Neural Networks for Multi-slice MRI Cardiac Segmentation

    Authors: Rudra P K Poudel, Pablo Lamata, Giovanni Montana

    Abstract: In cardiac magnetic resonance imaging, fully-automatic segmentation of the heart enables precise structural and functional measurements to be taken, e.g. from short-axis MR images of the left-ventricle. In this work we propose a recurrent fully-convolutional network (RFCN) that learns image representations from the full stack of 2D slices and has the ability to leverage inter-slice spatial depende… ▽ More

    Submitted 13 August, 2016; originally announced August 2016.

    Comments: MICCAI Workshop RAMBO 2016