Skip to main content

Showing 1–50 of 51 results for author: Khanal, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2603.25886  [pdf, ps, other

    cs.CV

    Automated Quality Assessment of Blind Sweep Obstetric Ultrasound for Improved Diagnosis

    Authors: Prasiddha Bhandari, Kanchan Poudel, Nishant Luitel, Bishram Acharya, Angelina Ghimire, Tyler Wellman, Kilian Koepsell, Pradeep Raj Regmi, Bishesh Khanal

    Abstract: Blind Sweep Obstetric Ultrasound (BSOU) enables scalable fetal imaging in low-resource settings by allowing minimally trained operators to acquire standardized sweep videos for automated Artificial Intelligence(AI) interpretation. However, the reliability of such AI systems depends critically on the quality of the acquired sweeps, and little is known about how deviations from the intended protocol… ▽ More

    Submitted 26 March, 2026; originally announced March 2026.

  2. arXiv:2603.22291  [pdf, ps, other

    cs.CL

    Evaluating Large Language Models' Responses to Sexual and Reproductive Health Queries in Nepali

    Authors: Medha Sharma, Supriya Khadka, Udit Chandra Aryal, Bishnu Hari Bhatta, Bijayan Bhattarai, Santosh Dahal, Kamal Gautam, Pushpa Joshi, Saugat Kafle, Shristi Khadka, Shushila Khadka, Binod Lamichhane, Shilpa Lamichhane, Anusha Parajuli, Sabina Pokharel, Suvekshya Sitaula, Neha Verma, Bishesh Khanal

    Abstract: As Large Language Models (LLMs) become integrated into daily life, they are increasingly used for personal queries, including Sexual and Reproductive Health (SRH), allowing users to chat anonymously without fear of judgment. However, current evaluation methods primarily focus on accuracy, often for objective queries in high-resource languages, and lack criteria to assess usability and safety, espe… ▽ More

    Submitted 4 March, 2026; originally announced March 2026.

  3. arXiv:2601.06500  [pdf

    cs.AI cs.CY

    The AI Pyramid A Conceptual Framework for Workforce Capability in the Age of AI

    Authors: Alok Khatri, Bishesh Khanal

    Abstract: Artificial intelligence (AI) represents a qualitative shift in technological change by extending cognitive labor itself rather than merely automating routine tasks. Recent evidence shows that generative AI disproportionately affects highly educated, white collar work, challenging existing assumptions about workforce vulnerability and rendering traditional approaches to digital or AI literacy insuf… ▽ More

    Submitted 20 February, 2026; v1 submitted 10 January, 2026; originally announced January 2026.

    Comments: 14 pages

  4. arXiv:2512.08143  [pdf, ps, other

    cs.LG

    PolyLingua: Margin-based Inter-class Transformer for Robust Cross-domain Language Detection

    Authors: Ali Lotfi Rezaabad, Bikram Khanal, Shashwat Chaurasia, Lu Zeng, Dezhi Hong, Hossein Bashashati, Thomas Butler, Megan Ganji

    Abstract: Language identification is a crucial first step in multilingual systems such as chatbots and virtual assistants, enabling linguistically and culturally accurate user experiences. Errors at this stage can cascade into downstream failures, setting a high bar for accuracy. Yet, existing language identification tools struggle with key cases -- such as music requests where the song title and user langu… ▽ More

    Submitted 10 December, 2025; v1 submitted 8 December, 2025; originally announced December 2025.

  5. arXiv:2511.06169  [pdf, ps, other

    cs.LG

    Local K-Similarity Constraint for Federated Learning with Label Noise

    Authors: Sanskar Amgain, Prashant Shrestha, Bidur Khanal, Alina Devkota, Yash Raj Shrestha, Seungryul Baek, Prashnna Gyawali, Binod Bhattarai

    Abstract: Federated learning on clients with noisy labels is a challenging problem, as such clients can infiltrate the global model, impacting the overall generalizability of the system. Existing methods proposed to handle noisy clients assume that a sufficient number of clients with clean labels are available, which can be leveraged to learn a robust global model while dampening the impact of noisy clients… ▽ More

    Submitted 8 November, 2025; originally announced November 2025.

  6. arXiv:2509.15558  [pdf, ps, other

    cs.CV cs.HC

    From Development to Deployment of AI-assisted Telehealth and Screening for Vision- and Hearing-threatening diseases in resource-constrained settings: Field Observations, Challenges and Way Forward

    Authors: Mahesh Shakya, Bijay Adhikari, Nirsara Shrestha, Bipin Koirala, Arun Adhikari, Prasanta Poudyal, Luna Mathema, Sarbagya Buddhacharya, Bijay Khatri, Bishesh Khanal

    Abstract: Vision- and hearing-threatening diseases cause preventable disability, especially in resource-constrained settings(RCS) with few specialists and limited screening setup. Large scale AI-assisted screening and telehealth has potential to expand early detection, but practical deployment is challenging in paper-based workflows and limited documented field experience exist to build upon. We provide ins… ▽ More

    Submitted 18 September, 2025; originally announced September 2025.

    Comments: Accepted to MIRASOL (Medical Image Computing in Resource Constrained Settings Workshop & KI) Workshop, 2025

  7. arXiv:2505.07001  [pdf, ps, other

    cs.CV cs.LG

    Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models

    Authors: Bidur Khanal, Sandesh Pokhrel, Sanjay Bhandari, Ramesh Rana, Nikesh Shrestha, Ram Bahadur Gurung, Cristian Linte, Angus Watson, Yash Raj Shrestha, Binod Bhattarai

    Abstract: Vision-Language Models (VLMs) are becoming increasingly popular in the medical domain, bridging the gap between medical images and clinical language. Existing VLMs demonstrate an impressive ability to comprehend medical images and text queries to generate detailed, descriptive diagnostic medical reports. However, hallucination--the tendency to generate descriptions that are inconsistent with the v… ▽ More

    Submitted 22 June, 2025; v1 submitted 11 May, 2025; originally announced May 2025.

    Comments: Accepted at MICCAI 2025

  8. arXiv:2503.13470  [pdf, ps, other

    eess.SP cs.CV cs.LG

    Multimodal Latent Fusion of ECG Leads for Early Assessment of Pulmonary Hypertension

    Authors: Mohammod N. I. Suvon, Shuo Zhou, Prasun C. Tripathi, Wenrui Fan, Samer Alabed, Bishesh Khanal, Venet Osmani, Andrew J. Swift, Chen, Chen, Haiping Lu

    Abstract: Recent advancements in early assessment of pulmonary hypertension (PH) primarily focus on applying machine learning methods to centralized diagnostic modalities, such as 12-lead electrocardiogram (12L-ECG). Despite their potential, these approaches fall short in decentralized clinical settings, e.g., point-of-care and general practice, where handheld 6-lead ECG (6L-ECG) can offer an alternative bu… ▽ More

    Submitted 8 September, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

  9. arXiv:2412.14100  [pdf, other

    eess.IV cs.CV cs.LG

    Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset

    Authors: Bijay Adhikari, Pratibha Kulung, Jakesh Bohaju, Laxmi Kanta Poudel, Confidence Raymond, Dong Zhang, Udunna C Anazodo, Bishesh Khanal, Mahesh Shakya

    Abstract: Automating brain tumor segmentation using deep learning methods is an ongoing challenge in medical imaging. Multiple lingering issues exist including domain-shift and applications in low-resource settings which brings a unique set of challenges including scarcity of data. As a step towards solving these specific problems, we propose Convolutional adapter-inspired Parameter-efficient Fine-tuning (P… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

    Comments: Accepted to "The International Brain Tumor Segmentation (BraTS) challenge organized at MICCAI 2024 conference"

  10. Data-Dependent Generalization Bounds for Parameterized Quantum Models Under Noise

    Authors: Bikram Khanal, Pablo Rivas

    Abstract: Quantum machine learning offers a transformative approach to solving complex problems, but the inherent noise hinders its practical implementation in near-term quantum devices. This obstacle makes it difficult to understand the generalizability of quantum circuit models. Designing robust quantum machine learning models under noise requires a principled understanding of complexity and generalizatio… ▽ More

    Submitted 3 February, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: The Journal of Supercomputing

  11. arXiv:2412.08163  [pdf, other

    cs.CL

    NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection using Ensembling of BERT-based models

    Authors: Anmol Guragain, Nadika Poudel, Rajesh Piryani, Bishesh Khanal

    Abstract: This paper explores hate speech detection in Devanagari-scripted languages, focusing on Hindi and Nepali, for Subtask B of the CHIPSAL@COLING 2025 Shared Task. Using a range of transformer-based models such as XLM-RoBERTa, MURIL, and IndicBERT, we examine their effectiveness in navigating the nuanced boundary between hate speech and free expression. Our best performing model, implemented as ensemb… ▽ More

    Submitted 12 December, 2024; v1 submitted 11 December, 2024; originally announced December 2024.

  12. arXiv:2412.05996  [pdf

    cs.CV

    Paddy Disease Detection and Classification Using Computer Vision Techniques: A Mobile Application to Detect Paddy Disease

    Authors: Bimarsha Khanal, Paras Poudel, Anish Chapagai, Bijan Regmi, Sitaram Pokhrel, Salik Ram Khanal

    Abstract: Plant diseases significantly impact our food supply, causing problems for farmers, economies reliant on agriculture, and global food security. Accurate and timely plant disease diagnosis is crucial for effective treatment and minimizing yield losses. Despite advancements in agricultural technology, a precise and early diagnosis remains a challenge, especially in underdeveloped regions where agricu… ▽ More

    Submitted 8 December, 2024; originally announced December 2024.

    Comments: 21 pages,12 figures and 2 tables

  13. arXiv:2411.09598  [pdf, other

    eess.IV cs.CV

    Assessing the Performance of the DINOv2 Self-supervised Learning Vision Transformer Model for the Segmentation of the Left Atrium from MRI Images

    Authors: Bipasha Kundu, Bidur Khanal, Richard Simon, Cristian A. Linte

    Abstract: Accurate left atrium (LA) segmentation from pre-operative scans is crucial for diagnosing atrial fibrillation, treatment planning, and supporting surgical interventions. While deep learning models are key in medical image segmentation, they often require extensive manually annotated data. Foundation models trained on larger datasets have reduced this dependency, enhancing generalizability and robu… ▽ More

    Submitted 14 November, 2024; originally announced November 2024.

    Comments: 6 pages, 3 figures, SPIE Medical Imaging, 2025

  14. arXiv:2410.08005  [pdf, other

    cs.DC

    NLP-Guided Synthesis: Transitioning from Sequential Programs to Distributed Programs

    Authors: Arun Sanjel, Bikram Khanal, Greg Speegle, Pablo Rivas

    Abstract: As the need for large-scale data processing grows, distributed programming frameworks like PySpark have become increasingly popular. However, the task of converting traditional, sequential code to distributed code remains a significant hurdle, often requiring specialized knowledge and substantial time investment. While existing tools have made strides in automating this conversion, they often fall… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  15. arXiv:2410.05239  [pdf, other

    cs.CV cs.CL

    TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models

    Authors: Rabin Adhikari, Safal Thapaliya, Manish Dhakal, Bishesh Khanal

    Abstract: Vision-Language Models (VLMs) have shown impressive performance in vision tasks, but adapting them to new domains often requires expensive fine-tuning. Prompt tuning techniques, including textual, visual, and multimodal prompting, offer efficient alternatives by leveraging learnable prompts. However, their application to Vision-Language Segmentation Models (VLSMs) and evaluation under significant… ▽ More

    Submitted 8 October, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: Accepted at ACCV 2024 (oral presentation)

  16. arXiv:2409.11233  [pdf, other

    cs.CL

    Evaluating the Impact of Compression Techniques on Task-Specific Performance of Large Language Models

    Authors: Bishwash Khanal, Jeffery M. Capone

    Abstract: Large language models (LLMs) offer powerful capabilities but incur substantial computational costs, driving the need for efficient compression techniques. This study evaluates the impact of popular compression methods - Magnitude Pruning, SparseGPT, and Wanda - on the LLaMA-2-7B model, focusing on the trade-offs between model size reduction, downstream task performance, and the role of calibration… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

  17. arXiv:2409.07632  [pdf, other

    quant-ph cs.CC cs.LG

    Learning Robust Observable to Address Noise in Quantum Machine Learning

    Authors: Bikram Khanal, Pablo Rivas

    Abstract: Quantum Machine Learning (QML) has emerged as a promising field that combines the power of quantum computing with the principles of machine learning. One of the significant challenges in QML is dealing with noise in quantum systems, especially in the Noisy Intermediate-Scale Quantum (NISQ) era. Noise in quantum systems can introduce errors in quantum computations and degrade the performance of qua… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  18. arXiv:2409.07626  [pdf, other

    quant-ph cs.CC cs.LG

    Generalization Error Bound for Quantum Machine Learning in NISQ Era -- A Survey

    Authors: Bikram Khanal, Pablo Rivas, Arun Sanjel, Korn Sooksatra, Ernesto Quevedo, Alejandro Rodriguez

    Abstract: Despite the mounting anticipation for the quantum revolution, the success of Quantum Machine Learning (QML) in the Noisy Intermediate-Scale Quantum (NISQ) era hinges on a largely unexplored factor: the generalization error bound, a cornerstone of robust and reliable machine learning models. Current QML research, while exploring novel algorithms and applications extensively, is predominantly situat… ▽ More

    Submitted 3 February, 2025; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: Quantum Machine Intelligence

    Journal ref: Quantum Mach. Intell. 6, 90 (2024)

  19. arXiv:2408.06814  [pdf, other

    cs.CV cs.CG

    Structure-preserving Planar Simplification for Indoor Environments

    Authors: Bishwash Khanal, Sanjay Rijal, Manish Awale, Vaghawan Ojha

    Abstract: This paper presents a novel approach for structure-preserving planar simplification of indoor scene point clouds for both simulated and real-world environments. Initially, the scene point cloud undergoes preprocessing steps, including noise reduction and Manhattan world alignment, to ensure robustness and coherence in subsequent analyses. We segment each captured scene into structured (walls-ceili… ▽ More

    Submitted 21 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

  20. arXiv:2407.05973  [pdf, other

    cs.LG cs.CV

    Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise

    Authors: Bidur Khanal, Tianhong Dai, Binod Bhattarai, Cristian Linte

    Abstract: The robustness of supervised deep learning-based medical image classification is significantly undermined by label noise. Although several methods have been proposed to enhance classification performance in the presence of noisy labels, they face some challenges: 1) a struggle with class-imbalanced datasets, leading to the frequent overlooking of minority classes as noisy samples; 2) a singular fo… ▽ More

    Submitted 24 October, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted at MICCAI 2024

  21. arXiv:2406.11877  [pdf

    physics.ao-ph cs.LG

    Solar Power Prediction Using Satellite Data in Different Parts of Nepal

    Authors: Raj Krishna Nepal, Bibek Khanal, Vibek Ghimire, Kismat Neupane, Atul Pokharel, Kshitij Niraula, Baburam Tiwari, Nawaraj Bhattarai, Khem N. Poudyal, Nawaraj Karki, Mohan B Dangi, John Biden

    Abstract: Due to the unavailability of solar irradiance data for many potential sites of Nepal, the paper proposes predicting solar irradiance based on alternative meteorological parameters. The study focuses on five distinct regions in Nepal and utilizes a dataset spanning almost ten years, obtained from CERES SYN1deg and MERRA-2. Machine learning models such as Random Forest, XGBoost, K-Nearest Neighbors,… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 20 pages, 12 figures, 5 tables

  22. arXiv:2405.06196  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    VLSM-Adapter: Finetuning Vision-Language Segmentation Efficiently with Lightweight Blocks

    Authors: Manish Dhakal, Rabin Adhikari, Safal Thapaliya, Bishesh Khanal

    Abstract: Foundation Vision-Language Models (VLMs) trained using large-scale open-domain images and text pairs have recently been adapted to develop Vision-Language Segmentation Models (VLSMs) that allow providing text prompts during inference to guide image segmentation. If robust and powerful VLSMs can be built for medical images, it could aid medical professionals in many clinical tasks where they must s… ▽ More

    Submitted 27 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted at MICCAI 2024, the 27th International Conference on Medical Image Computing and Computer Assisted Intervention

  23. arXiv:2405.03789  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    On Adversarial Examples for Text Classification by Perturbing Latent Representations

    Authors: Korn Sooksatra, Bikram Khanal, Pablo Rivas

    Abstract: Recently, with the advancement of deep learning, several applications in text classification have advanced significantly. However, this improvement comes with a cost because deep learning is vulnerable to adversarial examples. This weakness indicates that deep learning is not very robust. Fortunately, the input of a text classifier is discrete. Hence, it can prevent the classifier from state-of-th… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 7 pages

    MSC Class: 68T01; 68T50 ACM Class: I.2.7

  24. arXiv:2404.07330  [pdf, other

    quant-ph cs.LG

    A Modified Depolarization Approach for Efficient Quantum Machine Learning

    Authors: Bikram Khanal, Pablo Rivas

    Abstract: Quantum Computing in the Noisy Intermediate-Scale Quantum (NISQ) era has shown promising applications in machine learning, optimization, and cryptography. Despite the progress, challenges persist due to system noise, errors, and decoherence that complicate the simulation of quantum systems. The depolarization channel is a standard tool for simulating a quantum system's noise. However, modeling suc… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  25. arXiv:2403.11936  [pdf, other

    eess.IV cs.CV

    AI-Assisted Cervical Cancer Screening

    Authors: Kanchan Poudel, Lisasha Poudel, Prabin Raj Shakya, Atit Poudel, Archana Shrestha, Bishesh Khanal

    Abstract: Visual Inspection with Acetic Acid (VIA) remains the most feasible cervical cancer screening test in resource-constrained settings of low- and middle-income countries (LMICs), which are often performed screening camps or primary/community health centers by nurses instead of the preferred but unavailable expert Gynecologist. To address the highly subjective nature of the test, various handheld devi… ▽ More

    Submitted 4 September, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  26. arXiv:2402.16734  [pdf, other

    eess.IV cs.CV cs.LG

    Investigating the Robustness of Vision Transformers against Label Noise in Medical Image Classification

    Authors: Bidur Khanal, Prashant Shrestha, Sanskar Amgain, Bishesh Khanal, Binod Bhattarai, Cristian A. Linte

    Abstract: Label noise in medical image classification datasets significantly hampers the training of supervised deep learning methods, undermining their generalizability. The test performance of a model tends to decrease as the label noise rate increases. Over recent years, several methods have been proposed to mitigate the impact of label noise in medical image classification and enhance the robustness of… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  27. arXiv:2401.07990  [pdf, other

    eess.IV cs.CV cs.LG

    How does self-supervised pretraining improve robustness against noisy labels across various medical image classification datasets?

    Authors: Bidur Khanal, Binod Bhattarai, Bishesh Khanal, Cristian Linte

    Abstract: Noisy labels can significantly impact medical image classification, particularly in deep learning, by corrupting learned features. Self-supervised pretraining, which doesn't rely on labeled data, can enhance robustness against noisy labels. However, this robustness varies based on factors like the number of classes, dataset complexity, and training size. In medical images, subtle inter-class diffe… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  28. arXiv:2312.06224  [pdf, other

    cs.CV cs.CL

    Medical Vision Language Pretraining: A survey

    Authors: Prashant Shrestha, Sanskar Amgain, Bidur Khanal, Cristian A. Linte, Binod Bhattarai

    Abstract: Medical Vision Language Pretraining (VLP) has recently emerged as a promising solution to the scarcity of labeled data in the medical domain. By leveraging paired/unpaired vision and text datasets through self-supervised learning, models can be trained to acquire vast knowledge and learn robust feature representations. Such pretrained models have the potential to enhance multiple downstream medica… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  29. arXiv:2309.13587  [pdf, other

    eess.IV cs.CV cs.LG

    Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape Reconstruction

    Authors: Mahesh Shakya, Bishesh Khanal

    Abstract: Various deep learning models have been proposed for 3D bone shape reconstruction from two orthogonal (biplanar) X-ray images. However, it is unclear how these models compare against each other since they are evaluated on different anatomy, cohort and (often privately held) datasets. Moreover, the impact of the commonly optimized image-based segmentation metrics such as dice score on the estimation… ▽ More

    Submitted 26 September, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: accepted to NeurIPS 2023

  30. arXiv:2309.12829  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography

    Authors: Rabin Adhikari, Manish Dhakal, Safal Thapaliya, Kanchan Poudel, Prasiddha Bhandari, Bishesh Khanal

    Abstract: Accurate segmentation is essential for echocardiography-based assessment of cardiovascular diseases (CVDs). However, the variability among sonographers and the inherent challenges of ultrasound images hinder precise segmentation. By leveraging the joint representation of image and text modalities, Vision-Language Segmentation Models (VLSMs) can incorporate rich contextual information, potentially… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: Accepted at the 4th International Workshop of Advances in Simplifying Medical UltraSound (ASMUS)

  31. arXiv:2309.12325  [pdf

    cs.CY cs.AI cs.CV cs.LG

    FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare

    Authors: Karim Lekadir, Aasa Feragen, Abdul Joseph Fofanah, Alejandro F Frangi, Alena Buyx, Anais Emelie, Andrea Lara, Antonio R Porras, An-Wen Chan, Arcadi Navarro, Ben Glocker, Benard O Botwe, Bishesh Khanal, Brigit Beger, Carol C Wu, Celia Cintas, Curtis P Langlotz, Daniel Rueckert, Deogratias Mzurikwao, Dimitrios I Fotiadis, Doszhan Zhussupov, Enzo Ferrante, Erik Meijering, Eva Weicken, Fabio A González , et al. (95 additional authors not shown)

    Abstract: Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted… ▽ More

    Submitted 8 July, 2024; v1 submitted 11 August, 2023; originally announced September 2023.

    ACM Class: I.2.0; I.4.0; I.5.0

  32. arXiv:2308.07706  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models

    Authors: Kanchan Poudel, Manish Dhakal, Prasiddha Bhandari, Rabin Adhikari, Safal Thapaliya, Bishesh Khanal

    Abstract: Medical image segmentation allows quantifying target structure size and shape, aiding in disease diagnosis, prognosis, surgery planning, and comprehension.Building upon recent advancements in foundation Vision-Language Models (VLMs) from natural image-text pairs, several studies have proposed adapting them to Vision-Language Segmentation Models (VLSMs) that allow using language text as an addition… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: Medical Imaging with Deep Learning (MIDL) 2024 (Oral)

  33. arXiv:2308.04551  [pdf, other

    eess.IV cs.CV cs.LG

    Improving Medical Image Classification in Noisy Labels Using Only Self-supervised Pretraining

    Authors: Bidur Khanal, Binod Bhattarai, Bishesh Khanal, Cristian A. Linte

    Abstract: Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted at MICCAI 2023 DEMI Workshop

  34. arXiv:2306.12376  [pdf, other

    eess.IV cs.CV

    M-VAAL: Multimodal Variational Adversarial Active Learning for Downstream Medical Image Analysis Tasks

    Authors: Bidur Khanal, Binod Bhattarai, Bishesh Khanal, Danail Stoyanov, Cristian A. Linte

    Abstract: Acquiring properly annotated data is expensive in the medical field as it requires experts, time-consuming protocols, and rigorous validation. Active learning attempts to minimize the need for large annotated samples by actively sampling the most informative examples for annotation. These examples contribute significantly to improving the performance of supervised machine learning models, and thus… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  35. arXiv:2304.05339  [pdf, other

    eess.IV cs.CV cs.LG

    Deep-learning Assisted Detection and Quantification of (oo)cysts of Giardia and Cryptosporidium on Smartphone Microscopy Images

    Authors: Suprim Nakarmi, Sanam Pudasaini, Safal Thapaliya, Pratima Upretee, Retina Shrestha, Basant Giri, Bhanu Bhakta Neupane, Bishesh Khanal

    Abstract: The consumption of microbial-contaminated food and water is responsible for the deaths of millions of people annually. Smartphone-based microscopy systems are portable, low-cost, and more accessible alternatives for the detection of Giardia and Cryptosporidium than traditional brightfield microscopes. However, the images from smartphone microscopes are noisier and require manual cyst identificatio… ▽ More

    Submitted 6 August, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: 21 pages (including supplementary information), 5 figures, 7 tables, Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2024:014

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2024)

  36. arXiv:2210.05425  [pdf

    cs.CL cs.AI cs.CY cs.LG

    COVID-19-related Nepali Tweets Classification in a Low Resource Setting

    Authors: Rabin Adhikari, Safal Thapaliya, Nirajan Basnet, Samip Poudel, Aman Shakya, Bishesh Khanal

    Abstract: Billions of people across the globe have been using social media platforms in their local languages to voice their opinions about the various topics related to the COVID-19 pandemic. Several organizations, including the World Health Organization, have developed automated social media analysis tools that classify COVID-19-related tweets into various topics. However, these tools that help combat the… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted at the 7th Social Media Mining for Health (#SMM4H) Workshop, co-located at Coling 2022

  37. arXiv:2208.00400  [pdf, other

    cs.CV

    FixMatchSeg: Fixing FixMatch for Semi-Supervised Semantic Segmentation

    Authors: Pratima Upretee, Bishesh Khanal

    Abstract: Supervised deep learning methods for semantic medical image segmentation are getting increasingly popular in the past few years.However, in resource constrained settings, getting large number of annotated images is very difficult as it mostly requires experts, is expensive and time-consuming.Semi-supervised segmentation can be an attractive solution where a very few labeled images are used along w… ▽ More

    Submitted 2 August, 2022; v1 submitted 31 July, 2022; originally announced August 2022.

    Comments: 2 figures, 4 tables, 9 pages + 2 pages references

  38. arXiv:2106.15475  [pdf, other

    cs.CV

    How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

    Authors: Bidur Khanal, Christopher Kanan

    Abstract: Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets. While the impact of label noise on learning in deep neural networks has been studied in prior work, these studies have exclusively focused on homogeneous label noise, i.e., the degree of label noise is the same across all categories. However, in the real-world, label noise is often heterogeneous, with s… ▽ More

    Submitted 26 September, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

  39. arXiv:2105.05501  [pdf, other

    cs.CV

    Label Geometry Aware Discriminator for Conditional Generative Networks

    Authors: Suman Sapkota, Bidur Khanal, Binod Bhattarai, Bishesh Khanal, Tae-Kyun Kim

    Abstract: Multi-domain image-to-image translation with conditional Generative Adversarial Networks (GANs) can generate highly photo realistic images with desired target classes, yet these synthetic images have not always been helpful to improve downstream supervised tasks such as image classification. Improving downstream tasks with synthetic examples requires generating images with high fidelity to the unk… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  40. arXiv:2005.09349  [pdf, other

    cs.CV

    Uncertainty Estimation in Deep 2D Echocardiography Segmentation

    Authors: Lavsen Dahal, Aayush Kafle, Bishesh Khanal

    Abstract: 2D echocardiography is the most common imaging modality for cardiovascular diseases. The portability and relatively low-cost nature of Ultrasound (US) enable the US devices needed for performing echocardiography to be made widely available. However, acquiring and interpreting cardiac US images is operator dependent, limiting its use to only places where experts are present. Recently, Deep Learning… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  41. arXiv:1910.14202  [pdf, other

    cs.CV

    Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression

    Authors: Bidur Khanal, Lavsen Dahal, Prashant Adhikari, Bishesh Khanal

    Abstract: Correct evaluation and treatment of Scoliosis require accurate estimation of spinal curvature. Current gold standard is to manually estimate Cobb Angles in spinal X-ray images which is time consuming and has high inter-rater variability. We propose an automatic method with a novel framework that first detects vertebrae as objects followed by a landmark detector that estimates the 4 landmark corner… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

    Comments: Accepted to MICCAI 2019 CSI Workshop & Challenge: Computational Methods and Clinical Applications for Spine Imaging

  42. arXiv:1908.02582  [pdf, other

    eess.IV cs.CV cs.LG

    Confident Head Circumference Measurement from Ultrasound with Real-time Feedback for Sonographers

    Authors: Samuel Budd, Matthew Sinclair, Bishesh Khanal, Jacqueline Matthew, David Lloyd, Alberto Gomez, Nicolas Toussaint, Emma Robinson, Bernhard Kainz

    Abstract: Manual estimation of fetal Head Circumference (HC) from Ultrasound (US) is a key biometric for monitoring the healthy development of fetuses. Unfortunately, such measurements are subject to large inter-observer variability, resulting in low early-detection rates of fetal abnormalities. To address this issue, we propose a novel probabilistic Deep Learning approach for real-time automated estimation… ▽ More

    Submitted 7 August, 2019; originally announced August 2019.

    Comments: Accepted at MICCAI 2019; Demo video available on Twitter (@sambuddinc)

  43. Controlling Meshes via Curvature: Spin Transformations for Pose-Invariant Shape Processing

    Authors: Loic Le Folgoc, Daniel C. Castro, Jeremy Tan, Bishesh Khanal, Konstantinos Kamnitsas, Ian Walker, Amir Alansary, Ben Glocker

    Abstract: We investigate discrete spin transformations, a geometric framework to manipulate surface meshes by controlling mean curvature. Applications include surface fairing -- flowing a mesh onto say, a reference sphere -- and mesh extrusion -- e.g., rebuilding a complex shape from a reference sphere and curvature specification. Because they operate in curvature space, these operations can be conducted ve… ▽ More

    Submitted 6 March, 2019; originally announced March 2019.

    Comments: Accepted for publication at the 26th international conference on Information Processing in Medical Imaging (IPMI 2019)

    Journal ref: IPMI 2019. LNCS, vol 11492, pp 221-234. Springer, Cham

  44. arXiv:1903.01905   

    cs.CV

    FastReg: Fast Non-Rigid Registration via Accelerated Optimisation on the Manifold of Diffeomorphisms

    Authors: Daniel Grzech, Loïc le Folgoc, Mattias P. Heinrich, Bishesh Khanal, Jakub Moll, Julia A. Schnabel, Ben Glocker, Bernhard Kainz

    Abstract: We present an implementation of a new approach to diffeomorphic non-rigid registration of medical images. The method is based on optical flow and warps images via gradient flow with the standard $L^2$ inner product. To compute the transformation, we rely on accelerated optimisation on the manifold of diffeomorphisms. We achieve regularity properties of Sobolev gradient flows, which are expensive t… ▽ More

    Submitted 24 April, 2019; v1 submitted 5 March, 2019; originally announced March 2019.

    Comments: There is an ongoing dispute about the presentation of this paper. It will be withdrawn until the dispute is resoved

  45. arXiv:1808.00793  [pdf, other

    cs.CV

    Weakly Supervised Localisation for Fetal Ultrasound Images

    Authors: Nicolas Toussaint, Bishesh Khanal, Matthew Sinclair, Alberto Gomez, Emily Skelton, Jacqueline Matthew, Julia A. Schnabel

    Abstract: This paper addresses the task of detecting and localising fetal anatomical regions in 2D ultrasound images, where only image-level labels are present at training, i.e. without any localisation or segmentation information. We examine the use of convolutional neural network architectures coupled with soft proposal layers. The resulting network simultaneously performs anatomical region detection (cla… ▽ More

    Submitted 2 August, 2018; originally announced August 2018.

    Comments: 4th Workshop on Deep Learning for Medical Image Analysis, MICCAI 2018, Granada, Spain

  46. arXiv:1807.10583  [pdf, other

    cs.CV cs.LG stat.ML

    EchoFusion: Tracking and Reconstruction of Objects in 4D Freehand Ultrasound Imaging without External Trackers

    Authors: Bishesh Khanal, Alberto Gomez, Nicolas Toussaint, Steven McDonagh, Veronika Zimmer, Emily Skelton, Jacqueline Matthew, Daniel Grzech, Robert Wright, Chandni Gupta, Benjamin Hou, Daniel Rueckert, Julia A. Schnabel, Bernhard Kainz

    Abstract: Ultrasound (US) is the most widely used fetal imaging technique. However, US images have limited capture range, and suffer from view dependent artefacts such as acoustic shadows. Compounding of overlapping 3D US acquisitions into a high-resolution volume can extend the field of view and remove image artefacts, which is useful for retrospective analysis including population based studies. However,… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

    Comments: MICCAI Workshop on Perinatal, Preterm and Paediatric Image analysis (PIPPI), 2018

  47. Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network

    Authors: Yuanwei Li, Bishesh Khanal, Benjamin Hou, Amir Alansary, Juan J. Cerrolaza, Matthew Sinclair, Jacqueline Matthew, Chandni Gupta, Caroline Knight, Bernhard Kainz, Daniel Rueckert

    Abstract: Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour… ▽ More

    Submitted 6 October, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: 8 pages, 2 figures, accepted for MICCAI 2018; Added link to source code

    Journal ref: LNCS 11070 (2018) 392-400

  48. Fast Multiple Landmark Localisation Using a Patch-based Iterative Network

    Authors: Yuanwei Li, Amir Alansary, Juan J. Cerrolaza, Bishesh Khanal, Matthew Sinclair, Jacqueline Matthew, Chandni Gupta, Caroline Knight, Bernhard Kainz, Daniel Rueckert

    Abstract: We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location.… ▽ More

    Submitted 6 October, 2018; v1 submitted 18 June, 2018; originally announced June 2018.

    Comments: 8 pages, 4 figures, Accepted for MICCAI 2018

    Journal ref: LNCS 11070 (2018) 563-571

  49. arXiv:1806.00411  [pdf, other

    cs.CV

    Adapted and Oversegmenting Graphs: Application to Geometric Deep Learning

    Authors: Alberto Gomez, Veronika A. Zimmer, Bishesh Khanal, Nicolas Toussaint, Julia A. Schnabel

    Abstract: We propose a novel iterative method to adapt a a graph to d-dimensional image data. The method drives the nodes of the graph towards image features. The adaptation process naturally lends itself to a measure of feature saliency which can then be used to retain meaningful nodes and edges in the graph. From the adapted graph, we also propose the computation of a dual graph, which inherits the salien… ▽ More

    Submitted 5 September, 2019; v1 submitted 1 June, 2018; originally announced June 2018.

    Comments: Submited to CVIU

  50. arXiv:1805.01026  [pdf, other

    cs.CV

    Computing CNN Loss and Gradients for Pose Estimation with Riemannian Geometry

    Authors: Benjamin Hou, Nina Miolane, Bishesh Khanal, Matthew C. H. Lee, Amir Alansary, Steven McDonagh, Jo V. Hajnal, Daniel Rueckert, Ben Glocker, Bernhard Kainz

    Abstract: Pose estimation, i.e. predicting a 3D rigid transformation with respect to a fixed co-ordinate frame in, SE(3), is an omnipresent problem in medical image analysis with applications such as: image rigid registration, anatomical standard plane detection, tracking and device/camera pose estimation. Deep learning methods often parameterise a pose with a representation that separates rotation and tran… ▽ More

    Submitted 17 July, 2018; v1 submitted 2 May, 2018; originally announced May 2018.