Skip to main content

Showing 1–50 of 73 results for author: Lee, C H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2601.18564  [pdf

    cs.LG cs.CV eess.SP

    An Unsupervised Tensor-Based Domain Alignment

    Authors: Chong Hyun Lee, Kibae Lee, Hyun Hee Yim

    Abstract: We propose a tensor-based domain alignment (DA) algorithm designed to align source and target tensors within an invariant subspace through the use of alignment matrices. These matrices along with the subspace undergo iterative optimization of which constraint is on oblique manifold, which offers greater flexibility and adaptability compared to the traditional Stiefel manifold. Moreover, regulariza… ▽ More

    Submitted 26 January, 2026; originally announced January 2026.

    Comments: 5 pages, 5 figures

  2. arXiv:2601.01294  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Diffusion Timbre Transfer Via Mutual Information Guided Inpainting

    Authors: Ching Ho Lee, Javier Nistal, Stefan Lattner, Marco Pasini, George Fazekas

    Abstract: We study timbre transfer as an inference-time editing problem for music audio. Starting from a strong pre-trained latent diffusion model, we introduce a lightweight procedure that requires no additional training: (i) a dimension-wise noise injection that targets latent channels most informative of instrument identity, and (ii) an early-step clamping mechanism that re-imposes the input's melodic an… ▽ More

    Submitted 28 January, 2026; v1 submitted 3 January, 2026; originally announced January 2026.

    Comments: 5 pages, 2 figures, 3 tables

  3. arXiv:2512.12630  [pdf, ps, other

    cs.HC cs.AI

    ORIBA: Exploring LLM-Driven Role-Play Chatbot as a Creativity Support Tool for Original Character Artists

    Authors: Yuqian Sun, Xingyu Li, Shunyu Yao, Noura Howell, Tristan Braud, Chang Hee Lee, Ali Asadipour

    Abstract: Recent advances in Generative AI (GAI) have led to new opportunities for creativity support. However, this technology has raised ethical concerns in the visual artists community. This paper explores how GAI can assist visual artists in developing original characters (OCs) while respecting their creative agency. We present ORIBA, an AI chatbot leveraging large language models (LLMs) to enable artis… ▽ More

    Submitted 14 December, 2025; originally announced December 2025.

  4. arXiv:2509.19242  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Linear Regression under Missing or Corrupted Coordinates

    Authors: Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas

    Abstract: We study multivariate linear regression under Gaussian covariates in two settings, where data may be erased or corrupted by an adversary under a coordinate-wise budget. In the incomplete data setting, an adversary may inspect the dataset and delete entries in up to an $η$-fraction of samples per coordinate; a strong form of the Missing Not At Random model. In the corrupted data setting, the advers… ▽ More

    Submitted 23 September, 2025; originally announced September 2025.

  5. arXiv:2508.03738  [pdf, ps, other

    eess.IV cs.AI cs.CV

    Improve Retinal Artery/Vein Classification via Channel Couplin

    Authors: Shuang Zeng, Chee Hong Lee, Kaiwen Li, Boxu Xie, Ourui Fu, Hangzhou He, Lei Zhu, Yanye Lu, Fangxiao Cheng

    Abstract: Retinal vessel segmentation plays a vital role in analyzing fundus images for the diagnosis of systemic and ocular diseases. Building on this, classifying segmented vessels into arteries and veins (A/V) further enables the extraction of clinically relevant features such as vessel width, diameter and tortuosity, which are essential for detecting conditions like diabetic and hypertensive retinopathy… ▽ More

    Submitted 31 July, 2025; originally announced August 2025.

  6. arXiv:2506.22608  [pdf, ps, other

    cs.DS

    On Fine-Grained Distinct Element Estimation

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas, David P. Woodruff, Samson Zhou

    Abstract: We study the problem of distributed distinct element estimation, where $α$ servers each receive a subset of a universe $[n]$ and aim to compute a $(1+\varepsilon)$-approximation to the number of distinct elements using minimal communication. While prior work establishes a worst-case bound of $Θ\left(α\log n+\fracα{\varepsilon^2}\right)$ bits, these results rely on assumptions that may not hold in… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: ICML 2025

  7. arXiv:2506.08618  [pdf, ps, other

    cs.LG cond-mat.mes-hall cond-mat.other cs.AI cs.CV

    HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

    Authors: Xianquan Yan, Hakan Akgün, Kenji Kawaguchi, N. Duane Loh, Ching Hua Lee

    Abstract: AI is transforming scientific research by revealing new ways to understand complex physical systems, but its impact remains constrained by the lack of large, high-quality domain-specific datasets. A rich, largely untapped resource lies in non-Hermitian quantum physics, where the energy spectra of crystals form intricate geometries on the complex plane -- termed as Hamiltonian spectral graphs. Desp… ▽ More

    Submitted 4 March, 2026; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: 49 pages, 13 figures, 14 tables. Code & pipeline: [https://github.com/sarinstein-yan/Poly2Graph] Dataset: [https://github.com/sarinstein-yan/HSG-12M] Dataset released under CC BY 4.0. Benchmark scripts and data loaders included

    Journal ref: The Fourteenth International Conference on Learning Representations (ICLR 2026)

  8. arXiv:2506.03571  [pdf, ps, other

    cs.CV cs.AI

    DiagNet: Detecting Objects using Diagonal Constraints on Adjacency Matrix of Graph Neural Network

    Authors: Chong Hyun Lee, Kibae Lee

    Abstract: We propose DaigNet, a new approach to object detection with which we can detect an object bounding box using diagonal constraints on adjacency matrix of a graph convolutional network (GCN). We propose two diagonalization algorithms based on hard and soft constraints on adjacency matrix and two loss functions using diagonal constraint and complementary constraint. The DaigNet eliminates the need fo… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  9. arXiv:2506.02865  [pdf, ps, other

    cs.AI

    Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights

    Authors: Mathieu Andreux, Breno Baldas Skuk, Hamza Benchekroun, Emilien Biré, Antoine Bonnet, Riaz Bordie, Nathan Bout, Matthias Brunel, Pierre-Louis Cedoz, Antoine Chassang, Mickaël Chen, Alexandra D. Constantinou, Antoine d'Andigné, Hubert de La Jonquière, Aurélien Delfosse, Ludovic Denoyer, Alexis Deprez, Augustin Derupti, Michael Eickenberg, Mathïs Federico, Charles Kantor, Xavier Koegler, Yann Labbé, Matthew C. H. Lee, Erwan Le Jumeau de Kergaradec , et al. (19 additional authors not shown)

    Abstract: We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 t… ▽ More

    Submitted 11 June, 2025; v1 submitted 3 June, 2025; originally announced June 2025.

    Comments: Alphabetical order

  10. arXiv:2506.01832  [pdf, ps, other

    cs.CC

    Pseudorandom bits for non-commutative programs

    Authors: Chin Ho Lee, Emanuele Viola

    Abstract: We obtain new explicit pseudorandom generators for several computational models involving groups. Our main results are as follows: 1. We consider read-once group-products over a finite group $G$, i.e., tests of the form $\prod_{i=1}^n g_i^{x_i}$ where $g_i\in G$, a special case of read-once permutation branching programs. We give generators with optimal seed length $c_G \log(n/\varepsilon)$ over… ▽ More

    Submitted 4 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

    Comments: Minor fixes

  11. arXiv:2505.03896  [pdf, other

    cs.CV cs.AI

    Novel Extraction of Discriminative Fine-Grained Feature to Improve Retinal Vessel Segmentation

    Authors: Shuang Zeng, Chee Hong Lee, Micky C Nnamdi, Wenqi Shi, J Ben Tamo, Lei Zhu, Hangzhou He, Xinliang Zhang, Qian Chen, May D. Wang, Yanye Lu, Qiushi Ren

    Abstract: Retinal vessel segmentation is a vital early detection method for several severe ocular diseases. Despite significant progress in retinal vessel segmentation with the advancement of Neural Networks, there are still challenges to overcome. Specifically, retinal vessel segmentation aims to predict the class label for every pixel within a fundus image, with a primary focus on intra-image discriminati… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  12. arXiv:2504.15251  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    On Learning Parallel Pancakes with Mostly Uniform Weights

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Jasper C. H. Lee, Thanasis Pittas

    Abstract: We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponenti… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

  13. arXiv:2502.19765  [pdf, ps, other

    cs.CL cs.LG

    EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models

    Authors: Che Hyun Lee, Heeseung Kim, Jiheum Yeom, Sungroh Yoon

    Abstract: We propose EdiText, a controllable text editing method that modifies the reference text to desired attributes at various scales. We integrate an SDEdit-based editing technique that allows for broad adjustments in the degree of text editing. Additionally, we introduce a novel fine-level editing method based on self-conditioning, which allows subtle control of reference text. While being capable of… ▽ More

    Submitted 2 June, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

    Comments: ACL 2025

  14. arXiv:2502.19759  [pdf, other

    cs.SD eess.AS

    Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models

    Authors: Heeseung Kim, Che Hyun Lee, Sangkwon Park, Jiheum Yeom, Nohil Park, Sangwon Yu, Sungroh Yoon

    Abstract: Recent advancements in multi-turn voice interaction models have improved user-model communication. However, while closed-source models effectively retain and recall past utterances, whether open-source models share this ability remains unexplored. To fill this gap, we systematically evaluate how well open-source interaction models utilize past utterances using ContextDialog, a benchmark we propose… ▽ More

    Submitted 23 May, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: ACL 2025 Findings, Project Page: https://contextdialog.github.io/

  15. Machine Learning Optimal Ordering in Global Routing Problems in Semiconductors

    Authors: Heejin Choi, Minji Lee, Chang Hyeong Lee, Jaeho Yang, Rak-Kyeong Seong

    Abstract: In this work, we propose a new method for ordering nets during the process of layer assignment in global routing problems. The global routing problems that we focus on in this work are based on routing problems that occur in the design of substrates in multilayered semiconductor packages. The proposed new method is based on machine learning techniques and we show that the proposed method supersede… ▽ More

    Submitted 30 December, 2024; originally announced December 2024.

    Comments: 18 pages, 13 figures, 6 tables; published in Scientific Reports

    Report number: UNIST-MTH-24-RS-07

    Journal ref: Scientific Reports 14, 31077 (2024)

  16. arXiv:2411.10399  [pdf, other

    cs.GT cs.CR cs.DC

    Game Theoretic Liquidity Provisioning in Concentrated Liquidity Market Makers

    Authors: Weizhao Tang, Rachid El-Azouzi, Cheng Han Lee, Ethan Chan, Giulia Fanti

    Abstract: Automated marker makers (AMMs) are a class of decentralized exchanges that enable the automated trading of digital assets. They accept deposits of digital tokens from liquidity providers (LPs); tokens can be used by traders to execute trades, which generate fees for the investing LPs. The distinguishing feature of AMMs is that trade prices are determined algorithmically, unlike classical limit ord… ▽ More

    Submitted 15 November, 2024; originally announced November 2024.

  17. arXiv:2409.15760  [pdf, other

    cs.SD eess.AS

    NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers

    Authors: Nohil Park, Heeseung Kim, Che Hyun Lee, Jooyoung Choi, Jiheum Yeom, Sungroh Yoon

    Abstract: We present NanoVoice, a personalized text-to-speech model that efficiently constructs voice adapters for multiple speakers simultaneously. NanoVoice introduces a batch-wise speaker adaptation technique capable of fine-tuning multiple references in parallel, significantly reducing training time. Beyond building separate adapters for each speaker, we also propose a parameter sharing technique that r… ▽ More

    Submitted 20 December, 2024; v1 submitted 24 September, 2024; originally announced September 2024.

    Comments: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025, Demo Page: https://nanovoice.github.io/

  18. arXiv:2409.15759  [pdf, other

    cs.SD eess.AS

    VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance

    Authors: Jiheum Yeom, Heeseung Kim, Jooyoung Choi, Che Hyun Lee, Nohil Park, Sungroh Yoon

    Abstract: When applying parameter-efficient finetuning via LoRA onto speaker adaptive text-to-speech models, adaptation performance may decline compared to full-finetuned counterparts, especially for out-of-domain speakers. Here, we propose VoiceGuider, a parameter-efficient speaker adaptive text-to-speech system reinforced with autoguidance to enhance the speaker adaptation performance, reducing the gap ag… ▽ More

    Submitted 20 December, 2024; v1 submitted 24 September, 2024; originally announced September 2024.

    Comments: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025, Demo Page: https://voiceguider.github.io/

  19. arXiv:2409.06932  [pdf, other

    cs.CC math.CO

    Boosting uniformity in quasirandom groups: fast and simple

    Authors: Harm Derksen, Chin Ho Lee, Emanuele Viola

    Abstract: We study the communication complexity of multiplying $k\times t$ elements from the group $H=\text{SL}(2,q)$ in the number-on-forehead model with $k$ parties. We prove a lower bound of $(t\log H)/c^{k}$. This is an exponential improvement over previous work, and matches the state-of-the-art in the area. Relatedly, we show that the convolution of $k^{c}$ independent copies of a 3-uniform distribut… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

  20. arXiv:2408.14739  [pdf, other

    cs.SD eess.AS

    VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech

    Authors: Heeseung Kim, Sang-gil Lee, Jiheum Yeom, Che Hyun Lee, Sungwon Kim, Sungroh Yoon

    Abstract: We propose VoiceTailor, a parameter-efficient speaker-adaptive text-to-speech (TTS) system, by equipping a pre-trained diffusion-based TTS model with a personalized adapter. VoiceTailor identifies pivotal modules that benefit from the adapter based on a weight change ratio analysis. We utilize Low-Rank Adaptation (LoRA) as a parameter-efficient adaptation method and incorporate the adapter into pi… ▽ More

    Submitted 27 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: INTERSPEECH 2024

  21. arXiv:2408.09554  [pdf, ps, other

    q-bio.QM cs.CV eess.IV

    Screen Them All: High-Throughput Pan-Cancer Genetic and Phenotypic Biomarker Screening from H&E Whole Slide Images

    Authors: Yi Kan Wang, Ludmila Tydlitatova, Jeremy D. Kunz, Gerard Oakley, Bonnie Kar Bo Chow, Ran A. Godrich, Matthew C. H. Lee, Hamed Aghdam, Alican Bozkurt, Michal Zelechowski, Chad Vanderbilt, Christopher Kanan, Juan A. Retamero, Peter Hamilton, Razik Yousfi, Thomas J. Fuchs, David S. Klimstra, Siqi Liu

    Abstract: Molecular assays are standard of care for detecting genomic alterations in cancer prognosis and therapy selection but are costly, tissue-destructive and time-consuming. Artificial intelligence (AI) applied to routine hematoxylin and eosin (H&E)-stained whole slide images (WSIs) offers a fast and economical alternative for screening molecular biomarkers. We introduce OmniScreen, a high-throughput A… ▽ More

    Submitted 14 July, 2025; v1 submitted 18 August, 2024; originally announced August 2024.

  22. arXiv:2407.12110  [pdf, other

    cs.CC

    Pseudorandomness, symmetry, smoothing: II

    Authors: Harm Derksen, Peter Ivanov, Chin Ho Lee, Emanuele Viola

    Abstract: We prove several new results on the Hamming weight of bounded uniform and small-bias distributions. We exhibit bounded-uniform distributions whose weight is anti-concentrated, matching existing concentration inequalities. This construction relies on a recent result in approximation theory due to Erdéyi (Acta Arithmetica 2016). In particular, we match the classical tail bounds, generalizing a res… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  23. arXiv:2407.11177  [pdf, ps, other

    cs.DS

    Trace reconstruction from local statistical queries

    Authors: Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio

    Abstract: The goal of trace reconstruction is to reconstruct an unknown $n$-bit string $x$ given only independent random traces of $x$, where a random trace of $x$ is obtained by passing $x$ through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of $x$ rather than individual… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: RANDOM 2024

  24. Segmenting Medical Images with Limited Data

    Authors: Zhaoshan Liua, Qiujie Lv, Chau Hung Lee, Lei Shen

    Abstract: While computer vision has proven valuable for medical image segmentation, its application faces challenges such as limited dataset sizes and the complexity of effectively leveraging unlabeled images. To address these challenges, we present a novel semi-supervised, consistency-based approach termed the data-efficient medical segmenter (DEMS). The DEMS features an encoder-decoder architecture and in… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Neural Networks Accepted

  25. arXiv:2405.13143  [pdf, ps, other

    cs.CC

    Pseudorandomness, symmetry, smoothing: I

    Authors: Harm Derksen, Peter Ivanov, Chin Ho Lee, Emanuele Viola

    Abstract: We prove several new results about bounded uniform and small-bias distributions. A main message is that, small-bias, even perturbed with noise, does not fool several classes of tests better than bounded uniformity. We prove this for threshold tests, small-space algorithms, and small-depth circuits. In particular, we obtain small-bias distributions that 1) achieve an optimal lower bound on their… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: CCC 2024

  26. arXiv:2401.11371  [pdf, other

    cs.RO eess.SY

    Modeling Considerations for Developing Deep Space Autonomous Spacecraft and Simulators

    Authors: Christopher Agia, Guillem Casadesus Vila, Saptarshi Bandyopadhyay, David S. Bayard, Kar-Ming Cheung, Charles H. Lee, Eric Wood, Ian Aenishanslin, Steven Ardito, Lorraine Fesq, Marco Pavone, Issa A. D. Nesnas

    Abstract: To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refineme… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: Project page: https://sites.google.com/stanford.edu/spacecraft-models. 20 pages, 8 figures. Accepted to the IEEE Conference on Aerospace (AeroConf) 2024

    ACM Class: I.2.8; I.2.9; I.6.1; I.6.3; I.6.4; I.6.6; J.2

  27. arXiv:2312.11769  [pdf, other

    cs.LG cs.DS cs.IT math.ST stat.ML

    Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas

    Abstract: We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a $k$-component mixture distribution $D = \sum_{i =1}^k w_i P_i$, where each $w_i \ge α$ for some known parameter $α$, and each $P_i$ has unknown covariance $Σ_i \preceq σ^2_i \cdot I_d$ for some unknown $σ_i$, the goal is to cluster the sam… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  28. arXiv:2311.12784  [pdf, ps, other

    math.ST cs.IT cs.LG stat.ML

    Optimality in Mean Estimation: Beyond Worst-Case, Beyond Sub-Gaussian, and Beyond $1+α$ Moments

    Authors: Trung Dang, Jasper C. H. Lee, Maoyuan Song, Paul Valiant

    Abstract: There is growing interest in improving our algorithmic understanding of fundamental statistical problems such as mean estimation, driven by the goal of understanding the limits of what we can extract from valuable data. The state of the art results for mean estimation in $\mathbb{R}$ are 1) the optimal sub-Gaussian mean estimator by [LV22], with the tight sub-Gaussian constant for all distribution… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: 27 pages, to appear in NeurIPS 2023. Abstract shortened to fit arXiv limit

  29. arXiv:2311.08022  [pdf, ps, other

    cs.AI cs.LG

    Two-Stage Predict+Optimize for Mixed Integer Linear Programs with Unknown Parameters in Constraints

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

    Abstract: Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  30. arXiv:2310.11870  [pdf, other

    cs.CL cs.AI

    AI Nushu: An Exploration of Language Emergence in Sisterhood -Through the Lens of Computational Linguistics

    Authors: Yuqian Sun, Yuying Tang, Ze Gao, Zhijun Pan, Chuyan Xu, Yurou Chen, Kejiang Qian, Zhigang Wang, Tristan Braud, Chang Hee Lee, Ali Asadipour

    Abstract: This paper presents "AI Nushu," an emerging language system inspired by Nushu (women's scripts), the unique language created and used exclusively by ancient Chinese women who were thought to be illiterate under a patriarchal society. In this interactive installation, two artificial intelligence (AI) agents are trained in the Chinese dictionary and the Nushu corpus. By continually observing their e… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted for publication at SIGGRAPH Asia 2023

    MSC Class: 14J60 (Primary) 14F05; 14J26 (Secondary) ACM Class: F.2.2; I.2.7

  31. arXiv:2309.11478  [pdf, other

    cs.AI

    Fictional Worlds, Real Connections: Developing Community Storytelling Social Chatbots through LLMs

    Authors: Yuqian Sun, Hanyi Wang, Pok Man Chan, Morteza Tabibi, Yan Zhang, Huan Lu, Yuheng Chen, Chang Hee Lee, Ali Asadipour

    Abstract: We address the integration of storytelling and Large Language Models (LLMs) to develop engaging and believable Social Chatbots (SCs) in community settings. Motivated by the potential of fictional characters to enhance social interactions, we introduce Storytelling Social Chatbots (SSCs) and the concept of story engineering to transform fictional game characters into "live" social entities within p… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  32. arXiv:2309.07778  [pdf, other

    eess.IV cs.CV cs.LG q-bio.TO

    Virchow: A Million-Slide Digital Pathology Foundation Model

    Authors: Eugene Vorontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Siqi Liu, Kristen Severson, Eric Zimmermann, James Hall, Neil Tenenholtz, Nicolo Fusi, Philippe Mathieu, Alexander van Eck, Donghun Lee, Julian Viret, Eric Robert, Yi Kan Wang, Jeremy D. Kunz, Matthew C. H. Lee, Jan Bernhard, Ran A. Godrich, Gerard Oakley, Ewan Millar, Matthew Hanna, Juan Retamero , et al. (6 additional authors not shown)

    Abstract: The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computati… ▽ More

    Submitted 17 January, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  33. arXiv:2308.12915  [pdf, other

    cs.HC cs.AI

    Language as Reality: A Co-Creative Storytelling Game Experience in 1001 Nights using Generative AI

    Authors: Yuqian Sun, Zhouyi Li, Ke Fang, Chang Hee Lee, Ali Asadipour

    Abstract: In this paper, we present "1001 Nights", an AI-native game that allows players lead in-game reality through co-created storytelling with the character driven by large language model. The concept is inspired by Wittgenstein's idea of the limits of one's world being determined by the bounds of their language. Using advanced AI tools like GPT-4 and Stable Diffusion, the second iteration of the game e… ▽ More

    Submitted 18 September, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: The paper was accepted by The 19th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 23)

  34. arXiv:2306.16573  [pdf, other

    math.ST cs.IT cs.LG math.PR stat.ML

    Finite-Sample Symmetric Mean Estimation with Fisher Information Rate

    Authors: Shivam Gupta, Jasper C. H. Lee, Eric Price

    Abstract: The mean of an unknown variance-$σ^2$ distribution $f$ can be estimated from $n$ samples with variance $\frac{σ^2}{n}$ and nearly corresponding subgaussian rate. When $f$ is known up to translation, this can be improved asymptotically to $\frac{1}{n\mathcal I}$, where $\mathcal I$ is the Fisher information of the distribution. Such an improvement is not possible for general unknown $f$, but [Stone… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: COLT 2023

  35. arXiv:2305.00966  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    A Spectral Algorithm for List-Decodable Covariance Estimation in Relative Frobenius Norm

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of list-decodable Gaussian covariance estimation. Given a multiset $T$ of $n$ points in $\mathbb R^d$ such that an unknown $α<1/2$ fraction of points in $T$ are i.i.d. samples from an unknown Gaussian $\mathcal{N}(μ, Σ)$, the goal is to output a list of $O(1/α)$ hypotheses at least one of which is close to $Σ$ in relative Frobenius norm. Our main result is a… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  36. arXiv:2303.07945  [pdf, other

    cs.CV

    Edit-A-Video: Single Video Editing with Object-Aware Consistency

    Authors: Chaehun Shin, Heeseung Kim, Che Hyun Lee, Sang-gil Lee, Sungroh Yoon

    Abstract: Despite the fact that text-to-video (TTV) model has recently achieved remarkable success, there have been few approaches on TTV for its extension to video editing. Motivated by approaches on TTV models adapting from diffusion-based text-to-image (TTI) models, we suggest the video editing framework given only a pretrained TTI model and a single <text, video> pair, which we term Edit-A-Video. The fr… ▽ More

    Submitted 17 November, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: ACML 2023 Best Paper Award

  37. arXiv:2303.06698  [pdf, ps, other

    cs.LG cs.AI math.OC

    Branch & Learn with Post-hoc Correction for Predict+Optimize with Unknown Parameters in Constraints

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

    Abstract: Combining machine learning and constrained optimization, Predict+Optimize tackles optimization problems containing parameters that are unknown at the time of solving. Prior works focus on cases with unknowns only in the objectives. A new framework was recently proposed to cater for unknowns also in constraints by introducing a loss function, called Post-hoc Regret, that takes into account the cost… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

  38. arXiv:2302.02497  [pdf, other

    math.ST cs.IT cs.LG math.PR stat.ML

    High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors

    Authors: Shivam Gupta, Jasper C. H. Lee, Eric Price

    Abstract: In location estimation, we are given $n$ samples from a known distribution $f$ shifted by an unknown translation $λ$, and want to estimate $λ$ as precisely as possible. Asymptotically, the maximum likelihood estimate achieves the Cramér-Rao bound of error $\mathcal N(0, \frac{1}{n\mathcal I})$, where $\mathcal I$ is the Fisher information of $f$. However, the $n$ required for convergence depends o… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  39. arXiv:2211.16333  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia

    Abstract: We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: To appear in NeurIPS 2022

  40. arXiv:2211.03292  [pdf, ps, other

    cs.DS

    Approximate Trace Reconstruction from a Single Trace

    Authors: Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, Sandip Sinha

    Abstract: The well-known trace reconstruction problem is the problem of inferring an unknown source string $x \in \{0,1\}^n$ from independent "traces", i.e. copies of $x$ that have been corrupted by a $δ$-deletion channel which independently deletes each bit of $x$ with probability $δ$ and concatenates the surviving bits. The current paper considers the extreme data-limited regime in which only a single tra… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  41. arXiv:2209.03668  [pdf, other

    cs.AI cs.LG math.OC

    Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

    Abstract: Predict+Optimize is a recently proposed framework which combines machine learning and constrained optimization, tackling optimization problems that contain parameters that are unknown at solving time. The goal is to predict the unknown parameters and use the estimates to solve for an estimated optimal solution to the optimization problem. However, all prior works have focused on the case where unk… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  42. Recent Progress in Transformer-based Medical Image Analysis

    Authors: Zhaoshan Liu, Qiujie Lv, Ziduo Yang, Yifan Li, Chau Hung Lee, Lei Shen

    Abstract: The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structur… ▽ More

    Submitted 25 July, 2023; v1 submitted 13 August, 2022; originally announced August 2022.

    Comments: Computers in Biology and Medicine Accepted

    MSC Class: I.2.m; I.4.9; I.5.4; J.0

  43. arXiv:2206.02348  [pdf, other

    math.ST cs.DS cs.IT cs.LG stat.ML

    Finite-Sample Maximum Likelihood Estimation of Location

    Authors: Shivam Gupta, Jasper C. H. Lee, Eric Price, Paul Valiant

    Abstract: We consider 1-dimensional location estimation, where we estimate a parameter $λ$ from $n$ samples $λ+ η_i$, with each $η_i$ drawn i.i.d. from a known distribution $f$. For fixed $f$ the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as $n \to \infty$: it is asymptotically normal with variance matching the Cramér-Rao lower bound of $\frac{1}{n\mathcal{I}}$, where… ▽ More

    Submitted 18 July, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Corrected an inaccuracy in the description of the experimental setup. Also updated funding acknowledgements

  44. arXiv:2205.01672  [pdf, other

    cs.LG cs.AI math.OC

    Branch & Learn for Recursively and Iteratively Solvable Problems in Predict+Optimize

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee, Allen Z. Zhong

    Abstract: This paper proposes Branch & Learn, a framework for Predict+Optimize to tackle optimization problems containing parameters that are unknown at the time of solving. Given an optimization problem solvable by a recursive algorithm satisfying simple conditions, we show how a corresponding learning algorithm can be constructed directly and methodically from the recursive algorithm. Our framework applie… ▽ More

    Submitted 1 May, 2022; originally announced May 2022.

  45. arXiv:2203.14539  [pdf, other

    cs.LG cs.CV

    Semi-supervised anomaly detection algorithm based on KL divergence (SAD-KL)

    Authors: Chong Hyun Lee, Kibae Lee

    Abstract: The unlabeled data are generally assumed to be normal data in detecting abnormal data via semisupervised learning. This assumption, however, causes inevitable detection error when distribution of unlabeled data is different from distribution of labeled normal dataset. To deal the problem caused by distribution gap between labeled and unlabeled data, we propose a semi-supervised anomaly detection a… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 9 pages, 8 figures

  46. arXiv:2203.06184  [pdf, other

    eess.IV cs.CV

    GSDA: Generative Adversarial Network-based Semi-Supervised Data Augmentation for Ultrasound Image Classification

    Authors: Zhaoshan Liu, Qiujie Lv, Chau Hung Lee, Lei Shen

    Abstract: Medical Ultrasound (US) is one of the most widely used imaging modalities in clinical practice, but its usage presents unique challenges such as variable imaging quality. Deep Learning (DL) models can serve as advanced medical US image analysis tools, but their performance is greatly limited by the scarcity of large datasets. To solve the common data shortage, we develop GSDA, a Generative Adversa… ▽ More

    Submitted 5 October, 2023; v1 submitted 11 March, 2022; originally announced March 2022.

    Comments: Heliyon Accepted

    ACM Class: I.2.1; I.2.10; I.4.9; I.5.4; J.0

  47. arXiv:2107.11530  [pdf, ps, other

    cs.DS cs.DM

    Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

    Authors: Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, Sandip Sinha

    Abstract: In the standard trace reconstruction problem, the goal is to \emph{exactly} reconstruct an unknown source string $\mathsf{x} \in \{0,1\}^n$ from independent "traces", which are copies of $\mathsf{x}$ that have been corrupted by a $δ$-deletion channel which independently deletes each bit of $\mathsf{x}$ with probability $δ$ and concatenates the surviving bits. We study the \emph{approximate} trace… ▽ More

    Submitted 25 August, 2021; v1 submitted 24 July, 2021; originally announced July 2021.

    Comments: Updated few references

    MSC Class: 68Q25 (Primary) 68Q32; 68Q87; 68Q17; 68W32; 68W40 (Secondary) ACM Class: F.2.0; G.3

  48. arXiv:2107.10797  [pdf, other

    cs.CC

    Fourier growth of structured $\mathbb{F}_2$-polynomials and applications

    Authors: Jarosław Błasiok, Peter Ivanov, Yaonan Jin, Chin Ho Lee, Rocco A. Servedio, Emanuele Viola

    Abstract: We analyze the Fourier growth, i.e. the $L_1$ Fourier weight at level $k$ (denoted $L_{1,k}$), of various well-studied classes of "structured" $\mathbb{F}_2$-polynomials. This study is motivated by applications in pseudorandomness, in particular recent results and conjectures due to [CHHL19,CHLT19,CGLSS20] which show that upper bounds on Fourier growth (even at level $k=2$) give unconditional pseu… ▽ More

    Submitted 11 October, 2024; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Corrected a mistake in Lemma 27 in the previous version of the paper

  49. arXiv:2012.02844  [pdf, ps, other

    cs.DS

    Polynomial-time trace reconstruction in the low deletion rate regime

    Authors: Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, Sandip Sinha

    Abstract: In the \emph{trace reconstruction problem}, an unknown source string $x \in \{0,1\}^n$ is transmitted through a probabilistic \emph{deletion channel} which independently deletes each bit with some fixed probability $δ$ and concatenates the surviving bits, resulting in a \emph{trace} of $x$. The problem is to reconstruct $x$ given access to independent traces. Trace reconstruction of arbitrary (w… ▽ More

    Submitted 7 December, 2020; v1 submitted 4 December, 2020; originally announced December 2020.

    Comments: ITCS 2021. Updated with minor correction of extraneous file reference

    MSC Class: 68Q87 (Primary) 68Q25; 68W32; 68W40 (Secondary) ACM Class: F.2.0

  50. arXiv:2011.08384  [pdf, ps, other

    math.ST cs.DS cs.IT cs.LG stat.ML

    Optimal Sub-Gaussian Mean Estimation in $\mathbb{R}$

    Authors: Jasper C. H. Lee, Paul Valiant

    Abstract: We revisit the problem of estimating the mean of a real-valued distribution, presenting a novel estimator with sub-Gaussian convergence: intuitively, "our estimator, on any distribution, is as accurate as the sample mean is for the Gaussian distribution of matching variance." Crucially, in contrast to prior works, our estimator does not require prior knowledge of the variance, and works across the… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.