-
An Unsupervised Tensor-Based Domain Alignment
Authors:
Chong Hyun Lee,
Kibae Lee,
Hyun Hee Yim
Abstract:
We propose a tensor-based domain alignment (DA) algorithm designed to align source and target tensors within an invariant subspace through the use of alignment matrices. These matrices along with the subspace undergo iterative optimization of which constraint is on oblique manifold, which offers greater flexibility and adaptability compared to the traditional Stiefel manifold. Moreover, regulariza…
▽ More
We propose a tensor-based domain alignment (DA) algorithm designed to align source and target tensors within an invariant subspace through the use of alignment matrices. These matrices along with the subspace undergo iterative optimization of which constraint is on oblique manifold, which offers greater flexibility and adaptability compared to the traditional Stiefel manifold. Moreover, regularization terms defined to preserve the variance of both source and target tensors, ensures robust performance. Our framework is versatile, effectively generalizing existing tensor-based DA methods as special cases. Through extensive experiments, we demonstrate that our approach not only enhances DA conversion speed but also significantly boosts classification accuracy. This positions our method as superior to current state-of-the-art techniques, making it a preferable choice for complex domain adaptation tasks.
△ Less
Submitted 26 January, 2026;
originally announced January 2026.
-
Diffusion Timbre Transfer Via Mutual Information Guided Inpainting
Authors:
Ching Ho Lee,
Javier Nistal,
Stefan Lattner,
Marco Pasini,
George Fazekas
Abstract:
We study timbre transfer as an inference-time editing problem for music audio. Starting from a strong pre-trained latent diffusion model, we introduce a lightweight procedure that requires no additional training: (i) a dimension-wise noise injection that targets latent channels most informative of instrument identity, and (ii) an early-step clamping mechanism that re-imposes the input's melodic an…
▽ More
We study timbre transfer as an inference-time editing problem for music audio. Starting from a strong pre-trained latent diffusion model, we introduce a lightweight procedure that requires no additional training: (i) a dimension-wise noise injection that targets latent channels most informative of instrument identity, and (ii) an early-step clamping mechanism that re-imposes the input's melodic and rhythmic structure during reverse diffusion. The method operates directly on audio latents and is compatible with text/audio conditioning (e.g., CLAP). We discuss design choices,analyze trade-offs between timbral change and structural preservation, and show that simple inference-time controls can meaningfully steer pre-trained models for style-transfer use cases.
△ Less
Submitted 28 January, 2026; v1 submitted 3 January, 2026;
originally announced January 2026.
-
ORIBA: Exploring LLM-Driven Role-Play Chatbot as a Creativity Support Tool for Original Character Artists
Authors:
Yuqian Sun,
Xingyu Li,
Shunyu Yao,
Noura Howell,
Tristan Braud,
Chang Hee Lee,
Ali Asadipour
Abstract:
Recent advances in Generative AI (GAI) have led to new opportunities for creativity support. However, this technology has raised ethical concerns in the visual artists community. This paper explores how GAI can assist visual artists in developing original characters (OCs) while respecting their creative agency. We present ORIBA, an AI chatbot leveraging large language models (LLMs) to enable artis…
▽ More
Recent advances in Generative AI (GAI) have led to new opportunities for creativity support. However, this technology has raised ethical concerns in the visual artists community. This paper explores how GAI can assist visual artists in developing original characters (OCs) while respecting their creative agency. We present ORIBA, an AI chatbot leveraging large language models (LLMs) to enable artists to role-play with their OCs, focusing on conceptualization (e.g., backstories) while leaving exposition (visual creation) to creators. Through a study with 14 artists, we found ORIBA motivated artists' imaginative engagement, developing multidimensional attributes and stronger bonds with OCs that inspire their creative process. Our contributions include design insights for AI systems that develop from artists' perspectives, demonstrating how LLMs can support cross-modal creativity while preserving creative agency in OC art. This paper highlights the potential of GAI as a neutral, non-visual support that strengthens existing creative practice, without infringing artistic exposition.
△ Less
Submitted 14 December, 2025;
originally announced December 2025.
-
Linear Regression under Missing or Corrupted Coordinates
Authors:
Ilias Diakonikolas,
Jelena Diakonikolas,
Daniel M. Kane,
Jasper C. H. Lee,
Thanasis Pittas
Abstract:
We study multivariate linear regression under Gaussian covariates in two settings, where data may be erased or corrupted by an adversary under a coordinate-wise budget. In the incomplete data setting, an adversary may inspect the dataset and delete entries in up to an $η$-fraction of samples per coordinate; a strong form of the Missing Not At Random model. In the corrupted data setting, the advers…
▽ More
We study multivariate linear regression under Gaussian covariates in two settings, where data may be erased or corrupted by an adversary under a coordinate-wise budget. In the incomplete data setting, an adversary may inspect the dataset and delete entries in up to an $η$-fraction of samples per coordinate; a strong form of the Missing Not At Random model. In the corrupted data setting, the adversary instead replaces values arbitrarily, and the corruption locations are unknown to the learner. Despite substantial work on missing data, linear regression under such adversarial missingness remains poorly understood, even information-theoretically. Unlike the clean setting, where estimation error vanishes with more samples, here the optimal error remains a positive function of the problem parameters. Our main contribution is to characterize this error up to constant factors across essentially the entire parameter range. Specifically, we establish novel information-theoretic lower bounds on the achievable error that match the error of (computationally efficient) algorithms. A key implication is that, perhaps surprisingly, the optimal error in the missing data setting matches that in the corruption setting-so knowing the corruption locations offers no general advantage.
△ Less
Submitted 23 September, 2025;
originally announced September 2025.
-
Improve Retinal Artery/Vein Classification via Channel Couplin
Authors:
Shuang Zeng,
Chee Hong Lee,
Kaiwen Li,
Boxu Xie,
Ourui Fu,
Hangzhou He,
Lei Zhu,
Yanye Lu,
Fangxiao Cheng
Abstract:
Retinal vessel segmentation plays a vital role in analyzing fundus images for the diagnosis of systemic and ocular diseases. Building on this, classifying segmented vessels into arteries and veins (A/V) further enables the extraction of clinically relevant features such as vessel width, diameter and tortuosity, which are essential for detecting conditions like diabetic and hypertensive retinopathy…
▽ More
Retinal vessel segmentation plays a vital role in analyzing fundus images for the diagnosis of systemic and ocular diseases. Building on this, classifying segmented vessels into arteries and veins (A/V) further enables the extraction of clinically relevant features such as vessel width, diameter and tortuosity, which are essential for detecting conditions like diabetic and hypertensive retinopathy. However, manual segmentation and classification are time-consuming, costly and inconsistent. With the advancement of Convolutional Neural Networks, several automated methods have been proposed to address this challenge, but there are still some issues. For example, the existing methods all treat artery, vein and overall vessel segmentation as three separate binary tasks, neglecting the intrinsic coupling relationships between these anatomical structures. Considering artery and vein structures are subsets of the overall retinal vessel map and should naturally exhibit prediction consistency with it, we design a novel loss named Channel-Coupled Vessel Consistency Loss to enforce the coherence and consistency between vessel, artery and vein predictions, avoiding biasing the network toward three simple binary segmentation tasks. Moreover, we also introduce a regularization term named intra-image pixel-level contrastive loss to extract more discriminative feature-level fine-grained representations for accurate retinal A/V classification. SOTA results have been achieved across three public A/V classification datasets including RITE, LES-AV and HRF. Our code will be available upon acceptance.
△ Less
Submitted 31 July, 2025;
originally announced August 2025.
-
On Fine-Grained Distinct Element Estimation
Authors:
Ilias Diakonikolas,
Daniel M. Kane,
Jasper C. H. Lee,
Thanasis Pittas,
David P. Woodruff,
Samson Zhou
Abstract:
We study the problem of distributed distinct element estimation, where $α$ servers each receive a subset of a universe $[n]$ and aim to compute a $(1+\varepsilon)$-approximation to the number of distinct elements using minimal communication. While prior work establishes a worst-case bound of $Θ\left(α\log n+\fracα{\varepsilon^2}\right)$ bits, these results rely on assumptions that may not hold in…
▽ More
We study the problem of distributed distinct element estimation, where $α$ servers each receive a subset of a universe $[n]$ and aim to compute a $(1+\varepsilon)$-approximation to the number of distinct elements using minimal communication. While prior work establishes a worst-case bound of $Θ\left(α\log n+\fracα{\varepsilon^2}\right)$ bits, these results rely on assumptions that may not hold in practice. We introduce a new parameterization based on the number $C = \fracβ{\varepsilon^2}$ of pairwise collisions, i.e., instances where the same element appears on multiple servers, and design a protocol that uses only $\mathcal{O}\left(α\log n+\frac{\sqrtβ}{\varepsilon^2} \log n\right)$ bits, breaking previous lower bounds when $C$ is small. We further improve our algorithm under assumptions on the number of distinct elements or collisions and provide matching lower bounds in all regimes, establishing $C$ as a tight complexity measure for the problem. Finally, we consider streaming algorithms for distinct element estimation parameterized by the number of items with frequency larger than $1$. Overall, our results offer insight into why statistical problems with known hardness results can be efficiently solved in practice.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals
Authors:
Xianquan Yan,
Hakan Akgün,
Kenji Kawaguchi,
N. Duane Loh,
Ching Hua Lee
Abstract:
AI is transforming scientific research by revealing new ways to understand complex physical systems, but its impact remains constrained by the lack of large, high-quality domain-specific datasets. A rich, largely untapped resource lies in non-Hermitian quantum physics, where the energy spectra of crystals form intricate geometries on the complex plane -- termed as Hamiltonian spectral graphs. Desp…
▽ More
AI is transforming scientific research by revealing new ways to understand complex physical systems, but its impact remains constrained by the lack of large, high-quality domain-specific datasets. A rich, largely untapped resource lies in non-Hermitian quantum physics, where the energy spectra of crystals form intricate geometries on the complex plane -- termed as Hamiltonian spectral graphs. Despite their significance as fingerprints for electronic behavior, their systematic study has been intractable due to the reliance on manual extraction. To unlock this potential, we introduce Poly2Graph: a high-performance, open-source pipeline that automates the mapping of 1-D crystal Hamiltonians to spectral graphs. Using this tool, we present HSG-12M: a dataset containing 11.6 million static and 5.1 million dynamic Hamiltonian spectral graphs across 1401 characteristic-polynomial classes, distilled from 177 TB of spectral potential data. Crucially, HSG-12M is the first large-scale dataset of spatial multigraphs -- graphs embedded in a metric space where multiple geometrically distinct trajectories between two nodes are retained as separate edges. This simultaneously addresses a critical gap, as existing graph benchmarks overwhelmingly assume simple, non-spatial edges, discarding vital geometric information. Benchmarks with popular GNNs expose new challenges in learning spatial multi-edges at scale. Beyond its practical utility, we show that spectral graphs serve as universal topological fingerprints of polynomials, vectors, and matrices, forging a new algebra-to-graph link. HSG-12M lays the groundwork for data-driven scientific discovery in condensed matter physics, new opportunities in geometry-aware graph learning and beyond.
△ Less
Submitted 4 March, 2026; v1 submitted 10 June, 2025;
originally announced June 2025.
-
DiagNet: Detecting Objects using Diagonal Constraints on Adjacency Matrix of Graph Neural Network
Authors:
Chong Hyun Lee,
Kibae Lee
Abstract:
We propose DaigNet, a new approach to object detection with which we can detect an object bounding box using diagonal constraints on adjacency matrix of a graph convolutional network (GCN). We propose two diagonalization algorithms based on hard and soft constraints on adjacency matrix and two loss functions using diagonal constraint and complementary constraint. The DaigNet eliminates the need fo…
▽ More
We propose DaigNet, a new approach to object detection with which we can detect an object bounding box using diagonal constraints on adjacency matrix of a graph convolutional network (GCN). We propose two diagonalization algorithms based on hard and soft constraints on adjacency matrix and two loss functions using diagonal constraint and complementary constraint. The DaigNet eliminates the need for designing a set of anchor boxes commonly used. To prove feasibility of our novel detector, we adopt detection head in YOLO models. Experiments show that the DiagNet achieves 7.5% higher mAP50 on Pascal VOC than YOLOv1. The DiagNet also shows 5.1% higher mAP on MS COCO than YOLOv3u, 3.7% higher mAP than YOLOv5u, and 2.9% higher mAP than YOLOv8.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Authors:
Mathieu Andreux,
Breno Baldas Skuk,
Hamza Benchekroun,
Emilien Biré,
Antoine Bonnet,
Riaz Bordie,
Nathan Bout,
Matthias Brunel,
Pierre-Louis Cedoz,
Antoine Chassang,
Mickaël Chen,
Alexandra D. Constantinou,
Antoine d'Andigné,
Hubert de La Jonquière,
Aurélien Delfosse,
Ludovic Denoyer,
Alexis Deprez,
Augustin Derupti,
Michael Eickenberg,
Mathïs Federico,
Charles Kantor,
Xavier Koegler,
Yann Labbé,
Matthew C. H. Lee,
Erwan Le Jumeau de Kergaradec
, et al. (19 additional authors not shown)
Abstract:
We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 t…
▽ More
We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 tops generalist User Interface (UI) benchmarks as well as our new web UI localization benchmark, WebClick. When powered by Holo1, Surfer-H achieves a 92.2% state-of-the-art performance on WebVoyager, striking a Pareto-optimal balance between accuracy and cost-efficiency. To accelerate research advancement in agentic systems, we are open-sourcing both our WebClick evaluation dataset and the Holo1 model weights.
△ Less
Submitted 11 June, 2025; v1 submitted 3 June, 2025;
originally announced June 2025.
-
Pseudorandom bits for non-commutative programs
Authors:
Chin Ho Lee,
Emanuele Viola
Abstract:
We obtain new explicit pseudorandom generators for several computational models involving groups. Our main results are as follows:
1. We consider read-once group-products over a finite group $G$, i.e., tests of the form $\prod_{i=1}^n g_i^{x_i}$ where $g_i\in G$, a special case of read-once permutation branching programs. We give generators with optimal seed length $c_G \log(n/\varepsilon)$ over…
▽ More
We obtain new explicit pseudorandom generators for several computational models involving groups. Our main results are as follows:
1. We consider read-once group-products over a finite group $G$, i.e., tests of the form $\prod_{i=1}^n g_i^{x_i}$ where $g_i\in G$, a special case of read-once permutation branching programs. We give generators with optimal seed length $c_G \log(n/\varepsilon)$ over any $p$-group. The proof uses the small-bias plus noise paradigm, but derandomizes the noise to avoid the recursion in previous work. Our generator works when the bits are read in any order. Previously for any non-commutative group the best seed length was $\ge\log n\log(1/\varepsilon)$, even for a fixed order.
2. We give a reduction that "lifts" suitable generators for group products over $G$ to a generator that fools width-$w$ block products, i.e., tests of the form $\prod g_i^{f_i}$ where the $f_i$ are arbitrary functions on disjoint blocks of $w$ bits. Block products generalize several previously studied classes. The reduction applies to groups that are mixing in a representation-theoretic sense that we identify.
3. Combining (2) with (1) and other works we obtain new generators for block products over the quaternions or over any commutative group, with nearly optimal seed length. In particular, we obtain generators for read-once polynomials modulo any fixed $m$ with nearly optimal seed length. Previously this was known only for $m=2$.
4. We give a new generator for products over "mixing groups." The construction departs from previous work and uses representation theory. For constant error, we obtain optimal seed length, improving on previous work (which applied to any group).
This paper identifies a challenge in the area that is reminiscent of a roadblock in circuit complexity -- handling composite moduli -- and points to several classes of groups to be attacked next.
△ Less
Submitted 4 June, 2025; v1 submitted 2 June, 2025;
originally announced June 2025.
-
Novel Extraction of Discriminative Fine-Grained Feature to Improve Retinal Vessel Segmentation
Authors:
Shuang Zeng,
Chee Hong Lee,
Micky C Nnamdi,
Wenqi Shi,
J Ben Tamo,
Lei Zhu,
Hangzhou He,
Xinliang Zhang,
Qian Chen,
May D. Wang,
Yanye Lu,
Qiushi Ren
Abstract:
Retinal vessel segmentation is a vital early detection method for several severe ocular diseases. Despite significant progress in retinal vessel segmentation with the advancement of Neural Networks, there are still challenges to overcome. Specifically, retinal vessel segmentation aims to predict the class label for every pixel within a fundus image, with a primary focus on intra-image discriminati…
▽ More
Retinal vessel segmentation is a vital early detection method for several severe ocular diseases. Despite significant progress in retinal vessel segmentation with the advancement of Neural Networks, there are still challenges to overcome. Specifically, retinal vessel segmentation aims to predict the class label for every pixel within a fundus image, with a primary focus on intra-image discrimination, making it vital for models to extract more discriminative features. Nevertheless, existing methods primarily focus on minimizing the difference between the output from the decoder and the label, but ignore fully using feature-level fine-grained representations from the encoder. To address these issues, we propose a novel Attention U-shaped Kolmogorov-Arnold Network named AttUKAN along with a novel Label-guided Pixel-wise Contrastive Loss for retinal vessel segmentation. Specifically, we implement Attention Gates into Kolmogorov-Arnold Networks to enhance model sensitivity by suppressing irrelevant feature activations and model interpretability by non-linear modeling of KAN blocks. Additionally, we also design a novel Label-guided Pixel-wise Contrastive Loss to supervise our proposed AttUKAN to extract more discriminative features by distinguishing between foreground vessel-pixel pairs and background pairs. Experiments are conducted across four public datasets including DRIVE, STARE, CHASE_DB1, HRF and our private dataset. AttUKAN achieves F1 scores of 82.50%, 81.14%, 81.34%, 80.21% and 80.09%, along with MIoU scores of 70.24%, 68.64%, 68.59%, 67.21% and 66.94% in the above datasets, which are the highest compared to 11 networks for retinal vessel segmentation. Quantitative and qualitative results show that our AttUKAN achieves state-of-the-art performance and outperforms existing retinal vessel segmentation methods. Our code will be available at https://github.com/stevezs315/AttUKAN.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
On Learning Parallel Pancakes with Mostly Uniform Weights
Authors:
Ilias Diakonikolas,
Daniel M. Kane,
Sushrut Karmalkar,
Jasper C. H. Lee,
Thanasis Pittas
Abstract:
We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponenti…
▽ More
We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponentially small and that the components have the same unknown covariance. Recent work gave a $d^{O(\log(1/w_{\min}))}$-time algorithm for this class of GMMs, where $w_{\min}$ is the minimum weight. Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
Authors:
Che Hyun Lee,
Heeseung Kim,
Jiheum Yeom,
Sungroh Yoon
Abstract:
We propose EdiText, a controllable text editing method that modifies the reference text to desired attributes at various scales. We integrate an SDEdit-based editing technique that allows for broad adjustments in the degree of text editing. Additionally, we introduce a novel fine-level editing method based on self-conditioning, which allows subtle control of reference text. While being capable of…
▽ More
We propose EdiText, a controllable text editing method that modifies the reference text to desired attributes at various scales. We integrate an SDEdit-based editing technique that allows for broad adjustments in the degree of text editing. Additionally, we introduce a novel fine-level editing method based on self-conditioning, which allows subtle control of reference text. While being capable of editing on its own, this fine-grained method, integrated with the SDEdit approach, enables EdiText to make precise adjustments within the desired range. EdiText demonstrates its controllability to robustly adjust reference text at a broad range of levels across various tasks, including toxicity control and sentiment control.
△ Less
Submitted 2 June, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Authors:
Heeseung Kim,
Che Hyun Lee,
Sangkwon Park,
Jiheum Yeom,
Nohil Park,
Sangwon Yu,
Sungroh Yoon
Abstract:
Recent advancements in multi-turn voice interaction models have improved user-model communication. However, while closed-source models effectively retain and recall past utterances, whether open-source models share this ability remains unexplored. To fill this gap, we systematically evaluate how well open-source interaction models utilize past utterances using ContextDialog, a benchmark we propose…
▽ More
Recent advancements in multi-turn voice interaction models have improved user-model communication. However, while closed-source models effectively retain and recall past utterances, whether open-source models share this ability remains unexplored. To fill this gap, we systematically evaluate how well open-source interaction models utilize past utterances using ContextDialog, a benchmark we proposed for this purpose. Our findings show that speech-based models have more difficulty than text-based ones, especially when recalling information conveyed in speech, and even with retrieval-augmented generation, models still struggle with questions about past utterances. These insights highlight key limitations in open-source models and suggest ways to improve memory retention and retrieval robustness.
△ Less
Submitted 23 May, 2025; v1 submitted 26 February, 2025;
originally announced February 2025.
-
Machine Learning Optimal Ordering in Global Routing Problems in Semiconductors
Authors:
Heejin Choi,
Minji Lee,
Chang Hyeong Lee,
Jaeho Yang,
Rak-Kyeong Seong
Abstract:
In this work, we propose a new method for ordering nets during the process of layer assignment in global routing problems. The global routing problems that we focus on in this work are based on routing problems that occur in the design of substrates in multilayered semiconductor packages. The proposed new method is based on machine learning techniques and we show that the proposed method supersede…
▽ More
In this work, we propose a new method for ordering nets during the process of layer assignment in global routing problems. The global routing problems that we focus on in this work are based on routing problems that occur in the design of substrates in multilayered semiconductor packages. The proposed new method is based on machine learning techniques and we show that the proposed method supersedes conventional net ordering techniques based on heuristic score functions. We perform global routing experiments in multilayered semiconductor package environments in order to illustrate that the routing order based on our new proposed technique outperforms previous methods based on heuristics. Our approach of using machine learning for global routing targets specifically the net ordering step which we show in this work can be significantly improved by deep learning.
△ Less
Submitted 30 December, 2024;
originally announced December 2024.
-
Game Theoretic Liquidity Provisioning in Concentrated Liquidity Market Makers
Authors:
Weizhao Tang,
Rachid El-Azouzi,
Cheng Han Lee,
Ethan Chan,
Giulia Fanti
Abstract:
Automated marker makers (AMMs) are a class of decentralized exchanges that enable the automated trading of digital assets. They accept deposits of digital tokens from liquidity providers (LPs); tokens can be used by traders to execute trades, which generate fees for the investing LPs. The distinguishing feature of AMMs is that trade prices are determined algorithmically, unlike classical limit ord…
▽ More
Automated marker makers (AMMs) are a class of decentralized exchanges that enable the automated trading of digital assets. They accept deposits of digital tokens from liquidity providers (LPs); tokens can be used by traders to execute trades, which generate fees for the investing LPs. The distinguishing feature of AMMs is that trade prices are determined algorithmically, unlike classical limit order books. Concentrated liquidity market makers (CLMMs) are a major class of AMMs that offer liquidity providers flexibility to decide not only \emph{how much} liquidity to provide, but \emph{in what ranges of prices} they want the liquidity to be used. This flexibility can complicate strategic planning, since fee rewards are shared among LPs. We formulate and analyze a game theoretic model to study the incentives of LPs in CLMMs. Our main results show that while our original formulation admits multiple Nash equilibria and has complexity quadratic in the number of price ticks in the contract, it can be reduced to a game with a unique Nash equilibrium whose complexity is only linear. We further show that the Nash equilibrium of this simplified game follows a waterfilling strategy, in which low-budget LPs use up their full budget, but rich LPs do not. Finally, by fitting our game model to real-world CLMMs, we observe that in liquidity pools with risky assets, LPs adopt investment strategies far from the Nash equilibrium. Under price uncertainty, they generally invest in fewer and wider price ranges than our analysis suggests, with lower-frequency liquidity updates. We show that across several pools, by updating their strategy to more closely match the Nash equilibrium of our game, LPs can improve their median daily returns by \$116, which corresponds to an increase of 0.009\% in median daily return on investment.
△ Less
Submitted 15 November, 2024;
originally announced November 2024.
-
NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers
Authors:
Nohil Park,
Heeseung Kim,
Che Hyun Lee,
Jooyoung Choi,
Jiheum Yeom,
Sungroh Yoon
Abstract:
We present NanoVoice, a personalized text-to-speech model that efficiently constructs voice adapters for multiple speakers simultaneously. NanoVoice introduces a batch-wise speaker adaptation technique capable of fine-tuning multiple references in parallel, significantly reducing training time. Beyond building separate adapters for each speaker, we also propose a parameter sharing technique that r…
▽ More
We present NanoVoice, a personalized text-to-speech model that efficiently constructs voice adapters for multiple speakers simultaneously. NanoVoice introduces a batch-wise speaker adaptation technique capable of fine-tuning multiple references in parallel, significantly reducing training time. Beyond building separate adapters for each speaker, we also propose a parameter sharing technique that reduces the number of parameters used for speaker adaptation. By incorporating a novel trainable scale matrix, NanoVoice mitigates potential performance degradation during parameter sharing. NanoVoice achieves performance comparable to the baselines, while training 4 times faster and using 45 percent fewer parameters for speaker adaptation with 40 reference voices. Extensive ablation studies and analysis further validate the efficiency of our model.
△ Less
Submitted 20 December, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance
Authors:
Jiheum Yeom,
Heeseung Kim,
Jooyoung Choi,
Che Hyun Lee,
Nohil Park,
Sungroh Yoon
Abstract:
When applying parameter-efficient finetuning via LoRA onto speaker adaptive text-to-speech models, adaptation performance may decline compared to full-finetuned counterparts, especially for out-of-domain speakers. Here, we propose VoiceGuider, a parameter-efficient speaker adaptive text-to-speech system reinforced with autoguidance to enhance the speaker adaptation performance, reducing the gap ag…
▽ More
When applying parameter-efficient finetuning via LoRA onto speaker adaptive text-to-speech models, adaptation performance may decline compared to full-finetuned counterparts, especially for out-of-domain speakers. Here, we propose VoiceGuider, a parameter-efficient speaker adaptive text-to-speech system reinforced with autoguidance to enhance the speaker adaptation performance, reducing the gap against full-finetuned models. We carefully explore various ways of strengthening autoguidance, ultimately finding the optimal strategy. VoiceGuider as a result shows robust adaptation performance especially on extreme out-of-domain speech data. We provide audible samples in our demo page.
△ Less
Submitted 20 December, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
Boosting uniformity in quasirandom groups: fast and simple
Authors:
Harm Derksen,
Chin Ho Lee,
Emanuele Viola
Abstract:
We study the communication complexity of multiplying $k\times t$ elements from the group $H=\text{SL}(2,q)$ in the number-on-forehead model with $k$ parties. We prove a lower bound of $(t\log H)/c^{k}$. This is an exponential improvement over previous work, and matches the state-of-the-art in the area.
Relatedly, we show that the convolution of $k^{c}$ independent copies of a 3-uniform distribut…
▽ More
We study the communication complexity of multiplying $k\times t$ elements from the group $H=\text{SL}(2,q)$ in the number-on-forehead model with $k$ parties. We prove a lower bound of $(t\log H)/c^{k}$. This is an exponential improvement over previous work, and matches the state-of-the-art in the area.
Relatedly, we show that the convolution of $k^{c}$ independent copies of a 3-uniform distribution over $H^{m}$ is close to a $k$-uniform distribution. This is again an exponential improvement over previous work which needed $c^{k}$ copies. The proofs are remarkably simple; the results extend to other quasirandom groups.
We also show that for any group $H$, any distribution over $H^{m}$ whose weight-$k$ Fourier coefficients are small is close to a $k$-uniform distribution. This generalizes previous work in the abelian setting, and the proof is simpler.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech
Authors:
Heeseung Kim,
Sang-gil Lee,
Jiheum Yeom,
Che Hyun Lee,
Sungwon Kim,
Sungroh Yoon
Abstract:
We propose VoiceTailor, a parameter-efficient speaker-adaptive text-to-speech (TTS) system, by equipping a pre-trained diffusion-based TTS model with a personalized adapter. VoiceTailor identifies pivotal modules that benefit from the adapter based on a weight change ratio analysis. We utilize Low-Rank Adaptation (LoRA) as a parameter-efficient adaptation method and incorporate the adapter into pi…
▽ More
We propose VoiceTailor, a parameter-efficient speaker-adaptive text-to-speech (TTS) system, by equipping a pre-trained diffusion-based TTS model with a personalized adapter. VoiceTailor identifies pivotal modules that benefit from the adapter based on a weight change ratio analysis. We utilize Low-Rank Adaptation (LoRA) as a parameter-efficient adaptation method and incorporate the adapter into pivotal modules of the pre-trained diffusion decoder. To achieve powerful adaptation performance with few parameters, we explore various guidance techniques for speaker adaptation and investigate the best strategies to strengthen speaker information. VoiceTailor demonstrates comparable speaker adaptation performance to existing adaptive TTS models by fine-tuning only 0.25\% of the total parameters. VoiceTailor shows strong robustness when adapting to a wide range of real-world speakers, as shown in the demo.
△ Less
Submitted 27 August, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
Screen Them All: High-Throughput Pan-Cancer Genetic and Phenotypic Biomarker Screening from H&E Whole Slide Images
Authors:
Yi Kan Wang,
Ludmila Tydlitatova,
Jeremy D. Kunz,
Gerard Oakley,
Bonnie Kar Bo Chow,
Ran A. Godrich,
Matthew C. H. Lee,
Hamed Aghdam,
Alican Bozkurt,
Michal Zelechowski,
Chad Vanderbilt,
Christopher Kanan,
Juan A. Retamero,
Peter Hamilton,
Razik Yousfi,
Thomas J. Fuchs,
David S. Klimstra,
Siqi Liu
Abstract:
Molecular assays are standard of care for detecting genomic alterations in cancer prognosis and therapy selection but are costly, tissue-destructive and time-consuming. Artificial intelligence (AI) applied to routine hematoxylin and eosin (H&E)-stained whole slide images (WSIs) offers a fast and economical alternative for screening molecular biomarkers. We introduce OmniScreen, a high-throughput A…
▽ More
Molecular assays are standard of care for detecting genomic alterations in cancer prognosis and therapy selection but are costly, tissue-destructive and time-consuming. Artificial intelligence (AI) applied to routine hematoxylin and eosin (H&E)-stained whole slide images (WSIs) offers a fast and economical alternative for screening molecular biomarkers. We introduce OmniScreen, a high-throughput AI-based system leveraging Virchow2 embeddings extracted from 60,529 cancer patients with paired 489-gene MSK-IMPACT targeted biomarker panel and WSIs. Unlike conventional approaches that train separate models for each biomarker, OmniScreen employs a unified model to predict a broad range of clinically relevant biomarkers across cancers, including low-prevalence targets impractical to model individually. OmniScreen reliably identifies therapeutic targets and shared phenotypic features across common and rare tumors. We investigate the biomarker prediction probabilities and accuracies of OmniScreen in relation to tumor area, cohort size, histologic subtype alignment, and pathway-level morphological patterns. These findings underscore the potential of OmniScreen for routine clinical screening.
△ Less
Submitted 14 July, 2025; v1 submitted 18 August, 2024;
originally announced August 2024.
-
Pseudorandomness, symmetry, smoothing: II
Authors:
Harm Derksen,
Peter Ivanov,
Chin Ho Lee,
Emanuele Viola
Abstract:
We prove several new results on the Hamming weight of bounded uniform and small-bias distributions.
We exhibit bounded-uniform distributions whose weight is anti-concentrated, matching existing concentration inequalities. This construction relies on a recent result in approximation theory due to Erdéyi (Acta Arithmetica 2016). In particular, we match the classical tail bounds, generalizing a res…
▽ More
We prove several new results on the Hamming weight of bounded uniform and small-bias distributions.
We exhibit bounded-uniform distributions whose weight is anti-concentrated, matching existing concentration inequalities. This construction relies on a recent result in approximation theory due to Erdéyi (Acta Arithmetica 2016). In particular, we match the classical tail bounds, generalizing a result by Bun and Steinke (RANDOM 2015). Also, we improve on a construction by Benjamini, Gurel-Gurevich, and Peled (2012).
We give a generic transformation that converts any bounded uniform distribution to a small-bias distribution that almost preserves its weight distribution. Applying this transformation in conjunction with the above results and others, we construct small-bias distributions with various weight restrictions. In particular, we match the concentration that follows from that of bounded uniformity and the generic closeness of small-bias and bounded-uniform distributions, answering a question by Bun and Steinke (RANDOM 2015). Moreover, these distributions are supported on only a constant number of Hamming weights.
We further extend the anti-concentration constructions to small-bias distributions perturbed with noise, a class that has received much attention recently in derandomization. Our results imply (but are not implied by) a recent result of the authors (CCC 2024), and are based on different techniques. In particular, we prove that the standard Gaussian distribution is far from any mixture of Gaussians with bounded variance.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Trace reconstruction from local statistical queries
Authors:
Xi Chen,
Anindya De,
Chin Ho Lee,
Rocco A. Servedio
Abstract:
The goal of trace reconstruction is to reconstruct an unknown $n$-bit string $x$ given only independent random traces of $x$, where a random trace of $x$ is obtained by passing $x$ through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of $x$ rather than individual…
▽ More
The goal of trace reconstruction is to reconstruct an unknown $n$-bit string $x$ given only independent random traces of $x$, where a random trace of $x$ is obtained by passing $x$ through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of $x$ rather than individual traces themselves. Such an algorithm is said to be $\ell$-local if each of its statistical queries corresponds to an $\ell$-junta function over some block of $\ell$ consecutive bits in the trace. Since several -- but not all -- known algorithms for trace reconstruction fall under the local statistical query paradigm, it is interesting to understand the abilities and limitations of local SQ algorithms for trace reconstruction.
In this paper we establish nearly-matching upper and lower bounds on local Statistical Query algorithms for both worst-case and average-case trace reconstruction. For the worst-case problem, we show that there is an $\tilde{O}(n^{1/5})$-local SQ algorithm that makes all its queries with tolerance $τ\geq 2^{-\tilde{O}(n^{1/5})}$, and also that any $\tilde{O}(n^{1/5})$-local SQ algorithm must make some query with tolerance $τ\leq 2^{-\tildeΩ(n^{1/5})}$. For the average-case problem, we show that there is an $O(\log n)$-local SQ algorithm that makes all its queries with tolerance $τ\geq 1/\mathrm{poly}(n)$, and also that any $O(\log n)$-local SQ algorithm must make some query with tolerance $τ\leq 1/\mathrm{poly}(n).$
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Segmenting Medical Images with Limited Data
Authors:
Zhaoshan Liua,
Qiujie Lv,
Chau Hung Lee,
Lei Shen
Abstract:
While computer vision has proven valuable for medical image segmentation, its application faces challenges such as limited dataset sizes and the complexity of effectively leveraging unlabeled images. To address these challenges, we present a novel semi-supervised, consistency-based approach termed the data-efficient medical segmenter (DEMS). The DEMS features an encoder-decoder architecture and in…
▽ More
While computer vision has proven valuable for medical image segmentation, its application faces challenges such as limited dataset sizes and the complexity of effectively leveraging unlabeled images. To address these challenges, we present a novel semi-supervised, consistency-based approach termed the data-efficient medical segmenter (DEMS). The DEMS features an encoder-decoder architecture and incorporates the developed online automatic augmenter (OAA) and residual robustness enhancement (RRE) blocks. The OAA augments input data with various image transformations, thereby diversifying the dataset to improve the generalization ability. The RRE enriches feature diversity and introduces perturbations to create varied inputs for different decoders, thereby providing enhanced variability. Moreover, we introduce a sensitive loss to further enhance consistency across different decoders and stabilize the training process. Extensive experimental results on both our own and three public datasets affirm the effectiveness of DEMS. Under extreme data shortage scenarios, our DEMS achieves 16.85\% and 10.37\% improvement in dice score compared with the U-Net and top-performed state-of-the-art method, respectively. Given its superior data efficiency, DEMS could present significant advancements in medical segmentation under small data regimes. The project homepage can be accessed at https://github.com/NUS-Tim/DEMS.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Pseudorandomness, symmetry, smoothing: I
Authors:
Harm Derksen,
Peter Ivanov,
Chin Ho Lee,
Emanuele Viola
Abstract:
We prove several new results about bounded uniform and small-bias distributions. A main message is that, small-bias, even perturbed with noise, does not fool several classes of tests better than bounded uniformity. We prove this for threshold tests, small-space algorithms, and small-depth circuits. In particular, we obtain small-bias distributions that
1) achieve an optimal lower bound on their…
▽ More
We prove several new results about bounded uniform and small-bias distributions. A main message is that, small-bias, even perturbed with noise, does not fool several classes of tests better than bounded uniformity. We prove this for threshold tests, small-space algorithms, and small-depth circuits. In particular, we obtain small-bias distributions that
1) achieve an optimal lower bound on their statistical distance to any bounded-uniform distribution. This closes a line of research initiated by Alon, Goldreich, and Mansour in 2003, and improves on a result by O'Donnell and Zhao.
2) have heavier tail mass than the uniform distribution. This answers a question posed by several researchers including Bun and Steinke.
3) rule out a popular paradigm for constructing pseudorandom generators, originating in a 1989 work by Ajtai and Wigderson. This again answers a question raised by several researchers. For branching programs, our result matches a bound by Forbes and Kelley.
Our small-bias distributions above are symmetric. We show that the xor of any two symmetric small-bias distributions fools any bounded function. Hence our examples cannot be extended to the xor of two small-bias distributions, another popular paradigm whose power remains unknown. We also generalize and simplify the proof of a result of Bazzi.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Modeling Considerations for Developing Deep Space Autonomous Spacecraft and Simulators
Authors:
Christopher Agia,
Guillem Casadesus Vila,
Saptarshi Bandyopadhyay,
David S. Bayard,
Kar-Ming Cheung,
Charles H. Lee,
Eric Wood,
Ian Aenishanslin,
Steven Ardito,
Lorraine Fesq,
Marco Pavone,
Issa A. D. Nesnas
Abstract:
To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refineme…
▽ More
To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refinement and interconnection of the underlying spacecraft and environment onboard models. However, with a limited understanding of the assumptions and tradeoffs of modeling to arbitrary extents, designing onboard models to support system-level capabilities presents a significant challenge.
In this paper, we provide a detailed analysis of the increasing levels of model fidelity for several key spacecraft subsystems, with the goal of informing future spacecraft functional- and system-level autonomy algorithms and the physics-based simulators on which they are validated. We do not argue for the adoption of a particular fidelity class of models but, instead, highlight the potential tradeoffs and opportunities associated with the use of models for onboard autonomy and in physics-based simulators at various fidelity levels. We ground our analysis in the context of deep space exploration of small bodies, an emerging frontier for autonomous spacecraft operation in space, where the choice of models employed onboard the spacecraft may determine mission success. We conduct our experiments in the Multi-Spacecraft Concept and Autonomy Tool (MuSCAT), a software suite for developing spacecraft autonomy algorithms.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation
Authors:
Ilias Diakonikolas,
Daniel M. Kane,
Jasper C. H. Lee,
Thanasis Pittas
Abstract:
We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a $k$-component mixture distribution $D = \sum_{i =1}^k w_i P_i$, where each $w_i \ge α$ for some known parameter $α$, and each $P_i$ has unknown covariance $Σ_i \preceq σ^2_i \cdot I_d$ for some unknown $σ_i$, the goal is to cluster the sam…
▽ More
We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a $k$-component mixture distribution $D = \sum_{i =1}^k w_i P_i$, where each $w_i \ge α$ for some known parameter $α$, and each $P_i$ has unknown covariance $Σ_i \preceq σ^2_i \cdot I_d$ for some unknown $σ_i$, the goal is to cluster the samples assuming a pairwise mean separation in the order of $(σ_i+σ_j)/\sqrtα$ between every pair of components $P_i$ and $P_j$. Our contributions are as follows:
For the special case of nearly uniform mixtures, we give the first poly-time algorithm for this clustering task. Prior work either required separation scaling with the maximum cluster standard deviation (i.e. $\max_i σ_i$) [DKK+22b] or required both additional structural assumptions and mean separation scaling as a large degree polynomial in $1/α$ [BKK22].
For general-weight mixtures, we point out that accurate clustering is information-theoretically impossible under our fine-grained mean separation assumptions. We introduce the notion of a clustering refinement -- a list of not-too-small subsets satisfying a similar separation, and which can be merged into a clustering approximating the ground truth -- and show that it is possible to efficiently compute an accurate clustering refinement of the samples. Furthermore, under a variant of the "no large sub-cluster'' condition from in prior work [BKK22], we show that our algorithm outputs an accurate clustering, not just a refinement, even for general-weight mixtures. As a corollary, we obtain efficient clustering algorithms for mixtures of well-conditioned high-dimensional log-concave distributions.
Moreover, our algorithm is robust to $Ω(α)$-fraction of adversarial outliers.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Optimality in Mean Estimation: Beyond Worst-Case, Beyond Sub-Gaussian, and Beyond $1+α$ Moments
Authors:
Trung Dang,
Jasper C. H. Lee,
Maoyuan Song,
Paul Valiant
Abstract:
There is growing interest in improving our algorithmic understanding of fundamental statistical problems such as mean estimation, driven by the goal of understanding the limits of what we can extract from valuable data. The state of the art results for mean estimation in $\mathbb{R}$ are 1) the optimal sub-Gaussian mean estimator by [LV22], with the tight sub-Gaussian constant for all distribution…
▽ More
There is growing interest in improving our algorithmic understanding of fundamental statistical problems such as mean estimation, driven by the goal of understanding the limits of what we can extract from valuable data. The state of the art results for mean estimation in $\mathbb{R}$ are 1) the optimal sub-Gaussian mean estimator by [LV22], with the tight sub-Gaussian constant for all distributions with finite but unknown variance, and 2) the analysis of the median-of-means algorithm by [BCL13] and a lower bound by [DLLO16], characterizing the big-O optimal errors for distributions for which only a $1+α$ moment exists for $α\in (0,1)$. Both results, however, are optimal only in the worst case. We initiate the fine-grained study of the mean estimation problem: Can algorithms leverage useful features of the input distribution to beat the sub-Gaussian rate, without explicit knowledge of such features?
We resolve this question with an unexpectedly nuanced answer: "Yes in limited regimes, but in general no". For any distribution $p$ with a finite mean, we construct a distribution $q$ whose mean is well-separated from $p$'s, yet $p$ and $q$ are not distinguishable with high probability, and $q$ further preserves $p$'s moments up to constants. The main consequence is that no reasonable estimator can asymptotically achieve better than the sub-Gaussian error rate for any distribution, matching the worst-case result of [LV22]. More generally, we introduce a new definitional framework to analyze the fine-grained optimality of algorithms, which we call "neighborhood optimality", interpolating between the unattainably strong "instance optimality" and the trivially weak "admissibility" definitions. Applying the new framework, we show that median-of-means is neighborhood optimal, up to constant factors. It is open to find a neighborhood-optimal estimator without constant factor slackness.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Two-Stage Predict+Optimize for Mixed Integer Linear Programs with Unknown Parameters in Constraints
Authors:
Xinyi Hu,
Jasper C. H. Lee,
Jimmy H. M. Lee
Abstract:
Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of…
▽ More
Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of the predicted solution under the true parameters. Almost all prior works have focused on the special case where the unknowns appear only in the optimization objective and not the constraints. Hu et al.~proposed the first adaptation of Predict+Optimize to handle unknowns appearing in constraints, but the framework has somewhat ad-hoc elements, and they provided a training algorithm only for covering and packing linear programs. In this work, we give a new \emph{simpler} and \emph{more powerful} framework called \emph{Two-Stage Predict+Optimize}, which we believe should be the canonical framework for the Predict+Optimize setting. We also give a training algorithm usable for all mixed integer linear programs, vastly generalizing the applicability of the framework. Experimental results demonstrate the superior prediction performance of our training framework over all classical and state-of-the-art methods.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
AI Nushu: An Exploration of Language Emergence in Sisterhood -Through the Lens of Computational Linguistics
Authors:
Yuqian Sun,
Yuying Tang,
Ze Gao,
Zhijun Pan,
Chuyan Xu,
Yurou Chen,
Kejiang Qian,
Zhigang Wang,
Tristan Braud,
Chang Hee Lee,
Ali Asadipour
Abstract:
This paper presents "AI Nushu," an emerging language system inspired by Nushu (women's scripts), the unique language created and used exclusively by ancient Chinese women who were thought to be illiterate under a patriarchal society. In this interactive installation, two artificial intelligence (AI) agents are trained in the Chinese dictionary and the Nushu corpus. By continually observing their e…
▽ More
This paper presents "AI Nushu," an emerging language system inspired by Nushu (women's scripts), the unique language created and used exclusively by ancient Chinese women who were thought to be illiterate under a patriarchal society. In this interactive installation, two artificial intelligence (AI) agents are trained in the Chinese dictionary and the Nushu corpus. By continually observing their environment and communicating, these agents collaborate towards creating a standard writing system to encode Chinese. It offers an artistic interpretation of the creation of a non-western script from a computational linguistics perspective, integrating AI technology with Chinese cultural heritage and a feminist viewpoint.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Fictional Worlds, Real Connections: Developing Community Storytelling Social Chatbots through LLMs
Authors:
Yuqian Sun,
Hanyi Wang,
Pok Man Chan,
Morteza Tabibi,
Yan Zhang,
Huan Lu,
Yuheng Chen,
Chang Hee Lee,
Ali Asadipour
Abstract:
We address the integration of storytelling and Large Language Models (LLMs) to develop engaging and believable Social Chatbots (SCs) in community settings. Motivated by the potential of fictional characters to enhance social interactions, we introduce Storytelling Social Chatbots (SSCs) and the concept of story engineering to transform fictional game characters into "live" social entities within p…
▽ More
We address the integration of storytelling and Large Language Models (LLMs) to develop engaging and believable Social Chatbots (SCs) in community settings. Motivated by the potential of fictional characters to enhance social interactions, we introduce Storytelling Social Chatbots (SSCs) and the concept of story engineering to transform fictional game characters into "live" social entities within player communities. Our story engineering process includes three steps: (1) Character and story creation, defining the SC's personality and worldview, (2) Presenting Live Stories to the Community, allowing the SC to recount challenges and seek suggestions, and (3) Communication with community members, enabling interaction between the SC and users. We employed the LLM GPT-3 to drive our SSC prototypes, "David" and "Catherine," and evaluated their performance in an online gaming community, "DE (Alias)," on Discord. Our mixed-method analysis, based on questionnaires (N=15) and interviews (N=8) with community members, reveals that storytelling significantly enhances the engagement and believability of SCs in community settings.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Virchow: A Million-Slide Digital Pathology Foundation Model
Authors:
Eugene Vorontsov,
Alican Bozkurt,
Adam Casson,
George Shaikovski,
Michal Zelechowski,
Siqi Liu,
Kristen Severson,
Eric Zimmermann,
James Hall,
Neil Tenenholtz,
Nicolo Fusi,
Philippe Mathieu,
Alexander van Eck,
Donghun Lee,
Julian Viret,
Eric Robert,
Yi Kan Wang,
Jeremy D. Kunz,
Matthew C. H. Lee,
Jan Bernhard,
Ran A. Godrich,
Gerard Oakley,
Ewan Millar,
Matthew Hanna,
Juan Retamero
, et al. (6 additional authors not shown)
Abstract:
The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computati…
▽ More
The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computational pathology. Using self-supervised learning empowered by the DINOv2 algorithm, Virchow is a vision transformer model with 632 million parameters trained on 1.5 million hematoxylin and eosin stained whole slide images from diverse tissue and specimen types, which is orders of magnitude more data than previous works. The Virchow model enables the development of a pan-cancer detection system with 0.949 overall specimen-level AUC across 17 different cancer types, while also achieving 0.937 AUC on 7 rare cancer types. The Virchow model sets the state-of-the-art on the internal and external image tile level benchmarks and slide level biomarker prediction tasks. The gains in performance highlight the importance of training on massive pathology image datasets, suggesting scaling up the data and network architecture can improve the accuracy for many high-impact computational pathology applications where limited amounts of training data are available.
△ Less
Submitted 17 January, 2024; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Language as Reality: A Co-Creative Storytelling Game Experience in 1001 Nights using Generative AI
Authors:
Yuqian Sun,
Zhouyi Li,
Ke Fang,
Chang Hee Lee,
Ali Asadipour
Abstract:
In this paper, we present "1001 Nights", an AI-native game that allows players lead in-game reality through co-created storytelling with the character driven by large language model. The concept is inspired by Wittgenstein's idea of the limits of one's world being determined by the bounds of their language. Using advanced AI tools like GPT-4 and Stable Diffusion, the second iteration of the game e…
▽ More
In this paper, we present "1001 Nights", an AI-native game that allows players lead in-game reality through co-created storytelling with the character driven by large language model. The concept is inspired by Wittgenstein's idea of the limits of one's world being determined by the bounds of their language. Using advanced AI tools like GPT-4 and Stable Diffusion, the second iteration of the game enables the protagonist, Shahrzad, to realize words and stories in her world. The player can steer the conversation with the AI King towards specific keywords, which then become battle equipment in the game. This blend of interactive narrative and text-to-image transformation challenges the conventional border between the game world and reality through a dual perspective. We focus on Shahrzad, who seeks to alter her fate compared to the original folklore, and the player, who collaborates with AI to craft narratives and shape the game world. We explore the technical and design elements of implementing such a game with an objective to enhance the narrative game genre with AI-generated content and to delve into AI-native gameplay possibilities.
△ Less
Submitted 18 September, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Finite-Sample Symmetric Mean Estimation with Fisher Information Rate
Authors:
Shivam Gupta,
Jasper C. H. Lee,
Eric Price
Abstract:
The mean of an unknown variance-$σ^2$ distribution $f$ can be estimated from $n$ samples with variance $\frac{σ^2}{n}$ and nearly corresponding subgaussian rate. When $f$ is known up to translation, this can be improved asymptotically to $\frac{1}{n\mathcal I}$, where $\mathcal I$ is the Fisher information of the distribution. Such an improvement is not possible for general unknown $f$, but [Stone…
▽ More
The mean of an unknown variance-$σ^2$ distribution $f$ can be estimated from $n$ samples with variance $\frac{σ^2}{n}$ and nearly corresponding subgaussian rate. When $f$ is known up to translation, this can be improved asymptotically to $\frac{1}{n\mathcal I}$, where $\mathcal I$ is the Fisher information of the distribution. Such an improvement is not possible for general unknown $f$, but [Stone, 1975] showed that this asymptotic convergence $\textit{is}$ possible if $f$ is $\textit{symmetric}$ about its mean. Stone's bound is asymptotic, however: the $n$ required for convergence depends in an unspecified way on the distribution $f$ and failure probability $δ$. In this paper we give finite-sample guarantees for symmetric mean estimation in terms of Fisher information. For every $f, n, δ$ with $n > \log \frac{1}δ$, we get convergence close to a subgaussian with variance $\frac{1}{n \mathcal I_r}$, where $\mathcal I_r$ is the $r$-$\textit{smoothed}$ Fisher information with smoothing radius $r$ that decays polynomially in $n$. Such a bound essentially matches the finite-sample guarantees in the known-$f$ setting.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
A Spectral Algorithm for List-Decodable Covariance Estimation in Relative Frobenius Norm
Authors:
Ilias Diakonikolas,
Daniel M. Kane,
Jasper C. H. Lee,
Ankit Pensia,
Thanasis Pittas
Abstract:
We study the problem of list-decodable Gaussian covariance estimation. Given a multiset $T$ of $n$ points in $\mathbb R^d$ such that an unknown $α<1/2$ fraction of points in $T$ are i.i.d. samples from an unknown Gaussian $\mathcal{N}(μ, Σ)$, the goal is to output a list of $O(1/α)$ hypotheses at least one of which is close to $Σ$ in relative Frobenius norm. Our main result is a…
▽ More
We study the problem of list-decodable Gaussian covariance estimation. Given a multiset $T$ of $n$ points in $\mathbb R^d$ such that an unknown $α<1/2$ fraction of points in $T$ are i.i.d. samples from an unknown Gaussian $\mathcal{N}(μ, Σ)$, the goal is to output a list of $O(1/α)$ hypotheses at least one of which is close to $Σ$ in relative Frobenius norm. Our main result is a $\mathrm{poly}(d,1/α)$ sample and time algorithm for this task that guarantees relative Frobenius norm error of $\mathrm{poly}(1/α)$. Importantly, our algorithm relies purely on spectral techniques. As a corollary, we obtain an efficient spectral algorithm for robust partial clustering of Gaussian mixture models (GMMs) -- a key ingredient in the recent work of [BDJ+22] on robustly learning arbitrary GMMs. Combined with the other components of [BDJ+22], our new method yields the first Sum-of-Squares-free algorithm for robustly learning GMMs. At the technical level, we develop a novel multi-filtering method for list-decodable covariance estimation that may be useful in other settings.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
Edit-A-Video: Single Video Editing with Object-Aware Consistency
Authors:
Chaehun Shin,
Heeseung Kim,
Che Hyun Lee,
Sang-gil Lee,
Sungroh Yoon
Abstract:
Despite the fact that text-to-video (TTV) model has recently achieved remarkable success, there have been few approaches on TTV for its extension to video editing. Motivated by approaches on TTV models adapting from diffusion-based text-to-image (TTI) models, we suggest the video editing framework given only a pretrained TTI model and a single <text, video> pair, which we term Edit-A-Video. The fr…
▽ More
Despite the fact that text-to-video (TTV) model has recently achieved remarkable success, there have been few approaches on TTV for its extension to video editing. Motivated by approaches on TTV models adapting from diffusion-based text-to-image (TTI) models, we suggest the video editing framework given only a pretrained TTI model and a single <text, video> pair, which we term Edit-A-Video. The framework consists of two stages: (1) inflating the 2D model into the 3D model by appending temporal modules and tuning on the source video (2) inverting the source video into the noise and editing with target text prompt and attention map injection. Each stage enables the temporal modeling and preservation of semantic attributes of the source video. One of the key challenges for video editing include a background inconsistency problem, where the regions not included for the edit suffer from undesirable and inconsistent temporal alterations. To mitigate this issue, we also introduce a novel mask blending method, termed as sparse-causal blending (SC Blending). We improve previous mask blending methods to reflect the temporal consistency so that the area where the editing is applied exhibits smooth transition while also achieving spatio-temporal consistency of the unedited regions. We present extensive experimental results over various types of text and videos, and demonstrate the superiority of the proposed method compared to baselines in terms of background consistency, text alignment, and video editing quality.
△ Less
Submitted 17 November, 2023; v1 submitted 14 March, 2023;
originally announced March 2023.
-
Branch & Learn with Post-hoc Correction for Predict+Optimize with Unknown Parameters in Constraints
Authors:
Xinyi Hu,
Jasper C. H. Lee,
Jimmy H. M. Lee
Abstract:
Combining machine learning and constrained optimization, Predict+Optimize tackles optimization problems containing parameters that are unknown at the time of solving. Prior works focus on cases with unknowns only in the objectives. A new framework was recently proposed to cater for unknowns also in constraints by introducing a loss function, called Post-hoc Regret, that takes into account the cost…
▽ More
Combining machine learning and constrained optimization, Predict+Optimize tackles optimization problems containing parameters that are unknown at the time of solving. Prior works focus on cases with unknowns only in the objectives. A new framework was recently proposed to cater for unknowns also in constraints by introducing a loss function, called Post-hoc Regret, that takes into account the cost of correcting an unsatisfiable prediction. Since Post-hoc Regret is non-differentiable, the previous work computes only its approximation. While the notion of Post-hoc Regret is general, its specific implementation is applicable to only packing and covering linear programming problems. In this paper, we first show how to compute Post-hoc Regret exactly for any optimization problem solvable by a recursive algorithm satisfying simple conditions. Experimentation demonstrates substantial improvement in the quality of solutions as compared to the earlier approximation approach. Furthermore, we show experimentally the empirical behavior of different combinations of correction and penalty functions used in the Post-hoc Regret of the same benchmarks. Results provide insights for defining the appropriate Post-hoc Regret in different application scenarios.
△ Less
Submitted 12 March, 2023;
originally announced March 2023.
-
High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors
Authors:
Shivam Gupta,
Jasper C. H. Lee,
Eric Price
Abstract:
In location estimation, we are given $n$ samples from a known distribution $f$ shifted by an unknown translation $λ$, and want to estimate $λ$ as precisely as possible. Asymptotically, the maximum likelihood estimate achieves the Cramér-Rao bound of error $\mathcal N(0, \frac{1}{n\mathcal I})$, where $\mathcal I$ is the Fisher information of $f$. However, the $n$ required for convergence depends o…
▽ More
In location estimation, we are given $n$ samples from a known distribution $f$ shifted by an unknown translation $λ$, and want to estimate $λ$ as precisely as possible. Asymptotically, the maximum likelihood estimate achieves the Cramér-Rao bound of error $\mathcal N(0, \frac{1}{n\mathcal I})$, where $\mathcal I$ is the Fisher information of $f$. However, the $n$ required for convergence depends on $f$, and may be arbitrarily large. We build on the theory using \emph{smoothed} estimators to bound the error for finite $n$ in terms of $\mathcal I_r$, the Fisher information of the $r$-smoothed distribution. As $n \to \infty$, $r \to 0$ at an explicit rate and this converges to the Cramér-Rao bound. We (1) improve the prior work for 1-dimensional $f$ to converge for constant failure probability in addition to high probability, and (2) extend the theory to high-dimensional distributions. In the process, we prove a new bound on the norm of a high-dimensional random variable whose 1-dimensional projections are subgamma, which may be of independent interest.
△ Less
Submitted 5 February, 2023;
originally announced February 2023.
-
Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions
Authors:
Ilias Diakonikolas,
Daniel M. Kane,
Jasper C. H. Lee,
Ankit Pensia
Abstract:
We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained…
▽ More
We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability $τ$, having an additive $\log(1/τ)$ dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Approximate Trace Reconstruction from a Single Trace
Authors:
Xi Chen,
Anindya De,
Chin Ho Lee,
Rocco A. Servedio,
Sandip Sinha
Abstract:
The well-known trace reconstruction problem is the problem of inferring an unknown source string $x \in \{0,1\}^n$ from independent "traces", i.e. copies of $x$ that have been corrupted by a $δ$-deletion channel which independently deletes each bit of $x$ with probability $δ$ and concatenates the surviving bits. The current paper considers the extreme data-limited regime in which only a single tra…
▽ More
The well-known trace reconstruction problem is the problem of inferring an unknown source string $x \in \{0,1\}^n$ from independent "traces", i.e. copies of $x$ that have been corrupted by a $δ$-deletion channel which independently deletes each bit of $x$ with probability $δ$ and concatenates the surviving bits. The current paper considers the extreme data-limited regime in which only a single trace is provided to the reconstruction algorithm. In this setting exact reconstruction is of course impossible, and the question is to what accuracy the source string $x$ can be approximately reconstructed.
We give a detailed study of this question, providing algorithms and lower bounds for the high, intermediate, and low deletion rate regimes in both the worst-case ($x$ is arbitrary) and average-case ($x$ is drawn uniformly from $\{0,1\}^n$) models. In several cases the lower bounds we establish are matched by computationally efficient algorithms that we provide.
We highlight our results for the high deletion rate regime: roughly speaking, they show that
- Having access to a single trace is already quite useful for worst-case trace reconstruction: an efficient algorithm can perform much more accurate reconstruction, given one trace that is even only a few bits long, than it could given no traces at all. But in contrast,
- in the average-case setting, having access to a single trace is provably not very useful: no algorithm, computationally efficient or otherwise, can achieve significantly higher accuracy given one trace that is $o(n)$ bits long than it could with no traces.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints
Authors:
Xinyi Hu,
Jasper C. H. Lee,
Jimmy H. M. Lee
Abstract:
Predict+Optimize is a recently proposed framework which combines machine learning and constrained optimization, tackling optimization problems that contain parameters that are unknown at solving time. The goal is to predict the unknown parameters and use the estimates to solve for an estimated optimal solution to the optimization problem. However, all prior works have focused on the case where unk…
▽ More
Predict+Optimize is a recently proposed framework which combines machine learning and constrained optimization, tackling optimization problems that contain parameters that are unknown at solving time. The goal is to predict the unknown parameters and use the estimates to solve for an estimated optimal solution to the optimization problem. However, all prior works have focused on the case where unknown parameters appear only in the optimization objective and not the constraints, for the simple reason that if the constraints were not known exactly, the estimated optimal solution might not even be feasible under the true parameters. The contributions of this paper are two-fold. First, we propose a novel and practically relevant framework for the Predict+Optimize setting, but with unknown parameters in both the objective and the constraints. We introduce the notion of a correction function, and an additional penalty term in the loss function, modelling practical scenarios where an estimated optimal solution can be modified into a feasible solution after the true parameters are revealed, but at an additional cost. Second, we propose a corresponding algorithmic approach for our framework, which handles all packing and covering linear programs. Our approach is inspired by the prior work of Mandi and Guns, though with crucial modifications and re-derivations for our very different setting. Experimentation demonstrates the superior empirical performance of our method over classical approaches.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Recent Progress in Transformer-based Medical Image Analysis
Authors:
Zhaoshan Liu,
Qiujie Lv,
Ziduo Yang,
Yifan Li,
Chau Hung Lee,
Lei Shen
Abstract:
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structur…
▽ More
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
△ Less
Submitted 25 July, 2023; v1 submitted 13 August, 2022;
originally announced August 2022.
-
Finite-Sample Maximum Likelihood Estimation of Location
Authors:
Shivam Gupta,
Jasper C. H. Lee,
Eric Price,
Paul Valiant
Abstract:
We consider 1-dimensional location estimation, where we estimate a parameter $λ$ from $n$ samples $λ+ η_i$, with each $η_i$ drawn i.i.d. from a known distribution $f$. For fixed $f$ the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as $n \to \infty$: it is asymptotically normal with variance matching the Cramér-Rao lower bound of $\frac{1}{n\mathcal{I}}$, where…
▽ More
We consider 1-dimensional location estimation, where we estimate a parameter $λ$ from $n$ samples $λ+ η_i$, with each $η_i$ drawn i.i.d. from a known distribution $f$. For fixed $f$ the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as $n \to \infty$: it is asymptotically normal with variance matching the Cramér-Rao lower bound of $\frac{1}{n\mathcal{I}}$, where $\mathcal{I}$ is the Fisher information of $f$. However, this bound does not hold for finite $n$, or when $f$ varies with $n$. We show for arbitrary $f$ and $n$ that one can recover a similar theory based on the Fisher information of a smoothed version of $f$, where the smoothing radius decays with $n$.
△ Less
Submitted 18 July, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Branch & Learn for Recursively and Iteratively Solvable Problems in Predict+Optimize
Authors:
Xinyi Hu,
Jasper C. H. Lee,
Jimmy H. M. Lee,
Allen Z. Zhong
Abstract:
This paper proposes Branch & Learn, a framework for Predict+Optimize to tackle optimization problems containing parameters that are unknown at the time of solving. Given an optimization problem solvable by a recursive algorithm satisfying simple conditions, we show how a corresponding learning algorithm can be constructed directly and methodically from the recursive algorithm. Our framework applie…
▽ More
This paper proposes Branch & Learn, a framework for Predict+Optimize to tackle optimization problems containing parameters that are unknown at the time of solving. Given an optimization problem solvable by a recursive algorithm satisfying simple conditions, we show how a corresponding learning algorithm can be constructed directly and methodically from the recursive algorithm. Our framework applies also to iterative algorithms by viewing them as a degenerate form of recursion. Extensive experimentation shows better performance for our proposal over classical and state-of-the-art approaches.
△ Less
Submitted 1 May, 2022;
originally announced May 2022.
-
Semi-supervised anomaly detection algorithm based on KL divergence (SAD-KL)
Authors:
Chong Hyun Lee,
Kibae Lee
Abstract:
The unlabeled data are generally assumed to be normal data in detecting abnormal data via semisupervised learning. This assumption, however, causes inevitable detection error when distribution of unlabeled data is different from distribution of labeled normal dataset. To deal the problem caused by distribution gap between labeled and unlabeled data, we propose a semi-supervised anomaly detection a…
▽ More
The unlabeled data are generally assumed to be normal data in detecting abnormal data via semisupervised learning. This assumption, however, causes inevitable detection error when distribution of unlabeled data is different from distribution of labeled normal dataset. To deal the problem caused by distribution gap between labeled and unlabeled data, we propose a semi-supervised anomaly detection algorithm using KL divergence (SAD-KL). The proposed SAD-KL is composed of two steps: (1) estimating KL divergence of probability density functions (PDFs) of the local outlier factors (LOFs) of the labeled normal data and the unlabeled data (2) estimating detection probability and threshold for detecting normal data in unlabeled data by using the KL divergence. We show that the PDFs of the LOFs follow Burr distribution and use them for detection. Once the threshold is computed, the SAD-KL runs iteratively until the labeling change rate is lower than the predefined threshold. Experiments results show that the SAD-KL shows superior detection probability over the existing algorithms even though it takes less learning time.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
GSDA: Generative Adversarial Network-based Semi-Supervised Data Augmentation for Ultrasound Image Classification
Authors:
Zhaoshan Liu,
Qiujie Lv,
Chau Hung Lee,
Lei Shen
Abstract:
Medical Ultrasound (US) is one of the most widely used imaging modalities in clinical practice, but its usage presents unique challenges such as variable imaging quality. Deep Learning (DL) models can serve as advanced medical US image analysis tools, but their performance is greatly limited by the scarcity of large datasets. To solve the common data shortage, we develop GSDA, a Generative Adversa…
▽ More
Medical Ultrasound (US) is one of the most widely used imaging modalities in clinical practice, but its usage presents unique challenges such as variable imaging quality. Deep Learning (DL) models can serve as advanced medical US image analysis tools, but their performance is greatly limited by the scarcity of large datasets. To solve the common data shortage, we develop GSDA, a Generative Adversarial Network (GAN)-based semi-supervised data augmentation method. GSDA consists of the GAN and Convolutional Neural Network (CNN). The GAN synthesizes and pseudo-labels high-resolution, high-quality US images, and both real and synthesized images are then leveraged to train the CNN. To address the training challenges of both GAN and CNN with limited data, we employ transfer learning techniques during their training. We also introduce a novel evaluation standard that balances classification accuracy with computational time. We evaluate our method on the BUSI dataset and GSDA outperforms existing state-of-the-art methods. With the high-resolution and high-quality images synthesized, GSDA achieves a 97.9% accuracy using merely 780 images. Given these promising results, we believe that GSDA holds potential as an auxiliary tool for medical US analysis.
△ Less
Submitted 5 October, 2023; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces
Authors:
Xi Chen,
Anindya De,
Chin Ho Lee,
Rocco A. Servedio,
Sandip Sinha
Abstract:
In the standard trace reconstruction problem, the goal is to \emph{exactly} reconstruct an unknown source string $\mathsf{x} \in \{0,1\}^n$ from independent "traces", which are copies of $\mathsf{x}$ that have been corrupted by a $δ$-deletion channel which independently deletes each bit of $\mathsf{x}$ with probability $δ$ and concatenates the surviving bits. We study the \emph{approximate} trace…
▽ More
In the standard trace reconstruction problem, the goal is to \emph{exactly} reconstruct an unknown source string $\mathsf{x} \in \{0,1\}^n$ from independent "traces", which are copies of $\mathsf{x}$ that have been corrupted by a $δ$-deletion channel which independently deletes each bit of $\mathsf{x}$ with probability $δ$ and concatenates the surviving bits. We study the \emph{approximate} trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of $\mathsf{x}$ rather than an exact reconstruction.
We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string $\mathsf{x} \in \{0,1\}^n$ from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate $0 < δ< 1$ (which may depend on $n$), for almost every source string $\mathsf{x} \in \{0,1\}^n$, given any number $M \leq Θ(1/δ)$ of traces from $\mathrm{Del}_δ(\mathsf{x})$, the algorithm constructs a hypothesis string $\widehat{\mathsf{x}}$ that has edit distance at most $n \cdot (δM)^{Ω(M)}$ from $\mathsf{x}$. We also prove a near-matching information-theoretic lower bound showing that given $M \leq Θ(1/δ)$ traces from $\mathrm{Del}_δ(\mathsf{x})$ for a random $n$-bit string $\mathsf{x}$, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is $n \cdot (δM)^{O(M)}$.
△ Less
Submitted 25 August, 2021; v1 submitted 24 July, 2021;
originally announced July 2021.
-
Fourier growth of structured $\mathbb{F}_2$-polynomials and applications
Authors:
Jarosław Błasiok,
Peter Ivanov,
Yaonan Jin,
Chin Ho Lee,
Rocco A. Servedio,
Emanuele Viola
Abstract:
We analyze the Fourier growth, i.e. the $L_1$ Fourier weight at level $k$ (denoted $L_{1,k}$), of various well-studied classes of "structured" $\mathbb{F}_2$-polynomials. This study is motivated by applications in pseudorandomness, in particular recent results and conjectures due to [CHHL19,CHLT19,CGLSS20] which show that upper bounds on Fourier growth (even at level $k=2$) give unconditional pseu…
▽ More
We analyze the Fourier growth, i.e. the $L_1$ Fourier weight at level $k$ (denoted $L_{1,k}$), of various well-studied classes of "structured" $\mathbb{F}_2$-polynomials. This study is motivated by applications in pseudorandomness, in particular recent results and conjectures due to [CHHL19,CHLT19,CGLSS20] which show that upper bounds on Fourier growth (even at level $k=2$) give unconditional pseudorandom generators.
Our main structural results on Fourier growth are as follows:
- We show that any symmetric degree-$d$ $\mathbb{F}_2$-polynomial $p$ has $L_{1,k}(p) \le \Pr[p=1] \cdot O(d)^k$, and this is tight for any constant $k$. This quadratically strengthens an earlier bound that was implicit in [RSV13].
- We show that any read-$Δ$ degree-$d$ $\mathbb{F}_2$-polynomial $p$ has $L_{1,k}(p) \le \Pr[p=1] \cdot (k Δd)^{O(k)}$.
- We establish a composition theorem which gives $L_{1,k}$ bounds on disjoint compositions of functions that are closed under restrictions and admit $L_{1,k}$ bounds.
Finally, we apply the above structural results to obtain new unconditional pseudorandom generators and new correlation bounds for various classes of $\mathbb{F}_2$-polynomials.
△ Less
Submitted 11 October, 2024; v1 submitted 22 July, 2021;
originally announced July 2021.
-
Polynomial-time trace reconstruction in the low deletion rate regime
Authors:
Xi Chen,
Anindya De,
Chin Ho Lee,
Rocco A. Servedio,
Sandip Sinha
Abstract:
In the \emph{trace reconstruction problem}, an unknown source string $x \in \{0,1\}^n$ is transmitted through a probabilistic \emph{deletion channel} which independently deletes each bit with some fixed probability $δ$ and concatenates the surviving bits, resulting in a \emph{trace} of $x$. The problem is to reconstruct $x$ given access to independent traces.
Trace reconstruction of arbitrary (w…
▽ More
In the \emph{trace reconstruction problem}, an unknown source string $x \in \{0,1\}^n$ is transmitted through a probabilistic \emph{deletion channel} which independently deletes each bit with some fixed probability $δ$ and concatenates the surviving bits, resulting in a \emph{trace} of $x$. The problem is to reconstruct $x$ given access to independent traces.
Trace reconstruction of arbitrary (worst-case) strings is a challenging problem, with the current state of the art for poly$(n)$-time algorithms being the 2004 algorithm of Batu et al. \cite{BKKM04}. This algorithm can reconstruct an arbitrary source string $x \in \{0,1\}^n$ in poly$(n)$ time provided that the deletion rate $δ$ satisfies $δ\leq n^{-(1/2 + \varepsilon)}$ for some $\varepsilon > 0$.
In this work we improve on the result of \cite{BKKM04} by giving a poly$(n)$-time algorithm for trace reconstruction for any deletion rate $δ\leq n^{-(1/3 + \varepsilon)}$. Our algorithm works by alternating an alignment-based procedure, which we show effectively reconstructs portions of the source string that are not "highly repetitive", with a novel procedure that efficiently determines the length of highly repetitive subwords of the source string.
△ Less
Submitted 7 December, 2020; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Optimal Sub-Gaussian Mean Estimation in $\mathbb{R}$
Authors:
Jasper C. H. Lee,
Paul Valiant
Abstract:
We revisit the problem of estimating the mean of a real-valued distribution, presenting a novel estimator with sub-Gaussian convergence: intuitively, "our estimator, on any distribution, is as accurate as the sample mean is for the Gaussian distribution of matching variance." Crucially, in contrast to prior works, our estimator does not require prior knowledge of the variance, and works across the…
▽ More
We revisit the problem of estimating the mean of a real-valued distribution, presenting a novel estimator with sub-Gaussian convergence: intuitively, "our estimator, on any distribution, is as accurate as the sample mean is for the Gaussian distribution of matching variance." Crucially, in contrast to prior works, our estimator does not require prior knowledge of the variance, and works across the entire gamut of distributions with bounded variance, including those without any higher moments. Parameterized by the sample size $n$, the failure probability $δ$, and the variance $σ^2$, our estimator is accurate to within $σ\cdot(1+o(1))\sqrt{\frac{2\log\frac{1}δ}{n}}$, tight up to the $1+o(1)$ factor. Our estimator construction and analysis gives a framework generalizable to other problems, tightly analyzing a sum of dependent random variables by viewing the sum implicitly as a 2-parameter $ψ$-estimator, and constructing bounds using mathematical programming and duality techniques.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.