-
Complex network analysis of pore structures in monodisperse granular materials with varied grain shapes
Authors:
Jie Qi,
Wenbin Fei,
Guillermo A. Narsilio
Abstract:
Understanding how pore structure influences flow and transport behaviour in granular materials is essential for addressing a wide range of geotechnical, hydraulic, and environmental challenges. These processes are largely shaped by the microscopic arrangement of particles and interconnections between pores within the material. However, detailed insights considering granular assemblies with diversi…
▽ More
Understanding how pore structure influences flow and transport behaviour in granular materials is essential for addressing a wide range of geotechnical, hydraulic, and environmental challenges. These processes are largely shaped by the microscopic arrangement of particles and interconnections between pores within the material. However, detailed insights considering granular assemblies with diversified grain shapes remain scarce. This study introduces a comprehensive framework incorporating Discrete Element Method (DEM) simulations, image processing, pore-network modelling, and complex network theory to investigate the links between particle morphology and their hydraulic behaviours. Mono-disperse assemblies of natural sand particles with varied shapes are constructed in DEM, and pore networks are extracted through image processing and pore-network modelling. Complex network analysis is then applied to calculate structural metrics that reveal intrinsic relationships between pore microstructures and hydraulic properties. Our results demonstrate that particle morphology significantly impacts pore network characteristics, including pore and throat sizes, closeness centrality, pore structure anisotropy, providing valuable insights into how pore structure influence transport properties.
△ Less
Submitted 27 November, 2025;
originally announced November 2025.
-
Entropy-Guided Reasoning Compression
Authors:
Hourun Zhu,
Yang Gao,
Wenlong Fei,
Jiawei Li,
Huashan Sun
Abstract:
Large reasoning models have demonstrated remarkable performance on complex reasoning tasks, yet the excessive length of their chain-of-thought outputs remains a major practical bottleneck due to high computation cost and poor deployability. Existing compression methods have achieved partial success but overlook a crucial phenomenon in the training process -- the entropy conflict. During compressio…
▽ More
Large reasoning models have demonstrated remarkable performance on complex reasoning tasks, yet the excessive length of their chain-of-thought outputs remains a major practical bottleneck due to high computation cost and poor deployability. Existing compression methods have achieved partial success but overlook a crucial phenomenon in the training process -- the entropy conflict. During compression training, entropy decreases, leading to shorter reasoning but limited exploration, while accuracy-oriented objectives increase entropy, lengthening reasoning chains. This can cause the model to get stuck in a local dilemma. Our analysis further reveals the origin of the entropy conflict: many high-entropy tokens are logical connectors that receive larger gradients and are encouraged under the performance objective, while the compression objective simultaneously penalizes these potentially redundant connectors. This opposing pressure creates a direct source of entropy conflict. To address these issues, we adopt an entropy-guided training framework. As entropy descends, the model is guided toward efficient reasoning by encouraging concise thought steps; as entropy rises, exploration is reinforced under the compact reasoning mode to improve robustness. Experiments on six mathematical benchmarks show that our method compresses reasoning length to 20% of the original while maintaining or even surpassing baseline accuracy. Code and models will be released publicly.
△ Less
Submitted 24 November, 2025; v1 submitted 18 November, 2025;
originally announced November 2025.
-
NeuralDB: Scaling Knowledge Editing in LLMs to 100,000 Facts with Neural KV Database
Authors:
Weizhi Fei,
Hao Shi,
Jing Xu,
Jingchen Peng,
Jiazheng Li,
Jingzhao Zhang,
Bo Bai,
Wei Han,
Zhenyuan Chen,
Xueyan Niu
Abstract:
Efficiently editing knowledge stored in large language models (LLMs) enables model updates without large-scale training. One possible solution is Locate-and-Edit (L\&E), allowing simultaneous modifications of a massive number of facts. However, such editing may compromise the general abilities of LLMs and even result in forgetting edited facts when scaling up to thousands of edits. In this paper,…
▽ More
Efficiently editing knowledge stored in large language models (LLMs) enables model updates without large-scale training. One possible solution is Locate-and-Edit (L\&E), allowing simultaneous modifications of a massive number of facts. However, such editing may compromise the general abilities of LLMs and even result in forgetting edited facts when scaling up to thousands of edits. In this paper, we model existing linear L\&E methods as querying a Key-Value (KV) database. From this perspective, we then propose NeuralDB, an editing framework that explicitly represents the edited facts as a neural KV database equipped with a non-linear gated retrieval module, % In particular, our gated module only operates when inference involves the edited facts, effectively preserving the general abilities of LLMs. Comprehensive experiments involving the editing of 10,000 facts were conducted on the ZsRE and CounterFacts datasets, using GPT2-XL, GPT-J (6B) and Llama-3 (8B). The results demonstrate that NeuralDB not only excels in editing efficacy, generalization, specificity, fluency, and consistency, but also preserves overall performance across six representative text understanding and generation tasks. Further experiments indicate that NeuralDB maintains its effectiveness even when scaled to 100,000 facts (\textbf{50x} more than in prior work).
△ Less
Submitted 23 July, 2025;
originally announced July 2025.
-
Self-Guided Process Reward Optimization with Redefined Step-wise Advantage for Process Reinforcement Learning
Authors:
Wu Fei,
Hao Kong,
Shuxian Liang,
Yang Lin,
Yibo Yang,
Jing Tang,
Lei Chen,
Xiansheng Hua
Abstract:
Process Reinforcement Learning~(PRL) has demonstrated considerable potential in enhancing the reasoning capabilities of Large Language Models~(LLMs). However, introducing additional process reward models incurs substantial computational overhead, and there is no unified theoretical framework for process-level advantage estimation. To bridge this gap, we propose \textbf{S}elf-Guided \textbf{P}roces…
▽ More
Process Reinforcement Learning~(PRL) has demonstrated considerable potential in enhancing the reasoning capabilities of Large Language Models~(LLMs). However, introducing additional process reward models incurs substantial computational overhead, and there is no unified theoretical framework for process-level advantage estimation. To bridge this gap, we propose \textbf{S}elf-Guided \textbf{P}rocess \textbf{R}eward \textbf{O}ptimization~(\textbf{SPRO}), a novel framework that enables process-aware RL through two key innovations: (1) we first theoretically demonstrate that process rewards can be derived intrinsically from the policy model itself, and (2) we introduce well-defined cumulative process rewards and \textbf{M}asked \textbf{S}tep \textbf{A}dvantage (\textbf{MSA}), which facilitates rigorous step-wise action advantage estimation within shared-prompt sampling groups. Our experimental results demonstrate that SPRO outperforms vaniila GRPO with 3.4x higher training efficiency and a 17.5\% test accuracy improvement. Furthermore, SPRO maintains a stable and elevated policy entropy throughout training while reducing the average response length by approximately $1/3$, evidencing sufficient exploration and prevention of reward hacking. Notably, SPRO incurs no additional computational overhead compared to outcome-supervised RL methods such as GRPO, which benefit industrial implementation.
△ Less
Submitted 3 July, 2025; v1 submitted 2 July, 2025;
originally announced July 2025.
-
Research on Optimal Control Problem Based on Reinforcement Learning under Knightian Uncertainty
Authors:
Ziyu Li,
Chen Fei,
Weiyin Fei
Abstract:
Considering that the decision-making environment faced by reinforcement learning (RL) agents is full of Knightian uncertainty, this paper describes the exploratory state dynamics equation in Knightian uncertainty to study the entropy-regularized relaxed stochastic control problem in a Knightian uncertainty environment. By employing stochastic analysis theory and the dynamic programming principle u…
▽ More
Considering that the decision-making environment faced by reinforcement learning (RL) agents is full of Knightian uncertainty, this paper describes the exploratory state dynamics equation in Knightian uncertainty to study the entropy-regularized relaxed stochastic control problem in a Knightian uncertainty environment. By employing stochastic analysis theory and the dynamic programming principle under nonlinear expectation, we derive the Hamilton-Jacobi-Bellman (HJB) equation and solve for the optimal policy that achieves a trade-off between exploration and exploitation. Subsequently, for the linear-quadratic (LQ) case, we examine the agent's optimal randomized feedback control under both state-dependent and state-independent reward scenarios, proving that the optimal randomized feedback control follows a Gaussian distribution in the LQ framework. Furthermore, we investigate how the degree of Knightian uncertainty affects the variance of the optimal feedback policy. Additionally, we establish the solvability equivalence between non-exploratory and exploratory LQ problems under Knightian uncertainty and analyze the associated exploration cost. Finally, we provide an LQ example and validate the theoretical findings through numerical simulations.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
Efficient and Scalable Neural Symbolic Search for Knowledge Graph Complex Query Answering
Authors:
Weizhi Fei,
Zihao Wang,
hang Yin,
Shukai Zhao,
Wei Zhang,
Yangqiu Song
Abstract:
Complex Query Answering (CQA) aims to retrieve answer sets for complex logical formulas from incomplete knowledge graphs, which is a crucial yet challenging task in knowledge graph reasoning. While neuro-symbolic search utilized neural link predictions achieve superior accuracy, they encounter significant complexity bottlenecks: (i) Data complexity typically scales quadratically with the number of…
▽ More
Complex Query Answering (CQA) aims to retrieve answer sets for complex logical formulas from incomplete knowledge graphs, which is a crucial yet challenging task in knowledge graph reasoning. While neuro-symbolic search utilized neural link predictions achieve superior accuracy, they encounter significant complexity bottlenecks: (i) Data complexity typically scales quadratically with the number of entities in the knowledge graph, and (ii) Query complexity becomes NP-hard for cyclic queries. Consequently, these approaches struggle to effectively scale to larger knowledge graphs and more complex queries. To address these challenges, we propose an efficient and scalable symbolic search framework. First, we propose two constraint strategies to compute neural logical indices to reduce the domain of variables, thereby decreasing the data complexity of symbolic search. Additionally, we introduce an approximate algorithm based on local search to tackle the NP query complexity of cyclic queries. Experiments on various CQA benchmarks demonstrate that our framework reduces the computational load of symbolic methods by 90\% while maintaining nearly the same performance, thus alleviating both efficiency and scalability issues.
△ Less
Submitted 20 May, 2025; v1 submitted 12 May, 2025;
originally announced May 2025.
-
Top Ten Challenges Towards Agentic Neural Graph Databases
Authors:
Jiaxin Bai,
Zihao Wang,
Yukun Zhou,
Hang Yin,
Weizhi Fei,
Qi Hu,
Zheye Deng,
Jiayang Cheng,
Tianshi Zheng,
Hong Ting Tsang,
Yisen Gao,
Zhongwei Xie,
Yufei Li,
Lixin Fan,
Binhang Yuan,
Wei Wang,
Lei Chen,
Xiaofang Zhou,
Yangqiu Song
Abstract:
Graph databases (GDBs) like Neo4j and TigerGraph excel at handling interconnected data but lack advanced inference capabilities. Neural Graph Databases (NGDBs) address this by integrating Graph Neural Networks (GNNs) for predictive analysis and reasoning over incomplete or noisy data. However, NGDBs rely on predefined queries and lack autonomy and adaptability. This paper introduces Agentic Neural…
▽ More
Graph databases (GDBs) like Neo4j and TigerGraph excel at handling interconnected data but lack advanced inference capabilities. Neural Graph Databases (NGDBs) address this by integrating Graph Neural Networks (GNNs) for predictive analysis and reasoning over incomplete or noisy data. However, NGDBs rely on predefined queries and lack autonomy and adaptability. This paper introduces Agentic Neural Graph Databases (Agentic NGDBs), which extend NGDBs with three core functionalities: autonomous query construction, neural query execution, and continuous learning. We identify ten key challenges in realizing Agentic NGDBs: semantic unit representation, abductive reasoning, scalable query execution, and integration with foundation models like large language models (LLMs). By addressing these challenges, Agentic NGDBs can enable intelligent, self-improving systems for modern data-driven applications, paving the way for adaptable and autonomous data management solutions.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Authors:
Weizhi Fei,
Xueyan Niu,
Guoqing Xie,
Yingqing Liu,
Bo Bai,
Wei Han
Abstract:
Although applications involving long-context inputs are crucial for the effective utilization of large language models (LLMs), they also result in increased computational costs and reduced performance. To address this challenge, we propose an efficient, training-free prompt compression method that retains key information within compressed prompts. We identify specific attention heads in transforme…
▽ More
Although applications involving long-context inputs are crucial for the effective utilization of large language models (LLMs), they also result in increased computational costs and reduced performance. To address this challenge, we propose an efficient, training-free prompt compression method that retains key information within compressed prompts. We identify specific attention heads in transformer-based LLMs, which we designate as evaluator heads, that are capable of selecting tokens in long inputs that are most significant for inference. Building on this discovery, we develop EHPC, an Evaluator Head-based Prompt Compression method, which enables LLMs to rapidly "skim through" input prompts by leveraging only the first few layers with evaluator heads during the pre-filling stage, subsequently passing only the important tokens to the model for inference. EHPC achieves state-of-the-art results across two mainstream benchmarks: prompt compression and long-context inference acceleration. Consequently, it effectively reduces the complexity and costs associated with commercial API calls. We further demonstrate that EHPC attains competitive results compared to key-value cache-based acceleration methods, thereby highlighting its potential to enhance the efficiency of LLMs for long-context tasks.
△ Less
Submitted 5 February, 2025; v1 submitted 22 January, 2025;
originally announced January 2025.
-
MATES: Multi-view Aggregated Two-Sample Test
Authors:
Zexi Cai,
Wenbo Fei,
Doudou Zhou
Abstract:
The two-sample test is a fundamental problem in statistics with a wide range of applications. In the realm of high-dimensional data, nonparametric methods have gained prominence due to their flexibility and minimal distributional assumptions. However, many existing methods tend to be more effective when the two distributions differ primarily in their first and/or second moments. In many real-world…
▽ More
The two-sample test is a fundamental problem in statistics with a wide range of applications. In the realm of high-dimensional data, nonparametric methods have gained prominence due to their flexibility and minimal distributional assumptions. However, many existing methods tend to be more effective when the two distributions differ primarily in their first and/or second moments. In many real-world scenarios, distributional differences may arise in higher-order moments, rendering traditional methods less powerful. To address this limitation, we propose a novel framework to aggregate information from multiple moments to build a test statistic. Each moment is regarded as one view of the data and contributes to the detection of some specific type of discrepancy, thus allowing the test statistic to capture more complex distributional differences. The novel multi-view aggregated two-sample test (MATES) leverages a graph-based approach, where the test statistic is constructed from the weighted similarity graphs of the pooled sample. Under mild conditions on the multi-view weighted similarity graphs, we establish theoretical properties of MATES, including a distribution-free limiting distribution under the null hypothesis, which enables straightforward type-I error control. Extensive simulation studies demonstrate that MATES effectively distinguishes subtle differences between distributions. We further validate the method on the S&P100 data, showcasing its power in detecting complex distributional variations.
△ Less
Submitted 21 December, 2024;
originally announced December 2024.
-
Positivity-preserving truncated Euler and Milstein methods for financial SDEs with super-linear coefficients
Authors:
Shounian Deng,
Chen Fei,
Weiyin Fei,
Xuerong Mao
Abstract:
In this paper, we propose two variants of the positivity-preserving schemes, namely the truncated Euler-Maruyama (EM) method and the truncated Milstein scheme, applied to stochastic differential equations (SDEs) with positive solutions and super-linear coefficients. Under some regularity and integrability assumptions we derive the optimal strong convergence rates of the two schemes. Moreover, we d…
▽ More
In this paper, we propose two variants of the positivity-preserving schemes, namely the truncated Euler-Maruyama (EM) method and the truncated Milstein scheme, applied to stochastic differential equations (SDEs) with positive solutions and super-linear coefficients. Under some regularity and integrability assumptions we derive the optimal strong convergence rates of the two schemes. Moreover, we demonstrate flexibility of our approaches by applying the truncated methods to approximate SDEs with super-linear coefficients (3/2 and Aiıt-Sahalia models) directly and also with sub-linear coefficients (CIR model) indirectly. Numerical experiments are provided to verify the effectiveness of the theoretical results.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding
Authors:
Weizhi Fei,
Xueyan Niu,
Guoqing Xie,
Yanhua Zhang,
Bo Bai,
Lei Deng,
Wei Han
Abstract:
Current Large Language Models (LLMs) face inherent limitations due to their pre-defined context lengths, which impede their capacity for multi-hop reasoning within extensive textual contexts. While existing techniques like Retrieval-Augmented Generation (RAG) have attempted to bridge this gap by sourcing external information, they fall short when direct answers are not readily available. We introd…
▽ More
Current Large Language Models (LLMs) face inherent limitations due to their pre-defined context lengths, which impede their capacity for multi-hop reasoning within extensive textual contexts. While existing techniques like Retrieval-Augmented Generation (RAG) have attempted to bridge this gap by sourcing external information, they fall short when direct answers are not readily available. We introduce a novel approach that re-imagines information retrieval through dynamic in-context editing, inspired by recent breakthroughs in knowledge editing. By treating lengthy contexts as malleable external knowledge, our method interactively gathers and integrates relevant information, thereby enabling LLMs to perform sophisticated reasoning steps. Experimental results demonstrate that our method effectively empowers context-limited LLMs, such as Llama2, to engage in multi-hop reasoning with improved performance, which outperforms state-of-the-art context window extrapolation methods and even compares favorably to more advanced commercial long-context models. Our interactive method not only enhances reasoning capabilities but also mitigates the associated training and computational costs, making it a pragmatic solution for enhancing LLMs' reasoning within expansive contexts.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Extending Complex Logical Queries on Uncertain Knowledge Graphs
Authors:
Weizhi Fei,
Zihao Wang,
Hang Yin,
Yang Duan,
Yangqiu Song
Abstract:
The study of machine learning-based logical query answering enables reasoning with large-scale and incomplete knowledge graphs. This paper advances this area of research by addressing the uncertainty inherent in knowledge. While the uncertain nature of knowledge is widely recognized in the real world, it does not align seamlessly with the first-order logic that underpins existing studies. To bridg…
▽ More
The study of machine learning-based logical query answering enables reasoning with large-scale and incomplete knowledge graphs. This paper advances this area of research by addressing the uncertainty inherent in knowledge. While the uncertain nature of knowledge is widely recognized in the real world, it does not align seamlessly with the first-order logic that underpins existing studies. To bridge this gap, we explore the soft queries on uncertain knowledge, inspired by the framework of soft constraint programming. We propose a neural symbolic approach that incorporates both forward inference and backward calibration to answer soft queries on large-scale, incomplete, and uncertain knowledge graphs. Theoretical discussions demonstrate that our method avoids catastrophic cascading errors in the forward inference while maintaining the same complexity as state-of-the-art symbolic methods for complex logical queries. Empirical results validate the superior performance of our backward calibration compared to extended query embedding methods and neural symbolic approaches.
△ Less
Submitted 20 May, 2025; v1 submitted 3 March, 2024;
originally announced March 2024.
-
Towards Accurate Post-training Quantization for Reparameterized Models
Authors:
Luoming Zhang,
Yefei He,
Wen Fei,
Zhenyu Lou,
Weijia Wu,
YangWei Ying,
Hong Zhou
Abstract:
Model reparameterization is a widely accepted technique for improving inference speed without compromising performance. However, current Post-training Quantization (PTQ) methods often lead to significant accuracy degradation when applied to reparameterized models. This is primarily caused by channel-specific and sample-specific outliers, which appear only at specific samples and channels and impac…
▽ More
Model reparameterization is a widely accepted technique for improving inference speed without compromising performance. However, current Post-training Quantization (PTQ) methods often lead to significant accuracy degradation when applied to reparameterized models. This is primarily caused by channel-specific and sample-specific outliers, which appear only at specific samples and channels and impact on the selection of quantization parameters. To address this issue, we propose RepAPQ, a novel framework that preserves the accuracy of quantized reparameterization models. Different from previous frameworks using Mean Squared Error (MSE) as a measurement, we utilize Mean Absolute Error (MAE) to mitigate the influence of outliers on quantization parameters. Our framework comprises two main components: Quantization Protecting Reparameterization and Across-block Calibration. For effective calibration, Quantization Protecting Reparameterization combines multiple branches into a single convolution with an affine layer. During training, the affine layer accelerates convergence and amplifies the output of the convolution to better accommodate samples with outliers. Additionally, Across-block Calibration leverages the measurement of stage output as supervision to address the gradient problem introduced by MAE and enhance the interlayer correlation with quantization parameters. Comprehensive experiments demonstrate the effectiveness of RepAPQ across various models and tasks. Our framework outperforms previous methods by approximately 1\% for 8-bit PTQ and 2\% for 6-bit PTQ, showcasing its superior performance. The code is available at \url{https://github.com/ilur98/DLMC-QUANT}.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Edge-Enabled Real-time Railway Track Segmentation
Authors:
Chen Chenglin,
Wang Fei,
Yang Min,
Qin Yong,
Bai Yun
Abstract:
Accurate and rapid railway track segmentation can assist automatic train driving and is a key step in early warning to fixed or moving obstacles on the railway track. However, certain existing algorithms tailored for track segmentation often struggle to meet the requirements of real-time and efficiency on resource-constrained edge devices. Considering this challenge, we propose an edge-enabled rea…
▽ More
Accurate and rapid railway track segmentation can assist automatic train driving and is a key step in early warning to fixed or moving obstacles on the railway track. However, certain existing algorithms tailored for track segmentation often struggle to meet the requirements of real-time and efficiency on resource-constrained edge devices. Considering this challenge, we propose an edge-enabled real-time railway track segmentation algorithm, which is optimized to be suitable for edge applications by optimizing the network structure and quantizing the model after training. Initially, Ghost convolution is introduced to reduce the complexity of the backbone, thereby achieving the extraction of key information of the interested region at a lower cost. To further reduce the model complexity and calculation, a new lightweight detection head is proposed to achieve the best balance between accuracy and efficiency. Subsequently, we introduce quantization techniques to map the model's floating-point weights and activation values into lower bit-width fixed-point representations, reducing computational demands and memory footprint, ultimately accelerating the model's inference. Finally, we draw inspiration from GPU parallel programming principles to expedite the pre-processing and post-processing stages of the algorithm by doing parallel processing. The approach is evaluated with public and challenging dataset RailSem19 and tested on Jetson Nano. Experimental results demonstrate that our enhanced algorithm achieves an accuracy level of 83.3% while achieving a real-time inference rate of 25 frames per second when the input size is 480x480, thereby effectively meeting the requirements for real-time and high-efficiency operation.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
Extending Context Window of Large Language Models via Semantic Compression
Authors:
Weizhi Fei,
Xueyan Niu,
Pingyi Zhou,
Lu Hou,
Bo Bai,
Lei Deng,
Wei Han
Abstract:
Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long texts. We propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer, without incurring significant computational c…
▽ More
Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long texts. We propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer, without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Authors:
Luoming Zhang,
Wen Fei,
Weijia Wu,
Yefei He,
Zhenyu Lou,
Hong Zhou
Abstract:
Large Language Models (LLMs) pose significant hardware challenges related to memory requirements and computational ability. There are two mainstream quantization schemes for LLMs: coarse-grained ($\textit{e.g.,}$ channel-wise) quantization and fine-grained ($\textit{e.g.,}$ group-wise) quantization. Fine-grained quantization has smaller quantization loss, consequently achieving superior performanc…
▽ More
Large Language Models (LLMs) pose significant hardware challenges related to memory requirements and computational ability. There are two mainstream quantization schemes for LLMs: coarse-grained ($\textit{e.g.,}$ channel-wise) quantization and fine-grained ($\textit{e.g.,}$ group-wise) quantization. Fine-grained quantization has smaller quantization loss, consequently achieving superior performance. However, when applied to weight-activation quantization, it disrupts continuous integer matrix multiplication, leading to inefficient inference. In this paper, we introduce Dual Grained Quantization (DGQ), a novel A8W4 quantization for LLM that maintains superior performance while ensuring fast inference speed. DSQ dequantizes the fine-grained INT4 weight into coarse-grained INT8 representation and preform matrix multiplication using INT8 kernels. Besides, we develop a two-phase grid search algorithm to simplify the determination of fine-grained and coarse-grained quantization scales. We also devise a percentile clipping schema for smoothing the activation outliers without the need for complex optimization techniques. Experimental results demonstrate that DGQ consistently outperforms prior methods across various LLM architectures and a wide range of tasks. Remarkably, by our implemented efficient CUTLASS kernel, we achieve $\textbf{1.12}$ $\times$ memory reduction and $\textbf{3.24}$ $\times$ speed gains comparing A16W4 implementation. These advancements enable efficient deployment of A8W4 LLMs for real-world applications.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
$\text{EFO}_{k}$-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation
Authors:
Hang Yin,
Zihao Wang,
Weizhi Fei,
Yangqiu Song
Abstract:
To answer complex queries on knowledge graphs, logical reasoning over incomplete knowledge is required due to the open-world assumption. Learning-based methods are essential because they are capable of generalizing over unobserved knowledge. Therefore, an appropriate dataset is fundamental to both obtaining and evaluating such methods under this paradigm. In this paper, we propose a comprehensive…
▽ More
To answer complex queries on knowledge graphs, logical reasoning over incomplete knowledge is required due to the open-world assumption. Learning-based methods are essential because they are capable of generalizing over unobserved knowledge. Therefore, an appropriate dataset is fundamental to both obtaining and evaluating such methods under this paradigm. In this paper, we propose a comprehensive framework for data generation, model training, and method evaluation that covers the combinatorial space of Existential First-order Queries with multiple variables ($\text{EFO}_{k}$). The combinatorial query space in our framework significantly extends those defined by set operations in the existing literature. Additionally, we construct a dataset, $\text{EFO}_{k}$-CQA, with 741 types of query for empirical evaluation, and our benchmark results provide new insights into how query hardness affects the results. Furthermore, we demonstrate that the existing dataset construction process is systematically biased that hinders the appropriate development of query-answering methods, highlighting the importance of our work. Our code and data are provided in~\url{https://github.com/HKUST-KnowComp/EFOK-CQA}.
△ Less
Submitted 15 July, 2023;
originally announced July 2023.
-
GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition
Authors:
Yu Pan,
Yanni Hu,
Yuguang Yang,
Wen Fei,
Jixun Yao,
Heng Lu,
Lei Ma,
Jianjun Zhao
Abstract:
Contrastive cross-modality pretraining has recently exhibited impressive success in diverse fields, whereas there is limited research on their merits in speech emotion recognition (SER). In this paper, we propose GEmo-CLAP, a kind of gender-attribute-enhanced contrastive language-audio pretraining (CLAP) method for SER. Specifically, we first construct an effective emotion CLAP (Emo-CLAP) for SER,…
▽ More
Contrastive cross-modality pretraining has recently exhibited impressive success in diverse fields, whereas there is limited research on their merits in speech emotion recognition (SER). In this paper, we propose GEmo-CLAP, a kind of gender-attribute-enhanced contrastive language-audio pretraining (CLAP) method for SER. Specifically, we first construct an effective emotion CLAP (Emo-CLAP) for SER, using pre-trained text and audio encoders. Second, given the significance of gender information in SER, two novel multi-task learning based GEmo-CLAP (ML-GEmo-CLAP) and soft label based GEmo-CLAP (SL-GEmo-CLAP) models are further proposed to incorporate gender information of speech signals, forming more reasonable objectives. Experiments on IEMOCAP indicate that our proposed two GEmo-CLAPs consistently outperform Emo-CLAP with different pre-trained models. Remarkably, the proposed WavLM-based SL-GEmo-CLAP obtains the best WAR of 83.16\%, which performs better than state-of-the-art SER methods.
△ Less
Submitted 4 December, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport
Authors:
Zihao Wang,
Weizhi Fei,
Hang Yin,
Yangqiu Song,
Ginny Y. Wong,
Simon See
Abstract:
Answering complex queries on knowledge graphs is important but particularly challenging because of the data incompleteness. Query embedding methods address this issue by learning-based models and simulating logical reasoning with set operators. Previous works focus on specific forms of embeddings, but scoring functions between embeddings are underexplored. In contrast to existing scoring functions…
▽ More
Answering complex queries on knowledge graphs is important but particularly challenging because of the data incompleteness. Query embedding methods address this issue by learning-based models and simulating logical reasoning with set operators. Previous works focus on specific forms of embeddings, but scoring functions between embeddings are underexplored. In contrast to existing scoring functions motivated by local comparison or global transport, this work investigates the local and global trade-off with unbalanced optimal transport theory. Specifically, we embed sets as bounded measures in $\real$ endowed with a scoring function motivated by the Wasserstein-Fisher-Rao metric. Such a design also facilitates closed-form set operators in the embedding space. Moreover, we introduce a convolution-based algorithm for linear time computation and a block-diagonal kernel to enforce the trade-off. Results show that WFRE can outperform existing query embedding methods on standard datasets, evaluation sets with combinatorially complex queries, and hierarchical knowledge graphs. Ablation study shows that finding a better local and global trade-off is essential for performance improvement.
△ Less
Submitted 6 May, 2023;
originally announced May 2023.
-
Design of Core-Shell Structured Magnetic Microwires with Desirable Properties for Multifunctional Applications
Authors:
Sida Jiang,
Tatiana Eggers,
Ongard Thiabgoh,
Claire Albrecht,
Jingshun Liu,
Huan Wang,
Ze Li,
Dawei Xing,
Weidong Fei,
Wenbin Fang,
Jianfei Sun,
Manh-Huong Phan
Abstract:
Amorphous Co-rich microwires with excellent soft magnetic and mechanical properties produced by melt-extraction technique are emerging as a multifunctional material for a variety of applications ranging from ultrasensitive magnetic field sensors to structural health self-monitoring composites. There is a pressing need for enhancing these properties to make the microwires practical for integration…
▽ More
Amorphous Co-rich microwires with excellent soft magnetic and mechanical properties produced by melt-extraction technique are emerging as a multifunctional material for a variety of applications ranging from ultrasensitive magnetic field sensors to structural health self-monitoring composites. There is a pressing need for enhancing these properties to make the microwires practical for integration into new technologies. Conventional heat treatments at temperature below crystallization may improve the magnetic softness of an as-quenched amorphous wire, but usually deteriorate the good mechanical characteristic of the wire due to crystallization. To overcome this, we propose a new approach that utilizes the advantages of a multi-step Joule current annealing method to design novel (nanocrystal, amorphous)/amorphous core/shell structures directly from as-quenched amorphous microwires. These results show that the density and size of nanocrystals in the core can be optimized by controlling the Joule current intensity, resulting in the large enhancement of soft magnetic and giant magneto-impedance properties, while the amorphous shell preserves the excellent mechanical strength of the microwire. This study also provides a new pathway for the design of novel core/shell structures directly from rapidly quenched amorphous magnetic materials that are currently exploited in high frequency transformers, sensing and cooling devices.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
The truncated EM method for stochastic differential delay equations with variable delay
Authors:
Shounian Deng,
Chen Fei,
Weiyin Fei,
Xuerong Mao
Abstract:
This paper mainly investigates the strong convergence and stability of the truncated Euler-Maruyama (EM) method for stochastic differential delay equations with variable delay whose coefficients can be growing super-linearly. By constructing appropriate truncated functions to control the super-linear growth of the original coefficients, we present a type of the truncated EM method for such SDDEs w…
▽ More
This paper mainly investigates the strong convergence and stability of the truncated Euler-Maruyama (EM) method for stochastic differential delay equations with variable delay whose coefficients can be growing super-linearly. By constructing appropriate truncated functions to control the super-linear growth of the original coefficients, we present a type of the truncated EM method for such SDDEs with variable delay, which is proposed to be approximated by the value taken at the nearest grid points on the left of the delayed argument. The strong convergence result (without order) of the method is established under the local Lipschitz plus generalized Khasminskii-type conditions and the optimal strong convergence order $1/2$ can be obtained if the global monotonicity with U function and polynomial growth conditions are added to the assumptions. Moreover, the partially truncated EM method is proved to preserve the mean-square and H_\infty stabilities of the true solutions. Compared with the known results on the truncated EM method for SDDEs, a better order of strong convergence is obtained under more relaxing conditions on the coefficients, and more refined technical estimates are developed so as to overcome the challenges arising due to variable delay. Lastly, some numerical examples are utilized to confirm the effectiveness of the theoretical results.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Perfect optical coherence lattices
Authors:
Liang Chunhao,
Liu Xin,
Xu Zhiheng,
Wang Fei,
Ponomarenko Sergey A.,
Cai Yangjian,
Pujuan Ma
Abstract:
We advance and experimentally implement a protocol to generate perfect optical coherence lattices (OCL) that are not modulated by an envelope field. Structuring the amplitude and phase of an input partially coherent beam in a Fourier plane of an imaging system lies at the heart of our protocol. In the proposed approach, the OCL node profile depends solely on the degree of coherence (DOC) of the in…
▽ More
We advance and experimentally implement a protocol to generate perfect optical coherence lattices (OCL) that are not modulated by an envelope field. Structuring the amplitude and phase of an input partially coherent beam in a Fourier plane of an imaging system lies at the heart of our protocol. In the proposed approach, the OCL node profile depends solely on the degree of coherence (DOC) of the input beam such that, in principle, any lattice structure can be attained via proper manipulations in the Fourier plane. Moreover, any genuine partially coherent source can serve as an input to our lattice generating imaging system. Our results are anticipated to find applications to optical field engineering and multi-target probing among others.
△ Less
Submitted 12 June, 2021;
originally announced June 2021.
-
MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization
Authors:
Wen Fei,
Wenrui Dai,
Chenglin Li,
Junni Zou,
Hongkai Xiong
Abstract:
Substantial experiments have validated the success of Batch Normalization (BN) Layer in benefiting convergence and generalization. However, BN requires extra memory and float-point calculation. Moreover, BN would be inaccurate on micro-batch, as it depends on batch statistics. In this paper, we address these problems by simplifying BN regularization while keeping two fundamental impacts of BN laye…
▽ More
Substantial experiments have validated the success of Batch Normalization (BN) Layer in benefiting convergence and generalization. However, BN requires extra memory and float-point calculation. Moreover, BN would be inaccurate on micro-batch, as it depends on batch statistics. In this paper, we address these problems by simplifying BN regularization while keeping two fundamental impacts of BN layers, i.e., data decorrelation and adaptive learning rate. We propose a novel normalization method, named MimicNorm, to improve the convergence and efficiency in network training. MimicNorm consists of only two light operations, including modified weight mean operations (subtract mean values from weight parameter tensor) and one BN layer before loss function (last BN layer). We leverage the neural tangent kernel (NTK) theory to prove that our weight mean operation whitens activations and transits network into the chaotic regime like BN layer, and consequently, leads to an enhanced convergence. The last BN layer provides autotuned learning rates and also improves accuracy. Experimental results show that MimicNorm achieves similar accuracy for various network structures, including ResNets and lightweight networks like ShuffleNet, with a reduction of about 20% memory consumption. The code is publicly available at https://github.com/Kid-key/MimicNorm.
△ Less
Submitted 27 September, 2023; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Delay-dependent Asymptotic Stability of Highly Nonlinear Stochastic Differential Delay Equations Driven by $G$-Brownian Motion
Authors:
Chen Fei,
Weiyin Fei,
Xuerong Mao,
Litan Yan
Abstract:
Based on the classical probability, the stability criteria for stochastic differential delay equations (SDDEs) where their coefficients are either linear or nonlinear but bounded by linear functions have been investigated intensively. Moreover, the dependent stability of the highly nonlinear hybrid stochastic differential equations is recently studied. In this paper, by using the nonlinear expecta…
▽ More
Based on the classical probability, the stability criteria for stochastic differential delay equations (SDDEs) where their coefficients are either linear or nonlinear but bounded by linear functions have been investigated intensively. Moreover, the dependent stability of the highly nonlinear hybrid stochastic differential equations is recently studied. In this paper, by using the nonlinear expectation theory, we explore the dependent stability of a class of highly nonlinear hybrid stochastic differential delay equations driven by $G$-Brownian motion ($G$-SDDEs). Firstly, we give preliminaries of sublinear expectation. Then, the delay-dependent criteria of the stability and boundedness of solutions to $G$-SDDEs is provided. Finally, an illustrative example is analyzed by the $\varphi$-max-mean algorithm.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
Consistency of least squares estimation to the parameter for stochastic differential equations under distribution uncertainty
Authors:
Chen Fei,
Weiyin Fei
Abstract:
Under distribution uncertainty, on the basis of discrete data we investigate the consistency of the least squares estimator (LSE) of the parameter for the stochastic differential equation (SDE) where the noise are characterized by $G$-Brownian motion. In order to obtain our main result of consistency of parameter estimation, we provide some lemmas by the theory of stochastic calculus of sublinear…
▽ More
Under distribution uncertainty, on the basis of discrete data we investigate the consistency of the least squares estimator (LSE) of the parameter for the stochastic differential equation (SDE) where the noise are characterized by $G$-Brownian motion. In order to obtain our main result of consistency of parameter estimation, we provide some lemmas by the theory of stochastic calculus of sublinear expectation. The result shows that under some regularity conditions, the least squares estimator is strong consistent uniformly on the prior set. An illustrative example is discussed.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
Generalized Ait-Sahalia-type interest rate model with Poisson jumps and convergence of the numerical approximation
Authors:
Shounian Deng,
Chen Fei,
Weiyin Fei,
Xuerong Mao
Abstract:
In this paper, we consider the generalized Ait-Sahaliz interest rate model with Poisson jumps in finance. The analytical properties including the positivity, boundedness and pathwise asymptotic estimations of the solution to the model are investigated. Moreover, we prove that the Euler-Maruyama (EM) numerical solutions will converge to the true solution in probability. Finally, under assumption th…
▽ More
In this paper, we consider the generalized Ait-Sahaliz interest rate model with Poisson jumps in finance. The analytical properties including the positivity, boundedness and pathwise asymptotic estimations of the solution to the model are investigated. Moreover, we prove that the Euler-Maruyama (EM) numerical solutions will converge to the true solution in probability. Finally, under assumption that the interest rate or the asset price is governed by this model, we apply the EM solutions to compute some financial quantities.
△ Less
Submitted 31 August, 2018;
originally announced September 2018.
-
The truncated EM method for stochastic differential equations with Poisson jumps
Authors:
Shounian Deng,
Weiyin Fei,
Wei Liu,
Xuerong Mao
Abstract:
In this paper, we use the truncated EM method to study the finite time strong convergence for the SDEs with Poisson jumps under the Khasminskii-type condition. We establish the finite time $ \mathcal L ^r (r \ge 2) $ convergence rate when the drift and diffusion coefficients satisfy super-linear condition and the jump coefficient satisfies the linear growth condition. The result shows that the opt…
▽ More
In this paper, we use the truncated EM method to study the finite time strong convergence for the SDEs with Poisson jumps under the Khasminskii-type condition. We establish the finite time $ \mathcal L ^r (r \ge 2) $ convergence rate when the drift and diffusion coefficients satisfy super-linear condition and the jump coefficient satisfies the linear growth condition. The result shows that the optimal $\mathcal L ^r$-convergence rate is close to $ 1/ (1 + γ)$, where $γ$ is the super-linear growth constant. This is significantly different from the result on SDEs without jumps. When all the three coefficients of SDEs are allowing to grow super-linearly, the $ \mathcal L^r (0<r<2)$ strong convergence results are also investigated and the optimal strong convergence rate is shown to be not greater than $1/4$. Moreover, we prove that the truncated EM method preserve nicely the mean square exponentially stability and asymptotic boundedness of the underlying SDEs with Piosson jumps. Several examples are given to illustrate our results.
△ Less
Submitted 28 May, 2018;
originally announced May 2018.
-
On existence and uniqueness of solutions to uncertain backward stochastic differential equations
Authors:
Weiyin Fei
Abstract:
This paper is concerned with a class of uncertain backward stochastic differential equations (UBSDEs) driven by both an $m$-dimensional Brownian motion and a $d$-dimensional canonical process with uniform Lipschitzian coefficients. Such equations can be useful in modelling hybrid systems, where the phenomena are simultaneously subjected to two kinds of uncertainties: randomness and uncertainty. Th…
▽ More
This paper is concerned with a class of uncertain backward stochastic differential equations (UBSDEs) driven by both an $m$-dimensional Brownian motion and a $d$-dimensional canonical process with uniform Lipschitzian coefficients. Such equations can be useful in modelling hybrid systems, where the phenomena are simultaneously subjected to two kinds of uncertainties: randomness and uncertainty. The solutions of UBSDEs are the uncertain stochastic processes. Thus, the existence and uniqueness of solutions to UBSDEs with Lipschitzian coefficients are proved.
△ Less
Submitted 28 January, 2014;
originally announced January 2014.
-
Optimal control of uncertain stochastic systems with Markovian switching and its applications to portfolio decisions
Authors:
Weiyin Fei
Abstract:
This paper first describes a class of uncertain stochastic control systems with Markovian switching, and derives an Itô-Liu formula for Markov-modulated processes. And we characterize an optimal control law, which satisfies the generalized Hamilton-Jacobi-Bellman (HJB) equation with Markovian switching. Then, by using the generalized HJB equation, we deduce the optimal consumption and portfolio po…
▽ More
This paper first describes a class of uncertain stochastic control systems with Markovian switching, and derives an Itô-Liu formula for Markov-modulated processes. And we characterize an optimal control law, which satisfies the generalized Hamilton-Jacobi-Bellman (HJB) equation with Markovian switching. Then, by using the generalized HJB equation, we deduce the optimal consumption and portfolio policies under uncertain stochastic financial markets with Markovian switching. Finally, for constant relative risk-aversion (CRRA) felicity functions, we explicitly obtain the optimal consumption and portfolio policies. Moreover, we also make an economic analysis through numerical examples.
△ Less
Submitted 11 January, 2014;
originally announced January 2014.
-
On exponential stability for stochastic differential equations disturbed by G-Brownian motion
Authors:
Weiyin Fei,
Chen Fei
Abstract:
We first introduce the calculus of Peng's G-Brownian motion on a sublinear expectation space $(Ω, {\cal H}, \hat{\mathbb{E}})$. Then we investigate the exponential stability of paths for a class of stochastic differential equations disturbed by a G-Brownian motion in the sense of quasi surely (q.s.). The analyses consist in G-Lyapunov function and some special inequalities. Various sufficient cond…
▽ More
We first introduce the calculus of Peng's G-Brownian motion on a sublinear expectation space $(Ω, {\cal H}, \hat{\mathbb{E}})$. Then we investigate the exponential stability of paths for a class of stochastic differential equations disturbed by a G-Brownian motion in the sense of quasi surely (q.s.). The analyses consist in G-Lyapunov function and some special inequalities. Various sufficient conditions are obtained to ensure the stability of strong solutions. In particular, by means of our results we generalize the one in the classical stochastic differential equations. Finally, an illustrative example is given.
△ Less
Submitted 28 November, 2013;
originally announced November 2013.
-
Optimal stochastic control and optimal consumption and portfolio with G-Brownian motion
Authors:
Weiyin Fei,
Chen Fei
Abstract:
By the calculus of Peng's G-sublinear expectation and G-Brownian motion on a sublinear expectation space $(Ω, {\cal H}, \hat{\mathbb{E}})$, we first set up an optimality principle of stochastic control problem. Then we investigate an optimal consumption and portfolio decision with a volatility ambiguity by the derived verification theorem. Next the two-fund separation theorem is explicitly obtaine…
▽ More
By the calculus of Peng's G-sublinear expectation and G-Brownian motion on a sublinear expectation space $(Ω, {\cal H}, \hat{\mathbb{E}})$, we first set up an optimality principle of stochastic control problem. Then we investigate an optimal consumption and portfolio decision with a volatility ambiguity by the derived verification theorem. Next the two-fund separation theorem is explicitly obtained. And an illustrative example is provided.
△ Less
Submitted 1 September, 2013;
originally announced September 2013.