arXiv:2511.20718 [pdf, ps, other]

ST-PPO: Stabilized Off-Policy Proximal Policy Optimization for Multi-Turn Agents Training

Authors: Chenliang Li, Adel Elmahdy, Alex Boyd, Zhongruo Wang, Alfredo Garcia, Parminder Bhatia, Taha Kass-Hout, Cao Xiao, Mingyi Hong

Abstract: PPO has been widely adopted for training large language models (LLMs) at the token level in multi-turn dialogue and reasoning tasks. However, its performance is often unstable and prone to collapse. Through empirical analysis, we identify two main sources of instability in this setting: (1)~token-level importance sampling, which is misaligned with the natural granularity of multi-turn environments… ▽ More PPO has been widely adopted for training large language models (LLMs) at the token level in multi-turn dialogue and reasoning tasks. However, its performance is often unstable and prone to collapse. Through empirical analysis, we identify two main sources of instability in this setting: (1)~token-level importance sampling, which is misaligned with the natural granularity of multi-turn environments that have distinct turn-level stages, and (2) inaccurate advantage estimates from off-policy samples, where the critic has not learned to evaluate certain state-action pairs, resulting in high-variance gradients and unstable updates. To address these challenges, we introduce two complementary stabilization techniques: (1) turn-level importance sampling, which aligns optimization with the natural structure of multi-turn reasoning, and (2) clipping-bias correction, which normalizes gradients by downweighting unreliable, highly off-policy samples. Depending on how these components are combined, we obtain three variants: Turn-PPO (turn-level sampling only), S-PPO (clipping-bias correction applied to token-level PPO), and ST-PPO (turn-level sampling combined with clipping-bias correction). In our experiments, we primarily study ST-PPO and S-PPO, which together demonstrate how the two stabilization mechanisms address complementary sources of instability. Experiments on multi-turn search tasks across general QA, multi-hop QA, and medical multiple-choice QA benchmarks show that ST-PPO and S-PPO consistently prevent the performance collapses observed in large-model training, maintain lower clipping ratios throughout optimization, and achieve higher task performance than standard token-level PPO. These results demonstrate that combining turn-level importance sampling with clipping-bias correction provides a practical and scalable solution for stabilizing multi-turn LLM agent training. △ Less

Submitted 25 November, 2025; originally announced November 2025.

arXiv:2408.10536 [pdf, other]

Synergistic Approach for Simultaneous Optimization of Monolingual, Cross-lingual, and Multilingual Information Retrieval

Authors: Adel Elmahdy, Sheng-Chieh Lin, Amin Ahmad

Abstract: Information retrieval across different languages is an increasingly important challenge in natural language processing. Recent approaches based on multilingual pre-trained language models have achieved remarkable success, yet they often optimize for either monolingual, cross-lingual, or multilingual retrieval performance at the expense of others. This paper proposes a novel hybrid batch training s… ▽ More Information retrieval across different languages is an increasingly important challenge in natural language processing. Recent approaches based on multilingual pre-trained language models have achieved remarkable success, yet they often optimize for either monolingual, cross-lingual, or multilingual retrieval performance at the expense of others. This paper proposes a novel hybrid batch training strategy to simultaneously improve zero-shot retrieval performance across monolingual, cross-lingual, and multilingual settings while mitigating language bias. The approach fine-tunes multilingual language models using a mix of monolingual and cross-lingual question-answer pair batches sampled based on dataset size. Experiments on XQuAD-R, MLQA-R, and MIRACL benchmark datasets show that the proposed method consistently achieves comparable or superior results in zero-shot retrieval across various languages and retrieval tasks compared to monolingual-only or cross-lingual-only training. Hybrid batch training also substantially reduces language bias in multilingual retrieval compared to monolingual training. These results demonstrate the effectiveness of the proposed approach for learning language-agnostic representations that enable strong zero-shot retrieval performance across diverse languages. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 15 pages, 2 figures, 13 tables

arXiv:2306.13789 [pdf, other]

Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models

Authors: Adel Elmahdy, Ahmed Salem

Abstract: Natural language processing (NLP) models have become increasingly popular in real-world applications, such as text classification. However, they are vulnerable to privacy attacks, including data reconstruction attacks that aim to extract the data used to train the model. Most previous studies on data reconstruction attacks have focused on LLM, while classification models were assumed to be more se… ▽ More Natural language processing (NLP) models have become increasingly popular in real-world applications, such as text classification. However, they are vulnerable to privacy attacks, including data reconstruction attacks that aim to extract the data used to train the model. Most previous studies on data reconstruction attacks have focused on LLM, while classification models were assumed to be more secure. In this work, we propose a new targeted data reconstruction attack called the Mix And Match attack, which takes advantage of the fact that most classification models are based on LLM. The Mix And Match attack uses the base model of the target model to generate candidate tokens and then prunes them using the classification head. We extensively demonstrate the effectiveness of the attack using both random and organic canaries. This work highlights the importance of considering the privacy risks associated with data reconstruction attacks in classification models and offers insights into possible leakages. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 17 pages, 6 figures, 4 tables

arXiv:2206.04591 [pdf, other]

Privacy Leakage in Text Classification: A Data Extraction Approach

Authors: Adel Elmahdy, Huseyin A. Inan, Robert Sim

Abstract: Recent work has demonstrated the successful extraction of training data from generative language models. However, it is not evident whether such extraction is feasible in text classification models since the training objective is to predict the class label as opposed to next-word prediction. This poses an interesting challenge and raises an important question regarding the privacy of training data… ▽ More Recent work has demonstrated the successful extraction of training data from generative language models. However, it is not evident whether such extraction is feasible in text classification models since the training objective is to predict the class label as opposed to next-word prediction. This poses an interesting challenge and raises an important question regarding the privacy of training data in text classification settings. Therefore, we study the potential privacy leakage in the text classification domain by investigating the problem of unintended memorization of training data that is not pertinent to the learning task. We propose an algorithm to extract missing tokens of a partial text by exploiting the likelihood of the class label provided by the model. We test the effectiveness of our algorithm by inserting canaries into the training set and attempting to extract tokens in these canaries post-training. In our experiments, we demonstrate that successful extraction is possible to some extent. This can also be used as an auditing strategy to assess any potential unauthorized use of personal data without consent. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: 8 pages, 4 tables. Accepted at NAACL 2022 Workshop on Privacy in NLP (PrivateNLP)

arXiv:2201.01728 [pdf, other]

Matrix Completion with Hierarchical Graph Side Information

Authors: Adel Elmahdy, Junhyung Ahn, Changho Suh, Soheil Mohajer

Abstract: We consider a matrix completion problem that exploits social or item similarity graphs as side information. We develop a universal, parameter-free, and computationally efficient algorithm that starts with hierarchical graph clustering and then iteratively refines estimates both on graph clustering and matrix ratings. Under a hierarchical stochastic block model that well respects practically-releva… ▽ More We consider a matrix completion problem that exploits social or item similarity graphs as side information. We develop a universal, parameter-free, and computationally efficient algorithm that starts with hierarchical graph clustering and then iteratively refines estimates both on graph clustering and matrix ratings. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model (to be detailed), we demonstrate that our algorithm achieves the information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) that is derived by maximum likelihood estimation together with a lower-bound impossibility result. One consequence of this result is that exploiting the hierarchical structure of social graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. We conduct extensive experiments both on synthetic and real-world datasets to corroborate our theoretical results as well as to demonstrate significant performance improvements over other matrix completion algorithms that leverage graph side information. △ Less

Submitted 1 January, 2022; originally announced January 2022.

Comments: 53 pages, 3 figures, 1 table. Published in NeurIPS 2020. The first two authors contributed equally to this work. In this revision, achievability proof technique is updated and typos are corrected. arXiv admin note: substantial text overlap with arXiv:2109.05408

Journal ref: Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

arXiv:2201.00313 [pdf, other]

Secure Determinant Codes for Distributed Storage Systems

Authors: Adel Elmahdy, Michelle Kleckler, Soheil Mohajer

Abstract: The information-theoretic secure exact-repair regenerating codes for distributed storage systems (DSSs) with parameters $(n,k=d,d,\ell)$ are studied in this paper. We consider distributed storage systems with $n$ nodes, in which the original data can be recovered from any subset of $k=d$ nodes, and the content of any node can be retrieved from those of any $d$ helper nodes. Moreover, we consider t… ▽ More The information-theoretic secure exact-repair regenerating codes for distributed storage systems (DSSs) with parameters $(n,k=d,d,\ell)$ are studied in this paper. We consider distributed storage systems with $n$ nodes, in which the original data can be recovered from any subset of $k=d$ nodes, and the content of any node can be retrieved from those of any $d$ helper nodes. Moreover, we consider two secrecy constraints, namely, Type-I, where the message remains secure against an eavesdropper with access to the content of any subset of up to $\ell$ nodes, and Type-II, in which the message remains secure against an eavesdropper who can observe the incoming repair data from all possible nodes to a fixed but unknown subset of up to $\ell$ compromised nodes. Two classes of secure determinant codes are proposed for Type-I and Type-II secrecy constraints. Each proposed code can be designed for a range of per-node storage capacity and repair bandwidth for any system parameters. They lead to two achievable secrecy trade-offs, for Type-I and Type-II security. △ Less

Submitted 29 December, 2022; v1 submitted 2 January, 2022; originally announced January 2022.

Comments: 22 pages, 8 figures. The first two authors contributed equally to this work. Accepted for publication at IEEE Transactions on Information Theory

arXiv:2109.05408 [pdf, other]

On the Fundamental Limits of Matrix Completion: Leveraging Hierarchical Similarity Graphs

Authors: Junhyung Ahn, Adel Elmahdy, Soheil Mohajer, Changho Suh

Abstract: We study the matrix completion problem that leverages hierarchical similarity graphs as side information in the context of recommender systems. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model, we characterize the exact information-theoretic limit on the number of observed matrix entries (i.e., optimal sample compl… ▽ More We study the matrix completion problem that leverages hierarchical similarity graphs as side information in the context of recommender systems. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model, we characterize the exact information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) by proving sharp upper and lower bounds on the sample complexity. In the achievability proof, we demonstrate that probability of error of the maximum likelihood estimator vanishes for sufficiently large number of users and items, if all sufficient conditions are satisfied. On the other hand, the converse (impossibility) proof is based on the genie-aided maximum likelihood estimator. Under each necessary condition, we present examples of a genie-aided estimator to prove that the probability of error does not vanish for sufficiently large number of users and items. One important consequence of this result is that exploiting the hierarchical structure of social graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. More specifically, we analyze the optimal sample complexity and identify different regimes whose characteristics rely on quality metrics of side information of the hierarchical similarity graph. Finally, we present simulation results to corroborate our theoretical findings and show that the characterized information-theoretic limit can be asymptotically achieved. △ Less

Submitted 11 September, 2021; originally announced September 2021.

Comments: The first two authors contributed equally to this work. A preliminary version of this work was presented at the 2020 Advances in Neural Information Processing Systems Conference (NeurIPS 2020)

arXiv:1807.04255 [pdf, other]

doi 10.1109/TIT.2020.2964547

On the Fundamental Limits of Coded Data Shuffling for Distributed Machine Learning

Authors: Adel Elmahdy, Soheil Mohajer

Abstract: We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining… ▽ More We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files, assuming the cached files are uncoded. The caches of the worker nodes are updated every iteration, and they should be designed to satisfy any possible unknown permutation of the files in subsequent iterations. For this problem, we characterize the exact load-memory trade-off for worst-case shuffling by deriving the minimum communication load for a given storage capacity per worker node. As a byproduct, the exact load-memory trade-off for any shuffling is characterized when the number of files is equal to the number of worker nodes. We propose a novel deterministic coded shuffling scheme, which improves the state of the art, by exploiting the cache memories to create coded functions that can be decoded by several worker nodes. Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles. △ Less

Submitted 20 June, 2020; v1 submitted 11 July, 2018; originally announced July 2018.

Comments: This work has been published in IEEE Transactions on Information Theory. A preliminary version of this work was presented at IEEE International Symposium on Information Theory (ISIT), Jun. 2018

Journal ref: IEEE Transactions on Information Theory, vol. 66, no. 5, pp. 3098-3131, May 2020

arXiv:1608.00209 [pdf, other]

Asymmetric Degrees of Freedom of the Full-Duplex MIMO 3-Way Channel with Unicast and Broadcast Messages

Authors: Adel M. Elmahdy, Amr El-Keyi, Yahya Mohasseb, Tamer ElBatt, Mohammed Nafie, Karim G. Seddik, Tamer Khattab

Abstract: In this paper, we characterize the asymmetric total degrees of freedom (DoF) of a multiple-input multiple-output (MIMO) 3-way channel. Each node has a separate-antenna full-duplex MIMO transceiver with a different number of antennas, where each antenna can be configured for either signal transmission or reception. We study this system under two message configurations; the first configuration is wh… ▽ More In this paper, we characterize the asymmetric total degrees of freedom (DoF) of a multiple-input multiple-output (MIMO) 3-way channel. Each node has a separate-antenna full-duplex MIMO transceiver with a different number of antennas, where each antenna can be configured for either signal transmission or reception. We study this system under two message configurations; the first configuration is when each node has two unicast messages to be delivered to the two other nodes, while the second configuration is when each node has two unicast messages as well as one broadcast message to be delivered to the two other nodes. For each configuration, we first derive upper bounds on the total DoF of the system. Cut-set bounds in conjunction with genie-aided bounds are derived to characterize the achievable total DoF. Afterwards, we analytically derive the optimal number of transmit and receive antennas at each node to maximize the total DoF of the system, subject to the total number of antennas at each node. Finally, the achievable schemes for each configuration are constructed. The proposed schemes are mainly based on zero-forcing and null-space transmit beamforming. △ Less

Submitted 31 July, 2016; originally announced August 2016.

arXiv:1603.09530 [pdf, other]

doi 10.1109/WCNC.2016.7564798

On Optimizing Cooperative Cognitive User Performance under Primary QoS Constraints

Authors: Adel M. Elmahdy, Amr El-Keyi, Tamer ElBatt, Karim G. Seddik

Abstract: We study the problem of optimizing the performance of cognitive radio users with opportunistic real-time applications subject to primary users quality-of-service (QoS) constraints. Two constrained optimization problems are formulated; the first problem is maximizing the secondary user throughput while the second problem is minimizing the secondary user average delay, subject to a common constraint… ▽ More We study the problem of optimizing the performance of cognitive radio users with opportunistic real-time applications subject to primary users quality-of-service (QoS) constraints. Two constrained optimization problems are formulated; the first problem is maximizing the secondary user throughput while the second problem is minimizing the secondary user average delay, subject to a common constraint on the primary user average delay. In spite of the complexity of the optimization problems, due to their non-convexity, we transform the first problem into a set of linear programs and the second problem into a set of quasiconvex optimization problems. We prove that both problems are equivalent with identical feasible sets and optimal solutions. We show, through numerical results, that the proposed cooperation policy represents the best compromise between enhancing the secondary users QoS and satisfying the primary users QoS requirements. △ Less

Submitted 31 March, 2016; originally announced March 2016.

Comments: 7 pages, IEEE WCNC 2016

arXiv:1410.2419 [pdf, other]

doi 10.1109/PIMRC.2014.7136302

On the Stable Throughput of Cooperative Cognitive Radio Networks with Finite Relaying Buffer

Authors: Adel M. Elmahdy, Amr El-Keyi, Tamer ElBatt, Karim G. Seddik

Abstract: In this paper, we study the problem of cooperative communications in cognitive radio systems where the secondary user has limited relaying room for the overheard primary packets. More specifically, we characterize the stable throughput region of a cognitive radio network with a finite relaying buffer at the secondary user. Towards this objective, we formulate a constrained optimization problem for… ▽ More In this paper, we study the problem of cooperative communications in cognitive radio systems where the secondary user has limited relaying room for the overheard primary packets. More specifically, we characterize the stable throughput region of a cognitive radio network with a finite relaying buffer at the secondary user. Towards this objective, we formulate a constrained optimization problem for maximizing the secondary user throughput while guaranteeing the stability of the primary user queue. We consider a general cooperation policy where the packet admission and queue selection probabilities, at the secondary user, are both dependent on the state (length) of the finite relaying buffer. Despite the sheer complexity of the optimization problem, attributed to its non-convexity, we transform it to a linear program. Our numerical results reveal a number of valuable insights, e.g., it is always mutually beneficial to cooperate in delivering the primary packets in terms of expanding the stable throughput region. In addition, the stable throughput region of the system, compared to the case of infinite relaying queue capacity, marginally shrinks for limited relaying queue capacity. △ Less

Submitted 9 October, 2014; originally announced October 2014.

Comments: 5 pages, IEEE PIMRC 2014

arXiv:1311.1240 [pdf, other]

doi 10.1109/PIMRC.2013.6666427

Generalized Instantly Decodable Network Coding for Relay-Assisted Networks

Authors: Adel M. Elmahdy, Sameh Sorour, Karim G. Seddik

Abstract: In this paper, we investigate the problem of minimizing the frame completion delay for Instantly Decodable Network Coding (IDNC) in relay-assisted wireless multicast networks. We first propose a packet recovery algorithm in the single relay topology which employs generalized IDNC instead of strict IDNC previously proposed in the literature for the same relay-assisted topology. This use of generali… ▽ More In this paper, we investigate the problem of minimizing the frame completion delay for Instantly Decodable Network Coding (IDNC) in relay-assisted wireless multicast networks. We first propose a packet recovery algorithm in the single relay topology which employs generalized IDNC instead of strict IDNC previously proposed in the literature for the same relay-assisted topology. This use of generalized IDNC is supported by showing that it is a super-set of the strict IDNC scheme, and thus can generate coding combinations that are at least as efficient as strict IDNC in reducing the average completion delay. We then extend our study to the multiple relay topology and propose a joint generalized IDNC and relay selection algorithm. This proposed algorithm benefits from the reception diversity of the multiple relays to further reduce the average completion delay in the network. Simulation results show that our proposed solutions achieve much better performance compared to previous solutions in the literature. △ Less

Submitted 9 October, 2014; v1 submitted 5 November, 2013; originally announced November 2013.

Comments: 5 pages, IEEE PIMRC 2013

Showing 1–12 of 12 results for author: Elmahdy, A