-
Interpreting and Steering State-Space Models via Activation Subspace Bottlenecks
Authors:
Vamshi Sunku Mohan,
Kaustubh Gupta,
Aneesha Das,
Chandan Singh
Abstract:
State-space models (SSMs) have emerged as an efficient strategy for building powerful language models, avoiding the quadratic complexity of computing attention in transformers. Despite their promise, the interpretability and steerability of modern SSMs remain relatively underexplored. We take a major step in this direction by identifying activation subspace bottlenecks in the Mamba family of SSM m…
▽ More
State-space models (SSMs) have emerged as an efficient strategy for building powerful language models, avoiding the quadratic complexity of computing attention in transformers. Despite their promise, the interpretability and steerability of modern SSMs remain relatively underexplored. We take a major step in this direction by identifying activation subspace bottlenecks in the Mamba family of SSM models using tools from mechanistic interpretability. We then introduce a test-time steering intervention that simply multiplies the activations of the identified bottlenecks by a scalar. Across 5 SSMs and 6 diverse benchmarks, this intervention improves performance by an average of 8.27%, without requiring any task-specific tuning. Finally, we validate that the identified bottlenecks are indeed hindering performance by modifying them to yield an architecture we call Stable-Mamba, which achieves long-context performance gains when retrained from scratch.
△ Less
Submitted 26 February, 2026;
originally announced February 2026.
-
Learning Tennis Strategy Through Curriculum-Based Dueling Double Deep Q-Networks
Authors:
Vishnu Mohan
Abstract:
Tennis strategy optimization is a challenging sequential decision-making problem involving hierarchical scoring, stochastic outcomes, long-horizon credit assignment, physical fatigue, and adaptation to opponent skill. I present a reinforcement learning framework that integrates a custom tennis simulation environment with a Dueling Double Deep Q-Network(DDQN) trained using curriculum learning. The…
▽ More
Tennis strategy optimization is a challenging sequential decision-making problem involving hierarchical scoring, stochastic outcomes, long-horizon credit assignment, physical fatigue, and adaptation to opponent skill. I present a reinforcement learning framework that integrates a custom tennis simulation environment with a Dueling Double Deep Q-Network(DDQN) trained using curriculum learning. The environment models complete tennis scoring at the level of points, games, and sets, rally-level tactical decisions across ten discrete action categories, symmetric fatigue dynamics, and a continuous opponent skill parameter. The dueling architecture decomposes action-value estimation into state-value and advantage components, while double Q-learning reduces overestimation bias and improves training stability in this long-horizon stochastic domain. Curriculum learning progressively increases opponent difficulty from 0.40 to 0.50, enabling robust skill acquisition without the training collapse observed under fixed opponents. Across extensive evaluations, the trained agent achieves win rates between 98 and 100 percent against balanced opponents and maintains strong performance against more challenging opponents. Serve efficiency ranges from 63.0 to 67.5 percent, and return efficiency ranges from 52.8 to 57.1 percent. Ablation studies demonstrate that both the dueling architecture and curriculum learning are necessary for stable convergence, while a standard DQN baseline fails to learn effective policies. Despite strong performance, tactical analysis reveals a pronounced defensive bias, with the learned policy prioritizing error avoidance and prolonged rallies over aggressive point construction. These results highlight a limitation of win-rate driven optimization in simplified sports simulations and emphasize the importance of reward design for realistic sports reinforcement learning.
△ Less
Submitted 19 December, 2025;
originally announced December 2025.
-
A Fair OR-ML Framework for Resource Substitution in Large-Scale Networks
Authors:
Ved Mohan,
El Mehdi Er Raqabi,
Pascal Van Hentenryck
Abstract:
Ensuring that the right resource is available at the right location and time remains a major challenge for organizations operating large-scale logistics networks. The challenge comes from uneven demand patterns and the resulting asymmetric flow of resources across the arcs, which create persistent imbalances at the network nodes. Resource substitution among multiple, potentially composite and inte…
▽ More
Ensuring that the right resource is available at the right location and time remains a major challenge for organizations operating large-scale logistics networks. The challenge comes from uneven demand patterns and the resulting asymmetric flow of resources across the arcs, which create persistent imbalances at the network nodes. Resource substitution among multiple, potentially composite and interchangeable, resource types is a cost-effective way to mitigate these imbalances. This leads to the resource substitution problem, which aims at determining the minimum number of resource substitutions from an initial assignment to minimize the overall network imbalance. In decentralized settings, achieving globally coordinated solutions becomes even more difficult. When substitution entails costs, effective prescriptions must also incorporate fairness and account for the individual preferences of schedulers. This paper presents a generic framework that combines operations research (OR) and machine learning (ML) to enable fair resource substitution in large networks. The OR component models and solves the resource substitution problem under a fairness lens. The ML component leverages historical data to learn schedulers' preferences, guide intelligent exploration of the decision space, and enhance computational efficiency by dynamically selecting the top-$κ$ resources for each arc in the network. The framework produces a portfolio of high-quality solutions from which schedulers can select satisfactory trade-offs. The proposed framework is applied to the network of one of the largest package delivery companies in the world, which serves as the primary motivation for this research. Computational results demonstrate substantial improvements over state-of-the-art methods, including an 80% reduction in model size and a 90% decrease in execution time while preserving optimality.
△ Less
Submitted 22 November, 2025;
originally announced November 2025.
-
MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction
Authors:
Zilin Xiao,
Qi Ma,
Mengting Gu,
Chun-cheng Jason Chen,
Xintao Chen,
Vicente Ordonez,
Vijai Mohan
Abstract:
Universal multimodal embedding models have achieved great success in capturing semantic relevance between queries and candidates. However, current methods either condense queries and candidates into a single vector, potentially limiting the expressiveness for fine-grained information, or produce too many vectors that are prohibitive for multi-vector retrieval. In this work, we introduce MetaEmbed,…
▽ More
Universal multimodal embedding models have achieved great success in capturing semantic relevance between queries and candidates. However, current methods either condense queries and candidates into a single vector, potentially limiting the expressiveness for fine-grained information, or produce too many vectors that are prohibitive for multi-vector retrieval. In this work, we introduce MetaEmbed, a new framework for multimodal retrieval that rethinks how multimodal embeddings are constructed and interacted with at scale. During training, a fixed number of learnable Meta Tokens are appended to the input sequence. At test-time, their last-layer contextualized representations serve as compact yet expressive multi-vector embeddings. Through the proposed Matryoshka Multi-Vector Retrieval training, MetaEmbed learns to organize information by granularity across multiple vectors. As a result, we enable test-time scaling in multimodal retrieval where users can balance retrieval quality against efficiency demands by selecting the number of tokens used for indexing and retrieval interactions. Extensive evaluations on the Massive Multimodal Embedding Benchmark (MMEB) and the Visual Document Retrieval Benchmark (ViDoRe) confirm that MetaEmbed achieves state-of-the-art retrieval performance while scaling robustly to models with 32B parameters. Code is available at https://github.com/facebookresearch/MetaEmbed.
△ Less
Submitted 6 April, 2026; v1 submitted 22 September, 2025;
originally announced September 2025.
-
Language Self-Play For Data-Free Training
Authors:
Jakub Grudzien Kuba,
Mengting Gu,
Qi Ma,
Yuandong Tian,
Vijai Mohan,
Jason Chen
Abstract:
Large language models (LLMs) have advanced rapidly in recent years, driven by scale, abundant high-quality training data, and reinforcement learning. Yet this progress faces a fundamental bottleneck: the need for ever more data from which models can continue to learn. In this work, we propose a reinforcement learning approach that removes this dependency by enabling models to improve without addit…
▽ More
Large language models (LLMs) have advanced rapidly in recent years, driven by scale, abundant high-quality training data, and reinforcement learning. Yet this progress faces a fundamental bottleneck: the need for ever more data from which models can continue to learn. In this work, we propose a reinforcement learning approach that removes this dependency by enabling models to improve without additional data. Our method leverages a game-theoretic framework of self-play, where a model's capabilities are cast as performance in a competitive game and stronger policies emerge by having the model play against itself-a process we call Language Self-Play (LSP). Experiments with Llama-3.2-3B-Instruct on instruction-following, mathematics, and coding benchmarks show that pretrained models can be effectively improved with self-play alone.
△ Less
Submitted 18 December, 2025; v1 submitted 9 September, 2025;
originally announced September 2025.
-
REFRAG: Rethinking RAG based Decoding
Authors:
Xiaoqiang Lin,
Aritra Ghosh,
Bryan Kian Hsiang Low,
Anshumali Shrivastava,
Vijai Mohan
Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities in leveraging extensive external knowledge to enhance responses in multi-turn and agentic applications, such as retrieval-augmented generation (RAG). However, processing long-context inputs introduces significant system latency and demands substantial memory for the key-value cache, resulting in reduced throughput and a fundame…
▽ More
Large Language Models (LLMs) have demonstrated remarkable capabilities in leveraging extensive external knowledge to enhance responses in multi-turn and agentic applications, such as retrieval-augmented generation (RAG). However, processing long-context inputs introduces significant system latency and demands substantial memory for the key-value cache, resulting in reduced throughput and a fundamental trade-off between knowledge enrichment and system efficiency. While minimizing latency for long-context inputs is a primary objective for LLMs, we contend that RAG require specialized consideration. In RAG, much of the LLM context consists of concatenated passages from retrieval, with only a small subset directly relevant to the query. These passages often exhibit low semantic similarity due to diversity or deduplication during re-ranking, leading to block-diagonal attention patterns that differ from those in standard LLM generation tasks. Based on this observation, we argue that most computations over the RAG context during decoding are unnecessary and can be eliminated with minimal impact on performance. To this end, we propose REFRAG, an efficient decoding framework that compresses, senses, and expands to improve latency in RAG applications. By exploiting the sparsity structure, we demonstrate a 30.85 the time-to-first-token acceleration (3.75 improvement to previous work) without loss in perplexity. In addition, our optimization framework for large context enables REFRAG to extend the context size of LLMs by 16. We provide rigorous validation of REFRAG across diverse long-context tasks, including RAG, multi-turn conversations, and long document summarization, spanning a wide range of datasets. Experimental results confirm that REFRAG delivers substantial speedup with no loss in accuracy compared to LLaMA models and other state-of-the-art baselines across various context sizes.
△ Less
Submitted 12 October, 2025; v1 submitted 31 August, 2025;
originally announced September 2025.
-
De Sitter Complexity Grows Linearly in the Static Patch
Authors:
Vyshnav Mohan,
Watse Sybesma
Abstract:
The observable universe has undergone periods of expansion that are well approximated by de Sitter (dS) space. Still lacking is a quantum mechanical description of dS, both globally and when restricted to the static patch. We develop a novel prescription for computing holographic complexity in the dS static patch to determine its microscopic features. Specifically, we propose that the natural cand…
▽ More
The observable universe has undergone periods of expansion that are well approximated by de Sitter (dS) space. Still lacking is a quantum mechanical description of dS, both globally and when restricted to the static patch. We develop a novel prescription for computing holographic complexity in the dS static patch to determine its microscopic features. Specifically, we propose that the natural candidate for dS complexity is the volume of extremal timelike surfaces restricted to the static patch, anchored to the cosmological horizon or an observer worldline. Our anchoring prescription provides a clear definition of a reference state, overcoming a common ambiguity in prior definitions of de Sitter holographic complexity. The late-time growth of our complexity functional is linear and proportional to the number of degrees of freedom associated to the cosmological horizon, and therefore does not exhibit hyperfast growth. Our results imply the dS static patch is characterized by a quantum mechanical system, with a finite dimensional Hilbert space whose evolution is governed by a chaotic Hamiltonian.
△ Less
Submitted 13 August, 2025;
originally announced August 2025.
-
Black Shell Thermodynamics
Authors:
Ulf Danielsson,
Vyshnav Mohan,
Larus Thorlacius
Abstract:
Black shells have been proposed as black hole mimickers, i.e. horizonless ultra-compact objects that replace black holes. In this paper, we assume the existence of black shells and consider their thermodynamic properties, but remain agnostic about their wider role in gravitational physics. An ambient negative cosmological constant is introduced in order to have a well-defined canonical ensemble, l…
▽ More
Black shells have been proposed as black hole mimickers, i.e. horizonless ultra-compact objects that replace black holes. In this paper, we assume the existence of black shells and consider their thermodynamic properties, but remain agnostic about their wider role in gravitational physics. An ambient negative cosmological constant is introduced in order to have a well-defined canonical ensemble, leading to a rich phase structure. In particular, the Hawking-Page transition between thermal AdS vacuum and large AdS black holes is split in two, with an intermediate black shell phase, which may play a role in gauge/gravity duality at finite volume. Similarly, for non-vanishing electric charge below a critical value, a black shell phase separates two black hole phases at low and high temperatures. Above the critical charge, there are no phase transitions and large AdS black holes always have the lowest free energy.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
Event-based Neural Spike Detection Using Spiking Neural Networks for Neuromorphic iBMI Systems
Authors:
Chanwook Hwang,
Biyan Zhou,
Ye Ke,
Vivek Mohan,
Jong Hwan Ko,
Arindam Basu
Abstract:
Implantable brain-machine interfaces (iBMIs) are evolving to record from thousands of neurons wirelessly but face challenges in data bandwidth, power consumption, and implant size. We propose a novel Spiking Neural Network Spike Detector (SNN-SPD) that processes event-based neural data generated via delta modulation and pulse count modulation, converting signals into sparse events. By leveraging t…
▽ More
Implantable brain-machine interfaces (iBMIs) are evolving to record from thousands of neurons wirelessly but face challenges in data bandwidth, power consumption, and implant size. We propose a novel Spiking Neural Network Spike Detector (SNN-SPD) that processes event-based neural data generated via delta modulation and pulse count modulation, converting signals into sparse events. By leveraging the temporal dynamics and inherent sparsity of spiking neural networks, our method improves spike detection performance while maintaining low computational overhead suitable for implantable devices. Our experimental results demonstrate that the proposed SNN-SPD achieves an accuracy of 95.72% at high noise levels (standard deviation 0.2), which is about 2% higher than the existing Artificial Neural Network Spike Detector (ANN-SPD). Moreover, SNN-SPD requires only 0.41% of the computation and about 26.62% of the weight parameters compared to ANN-SPD, with zero multiplications. This approach balances efficiency and performance, enabling effective data compression and power savings for next-generation iBMIs.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Architectural Exploration of Hybrid Neural Decoders for Neuromorphic Implantable BMI
Authors:
Vivek Mohan,
Biyan Zhou,
Zhou Wang,
Anil Bharath,
Emmanuel Drakakis,
Arindam Basu
Abstract:
This work presents an efficient decoding pipeline for neuromorphic implantable brain-machine interfaces (Neu-iBMI), leveraging sparse neural event data from an event-based neural sensing scheme. We introduce a tunable event filter (EvFilter), which also functions as a spike detector (EvFilter-SPD), significantly reducing the number of events processed for decoding by 192X and 554X, respectively. T…
▽ More
This work presents an efficient decoding pipeline for neuromorphic implantable brain-machine interfaces (Neu-iBMI), leveraging sparse neural event data from an event-based neural sensing scheme. We introduce a tunable event filter (EvFilter), which also functions as a spike detector (EvFilter-SPD), significantly reducing the number of events processed for decoding by 192X and 554X, respectively. The proposed pipeline achieves high decoding performance, up to R^2=0.73, with ANN- and SNN-based decoders, eliminating the need for signal recovery, spike detection, or sorting, commonly performed in conventional iBMI systems. The SNN-Decoder reduces computations and memory required by 5-23X compared to NN-, and LSTM-Decoders, while the ST-NN-Decoder delivers similar performance to an LSTM-Decoder requiring 2.5X fewer resources. This streamlined approach significantly reduces computational and memory demands, making it ideal for low-power, on-implant, or wearable iBMIs.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Black Hole Singularities from Holographic Complexity
Authors:
Vyshnav Mohan
Abstract:
Using a second law of complexity, we prove a black hole singularity theorem. By introducing the notion of trapped extremal surfaces, we show that their existence implies null geodesic incompleteness inside globally hyperbolic black holes. We also demonstrate that the vanishing of the growth rate of the volume of extremal surfaces provides a sharp diagnostic of the black hole singularity. In static…
▽ More
Using a second law of complexity, we prove a black hole singularity theorem. By introducing the notion of trapped extremal surfaces, we show that their existence implies null geodesic incompleteness inside globally hyperbolic black holes. We also demonstrate that the vanishing of the growth rate of the volume of extremal surfaces provides a sharp diagnostic of the black hole singularity. In static, uncharged, spherically symmetric spacetimes, this corresponds to the growth rate of spacelike extremal surfaces going to zero at the singularity. In charged or rotating spacetimes, such as the Reissner-Nordström and Kerr black holes, we identify novel timelike extremal surfaces that exhibit the same behavior at the timelike singularity.
△ Less
Submitted 15 June, 2025; v1 submitted 14 April, 2025;
originally announced April 2025.
-
Late-Time Saturation of Black Hole Complexity
Authors:
Friðrik Freyr Gautason,
Vyshnav Mohan,
Lárus Thorlacius
Abstract:
The holographic complexity of a static spherically symmetric black hole, defined as the volume of an extremal surface, grows linearly with time at late times in general relativity. The growth comes from a region at a constant transverse area inside the black hole and continues forever in the classical theory. In this region the volume complexity of any spherically symmetric black hole in $d+1$ spa…
▽ More
The holographic complexity of a static spherically symmetric black hole, defined as the volume of an extremal surface, grows linearly with time at late times in general relativity. The growth comes from a region at a constant transverse area inside the black hole and continues forever in the classical theory. In this region the volume complexity of any spherically symmetric black hole in $d+1$ spacetime dimensions reduces to a geodesic length in an effective two-dimensional JT-gravity theory. The length in JT-gravity has been argued to saturate at very late times via non-perturbative corrections obtained from a random matrix description of the gravity theory. The same argument, applied to our effective JT-gravity description of the volume complexity, leads to complexity saturation at times of exponential order in the Bekenstein-Hawking entropy of a $d+1$-dimensional black hole. Along the way, we explore a simple toy model for complexity growth, based on a discretisation of Nielsen complexity geometry, that can be analytically shown to exhibit the expected late-time complexity saturation.
△ Less
Submitted 20 March, 2025; v1 submitted 24 February, 2025;
originally announced February 2025.
-
Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent
Authors:
Momen K Tageldeen,
Yacine Belgaid,
Vivek Mohan,
Zhou Wang,
Emmanuel M Drakakis
Abstract:
The rapid proliferation of AI models, coupled with growing demand for edge deployment, necessitates the development of AI hardware that is both high-performance and energy-efficient. In this paper, we propose a novel analog accelerator architecture designed for AI/ML training workloads using stochastic gradient descent with L2 regularization (SGDr). The architecture leverages log-domain circuits i…
▽ More
The rapid proliferation of AI models, coupled with growing demand for edge deployment, necessitates the development of AI hardware that is both high-performance and energy-efficient. In this paper, we propose a novel analog accelerator architecture designed for AI/ML training workloads using stochastic gradient descent with L2 regularization (SGDr). The architecture leverages log-domain circuits in subthreshold MOS and incorporates volatile memory. We establish a mathematical framework for solving SGDr in the continuous time domain and detail the mapping of SGDr learning equations to log-domain circuits. By operating in the analog domain and utilizing weak inversion, the proposed design achieves significant reductions in transistor area and power consumption compared to digital implementations. Experimental results demonstrate that the architecture closely approximates ideal behavior, with a mean square error below 0.87% and precision as low as 8 bits. Furthermore, the architecture supports a wide range of hyperparameters. This work paves the way for energy-efficient analog AI hardware with on-chip training capabilities.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Non-Perturbative Corrections to Charged Black Hole Evaporation
Authors:
Vyshnav Mohan,
Lárus Thorlacius
Abstract:
The recent work of Brown et al. (arXiv:2411.03447) demonstrated that the low-temperature evaporation rate of a large near-extremal charged black hole is significantly reduced from semiclassical expectations. The quantum corrections responsible for the deviation come from Schwarzian modes of an emergent Jackiw-Teitelboim gravity description of the near-horizon geometry of the black hole. Using a on…
▽ More
The recent work of Brown et al. (arXiv:2411.03447) demonstrated that the low-temperature evaporation rate of a large near-extremal charged black hole is significantly reduced from semiclassical expectations. The quantum corrections responsible for the deviation come from Schwarzian modes of an emergent Jackiw-Teitelboim gravity description of the near-horizon geometry of the black hole. Using a one-parameter family of non-perturbative Airy completions, we extend these results to incorporate non-perturbative effects. At large parameter value, the non-perturbative evaporation rate is even smaller than the perturbative JT gravity results. The disparity becomes especially pronounced at very low energies, where the non-perturbative neutral Hawking flux is suppressed by a double exponential in the entropy of the black hole, effectively stopping its evaporation until the next charged particle is emitted via the Schwinger effect. We also explore an alternative family of Bessel completions for which the non-perturbative energy flux exceeds the perturbative JT gravity prediction.
△ Less
Submitted 3 December, 2024; v1 submitted 20 November, 2024;
originally announced November 2024.
-
The Llama 3 Herd of Models
Authors:
Aaron Grattafiori,
Abhimanyu Dubey,
Abhinav Jauhri,
Abhinav Pandey,
Abhishek Kadian,
Ahmad Al-Dahle,
Aiesha Letman,
Akhil Mathur,
Alan Schelten,
Alex Vaughan,
Amy Yang,
Angela Fan,
Anirudh Goyal,
Anthony Hartshorn,
Aobo Yang,
Archi Mitra,
Archie Sravankumar,
Artem Korenev,
Arthur Hinsvark,
Arun Rao,
Aston Zhang,
Aurelien Rodriguez,
Austen Gregerson,
Ava Spataru,
Baptiste Roziere
, et al. (536 additional authors not shown)
Abstract:
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical…
▽ More
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
△ Less
Submitted 23 November, 2024; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Hybrid Event-Frame Neural Spike Detector for Neuromorphic Implantable BMI
Authors:
Vivek Mohan,
Wee Peng Tay,
Arindam Basu
Abstract:
This work introduces two novel neural spike detection schemes intended for use in next-generation neuromorphic brain-machine interfaces (iBMIs). The first, an Event-based Spike Detector (Ev-SPD) which examines the temporal neighborhood of a neural event for spike detection, is designed for in-vivo processing and offers high sensitivity and decent accuracy (94-97%). The second, Neural Network-based…
▽ More
This work introduces two novel neural spike detection schemes intended for use in next-generation neuromorphic brain-machine interfaces (iBMIs). The first, an Event-based Spike Detector (Ev-SPD) which examines the temporal neighborhood of a neural event for spike detection, is designed for in-vivo processing and offers high sensitivity and decent accuracy (94-97%). The second, Neural Network-based Spike Detector (NN-SPD) which operates on hybrid temporal event frames, provides an off-implant solution using shallow neural networks with impressive detection accuracy (96-99%) and minimal false detections. These methods are evaluated using a synthetic dataset with varying noise levels and validated through comparison with ground truth data. The results highlight their potential in next-gen neuromorphic iBMI systems and emphasize the need to explore this direction further to understand their resource-efficient and high-performance capabilities for practical iBMI settings.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Blockchains, MEV and the knapsack problem: a primer
Authors:
Vijay Mohan,
Peyman Khezr
Abstract:
In this paper, we take a close look at a problem labeled maximal extractable value (MEV), which arises in a blockchain due to the ability of a block producer to manipulate the order of transactions within a block. Indeed, blockchains such as Ethereum have spent considerable resources addressing this issue and have redesigned the block production process to account for MEV. This paper provides an o…
▽ More
In this paper, we take a close look at a problem labeled maximal extractable value (MEV), which arises in a blockchain due to the ability of a block producer to manipulate the order of transactions within a block. Indeed, blockchains such as Ethereum have spent considerable resources addressing this issue and have redesigned the block production process to account for MEV. This paper provides an overview of the MEV problem and tracks how Ethereum has adapted to its presence. A vital aspect of the block building exercise is that it is a variant of the knapsack problem. Consequently, this paper highlights the role of designing auctions to fill a knapsack--or knapsack auctions--in alleviating the MEV problem. Overall, this paper presents a survey of the main issues and an accessible primer for researchers and students wishing to explore the economics of block building and MEV further.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Strategic Bidding in Knapsack Auctions
Authors:
Peyman Khezr,
Vijay Mohan,
Lionel Page
Abstract:
This paper examines knapsack auctions as a method to solve the knapsack problem with incomplete information, where object values are private and sizes are public. We analyze three auction types-uniform price (UP), discriminatory price (DP), and generalized second price (GSP)-to determine efficient resource allocation in these settings. Using a Greedy algorithm for allocating objects, we analyze bi…
▽ More
This paper examines knapsack auctions as a method to solve the knapsack problem with incomplete information, where object values are private and sizes are public. We analyze three auction types-uniform price (UP), discriminatory price (DP), and generalized second price (GSP)-to determine efficient resource allocation in these settings. Using a Greedy algorithm for allocating objects, we analyze bidding behavior, revenue and efficiency of these three auctions using theory, lab experiments, and AI-enriched simulations. Our results suggest that the uniform-price auction has the highest level of truthful bidding and efficiency while the discriminatory price and the generalized second-price auctions are superior in terms of revenue generation. This study not only deepens the understanding of auction-based approaches to NP-hard problems but also provides practical insights for market design.
△ Less
Submitted 30 April, 2024; v1 submitted 29 February, 2024;
originally announced March 2024.
-
Towards Neuromorphic Compression based Neural Sensing for Next-Generation Wireless Implantable Brain Machine Interface
Authors:
Vivek Mohan,
Wee Peng Tay,
Arindam Basu
Abstract:
This work introduces a neuromorphic compression based neural sensing architecture with address-event representation inspired readout protocol for massively parallel, next-gen wireless iBMI. The architectural trade-offs and implications of the proposed method are quantitatively analyzed in terms of compression ratio and spike information preservation. For the latter, we use metrics such as root-mea…
▽ More
This work introduces a neuromorphic compression based neural sensing architecture with address-event representation inspired readout protocol for massively parallel, next-gen wireless iBMI. The architectural trade-offs and implications of the proposed method are quantitatively analyzed in terms of compression ratio and spike information preservation. For the latter, we use metrics such as root-mean-square error and correlation coefficient between the original and recovered signal to assess the effect of neuromorphic compression on spike shape. Furthermore, we use accuracy, sensitivity, and false detection rate to understand the effect of compression on downstream iBMI tasks, specifically, spike detection. We demonstrate that a data compression ratio of $50-100$ can be achieved, $5-18\times$ more than prior work, by selective transmission of event pulses corresponding to neural spikes. A correlation coefficient of $\approx0.9$ and spike detection accuracy of over $90\%$ for the worst-case analysis involving $10K$-channel simulated recording and typical analysis using $100$ or $384$-channel real neural recordings. We also analyze the collision handling capability and scalability of the proposed pipeline.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
State-independent Black Hole Interiors from the Crossed Product
Authors:
Chethan Krishnan,
Vyshnav Mohan
Abstract:
Opinion is divided about the nature of state dependence in the black hole interior. Some argue that it is a necessary feature, while others argue it is a bug. In this paper, we consider the extended half-sided modular translation $U(s_0)$ (with $s_0 > 0$) of Leutheusser and Liu that takes us inside the horizon. We note that we can use this operator to construct a modular Hamiltonian $H$ and a conj…
▽ More
Opinion is divided about the nature of state dependence in the black hole interior. Some argue that it is a necessary feature, while others argue it is a bug. In this paper, we consider the extended half-sided modular translation $U(s_0)$ (with $s_0 > 0$) of Leutheusser and Liu that takes us inside the horizon. We note that we can use this operator to construct a modular Hamiltonian $H$ and a conjugation $J$ on the infalling time-evolved wedges. The original thermofield double translates to a new cyclic and separating vector in the shifted algebra. We use these objects and the Connes cocycle to repeat Witten's crossed product construction in this new setting, and to obtain a Type II$_\infty$ algebra that is independent of the various choices, in particular that of the cyclic separating vector. Our emergent times are implicitly boundary-dressed. But if one admits an ``extra'' observer in the interior, we argue that the (state-independent) algebra can be Type I or Type II$_1$ instead of Type II$_\infty$, depending on whether the observer's light cone contains an entire Cauchy slice or not. Along with these general considerations, we present some specific calculations in the setting of the Poincare BTZ black hole. We identify a generalization of modular translations in BTZ-Kruskal coordinates that is pointwise (as opposed to non-local) and is analytically tractable, exploiting a connection with the covering AdS-space. These evolutions can reach the singularity.
△ Less
Submitted 29 May, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Krylov Complexity of Open Quantum Systems: From Hard Spheres to Black Holes
Authors:
Vyshnav Mohan
Abstract:
We examine the complexity of quasi-static chaotic open quantum systems. As a prototypical example, we analytically compute the Krylov complexity of a slowly leaking hard-sphere gas using Berry's conjecture. We then connect it to the holographic complexity of a $d+1$-dimensional evaporating black hole using the Complexity=Volume proposal. We model the black hole spacetime by stitching together a se…
▽ More
We examine the complexity of quasi-static chaotic open quantum systems. As a prototypical example, we analytically compute the Krylov complexity of a slowly leaking hard-sphere gas using Berry's conjecture. We then connect it to the holographic complexity of a $d+1$-dimensional evaporating black hole using the Complexity=Volume proposal. We model the black hole spacetime by stitching together a sequence of static Schwarzschild patches across incoming negative energy null shock waves. Under certain identification of parameters, we find the late time complexity growth rate during each quasi-static equilibrium to be the same in both systems.
△ Less
Submitted 3 December, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Vector quantization loss analysis in VQGANs: a single-GPU ablation study for image-to-image synthesis
Authors:
Luv Verma,
Varun Mohan
Abstract:
This study performs an ablation analysis of Vector Quantized Generative Adversarial Networks (VQGANs), concentrating on image-to-image synthesis utilizing a single NVIDIA A100 GPU. The current work explores the nuanced effects of varying critical parameters including the number of epochs, image count, and attributes of codebook vectors and latent dimensions, specifically within the constraint of l…
▽ More
This study performs an ablation analysis of Vector Quantized Generative Adversarial Networks (VQGANs), concentrating on image-to-image synthesis utilizing a single NVIDIA A100 GPU. The current work explores the nuanced effects of varying critical parameters including the number of epochs, image count, and attributes of codebook vectors and latent dimensions, specifically within the constraint of limited resources. Notably, our focus is pinpointed on the vector quantization loss, keeping other hyperparameters and loss components (GAN loss) fixed. This was done to delve into a deeper understanding of the discrete latent space, and to explore how varying its size affects the reconstruction. Though, our results do not surpass the existing benchmarks, however, our findings shed significant light on VQGAN's behaviour for a smaller dataset, particularly concerning artifacts, codebook size optimization, and comparative analysis with Principal Component Analysis (PCA). The study also uncovers the promising direction by introducing 2D positional encodings, revealing a marked reduction in artifacts and insights into balancing clarity and overfitting.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Interference-Managed Local Service Insertion for 5G Broadcast
Authors:
M. V. Abhay Mohan,
K. Giridhar
Abstract:
Broadcast of localized TV content enables tailored content delivery catering to the requirements of regional user base. 5G multicast-broadcast service (MBS) requires a spectrally efficient broadcast solution that enables the change of content from one local service area (LSA) to another. A frequency reuse factor of unity between two adjacent LSAs causes their boundary region to become saturated wi…
▽ More
Broadcast of localized TV content enables tailored content delivery catering to the requirements of regional user base. 5G multicast-broadcast service (MBS) requires a spectrally efficient broadcast solution that enables the change of content from one local service area (LSA) to another. A frequency reuse factor of unity between two adjacent LSAs causes their boundary region to become saturated with co-channel interference (CCI). Increasing the reuse factor will reduce the CCI at the cost of degrading the spectral efficiency. This paper addresses the frequency and transmit power planning which manages the CCI at the LSA boundary to achieve a satisfactory trade-off between spectral efficiency and broadcast coverage.
△ Less
Submitted 12 March, 2023; v1 submitted 1 September, 2022;
originally announced October 2022.
-
Interpreting the Bulk Page Curve: A Vestige of Locality on Holographic Screens
Authors:
Chethan Krishnan,
Vyshnav Mohan
Abstract:
Areas of extremal surfaces anchored to sub-regions on screens in Minkowski space satisfy various entanglement entropy inequalities. In 2+1 dimensions where the arguments are simplest, we demonstrate (a) monogamy of mutual information, (b) various versions of (strong) subadditivity, (c) various inequalities involving the entanglement of purification, as well as (e) reflection inequality and (f) Ara…
▽ More
Areas of extremal surfaces anchored to sub-regions on screens in Minkowski space satisfy various entanglement entropy inequalities. In 2+1 dimensions where the arguments are simplest, we demonstrate (a) monogamy of mutual information, (b) various versions of (strong) subadditivity, (c) various inequalities involving the entanglement of purification, as well as (e) reflection inequality and (f) Araki-Lieb inequality. Just as in AdS, Linden-Winter and the tower of Cadney-Linden-Winter inequalities are satisfied trivially. All of these are purely geometric (and therefore unambiguous) statements, and we expect them to hold semi-classically when $G_N \rightarrow 0$. The results of arXiv:2103.17253 suggest that it is unlikely that there is non-analyticity at $G_N=0$. These observations have relevance for the Page phase transition in flat space black holes observed with respect to a screen in arXiv:2005.02993 and arXiv:2006.06872. In particular, they constitute a Lorentzian argument that these extremal surface transitions are indeed phase transitions of $some$ suitably defined entanglement entropy associated to $subregions$ on the screen.
△ Less
Submitted 27 December, 2021;
originally announced December 2021.
-
Interference-Aware Accurate Signal Recovery in sub-1 GHz UHF Band Reuse-1 Cellular OFDMA Downlinks
Authors:
Abhay Mohan M V,
Giridhar K
Abstract:
Reuse-1 systems operating in the sub-1 GHz UHF band are limited by substantial co-channel interference (CCI). In such orthogonal frequency division multiple access (OFDMA) cellular systems, the inter-sector or inter-tower interference (ITI) makes accurate signal recovery quite challenging as sub-1 GHz bands only support single-input single-output (SISO) links. Interference-aware receiver algorithm…
▽ More
Reuse-1 systems operating in the sub-1 GHz UHF band are limited by substantial co-channel interference (CCI). In such orthogonal frequency division multiple access (OFDMA) cellular systems, the inter-sector or inter-tower interference (ITI) makes accurate signal recovery quite challenging as sub-1 GHz bands only support single-input single-output (SISO) links. Interference-aware receiver algorithms are essential to mitigate the ITI in such low-frequency bands. Such algorithms enable ubiquitous mobile broadband access over the entire homeland, say with >95% geographical coverage with quality of service guarantees. One element of the interference-aware signal recovery is the least-squares-based joint channel estimation scheme that uses non-orthogonal pilot subcarriers. This estimator is then compared with a variant that uses orthogonal pilot subcarriers to bring out the advantage of this joint estimator. It is shown that the proposed joint estimator requires fewer pilots to be well-determined when compared to its under-determined orthogonal counterpart. Moreover, it is easy to implement and does not require any knowledge of channel statistics. This work also derives a compensation factor needed for the interference-aware detector in the presence of inter-carrier interference (ICI) originating from multiple transmitters. Simulation results show that the proposed joint channel estimator outperforms traditional estimators at moderate to high frequency selectivity. The proposed compensation factor to the joint detector is found to be essential for recovering the transmitted signal in the absence of phase-tracking pilots.
△ Less
Submitted 15 November, 2022; v1 submitted 1 August, 2021;
originally announced August 2021.
-
Hints of Gravitational Ergodicity: Berry's Ensemble and the Universality of the Semi-Classical Page Curve
Authors:
Chethan Krishnan,
Vyshnav Mohan
Abstract:
Recent developments on black holes have shown that a unitarity-compatible Page curve can be obtained from an ensemble-averaged semi-classical approximation. In this paper, we emphasize (1) that this peculiar manifestation of unitarity is not specific to black holes, and (2) that it can emerge from a single realization of an underlying unitary theory. To make things explicit, we consider a hard sph…
▽ More
Recent developments on black holes have shown that a unitarity-compatible Page curve can be obtained from an ensemble-averaged semi-classical approximation. In this paper, we emphasize (1) that this peculiar manifestation of unitarity is not specific to black holes, and (2) that it can emerge from a single realization of an underlying unitary theory. To make things explicit, we consider a hard sphere gas leaking slowly from a small box into a bigger box. This is a quantum chaotic system in which we expect to see the Page curve in the full unitary description, while semi-classically, eigenstates are expected to behave as though they live in Berry's ensemble. We reproduce the unitarity-compatible Page curve of this system, semi-classically. The computation has structural parallels to replica wormholes, relies crucially on ensemble averaging at each epoch, and reveals the interplay between the multiple time-scales in the problem. Working with the ensemble averaged $state$ rather than the entanglement entropy, we can also engineer an information "paradox". Our system provides a concrete example in which the ensemble underlying the semi-classical Page curve is an ergodic proxy for a time average, and not an explicit average over many theories. The questions we address here are logically independent of the existence of horizons, so we expect that semi-classical gravity should also be viewed in a similar light.
△ Less
Submitted 4 May, 2021; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Quantum crypto-economics: Blockchain prediction markets for the evolution of quantum technology
Authors:
Peter P. Rohde,
Vijay Mohan,
Sinclair Davidson,
Chris Berg,
Darcy Allen,
Gavin K. Brennen,
Jason Potts
Abstract:
Two of the most important technological advancements currently underway are the advent of quantum technologies, and the transitioning of global financial systems towards cryptographic assets, notably blockchain-based cryptocurrencies and smart contracts. There is, however, an important interplay between the two, given that, in due course, quantum technology will have the ability to directly compro…
▽ More
Two of the most important technological advancements currently underway are the advent of quantum technologies, and the transitioning of global financial systems towards cryptographic assets, notably blockchain-based cryptocurrencies and smart contracts. There is, however, an important interplay between the two, given that, in due course, quantum technology will have the ability to directly compromise the cryptographic foundations of blockchain. We explore this complex interplay by building financial models for quantum failure in various scenarios, including pricing quantum risk premiums. We call this quantum crypto-economics.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
Roadmap on quantum nanotechnologies
Authors:
Arne Laucht,
Frank Hohls,
Niels Ubbelohde,
M Fernando Gonzalez-Zalba,
David J Reilly,
Søren Stobbe,
Tim Schröder,
Pasquale Scarlino,
Jonne V Koski,
Andrew Dzurak,
Chih-Hwan Yang,
Jun Yoneda,
Ferdinand Kuemmeth,
Hendrik Bluhm,
Jarryd Pla,
Charles Hill,
Joe Salfi,
Akira Oiwa,
Juha T Muhonen,
Ewold Verhagen,
Matthew D LaHaye,
Hyun Ho Kim,
Adam W Tsen,
Dimitrie Culcer,
Attila Geresdi
, et al. (4 additional authors not shown)
Abstract:
Quantum phenomena are typically observable at length and time scales smaller than those of our everyday experience, often involving individual particles or excitations. The past few decades have seen a revolution in the ability to structure matter at the nanoscale, and experiments at the single particle level have become commonplace. This has opened wide new avenues for exploring and harnessing qu…
▽ More
Quantum phenomena are typically observable at length and time scales smaller than those of our everyday experience, often involving individual particles or excitations. The past few decades have seen a revolution in the ability to structure matter at the nanoscale, and experiments at the single particle level have become commonplace. This has opened wide new avenues for exploring and harnessing quantum mechanical effects in condensed matter. These quantum phenomena, in turn, have the potential to revolutionize the way we communicate, compute and probe the nanoscale world. Here, we review developments in key areas of quantum research in light of the nanotechnologies that enable them, with a view to what the future holds. Materials and devices with nanoscale features are used for quantum metrology and sensing, as building blocks for quantum computing, and as sources and detectors for quantum communication. They enable explorations of quantum behaviour and unconventional states in nano- and opto-mechanical systems, low-dimensional systems, molecular devices, nano-plasmonics, quantum electrodynamics, scanning tunnelling microscopy, and more. This rapidly expanding intersection of nanotechnology and quantum science/technology is mutually beneficial to both fields, laying claim to some of the most exciting scientific leaps of the last decade, with more on the horizon.
△ Less
Submitted 19 January, 2021;
originally announced January 2021.
-
Adversary Models for Mobile Device Authentication
Authors:
René Mayrhofer,
Vishwath Mohan,
Stephan Sigg
Abstract:
Mobile device authentication has been a highly active research topic for over 10 years, with a vast range of methods having been proposed and analyzed. In related areas such as secure channel protocols, remote authentication, or desktop user authentication, strong, systematic, and increasingly formal threat models have already been established and are used to qualitatively and quantitatively compa…
▽ More
Mobile device authentication has been a highly active research topic for over 10 years, with a vast range of methods having been proposed and analyzed. In related areas such as secure channel protocols, remote authentication, or desktop user authentication, strong, systematic, and increasingly formal threat models have already been established and are used to qualitatively and quantitatively compare different methods. Unfortunately, the analysis of mobile device authentication is often based on weak adversary models, suggesting overly optimistic results on their respective security. In this article, we first introduce a new classification of adversaries to better analyze and compare mobile device authentication methods. We then apply this classification to a systematic literature survey. The survey shows that security is still an afterthought and that most proposed protocols lack a comprehensive security analysis. Our proposed classification of adversaries provides a strong uniform adversary model that can offer a comparable and transparent classification of security properties in mobile device authentication methods.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Dynamic Vision Sensors
Authors:
Vivek Mohan,
Deepak Singla,
Tarun Pulluri,
Andres Ussa,
Pradeep Kumar Gopalakrishnan,
Pao-Sheng Sun,
Bharath Ramesh,
Arindam Basu
Abstract:
As an alternative sensing paradigm, dynamic vision sensors (DVS) have been recently explored to tackle scenarios where conventional sensors result in high data rate and processing time. This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor, thereby exploiting the sparse DVS output in a low-power setting for traffic monitor…
▽ More
As an alternative sensing paradigm, dynamic vision sensors (DVS) have been recently explored to tackle scenarios where conventional sensors result in high data rate and processing time. This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor, thereby exploiting the sparse DVS output in a low-power setting for traffic monitoring. Specifically, we propose a hardware efficient processing pipeline that optimizes memory and computational needs that enable long-term battery powered usage for IoT applications. To exploit the background removal property of a static DVS, we propose an event-based binary image creation that signals presence or absence of events in a frame duration. This reduces memory requirement and enables usage of simple algorithms like median filtering and connected component labeling for denoise and region proposal respectively. To overcome the fragmentation issue, a YOLO inspired neural network based detector and classifier to merge fragmented region proposals has been proposed. Finally, a new overlap based tracker was implemented, exploiting overlap between detections and tracks is proposed with heuristics to overcome occlusion. The proposed pipeline is evaluated with more than 5 hours of traffic recording spanning three different locations on two different neuromorphic sensors (DVS and CeleX) and demonstrate similar performance. Compared to existing event-based feature trackers, our method provides similar accuracy while needing approx 6 times less computes. To the best of our knowledge, this is the first time a stationary DVS based traffic monitoring solution is extensively compared to simultaneously recorded RGB frame-based methods while showing tremendous promise by outperforming state-of-the-art deep learning solutions.
△ Less
Submitted 9 May, 2022; v1 submitted 30 May, 2020;
originally announced June 2020.
-
A 75kb SRAM in 65nm CMOS for In-Memory Computing Based Neuromorphic Image Denoising
Authors:
Sumon Kumar Bose,
Vivek Mohan,
Arindam Basu
Abstract:
This paper presents an in-memory computing (IMC) architecture for image denoising. The proposed SRAM based in-memory processing framework works in tandem with approximate computing on a binary image generated from neuromorphic vision sensors. Implemented in TSMC 65nm process, the proposed architecture enables approximately 2000X energy savings (approximately 222X from IMC) compared to a digital im…
▽ More
This paper presents an in-memory computing (IMC) architecture for image denoising. The proposed SRAM based in-memory processing framework works in tandem with approximate computing on a binary image generated from neuromorphic vision sensors. Implemented in TSMC 65nm process, the proposed architecture enables approximately 2000X energy savings (approximately 222X from IMC) compared to a digital implementation when tested with the video recordings from a DAVIS sensor and achieves a peak throughput of 1.25-1.66 frames/us.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment
Authors:
Ajaz A. Bhat,
Vishwanathan Mohan
Abstract:
Corvids, apes, and children solve The Crow and The Pitcher task (from Aesop's Fables) indicating a causal understanding of the task. By cumulatively interacting with different objects, how can cognitive agents abstract the underlying cause-effect relations to predict affordances of novel objects? We address this question by re-enacting the Aesop's Fable task on a robot and present a) a brain-guide…
▽ More
Corvids, apes, and children solve The Crow and The Pitcher task (from Aesop's Fables) indicating a causal understanding of the task. By cumulatively interacting with different objects, how can cognitive agents abstract the underlying cause-effect relations to predict affordances of novel objects? We address this question by re-enacting the Aesop's Fable task on a robot and present a) a brain-guided neural model of semantic-episodic memory; with b) four task-agnostic learning rules that compare expectations from recalled past episodes with the current scenario to progressively extract the hidden causal relations. The ensuing robot behaviours illustrate causal learning; and predictions for novel objects converge to Archimedes' principle, independent of both the objects explored during learning and the order of their cumulative exploration.
△ Less
Submitted 29 February, 2020;
originally announced March 2020.
-
Machine Learning ${\cal N}=8, D=5$ Gauged Supergravity
Authors:
Chethan Krishnan,
Vyshnav Mohan,
Soham Ray
Abstract:
Type IIB string theory on a 5-sphere gives rise to ${\cal N}=8, SO(6)$ gauged supergravity in five dimensions. Motivated by the fact that this is the context of the most widely studied example of the AdS/CFT correspondence, we undertake an investigation of its critical points. The scalar manifold is an $E_{6(6)}/USp(8)$ coset, and the challenge is that it is 42-dimensional. We take a Machine Learn…
▽ More
Type IIB string theory on a 5-sphere gives rise to ${\cal N}=8, SO(6)$ gauged supergravity in five dimensions. Motivated by the fact that this is the context of the most widely studied example of the AdS/CFT correspondence, we undertake an investigation of its critical points. The scalar manifold is an $E_{6(6)}/USp(8)$ coset, and the challenge is that it is 42-dimensional. We take a Machine Learning approach to the problem using TensorFlow, and this results in a substantial increase in the number of known critical points. Our list of 32 critical points contains all five of the previously known ones, including an ${\cal N}=2$ supersymmetric point identified by Khavaev, Pilch and Warner.
△ Less
Submitted 20 March, 2020; v1 submitted 28 February, 2020;
originally announced February 2020.
-
Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products
Authors:
Tharun Medini,
Qixuan Huang,
Yiqiu Wang,
Vijai Mohan,
Anshumali Shrivastava
Abstract:
In the last decade, it has been shown that many hard AI tasks, especially in NLP, can be naturally modeled as extreme classification problems leading to improved precision. However, such models are prohibitively expensive to train due to the memory blow-up in the last layer. For example, a reasonable softmax layer for the dataset of interest in this paper can easily reach well beyond 100 billion p…
▽ More
In the last decade, it has been shown that many hard AI tasks, especially in NLP, can be naturally modeled as extreme classification problems leading to improved precision. However, such models are prohibitively expensive to train due to the memory blow-up in the last layer. For example, a reasonable softmax layer for the dataset of interest in this paper can easily reach well beyond 100 billion parameters (>400 GB memory). To alleviate this problem, we present Merged-Average Classifiers via Hashing (MACH), a generic K-classification algorithm where memory provably scales at O(logK) without any strong assumption on the classes. MACH is subtly a count-min sketch structure in disguise, which uses universal hashing to reduce classification with a large number of classes to few embarrassingly parallel and independent classification tasks with a small (constant) number of classes. MACH naturally provides a technique for zero communication model parallelism. We experiment with 6 datasets; some multiclass and some multilabel, and show consistent improvement over respective state-of-the-art baselines. In particular, we train an end-to-end deep classifier on a private product search dataset sampled from Amazon Search Engine with 70 million queries and 49.46 million products. MACH outperforms, by a significant margin,the state-of-the-art extreme classification models deployed on commercial search engines: Parabel and dense embedding models. Our largest model has 6.4 billion parameters and trains in less than 35 hours on a single p3.16x machine. Our training times are 7-10x faster, and our memory footprints are 2-4x smaller than the best baselines. This training time is also significantly lower than the one reported by Google's mixture of experts (MoE) language model on a comparable model size and hardware.
△ Less
Submitted 28 October, 2019;
originally announced October 2019.
-
A Study of Context Dependencies in Multi-page Product Search
Authors:
Keping Bi,
Choon Hui Teo,
Yesh Dattatreya,
Vijai Mohan,
W. Bruce Croft
Abstract:
In product search, users tend to browse results on multiple search result pages (SERPs) (e.g., for queries on clothing and shoes) before deciding which item to purchase. Users' clicks can be considered as implicit feedback which indicates their preferences and used to re-rank subsequent SERPs. Relevance feedback (RF) techniques are usually involved to deal with such scenarios. However, these metho…
▽ More
In product search, users tend to browse results on multiple search result pages (SERPs) (e.g., for queries on clothing and shoes) before deciding which item to purchase. Users' clicks can be considered as implicit feedback which indicates their preferences and used to re-rank subsequent SERPs. Relevance feedback (RF) techniques are usually involved to deal with such scenarios. However, these methods are designed for document retrieval, where relevance is the most important criterion. In contrast, product search engines need to retrieve items that are not only relevant but also satisfactory in terms of customers' preferences. Personalization based on users' purchase history has been shown to be effective in product search. However, this method captures users' long-term interest, which does not always align with their short-term interest, and does not benefit customers with little or no purchase history. In this paper, we study RF techniques based on both long-term and short-term context dependencies in multi-page product search. We also propose an end-to-end context-aware embedding model which can capture both types of context. Our experimental results show that short-term context leads to much better performance compared with long-term and no context. Moreover, our proposed model is more effective than state-of-art word-based RF models.
△ Less
Submitted 9 January, 2020; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Leverage Implicit Feedback for Context-aware Product Search
Authors:
Keping Bi,
Choon Hui Teo,
Yesh Dattatreya,
Vijai Mohan,
W. Bruce Croft
Abstract:
Product search serves as an important entry point for online shopping. In contrast to web search, the retrieved results in product search not only need to be relevant but also should satisfy customers' preferences in order to elicit purchases. Previous work has shown the efficacy of purchase history in personalized product search. However, customers with little or no purchase history do not benefi…
▽ More
Product search serves as an important entry point for online shopping. In contrast to web search, the retrieved results in product search not only need to be relevant but also should satisfy customers' preferences in order to elicit purchases. Previous work has shown the efficacy of purchase history in personalized product search. However, customers with little or no purchase history do not benefit from personalized product search. Furthermore, preferences extracted from a customer's purchase history are usually long-term and may not always align with her short-term interests. Hence, in this paper, we leverage clicks within a query session, as implicit feedback, to represent users' hidden intents, which further act as the basis for re-ranking subsequent result pages for the query. It has been studied extensively to model user preference with implicit feedback in recommendation tasks. However, there has been little research on modeling users' short-term interest in product search. We study whether short-term context could help promote users' ideal item in the following result pages for a query. Furthermore, we propose an end-to-end context-aware embedding model which can capture long-term and short-term context dependencies. Our experimental results on the datasets collected from the search log of a commercial product search engine show that short-term context leads to much better performance compared with long-term and no context. Our results also show that our proposed model is more effective than word-based context-aware models.
△ Less
Submitted 9 January, 2020; v1 submitted 4 September, 2019;
originally announced September 2019.
-
Automatic Repair and Type Binding of Undeclared Variables using Neural Networks
Authors:
Venkatesh Theru Mohan,
Ali Jannesari
Abstract:
Deep learning had been used in program analysis for the prediction of hidden software defects using software defect datasets, security vulnerabilities using generative adversarial networks as well as identifying syntax errors by learning a trained neural machine translation on program codes. However, all these approaches either require defect datasets or bug-free source codes that are executable f…
▽ More
Deep learning had been used in program analysis for the prediction of hidden software defects using software defect datasets, security vulnerabilities using generative adversarial networks as well as identifying syntax errors by learning a trained neural machine translation on program codes. However, all these approaches either require defect datasets or bug-free source codes that are executable for training the deep learning model. Our neural network model is neither trained with any defect datasets nor bug-free programming source codes, instead it is trained using structural semantic details of Abstract Syntax Tree (AST) where each node represents a construct appearing in the source code. This model is implemented to fix one of the most common semantic errors, such as undeclared variable errors as well as infer their type information before program compilation. By this approach, the model has achieved in correctly locating and identifying 81% of the programs on prutor dataset of 1059 programs with only undeclared variable errors and also inferring their types correctly in 80% of the programs.
△ Less
Submitted 14 July, 2019;
originally announced July 2019.
-
Semantic Product Search
Authors:
Priyanka Nigam,
Yiwei Song,
Vijai Mohan,
Vihan Lakshman,
Weitian,
Ding,
Ankit Shingavi,
Choon Hui Teo,
Hao Gu,
Bing Yin
Abstract:
We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. "woman" vs. "women"), and c) sensitivity…
▽ More
We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. "woman" vs. "women"), and c) sensitivity to spelling errors. To address these issues, we train a deep learning model for semantic matching using customer behavior data. Much of the recent work on large-scale semantic search using deep learning focuses on ranking for web search. In contrast, semantic matching for product search presents several novel challenges, which we elucidate in this paper. We address these challenges by a) developing a new loss function that has an inbuilt threshold to differentiate between random negative examples, impressed but not purchased examples, and positive examples (purchased items), b) using average pooling in conjunction with n-grams to capture short-range linguistic patterns, c) using hashing to handle out of vocabulary tokens, and d) using a model parallel training architecture to scale across 8 GPUs. We present compelling offline results that demonstrate at least 4.7% improvement in Recall@100 and 14.5% improvement in mean average precision (MAP) over baseline state-of-the-art semantic search methods using the same tokenization method. Moreover, we present results and discuss learnings from online A/B tests which demonstrate the efficacy of our method.
△ Less
Submitted 1 July, 2019;
originally announced July 2019.
-
Compressing Gradient Optimizers via Count-Sketches
Authors:
Ryan Spring,
Anastasios Kyrillidis,
Vijai Mohan,
Anshumali Shrivastava
Abstract:
Many popular first-order optimization methods (e.g., Momentum, AdaGrad, Adam) accelerate the convergence rate of deep learning models. However, these algorithms require auxiliary parameters, which cost additional memory proportional to the number of parameters in the model. The problem is becoming more severe as deep learning models continue to grow larger in order to learn from complex, large-sca…
▽ More
Many popular first-order optimization methods (e.g., Momentum, AdaGrad, Adam) accelerate the convergence rate of deep learning models. However, these algorithms require auxiliary parameters, which cost additional memory proportional to the number of parameters in the model. The problem is becoming more severe as deep learning models continue to grow larger in order to learn from complex, large-scale datasets. Our proposed solution is to maintain a linear sketch to compress the auxiliary variables. We demonstrate that our technique has the same performance as the full-sized baseline, while using significantly less space for the auxiliary variables. Theoretically, we prove that count-sketch optimization maintains the SGD convergence rate, while gracefully reducing memory usage for large-models. On the large-scale 1-Billion Word dataset, we save 25% of the memory used during training (8.6 GB instead of 11.7 GB) by compressing the Adam optimizer in the Embedding and Softmax layers with negligible accuracy and performance loss. For an Amazon extreme classification task with over 49.5 million classes, we also reduce the training time by 38%, by increasing the mini-batch size 3.5x using our count-sketch optimizer.
△ Less
Submitted 26 February, 2019; v1 submitted 31 January, 2019;
originally announced February 2019.
-
Adaptive, Personalized Diversity for Visual Discovery
Authors:
Choon Hui Teo,
Houssam Nassif,
Daniel Hill,
Sriram Srinavasan,
Mitchell Goodman,
Vijai Mohan,
SVN Vishwanathan
Abstract:
Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shopping experience. Here we explore extensions in the direction of adaptive personalization and item diversi…
▽ More
Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shopping experience. Here we explore extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon. Our system presents the user with a diverse set of interesting items while adapting to user interactions. Our solution consists of three components (1) a Bayesian regression model for scoring the relevance of items while leveraging uncertainty, (2) a submodular diversification framework that re-ranks the top scoring items based on category, and (3) personalized category preferences learned from the user's behavior. When tested on live traffic, our algorithms show a strong lift in click-through-rate and session duration.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Low resolution spectroscopy of selected Algol systems
Authors:
D. Shanti Priya,
J. Rukmini,
M. Parthasarathy,
D. K. Sahu,
Vijay Mohan,
B. C. Bhatt,
Vineet S. Thomas
Abstract:
The analysis of spectroscopic data for 30 Algol-type binaries is presented. All these systems are short period Algols having primaries with spectral types B and A. Dominant spectral lines were identified for the spectra collected and their equivalent widths were calculated. All the spectra were examined to understand presence of mass transfer, a disk or circumstellar matter and chromospheric emiss…
▽ More
The analysis of spectroscopic data for 30 Algol-type binaries is presented. All these systems are short period Algols having primaries with spectral types B and A. Dominant spectral lines were identified for the spectra collected and their equivalent widths were calculated. All the spectra were examined to understand presence of mass transfer, a disk or circumstellar matter and chromospheric emission. We also present first spectroscopic and period study for few Algols and conclude that high resolution spectra within and outside the primary minimum are needed for better understanding of these Algol type close binaries.
△ Less
Submitted 1 March, 2018;
originally announced March 2018.
-
Quantum interactions, Predictability and Emergence Of Gravity
Authors:
Vyshnav Mohan
Abstract:
In this paper, we will show that gravity can emerge from an effective field theory, obtained by tracing out the fermionic system from an interacting quantum field theory, when we impose the condition that the field equations must be Cauchy predictable. The source of the gravitational field can be identified with the quantum interactions that existed in the interacting QFT. This relation is very si…
▽ More
In this paper, we will show that gravity can emerge from an effective field theory, obtained by tracing out the fermionic system from an interacting quantum field theory, when we impose the condition that the field equations must be Cauchy predictable. The source of the gravitational field can be identified with the quantum interactions that existed in the interacting QFT. This relation is very similar to the ER= EPR conjecture and strongly relies on the fact that emergence of a classical theory will be dependent on the underlying quantum processes and interactions. We consider two concrete example for reaching the result - one where initially there was no gravity and other where gravity was present. The latter case will result in first order corrections to Einstein's equations and immediately reproduces well-known results like effective event horizons and gravitational birefringence.
△ Less
Submitted 14 February, 2018;
originally announced February 2018.
-
Back-reaction of quantum processes and modified gravitational dynamics
Authors:
Vyshnav Mohan
Abstract:
In this paper, we seek to find a modified theory of gravity that accounts for the back-reaction of QED on curved spacetime. It is already known that vacuum fluctuations induce interactions between gravity and photons. An effective action for electromagnetism, which encodes the details of such quantum process, is utilized to get a set of modified Maxwell's equation and a new Lagrangian for the dyna…
▽ More
In this paper, we seek to find a modified theory of gravity that accounts for the back-reaction of QED on curved spacetime. It is already known that vacuum fluctuations induce interactions between gravity and photons. An effective action for electromagnetism, which encodes the details of such quantum process, is utilized to get a set of modified Maxwell's equation and a new Lagrangian for the dynamics of gravity. Imposition of Cauchy predictability on these modified Maxwell's equations lead us to a new effective metric, from which the Lagrangian can be calculated. The new Lagrangian for gravity turns out to be a function of not just the higher order derivatives of the metric tensor but also the polarization of the photon. This immediately results in phenomenon such as gravitational birefringence and existence of black holes with polarization dependent event horizons.
△ Less
Submitted 29 August, 2017; v1 submitted 26 August, 2017;
originally announced August 2017.
-
Study of the plutino object (208996) 2003 AZ84 from stellar occultations: size, shape and topographic features
Authors:
A. Dias-Oliveira,
B. Sicardy,
J. L. Ortiz,
F. Braga-Ribas,
R. Leiva,
R. Vieira-Martins,
G. Benedetti-Rossi,
J. I. B. Camargo,
M. Assafin,
A. R. Gomes-Junior,
T. Baug,
T. Chandrasekhar,
J. Desmars,
R. Duffard,
P. Santos-Sanz,
Z. Ergang,
S. Ganesh,
Y. Ikari,
P. Irawati,
J. Jain,
Z. Liying,
A. Richichi,
Q. Shengbang,
R. Behrend,
Z. Benkhaldoun
, et al. (38 additional authors not shown)
Abstract:
We present results derived from four stellar occultations by the plutino object (208996) 2003~AZ$_{84}$, detected at January 8, 2011 (single-chord event), February 3, 2012 (multi-chord), December 2, 2013 (single-chord) and November 15, 2014 (multi-chord). Our observations rule out an oblate spheroid solution for 2003~AZ$_{84}$'s shape. Instead, assuming hydrostatic equilibrium, we find that a Jaco…
▽ More
We present results derived from four stellar occultations by the plutino object (208996) 2003~AZ$_{84}$, detected at January 8, 2011 (single-chord event), February 3, 2012 (multi-chord), December 2, 2013 (single-chord) and November 15, 2014 (multi-chord). Our observations rule out an oblate spheroid solution for 2003~AZ$_{84}$'s shape. Instead, assuming hydrostatic equilibrium, we find that a Jacobi triaxial solution with semi axes $(470 \pm 20) \times (383 \pm 10) \times (245 \pm 8)$~km % axis ratios $b/a= 0.82 \pm 0.05$ and $c/a= 0.52 \pm 0.02$, can better account for all our occultation observations. Combining these dimensions with the rotation period of the body (6.75~h) and the amplitude of its rotation light curve, we derive a density $ρ=0.87 \pm 0.01$~g~cm$^{-3}$ a geometric albedo $p_V= 0.097 \pm 0.009$. A grazing chord observed during the 2014 occultation reveals a topographic feature along 2003~AZ$_{84}$'s limb, that can be interpreted as an abrupt chasm of width $\sim 23$~km and depth $> 8$~km or a smooth depression of width $\sim 80$~km and depth $\sim 13$~km (or an intermediate feature between those two extremes).
△ Less
Submitted 30 May, 2017;
originally announced May 2017.
-
SALT observation of X-ray pulse reprocessing in 4U 1626-67
Authors:
Gayathri Raman,
Biswajit Paul,
Dipankar Bhattacharya,
Vijay Mohan
Abstract:
We investigate optical reprocessing of X-rays in the LMXB pulsar 4U 1626-67 in its current spin-up phase using observations with Southern African Large Telescope (SALT), near-simultaneous observations with Swift-XRT and non-simultaneous RXTE-PCA observations and present the results of timing analysis. Using SALT observations carried out on 5th and 6th March, 2014, we detect some interesting reproc…
▽ More
We investigate optical reprocessing of X-rays in the LMXB pulsar 4U 1626-67 in its current spin-up phase using observations with Southern African Large Telescope (SALT), near-simultaneous observations with Swift-XRT and non-simultaneous RXTE-PCA observations and present the results of timing analysis. Using SALT observations carried out on 5th and 6th March, 2014, we detect some interesting reprocessing signatures. We detect a weak optical Quasi Periodic Oscillation (QPO) in the power density spectrum on March 5th at 48 mHz with a fractional rms of 3.3% in spite of the fact that source shows no corresponding X-ray QPO in the spin-up phase. In the light curve obtained on March 5th, we detect a coherent pulsation at the spin period of ~7.677 s. A previously known, slightly down-shifted side-band is also detected at 129.92 mHz. The frequency spacing between main pulse and this side-band is different from earlier observations, though the statistical significance of the difference is limited. The light curve of 6th March displays short time-scale variability in the form of flares on timescales of a few minutes. Folded pulse profiles resulting from data of this night show an interesting trend of pulse peak drifting. This drift could be due to i) rapid changes in the reprocessing agent, like orbital motion of an accretion disk warp around the neutron star or ii) intrinsic pulse phase changes in X-rays. We also examine some X-ray light curves obtained with RXTE-PCA during 2008-2010 for pulse shape changes in short time scales during X-ray flares.
△ Less
Submitted 17 March, 2016; v1 submitted 4 February, 2016;
originally announced February 2016.
-
V899 Mon: An Outbursting Protostar With Peculiar Light Curve And Its Transition Phases
Authors:
J. P. Ninan,
D. K. Ojha,
T. Baug,
B. C. Bhatt,
V. Mohan,
S. K. Ghosh,
A. Men'shchikov,
G. C. Anupama,
M. Tamura,
Th. Henning
Abstract:
We present a detailed study of V899 Mon (a new member in the FUors/EXors family of young low-mass stars undergoing outburst), based on our long-term monitoring of the source starting from November 2009 to April 2015. Our optical and near-infrared photometric and spectroscopic monitoring recorded the source transitioning from its first outburst to a short duration quiescence phase ($<$ 1 year), and…
▽ More
We present a detailed study of V899 Mon (a new member in the FUors/EXors family of young low-mass stars undergoing outburst), based on our long-term monitoring of the source starting from November 2009 to April 2015. Our optical and near-infrared photometric and spectroscopic monitoring recorded the source transitioning from its first outburst to a short duration quiescence phase ($<$ 1 year), and then returning to a second outburst. We report here the evolution of the outflows from inner region of the disk as the accretion rate evolved in various epochs. Our high resolution (R$\sim$37000) optical spectrum could resolve interesting clumpy structures in the outflow traced by various lines. Change in far-infrared flux was also detected between two outburst epochs. Based on our observations we constrained various stellar and envelope parameters of V899 Mon, as well as the kinematics of its accretion and outflow. The photometric and spectroscopic properties of this source fall between classical FUors and EXors. Our investigation of V899 Mon hints instability associated with magnetospheric accretion to be the physical cause of sudden short duration pause of outburst in 2011. It is also a good candidate to explain similar short duration pauses in outburst of some other FUors/EXors sources.
△ Less
Submitted 24 October, 2015;
originally announced October 2015.
-
Molecular hydrogen from z = 0.0963 DLA towards the QSO J1619+3342
Authors:
Raghunathan Srianand,
Hadi Rahmani,
Sowgat Muzahid,
Vijay Mohan
Abstract:
We report the detection of H2 in a zabs= 0.0963 Damped Lyman-α (DLA) system towards zem = 0.4716 QSO J1619+3342. This DLA has log N(H I) = 20.55 (0.10), 18.13 < log N(H2) < 18.40, [S/H] = -0.62 (0.13), [Fe/S] = -1.00 (0.17) and the molecular fraction -2.11 < log f(H2) < -1.85. The inferred gas kinetic temperature using the rotational level population is in the range 95 - 132 K. We do not detect C…
▽ More
We report the detection of H2 in a zabs= 0.0963 Damped Lyman-α (DLA) system towards zem = 0.4716 QSO J1619+3342. This DLA has log N(H I) = 20.55 (0.10), 18.13 < log N(H2) < 18.40, [S/H] = -0.62 (0.13), [Fe/S] = -1.00 (0.17) and the molecular fraction -2.11 < log f(H2) < -1.85. The inferred gas kinetic temperature using the rotational level population is in the range 95 - 132 K. We do not detect C I or C II* absorption from this system. Using R- and V-band deep images we identify a sub-L* galaxy at an impact parameter of 14 kpc from the line of sight, having consistent photometric redshift, as a possible host for the absorber. We use the photoionization code CLOUDY to get the physical conditions in the H2 component using the observational constrains from H2, C I, C II* and Mg I. All the observations can be consistently explained if one or more of the following is true: (i) Carbon is underabundant by more than 0.6 dex as seen in halo stars with Z ~ 0.1 Z_sun, (ii) H I associated with H2 component is less than 50% of the H I measured along the line of sight and (iii) the H2 formation rate on the dust grains is at least a factor two higher than what is typically used in analytic calculations for Milky Way interstellar medium. Even when these are satisfied, the gas kinetic temperature in the models are much lower than what is inferred from the ortho-to-para ratio of the molecular hydrogen. Alternatively the high kinetic temperature could be a consequence of contribution to the gas heating from non-radiative heating processes seen in hydrodynamical simulations.
△ Less
Submitted 20 June, 2014;
originally announced June 2014.
-
Variability in Low Ionization Broad Absorption Line Outflows
Authors:
M. Vivek,
R. Srianand,
P. Petitjean,
V. Mohan,
A. Mahabal,
S. Samui
Abstract:
We present results of our time variability studies of Mg II and Al III absorption lines in a sample of 22 Low Ionization Broad Absorption Line QSOs (LoBAL QSOs) at 0.2 <= zem <= 2.1 using the 2m telescope at IUCAA Girawali Observatory over a time-scale of 10 days to 7.69 years in the QSO's rest frame. Spectra are analysed in conjunction with photometric light curves from Catalina Real-Time Transie…
▽ More
We present results of our time variability studies of Mg II and Al III absorption lines in a sample of 22 Low Ionization Broad Absorption Line QSOs (LoBAL QSOs) at 0.2 <= zem <= 2.1 using the 2m telescope at IUCAA Girawali Observatory over a time-scale of 10 days to 7.69 years in the QSO's rest frame. Spectra are analysed in conjunction with photometric light curves from Catalina Real-Time Transient Survey. Long time-scale (i.e >= 1 year) absorption line variability is seen in 8 cases (36% systems) while only 4 of them (i.e 18% systems) show variability over short time-scales (i.e < 1 year). We notice a tendency of highly variable LoBAL QSOs to have high ejection velocity, low equivalent width and low redshift. The detection rate of variability in LoBAL QSOs showing Fe fine-structure lines (FeLoBAL QSOs) is less than that seen in non-Fe LoBAL QSOs. Absorption line variability is more frequently detected in QSOs having continuum dominated by Fe emission lines compared to rest of the QSOs. Confirming these trends with a bigger sample will give vital clues for understanding the physical distinction between different BAL QSO sub-classes. We correlate the absorption line variability with various parameters derived from continuum light curves and find no clear correlation between continuum flux and absorption line variabilities. However, sources with large absorption line variability also show large variability in their light curves. We also see appearance/disappearance of absorption components in 2 cases and clear indications for profile variations in 4 cases. The observed variability can be best explained by a combination of process driven by continuum variations and clouds transiting across the line of sight.
△ Less
Submitted 12 February, 2014;
originally announced February 2014.
-
Re-appearance of McNeil's nebula (V1647 Orionis) and its outburst environment
Authors:
J. P. Ninan,
D. K. Ojha,
B. C. Bhatt,
S. K. Ghosh,
V. Mohan,
K. K. Mallick,
M. Tamura,
Th. Henning
Abstract:
We present a detailed study of McNeil's nebula (V1647 Ori) in its ongoing outburst phase starting from September 2008 to March 2013. Our 124 nights of photometric observations were carried out in optical V, R, I and near-infrared J, H, K bands, and 59 nights of medium resolution spectroscopic observations were done in 5200 - 9000 Ang wavelength range. All observations were carried out with 2-m Him…
▽ More
We present a detailed study of McNeil's nebula (V1647 Ori) in its ongoing outburst phase starting from September 2008 to March 2013. Our 124 nights of photometric observations were carried out in optical V, R, I and near-infrared J, H, K bands, and 59 nights of medium resolution spectroscopic observations were done in 5200 - 9000 Ang wavelength range. All observations were carried out with 2-m Himalayan Chandra Telescope and 2-m IUCAA Girawali Telescope. Our observations show that over last four and a half years, V1647 Ori and the region C near Herbig-Haro object, HH 22A, have been undergoing a slow dimming at a rate of ~0.04 mag/yr and ~0.05 mag/yr respectively in R-band, which is 6 times slower than the rate during similar stage of V1647 Ori in 2003 outburst. We detected change in flux distribution over the reflection nebula implying changes in circumstellar matter distribution between 2003 and 2008 outbursts. Apart from steady wind of velocity ~350 km/s we detected two episodic magnetic reconnection driven winds. Forbidden [O I] 6300 Ang and [Fe II] 7155 Ang lines were also detected implying shock regions probably from jets. We tried to explain the outburst timescales of V1647 Ori using the standard models of FUors kind of outburst and found that pure thermal instability models like Bell & Lin (1994) cannot explain the variations in timescales. In the framework of various instability models we conclude that one possible reason for sudden ending of 2003 outburst in 2005 November was due to a low density region or gap in the inner region (~ 1 AU) of the disc.
△ Less
Submitted 19 September, 2013;
originally announced September 2013.
-
The outburst and nature of young eruptive low mass stars in dark clouds
Authors:
J. P. Ninan,
D. K. Ojha,
B. C. Bhatt,
K. K. Mallick,
A. Tej,
D. K. Sahu,
S. K. Ghosh,
V. Mohan
Abstract:
The FU Orionis (FUor) or EX Orionis (EXor) phenomenon has attracted increasing attention in recent years and is now accepted as a crucial element in the early evolution of low-mass stars. FUor and EXor eruptions of young stellar objects (YSOs) are caused by strongly enhanced accretion from the surrounding disk. FUors display optical outbursts of $\sim$ 4 mag or more and last for several decades, w…
▽ More
The FU Orionis (FUor) or EX Orionis (EXor) phenomenon has attracted increasing attention in recent years and is now accepted as a crucial element in the early evolution of low-mass stars. FUor and EXor eruptions of young stellar objects (YSOs) are caused by strongly enhanced accretion from the surrounding disk. FUors display optical outbursts of $\sim$ 4 mag or more and last for several decades, whereas EXors show smaller outbursts ($Δ$m $\sim$ 2 - 3 mag) that last from a few months to a few years and may occur repeatedly. Therefore, FUor/EXor eruptions represent a rare but very important phenomenon in early stellar evolution, during which a young low-mass YSO brightens by up to several optical magnitudes. Hence, long-term observations of this class of eruptive variable are important to design theoretical models of low-mass star formation. In this paper, we present recent results from our long-term monitoring observations of three rare types of eruptive young variables with the 2-m Himalayan {\it Chandra} Telescope (HCT) and the 2-m IUCAA Girawali Observatory (IGO) telescope.
△ Less
Submitted 24 June, 2012;
originally announced June 2012.