Skip to main content

Showing 1–8 of 8 results for author: Rohrer, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2604.12374  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

    Authors: NVIDIA, :, Aakshita Chandiramani, Aaron Blakeman, Abdullahi Olaoye, Abhibha Gupta, Abhilash Somasamudramath, Abhinav Khattar, Adeola Adesoba, Adi Renduchintala, Adil Asif, Aditya Agrawal, Aditya Vavre, Ahmad Kiswani, Aishwarya Padmakumar, Ajay Hotchandani, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Gronskiy, Alex Kondratenko, Alex Neefus, Alex Steiner, Alex Yang , et al. (522 additional authors not shown)

    Abstract: We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention Mixture-of-Experts model. Nemotron 3 Super is the first model in the Nemotron 3 family to 1) be pre-trained in NVFP4, 2) leverage LatentMoE, a new Mixture-of-Experts architecture that optimizes for both accuracy per FLOP and accuracy per parameter, a… ▽ More

    Submitted 14 April, 2026; originally announced April 2026.

  2. arXiv:2512.20856  [pdf, ps, other

    cs.CL cs.AI cs.LG

    NVIDIA Nemotron 3: Efficient and Open Intelligence

    Authors: NVIDIA, :, Aaron Blakeman, Aaron Grattafiori, Aarti Basant, Abhibha Gupta, Abhinav Khattar, Adi Renduchintala, Aditya Vavre, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Kondratenko, Alexander Bukharin, Alexandre Milesi, Ali Taghibakhshi, Alisa Liu, Amelia Barton, Ameya Sunil Mahabaleshwarkar, Amir Klein, Amit Zuker, Amnon Geifman, Amy Shen, Anahita Bhiwandiwalla , et al. (334 additional authors not shown)

    Abstract: We introduce the Nemotron 3 family of models - Nano, Super, and Ultra. These models deliver strong agentic, reasoning, and conversational capabilities. The Nemotron 3 family uses a Mixture-of-Experts hybrid Mamba-Transformer architecture to provide best-in-class throughput and context lengths of up to 1M tokens. Super and Ultra models are trained with NVFP4 and incorporate LatentMoE, a novel appro… ▽ More

    Submitted 23 December, 2025; originally announced December 2025.

  3. arXiv:2512.20848  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

    Authors: NVIDIA, :, Aaron Blakeman, Aaron Grattafiori, Aarti Basant, Abhibha Gupta, Abhinav Khattar, Adi Renduchintala, Aditya Vavre, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Kondratenko, Alexander Bukharin, Alexandre Milesi, Ali Taghibakhshi, Alisa Liu, Amelia Barton, Ameya Sunil Mahabaleshwarkar, Amir Klein, Amit Zuker, Amnon Geifman, Amy Shen, Anahita Bhiwandiwalla , et al. (289 additional authors not shown)

    Abstract: We present Nemotron 3 Nano 30B-A3B, a Mixture-of-Experts hybrid Mamba-Transformer language model. Nemotron 3 Nano was pretrained on 25 trillion text tokens, including more than 3 trillion new unique tokens over Nemotron 2, followed by supervised fine tuning and large-scale RL on diverse environments. Nemotron 3 Nano achieves better accuracy than our previous generation Nemotron 2 Nano while activa… ▽ More

    Submitted 23 December, 2025; originally announced December 2025.

  4. arXiv:2511.21990  [pdf, ps, other

    cs.LG cs.AI cs.CR

    A Safety and Security Framework for Real-World Agentic Systems

    Authors: Shaona Ghosh, Barnaby Simkin, Kyriacos Shiarlis, Soumili Nandi, Dan Zhao, Matthew Fiedler, Julia Bazinska, Nikki Pope, Roopa Prabhu, Daniel Rohrer, Michael Demoret, Bartley Richardson

    Abstract: This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and security are not merely fixed attributes of individual models but also emergent properties arising from the dynamic interactions among models, orchestrators, tools, and data within their operating environments. We propose a new way of identification of novel… ▽ More

    Submitted 26 November, 2025; originally announced November 2025.

  5. arXiv:2508.14444  [pdf, ps, other

    cs.CL cs.AI cs.LG

    NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

    Authors: NVIDIA, :, Aarti Basant, Abhijit Khairnar, Abhijit Paithankar, Abhinav Khattar, Adithya Renduchintala, Aditya Malte, Akhiad Bercovich, Akshay Hazare, Alejandra Rico, Aleksander Ficek, Alex Kondratenko, Alex Shaposhnikov, Alexander Bukharin, Ali Taghibakhshi, Amelia Barton, Ameya Sunil Mahabaleshwarkar, Amy Shen, Andrew Tao, Ann Guan, Anna Shors, Anubhav Mandarwal, Arham Mehta, Arun Venkatesan , et al. (192 additional authors not shown)

    Abstract: We introduce Nemotron-Nano-9B-v2, a hybrid Mamba-Transformer language model designed to increase throughput for reasoning workloads while achieving state-of-the-art accuracy compared to similarly-sized models. Nemotron-Nano-9B-v2 builds on the Nemotron-H architecture, in which the majority of the self-attention layers in the common Transformer architecture are replaced with Mamba-2 layers, to achi… ▽ More

    Submitted 2 September, 2025; v1 submitted 20 August, 2025; originally announced August 2025.

  6. arXiv:2410.07890  [pdf, other

    stat.ML cs.LG

    Identifying latent disease factors differently expressed in patient subgroups using group factor analysis

    Authors: Fabio S. Ferreira, John Ashburner, Arabella Bouzigues, Chatrin Suksasilp, Lucy L. Russell, Phoebe H. Foster, Eve Ferry-Bolder, John C. van Swieten, Lize C. Jiskoot, Harro Seelaar, Raquel Sanchez-Valle, Robert Laforce, Caroline Graff, Daniela Galimberti, Rik Vandenberghe, Alexandre de Mendonca, Pietro Tiraboschi, Isabel Santana, Alexander Gerhard, Johannes Levin, Sandro Sorbi, Markus Otto, Florence Pasquier, Simon Ducharme, Chris R. Butler , et al. (11 additional authors not shown)

    Abstract: In this study, we propose a novel approach to uncover subgroup-specific and subgroup-common latent factors addressing the challenges posed by the heterogeneity of neurological and mental disorders, which hinder disease understanding, treatment development, and outcome prediction. The proposed approach, sparse Group Factor Analysis (GFA) with regularised horseshoe priors, was implemented with proba… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 38 pages, 14 figures

  7. arXiv:2305.14553  [pdf

    cs.CR cs.AI cs.CY

    Adversarial Machine Learning and Cybersecurity: Risks, Challenges, and Legal Implications

    Authors: Micah Musser, Andrew Lohn, James X. Dempsey, Jonathan Spring, Ram Shankar Siva Kumar, Brenda Leong, Christina Liaghati, Cindy Martinez, Crystal D. Grant, Daniel Rohrer, Heather Frase, Jonathan Elliott, John Bansemer, Mikel Rodriguez, Mitt Regan, Rumman Chowdhury, Stefan Hermanek

    Abstract: In July 2022, the Center for Security and Emerging Technology (CSET) at Georgetown University and the Program on Geopolitics, Technology, and Governance at the Stanford Cyber Policy Center convened a workshop of experts to examine the relationship between vulnerabilities in artificial intelligence systems and more traditional types of software vulnerabilities. Topics discussed included the extent… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  8. arXiv:2305.03413  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    Domain-agnostic segmentation of thalamic nuclei from joint structural and diffusion MRI

    Authors: Henry F. J. Tregidgo, Sonja Soskic, Mark D. Olchanyi, Juri Althonayan, Benjamin Billot, Chiara Maffei, Polina Golland, Anastasia Yendiki, Daniel C. Alexander, Martina Bocchetta, Jonathan D. Rohrer, Juan Eugenio Iglesias

    Abstract: The human thalamus is a highly connected subcortical grey-matter structure within the brain. It comprises dozens of nuclei with different function and connectivity, which are affected differently by disease. For this reason, there is growing interest in studying the thalamic nuclei in vivo with MRI. Tools are available to segment the thalamus from 1 mm T1 scans, but the contrast of the lateral and… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: Under review