Skip to main content

Showing 1–2 of 2 results for author: JM, A

.
  1. arXiv:2601.06747  [pdf, ps, other

    cs.AI

    FinForge: Semi-Synthetic Financial Benchmark Generation

    Authors: Glenn Matlin, Akhil Theerthala, Anant Gupta, Anirudh JM, Rayan Castilla, Yi Mei Ng, Sudheer Chava

    Abstract: Evaluating Language Models (LMs) in specialized, high-stakes domains such as finance remains a significant challenge due to the scarcity of open, high-quality, and domain-specific datasets. Existing general-purpose benchmarks provide broad coverage but lack the depth and domain fidelity needed to assess LMs' capabilities for real-world financial reasoning, which requires both conceptual understand… ▽ More

    Submitted 19 January, 2026; v1 submitted 10 January, 2026; originally announced January 2026.

  2. arXiv:2512.08965  [pdf

    cs.LG cs.AI cs.CL

    Financial Instruction Following Evaluation (FIFE)

    Authors: Glenn Matlin, Siddharth, Anirudh JM, Aditya Shukla, Yahya Hassan, Sudheer Chava

    Abstract: Language Models (LMs) struggle with complex, interdependent instructions, particularly in high-stakes domains like finance where precision is critical. We introduce FIFE, a novel, high-difficulty benchmark designed to assess LM instruction-following capabilities for financial analysis tasks. FIFE comprises 88 human-authored prompts and employs a verification system with chainable, verifiable const… ▽ More

    Submitted 30 November, 2025; originally announced December 2025.

    Comments: Accepted at NeurIPS 2025 Generative AI in Finance Workshop (GenAI Finance), San Diego. Camera-ready version. Code and data: https://github.com/gtfintechlab/FIFE/