Skip to main content

Showing 1–5 of 5 results for author: Shmatova, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2512.14576  [pdf, ps, other

    cs.CL cs.AI

    Low-Resource, High-Impact: Building Corpora for Inclusive Language Technologies

    Authors: Ekaterina Artemova, Laurie Burchell, Daryna Dementieva, Shu Okabe, Mariya Shmatova, Pedro Ortiz Suarez

    Abstract: This tutorial (https://tum-nlp.github.io/low-resource-tutorial) is designed for NLP practitioners, researchers, and developers working with multilingual and low-resource languages who seek to create more equitable and socially impactful language technologies. Participants will walk away with a practical toolkit for building end-to-end NLP pipelines for underrepresented languages -- from data colle… ▽ More

    Submitted 16 December, 2025; originally announced December 2025.

    Comments: Tutorial is accepted to LREC2026

  2. arXiv:2508.14909  [pdf, ps, other

    cs.CL

    Preliminary Ranking of WMT25 General Machine Translation Systems

    Authors: Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Konstantin Dranch, Anton Dvorkovich, Sergey Dukanov, Natalia Fedorova, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Howard Lakougna, Jessica Lundin, Kenton Murray, Masaaki Nagata, Stefano Perrella, Lorenzo Proietti, Martin Popel, Maja Popović, Parker Riley, Mariya Shmatova , et al. (3 additional authors not shown)

    Abstract: We present the preliminary rankings of machine translation (MT) systems submitted to the WMT25 General Machine Translation Shared Task, as determined by automatic evaluation metrics. Because these rankings are derived from automatic evaluation, they may exhibit a bias toward systems that employ re-ranking techniques, such as Quality Estimation or Minimum Bayes Risk decoding. The official WMT25 ran… ▽ More

    Submitted 24 August, 2025; v1 submitted 11 August, 2025; originally announced August 2025.

  3. arXiv:2407.19884  [pdf, other

    cs.CL

    Preliminary WMT24 Ranking of General MT Systems and LLMs

    Authors: Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondrej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Benjamin Marie, Kenton Murray, Masaaki Nagata, Martin Popel, Maja Popovic, Mariya Shmatova, Steinþór Steingrímsson, Vilém Zouhar

    Abstract: This is the preliminary ranking of WMT24 General MT systems based on automatic metrics. The official ranking will be a human evaluation, which is superior to the automatic ranking and supersedes it. The purpose of this report is not to interpret any findings but only provide preliminary results to the participants of the General MT task that may be useful during the writing of the system submissio… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  4. arXiv:2406.11580  [pdf, other

    cs.CL

    Error Span Annotation: A Balanced Approach for Human Evaluation of Machine Translation

    Authors: Tom Kocmi, Vilém Zouhar, Eleftherios Avramidis, Roman Grundkiewicz, Marzena Karpinska, Maja Popović, Mrinmaya Sachan, Mariya Shmatova

    Abstract: High-quality Machine Translation (MT) evaluation relies heavily on human judgments. Comprehensive error classification methods, such as Multidimensional Quality Metrics (MQM), are expensive as they are time-consuming and can only be done by experts, whose availability may be limited especially for low-resource languages. On the other hand, just assigning overall scores, like Direct Assessment (DA)… ▽ More

    Submitted 18 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2107.07455  [pdf, other

    cs.LG cs.AI stat.ML

    Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks

    Authors: Andrey Malinin, Neil Band, Ganshin, Alexander, German Chesnokov, Yarin Gal, Mark J. F. Gales, Alexey Noskov, Andrey Ploskonosov, Liudmila Prokhorenkova, Ivan Provilkov, Vatsal Raina, Vyas Raina, Roginskiy, Denis, Mariya Shmatova, Panos Tigas, Boris Yangel

    Abstract: There has been significant research done on developing methods for improving robustness to distributional shift and uncertainty estimation. In contrast, only limited work has examined developing standard datasets and benchmarks for assessing these approaches. Additionally, most work on uncertainty estimation and robustness has developed new techniques based on small-scale regression or image class… ▽ More

    Submitted 11 February, 2022; v1 submitted 15 July, 2021; originally announced July 2021.