User profiles for Matthew R. Gormley
Matthew R. GormleyCarnegie Mellon University Verified email at cs.cmu.edu Cited by 2332 |
[PDF][PDF] Annotated gigaword
C Napoles, MR Gormley… - Proceedings of the joint …, 2012 - aclanthology.org
We have created layers of annotation on the English Gigaword v. 5 corpus to render it
useful as a standardized corpus for knowledge extraction and distributional semantics. Most …
useful as a standardized corpus for knowledge extraction and distributional semantics. Most …
In-context learning with long-context models: An in-depth exploration
As model context lengths continue to increase, the number of demonstrations that can be
provided in-context approaches the size of entire training datasets. We study the behavior of in-…
provided in-context approaches the size of entire training datasets. We study the behavior of in-…
[PDF][PDF] Improved relation extraction with feature-rich compositional embedding models
Compositional embedding models build a representation (or embedding) for a linguistic
structure based on its component word embeddings. We propose a Feature-rich Compositional …
structure based on its component word embeddings. We propose a Feature-rich Compositional …
Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces
Recent work on bilingual lexicon induction (BLI) has frequently depended either on aligned
bilingual lexicons or on distribution matching, often with an assumption about the isometry of …
bilingual lexicons or on distribution matching, often with an assumption about the isometry of …
Limitations of autoregressive models and their alternatives
Standard autoregressive language models perform only polynomial-time computation to
compute the probability of the next symbol. While this is attractive, it means they cannot model …
compute the probability of the next symbol. While this is attractive, it means they cannot model …
Effective convolutional attention network for multi-label clinical document classification
Multi-label document classification (MLDC) problems can be challenging, especially for long
documents with a large label set and a long-tail distribution over labels. In this paper, we …
documents with a large label set and a long-tail distribution over labels. In this paper, we …
Leveraging pretrained models for automatic summarization of doctor-patient conversations
Fine-tuning pretrained models for automatically summarizing doctor-patient conversation
transcripts presents many challenges: limited training data, significant domain shift, long and …
transcripts presents many challenges: limited training data, significant domain shift, long and …
It's mbr all the way down: Modern generation techniques through the lens of minimum bayes risk
…, A Xie, G Neubig, MR Gormley - Proceedings of the Big …, 2023 - aclanthology.org
Minimum Bayes Risk (MBR) decoding is a method for choosing the outputs of a machine
learning system based not on the output with the highest probability, but the output with the …
learning system based not on the output with the highest probability, but the output with the …
Summqa at mediqa-chat 2023: In-context learning with gpt-4 for medical summarization
…, M Palavalli, A Bertsch, MR Gormley - Proceedings of the …, 2023 - aclanthology.org
Medical dialogue summarization is challenging due to the unstructured nature of medical
conversations, the use of medical terminologyin gold summaries, and the need to identify key …
conversations, the use of medical terminologyin gold summaries, and the need to identify key …
MDACE: MIMIC documents annotated with code evidence
…, R Klopfer, E Lu, B Striner, MR Gormley - Proceedings of the …, 2023 - aclanthology.org
We introduce a dataset for evidence/rationale extraction on an extreme multi-label classification
task over long medical documents. One such task is Computer-Assisted Coding (CAC) …
task over long medical documents. One such task is Computer-Assisted Coding (CAC) …