Showing 1–2 of 2 results for author: Morris, V

Search v0.5.6 released 2020-02-24

arXiv:2309.08593 [pdf, ps, other]

cs.LG

Attention-Only Transformers and Implementing MLPs with Attention Heads

Authors: Robert Huben, Valerie Morris

Abstract: The transformer architecture is widely used in machine learning models and consists of two alternating sublayers: attention heads and MLPs. We prove that an MLP neuron can be implemented by a masked attention head with internal dimension 1 so long as the MLP's activation function comes from a restricted class including SiLU and close approximations of ReLU and GeLU. This allows one to convert an M… ▽ More The transformer architecture is widely used in machine learning models and consists of two alternating sublayers: attention heads and MLPs. We prove that an MLP neuron can be implemented by a masked attention head with internal dimension 1 so long as the MLP's activation function comes from a restricted class including SiLU and close approximations of ReLU and GeLU. This allows one to convert an MLP-and-attention transformer into an attention-only transformer at the cost of greatly increasing the number of attention heads. We also prove that attention heads can perform the components of an MLP (linear transformations and activation functions) separately. Finally, we prove that attention heads can encode arbitrary masking patterns in their weight matrices to within arbitrarily small error. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 11 pages
arXiv:1811.01272 [pdf, other]

physics.data-an cs.IT physics.ao-ph

doi 10.3390/e20120931

Anomaly Detection in Paleoclimate Records using Permutation Entropy

Authors: Joshua Garland, Tyler R. Jones, Michael Neuder, Valerie Morris, James W. C. White, Elizabeth Bradley

Abstract: Permutation entropy techniques can be useful in identifying anomalies in paleoclimate data records, including noise, outliers, and post-processing issues. We demonstrate this using weighted and unweighted permutation entropy of water-isotope records in a deep polar ice core. In one region of these isotope records, our previous calculations revealed an abrupt change in the complexity of the traces:… ▽ More Permutation entropy techniques can be useful in identifying anomalies in paleoclimate data records, including noise, outliers, and post-processing issues. We demonstrate this using weighted and unweighted permutation entropy of water-isotope records in a deep polar ice core. In one region of these isotope records, our previous calculations revealed an abrupt change in the complexity of the traces: specifically, in the amount of new information that appeared at every time step. We conjectured that this effect was due to noise introduced by an older laboratory instrument. In this paper, we validate that conjecture by re-analyzing a section of the ice core using a more-advanced version of the laboratory instrument. The anomalous noise levels are absent from the permutation entropy traces of the new data. In other sections of the core, we show that permutation entropy techniques can be used to identify anomalies in the raw data that are not associated with climatic or glaciological processes, but rather effects occurring during field work, laboratory analysis, or data post-processing. These examples make it clear that permutation entropy is a useful forensic tool for identifying sections of data that require targeted re-analysis---and can even be useful in guiding that analysis. △ Less

Submitted 29 November, 2018; v1 submitted 3 November, 2018; originally announced November 2018.

Comments: 15 pages, 7 figures

Journal ref: Entropy 2018, 20(12), 931;

Search v0.5.6 released 2020-02-24