Institutional Scholarship

Learning Hierarchical Structure with LSTMs

Show simple item record

dc.contributor.advisor Grissom, Alvin
dc.contributor.author Paris, Tomas
dc.date.accessioned 2021-07-12T11:57:42Z
dc.date.available 2021-07-12T11:57:42Z
dc.date.issued 2021
dc.identifier.uri http://hdl.handle.net/10066/23546
dc.description.abstract Recurrent Neural Networks (RNNs) are successful at modeling languages because of their ability to recognize patterns over an undefined input length using their internal memory. However, the data kept in their memory decays over time due to a problem called vanishing gradients. Long Short-Term Memory (LSTM) units mitigate this problem with forget gates which help reserve its memory for only important data. This model has thus become very popular in natural language processing (NLP), because they are able to model context. Compared to ealier models used in NLP, LSTMs excel at modeling a language modeling. However, some aspects of their success in the field have surprised researchers. Their apparent ability to model syntax suggests that they use mechanisms of learning which we do not yet fully understand. Research has been done on LSTMs and language syntax, in an effort to potentially further the field. Yet, an exhaustive account of how the inside an LSTM works and what needs improving has yet to be compiled. Here, we hope to use previous research and some final experiments to provide a clear picture of how LSTMs model hierarchical syntax.
dc.description.sponsorship Haverford College. Department of Computer Science
dc.language.iso eng
dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
dc.title Learning Hierarchical Structure with LSTMs
dc.type Thesis
dc.rights.access Open Access


Files in this item

This item appears in the following Collection(s)

Show simple item record

http://creativecommons.org/licenses/by-nc/4.0/ Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc/4.0/

Search


Browse

My Account