Learning Hierarchical Structure with LSTMs
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Producer
Director
Performer
Choreographer
Costume Designer
Music
Videographer
Lighting Designer
Set Designer
Crew Member
Funder
Rehearsal Director
Concert Coordinator
Advisor
Moderator
Panelist
Alternative Title
Department
Haverford College. Department of Computer Science
Type
Thesis
Original Format
Running Time
File Format
Place of Publication
Date Span
Copyright Date
Award
Language
eng
Note
Table of Contents
Terms of Use
Rights Holder
Access Restrictions
Open Access
Terms of Use
Tripod URL
Identifier
Abstract
Recurrent Neural Networks (RNNs) are successful at modeling languages because of their ability to recognize patterns over an undefined input length using their internal memory. However, the data kept in their memory decays over time due to a problem called vanishing gradients. Long Short-Term Memory (LSTM) units mitigate this problem with forget gates which help reserve its memory for only important data. This model has thus become very popular in natural language processing (NLP), because they are able to model context. Compared to ealier models used in NLP, LSTMs excel at modeling a language modeling. However, some aspects of their success in the field have surprised researchers. Their apparent ability to model syntax suggests that they use mechanisms of learning which we do not yet fully understand. Research has been done on LSTMs and language syntax, in an effort to potentially further the field. Yet, an exhaustive account of how the inside an LSTM works and what needs improving has yet to be compiled. Here, we hope to use previous research and some final experiments to provide a clear picture of how LSTMs model hierarchical syntax.