Topic Modeling, Named-Entity Recognition, and Network Analysis of Literary Corpora

Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
Producer
Director
Performer
Choreographer
Costume Designer
Music
Videographer
Lighting Designer
Set Designer
Crew Member
Funder
Rehearsal Director
Concert Coordinator
Moderator
Panelist
Alternative Title
Department
Haverford College. Department of Computer Science
Type
Thesis
Original Format
Running Time
File Format
Place of Publication
Date Span
Copyright Date
Award
Language
eng
Note
Table of Contents
Terms of Use
Rights Holder
Access Restrictions
Open Access
Tripod URL
Identifier
Abstract
In this thesis, we conduct a literature review on the application of two natural language processing techniques, topic modeling and named-entity recognition (character identification), on collections of literary fiction. These techniques allow us to efficiently identify the dominant themes in a text as well as the placement of named entities in relation to those themes. This process can be extended to the corpus as a whole to gauge the presence of themes across multiple works. We also investigate the use of this data in networks, which allow researchers to create human-readable maps of themes and entities across the corpus.
Description
Subjects
Citation
Collections