Exploring the Role of Emojis in Tweets for Authorship Attribution
Date
2019
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Producer
Director
Performer
Choreographer
Costume Designer
Music
Videographer
Lighting Designer
Set Designer
Crew Member
Funder
Rehearsal Director
Concert Coordinator
Advisor
Moderator
Panelist
Alternative Title
Department
Swarthmore College. Dept. of Linguistics
Type
Original Format
Running Time
File Format
Place of Publication
Date Span
Copyright Date
Award
Language
en
Note
Table of Contents
Terms of Use
Full copyright to this work is retained by the student author. It may only be used for non-commercial, research, and educational purposes. All other uses are restricted.
Rights Holder
Access Restrictions
Terms of Use
Tripod URL
Identifier
Abstract
Authorship attribution research has long focused primarily on determining authorship
of books or other large texts (Mosteller and L. Wallace, 1963; Gamon, 2004). Only
recently have scholars turned to using authorship attribution on short texts or tweets
(Eder, 2010; Schwartz et aI., 2013; Mikros and Perifanos, 2013). This research explores
whether emojis are a useful linguistic feature for authorship attribution of tweets because
of the rise of emoji use. An emoji rich dataset was created since none existed at the time
of this research. A Naive Bayes classifier was used as the authorship attribution model.
The baseline feature set consisting of commonly used authorship attribution features was
augmented with emoji rich features to perform authorship attribution of tweets. My results
show that targeting emojis in the feature set prompts a percent increase of at least
30% (raising the accuracy from 65% to 85%).