Exploring the Role of Emojis in Tweets for Authorship Attribution

Date
2019
Journal Title
Journal ISSN
Volume Title
Publisher
Producer
Director
Performer
Choreographer
Costume Designer
Music
Videographer
Lighting Designer
Set Designer
Crew Member
Funder
Rehearsal Director
Concert Coordinator
Moderator
Panelist
Alternative Title
Department
Swarthmore College. Dept. of Linguistics
Type
Original Format
Running Time
File Format
Place of Publication
Date Span
Copyright Date
Award
Language
en
Note
Table of Contents
Terms of Use
Full copyright to this work is retained by the student author. It may only be used for non-commercial, research, and educational purposes. All other uses are restricted.
Rights Holder
Access Restrictions
Terms of Use
Tripod URL
Identifier
Abstract
Authorship attribution research has long focused primarily on determining authorship of books or other large texts (Mosteller and L. Wallace, 1963; Gamon, 2004). Only recently have scholars turned to using authorship attribution on short texts or tweets (Eder, 2010; Schwartz et aI., 2013; Mikros and Perifanos, 2013). This research explores whether emojis are a useful linguistic feature for authorship attribution of tweets because of the rise of emoji use. An emoji rich dataset was created since none existed at the time of this research. A Naive Bayes classifier was used as the authorship attribution model. The baseline feature set consisting of commonly used authorship attribution features was augmented with emoji rich features to perform authorship attribution of tweets. My results show that targeting emojis in the feature set prompts a percent increase of at least 30% (raising the accuracy from 65% to 85%).
Description
Subjects
Citation
Collections