Sentiment Analysis of Egyptian Arabic in Social Media

dc.contributor.advisorKumar, Deepak
dc.contributor.advisorDarwish, Manar
dc.contributor.authorAbdalkader, Mohamed
dc.date.accessioned2014-12-01T15:49:00Z
dc.date.available2014-12-01T15:49:00Z
dc.date.issued2014
dc.description.abstractSentiment analysis is an emerging area of application fueled by the increase of public participation in online social media. Much work has been done on sentiment analysis in English while less work has been done on other languages like Mandarin and Arabic. Arabic is spoken by hundreds of millions of people in over twenty countries. Modern Standard Arabic (MSA) is used online mostly by newspapers and other official sources. However, social media and blogs used by individuals are typically in Dialect Arabic (DA). My Senior Thesis work has been focused on exploring ways to increase the accuracy of automated sentiment analysis in Egyptian Arabic through using the specific features of Arabic. I found that the baseline algorithm makes the most mistakes in classifying tweets that carry a sentiment as neutral tweets. Using Minimum Edit Distance (MED) and ISRI Arabic stemmer, I was able to decrease the error of the baseline algorithm by 31% without having to add any new entries to the lexicon. My approach has allowed me to not only get over the challenge of different morphological forms but also misspelling and informal writing. While I cannot empirically compare it to results by other authors as I am using a different data set, my approach reaches an accuracy of 78% which has an improvement of 14.7% over the baseline.
dc.description.sponsorshipHaverford College. Department of Computer Science
dc.identifier.urihttp://hdl.handle.net/10066/15080
dc.language.isoeng
dc.rights.accessHaverford users only
dc.rights.urihttp://creativecommons.org/licenses/by-nc/3.0/us/
dc.subject.lcshNatural language processing (Computer science)
dc.subject.lcshComputational linguistics
dc.subject.lcshPublic opinion -- Data processing
dc.subject.lcshData mining
dc.subject.lcshEgyptian language -- Computer programs
dc.subject.lcshEgyptian language -- Data processing
dc.titleSentiment Analysis of Egyptian Arabic in Social Media
dc.typeThesis
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
2014AbdalkaderM_thesis.pdf
Size:
665.46 KB
Format:
Adobe Portable Document Format
Description:
Thesis
Loading...
Thumbnail Image
Name:
2014AbdalkaderM_release.pdf
Size:
90.11 KB
Format:
Adobe Portable Document Format
Description:
Release
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections