Foreign accented speech transcription and accent recognition using a game-based approach

Journal Title
Journal ISSN
Volume Title
Costume Designer
Lighting Designer
Set Designer
Crew Member
Rehearsal Director
Concert Coordinator
Alternative Title
Swarthmore College. Dept. of Linguistics
Thesis (B.A.)
Original Format
Running Time
File Format
Place of Publication
Date Span
Copyright Date
Table of Contents
Terms of Use
Full copyright to this work is retained by the student author. It may only be used for non-commercial, research, and educational purposes. All other uses are restricted.
Rights Holder
Access Restrictions
Terms of Use
Tripod URL
While significant improvements have been made in reducing sentence error rate (SER) and word error rate (WER) of automatic speech recognition (ASR) technology, existing systems still face considerable di culty parsing non-native speech. Two methods are common in adapting ASR systems to accommodate foreign accented speech. In the first, accent detection and identification is followed by an accent-specific acoustic model (Faria 2006, Chen et al. 2001) or dictionary (Fung and Kat 1999). Accents have also been classified by severity (Zheng et al. 2005, Bartkova and Jouvet 2007). The alternative is to use acoustic or phonetic models from both native and non-native speech (Bouselmi et al. 2006, Matsunaga et al. 2003). It has been shown that the use of accent-specific data improves recognition rate (Arslan and Hansen 1996, Humphries et al. 1996) but success rates vary among languages. In either case, specific information needs to be obtained regarding particular accents, and the process of adapting existing corpora to train language models is both time-consuming and tedious, limiting advances in the eld. We introduce the Foreign Accented Speech Transcription Game (FASTGame) as a way to transform the transcription process into a more enjoyable format. The FASTGame is a `game with a purpose' designed to obtain normalized orthographic transcriptions of foreign accented speech from native listeners. The FASTGame is accessible online through the social networking website Facebook and contains two tasks. The first asks the player to determine the native language of a foreign accented speaker of English from four available options as rapidly as possible. Players are incentivized by scores that reflect how well they perform. For this task they are based on accuracy and speed. In addition to examining the specific cues that trigger accent recognition, analysis can be made on the data about user responses to novel accents. The second task asks the player to transcribe a phrase spoken by a foreign accented speaker of English. Their scores are calculated based on agreement with other users. In the event that transcriptions have not already been written, scores are assigned randomly. All transcriptions for a particular recording are then aggregated and the correct transcription will then be generated based on multiple agreement. Existing continuous speech recognition software fail to accurately produce transcriptions for such recordings, which are also of varying audio quality and accent severity. By performing time-alignment on the transcriptions provided with this game, valuable training data can be used to improve language models for accented speech. In both tasks of the game, steps are taken in order to avoid repeated plays and undesirable data conditioning. The FASTGame was created as an alternative to existing methods for obtaining tran- scriptions, and its primary merit is in supplementing large speech corpora with additional data in a relatively inexpensive and effortless manner.