Browsing by Author "Chandlee, Jane"
Now showing 1 - 20 of 25
Results Per Page
Sort Options
- ItemBuilding a Linguistics based Loss Function for Dialogue Generation(2020) St. Clair, Jack; Chandlee, JaneThis paper will investigate different loss functions used for various natural language processing (NLP) machine learning tasks. These loss functions have proven their worth in the area of machine translation but they have been shown to be inadequate for the task of dialogue generation. Thus, this paper proposes some potential additions to these loss functions that add more linguistic information with the goal of improving dialogue generation to get machine learning algorithms closer to creating human like dialogue.
- ItemBuilding a Linguistics based Loss Function for Dialogue Generation(2020) St. Clair, Jack; Chandlee, JaneThis paper will investigate different loss functions used for various natural language processing (NLP) machine learning tasks. These loss functions have proven their worth in the area of machine translation but they have been shown to be inadequate for the task of dialogue generation. Thus, this paper proposes some potential additions to these loss functions that add more linguistic information with the goal of improving dialogue generation to get machine learning algorithms closer to creating human like dialogue.
- ItemComputational Modeling of Musical Enculturation: An Investigation of Multicultural Music Learning Using Self-Organizing Maps(2016) Pazdera, Jesse; Chandlee, Jane; Boltz, MarilynPast research has shown that children are able to implicitly learn the underlying melodic structure of their native culture's musical system, even without formal musical training. Although implicit musical learning has been well studied, little is known about non-Western and multicultural musical enculturation. The present study addressed these issues though three experiments using self-organizing maps (SOMs), a type of neural model, to simulate implicit musical learning. Experiment 1 used SOMs to simulate Western, Chinese, and Hindustani musical enculturation, each learned independently from one another. Experiment 2 simulated a child growing up in a multicultural context, to investigate whether they might learn the structure of multiple native systems. Experiment 3 simulated an adult encountering an unfamiliar culture, to examine whether adults – not only children – may implicitly acquire the syntax of new musical systems. Results generally supported the plausibility of successful multicultural learning, with the caveat that certain systems disrupted the learning of others. Our findings led to further discussion of cross-cultural similarities between musical systems and the implications of these connections.
- ItemDefining Language in the Wake of Primate Language Research(2019) Meyer, Hanna L.; Chandlee, JaneThis text examines language as it is used in the animal language debate through pragmatic and structural linguistic perspectives on primate language research. The surge of primate language studies in America in the 1970's generated a wave of public and academic interest in animal language that continues today in the form of ongoing primate research both in the lab and in the wild. These studies have forced the field to examine the way it conceptualizes language, as well as the current criteria with which we use to define it I argue that the traditional linguistic approach to defining language, which measures language through surface level features, as can be seen in Hockett's design features, cannot fully describe language, and instead must include a more pragmatic perspective in order to more accurately measure primate language. Finally I argue that the term language, as it has historically been used to describe only human language is useless, as its exclusivity ignores the gradient of the complexity of higher mental faculties across the evolutionary tree.
- ItemDistilling Knowledge from Wikipedia for Augmented Speech Recognition(2019) Warner, Tai Vongsathorn; Chandlee, Jane
- ItemEvaluating the Existence and Nature of the Critical Period Hypothesis in Second Language Acquisition(2022) Jayasankar, John; Chandlee, JaneThis paper seeks to investigate the existence and nature of the Critical Period Hypothesis (CPH) in Second Language Acquisition (L2A). I conduct an extensive literature review into many studies spanning five decades into many domains of research. I advocate for multiple critical periods (CPs) for various aspects of language acquisition (morphology, syntax, phonology, phonotactics, grammar, semantics, pragmatics) each with their own unique discontinuity between ultimate attainment (UA) and age of acquisition (AoA). I expose gaps and highlight sources of debate within current literature such as the validity of (UA) as a yardstick for evaluating L2A proficiency, problematic statistical methodology for modeling the discontinuities in the AoA-UA function, language acquisition transfer interference from first language acquisition into L2A, individualistic traits such as language aptitude and motivation. I examine methodological differences in existing literature with a particular focus on incorrect assumptions and statistical techniques that lead to false conclusions being drawn about the shape of the age of acquisition (AoA) and ultimate attainment (UA) function, in testing for the CPH. Ultimately,I advocate for the re-analysis of past studies using different methodological techniques to generate new AoA-UA function graphs to discern if there are real discontinuities or not. I hypothesize that correct and repeatable statistical modeling and proper experimental design will facilitate the discovery of multiple CPs that occur in a robust sequential order with unique onsets, offsets, and discontinuities to each CP. I also hypothesize that individuals with common L1s and interlanguage systems share unique predictable CP onset and offsets that are robust within the group. This paper adds to the existing literature by first presenting an updated in-depth analysis of the current literature and proceeds to discuss how statistical errors in the existing literature may be contributing to the lack of robust evidence for multiple CPs in L2A.
- ItemEvaluating the Existence and Nature of the Critical Period Hypothesis in Second Language Acquisition(2022) Jayasankar, John M.; Chandlee, JaneThis paper seeks to investigate the existence and nature of the Critical Period Hypothesis (CPH) in Second Language Acquisition (L2A). I conduct an extensive literature review into many studies spanning five decades into many domains of research. I advocate for multiple critical periods (CPs) for various aspects of language acquisition (morphology, syntax, phonology, phonotactics, grammar, semantics, pragmatics) each with their own unique discontinuity between ultimate attainment (UA) and age of acquisition (AoA). I expose gaps and highlight sources of debate within current literature such as the validity of (UA) as a yardstick for evaluating L2A proficiency, problematic statistical methodology for modeling the discontinuities in the AoA-UA function, language acquisition transfer interference from first language acquisition into L2A, individualistic traits such as language aptitude and motivation. I examine methodological differences in existing literature with a particular focus on incorrect assumptions and statistical techniques that lead to false conclusions being drawn about the shape of the age of acquisition (AoA) and ultimate attainment (UA) function, in testing for the CPH. Ultimately, I advocate for the re-analysis of past studies using different methodological techniques to generate new AoA-UA function graphs to discern if there are real discontinuities or not. I hypothesize that correct and repeatable statistical modeling and proper experimental design will facilitate the discovery of multiple CPs that occur in a robust sequential order with unique onsets, offsets, and discontinuities to each CP. I also hypothesize that individuals with common L1s and interlanguage systems share unique predictable CP onset and offsets that are robust within the group. This paper adds to the existing literature by first presenting an updated in-depth analysis of the current literature and proceeds to discuss how statistical errors in the existing literature may be contributing to the lack of robust evidence for multiple CPs in L2A.
- ItemExploring the Role of Emojis in Tweets for Authorship Attribution(2019) Ellison, Kennedy; Chandlee, Jane; Kumar, DeepakAuthorship attribution research has long focused primarily on determining authorship of books or other large texts (Mosteller and L. Wallace, 1963; Gamon, 2004). Only recently have scholars turned to using authorship attribution on short texts or tweets (Eder, 2010; Schwartz et aI., 2013; Mikros and Perifanos, 2013). This research explores whether emojis are a useful linguistic feature for authorship attribution of tweets because of the rise of emoji use. An emoji rich dataset was created since none existed at the time of this research. A Naive Bayes classifier was used as the authorship attribution model. The baseline feature set consisting of commonly used authorship attribution features was augmented with emoji rich features to perform authorship attribution of tweets. My results show that targeting emojis in the feature set prompts a percent increase of at least 30% (raising the accuracy from 65% to 85%).
- ItemThe fluidity of foreign language instruction; an intersection of personal teaching pedagogy and proposed second language teaching (SLT) principles(2019) Queen, Elizabeth Umutesi; Chandlee, JaneThe number of second language teaching methods is increasing fast enough that some theories risk becoming obsolete before being practiced. However, a number of second language researchers have noted that the road from such theories to their practice is barely travelled. In the first chapter of his book, Principles and Practices in second language Acquisition, Krashen (1982) notes that there is a lack of interaction between second language theorists, applied linguistics researchers and teachers. He argues that the failure of researchers to communicate with teachers has resulted in the latter using their own intuition and experience to inform their teaching practice. In part, he suggests that theorists both in theoretical and applied linguistics could benefit from learning and teaching languages in order to gain a deeper understanding of language learning and instructors would benefit from results of the research done by the theorists. Krashen (1982)'s proposal, which is not uncommon, implies that there is a need for such an interaction between second language theorists, applied linguistics researchers and teachers, and that the teacher relying on their intuition is insufficient. In an attempt to find out whether such an interaction is crucial to language pedagogy, I first look at principles shared by three language teaching methods. From these shared principles, I then draw potential applications that I would expect an instructor informed by one or more of these methods to practice in their language classroom. I then present my research, which consists of an interview with a foreign language professor, analysis of his course materials and an observation of his class. I then compare the findings of my research to the potential applications of the methods discussed in this paper with the aim to answer my research question. From my research, I conclude that the shared goal of the second language theorists and second language instructors of maximizing opportunities for the learner's language development leads them to more or less similar conclusions about what practices to pursue. As such, this interaction might not be imperative to second language pedagogy. It seems to be effectively replaced by the experience of instructors.
- ItemGrapheme to Phoneme Conversion: Using Input Strictly Local Finite State Transducers(2019) Morgan, Gregory M.; Chandlee, JaneThis thesis explores the many methods of Grapheme to Phoneme Conversion (G2P) including dictionary look-up, rule-based approaches, and probabilistic approaches such as Joint Sequence Models (JSM), Recurrent Neural Networks (RNN), and weighted finite state automata (WFST) as well as a discussion of letter to phoneme alignments methods. We then explain Strictly Local languages and functions and their previous applications in an Input Strictly Local FST Learning Algorithm. Finally, I propose a further application for G2P conversion by adapting the Input Strictly Local FST Learning Algorithm. My results indicate that while this algorithm had some success learning G2P, future work will be necessary to improve accuracy by implementing a probabilistic model.
- ItemHandling Reduplication in a Morphological Analyzer for Wamesa(2020) Lin, Emily; Chandlee, JaneThis thesis seeks to assess previous computational work done regarding reduplication in order to account for reduplication in a morphological analyzer for the Wamesa language. This paper includes a brief introduction to Wamesa, reduplication, two-level rules, and morphological analysis. The use of finite-state transducers as morphological analyzers is also discussed. Additionally, various methods that researchers have used to deal with reduplication in computational models are evaluated. Methods include the use of two-level rules and supplementing finite-state transducers with an equality operator, memory devices, and bidirectional reading capabilities. Ultimately, I propose additions and modifications to the Wamesa morphological analyzer, which involves the use of the Helsinki Finite-State Toolkit and two-level rules.
- ItemHandling Reduplication in a Morphological Analyzer for Wamesa(2020) Lin, Emily; Chandlee, JaneThis thesis seeks to assess previous computational work done regarding reduplication in order to account for reduplication in a morphological analyzer for the Wamesa language. This paper includes a brief introduction to Wamesa, reduplication, two-level rules, and morphological analysis. The use of finite-state transducers as morphological analyzers is also discussed. Additionally, various methods that researchers have used to deal with reduplication in computational models are evaluated. Methods include the use of two-level rules and supplementing finite-state transducers with an equality operator, memory devices, and bidirectional reading capabilities. Ultimately, I propose additions and modifications to the Wamesa morphological analyzer, which involves the use of the Helsinki Finite-State Toolkit and two-level rules.
- ItemThe Intonational Phonology of Spoken Word Poetry(2018) Crum, Abigail H.; Chandlee, JaneMost linguistic research on intonational phonology and poetry has been focused on traditional poetic intonation (Byers 1980, Barney 1999). This thesis expands this area of study to include the sub-genre of spoken word poetry. Spoken word is a performative oral art form that encourages imaginative use of language and intonation. A trend called "poet voice" has developed out of this genre. There is a relatively small amount of research which is focused on rhetoric in spoken word (Stoudamire). Using the approaches of Byers (1980), and Barney (1999) on predicting how poetry will sound, I describe unique intonational features in one spoken word poem by Harry Baker (2014). I highlight the drawbacks and benefits of the prediction method as it pertains to transcription and pitch. I also offer insight about how the linguistic study of spoken word poetry can develop in the future.
- ItemLearning Models for Opinion Mining Over a Fundamental Analysis Corpus(2017) Cassidy, Connor; Chandlee, JaneThis paper presents a comprehensive survey of the methodologies and techniques used in opinion mining. Opinion mining, also known as sentiment analysis, refers to the use of natural language processing, text analysis, and computational linguistics to identify and extract subjective information from a variety of source materials. In this paper, we provide historical background and insight to the evolution of methodologies that compose the cutting edge techniques in sentiment analysis. Our goal in completing this survey is to compare the field's techniques in order to establish and propose a best fit model for the mining of opinions within a financial news corpus. Based on results gathered from this research, we establish a method to accurately [16] extract sentiment from qualitative fundamental analysis text using Recursive Neural Tensor Networks [3] (RNTNs) along with phrase-level assessments of our corpus. In future work, we aim to use this proposed model to capture sentiment from our corpus at a topic level in order to assess and estimate the dynamic feedback effect of news in markets and sectors.
- ItemMachine Translation and Vernacular: Interpreting the Informal(2017) Tien, Nora; Chandlee, JaneA literature review of the current methods of machine translation, in particular as they relate to the open problem of translating informal language.
- ItemMitigating Racial Bias in Social Media Hate Speech Detection(2022) Han, Jiangxue; Chandlee, JaneNew ways of using language emerge in social media. While there are many positive aspects, it also leads to anti-social behavior, cyberbullying, online harassment, and hate speech. As a result, hate speech detection models are often used to recognize hate speech on social media and thus enable platforms to regulate the accounts that show such behavior. In this paper, I will establish that bias against users using African American English (AAE) exist in hate speech detection models and provide a literature review on current approaches to reduce such bias. I then propose to perform lexical and syntactic alternations to remove protected attributes of AAE before training and use an adversarial approach for training to generate hate speech predictions while mitigating racial bias.
- ItemMusic Transposition as a Method of Generating Data for Chord Recognition(2017) Wang, Alex; Chandlee, JaneThe aim of this thesis is to evaluate the viability of transposition as a technique for generating new data that can be used to improve the accuracy of a chord recognition system. Transposition is the process of shifting all the notes in a piece by a fixed interval (i.e. changing the key of the piece). Any piece of music can be transposed 11 times before returning back to its original key. We took each piece in our dataset and created 11 different versions of each piece which effectively expanded our dataset by 12 times. This is a potential solution to the perennial problem in chord recognition: the lack of training data. It is a well-known fact a machine learning model needs large volumes of training data but labeled chord data is scarce. We want to see if transposition can help remedy this situation by providing a convenient way of creating more data.To test the effectiveness of this technique, we trained three different Hidden Markov models on different quantities of transformed data: a baseline model that contained no transformed data, an experimental model that contained the original tracks plus 5 transposed versions of each track, and an experimental model that contained the original tracks plus 11 transposed versions of each track. We achieved a recognition accuracy of 33.3% for our baseline model. We can say tentatively that transposition is not a viable technique for generating data.
- ItemProsody of Positive and Negative Conjunction in English and Bangla(2020) Herringshaw, Travis E.; Chandlee, JaneIn addition to merely joining multiple constituents, conjunction can serve as an explicit indicator of how those constituents are related –viz., if they are in a positive or negative semantic relation. Positive conjunction conveys a positive relation and is facilitated with markers such as and, so, or because. Conversely, negative conjunction conveys a negative relation and is facilitated with markers such as but, yet, or though. Because conjunction reliably conveys positive and negative semantic relations, previous literature has used prosodic closeness of positively/negatively related conjuncts as a proxy for underlying semantic closeness, examining prosodic correlates of conjunction to infer "closeness" of positive/negative relations. Factors used to operationalize prosodic closeness have included inter-conjunct pitch reset and pause duration, for which a decrease in either is associated with increased closeness. Beyond possible insight into inherent closeness of semantic relations, investigation of these prosodic correlates is also useful in that if they are found to reliably deviate between positive and negative conjunction, then conjunction can be used to better inform efforts to simulate natural speech, as in text-to-speech. Previous studies have found pitch reset and pause duration to be less between positive conjuncts than between negative conjuncts in English, but have observed the opposite effect in Japanese. Tokizaki and Kuwana (2009) propose that negative relations are underlyingly closer, but that this effect is obscured in English. The current study analyzed inter-conjunct pitch reset, pause duration, as well as phrase-final lengthening, in various positive and negative conjunction structures in English and Bangla in an effort to address this account, as well as primarily to determine whether conjunction is a reliable correlate of certain prosodic features. The current study replicated the apparent positivity-closeness effect in English, but could not draw conclusions from the Bangla results (beyond observation of pronounced phrase-final lengthening in negative conjunction), primarily due to methodological issues.
- ItemProsody of Positive and Negative Conjunction in English and Bangla(2020) Herringshaw, Travis E.; Chandlee, JaneIn addition to merely joining multiple constituents, conjunction can serve as an explicit indicator of how those constituents are related – viz., if they are in a positive or negative semantic relation. Positive conjunction conveys a positive relation and is facilitated with markers such as and, so, or because. Conversely, negative conjunction conveys a negative relation and is facilitated with markers such as but, yet, or though. Because conjunction reliably conveys positive and negative semantic relations, previous literature has used prosodic closeness of positively/negatively related conjuncts as a proxy for underlying semantic closeness, examining prosodic correlates of conjunction to infer “closeness” of positive/negative relations. Factors used to operationalize prosodic closeness have included inter-conjunct pitch reset and pause duration, for which a decrease in either is associated with increased closeness. Beyond possible insight into inherent closeness of semantic relations, investigation of these prosodic correlates is also useful in that if they are found to reliably deviate between positive and negative conjunction, then conjunction can be used to better inform efforts to simulate natural speech, as in text-to-speech. Previous studies have found pitch reset and pause duration to be less between positive conjuncts than between negative conjuncts in English, but have observed the opposite effect in Japanese. Tokizaki and Kuwana (2009) propose that negative relations are underlyingly closer, but that this effect is obscured in English. The current study analyzed inter-conjunct pitch reset, pause duration, as well as phrase-final lengthening, in various positive and negative conjunction structures in English and Bangla in an effort to address this account, as well as primarily to determine whether conjunction is a reliable correlate of certain prosodic features. The current study replicated the apparent positivity-closeness effect in English, but could not draw conclusions from the Bangla results (beyond observation of pronounced phrase-final lengthening in negative conjunction), primarily due to methodological issues.
- ItemQuantitative Metathesis in Ancient Greek(2018) Brown, Anita; Chandlee, JaneThis paper explores the phenomenon known as 'quantitative metathesis' in Ancient Greek. Historically this change, an apparent metathesis of vowel length, has been considered to be true metathesis by classicists, but recent scholarship has cast suspicion on this notion, not least because metathesis of vowel length is not a known change in any other language. In this paper, I present a review of previous scholarship on Greek quantitative metathesis, in addition to a cross-linguistic survey of general metathesis, with special attention to autosegmental theory. I conclude that Greek quantitative metathesis is not true metathesis, but rather a retention and reassociation of abstract timing units through the two individual (and well-attested) processes of antevocalic shortening and compensatory lengthening.