Institutional Scholarship

Single Deletions in Source Sentences Trigger Hallucinations in English-Chinese Machine Translation with Transformers: An Analysis

Show simple item record

dc.contributor.advisor Grissom, Alvin
dc.contributor.author Shi, Ruikang
dc.date.accessioned 2021-07-12T11:57:42Z
dc.date.available 2021-07-12T11:57:42Z
dc.date.issued 2021
dc.identifier.uri http://hdl.handle.net/10066/23547
dc.description.abstract Machine translation is widely used by people. However, all machine translation models can sometimes make mistakes. I review some previous studies on possible errors in machine translation. Then I elaborate on the problem of how mistyping can cause a severe translation error: HALLUCINATION. I conduct experiments to examine the effect of deleting single letters or words on the probability of HALLUCINATION. The results show that both behaviors may cause HALLUCINATION while deleting single words has a greater probability. It also shows that training the model with more data can decrease the probability of HALLUCINATION. Moreover, the untranslated proper nouns in training data lead to INABILITY, a specific type of HALLUCINATION.
dc.description.sponsorship Haverford College. Department of Computer Science
dc.language.iso eng
dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
dc.title Single Deletions in Source Sentences Trigger Hallucinations in English-Chinese Machine Translation with Transformers: An Analysis
dc.type Thesis
dc.rights.access Dark Archive until 2026-01-01, afterwards Open Access.


Files in this item

This item appears in the following Collection(s)

Show simple item record

http://creativecommons.org/licenses/by-nc/4.0/ Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc/4.0/

Search


Browse

My Account