Single Deletions in Source Sentences Trigger Hallucinations in English-Chinese Machine Translation with Transformers: An Analysis

dc.contributor.advisorGrissom, Alvin
dc.contributor.authorShi, Ruikang
dc.date.accessioned2021-07-12T11:57:42Z
dc.date.available2021-07-12T11:57:42Z
dc.date.issued2021
dc.description.abstractMachine translation is widely used by people. However, all machine translation models can sometimes make mistakes. I review some previous studies on possible errors in machine translation. Then I elaborate on the problem of how mistyping can cause a severe translation error: HALLUCINATION. I conduct experiments to examine the effect of deleting single letters or words on the probability of HALLUCINATION. The results show that both behaviors may cause HALLUCINATION while deleting single words has a greater probability. It also shows that training the model with more data can decrease the probability of HALLUCINATION. Moreover, the untranslated proper nouns in training data lead to INABILITY, a specific type of HALLUCINATION.
dc.description.sponsorshipHaverford College. Department of Computer Science
dc.identifier.urihttp://hdl.handle.net/10066/23547
dc.language.isoeng
dc.rights.accessDark Archive until 2026-01-01, afterwards Open Access.
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.titleSingle Deletions in Source Sentences Trigger Hallucinations in English-Chinese Machine Translation with Transformers: An Analysis
dc.typeThesis
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
2021ShiR.pdf
Size:
973.81 KB
Format:
Adobe Portable Document Format
Description:
Thesis
Loading...
Thumbnail Image
Name:
2021ShiR_release.pdf
Size:
196.48 KB
Format:
Adobe Portable Document Format
Description:
** Archive Staff Only **
Collections