Single Deletions in Source Sentences Trigger Hallucinations in English-Chinese Machine Translation with Transformers: An Analysis
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Producer
Director
Performer
Choreographer
Costume Designer
Music
Videographer
Lighting Designer
Set Designer
Crew Member
Funder
Rehearsal Director
Concert Coordinator
Advisor
Moderator
Panelist
Alternative Title
Department
Haverford College. Department of Computer Science
Type
Thesis
Original Format
Running Time
File Format
Place of Publication
Date Span
Copyright Date
Award
Language
eng
Note
Table of Contents
Terms of Use
Rights Holder
Access Restrictions
Dark Archive until 2026-01-01, afterwards Open Access.
Terms of Use
Tripod URL
Identifier
Abstract
Machine translation is widely used by people. However, all machine translation models can sometimes make mistakes. I review some previous studies on possible errors in machine translation. Then I elaborate on the problem of how mistyping can cause a severe translation error: HALLUCINATION. I conduct experiments to examine the effect of deleting single letters or words on the probability of HALLUCINATION. The results show that both behaviors may cause HALLUCINATION while deleting single words has a greater probability. It also shows that training the model with more data can decrease the probability of HALLUCINATION. Moreover, the untranslated proper nouns in training data lead to INABILITY, a specific type of HALLUCINATION.