Single Deletions in Source Sentences Trigger Hallucinations in English-Chinese Machine Translation with Transformers: An Analysis
Haverford College. Department of Computer Science
Place of Publication
Table of Contents
Dark Archive until 2026-01-01, afterwards Open Access.
Machine translation is widely used by people. However, all machine translation models can sometimes make mistakes. I review some previous studies on possible errors in machine translation. Then I elaborate on the problem of how mistyping can cause a severe translation error: HALLUCINATION. I conduct experiments to examine the effect of deleting single letters or words on the probability of HALLUCINATION. The results show that both behaviors may cause HALLUCINATION while deleting single words has a greater probability. It also shows that training the model with more data can decrease the probability of HALLUCINATION. Moreover, the untranslated proper nouns in training data lead to INABILITY, a specific type of HALLUCINATION.