![]() 1b), the encoder-decoder attention learns the relationship between elements in the deep representation of the input sentence and elements in the translation (Fig. While the encoder attention captures the relationship between the elements in the input sentence (Fig. We note that the attention weights were learned spontaneously by the network, not inputted a priori.Ī critical feature of the encoder and decoder is self-attention, which allows identification and representation of relationships between sentence elements. Two heads are shown in different colors, each focusing on a different translation aspect which is described in italic. ![]() c Encoder-decoder attention on the second layer of the decoder. The attention link between ‘woman’ and ‘her’ illustrates how the system internally learns coreference. ![]() The strong attention link between ‘magazine’ and ‘gun’ suggests why CUBBITT ultimately correctly translates “magazine” as “zásobník” (gun magazine), rather than “časopis” (e.g., news magazine). b Visualization of encoder self-attention between the first two layers (one attention head shown, focusing on “magazine” and “her”). ![]() Layers of the encoder and decoder consist of self-attention and feed-forward layers and the decoder also contains an encoder-decoder attention layer, with an input of the deep representation created by the last layer of encoder. Finally, we provide insights into the principles underlying CUBBITT’s key technological advancement and how it improves the translation quality.Ī The input sentence is converted to a numerical representation and encoded into a deep representation by a six-layer encoder, which is subsequently decoded by a six-layer decoder into the translation in the target language. In addition, we validate the methodological improvements using an automatic metric on English↔French and English↔Polish news articles. We perform a new study with conditions that are more representative and far more challenging for MT, showing that CUBBITT conveys meaning of news articles significantly better than human translators even when the cross-sentence context is taken into account. In this work, we present a neural-based translation system CUBBITT (Charles University Block-Backtranslation-Improved Transformer Translation), which significantly outperformed professional translators on isolated sentences in a prestigious competition WMT 2018, namely the English–Czech News Translation Task 17. As a result, neural translation even managed to considerably narrow the gap to human-translation quality on isolated sentences 15, 16. This removal of past independence assumptions is the key reason behind the dramatic improvement of translation quality. Relying on the vast amounts of training data and unprecedented computing power, neural MT (NMT) models can now afford to access the complete information available anywhere in the source sentence and automatically learn which piece is useful at which stage of producing the output text. In line with these advances, the field of MT has shifted to the use of deep-learning neural-based methods 8, 9, 10, 11, which replaced previous approaches, such as rule-based systems 12 or statistical phrase-based methods 13, 14. There are also other challenges in recent MT research such as gender bias 4 or unsupervised MT 5, which are mostly orthogonal to the present work.ĭeep learning transformed multiple fields in the recent years, ranging from computer vision 6 to artificial intelligence in games 7. For these reasons, the level of human translation has been thought to be the upper bound of the achievable performance 3. Among key complications is the rich morphology in the source and especially in the target language 2. The main challenges faced by MT systems are correct resolution of the inherent ambiguity of language in the source text, and adequately expressing its intended meaning in the target language (translation adequacy) in a well-formed and fluent way (translation fluency). However, achieving major success remained elusive, in spite of the unwavering efforts of the machine translation (MT) research over the last 70 years. The idea of using computers for translation of natural languages is as old as computers themselves 1.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |