敵対的生成ネットワークを用いたニューラル機械翻訳とその評価

マツムラ, ユキオ; Matsumura, Yukio; 松村, 雪桜

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

敵対的生成ネットワークを用いたニューラル機械翻訳とその評価

http://hdl.handle.net/10748/00011317

名前 / ファイル	ライセンス	アクション
T02054-001.pdf (874.2 kB)

Item type

学位論文 / Thesis or Dissertation(1)

公開日

2020-04-01

タイトル

敵対的生成ネットワークを用いたニューラル機械翻訳とその評価

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_46ec

資源タイプ

thesis

著者

松村, 雪桜

著者(ヨミ)

マツムラ, ユキオ

著者別名

Matsumura, Yukio

抄録

内容記述タイプ

Abstract

内容記述

近年，ニューラル機械翻訳（Neural Machine Translation：NMT）の登場により，機械翻訳が盛んに研究されている．原言語文を中間表現に変換するEncoder，及び，その中間表現から目的言語文を1単語ずつ生成するDecoderと呼ばれる2つのRecurrent Neural Network（RNN）を組み合わせた初期型のEncoder-Decoderが提案されて以来，原言語文の各単語を重み付けして考慮することができるAttentionの導入等により，機械翻訳の精度は大きく向上してきた．しかしながら，翻訳時にいくつかの単語が翻訳されず消失してしまう，あるいは不必要な単語が出現してしまう，繰り返されてしまうといった現象がたびたび起きるという問題がある．また，機械翻訳の評価には一般的にBLEUが用いられている．しかしながら，BLEUは単語n-gram適合率に基づき精度を評価する手法であり，単語の表層が異なる場合全く別の単語として評価されるため，文の意味を正しく考慮した評価はできていない．加えて，評価のために翻訳の正解である参照訳を必要とする．画像生成の分野で注目を集めている敵対的生成ネットワーク（Generative Adversarial Network：GAN）は，GeneratorとDiscriminatorの2つのネットワークからなり，Discriminatorはあるデータが正解データであるかGeneratorの出力であるかを識別する一方で，GeneratorはDiscriminatorが識別できないようなデータを生成するように敵対的な学習を行うことで，Generatorが正解に近いデータを生成することを可能にしている．敵対的生成ネットワークは自然言語処理，とりわけ機械翻訳の分野でも使用が試みられており，これらの研究ではGeneratorをニューラル機械翻訳モデル，Discriminatorを入力された原言語文と目的言語文から目的言語文が参照訳であるかシステム出力文であるか予測する分類器として敵対的に学習を行うことで，Generatorであるニューラル機械翻訳モデルの精度の向上を図っている．また，従来のニューラル機械翻訳では単語単位での最適化を行っているが，敵対的生成ネットワークを用いることで文単位の情報を新たに用いて最適化することができ，文としてより自然なものを生成できることが期待される．本研究では，日英および英日翻訳における敵対的生成ネットワークがニューラル機械翻訳に及ぼす影響を調査するとともに，目的言語文の分類器であるDiscriminatorに注目し，Discriminatorが予測する正解データらしさを機械翻訳の評価手法として用いることを提案する．敵対的生成ネットワークの設定では正解データは人手による参照訳なので，原言語文と目的言語文のペアを見て正解データらしいということは人手による翻訳である可能性が高いということであり，翻訳文の評価に転用できると考えられる．また，提案手法では評価時には原言語文と翻訳文のペアから評価を行うことから正解の参照訳を必要としないため，単言語コーパスなどの参照訳がない文に対する翻訳の評価への使用も期待できる．敵対的生成ネットワークの構造としてはConditional Sequence GAN（CSGAN）を改良したものを実装し，学習における目的関数には3つの手法を用いた．1つ目の手法はGoodfellowらによって最初に提案されたGANであり，Discriminatorにおいて目的言語文に対して正解データであるか生成データであるかの2クラス分類を行い，クロスエントロピーを用いて学習する．2つ目の手法はLeast SquaresGAN（LSGAN）であり，従来のGANとは異なり活性化関数を用いずに目的言語文の正解らしさを直接予測し，平均二乗誤差を用いて学習を行う．3つ目の手法はWasserstein GAN（WGAN）であり，Discriminatorが正解データ（正例）に対して予測したスコアと生成データ（負例）に対して予測したスコアの差を用いて学習を行う．なお，敵対的生成ネットワークの学習は非常に不安定であり，どの手法においてもGeneratorの事前学習を行った後にその出力を用いてDiscriminatorを事前学習し，最後に敵対的生成ネットワーク全体を学習している．これら3つの手法について，Asian Scientific Paper Excerpt Corpus（ASPEC）を用いてそれぞれ実験を行い，BLEU等の評価指標を用いてニューラル機械翻訳の精度を示し，敵対的生成ネットワークがニューラル機械翻訳に与える影響について考察を述べる．また，学習されたDiscriminatorに原言語文とWorkshop on AsianTranslation（WAT）における実際の翻訳システムの出力文を入力することで，文単位の翻訳精度を評価する．評価したスコアと人手評価スコアの相関をケンドールの順位相関を用いて示し，考察を述べる．本論文の構成は以下のようになっている．第1章では本研究全体の提案，貢献，概要について述べる．第2章では先行研究とともにニューラル機械翻訳の基本的な構造について述べる．第3章では敵対的生成ネットワークに関する先行研究について述べる．第4章では敵対的生成ネットワークを用いた機械翻訳手法について述べる．第5章では敵対的生成ネットワークを用いた機械翻訳評価手法について述べる．第6章では，第4，5章で述べた手法を用いた実験結果と考察について述べる．最後に，第7章で本研究のまとめについて述べる．

抄録

内容記述タイプ

Abstract

内容記述

In recent years, neural machine translation (NMT) has been researched all over the world. Once an encoder-decoder NMT which combines two recurrent neural networks (RNN) was proposed, NMT had gained great popularity in the machine translation community. However, the conventional encoder-decoder NMT works pooly on long sequences. Attention-based NMT can perform better prediction of output words by using the weights of each hidden state of the encoder as the context vector. It contributed to improvement of translation quality, especially in a long sentence. Nevertheless, NMT has many problems. For example, there are over-translation: some words are repeatedly translated or unnecessary words are generated and under-translation: some words are mistakenly untranslated. In general, BLEU is used for the evaluation of machine translation. However, BLEU is an evaluation metric based on n-gram precision. Therefore, even if the system correctly translates a source sentence but the surface of word is different from the reference, BLEU evaluates the target sentence mistakenly. Furthermore, BLEU needs a reference. Incidentally, Generative adversarial network (GAN) which is popular in the field of image generation consists of two networks; generator which generates a data and discriminator which distinguishes a generated data from true data． Discriminator is expected to distinguish the true or generated data, but generator aims to generate a data close to true data which discriminator cannot distinguish. Generator can generate the data close to true data by adversarial training of these two networks. GAN is attempted to be used in the field of neural machine translation. Several previous studies regard NMT as generator, and the classifier as discriminator which distinguishes true sentence from generated sentence by the source sentence and target sentence. From this adversarial training, the quality of NMT is improved. Furthermore, an objective function of conventional NMT is optimized by word unit, so it is not guaranteed that an output of NMT is optimized as a sentence. Moreover, GAN NMT is optimized using sentence-level information, so GAN NMT is expected to generate natural sentence. In this research, we examined the effect of GAN in Japanese-English and English-Japanese translation. Furthermore, we focus on discriminator which predicts the correctness of a sentence and proposed an evaluation method for machine translation using GAN. In this scenario, true data is the human translation. Therefore, the sentence which is predicted to be a correct sentence by discriminator is likely to be the translation which is close to the human translation, so it can be used for the evaluation of machine translation. This method does not need reference on evaluation because it only uses the pair of the source sentence and system output, so it is expected that the translation of the source sentence which does not have the target sentence, for example monolingual corpus, is evaluated. In our experiments, we implemented the model architecture based on conditional sequence GAN (CSGAN), and trained it by three types of objective functions. The first is conventional GAN. It is trained using cross entropy. The second is least squares GAN (LSGAN). This method predicts the correctness of target sentence directly without using the activation function and it is trained using mean squared error. The last is Wasserstein GAN (WGAN). This method considers the difference of scores between true data and generated data on training. Note that we applied pre-training to both generator and discriminator using baseline in all methods because the training of GAN is very unstable. We experimented GAN NMT on Japanese-English and English-Japanese translation using Asian scientific paper excerpt corpus (ASPEC). Furthermore, we evaluate the output of translation system in the Workshop on Asian Translation (WAT) using GAN NMT. We confirmed the results in terms of both translation and evaluation quality. This paper comprises as follows. In Section 1, we introduce the overview of research on neural machine translation. In Section 2, we describe the architecture of neural machine translation. In Section 3, we introduce the previous studies on generative adversarial network. In Section 4, we describe the architecture of generative adversarial network in neural machine translation. In Section 5, we propose the evaluation method using generative adversarial network for machine translation. In Section 6, we show the experimental results on the translation and evaluation using generative adversarial network. Finally, in Section 7, we describe the summary of this paper.

内容記述

内容記述タイプ

Other

内容記述

首都大学東京, 2019-03-25, 修士（工学）

書誌情報

p. 1-28, 発行日 2019-03-25

その他のタイトル

Neural Machine Translation and its Evaluation using Generative Adversarial Network

学位名

修士（工学）

学位授与機関

学位授与機関名

首都大学東京

学位授与年月日

2019-03-25

戻る

views

See details

	Views

Versions

Ver.1

2023-06-19 16:04:53.455888

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

敵対的生成ネットワークを用いたニューラル機械翻訳とその評価

× 松村, 雪桜

× マツムラ, ユキオ

× Matsumura, Yukio

Versions

Share

Cite as

エクスポート