Skip to content

Commit

Permalink
modify readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jaywalnut310 committed Nov 2, 2017
1 parent 9ce177a commit 3ff8fc2
Show file tree
Hide file tree
Showing 19 changed files with 41 additions and 10 deletions.
51 changes: 41 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,61 @@
# Transformer-without-Explicit-Segmentation
Transformer without Explicit Segmentation
===

"Transformer without Explicit Segmentation" running on Tensor2Tensor Library.

---

## Introduction

It implements the embedding algorithm suggested in the paper "Fully Character-Level Neural Machine Translation without Explicit Segmentation".
It implements the embedding algorithm suggested in the paper "Fully Character-Level Neural Machine Translation without Explicit Segmentation". We call this embedding algorithm as "Character-Aware Embedding" from now on.

The embedding is working with the Transformer architecture, which was suggested in the paper "Attention Is All You Need".

I'm researching which embedding variation or input-output type gets better performance currently.

* Comparison with Word/SubWord/Character/Byte-level input embedding
* Comparison with Word/SubWord/Character/Byte-level output embedding
* Multi CNN Layers instead of a CNN layer

## Technical Report

### 1. Comparison [output type: (sub) word-level]
The result is not quite good currently.
Solving the discrepancy betweent training loss and evaluation loss would make it get better.
The effect of explicit segmentation may be revealed obviously in "Hard to Segment" languages such as Chinese, Japanese, Korean. That is, without Character-Aware Embedding, translations from "Hard to Segment" languages to easy to segment languages are difficult to get good performances until we use tokenizing preprocessing.

Therefore, I focus on the translation tasks from that kind of languages. In fact, it turns out those languages takes little training time than alphabetic languages, as their word lengths are generally shorter than alphabetic languages and we fix the max-pooling size.

### 1. Word-level output

We use small Korean-English parallel data.

![word_training_loss](pictures/comp_word_training_loss.png "word_training_loss")
![word_eval_loss](pictures/comp_word_eval_loss.png "word_eval_loss")
![word_bleu](pictures/comp_word_bleu.png "word_bleu")
![word_eval_metrics](pictures/comp_word_eval_metrics.png "word_eval_metrics")
#### 1.1. Comparison
Word-level input embedding is named 'koen_simple' and Character-Aware input embedding is named 'koen_char2word_simple'. As you can see, The evaluation loss of 'koen_char2word_simple' reduces more and faster than 'koen_simple'. Furthermore, training time is also reduced by reduction of time_step due to max-pooling.
![comp_word_eval_loss](pictures/cae5a5c9ac505ea180c8cbd2630aabe3.png)
*Fig 1. Evaluation Loss Comparison (Word-level Output)*

Furthermore, 'koen_char2word_simple' performs better or same in many metrics.
![comp_word_eval_metrics](pictures/5deb2b84292583297dca2c2407f53b6c.png)
*Fig 2. Evaluation Metrics Comparison (Word-level Output)*

### 2. Comparison [output type: character-level]
In terms of Out-of-vocabulary(OOV), 'koen_char2word_simple' is more robust due to convolution and character-level embedding scheme.

| Input \ Model | 'koen_char2word_simple' | 'ko_en_simple' |
|---|---|---|
| 바보 | ![](pictures/3585145a278db5457159f4e669512279.png) | ![](pictures/89739e520b83ccec69bf25a0133cdd8a.png) |
| 바아보 | ![](pictures/c4ea35c3ea10a90810fc34fd11044dc3.png) | ![](pictures/d9e615cd4f9c8171834e579f8d12130a.png) |
| 바아아보 | ![](pictures/2d272a1f110a6d5dbf65508bbc6be24d.png) | ![](pictures/255e9219923de0a1b801bf8126f27b04.png) |
| 아버지가방에들어가신다. | ![](pictures/24ebd5c0d5907114f29b9735a8a366fa.png) | ![](pictures/774810804cdb8a3dc1ca0f91ccab1b28.png) |
| 아버지가 방에 들어가신다. | ![](pictures/a26f67e0b0eb5ea834efd547e7a78d3e.png) | ![](pictures/1675eb17cbc1d86b1cab44c377bb5cbd.png) |
| 아버지 가방에 들어가신다. | ![](pictures/a894a9692b2285bf9811fdc495cf176f.png) | ![](pictures/e2ce12369041f64f7efe04f8d1d51d26.png) |
*Table 1. OOV Test for some sentences (Word-level Output)*

### 2. SubWord-level Output

### 3. Character-level Output

### 4. Byte-level Output

---
## Issues

## Reference
* [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)
Expand Down
Binary file added pictures/1675eb17cbc1d86b1cab44c377bb5cbd.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/24ebd5c0d5907114f29b9735a8a366fa.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/255e9219923de0a1b801bf8126f27b04.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/2d272a1f110a6d5dbf65508bbc6be24d.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/3585145a278db5457159f4e669512279.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/5deb2b84292583297dca2c2407f53b6c.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/774810804cdb8a3dc1ca0f91ccab1b28.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/89739e520b83ccec69bf25a0133cdd8a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/a26f67e0b0eb5ea834efd547e7a78d3e.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/a894a9692b2285bf9811fdc495cf176f.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/c4ea35c3ea10a90810fc34fd11044dc3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/cae5a5c9ac505ea180c8cbd2630aabe3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed pictures/comp_word_bleu.png
Binary file not shown.
Binary file removed pictures/comp_word_eval_loss.png
Binary file not shown.
Binary file removed pictures/comp_word_eval_metrics.png
Binary file not shown.
Binary file removed pictures/comp_word_training_loss.png
Binary file not shown.
Binary file added pictures/d9e615cd4f9c8171834e579f8d12130a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pictures/e2ce12369041f64f7efe04f8d1d51d26.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3ff8fc2

Please sign in to comment.