Training...Mainichi newspaper article texts
| 45 month | 75 month | |
| period | '91/01-'94/09 | '91/01-'94/09 |
| '95/01-'97/06 | ||
| data amount | 65M words | 118M words |
Language Model Compression
Baseline model (cutoff-1-1)
List of 20K Language Models
| 2-gram | 3-gram | |
| entries | entries | |
| 45month cutoff-1-1 | 1,238,929 | 4,733,916 |
| 45month cutoff-4-4 | 657,759 | 1,593,020 |
| 45month compress10% | 1,238,929 | 473,176 |
| 75month cutoff-1-1 | 1,675,803 | 7,445,209 |
| 75month cutoff-4-4 | 901,475 | 2,629,605 |
| 75month compress10% | 1,675,803 | 744,438 |
List of 60K Language Models
| 2-gram | 3-gram | |
| entries | entries | |
| 75month cutoff-1-1 | 2,420,231 | 8,368,507 |
| 75month compress10% | 2,420,231 | 836,852 |
backward 3-gram (for forward-backward search)