List of Acoustic Models
| #states | #mixtures | gender | |
| monophone | 129 | 4, 8, 16 | GD, GI |
| triphone 1000 | 1000 | 4, 8, 16 | GD |
| triphone 2000 | 2000 | 4, 8, 16 | GD, GI |
| triphone 3000 | 3000 | 4, 8, 16 | GD |
| PTM triphone | 3000/129 | 64 | GD, GI |
List of Japanese Phones
| a i u e o a: i: u: e: o: N w y |
| p py t k ky b by d dy g gy ts ch |
| m my n ny h hy f s sh z j r ry |
| q sp silB silE (pauses) |
Training...ASJ (Acoustical Society of Japan) databases
20K sentences / 132 speakers for each gender
Acoustic Analysis
| A/D | 16kHz,16bit |
| frame shift | 10ms |
| analysis | MFCC (12-th order) |
| LogPow | |
| CMN | done for whole utterance |
pattern: MFCC +
MFCC +
LogPow (25 variables)
HMM
left-to-right 3 states (excluding initial & final)
decision tree-based clustering:
(logical triphone 21000)
(physical triphone 8000)
PTM (Phonetic Tied-Mixture) model