Kaldi中的Transition Model

kaldi中的Transition Model主要维护了概率密度函数pdf，HMM拓扑类型，以及所有音素的HMM状态信息（即音素id，状态id，和对应的概率密度函数的id，单独索引的transition-id）。这篇是过去的讲解列表，弄清前面说的几个id之间的关系就行了。

topo文件结构

将topo文件读入`HmmTopology`

1
2
3

phones_     音素列表
phone2idx_  该音素建模用哪一种拓扑结构【几状态】
entries_    多少种拓扑结构【2】，每一种拓扑结构实际上由一系列状态描述

如何描述状态

发射概率: pdf，概率密度函数
转移概率：transitions

关于pdfclass

<PdfClass>默认forward_pdf_class和self_loop_pdf_class相同
否则分开描述<ForwardPdfClass>，<SelfLoopPdfClass>

Transition Model的初始化条件

HmmTopology
ContextDependency: 决策树，实际上，这里面维护一个EventMap，输入查询条件，给出查询结果

set.txt文件结构

set.txt文件中的一行表示这些因素共享一个一个pdf即GMM模型。所谓的决策树是在此基础之上，构建出索引树结构的。Transition Model初始化的时候只是针对单音素建模。所以决策树结构看起来比较简单（最后面附上两张图吧）。

关于EventMap

SE: SplitEventMap：一般给出中心位置的phone-id应答【key = P_】
TE: TableEventMap：一般给出pdf-class应答，默认【key = keyPdfClass =-1】
CE: ConstantEventMap: 一对一，只给出答案
copy-tree --binary=false tree - 看到的是什么: 树结构递归过程中间量的记录

Transition Model的作用

维护音素建模的拓扑结构，就是HmmTopology
维护了模型中所有phone_id，state_id， pdf_id的信息，索引称为transition_state
transition_state到firsttransition_id的映射
transition_id到transition_state的映射
transition_id到pdf_id的映射

transition_id对应的转移概率
对应存储结构如下：

HmmTopology topo_;
std::vector<Tuple> tuples_;
std::vector<int32> state2id_;
std::vector<int32> id2state_;
std::vector<int32> id2pdf_id_;
Vector<BaseFloat> log_probs_;

如何获取Tuples[phone-id，state-id，pdf-id]三元组信息

在索引树中，可以获得【phone-id, pdf-class, pdf-id】信息

将上述结构按pdf-id索引，存入pdf-info中，这个结构的大小，就是我们要建模的pdf数目

1 2	# pdf_id => [phone_id, pdf_class], [phone_id, pdf_class]... std::vector<std::vector<std::pair<int32, int32> > > pdf_info;

从topo结构中，获取【phone-id, pdf-class, state-id】信息

将上述结构按照【phone-id, pdf-class】索引，存入to_hmm_state_list中

1 2	# phone_id pdf_class => state1 state2... std::map<std::pair<int32, int32>, std::vector<int32> > to_hmm_state_list;

merge【pdf-id: phone-id, pdf-class】和【phone-id, pdf-class: state-id】

for pdf_id in range(len(pdf_info)):
    for phone_id, pdf_class in pdf_info[pdf_id]:
        for state_id in to_hmm_state_list[phone-id, pdf-class]:
            tuples.append(phone_id, state_id, pdf_id)

如何获取`state2id_`,`id2state_`,`id2pdf_id_`

transition_state索引从1开始
transition_state的总数就是tuples_的大小，也就是说,tuples_按transition_state索引

获取state2id_

first_tid = 1
for state_id in range(1, tuples_.size + 1):
    state2id_[state_id] = first_tid
    first_tid += num_of_tids_in_state

state2id_倒过来就是id2state_
id2pdf_id_根据tid找到state_id里面的forward_pdf或者self_loop_pdf就行

如何获取`log_probs_`

for each tid:
    find: first_state_id = id2state_[tid]
    get : pdf_class = tid - first_state_id
    find: cur_tuple = tuples_[first_state_id - 1]
    find: tpo_entry by cur_tuple.phone
    get : log_trans_prob by tpo_entry[cur_tuple.state].transitions[pdf_class].prob

数据格式转换

查看模型transition信息
1
show-transitions phones.txt final.mdl

可视化tree，dot命令，详细查询graphviz

1
2
3

# -Gsize指定大小，-T指定保存类型，可以是png, jpg, pdf等
draw-tree phones.txt tree | dot -Gsize=80,100 -Tpng > tree.png
draw-tree phones.txt tree | dot -Tpdf > tree.pdf

GMM模型转文本格式

1 2	# 输出到标准输出 gmm-copy --binary=false final.mdl -

Tree转文本格式

1 2	# 输出到标准输出 eg: copy-tree --binary=false tree -