系统发育树构建中用EM算法进行参数估计
Using EM Algorithm to Estimate Parameters in Phylogenetic Tree Construction
-
摘要: 系统发育学研究物种之间的进化关系, 其核苷酸替代模型通常假设序列进化没有数据的缺损和删失, 而现实中这个假设条件是很难满足的. 针对这种事实, 本文将运用EM算法对存在插入或缺失但序列长度假设不变的观测序列构建系统发育树进行参数估计, 为含缺损数据序列构建良好的系统发育树作铺垫. 重点在于运用EM算法做Jukes-Cantor模型、Kimura模型下含缺损数据的DNA序列构建有根树或无根树最佳分枝长度等的参数估计.Abstract: Phylogenetics studies the evolutionary relationships between species. The nucleotide substitution models in phylogenetics usually assume that evolutions of sequences have neither missing nor censored, which is hard to be satisfied in fact. Facing to the fact above, we use an EM algorithm to estimate parameters, to construct a fine phylogenetic tree of the sequences which have the same length after deletions and insertions. Main points of this paper is to estimate best parameters of DNA sequences having censored data for Jukes-Cantor Model and Kimura Model under the conditions of rooted tree and unrooted tree respectively.