Abstract:To improve the accuracy of the gene encoding (extron) prediction, near period-3 feature extrons prediction algorithm is proposed. Near period-3 clusting power spectrum of extrons and introns are extracted as template feature, DNA sequence whose extrons and introns ranges are unknown is divided into frames and moved. Compared with the template feature, the prediction of the Euclidean distance with different weights is realized from each frame. By changing the different feature, number, frame length, gene sequence weight and comparing with period-3 algorithm, the experiment results show that the prediction accuracy of the proposed algorithm is better than that period-3 algorithm.
[1]Rambaut A. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees[J]. Comput Appl Biosci,1997,13 (3):235-238.
[2]Rao N,Lei X, Guo J, et al.An Efficient Sliding Window Strategy for Accurate Location of Eukaryotic Protein Coding Regions[J].Computer in Biology and Medicine,2009,39(4):392-395.
[3]Vaidyanathan P P, Yoon B J.The role of signal-processing concepts in genomics and protemotics[J].Journal of the Franklin Institute, 2004, 341(1-2):111-135.
[4]Aggarwal G, Ramaswamy R. Ab initio gene identification:prokaryote genome annotation with GeneScan and GLIMMER[J]. Journal of Biosci,2002,27(1): 7-14.
[5]杜竹青.一种提高外显子预测的改进周期3消噪策略[J].江苏科技大学学报(自然科学版), 2007,27(6): 575-579.
[6]邵建峰,严晓华,邵伟,等.DNA序列信号3周期特性[J].南京工业大学学报(自然科学版), 2012,34(4): 133-137.
[7]田元新,陈超,邹小勇,等.外显子周期3行为特征的研究[J].化学学报,2005,63(13):1215-1219.
[8]Akhtar M, Epps J, Ambikairajah E. Signal processing in sequence analysis: advances in eukaryotic gene prediction[J].IEEE Journal of Selected Topics in Signal Processing, 2008, 2(3): 310-321.
[9]Al-Turaiki I M, Mathkour H, Touir A, et al. Computational approaches for gene prediction:A comparative survey[M].Berlin:Springer,2011:14-25.
[10]张晓磊.基于数字信号处理理论和方法的外显子预测研究[D].天津:南开大学, 2014.
[11]邵建林,徐东,王兰州,等.一种新的预测蛋白质二级结构的模型-贝叶斯神经网络[J].计量学报,2006,27 (3): 281-285.
[12]Yin C, Yau S. A Fourier characteristic of coding sequences; origins and a non-Fourier approximation[J].Journal of Computational Biology A Journal of Computational Molecular Cell Biology,2005,12(9):1153-1165.