On calculating the probability of a set of orthologous sequences
Junfeng Liu1,2, Liang Chen3, Hongyu Zhao4, Dirk F Moore1,2, Yong Lin1,2, Weichung Joe Shih1,2
1Biometrics Division, The Cancer, Institute of New Jersey, New Brunswick, NJ, USA; 2Department of Biostatistics, School of Public Health, University of Medicine and Dentistry of New Jersey, Piscataway, NJ, USA; 3Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA; 4Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT, USA
Abstract: Probabilistic DNA sequence models have been intensively applied to genome research. Within the evolutionary biology framework, this article investigates the feasibility for rigorously estimating the probability of a set of orthologous DNA sequences which evolve from a common progenitor. We propose Monte Carlo integration algorithms to sample the unknown ancestral and/or root sequences a posteriori conditional on a reference sequence and apply pairwise Needleman–Wunsch alignment between the sampled and nonreference species sequences to estimate the probability. We test our algorithms on both simulated and real sequences and compare calculated probabilities from Monte Carlo integration to those induced by single multiple alignment.
Keywords: evolution, Jukes–Cantor model, Monte Carlo integration, Needleman–Wunsch alignment, orthologous
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.Download Article [PDF]