WorldWideScience

Sample records for phylogenetic estimation methods

  1. Coalescent methods for estimating phylogenetic trees.

    Science.gov (United States)

    Liu, Liang; Yu, Lili; Kubatko, Laura; Pearl, Dennis K; Edwards, Scott V

    2009-10-01

    We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.

  2. Incompletely resolved phylogenetic trees inflate estimates of phylogenetic conservatism.

    Science.gov (United States)

    Davies, T Jonathan; Kraft, Nathan J B; Salamin, Nicolas; Wolkovich, Elizabeth M

    2012-02-01

    The tendency for more closely related species to share similar traits and ecological strategies can be explained by their longer shared evolutionary histories and represents phylogenetic conservatism. How strongly species traits co-vary with phylogeny can significantly impact how we analyze cross-species data and can influence our interpretation of assembly rules in the rapidly expanding field of community phylogenetics. Phylogenetic conservatism is typically quantified by analyzing the distribution of species values on the phylogenetic tree that connects them. Many phylogenetic approaches, however, assume a completely sampled phylogeny: while we have good estimates of deeper phylogenetic relationships for many species-rich groups, such as birds and flowering plants, we often lack information on more recent interspecific relationships (i.e., within a genus). A common solution has been to represent these relationships as polytomies on trees using taxonomy as a guide. Here we show that such trees can dramatically inflate estimates of phylogenetic conservatism quantified using S. P. Blomberg et al.'s K statistic. Using simulations, we show that even randomly generated traits can appear to be phylogenetically conserved on poorly resolved trees. We provide a simple rarefaction-based solution that can reliably retrieve unbiased estimates of K, and we illustrate our method using data on first flowering times from Thoreau's woods (Concord, Massachusetts, USA).

  3. Estimating evolutionary rates using time-structured data: a general comparison of phylogenetic methods.

    Science.gov (United States)

    Duchêne, Sebastián; Geoghegan, Jemma L; Holmes, Edward C; Ho, Simon Y W

    2016-11-15

    In rapidly evolving pathogens, including viruses and some bacteria, genetic change can accumulate over short time-frames. Accordingly, their sampling times can be used to calibrate molecular clocks, allowing estimation of evolutionary rates. Methods for estimating rates from time-structured data vary in how they treat phylogenetic uncertainty and rate variation among lineages. We compiled 81 virus data sets and estimated nucleotide substitution rates using root-to-tip regression, least-squares dating and Bayesian inference. Although estimates from these three methods were often congruent, this largely relied on the choice of clock model. In particular, relaxed-clock models tended to produce higher rate estimates than methods that assume constant rates. Discrepancies in rate estimates were also associated with high among-lineage rate variation, and phylogenetic and temporal clustering. These results provide insights into the factors that affect the reliability of rate estimates from time-structured sequence data, emphasizing the importance of clock-model testing. sduchene@unimelb.edu.au or garzonsebastian@hotmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. A new effective method for estimating missing values in the sequence data prior to phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Abdoulaye Baniré Diallo

    2006-01-01

    Full Text Available In this article we address the problem of phylogenetic inference from nucleic acid data containing missing bases. We introduce a new effective approach, called “Probabilistic estimation of missing values” (PEMV, allowing one to estimate unknown nucleotides prior to computing the evolutionary distances between them. We show that the new method improves the accuracy of phylogenetic inference compared to the existing methods “Ignoring Missing Sites” (IMS, “Proportional Distribution of Missing and Ambiguous Bases” (PDMAB included in the PAUP software [26]. The proposed strategy for estimating missing nucleotides is based on probabilistic formulae developed in the framework of the Jukes-Cantor [10] and Kimura 2-parameter [11] models. The relative performances of the new method were assessed through simulations carried out with the SeqGen program [20], for data generation, and the BioNJ method [7], for inferring phylogenies. We also compared the new method to the DNAML program [5] and “Matrix Representation using Parsimony” (MRP [13], [19] considering an example of 66 eutherian mammals originally analyzed in [17].

  5. Ant-Based Phylogenetic Reconstruction (ABPR: A new distance algorithm for phylogenetic estimation based on ant colony optimization

    Directory of Open Access Journals (Sweden)

    Karla Vittori

    2008-12-01

    Full Text Available We propose a new distance algorithm for phylogenetic estimation based on Ant Colony Optimization (ACO, named Ant-Based Phylogenetic Reconstruction (ABPR. ABPR joins two taxa iteratively based on evolutionary distance among sequences, while also accounting for the quality of the phylogenetic tree built according to the total length of the tree. Similar to optimization algorithms for phylogenetic estimation, the algorithm allows exploration of a larger set of nearly optimal solutions. We applied the algorithm to four empirical data sets of mitochondrial DNA ranging from 12 to 186 sequences, and from 898 to 16,608 base pairs, and covering taxonomic levels from populations to orders. We show that ABPR performs better than the commonly used Neighbor-Joining algorithm, except when sequences are too closely related (e.g., population-level sequences. The phylogenetic relationships recovered at and above species level by ABPR agree with conventional views. However, like other algorithms of phylogenetic estimation, the proposed algorithm failed to recover expected relationships when distances are too similar or when rates of evolution are very variable, leading to the problem of long-branch attraction. ABPR, as well as other ACO-based algorithms, is emerging as a fast and accurate alternative method of phylogenetic estimation for large data sets.

  6. The performance of phylogenetic algorithms in estimating haplotype genealogies with migration.

    Science.gov (United States)

    Salzburger, Walter; Ewing, Greg B; Von Haeseler, Arndt

    2011-05-01

    Genealogies estimated from haplotypic genetic data play a prominent role in various biological disciplines in general and in phylogenetics, population genetics and phylogeography in particular. Several software packages have specifically been developed for the purpose of reconstructing genealogies from closely related, and hence, highly similar haplotype sequence data. Here, we use simulated data sets to test the performance of traditional phylogenetic algorithms, neighbour-joining, maximum parsimony and maximum likelihood in estimating genealogies from nonrecombining haplotypic genetic data. We demonstrate that these methods are suitable for constructing genealogies from sets of closely related DNA sequences with or without migration. As genealogies based on phylogenetic reconstructions are fully resolved, but not necessarily bifurcating, and without reticulations, these approaches outperform widespread 'network' constructing methods. In our simulations of coalescent scenarios involving panmictic, symmetric and asymmetric migration, we found that phylogenetic reconstruction methods performed well, while the statistical parsimony approach as implemented in TCS performed poorly. Overall, parsimony as implemented in the PHYLIP package performed slightly better than other methods. We further point out that we are not making the case that widespread 'network' constructing methods are bad, but that traditional phylogenetic tree finding methods are applicable to haplotypic data and exhibit reasonable performance with respect to accuracy and robustness. We also discuss some of the problems of converting a tree to a haplotype genealogy, in particular that it is nonunique. © 2011 Blackwell Publishing Ltd.

  7. The Independent Evolution Method Is Not a Viable Phylogenetic Comparative Method.

    Directory of Open Access Journals (Sweden)

    Randi H Griffin

    Full Text Available Phylogenetic comparative methods (PCMs use data on species traits and phylogenetic relationships to shed light on evolutionary questions. Recently, Smaers and Vinicius suggested a new PCM, Independent Evolution (IE, which purportedly employs a novel model of evolution based on Felsenstein's Adaptive Peak Model. The authors found that IE improves upon previous PCMs by producing more accurate estimates of ancestral states, as well as separate estimates of evolutionary rates for each branch of a phylogenetic tree. Here, we document substantial theoretical and computational issues with IE. When data are simulated under a simple Brownian motion model of evolution, IE produces severely biased estimates of ancestral states and changes along individual branches. We show that these branch-specific changes are essentially ancestor-descendant or "directional" contrasts, and draw parallels between IE and previous PCMs such as "minimum evolution". Additionally, while comparisons of branch-specific changes between variables have been interpreted as reflecting the relative strength of selection on those traits, we demonstrate through simulations that regressing IE estimated branch-specific changes against one another gives a biased estimate of the scaling relationship between these variables, and provides no advantages or insights beyond established PCMs such as phylogenetically independent contrasts. In light of our findings, we discuss the results of previous papers that employed IE. We conclude that Independent Evolution is not a viable PCM, and should not be used in comparative analyses.

  8. Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

    Science.gov (United States)

    Lanfear, Robert; Hua, Xia; Warren, Dan L.

    2016-01-01

    Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794

  9. Community Phylogenetics: Assessing Tree Reconstruction Methods and the Utility of DNA Barcodes

    Science.gov (United States)

    Boyle, Elizabeth E.; Adamowicz, Sarah J.

    2015-01-01

    Studies examining phylogenetic community structure have become increasingly prevalent, yet little attention has been given to the influence of the input phylogeny on metrics that describe phylogenetic patterns of co-occurrence. Here, we examine the influence of branch length, tree reconstruction method, and amount of sequence data on measures of phylogenetic community structure, as well as the phylogenetic signal (Pagel’s λ) in morphological traits, using Trichoptera larval communities from Churchill, Manitoba, Canada. We find that model-based tree reconstruction methods and the use of a backbone family-level phylogeny improve estimations of phylogenetic community structure. In addition, trees built using the barcode region of cytochrome c oxidase subunit I (COI) alone accurately predict metrics of phylogenetic community structure obtained from a multi-gene phylogeny. Input tree did not alter overall conclusions drawn for phylogenetic signal, as significant phylogenetic structure was detected in two body size traits across input trees. As the discipline of community phylogenetics continues to expand, it is important to investigate the best approaches to accurately estimate patterns. Our results suggest that emerging large datasets of DNA barcode sequences provide a vast resource for studying the structure of biological communities. PMID:26110886

  10. Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data.

    Science.gov (United States)

    Tang, Cuong Q; Humphreys, Aelys M; Fontaneto, Diego; Barraclough, Timothy G; Paradis, Emmanuel

    2014-10-01

    Coalescent-based species delimitation methods combine population genetic and phylogenetic theory to provide an objective means for delineating evolutionarily significant units of diversity. The generalised mixed Yule coalescent (GMYC) and the Poisson tree process (PTP) are methods that use ultrametric (GMYC or PTP) or non-ultrametric (PTP) gene trees as input, intended for use mostly with single-locus data such as DNA barcodes. Here, we assess how robust the GMYC and PTP are to different phylogenetic reconstruction and branch smoothing methods. We reconstruct over 400 ultrametric trees using up to 30 different combinations of phylogenetic and smoothing methods and perform over 2000 separate species delimitation analyses across 16 empirical data sets. We then assess how variable diversity estimates are, in terms of richness and identity, with respect to species delimitation, phylogenetic and smoothing methods. The PTP method generally generates diversity estimates that are more robust to different phylogenetic methods. The GMYC is more sensitive, but provides consistent estimates for BEAST trees. The lower consistency of GMYC estimates is likely a result of differences among gene trees introduced by the smoothing step. Unresolved nodes (real anomalies or methodological artefacts) affect both GMYC and PTP estimates, but have a greater effect on GMYC estimates. Branch smoothing is a difficult step and perhaps an underappreciated source of bias that may be widespread among studies of diversity and diversification. Nevertheless, careful choice of phylogenetic method does produce equivalent PTP and GMYC diversity estimates. We recommend simultaneous use of the PTP model with any model-based gene tree (e.g. RAxML) and GMYC approaches with BEAST trees for obtaining species hypotheses.

  11. Bayesian phylogenetic estimation of fossil ages.

    Science.gov (United States)

    Drummond, Alexei J; Stadler, Tanja

    2016-07-19

    Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth-death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the 'morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses.This article is part of the themed issue 'Dating species divergences using

  12. Point estimates in phylogenetic reconstructions

    OpenAIRE

    Benner, Philipp; Bacak, Miroslav; Bourguignon, Pierre-Yves

    2013-01-01

    Motivation: The construction of statistics for summarizing posterior samples returned by a Bayesian phylogenetic study has so far been hindered by the poor geometric insights available into the space of phylogenetic trees, and ad hoc methods such as the derivation of a consensus tree makeup for the ill-definition of the usual concepts of posterior mean, while bootstrap methods mitigate the absence of a sound concept of variance. Yielding satisfactory results with sufficiently concentrated pos...

  13. Phylogenetic comparative methods on phylogenetic networks with reticulations.

    Science.gov (United States)

    Bastide, Paul; Solís-Lemus, Claudia; Kriebel, Ricardo; Sparks, K William; Ané, Cécile

    2018-04-25

    The goal of Phylogenetic Comparative Methods (PCMs) is to study the distribution of quantitative traits among related species. The observed traits are often seen as the result of a Brownian Motion (BM) along the branches of a phylogenetic tree. Reticulation events such as hybridization, gene flow or horizontal gene transfer, can substantially affect a species' traits, but are not modeled by a tree. Phylogenetic networks have been designed to represent reticulate evolution. As they become available for downstream analyses, new models of trait evolution are needed, applicable to networks. One natural extension of the BM is to use a weighted average model for the trait of a hybrid, at a reticulation point. We develop here an efficient recursive algorithm to compute the phylogenetic variance matrix of a trait on a network, in only one preorder traversal of the network. We then extend the standard PCM tools to this new framework, including phylogenetic regression with covariates (or phylogenetic ANOVA), ancestral trait reconstruction, and Pagel's λ test of phylogenetic signal. The trait of a hybrid is sometimes outside of the range of its two parents, for instance because of hybrid vigor or hybrid depression. These two phenomena are rather commonly observed in present-day hybrids. Transgressive evolution can be modeled as a shift in the trait value following a reticulation point. We develop a general framework to handle such shifts, and take advantage of the phylogenetic regression view of the problem to design statistical tests for ancestral transgressive evolution in the evolutionary history of a group of species. We study the power of these tests in several scenarios, and show that recent events have indeed the strongest impact on the trait distribution of present-day taxa. We apply those methods to a dataset of Xiphophorus fishes, to confirm and complete previous analysis in this group. All the methods developed here are available in the Julia package PhyloNetworks.

  14. Estimating phylogenetic trees from genome-scale data.

    Science.gov (United States)

    Liu, Liang; Xi, Zhenxiang; Wu, Shaoyuan; Davis, Charles C; Edwards, Scott V

    2015-12-01

    The heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. Phylogenetic methods known as "species tree" methods have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Here we review theory and empirical examples that help clarify conflicts between species tree and concatenation methods, and misconceptions in the literature about the performance of species tree methods. Considering concatenation as a special case of the multispecies coalescent model helps explain differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences and long-branch attraction. We show that approaches, such as binning, designed to augment the signal in species tree analyses can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods incorporating biological realism are a key to phylogenetic analysis of whole-genome data. © 2015 New York Academy of Sciences.

  15. BIMLR: a method for constructing rooted phylogenetic networks from rooted phylogenetic trees.

    Science.gov (United States)

    Wang, Juan; Guo, Maozu; Xing, Linlin; Che, Kai; Liu, Xiaoyan; Wang, Chunyu

    2013-09-15

    Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. © 2013 Elsevier B.V. All rights reserved.

  16. Phylogenetic reconstruction methods: an overview.

    Science.gov (United States)

    De Bruyn, Alexandre; Martin, Darren P; Lefeuvre, Pierre

    2014-01-01

    Initially designed to infer evolutionary relationships based on morphological and physiological characters, phylogenetic reconstruction methods have greatly benefited from recent developments in molecular biology and sequencing technologies with a number of powerful methods having been developed specifically to infer phylogenies from macromolecular data. This chapter, while presenting an overview of basic concepts and methods used in phylogenetic reconstruction, is primarily intended as a simplified step-by-step guide to the construction of phylogenetic trees from nucleotide sequences using fairly up-to-date maximum likelihood methods implemented in freely available computer programs. While the analysis of chloroplast sequences from various Vanilla species is used as an illustrative example, the techniques covered here are relevant to the comparative analysis of homologous sequences datasets sampled from any group of organisms.

  17. Phylogenetic uncertainty can bias the number of evolutionary transitions estimated from ancestral state reconstruction methods.

    Science.gov (United States)

    Duchêne, Sebastian; Lanfear, Robert

    2015-09-01

    Ancestral state reconstruction (ASR) is a popular method for exploring the evolutionary history of traits that leave little or no trace in the fossil record. For example, it has been used to test hypotheses about the number of evolutionary origins of key life-history traits such as oviparity, or key morphological structures such as wings. Many studies that use ASR have suggested that the number of evolutionary origins of such traits is higher than was previously thought. The scope of such inferences is increasing rapidly, facilitated by the construction of very large phylogenies and life-history databases. In this paper, we use simulations to show that the number of evolutionary origins of a trait tends to be overestimated when the phylogeny is not perfect. In some cases, the estimated number of transitions can be several fold higher than the true value. Furthermore, we show that the bias is not always corrected by standard approaches to account for phylogenetic uncertainty, such as repeating the analysis on a large collection of possible trees. These findings have important implications for studies that seek to estimate the number of origins of a trait, particularly those that use large phylogenies that are associated with considerable uncertainty. We discuss the implications of this bias, and methods to ameliorate it. © 2015 Wiley Periodicals, Inc.

  18. Diversity Dynamics in Nymphalidae Butterflies: Effect of Phylogenetic Uncertainty on Diversification Rate Shift Estimates

    Science.gov (United States)

    Peña, Carlos; Espeland, Marianne

    2015-01-01

    The species rich butterfly family Nymphalidae has been used to study evolutionary interactions between plants and insects. Theories of insect-hostplant dynamics predict accelerated diversification due to key innovations. In evolutionary biology, analysis of maximum credibility trees in the software MEDUSA (modelling evolutionary diversity using stepwise AIC) is a popular method for estimation of shifts in diversification rates. We investigated whether phylogenetic uncertainty can produce different results by extending the method across a random sample of trees from the posterior distribution of a Bayesian run. Using the MultiMEDUSA approach, we found that phylogenetic uncertainty greatly affects diversification rate estimates. Different trees produced diversification rates ranging from high values to almost zero for the same clade, and both significant rate increase and decrease in some clades. Only four out of 18 significant shifts found on the maximum clade credibility tree were consistent across most of the sampled trees. Among these, we found accelerated diversification for Ithomiini butterflies. We used the binary speciation and extinction model (BiSSE) and found that a hostplant shift to Solanaceae is correlated with increased net diversification rates in Ithomiini, congruent with the diffuse cospeciation hypothesis. Our results show that taking phylogenetic uncertainty into account when estimating net diversification rate shifts is of great importance, as very different results can be obtained when using the maximum clade credibility tree and other trees from the posterior distribution. PMID:25830910

  19. Diversity dynamics in Nymphalidae butterflies: effect of phylogenetic uncertainty on diversification rate shift estimates.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available The species rich butterfly family Nymphalidae has been used to study evolutionary interactions between plants and insects. Theories of insect-hostplant dynamics predict accelerated diversification due to key innovations. In evolutionary biology, analysis of maximum credibility trees in the software MEDUSA (modelling evolutionary diversity using stepwise AIC is a popular method for estimation of shifts in diversification rates. We investigated whether phylogenetic uncertainty can produce different results by extending the method across a random sample of trees from the posterior distribution of a Bayesian run. Using the MultiMEDUSA approach, we found that phylogenetic uncertainty greatly affects diversification rate estimates. Different trees produced diversification rates ranging from high values to almost zero for the same clade, and both significant rate increase and decrease in some clades. Only four out of 18 significant shifts found on the maximum clade credibility tree were consistent across most of the sampled trees. Among these, we found accelerated diversification for Ithomiini butterflies. We used the binary speciation and extinction model (BiSSE and found that a hostplant shift to Solanaceae is correlated with increased net diversification rates in Ithomiini, congruent with the diffuse cospeciation hypothesis. Our results show that taking phylogenetic uncertainty into account when estimating net diversification rate shifts is of great importance, as very different results can be obtained when using the maximum clade credibility tree and other trees from the posterior distribution.

  20. An ant colony optimization algorithm for phylogenetic estimation under the minimum evolution principle

    Directory of Open Access Journals (Sweden)

    Milinkovitch Michel C

    2007-11-01

    Full Text Available Abstract Background Distance matrix methods constitute a major family of phylogenetic estimation methods, and the minimum evolution (ME principle (aiming at recovering the phylogeny with shortest length is one of the most commonly used optimality criteria for estimating phylogenetic trees. The major difficulty for its application is that the number of possible phylogenies grows exponentially with the number of taxa analyzed and the minimum evolution principle is known to belong to the NP MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8xdX7Kaeeiuaafaaa@3888@-hard class of problems. Results In this paper, we introduce an Ant Colony Optimization (ACO algorithm to estimate phylogenies under the minimum evolution principle. ACO is an optimization technique inspired from the foraging behavior of real ant colonies. This behavior is exploited in artificial ant colonies for the search of approximate solutions to discrete optimization problems. Conclusion We show that the ACO algorithm is potentially competitive in comparison with state-of-the-art algorithms for the minimum evolution principle. This is the first application of an ACO algorithm to the phylogenetic estimation problem.

  1. phangorn: phylogenetic analysis in R.

    Science.gov (United States)

    Schliep, Klaus Peter

    2011-02-15

    phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum parsimony or maximum likelihood (ML) and performing Hadamard conjugation. Extending the general ML framework, this package provides the possibility of estimating mixture and partition models. Furthermore, phangorn offers several functions for comparing trees, phylogenetic models or splits, simulating character data and performing congruence analyses. phangorn can be obtained through the CRAN homepage http://cran.r-project.org/web/packages/phangorn/index.html. phangorn is licensed under GPL 2.

  2. New weighting methods for phylogenetic tree reconstruction using multiple loci.

    Science.gov (United States)

    Misawa, Kazuharu; Tajima, Fumio

    2012-08-01

    Efficient determination of evolutionary distances is important for the correct reconstruction of phylogenetic trees. The performance of the pooled distance required for reconstructing a phylogenetic tree can be improved by applying large weights to appropriate distances for reconstructing phylogenetic trees and small weights to inappropriate distances. We developed two weighting methods, the modified Tajima-Takezaki method and the modified least-squares method, for reconstructing phylogenetic trees from multiple loci. By computer simulations, we found that both of the new methods were more efficient in reconstructing correct topologies than the no-weight method. Hence, we reconstructed hominoid phylogenetic trees from mitochondrial DNA using our new methods, and found that the levels of bootstrap support were significantly increased by the modified Tajima-Takezaki and by the modified least-squares method.

  3. The Efficacy of Consensus Tree Methods for Summarizing Phylogenetic Relationships from a Posterior Sample of Trees Estimated from Morphological Data.

    Science.gov (United States)

    O'Reilly, Joseph E; Donoghue, Philip C J

    2018-03-01

    Consensus trees are required to summarize trees obtained through MCMC sampling of a posterior distribution, providing an overview of the distribution of estimated parameters such as topology, branch lengths, and divergence times. Numerous consensus tree construction methods are available, each presenting a different interpretation of the tree sample. The rise of morphological clock and sampled-ancestor methods of divergence time estimation, in which times and topology are coestimated, has increased the popularity of the maximum clade credibility (MCC) consensus tree method. The MCC method assumes that the sampled, fully resolved topology with the highest clade credibility is an adequate summary of the most probable clades, with parameter estimates from compatible sampled trees used to obtain the marginal distributions of parameters such as clade ages and branch lengths. Using both simulated and empirical data, we demonstrate that MCC trees, and trees constructed using the similar maximum a posteriori (MAP) method, often include poorly supported and incorrect clades when summarizing diffuse posterior samples of trees. We demonstrate that the paucity of information in morphological data sets contributes to the inability of MCC and MAP trees to accurately summarise of the posterior distribution. Conversely, majority-rule consensus (MRC) trees represent a lower proportion of incorrect nodes when summarizing the same posterior samples of trees. Thus, we advocate the use of MRC trees, in place of MCC or MAP trees, in attempts to summarize the results of Bayesian phylogenetic analyses of morphological data.

  4. LifePrint: a novel k-tuple distance method for construction of phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Fabián Reyes-Prieto

    2011-01-01

    Full Text Available Fabián Reyes-Prieto1, Adda J García-Chéquer1, Hueman Jaimes-Díaz1, Janet Casique-Almazán1, Juana M Espinosa-Lara1, Rosaura Palma-Orozco2, Alfonso Méndez-Tenorio1, Rogelio Maldonado-Rodríguez1, Kenneth L Beattie31Laboratory of Biotechnology and Genomic Bioinformatics, Department of Biochemistry, National School of Biological Sciences, 2Superior School of Computer Sciences, National Polytechnic Institute, Mexico City, Mexico; 3Amerigenics Inc, Crossville, Tennessee, USAPurpose: Here we describe LifePrint, a sequence alignment-independent k-tuple distance method to estimate relatedness between complete genomes.Methods: We designed a representative sample of all possible DNA tuples of length 9 (9-tuples. The final sample comprises 1878 tuples (called the LifePrint set of 9-tuples; LPS9 that are distinct from each other by at least two internal and noncontiguous nucleotide differences. For validation of our k-tuple distance method, we analyzed several real and simulated viroid genomes. Using different distance metrics, we scrutinized diverse viroid genomes to estimate the k-tuple distances between these genomic sequences. Then we used the estimated genomic k-tuple distances to construct phylogenetic trees using the neighbor-joining algorithm. A comparison of the accuracy of LPS9 and the previously reported 5-tuple method was made using symmetric differences between the trees estimated from each method and a simulated “true” phylogenetic tree.Results: The identified optimal search scheme for LPS9 allows only up to two nucleotide differences between each 9-tuple and the scrutinized genome. Similarity search results of simulated viroid genomes indicate that, in most cases, LPS9 is able to detect single-base substitutions between genomes efficiently. Analysis of simulated genomic variants with a high proportion of base substitutions indicates that LPS9 is able to discern relationships between genomic variants with up to 40% of nucleotide

  5. Molecular Phylogenetic: Organism Taxonomy Method Based on Evolution History

    Directory of Open Access Journals (Sweden)

    N.L.P Indi Dharmayanti

    2011-03-01

    Full Text Available Phylogenetic is described as taxonomy classification of an organism based on its evolution history namely its phylogeny and as a part of systematic science that has objective to determine phylogeny of organism according to its characteristic. Phylogenetic analysis from amino acid and protein usually became important area in sequence analysis. Phylogenetic analysis can be used to follow the rapid change of a species such as virus. The phylogenetic evolution tree is a two dimensional of a species graphic that shows relationship among organisms or particularly among their gene sequences. The sequence separation are referred as taxa (singular taxon that is defined as phylogenetically distinct units on the tree. The tree consists of outer branches or leaves that represents taxa and nodes and branch represent correlation among taxa. When the nucleotide sequence from two different organism are similar, they were inferred to be descended from common ancestor. There were three methods which were used in phylogenetic, namely (1 Maximum parsimony, (2 Distance, and (3 Maximum likehoood. Those methods generally are applied to construct the evolutionary tree or the best tree for determine sequence variation in group. Every method is usually used for different analysis and data.

  6. Estimation of rates-across-sites distributions in phylogenetic substitution models.

    Science.gov (United States)

    Susko, Edward; Field, Chris; Blouin, Christian; Roger, Andrew J

    2003-10-01

    Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma. In this article, we introduce discrete distribution estimates with large numbers of equally spaced rate categories allowing us to investigate the appropriateness of the gamma model. With large numbers of rate categories, these discrete estimates are flexible enough to approximate the shape of almost any distribution. Likelihood ratio statistical tests and a nonparametric bootstrap confidence-bound estimation procedure based on the discrete estimates are presented that can be used to test the fit of a parametric family. We applied the methodology to several different protein data sets, and found that although the gamma model often provides a good parametric model for this type of data, rate estimates from an equal-probability discrete gamma model with a small number of categories will tend to underestimate the largest rates. In cases when the gamma model assumption is in doubt, rate estimates coming from the discrete rate distribution estimate with a large number of rate categories provide a robust alternative to gamma estimates. An alternative implementation of the gamma distribution is proposed that, for equal numbers of rate categories, is computationally more efficient during optimization than the standard gamma implementation and can provide more accurate estimates of site rates.

  7. Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased.

    Science.gov (United States)

    Xi, Zhenxiang; Liu, Liang; Davis, Charles C

    2015-11-01

    The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014). Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Quartet-net: a quartet-based method to reconstruct phylogenetic networks.

    Science.gov (United States)

    Yang, Jialiang; Grünewald, Stefan; Wan, Xiu-Feng

    2013-05-01

    Phylogenetic networks can model reticulate evolutionary events such as hybridization, recombination, and horizontal gene transfer. However, reconstructing such networks is not trivial. Popular character-based methods are computationally inefficient, whereas distance-based methods cannot guarantee reconstruction accuracy because pairwise genetic distances only reflect partial information about a reticulate phylogeny. To balance accuracy and computational efficiency, here we introduce a quartet-based method to construct a phylogenetic network from a multiple sequence alignment. Unlike distances that only reflect the relationship between a pair of taxa, quartets contain information on the relationships among four taxa; these quartets provide adequate capacity to infer a more accurate phylogenetic network. In applications to simulated and biological data sets, we demonstrate that this novel method is robust and effective in reconstructing reticulate evolutionary events and it has the potential to infer more accurate phylogenetic distances than other conventional phylogenetic network construction methods such as Neighbor-Joining, Neighbor-Net, and Split Decomposition. This method can be used in constructing phylogenetic networks from simple evolutionary events involving a few reticulate events to complex evolutionary histories involving a large number of reticulate events. A software called "Quartet-Net" is implemented and available at http://sysbio.cvm.msstate.edu/QuartetNet/.

  9. Phylogenetic analysis using parsimony and likelihood methods.

    Science.gov (United States)

    Yang, Z

    1996-02-01

    The assumptions underlying the maximum-parsimony (MP) method of phylogenetic tree reconstruction were intuitively examined by studying the way the method works. Computer simulations were performed to corroborate the intuitive examination. Parsimony appears to involve very stringent assumptions concerning the process of sequence evolution, such as constancy of substitution rates between nucleotides, constancy of rates across nucleotide sites, and equal branch lengths in the tree. For practical data analysis, the requirement of equal branch lengths means similar substitution rates among lineages (the existence of an approximate molecular clock), relatively long interior branches, and also few species in the data. However, a small amount of evolution is neither a necessary nor a sufficient requirement of the method. The difficulties involved in the application of current statistical estimation theory to tree reconstruction were discussed, and it was suggested that the approach proposed by Felsenstein (1981, J. Mol. Evol. 17: 368-376) for topology estimation, as well as its many variations and extensions, differs fundamentally from the maximum likelihood estimation of a conventional statistical parameter. Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter. Computer simulations were performed to study the probability that MP recovers the true tree under a hierarchy of models of nucleotide substitution; its performance relative to the likelihood method was especially noted. The results appeared to support the intuitive examination of the assumptions underlying MP. When a simple model of nucleotide substitution was assumed to generate data, the probability that MP recovers the true topology could be as high as, or even higher than, that for the likelihood method. When the assumed model became more complex and realistic, e.g., when substitution rates were

  10. Estimating phylogenetic relationships despite discordant gene trees across loci: the species tree of a diverse species group of feather mites (Acari: Proctophyllodidae).

    Science.gov (United States)

    Knowles, Lacey L; Klimov, Pavel B

    2011-11-01

    With the increased availability of multilocus sequence data, the lack of concordance of gene trees estimated for independent loci has focused attention on both the biological processes producing the discord and the methodologies used to estimate phylogenetic relationships. What has emerged is a suite of new analytical tools for phylogenetic inference--species tree approaches. In contrast to traditional phylogenetic methods that are stymied by the idiosyncrasies of gene trees, approaches for estimating species trees explicitly take into account the cause of discord among loci and, in the process, provides a direct estimate of phylogenetic history (i.e. the history of species divergence, not divergence of specific loci). We illustrate the utility of species tree estimates with an analysis of a diverse group of feather mites, the pinnatus species group (genus Proctophyllodes). Discord among four sequenced nuclear loci is consistent with theoretical expectations, given the short time separating speciation events (as evident by short internodes relative to terminal branch lengths in the trees). Nevertheless, many of the relationships are well resolved in a Bayesian estimate of the species tree; the analysis also highlights ambiguous aspects of the phylogeny that require additional loci. The broad utility of species tree approaches is discussed, and specifically, their application to groups with high speciation rates--a history of diversification with particular prevalence in host/parasite systems where species interactions can drive rapid diversification.

  11. Monte Carlo estimation of total variation distance of Markov chains on large spaces, with application to phylogenetics.

    Science.gov (United States)

    Herbei, Radu; Kubatko, Laura

    2013-03-26

    Markov chains are widely used for modeling in many areas of molecular biology and genetics. As the complexity of such models advances, it becomes increasingly important to assess the rate at which a Markov chain converges to its stationary distribution in order to carry out accurate inference. A common measure of convergence to the stationary distribution is the total variation distance, but this measure can be difficult to compute when the state space of the chain is large. We propose a Monte Carlo method to estimate the total variation distance that can be applied in this situation, and we demonstrate how the method can be efficiently implemented by taking advantage of GPU computing techniques. We apply the method to two Markov chains on the space of phylogenetic trees, and discuss the implications of our findings for the development of algorithms for phylogenetic inference.

  12. Tree imbalance causes a bias in phylogenetic estimation of evolutionary timescales using heterochronous sequences.

    Science.gov (United States)

    Duchêne, David; Duchêne, Sebastian; Ho, Simon Y W

    2015-07-01

    Phylogenetic estimation of evolutionary timescales has become routine in biology, forming the basis of a wide range of evolutionary and ecological studies. However, there are various sources of bias that can affect these estimates. We investigated whether tree imbalance, a property that is commonly observed in phylogenetic trees, can lead to reduced accuracy or precision of phylogenetic timescale estimates. We analysed simulated data sets with calibrations at internal nodes and at the tips, taking into consideration different calibration schemes and levels of tree imbalance. We also investigated the effect of tree imbalance on two empirical data sets: mitogenomes from primates and serial samples of the African swine fever virus. In analyses calibrated using dated, heterochronous tips, we found that tree imbalance had a detrimental impact on precision and produced a bias in which the overall timescale was underestimated. A pronounced effect was observed in analyses with shallow calibrations. The greatest decreases in accuracy usually occurred in the age estimates for medium and deep nodes of the tree. In contrast, analyses calibrated at internal nodes did not display a reduction in estimation accuracy or precision due to tree imbalance. Our results suggest that molecular-clock analyses can be improved by increasing taxon sampling, with the specific aims of including deeper calibrations, breaking up long branches and reducing tree imbalance. © 2014 John Wiley & Sons Ltd.

  13. SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.

    Science.gov (United States)

    Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal

    2012-01-01

    Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of

  14. A scalable method for identifying frequent subtrees in sets of large phylogenetic trees.

    Science.gov (United States)

    Ramu, Avinash; Kahveci, Tamer; Burleigh, J Gordon

    2012-10-03

    We consider the problem of finding the maximum frequent agreement subtrees (MFASTs) in a collection of phylogenetic trees. Existing methods for this problem often do not scale beyond datasets with around 100 taxa. Our goal is to address this problem for datasets with over a thousand taxa and hundreds of trees. We develop a heuristic solution that aims to find MFASTs in sets of many, large phylogenetic trees. Our method works in multiple phases. In the first phase, it identifies small candidate subtrees from the set of input trees which serve as the seeds of larger subtrees. In the second phase, it combines these small seeds to build larger candidate MFASTs. In the final phase, it performs a post-processing step that ensures that we find a frequent agreement subtree that is not contained in a larger frequent agreement subtree. We demonstrate that this heuristic can easily handle data sets with 1000 taxa, greatly extending the estimation of MFASTs beyond current methods. Although this heuristic does not guarantee to find all MFASTs or the largest MFAST, it found the MFAST in all of our synthetic datasets where we could verify the correctness of the result. It also performed well on large empirical data sets. Its performance is robust to the number and size of the input trees. Overall, this method provides a simple and fast way to identify strongly supported subtrees within large phylogenetic hypotheses.

  15. Assessing the Goodness of Fit of Phylogenetic Comparative Methods: A Meta-Analysis and Simulation Study.

    Directory of Open Access Journals (Sweden)

    Dwueng-Chwuan Jhwueng

    Full Text Available Phylogenetic comparative methods (PCMs have been applied widely in analyzing data from related species but their fit to data is rarely assessed.Can one determine whether any particular comparative method is typically more appropriate than others by examining comparative data sets?I conducted a meta-analysis of 122 phylogenetic data sets found by searching all papers in JEB, Blackwell Synergy and JSTOR published in 2002-2005 for the purpose of assessing the fit of PCMs. The number of species in these data sets ranged from 9 to 117.I used the Akaike information criterion to compare PCMs, and then fit PCMs to bivariate data sets through REML analysis. Correlation estimates between two traits and bootstrapped confidence intervals of correlations from each model were also compared.For phylogenies of less than one hundred taxa, the Independent Contrast method and the independent, non-phylogenetic models provide the best fit.For bivariate analysis, correlations from different PCMs are qualitatively similar so that actual correlations from real data seem to be robust to the PCM chosen for the analysis. Therefore, researchers might apply the PCM they believe best describes the evolutionary mechanisms underlying their data.

  16. Calculation of evolutionary correlation between individual genes and full-length genome: a method useful for choosing phylogenetic markers for molecular epidemiology.

    Directory of Open Access Journals (Sweden)

    Shuai Wang

    Full Text Available Individual genes or regions are still commonly used to estimate the phylogenetic relationships among viral isolates. The genomic regions that can faithfully provide assessments consistent with those predicted with full-length genome sequences would be preferable to serve as good candidates of the phylogenetic markers for molecular epidemiological studies of many viruses. Here we employed a statistical method to evaluate the evolutionary relationships between individual viral genes and full-length genomes without tree construction as a way to determine which gene can match the genome well in phylogenetic analyses. This method was performed by calculation of linear correlations between the genetic distance matrices of aligned individual gene sequences and aligned genome sequences. We applied this method to the phylogenetic analyses of porcine circovirus 2 (PCV2, measles virus (MV, hepatitis E virus (HEV and Japanese encephalitis virus (JEV. Phylogenetic trees were constructed for comparisons and the possible factors affecting the method accuracy were also discussed in the calculations. The results revealed that this method could produce results consistent with those of previous studies about the proper consensus sequences that could be successfully used as phylogenetic markers. And our results also suggested that these evolutionary correlations could provide useful information for identifying genes that could be used effectively to infer the genetic relationships.

  17. Phylogenetic representativeness: a new method for evaluating taxon sampling in evolutionary studies

    Directory of Open Access Journals (Sweden)

    Passamonti Marco

    2010-04-01

    Full Text Available Abstract Background Taxon sampling is a major concern in phylogenetic studies. Incomplete, biased, or improper taxon sampling can lead to misleading results in reconstructing evolutionary relationships. Several theoretical methods are available to optimize taxon choice in phylogenetic analyses. However, most involve some knowledge about the genetic relationships of the group of interest (i.e., the ingroup, or even a well-established phylogeny itself; these data are not always available in general phylogenetic applications. Results We propose a new method to assess taxon sampling developing Clarke and Warwick statistics. This method aims to measure the "phylogenetic representativeness" of a given sample or set of samples and it is based entirely on the pre-existing available taxonomy of the ingroup, which is commonly known to investigators. Moreover, our method also accounts for instability and discordance in taxonomies. A Python-based script suite, called PhyRe, has been developed to implement all analyses we describe in this paper. Conclusions We show that this method is sensitive and allows direct discrimination between representative and unrepresentative samples. It is also informative about the addition of taxa to improve taxonomic coverage of the ingroup. Provided that the investigators' expertise is mandatory in this field, phylogenetic representativeness makes up an objective touchstone in planning phylogenetic studies.

  18. Distance-Based Phylogenetic Methods Around a Polytomy.

    Science.gov (United States)

    Davidson, Ruth; Sullivant, Seth

    2014-01-01

    Distance-based phylogenetic algorithms attempt to solve the NP-hard least-squares phylogeny problem by mapping an arbitrary dissimilarity map representing biological data to a tree metric. The set of all dissimilarity maps is a Euclidean space properly containing the space of all tree metrics as a polyhedral fan. Outputs of distance-based tree reconstruction algorithms such as UPGMA and neighbor-joining are points in the maximal cones in the fan. Tree metrics with polytomies lie at the intersections of maximal cones. A phylogenetic algorithm divides the space of all dissimilarity maps into regions based upon which combinatorial tree is reconstructed by the algorithm. Comparison of phylogenetic methods can be done by comparing the geometry of these regions. We use polyhedral geometry to compare the local nature of the subdivisions induced by least-squares phylogeny, UPGMA, and neighbor-joining when the true tree has a single polytomy with exactly four neighbors. Our results suggest that in some circumstances, UPGMA and neighbor-joining poorly match least-squares phylogeny.

  19. Bayesian models for comparative analysis integrating phylogenetic uncertainty

    Directory of Open Access Journals (Sweden)

    Villemereuil Pierre de

    2012-06-01

    Full Text Available Abstract Background Uncertainty in comparative analyses can come from at least two sources: a phylogenetic uncertainty in the tree topology or branch lengths, and b uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow and inflated significance in hypothesis testing (e.g. p-values will be too small. Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible

  20. Bayesian models for comparative analysis integrating phylogenetic uncertainty

    Science.gov (United States)

    2012-01-01

    Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for

  1. Quartet-based methods to reconstruct phylogenetic networks.

    Science.gov (United States)

    Yang, Jialiang; Grünewald, Stefan; Xu, Yifei; Wan, Xiu-Feng

    2014-02-20

    Phylogenetic networks are employed to visualize evolutionary relationships among a group of nucleotide sequences, genes or species when reticulate events like hybridization, recombination, reassortant and horizontal gene transfer are believed to be involved. In comparison to traditional distance-based methods, quartet-based methods consider more information in the reconstruction process and thus have the potential to be more accurate. We introduce QuartetSuite, which includes a set of new quartet-based methods, namely QuartetS, QuartetA, and QuartetM, to reconstruct phylogenetic networks from nucleotide sequences. We tested their performances and compared them with other popular methods on two simulated nucleotide sequence data sets: one generated from a tree topology and the other from a complicated evolutionary history containing three reticulate events. We further validated these methods to two real data sets: a bacterial data set consisting of seven concatenated genes of 36 bacterial species and an influenza data set related to recently emerging H7N9 low pathogenic avian influenza viruses in China. QuartetS, QuartetA, and QuartetM have the potential to accurately reconstruct evolutionary scenarios from simple branching trees to complicated networks containing many reticulate events. These methods could provide insights into the understanding of complicated biological evolutionary processes such as bacterial taxonomy and reassortant of influenza viruses.

  2. A Penalized Likelihood Framework For High-Dimensional Phylogenetic Comparative Methods And An Application To New-World Monkeys Brain Evolution.

    Science.gov (United States)

    Julien, Clavel; Leandro, Aristide; Hélène, Morlon

    2018-06-19

    Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.

  3. FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods

    Directory of Open Access Journals (Sweden)

    Bakos Jason D

    2010-04-01

    Full Text Available Abstract Background Likelihood (ML-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. Results We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10× speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Conclusions Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs 1.

  4. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  5. Inferring Phylogenetic Networks Using PhyloNet.

    Science.gov (United States)

    Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay

    2018-07-01

    PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.

  6. Tetrapods on the EDGE: Overcoming data limitations to identify phylogenetic conservation priorities

    Science.gov (United States)

    Gray, Claudia L.; Wearn, Oliver R.; Owen, Nisha R.

    2018-01-01

    The scale of the ongoing biodiversity crisis requires both effective conservation prioritisation and urgent action. As extinction is non-random across the tree of life, it is important to prioritise threatened species which represent large amounts of evolutionary history. The EDGE metric prioritises species based on their Evolutionary Distinctiveness (ED), which measures the relative contribution of a species to the total evolutionary history of their taxonomic group, and Global Endangerment (GE), or extinction risk. EDGE prioritisations rely on adequate phylogenetic and extinction risk data to generate meaningful priorities for conservation. However, comprehensive phylogenetic trees of large taxonomic groups are extremely rare and, even when available, become quickly out-of-date due to the rapid rate of species descriptions and taxonomic revisions. Thus, it is important that conservationists can use the available data to incorporate evolutionary history into conservation prioritisation. We compared published and new methods to estimate missing ED scores for species absent from a phylogenetic tree whilst simultaneously correcting the ED scores of their close taxonomic relatives. We found that following artificial removal of species from a phylogenetic tree, the new method provided the closest estimates of their “true” ED score, differing from the true ED score by an average of less than 1%, compared to the 31% and 38% difference of the previous methods. The previous methods also substantially under- and over-estimated scores as more species were artificially removed from a phylogenetic tree. We therefore used the new method to estimate ED scores for all tetrapods. From these scores we updated EDGE prioritisation rankings for all tetrapod species with IUCN Red List assessments, including the first EDGE prioritisation for reptiles. Further, we identified criteria to identify robust priority species in an effort to further inform conservation action whilst

  7. Phylogenetic diversity and relationships among species of genus ...

    African Journals Online (AJOL)

    Fifty six Nicotiana species were used to construct phylogenetic trees and to asses the genetic relationships between them. Genetic distances estimated from RAPD analysis was used to construct phylogenetic trees using Phylogenetic Inference Package (PHYLIP). Since phylogenetic relationships estimated for closely ...

  8. How does cognition evolve? Phylogenetic comparative psychology

    Science.gov (United States)

    Matthews, Luke J.; Hare, Brian A.; Nunn, Charles L.; Anderson, Rindy C.; Aureli, Filippo; Brannon, Elizabeth M.; Call, Josep; Drea, Christine M.; Emery, Nathan J.; Haun, Daniel B. M.; Herrmann, Esther; Jacobs, Lucia F.; Platt, Michael L.; Rosati, Alexandra G.; Sandel, Aaron A.; Schroepfer, Kara K.; Seed, Amanda M.; Tan, Jingzhi; van Schaik, Carel P.; Wobber, Victoria

    2014-01-01

    Now more than ever animal studies have the potential to test hypotheses regarding how cognition evolves. Comparative psychologists have developed new techniques to probe the cognitive mechanisms underlying animal behavior, and they have become increasingly skillful at adapting methodologies to test multiple species. Meanwhile, evolutionary biologists have generated quantitative approaches to investigate the phylogenetic distribution and function of phenotypic traits, including cognition. In particular, phylogenetic methods can quantitatively (1) test whether specific cognitive abilities are correlated with life history (e.g., lifespan), morphology (e.g., brain size), or socio-ecological variables (e.g., social system), (2) measure how strongly phylogenetic relatedness predicts the distribution of cognitive skills across species, and (3) estimate the ancestral state of a given cognitive trait using measures of cognitive performance from extant species. Phylogenetic methods can also be used to guide the selection of species comparisons that offer the strongest tests of a priori predictions of cognitive evolutionary hypotheses (i.e., phylogenetic targeting). Here, we explain how an integration of comparative psychology and evolutionary biology will answer a host of questions regarding the phylogenetic distribution and history of cognitive traits, as well as the evolutionary processes that drove their evolution. PMID:21927850

  9. How does cognition evolve? Phylogenetic comparative psychology.

    Science.gov (United States)

    MacLean, Evan L; Matthews, Luke J; Hare, Brian A; Nunn, Charles L; Anderson, Rindy C; Aureli, Filippo; Brannon, Elizabeth M; Call, Josep; Drea, Christine M; Emery, Nathan J; Haun, Daniel B M; Herrmann, Esther; Jacobs, Lucia F; Platt, Michael L; Rosati, Alexandra G; Sandel, Aaron A; Schroepfer, Kara K; Seed, Amanda M; Tan, Jingzhi; van Schaik, Carel P; Wobber, Victoria

    2012-03-01

    Now more than ever animal studies have the potential to test hypotheses regarding how cognition evolves. Comparative psychologists have developed new techniques to probe the cognitive mechanisms underlying animal behavior, and they have become increasingly skillful at adapting methodologies to test multiple species. Meanwhile, evolutionary biologists have generated quantitative approaches to investigate the phylogenetic distribution and function of phenotypic traits, including cognition. In particular, phylogenetic methods can quantitatively (1) test whether specific cognitive abilities are correlated with life history (e.g., lifespan), morphology (e.g., brain size), or socio-ecological variables (e.g., social system), (2) measure how strongly phylogenetic relatedness predicts the distribution of cognitive skills across species, and (3) estimate the ancestral state of a given cognitive trait using measures of cognitive performance from extant species. Phylogenetic methods can also be used to guide the selection of species comparisons that offer the strongest tests of a priori predictions of cognitive evolutionary hypotheses (i.e., phylogenetic targeting). Here, we explain how an integration of comparative psychology and evolutionary biology will answer a host of questions regarding the phylogenetic distribution and history of cognitive traits, as well as the evolutionary processes that drove their evolution.

  10. Unrealistic phylogenetic trees may improve phylogenetic footprinting.

    Science.gov (United States)

    Nettling, Martin; Treutler, Hendrik; Cerquides, Jesus; Grosse, Ivo

    2017-06-01

    The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene regulation as a whole. Due to the development of sequencing technologies and the increasing number of available genomes, approaches based on phylogenetic footprinting become increasingly attractive. Phylogenetic footprinting requires phylogenetic trees with attached substitution probabilities for quantifying the evolution of binding sites, but these trees and substitution probabilities are typically not known and cannot be estimated easily. Here, we investigate the influence of phylogenetic trees with different substitution probabilities on the classification performance of phylogenetic footprinting using synthetic and real data. For synthetic data we find that the classification performance is highest when the substitution probability used for phylogenetic footprinting is similar to that used for data generation. For real data, however, we typically find that the classification performance of phylogenetic footprinting surprisingly increases with increasing substitution probabilities and is often highest for unrealistically high substitution probabilities close to one. This finding suggests that choosing realistic model assumptions might not always yield optimal predictions in general and that choosing unrealistically high substitution probabilities close to one might actually improve the classification performance of phylogenetic footprinting. The proposed PF is implemented in JAVA and can be downloaded from https://github.com/mgledi/PhyFoo. : martin.nettling@informatik.uni-halle.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  11. Visualizing phylogenetic tree landscapes.

    Science.gov (United States)

    Wilgenbusch, James C; Huang, Wen; Gallivan, Kyle A

    2017-02-02

    Genomic-scale sequence alignments are increasingly used to infer phylogenies in order to better understand the processes and patterns of evolution. Different partitions within these new alignments (e.g., genes, codon positions, and structural features) often favor hundreds if not thousands of competing phylogenies. Summarizing and comparing phylogenies obtained from multi-source data sets using current consensus tree methods discards valuable information and can disguise potential methodological problems. Discovery of efficient and accurate dimensionality reduction methods used to display at once in 2- or 3- dimensions the relationship among these competing phylogenies will help practitioners diagnose the limits of current evolutionary models and potential problems with phylogenetic reconstruction methods when analyzing large multi-source data sets. We introduce several dimensionality reduction methods to visualize in 2- and 3-dimensions the relationship among competing phylogenies obtained from gene partitions found in three mid- to large-size mitochondrial genome alignments. We test the performance of these dimensionality reduction methods by applying several goodness-of-fit measures. The intrinsic dimensionality of each data set is also estimated to determine whether projections in 2- and 3-dimensions can be expected to reveal meaningful relationships among trees from different data partitions. Several new approaches to aid in the comparison of different phylogenetic landscapes are presented. Curvilinear Components Analysis (CCA) and a stochastic gradient decent (SGD) optimization method give the best representation of the original tree-to-tree distance matrix for each of the three- mitochondrial genome alignments and greatly outperformed the method currently used to visualize tree landscapes. The CCA + SGD method converged at least as fast as previously applied methods for visualizing tree landscapes. We demonstrate for all three mtDNA alignments that 3D

  12. Fast and accurate phylogenetic reconstruction from high-resolution whole-genome data and a novel robustness estimator.

    Science.gov (United States)

    Lin, Y; Rajan, V; Moret, B M E

    2011-09-01

    The rapid accumulation of whole-genome data has renewed interest in the study of genomic rearrangements. Comparative genomics, evolutionary biology, and cancer research all require models and algorithms to elucidate the mechanisms, history, and consequences of these rearrangements. However, even simple models lead to NP-hard problems, particularly in the area of phylogenetic analysis. Current approaches are limited to small collections of genomes and low-resolution data (typically a few hundred syntenic blocks). Moreover, whereas phylogenetic analyses from sequence data are deemed incomplete unless bootstrapping scores (a measure of confidence) are given for each tree edge, no equivalent to bootstrapping exists for rearrangement-based phylogenetic analysis. We describe a fast and accurate algorithm for rearrangement analysis that scales up, in both time and accuracy, to modern high-resolution genomic data. We also describe a novel approach to estimate the robustness of results-an equivalent to the bootstrapping analysis used in sequence-based phylogenetic reconstruction. We present the results of extensive testing on both simulated and real data showing that our algorithm returns very accurate results, while scaling linearly with the size of the genomes and cubically with their number. We also present extensive experimental results showing that our approach to robustness testing provides excellent estimates of confidence, which, moreover, can be tuned to trade off thresholds between false positives and false negatives. Together, these two novel approaches enable us to attack heretofore intractable problems, such as phylogenetic inference for high-resolution vertebrate genomes, as we demonstrate on a set of six vertebrate genomes with 8,380 syntenic blocks. A copy of the software is available on demand.

  13. Phylogenetic trees

    OpenAIRE

    Baños, Hector; Bushek, Nathaniel; Davidson, Ruth; Gross, Elizabeth; Harris, Pamela E.; Krone, Robert; Long, Colby; Stewart, Allen; Walker, Robert

    2016-01-01

    We introduce the package PhylogeneticTrees for Macaulay2 which allows users to compute phylogenetic invariants for group-based tree models. We provide some background information on phylogenetic algebraic geometry and show how the package PhylogeneticTrees can be used to calculate a generating set for a phylogenetic ideal as well as a lower bound for its dimension. Finally, we show how methods within the package can be used to compute a generating set for the join of any two ideals.

  14. On the Quirks of Maximum Parsimony and Likelihood on Phylogenetic Networks

    OpenAIRE

    Bryant, Christopher; Fischer, Mareike; Linz, Simone; Semple, Charles

    2015-01-01

    Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogene...

  15. Estimating bacterial diversity for ecological studies: methods, metrics, and assumptions.

    Directory of Open Access Journals (Sweden)

    Julia Birtel

    Full Text Available Methods to estimate microbial diversity have developed rapidly in an effort to understand the distribution and diversity of microorganisms in natural environments. For bacterial communities, the 16S rRNA gene is the phylogenetic marker gene of choice, but most studies select only a specific region of the 16S rRNA to estimate bacterial diversity. Whereas biases derived from from DNA extraction, primer choice and PCR amplification are well documented, we here address how the choice of variable region can influence a wide range of standard ecological metrics, such as species richness, phylogenetic diversity, β-diversity and rank-abundance distributions. We have used Illumina paired-end sequencing to estimate the bacterial diversity of 20 natural lakes across Switzerland derived from three trimmed variable 16S rRNA regions (V3, V4, V5. Species richness, phylogenetic diversity, community composition, β-diversity, and rank-abundance distributions differed significantly between 16S rRNA regions. Overall, patterns of diversity quantified by the V3 and V5 regions were more similar to one another than those assessed by the V4 region. Similar results were obtained when analyzing the datasets with different sequence similarity thresholds used during sequences clustering and when the same analysis was used on a reference dataset of sequences from the Greengenes database. In addition we also measured species richness from the same lake samples using ARISA Fingerprinting, but did not find a strong relationship between species richness estimated by Illumina and ARISA. We conclude that the selection of 16S rRNA region significantly influences the estimation of bacterial diversity and species distributions and that caution is warranted when comparing data from different variable regions as well as when using different sequencing techniques.

  16. Efficient parsimony-based methods for phylogenetic network reconstruction.

    Science.gov (United States)

    Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir

    2007-01-15

    Phylogenies--the evolutionary histories of groups of organisms-play a major role in representing relationships among biological entities. Although many biological processes can be effectively modeled as tree-like relationships, others, such as hybrid speciation and horizontal gene transfer (HGT), result in networks, rather than trees, of relationships. Hybrid speciation is a significant evolutionary mechanism in plants, fish and other groups of species. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Maximum parsimony is one of the most commonly used criteria for phylogenetic tree inference. Roughly speaking, inference based on this criterion seeks the tree that minimizes the amount of evolution. In 1990, Jotun Hein proposed using this criterion for inferring the evolution of sequences subject to recombination. Preliminary results on small synthetic datasets. Nakhleh et al. (2005) demonstrated the criterion's application to phylogenetic network reconstruction in general and HGT detection in particular. However, the naive algorithms used by the authors are inapplicable to large datasets due to their demanding computational requirements. Further, no rigorous theoretical analysis of computing the criterion was given, nor was it tested on biological data. In the present work we prove that the problem of scoring the parsimony of a phylogenetic network is NP-hard and provide an improved fixed parameter tractable algorithm for it. Further, we devise efficient heuristics for parsimony-based reconstruction of phylogenetic networks. We test our methods on both synthetic and biological data (rbcL gene in bacteria) and obtain very promising results.

  17. Functional & phylogenetic diversity of copepod communities

    Science.gov (United States)

    Benedetti, F.; Ayata, S. D.; Blanco-Bercial, L.; Cornils, A.; Guilhaumon, F.

    2016-02-01

    The diversity of natural communities is classically estimated through species identification (taxonomic diversity) but can also be estimated from the ecological functions performed by the species (functional diversity), or from the phylogenetic relationships among them (phylogenetic diversity). Estimating functional diversity requires the definition of specific functional traits, i.e., phenotypic characteristics that impact fitness and are relevant to ecosystem functioning. Estimating phylogenetic diversity requires the description of phylogenetic relationships, for instance by using molecular tools. In the present study, we focused on the functional and phylogenetic diversity of copepod surface communities in the Mediterranean Sea. First, we implemented a specific trait database for the most commonly-sampled and abundant copepod species of the Mediterranean Sea. Our database includes 191 species, described by seven traits encompassing diverse ecological functions: minimal and maximal body length, trophic group, feeding type, spawning strategy, diel vertical migration and vertical habitat. Clustering analysis in the functional trait space revealed that Mediterranean copepods can be gathered into groups that have different ecological roles. Second, we reconstructed a phylogenetic tree using the available sequences of 18S rRNA. Our tree included 154 of the analyzed Mediterranean copepod species. We used these two datasets to describe the functional and phylogenetic diversity of copepod surface communities in the Mediterranean Sea. The replacement component (turn-over) and the species richness difference component (nestedness) of the beta diversity indices were identified. Finally, by comparing various and complementary aspects of plankton diversity (taxonomic, functional, and phylogenetic diversity) we were able to gain a better understanding of the relationships among the zooplankton community, biodiversity, ecosystem function, and environmental forcing.

  18. Fast algorithms for computing phylogenetic divergence time.

    Science.gov (United States)

    Crosby, Ralph W; Williams, Tiffani L

    2017-12-06

    The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.

  19. EM for phylogenetic topology reconstruction on nonhomogeneous data.

    Science.gov (United States)

    Ibáñez-Marcelo, Esther; Casanellas, Marta

    2014-06-17

    The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data.

  20. Efficient Detection of Repeating Sites to Accelerate Phylogenetic Likelihood Calculations.

    Science.gov (United States)

    Kobert, K; Stamatakis, A; Flouri, T

    2017-03-01

    The phylogenetic likelihood function (PLF) is the major computational bottleneck in several applications of evolutionary biology such as phylogenetic inference, species delimitation, model selection, and divergence times estimation. Given the alignment, a tree and the evolutionary model parameters, the likelihood function computes the conditional likelihood vectors for every node of the tree. Vector entries for which all input data are identical result in redundant likelihood operations which, in turn, yield identical conditional values. Such operations can be omitted for improving run-time and, using appropriate data structures, reducing memory usage. We present a fast, novel method for identifying and omitting such redundant operations in phylogenetic likelihood calculations, and assess the performance improvement and memory savings attained by our method. Using empirical and simulated data sets, we show that a prototype implementation of our method yields up to 12-fold speedups and uses up to 78% less memory than one of the fastest and most highly tuned implementations of the PLF currently available. Our method is generic and can seamlessly be integrated into any phylogenetic likelihood implementation. [Algorithms; maximum likelihood; phylogenetic likelihood function; phylogenetics]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  1. Phylogenetic diversity and biodiversity indices on phylogenetic networks.

    Science.gov (United States)

    Wicke, Kristina; Fischer, Mareike

    2018-04-01

    In biodiversity conservation it is often necessary to prioritize the species to conserve. Existing approaches to prioritization, e.g. the Fair Proportion Index and the Shapley Value, are based on phylogenetic trees and rank species according to their contribution to overall phylogenetic diversity. However, in many cases evolution is not treelike and thus, phylogenetic networks have been developed as a generalization of phylogenetic trees, allowing for the representation of non-treelike evolutionary events, such as hybridization. Here, we extend the concepts of phylogenetic diversity and phylogenetic diversity indices from phylogenetic trees to phylogenetic networks. On the one hand, we consider the treelike content of a phylogenetic network, e.g. the (multi)set of phylogenetic trees displayed by a network and the so-called lowest stable ancestor tree associated with it. On the other hand, we derive the phylogenetic diversity of subsets of taxa and biodiversity indices directly from the internal structure of the network. We consider both approaches that are independent of so-called inheritance probabilities as well as approaches that explicitly incorporate these probabilities. Furthermore, we introduce our software package NetDiversity, which is implemented in Perl and allows for the calculation of all generalized measures of phylogenetic diversity and generalized phylogenetic diversity indices established in this note that are independent of inheritance probabilities. We apply our methods to a phylogenetic network representing the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), a group of species characterized by widespread hybridization. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. New approaches to phylogenetic tree search and their application to large numbers of protein alignments.

    Science.gov (United States)

    Whelan, Simon

    2007-10-01

    Phylogenetic tree estimation plays a critical role in a wide variety of molecular studies, including molecular systematics, phylogenetics, and comparative genomics. Finding the optimal tree relating a set of sequences using score-based (optimality criterion) methods, such as maximum likelihood and maximum parsimony, may require all possible trees to be considered, which is not feasible even for modest numbers of sequences. In practice, trees are estimated using heuristics that represent a trade-off between topological accuracy and speed. I present a series of novel algorithms suitable for score-based phylogenetic tree reconstruction that demonstrably improve the accuracy of tree estimates while maintaining high computational speeds. The heuristics function by allowing the efficient exploration of large numbers of trees through novel hill-climbing and resampling strategies. These heuristics, and other computational approximations, are implemented for maximum likelihood estimation of trees in the program Leaphy, and its performance is compared to other popular phylogenetic programs. Trees are estimated from 4059 different protein alignments using a selection of phylogenetic programs and the likelihoods of the tree estimates are compared. Trees estimated using Leaphy are found to have equal to or better likelihoods than trees estimated using other phylogenetic programs in 4004 (98.6%) families and provide a unique best tree that no other program found in 1102 (27.1%) families. The improvement is particularly marked for larger families (80 to 100 sequences), where Leaphy finds a unique best tree in 81.7% of families.

  3. An Evaluation of Phylogenetic Methods for Reconstructing Transmitted HIV Variants using Longitudinal Clonal HIV Sequence Data

    Science.gov (United States)

    McCloskey, Rosemary M.; Liang, Richard H.; Harrigan, P. Richard; Brumme, Zabrina L.

    2014-01-01

    ABSTRACT A population of human immunodeficiency virus (HIV) within a host often descends from a single transmitted/founder virus. The high mutation rate of HIV, coupled with long delays between infection and diagnosis, make isolating and characterizing this strain a challenge. In theory, ancestral reconstruction could be used to recover this strain from sequences sampled in chronic infection; however, the accuracy of phylogenetic techniques in this context is unknown. To evaluate the accuracy of these methods, we applied ancestral reconstruction to a large panel of published longitudinal clonal and/or single-genome-amplification HIV sequence data sets with at least one intrapatient sequence set sampled within 6 months of infection or seroconversion (n = 19,486 sequences, median [interquartile range] = 49 [20 to 86] sequences/set). The consensus of the earliest sequences was used as the best possible estimate of the transmitted/founder. These sequences were compared to ancestral reconstructions from sequences sampled at later time points using both phylogenetic and phylogeny-naive methods. Overall, phylogenetic methods conferred a 16% improvement in reproducing the consensus of early sequences, compared to phylogeny-naive methods. This relative advantage increased with intrapatient sequence diversity (P reconstructing ancestral indel variation, especially within indel-rich regions of the HIV genome. Although further improvements are needed, our results indicate that phylogenetic methods for ancestral reconstruction significantly outperform phylogeny-naive alternatives, and we identify experimental conditions and study designs that can enhance accuracy of transmitted/founder virus reconstruction. IMPORTANCE When HIV is transmitted into a new host, most of the viruses fail to infect host cells. Consequently, an HIV infection tends to be descended from a single “founder” virus. A priority target for the vaccine research, these transmitted/founder viruses are

  4. Untangling hybrid phylogenetic signals: horizontal gene transfer and artifacts of phylogenetic reconstruction.

    Science.gov (United States)

    Beiko, Robert G; Ragan, Mark A

    2009-01-01

    Phylogenomic methods can be used to investigate the tangled evolutionary relationships among genomes. Building 'all the trees of all the genes' can potentially identify common pathways of horizontal gene transfer (HGT) among taxa at varying levels of phylogenetic depth. Phylogenetic affinities can be aggregated and merged with the information about genetic linkage and biochemical function to examine hypotheses of adaptive evolution via HGT. Additionally, the use of many genetic data sets increases the power of statistical tests for phylogenetic artifacts. However, large-scale phylogenetic analyses pose several challenges, including the necessary abandonment of manual validation techniques, the need to translate inferred phylogenetic discordance into inferred HGT events, and the challenges involved in aggregating results from search-based inference methods. In this chapter we describe a tree search procedure to recover the most parsimonious pathways of HGT, and examine some of the assumptions that are made by this method.

  5. Relating phylogenetic trees to transmission trees of infectious disease outbreaks.

    Science.gov (United States)

    Ypma, Rolf J F; van Ballegooijen, W Marijn; Wallinga, Jacco

    2013-11-01

    Transmission events are the fundamental building blocks of the dynamics of any infectious disease. Much about the epidemiology of a disease can be learned when these individual transmission events are known or can be estimated. Such estimations are difficult and generally feasible only when detailed epidemiological data are available. The genealogy estimated from genetic sequences of sampled pathogens is another rich source of information on transmission history. Optimal inference of transmission events calls for the combination of genetic data and epidemiological data into one joint analysis. A key difficulty is that the transmission tree, which describes the transmission events between infected hosts, differs from the phylogenetic tree, which describes the ancestral relationships between pathogens sampled from these hosts. The trees differ both in timing of the internal nodes and in topology. These differences become more pronounced when a higher fraction of infected hosts is sampled. We show how the phylogenetic tree of sampled pathogens is related to the transmission tree of an outbreak of an infectious disease, by the within-host dynamics of pathogens. We provide a statistical framework to infer key epidemiological and mutational parameters by simultaneously estimating the phylogenetic tree and the transmission tree. We test the approach using simulations and illustrate its use on an outbreak of foot-and-mouth disease. The approach unifies existing methods in the emerging field of phylodynamics with transmission tree reconstruction methods that are used in infectious disease epidemiology.

  6. A Bayesian phylogenetic approach to estimating the stability of linguistic features and the genetic biasing of tone.

    Science.gov (United States)

    Dediu, Dan

    2011-02-07

    Language is a hallmark of our species and understanding linguistic diversity is an area of major interest. Genetic factors influencing the cultural transmission of language provide a powerful and elegant explanation for aspects of the present day linguistic diversity and a window into the emergence and evolution of language. In particular, it has recently been proposed that linguistic tone-the usage of voice pitch to convey lexical and grammatical meaning-is biased by two genes involved in brain growth and development, ASPM and Microcephalin. This hypothesis predicts that tone is a stable characteristic of language because of its 'genetic anchoring'. The present paper tests this prediction using a Bayesian phylogenetic framework applied to a large set of linguistic features and language families, using multiple software implementations, data codings, stability estimations, linguistic classifications and outgroup choices. The results of these different methods and datasets show a large agreement, suggesting that this approach produces reliable estimates of the stability of linguistic data. Moreover, linguistic tone is found to be stable across methods and datasets, providing suggestive support for the hypothesis of genetic influences on its distribution.

  7. Phylogenetic Analysis Using Protein Mass Spectrometry.

    Science.gov (United States)

    Ma, Shiyong; Downard, Kevin M; Wong, Jason W H

    2017-01-01

    Through advances in molecular biology, comparative analysis of DNA sequences is currently the cornerstone in the study of molecular evolution and phylogenetics. Nevertheless, protein mass spectrometry offers some unique opportunities to enable phylogenetic analyses in organisms where DNA may be difficult or costly to obtain. To date, the methods of phylogenetic analysis using protein mass spectrometry can be classified into three categories: (1) de novo protein sequencing followed by classical phylogenetic reconstruction, (2) direct phylogenetic reconstruction using proteolytic peptide mass maps, and (3) mapping of mass spectral data onto classical phylogenetic trees. In this chapter, we provide a brief description of the three methods and the protocol for each method along with relevant tools and algorithms.

  8. Estimation of main diversification time-points of hantaviruses using phylogenetic analyses of complete genomes.

    Science.gov (United States)

    Castel, Guillaume; Tordo, Noël; Plyusnin, Alexander

    2017-04-02

    Because of the great variability of their reservoir hosts, hantaviruses are excellent models to evaluate the dynamics of virus-host co-evolution. Intriguing questions remain about the timescale of the diversification events that influenced this evolution. In this paper we attempted to estimate the first ever timing of hantavirus diversification based on thirty five available complete genomes representing five major groups of hantaviruses and the assumption of co-speciation of hantaviruses with their respective mammal hosts. Phylogenetic analyses were used to estimate the main diversification points during hantavirus evolution in mammals while host diversification was mostly estimated from independent calibrators taken from fossil records. Our results support an earlier developed hypothesis of co-speciation of known hantaviruses with their respective mammal hosts and hence a common ancestor for all hantaviruses carried by placental mammals. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. PhyDesign: an online application for profiling phylogenetic informativeness

    Directory of Open Access Journals (Sweden)

    Townsend Jeffrey P

    2011-05-01

    Full Text Available Abstract Background The rapid increase in number of sequenced genomes for species across of the tree of life is revealing a diverse suite of orthologous genes that could potentially be employed to inform molecular phylogenetic studies that encompass broader taxonomic sampling. Optimal usage of this diversity of loci requires user-friendly tools to facilitate widespread cost-effective locus prioritization for phylogenetic sampling. The Townsend (2007 phylogenetic informativeness provides a unique empirical metric for guiding marker selection. However, no software or automated methodology to evaluate sequence alignments and estimate the phylogenetic informativeness metric has been available. Results Here, we present PhyDesign, a platform-independent online application that implements the Townsend (2007 phylogenetic informativeness analysis, providing a quantitative prediction of the utility of loci to solve specific phylogenetic questions. An easy-to-use interface facilitates uploading of alignments and ultrametric trees to calculate and depict profiles of informativeness over specified time ranges, and provides rankings of locus prioritization for epochs of interest. Conclusions By providing these profiles, PhyDesign facilitates locus prioritization increasing the efficiency of sequencing for phylogenetic purposes compared to traditional studies with more laborious and low capacity screening methods, as well as increasing the accuracy of phylogenetic studies. Together with a manual and sample files, the application is freely accessible at http://phydesign.townsend.yale.edu.

  10. Phylogenetic comparative methods complement discriminant function analysis in ecomorphology.

    Science.gov (United States)

    Barr, W Andrew; Scott, Robert S

    2014-04-01

    In ecomorphology, Discriminant Function Analysis (DFA) has been used as evidence for the presence of functional links between morphometric variables and ecological categories. Here we conduct simulations of characters containing phylogenetic signal to explore the performance of DFA under a variety of conditions. Characters were simulated using a phylogeny of extant antelope species from known habitats. Characters were modeled with no biomechanical relationship to the habitat category; the only sources of variation were body mass, phylogenetic signal, or random "noise." DFA on the discriminability of habitat categories was performed using subsets of the simulated characters, and Phylogenetic Generalized Least Squares (PGLS) was performed for each character. Analyses were repeated with randomized habitat assignments. When simulated characters lacked phylogenetic signal and/or habitat assignments were random, ecomorphology. Copyright © 2013 Wiley Periodicals, Inc.

  11. Phylogenetic relationships among populations of Pristurus rupestris Blanford,1874 (Sauria: Sphaerodactylidae) in southern Iran

    OpenAIRE

    YOUSOFI, SUGOL; POUYANI, ESKANDAR RASTEGAR; HOJATI, VIDA

    2015-01-01

    We examined intraspecific relationships of the subspecies Pristurus rupestris iranicus from the northern Persian Gulf area (Hormozgan, Bushehr, and Sistan and Baluchestan provinces). Phylogenetic relationships among these samples were estimated based on the mitochondrial cytochrome b gene. We used three methods of phylogenetic tree reconstruction (maximum likelihood, maximum parsimony, and Bayesian inference). The sampled populations were divided into 5 clades but exhibit little genetic diver...

  12. Comparison of Boolean analysis and standard phylogenetic methods using artificially evolved and natural mt-tRNA sequences from great apes.

    Science.gov (United States)

    Ari, Eszter; Ittzés, Péter; Podani, János; Thi, Quynh Chi Le; Jakó, Eena

    2012-04-01

    Boolean analysis (or BOOL-AN; Jakó et al., 2009. BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction. Mol. Phylogenet. Evol. 52, 887-97.), a recently developed method for sequence comparison uses the Iterative Canonical Form of Boolean functions. It considers sequence information in a way entirely different from standard phylogenetic methods (i.e. Maximum Parsimony, Maximum-Likelihood, Neighbor-Joining, and Bayesian analysis). The performance and reliability of Boolean analysis were tested and compared with the standard phylogenetic methods, using artificially evolved - simulated - nucleotide sequences and the 22 mitochondrial tRNA genes of the great apes. At the outset, we assumed that the phylogeny of Hominidae is generally well established, and the guide tree of artificial sequence evolution can also be used as a benchmark. These offer a possibility to compare and test the performance of different phylogenetic methods. Trees were reconstructed by each method from 2500 simulated sequences and 22 mitochondrial tRNA sequences. We also introduced a special re-sampling method for Boolean analysis on permuted sequence sites, the P-BOOL-AN procedure. Considering the reliability values (branch support values of consensus trees and Robinson-Foulds distances) we used for simulated sequence trees produced by different phylogenetic methods, BOOL-AN appeared as the most reliable method. Although the mitochondrial tRNA sequences of great apes are relatively short (59-75 bases long) and the ratio of their constant characters is about 75%, BOOL-AN, P-BOOL-AN and the Bayesian approach produced the same tree-topology as the established phylogeny, while the outcomes of Maximum Parsimony, Maximum-Likelihood and Neighbor-Joining methods were equivocal. We conclude that Boolean analysis is a promising alternative to existing methods of sequence comparison for phylogenetic reconstruction and congruence analysis. Copyright © 2012 Elsevier Inc. All

  13. Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids.

    Science.gov (United States)

    Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry

    2006-04-09

    The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place

  14. Phylogenetic analyses of Vitis (Vitaceae based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids

    Directory of Open Access Journals (Sweden)

    Alverson Andrew J

    2006-04-01

    Full Text Available Abstract Background The Vitaceae (grape is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade

  15. On the quirks of maximum parsimony and likelihood on phylogenetic networks.

    Science.gov (United States)

    Bryant, Christopher; Fischer, Mareike; Linz, Simone; Semple, Charles

    2017-03-21

    Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogenetic networks, which can display such events, are becoming of more and more interest in phylogenetic research. It is therefore necessary to extend concepts like maximum parsimony from phylogenetic trees to networks. Several suggestions for possible extensions can be found in recent literature, for instance the softwired and the hardwired parsimony concepts. In this paper, we analyze the so-called big parsimony problem under these two concepts, i.e. we investigate maximum parsimonious networks and analyze their properties. In particular, we show that finding a softwired maximum parsimony network is possible in polynomial time. We also show that the set of maximum parsimony networks for the hardwired definition always contains at least one phylogenetic tree. Lastly, we investigate some parallels of parsimony to different likelihood concepts on phylogenetic networks. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Comparison of cluster-based and source-attribution methods for estimating transmission risk using large HIV sequence databases.

    Science.gov (United States)

    Le Vu, Stéphane; Ratmann, Oliver; Delpech, Valerie; Brown, Alison E; Gill, O Noel; Tostevin, Anna; Fraser, Christophe; Volz, Erik M

    2018-06-01

    Phylogenetic clustering of HIV sequences from a random sample of patients can reveal epidemiological transmission patterns, but interpretation is hampered by limited theoretical support and statistical properties of clustering analysis remain poorly understood. Alternatively, source attribution methods allow fitting of HIV transmission models and thereby quantify aspects of disease transmission. A simulation study was conducted to assess error rates of clustering methods for detecting transmission risk factors. We modeled HIV epidemics among men having sex with men and generated phylogenies comparable to those that can be obtained from HIV surveillance data in the UK. Clustering and source attribution approaches were applied to evaluate their ability to identify patient attributes as transmission risk factors. We find that commonly used methods show a misleading association between cluster size or odds of clustering and covariates that are correlated with time since infection, regardless of their influence on transmission. Clustering methods usually have higher error rates and lower sensitivity than source attribution method for identifying transmission risk factors. But neither methods provide robust estimates of transmission risk ratios. Source attribution method can alleviate drawbacks from phylogenetic clustering but formal population genetic modeling may be required to estimate quantitative transmission risk factors. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  17. Phylogenetic inference with weighted codon evolutionary distances.

    Science.gov (United States)

    Criscuolo, Alexis; Michel, Christian J

    2009-04-01

    We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates.

  18. TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.

    Science.gov (United States)

    Chang, Jia-Ming; Di Tommaso, Paolo; Notredame, Cedric

    2014-06-01

    Multiple sequence alignment (MSA) is a key modeling procedure when analyzing biological sequences. Homology and evolutionary modeling are the most common applications of MSAs. Both are known to be sensitive to the underlying MSA accuracy. In this work, we show how this problem can be partly overcome using the transitive consistency score (TCS), an extended version of the T-Coffee scoring scheme. Using this local evaluation function, we show that one can identify the most reliable portions of an MSA, as judged from BAliBASE and PREFAB structure-based reference alignments. We also show how this measure can be used to improve phylogenetic tree reconstruction using both an established simulated data set and a novel empirical yeast data set. For this purpose, we describe a novel lossless alternative to site filtering that involves overweighting the trustworthy columns. Our approach relies on the T-Coffee framework; it uses libraries of pairwise alignments to evaluate any third party MSA. Pairwise projections can be produced using fast or slow methods, thus allowing a trade-off between speed and accuracy. We compared TCS with Heads-or-Tails, GUIDANCE, Gblocks, and trimAl and found it to lead to significantly better estimates of structural accuracy and more accurate phylogenetic trees. The software is available from www.tcoffee.org/Projects/tcs. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.

    Science.gov (United States)

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.

  20. [Analysis of phylogenetic criteria for estimation of the rank of taxa in methane-oxidizing bacteria].

    Science.gov (United States)

    Romanovskaia, V A; Rokitko, P V

    2011-01-01

    To determine a possibility of application of phylogenetic criteria for estimating the taxa rank, the intra- and interspecies, as well as intergeneric relatedness of methanotrophs on the basis of 16S rRNA gene sequences was estimated. We used sequences of 16S rRNA genes of the studied isolates of obligate methanotrophs which have been deposited in UCM (Ukrainian Collection of Microorganisms), and of type strains of other obligate methanotrophs species (from GenBank database). It is shown, that the levels of interspecies and intergeneric relatedness in different families of methanotrophs are not identical, and therefore they can be used for differentiation of taxa only within one family. The carried out analysis has shown, that it is necessary to reconsider taxonomic position: (1) of two phenotypically similar species of Methylomonas (M. aurantiaca and M. fodinarum), similarity of 16S rRNA genes which is 99.4%, similarity of their total DNA--up to 80% that rather testifies to strain differences, than to species differences; (2) of species Methylomicrobium agile and M album which are phylogenetically more related to genus Methylobacter (97% of affinity), than Methylomicrobium (94% of affinity); (3) of genera of the family Beijerinckiaceae (Methylocella and Methylocapsa), and also genera of the family Methylocystaceae (Methylosinus and Methylocystis), whereas high level of relatedness (97% and more) of these bacteria with other methanotrophic genera (within one family) practically corresponds to a range of relatedness of species (within some genera) in the family Methylococcaceae. When determining phylogenetic criteria which can characterize the ranks of taxa, it was revealed, that the levels of interspecies relatedness of methanotrophic genera of the families Methylocystaceae and Beijerinckiaceae (97.8-99.1% and 97.8%, accordingly) considerably exceed the level of genera formation in the family Methylococcaceae (94.0-98.2%) and, moreover, approach the value of

  1. Encoding phylogenetic trees in terms of weighted quartets.

    Science.gov (United States)

    Grünewald, Stefan; Huber, Katharina T; Moulton, Vincent; Semple, Charles

    2008-04-01

    One of the main problems in phylogenetics is to develop systematic methods for constructing evolutionary or phylogenetic trees. For a set of species X, an edge-weighted phylogenetic X-tree or phylogenetic tree is a (graph theoretical) tree with leaf set X and no degree 2 vertices, together with a map assigning a non-negative length to each edge of the tree. Within phylogenetics, several methods have been proposed for constructing such trees that work by trying to piece together quartet trees on X, i.e. phylogenetic trees each having four leaves in X. Hence, it is of interest to characterise when a collection of quartet trees corresponds to a (unique) phylogenetic tree. Recently, Dress and Erdös provided such a characterisation for binary phylogenetic trees, that is, phylogenetic trees all of whose internal vertices have degree 3. Here we provide a new characterisation for arbitrary phylogenetic trees.

  2. The phylogenetic distribution of extrafloral nectaries in plants.

    Science.gov (United States)

    Weber, Marjorie G; Keeler, Kathleen H

    2013-06-01

    Understanding the evolutionary patterns of ecologically relevant traits is a central goal in plant biology. However, for most important traits, we lack the comprehensive understanding of their taxonomic distribution needed to evaluate their evolutionary mode and tempo across the tree of life. Here we evaluate the broad phylogenetic patterns of a common plant-defence trait found across vascular plants: extrafloral nectaries (EFNs), plant glands that secrete nectar and are located outside the flower. EFNs typically defend plants indirectly by attracting invertebrate predators who reduce herbivory. Records of EFNs published over the last 135 years were compiled. After accounting for changes in taxonomy, phylogenetic comparative methods were used to evaluate patterns of EFN evolution, using a phylogeny of over 55 000 species of vascular plants. Using comparisons of parametric and non-parametric models, the true number of species with EFNs likely to exist beyond the current list was estimated. To date, EFNs have been reported in 3941 species representing 745 genera in 108 families, about 1-2 % of vascular plant species and approx. 21 % of families. They are found in 33 of 65 angiosperm orders. Foliar nectaries are known in four of 36 fern families. Extrafloral nectaries are unknown in early angiosperms, magnoliids and gymnosperms. They occur throughout monocotyledons, yet most EFNs are found within eudicots, with the bulk of species with EFNs being rosids. Phylogenetic analyses strongly support the repeated gain and loss of EFNs across plant clades, especially in more derived dicot families, and suggest that EFNs are found in a minimum of 457 independent lineages. However, model selection methods estimate that the number of unreported cases of EFNs may be as high as the number of species already reported. EFNs are widespread and evolutionarily labile traits that have repeatedly evolved a remarkable number of times in vascular plants. Our current understanding of the

  3. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    Directory of Open Access Journals (Sweden)

    Kodner Robin B

    2010-10-01

    Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.

  4. Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction.

    Science.gov (United States)

    Mai, Uyen; Sayyari, Erfan; Mirarab, Siavash

    2017-01-01

    Phylogenetic trees inferred using commonly-used models of sequence evolution are unrooted, but the root position matters both for interpretation and downstream applications. This issue has been long recognized; however, whether the potential for discordance between the species tree and gene trees impacts methods of rooting a phylogenetic tree has not been extensively studied. In this paper, we introduce a new method of rooting a tree based on its branch length distribution; our method, which minimizes the variance of root to tip distances, is inspired by the traditional midpoint rerooting and is justified when deviations from the strict molecular clock are random. Like midpoint rerooting, the method can be implemented in a linear time algorithm. In extensive simulations that consider discordance between gene trees and the species tree, we show that the new method is more accurate than midpoint rerooting, but its relative accuracy compared to using outgroups to root gene trees depends on the size of the dataset and levels of deviations from the strict clock. We show high levels of error for all methods of rooting estimated gene trees due to factors that include effects of gene tree discordance, deviations from the clock, and gene tree estimation error. Our simulations, however, did not reveal significant differences between two equivalent methods for species tree estimation that use rooted and unrooted input, namely, STAR and NJst. Nevertheless, our results point to limitations of existing scalable rooting methods.

  5. Phylogenetic framework for coevolutionary studies: a compass for exploring jungles of tangled trees.

    Science.gov (United States)

    Martínez-Aquino, Andrés

    2016-08-01

    Phylogenetics is used to detect past evolutionary events, from how species originated to how their ecological interactions with other species arose, which can mirror cophylogenetic patterns. Cophylogenetic reconstructions uncover past ecological relationships between taxa through inferred coevolutionary events on trees, for example, codivergence, duplication, host-switching, and loss. These events can be detected by cophylogenetic analyses based on nodes and the length and branching pattern of the phylogenetic trees of symbiotic associations, for example, host-parasite. In the past 2 decades, algorithms have been developed for cophylogetenic analyses and implemented in different software, for example, statistical congruence index and event-based methods. Based on the combination of these approaches, it is possible to integrate temporal information into cophylogenetical inference, such as estimates of lineage divergence times between 2 taxa, for example, hosts and parasites. Additionally, the advances in phylogenetic biogeography applying methods based on parametric process models and combined Bayesian approaches, can be useful for interpreting coevolutionary histories in a scenario of biogeographical area connectivity through time. This article briefly reviews the basics of parasitology and provides an overview of software packages in cophylogenetic methods. Thus, the objective here is to present a phylogenetic framework for coevolutionary studies, with special emphasis on groups of parasitic organisms. Researchers wishing to undertake phylogeny-based coevolutionary studies can use this review as a "compass" when "walking" through jungles of tangled phylogenetic trees.

  6. The space of ultrametric phylogenetic trees.

    Science.gov (United States)

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  7. Improved Maximum Parsimony Models for Phylogenetic Networks.

    Science.gov (United States)

    Van Iersel, Leo; Jones, Mark; Scornavacca, Celine

    2018-05-01

    Phylogenetic networks are well suited to represent evolutionary histories comprising reticulate evolution. Several methods aiming at reconstructing explicit phylogenetic networks have been developed in the last two decades. In this article, we propose a new definition of maximum parsimony for phylogenetic networks that permits to model biological scenarios that cannot be modeled by the definitions currently present in the literature (namely, the "hardwired" and "softwired" parsimony). Building on this new definition, we provide several algorithmic results that lay the foundations for new parsimony-based methods for phylogenetic network reconstruction.

  8. Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.

    Science.gov (United States)

    Allman, Elizabeth S; Rhodes, John A; Sullivant, Seth

    2017-02-01

    Frequencies of k-mers in sequences are sometimes used as a basis for inferring phylogenetic trees without first obtaining a multiple sequence alignment. We show that a standard approach of using the squared Euclidean distance between k-mer vectors to approximate a tree metric can be statistically inconsistent. To remedy this, we derive model-based distance corrections for orthologous sequences without gaps, which lead to consistent tree inference. The identifiability of model parameters from k-mer frequencies is also studied. Finally, we report simulations showing that the corrected distance outperforms many other k-mer methods, even when sequences are generated with an insertion and deletion process. These results have implications for multiple sequence alignment as well since k-mer methods are usually the first step in constructing a guide tree for such algorithms.

  9. Phenotypic and phylogenetic identification of coliform bacteria obtained from 12 USEPA approved coliform methods

    KAUST Repository

    Zhang, Ya

    2015-06-26

    The current definition of coliform bacteria is method-dependent, and when different culture-based methods are used, discrepancies in results can occur and affect the accuracy in identifying true coliforms. This study used an alternative approach to identify true coliforms by combing the phenotypic traits of the coliform isolates and the phylogenetic affiliation of 16S rRNA gene sequences together with the use of lacZ and uidA genes. A collection of 1404 isolates from 12 US Environmental Protection Agency approved coliform-testing methods were characterized based on their phylogenetic affiliations and responses to their original isolation medium and Lauryl Tryptose broth, m-Endo and MI agar media. Isolates were phylogenetically classified into 32 true coliform or targeted Enterobacteriaceae (TE) groups, and 14 non-coliform or non-targeted Enterbacteriaceae (NTE) groups. It was statistically shown that detecting true-positive (TP) events is more challenging than detecting true-negative (TN) events. Furthermore, most false-negative (FN) events were associated with four TE groups (i.e., Serratia group I, Providencia, Proteus, and Morganella), and most false-positive (FP) events with two NTE groups, Aeromonas and Plesiomonas. In Escherichia coli testing, 18 out of 145 E. coli isolates identified by those enzymatic methods were validated as FNs. The reasons behind the FP and FN reactions could be explained through the analysis of the lacZ and uidA gene. Overall, combining the analyses of 16S rRNA, lacZ and uidA genes with the growth responses of TE and NTE on culture-based media is an effective way to evaluate the performance of coliform detection methods.

  10. Phenotypic and phylogenetic identification of coliform bacteria obtained from 12 USEPA approved coliform methods

    KAUST Repository

    Zhang, Ya; Hong, Pei-Ying; LeChevallier, Mark W.; Liu, Wen-Tso

    2015-01-01

    The current definition of coliform bacteria is method-dependent, and when different culture-based methods are used, discrepancies in results can occur and affect the accuracy in identifying true coliforms. This study used an alternative approach to identify true coliforms by combing the phenotypic traits of the coliform isolates and the phylogenetic affiliation of 16S rRNA gene sequences together with the use of lacZ and uidA genes. A collection of 1404 isolates from 12 US Environmental Protection Agency approved coliform-testing methods were characterized based on their phylogenetic affiliations and responses to their original isolation medium and Lauryl Tryptose broth, m-Endo and MI agar media. Isolates were phylogenetically classified into 32 true coliform or targeted Enterobacteriaceae (TE) groups, and 14 non-coliform or non-targeted Enterbacteriaceae (NTE) groups. It was statistically shown that detecting true-positive (TP) events is more challenging than detecting true-negative (TN) events. Furthermore, most false-negative (FN) events were associated with four TE groups (i.e., Serratia group I, Providencia, Proteus, and Morganella), and most false-positive (FP) events with two NTE groups, Aeromonas and Plesiomonas. In Escherichia coli testing, 18 out of 145 E. coli isolates identified by those enzymatic methods were validated as FNs. The reasons behind the FP and FN reactions could be explained through the analysis of the lacZ and uidA gene. Overall, combining the analyses of 16S rRNA, lacZ and uidA genes with the growth responses of TE and NTE on culture-based media is an effective way to evaluate the performance of coliform detection methods.

  11. Some limitations of public sequence data for phylogenetic inference (in plants).

    Science.gov (United States)

    Hinchliff, Cody E; Smith, Stephen Andrew

    2014-01-01

    The GenBank database contains essentially all of the nucleotide sequence data generated for published molecular systematic studies, but for the majority of taxa these data remain sparse. GenBank has value for phylogenetic methods that leverage data-mining and rapidly improving computational methods, but the limits imposed by the sparse structure of the data are not well understood. Here we present a tree representing 13,093 land plant genera--an estimated 80% of extant plant diversity--to illustrate the potential of public sequence data for broad phylogenetic inference in plants, and we explore the limits to inference imposed by the structure of these data using theoretical foundations from phylogenetic data decisiveness. We find that despite very high levels of missing data (over 96%), the present data retain the potential to inform over 86.3% of all possible phylogenetic relationships. Most of these relationships, however, are informed by small amounts of data--approximately half are informed by fewer than four loci, and more than 99% are informed by fewer than fifteen. We also apply an information theoretic measure of branch support to assess the strength of phylogenetic signal in the data, revealing many poorly supported branches concentrated near the tips of the tree, where data are sparse and the limiting effects of this sparseness are stronger. We argue that limits to phylogenetic inference and signal imposed by low data coverage may pose significant challenges for comprehensive phylogenetic inference at the species level. Computational requirements provide additional limits for large reconstructions, but these may be overcome by methodological advances, whereas insufficient data coverage can only be remedied by additional sampling effort. We conclude that public databases have exceptional value for modern systematics and evolutionary biology, and that a continued emphasis on expanding taxonomic and genomic coverage will play a critical role in developing

  12. Phylogenetic Trees From Sequences

    Science.gov (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  13. Ultrafast Approximation for Phylogenetic Bootstrap

    NARCIS (Netherlands)

    Bui Quang Minh, [No Value; Nguyen, Thi; von Haeseler, Arndt

    Nonparametric bootstrap has been a widely used tool in phylogenetic analysis to assess the clade support of phylogenetic trees. However, with the rapidly growing amount of data, this task remains a computational bottleneck. Recently, approximation methods such as the RAxML rapid bootstrap (RBS) and

  14. Molecular Phylogenetics: Mathematical Framework and Unsolved Problems

    Science.gov (United States)

    Xia, Xuhua

    Phylogenetic relationship is essential in dating evolutionary events, reconstructing ancestral genes, predicting sites that are important to natural selection, and, ultimately, understanding genomic evolution. Three categories of phylogenetic methods are currently used: the distance-based, the maximum parsimony, and the maximum likelihood method. Here, I present the mathematical framework of these methods and their rationales, provide computational details for each of them, illustrate analytically and numerically the potential biases inherent in these methods, and outline computational challenges and unresolved problems. This is followed by a brief discussion of the Bayesian approach that has been recently used in molecular phylogenetics.

  15. Big and slow: phylogenetic estimates of molecular evolution in baleen whales (suborder mysticeti).

    Science.gov (United States)

    Jackson, J A; Baker, C S; Vant, M; Steel, D J; Medrano-González, L; Palumbi, S R

    2009-11-01

    Baleen whales are the largest animals that have ever lived. To develop an improved estimation of substitution rate for nuclear and mitochondrial DNA for this taxon, we implemented a relaxed-clock phylogenetic approach using three fossil calibration dates: the divergence between odontocetes and mysticetes approximately 34 million years ago (Ma), between the balaenids and balaenopterids approximately 28 Ma, and the time to most recent common ancestor within the Balaenopteridae approximately 12 Ma. We examined seven mitochondrial genomes, a large number of mitochondrial control region sequences (219 haplotypes for 465 bp) and nine nuclear introns representing five species of whales, within which multiple species-specific alleles were sequenced to account for within-species diversity (1-15 for each locus). The total data set represents >1.65 Mbp of mitogenome and nuclear genomic sequence. The estimated substitution rate for the humpback whale control region (3.9%/million years, My) was higher than previous estimates for baleen whales but slow relative to other mammal species with similar generation times (e.g., human-chimp mean rate > 20%/My). The mitogenomic third codon position rate was also slow relative to other mammals (mean estimate 1%/My compared with a mammalian average of 9.8%/My for the cytochrome b gene). The mean nuclear genomic substitution rate (0.05%/My) was substantially slower than average synonymous estimates for other mammals (0.21-0.37%/My across a range of studies). The nuclear and mitogenome rate estimates for baleen whales were thus roughly consistent with an 8- to 10-fold slowing due to a combination of large body size and long generation times. Surprisingly, despite the large data set of nuclear intron sequences, there was only weak and conflicting support for alternate hypotheses about the phylogeny of balaenopterid whales, suggesting that interspecies introgressions or a rapid radiation has obscured species relationships in the nuclear genome.

  16. Towards a formal genealogical classification of the Lezgian languages (North Caucasus: testing various phylogenetic methods on lexical data.

    Directory of Open Access Journals (Sweden)

    Alexei Kassian

    Full Text Available A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ, Neighbor joining (NJ, Unweighted pair group method with arithmetic mean (UPGMA, Bayesian Markov chain Monte Carlo (MCMC, Unweighted maximum parsimony (UMP. Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances. Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists, the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP have yielded less likely topologies.

  17. Towards a formal genealogical classification of the Lezgian languages (North Caucasus): testing various phylogenetic methods on lexical data.

    Science.gov (United States)

    Kassian, Alexei

    2015-01-01

    A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies.

  18. Philosophy and phylogenetic inference: a comparison of likelihood and parsimony methods in the context of Karl Popper's writings on corroboration.

    Science.gov (United States)

    de Queiroz, K; Poe, S

    2001-06-01

    Advocates of cladistic parsimony methods have invoked the philosophy of Karl Popper in an attempt to argue for the superiority of those methods over phylogenetic methods based on Ronald Fisher's statistical principle of likelihood. We argue that the concept of likelihood in general, and its application to problems of phylogenetic inference in particular, are highly compatible with Popper's philosophy. Examination of Popper's writings reveals that his concept of corroboration is, in fact, based on likelihood. Moreover, because probabilistic assumptions are necessary for calculating the probabilities that define Popper's corroboration, likelihood methods of phylogenetic inference--with their explicit probabilistic basis--are easily reconciled with his concept. In contrast, cladistic parsimony methods, at least as described by certain advocates of those methods, are less easily reconciled with Popper's concept of corroboration. If those methods are interpreted as lacking probabilistic assumptions, then they are incompatible with corroboration. Conversely, if parsimony methods are to be considered compatible with corroboration, then they must be interpreted as carrying implicit probabilistic assumptions. Thus, the non-probabilistic interpretation of cladistic parsimony favored by some advocates of those methods is contradicted by an attempt by the same authors to justify parsimony methods in terms of Popper's concept of corroboration. In addition to being compatible with Popperian corroboration, the likelihood approach to phylogenetic inference permits researchers to test the assumptions of their analytical methods (models) in a way that is consistent with Popper's ideas about the provisional nature of background knowledge.

  19. Rooting phylogenetic trees under the coalescent model using site pattern probabilities.

    Science.gov (United States)

    Tian, Yuan; Kubatko, Laura

    2017-12-19

    Phylogenetic tree inference is a fundamental tool to estimate ancestor-descendant relationships among different species. In phylogenetic studies, identification of the root - the most recent common ancestor of all sampled organisms - is essential for complete understanding of the evolutionary relationships. Rooted trees benefit most downstream application of phylogenies such as species classification or study of adaptation. Often, trees can be rooted by using outgroups, which are species that are known to be more distantly related to the sampled organisms than any other species in the phylogeny. However, outgroups are not always available in evolutionary research. In this study, we develop a new method for rooting species tree under the coalescent model, by developing a series of hypothesis tests for rooting quartet phylogenies using site pattern probabilities. The power of this method is examined by simulation studies and by application to an empirical North American rattlesnake data set. The method shows high accuracy across the simulation conditions considered, and performs well for the rattlesnake data. Thus, it provides a computationally efficient way to accurately root species-level phylogenies that incorporates the coalescent process. The method is robust to variation in substitution model, but is sensitive to the assumption of a molecular clock. Our study establishes a computationally practical method for rooting species trees that is more efficient than traditional methods. The method will benefit numerous evolutionary studies that require rooting a phylogenetic tree without having to specify outgroups.

  20. Functional and phylogenetic ecology in R

    CERN Document Server

    Swenson, Nathan G

    2014-01-01

    Functional and Phylogenetic Ecology in R is designed to teach readers to use R for phylogenetic and functional trait analyses. Over the past decade, a dizzying array of tools and methods were generated to incorporate phylogenetic and functional information into traditional ecological analyses. Increasingly these tools are implemented in R, thus greatly expanding their impact. Researchers getting started in R can use this volume as a step-by-step entryway into phylogenetic and functional analyses for ecology in R. More advanced users will be able to use this volume as a quick reference to understand particular analyses. The volume begins with an introduction to the R environment and handling relevant data in R. Chapters then cover phylogenetic and functional metrics of biodiversity; null modeling and randomizations for phylogenetic and functional trait analyses; integrating phylogenetic and functional trait information; and interfacing the R environment with a popular C-based program. This book presents a uni...

  1. Independent contrasts and PGLS regression estimators are equivalent.

    Science.gov (United States)

    Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

    2012-05-01

    We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.

  2. Phylogenetic analysis of molecular and morphological data highlights uncertainty in the relationships of fossil and living species of Elopomorpha (Actinopterygii: Teleostei).

    Science.gov (United States)

    Dornburg, Alex; Friedman, Matt; Near, Thomas J

    2015-08-01

    Elopomorpha is one of the three main clades of living teleost fishes and includes a range of disparate lineages including eels, tarpons, bonefishes, and halosaurs. Elopomorphs were among the first groups of fishes investigated using Hennigian phylogenetic methods and continue to be the object of intense phylogenetic scrutiny due to their economic significance, diversity, and crucial evolutionary status as the sister group of all other teleosts. While portions of the phylogenetic backbone for Elopomorpha are consistent between studies, the relationships among Albula, Pterothrissus, Notacanthiformes, and Anguilliformes remain contentious and difficult to evaluate. This lack of phylogenetic resolution is problematic as fossil lineages are often described and placed taxonomically based on an assumed sister group relationship between Albula and Pterothrissus. In addition, phylogenetic studies using morphological data that sample elopomorph fossil lineages often do not include notacanthiform or anguilliform lineages, potentially introducing a bias toward interpreting fossils as members of the common stem of Pterothrissus and Albula. Here we provide a phylogenetic analysis of DNA sequences sampled from multiple nuclear genes that include representative taxa from Albula, Pterothrissus, Notacanthiformes and Anguilliformes. We integrate our molecular dataset with a morphological character matrix that spans both living and fossil elopomorph lineages. Our results reveal substantial uncertainty in the placement of Pterothrissus as well as all sampled fossil lineages, questioning the stability of the taxonomy of fossil Elopomorpha. However, despite topological uncertainty, our integration of fossil lineages into a Bayesian time calibrated framework provides divergence time estimates for the clade that are consistent with previously published age estimates based on the elopomorph fossil record and molecular estimates resulting from traditional node-dating methods. Copyright

  3. Molecular Phylogenetics: Concepts for a Newcomer.

    Science.gov (United States)

    Ajawatanawong, Pravech

    Molecular phylogenetics is the study of evolutionary relationships among organisms using molecular sequence data. The aim of this review is to introduce the important terminology and general concepts of tree reconstruction to biologists who lack a strong background in the field of molecular evolution. Some modern phylogenetic programs are easy to use because of their user-friendly interfaces, but understanding the phylogenetic algorithms and substitution models, which are based on advanced statistics, is still important for the analysis and interpretation without a guide. Briefly, there are five general steps in carrying out a phylogenetic analysis: (1) sequence data preparation, (2) sequence alignment, (3) choosing a phylogenetic reconstruction method, (4) identification of the best tree, and (5) evaluating the tree. Concepts in this review enable biologists to grasp the basic ideas behind phylogenetic analysis and also help provide a sound basis for discussions with expert phylogeneticists.

  4. Model checking software for phylogenetic trees using distribution and database methods

    Directory of Open Access Journals (Sweden)

    Requeno José Ignacio

    2013-12-01

    Full Text Available Model checking, a generic and formal paradigm stemming from computer science based on temporal logics, has been proposed for the study of biological properties that emerge from the labeling of the states defined over the phylogenetic tree. This strategy allows us to use generic software tools already present in the industry. However, the performance of traditional model checking is penalized when scaling the system for large phylogenies. To this end, two strategies are presented here. The first one consists of partitioning the phylogenetic tree into a set of subgraphs each one representing a subproblem to be verified so as to speed up the computation time and distribute the memory consumption. The second strategy is based on uncoupling the information associated to each state of the phylogenetic tree (mainly, the DNA sequence and exporting it to an external tool for the management of large information systems. The integration of all these approaches outperforms the results of monolithic model checking and helps us to execute the verification of properties in a real phylogenetic tree.

  5. Undergraduate Students’ Difficulties in Reading and Constructing Phylogenetic Tree

    Science.gov (United States)

    Sa'adah, S.; Tapilouw, F. S.; Hidayat, T.

    2017-02-01

    Representation is a very important communication tool to communicate scientific concepts. Biologists produce phylogenetic representation to express their understanding of evolutionary relationships. The phylogenetic tree is visual representation depict a hypothesis about the evolutionary relationship and widely used in the biological sciences. Phylogenetic tree currently growing for many disciplines in biology. Consequently, learning about phylogenetic tree become an important part of biological education and an interesting area for biology education research. However, research showed many students often struggle with interpreting the information that phylogenetic trees depict. The purpose of this study was to investigate undergraduate students’ difficulties in reading and constructing a phylogenetic tree. The method of this study is a descriptive method. In this study, we used questionnaires, interviews, multiple choice and open-ended questions, reflective journals and observations. The findings showed students experiencing difficulties, especially in constructing a phylogenetic tree. The students’ responds indicated that main reasons for difficulties in constructing a phylogenetic tree are difficult to placing taxa in a phylogenetic tree based on the data provided so that the phylogenetic tree constructed does not describe the actual evolutionary relationship (incorrect relatedness). Students also have difficulties in determining the sister group, character synapomorphy, autapomorphy from data provided (character table) and comparing among phylogenetic tree. According to them building the phylogenetic tree is more difficult than reading the phylogenetic tree. Finding this studies provide information to undergraduate instructor and students to overcome learning difficulties of reading and constructing phylogenetic tree.

  6. Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European.

    Science.gov (United States)

    Forster, Peter; Toth, Alfred

    2003-07-22

    Indo-European is the largest and best-documented language family in the world, yet the reconstruction of the Indo-European tree, first proposed in 1863, has remained controversial. Complications may include ascertainment bias when choosing the linguistic data, and disregard for the wave model of 1872 when attempting to reconstruct the tree. Essentially analogous problems were solved in evolutionary genetics by DNA sequencing and phylogenetic network methods, respectively. We now adapt these tools to linguistics, and analyze Indo-European language data, focusing on Celtic and in particular on the ancient Celtic language of Gaul (modern France), by using bilingual Gaulish-Latin inscriptions. Our phylogenetic network reveals an early split of Celtic within Indo-European. Interestingly, the next branching event separates Gaulish (Continental Celtic) from the British (Insular Celtic) languages, with Insular Celtic subsequently splitting into Brythonic (Welsh, Breton) and Goidelic (Irish and Scottish Gaelic). Taken together, the network thus suggests that the Celtic language arrived in the British Isles as a single wave (and then differentiated locally), rather than in the traditional two-wave scenario ("P-Celtic" to Britain and "Q-Celtic" to Ireland). The phylogenetic network furthermore permits the estimation of time in analogy to genetics, and we obtain tentative dates for Indo-European at 8100 BC +/- 1,900 years, and for the arrival of Celtic in Britain at 3200 BC +/- 1,500 years. The phylogenetic method is easily executed by hand and promises to be an informative approach for many problems in historical linguistics.

  7. Maximum parsimony, substitution model, and probability phylogenetic trees.

    Science.gov (United States)

    Weng, J F; Thomas, D A; Mareels, I

    2011-01-01

    The problem of inferring phylogenies (phylogenetic trees) is one of the main problems in computational biology. There are three main methods for inferring phylogenies-Maximum Parsimony (MP), Distance Matrix (DM) and Maximum Likelihood (ML), of which the MP method is the most well-studied and popular method. In the MP method the optimization criterion is the number of substitutions of the nucleotides computed by the differences in the investigated nucleotide sequences. However, the MP method is often criticized as it only counts the substitutions observable at the current time and all the unobservable substitutions that really occur in the evolutionary history are omitted. In order to take into account the unobservable substitutions, some substitution models have been established and they are now widely used in the DM and ML methods but these substitution models cannot be used within the classical MP method. Recently the authors proposed a probability representation model for phylogenetic trees and the reconstructed trees in this model are called probability phylogenetic trees. One of the advantages of the probability representation model is that it can include a substitution model to infer phylogenetic trees based on the MP principle. In this paper we explain how to use a substitution model in the reconstruction of probability phylogenetic trees and show the advantage of this approach with examples.

  8. Short segment search method for phylogenetic analysis using nested sliding windows

    Science.gov (United States)

    Iskandar, A. A.; Bustamam, A.; Trimarsanto, H.

    2017-10-01

    To analyze phylogenetics in Bioinformatics, coding DNA sequences (CDS) segment is needed for maximal accuracy. However, analysis by CDS cost a lot of time and money, so a short representative segment by CDS, which is envelope protein segment or non-structural 3 (NS3) segment is necessary. After sliding window is implemented, a better short segment than envelope protein segment and NS3 is found. This paper will discuss a mathematical method to analyze sequences using nested sliding window to find a short segment which is representative for the whole genome. The result shows that our method can find a short segment which more representative about 6.57% in topological view to CDS segment than an Envelope segment or NS3 segment.

  9. Undergraduate Students’ Initial Ability in Understanding Phylogenetic Tree

    Science.gov (United States)

    Sa'adah, S.; Hidayat, T.; Sudargo, Fransisca

    2017-04-01

    The Phylogenetic tree is a visual representation depicts a hypothesis about the evolutionary relationship among taxa. Evolutionary experts use this representation to evaluate the evidence for evolution. The phylogenetic tree is currently growing for many disciplines in biology. Consequently, learning about the phylogenetic tree has become an important part of biological education and an interesting area of biology education research. Skill to understanding and reasoning of the phylogenetic tree, (called tree thinking) is an important skill for biology students. However, research showed many students have difficulty in interpreting, constructing, and comparing among the phylogenetic tree, as well as experiencing a misconception in the understanding of the phylogenetic tree. Students are often not taught how to reason about evolutionary relationship depicted in the diagram. Students are also not provided with information about the underlying theory and process of phylogenetic. This study aims to investigate the initial ability of undergraduate students in understanding and reasoning of the phylogenetic tree. The research method is the descriptive method. Students are given multiple choice questions and an essay that representative by tree thinking elements. Each correct answer made percentages. Each student is also given questionnaires. The results showed that the undergraduate students’ initial ability in understanding and reasoning phylogenetic tree is low. Many students are not able to answer questions about the phylogenetic tree. Only 19 % undergraduate student who answered correctly on indicator evaluate the evolutionary relationship among taxa, 25% undergraduate student who answered correctly on indicator applying concepts of the clade, 17% undergraduate student who answered correctly on indicator determines the character evolution, and only a few undergraduate student who can construct the phylogenetic tree.

  10. Auto-validating von Neumann rejection sampling from small phylogenetic tree spaces

    Directory of Open Access Journals (Sweden)

    York Thomas

    2009-01-01

    Full Text Available Abstract Background In phylogenetic inference one is interested in obtaining samples from the posterior distribution over the tree space on the basis of some observed DNA sequence data. One of the simplest sampling methods is the rejection sampler due to von Neumann. Here we introduce an auto-validating version of the rejection sampler, via interval analysis, to rigorously draw samples from posterior distributions over small phylogenetic tree spaces. Results The posterior samples from the auto-validating sampler are used to rigorously (i estimate posterior probabilities for different rooted topologies based on mitochondrial DNA from human, chimpanzee and gorilla, (ii conduct a non-parametric test of rate variation between protein-coding and tRNA-coding sites from three primates and (iii obtain a posterior estimate of the human-neanderthal divergence time. Conclusion This solves the open problem of rigorously drawing independent and identically distributed samples from the posterior distribution over rooted and unrooted small tree spaces (3 or 4 taxa based on any multiply-aligned sequence data.

  11. Species divergence and phylogenetic variation of ecophysiological traits in lianas and trees.

    Science.gov (United States)

    Rios, Rodrigo S; Salgado-Luarte, Cristian; Gianoli, Ernesto

    2014-01-01

    The climbing habit is an evolutionary key innovation in plants because it is associated with enhanced clade diversification. We tested whether patterns of species divergence and variation of three ecophysiological traits that are fundamental for plant adaptation to light environments (maximum photosynthetic rate [A(max)], dark respiration rate [R(d)], and specific leaf area [SLA]) are consistent with this key innovation. Using data reported from four tropical forests and three temperate forests, we compared phylogenetic distance among species as well as the evolutionary rate, phylogenetic distance and phylogenetic signal of those traits in lianas and trees. Estimates of evolutionary rates showed that R(d) evolved faster in lianas, while SLA evolved faster in trees. The mean phylogenetic distance was 1.2 times greater among liana species than among tree species. Likewise, estimates of phylogenetic distance indicated that lianas were less related than by chance alone (phylogenetic evenness across 63 species), and trees were more related than expected by chance (phylogenetic clustering across 71 species). Lianas showed evenness for R(d), while trees showed phylogenetic clustering for this trait. In contrast, for SLA, lianas exhibited phylogenetic clustering and trees showed phylogenetic evenness. Lianas and trees showed patterns of ecophysiological trait variation among species that were independent of phylogenetic relatedness. We found support for the expected pattern of greater species divergence in lianas, but did not find consistent patterns regarding ecophysiological trait evolution and divergence. R(d) followed the species-level pattern, i.e., greater divergence/evolution in lianas compared to trees, while the opposite occurred for SLA and no pattern was detected for A(max). R(d) may have driven lianas' divergence across forest environments, and might contribute to diversification in climber clades.

  12. [Phylogenetic analysis of closely related Leuconostoc citreum species based on partial housekeeping genes].

    Science.gov (United States)

    Lv, Qiang; Chen, Ming; Xu, Haiyan; Song, Yuqin; Sun, Zhihong; Dan, Tong; Sun, Tiansong

    2013-07-04

    Using the 16S rRNA, dnaA, murC and pyrG gene sequences, we identified the phylogenetic relationship among closely related Leuconostoc citreum species. Seven Leu. citreum strains originally isolated from sourdough were characterized by PCR methods to amplify the dnaA, murC and pyrG gene sequences, which were determined to assess the suitability as phylogenetic markers. Then, we estimated the genetic distance and constructed the phylogenetic trees including 16S rRNA and above mentioned three housekeeping genes combining with published corresponding sequences. By comparing the phylogenetic trees, the topology of three housekeeping genes trees were consistent with that of 16S rRNA gene. The homology of closely related Leu. citreum species among dnaA, murC, pyrG and 16S rRNA gene sequences were different, ranged from75.5% to 97.2%, 50.2% to 99.7%, 65.0% to 99.8% and 98.5% 100%, respectively. The phylogenetic relationship of three housekeeping genes sequences were highly consistent with the results of 16S rRNA gene sequence, while the genetic distance of these housekeeping genes were extremely high than 16S rRNA gene. Consequently, the dnaA, murC and pyrG gene are suitable for classification and identification closely related Leu. citreum species.

  13. ["Long-branch Attraction" artifact in phylogenetic reconstruction].

    Science.gov (United States)

    Li, Yi-Wei; Yu, Li; Zhang, Ya-Ping

    2007-06-01

    Phylogenetic reconstruction among various organisms not only helps understand their evolutionary history but also reveal several fundamental evolutionary questions. Understanding of the evolutionary relationships among organisms establishes the foundation for the investigations of other biological disciplines. However, almost all the widely used phylogenetic methods have limitations which fail to eliminate systematic errors effectively, preventing the reconstruction of true organismal relationships. "Long-branch Attraction" (LBA) artifact is one of the most disturbing factors in phylogenetic reconstruction. In this review, the conception and analytic method as well as the avoidance strategy of LBA were summarized. In addition, several typical examples were provided. The approach to avoid and resolve LBA artifact has been discussed.

  14. Phylogenetic relationship among Kenyan sorghum germplasms ...

    African Journals Online (AJOL)

    Mr Kiboi

    phylogenetic relationships based on 10 DNA fragments at AltSB loci with SbMATE, ORF9 and MITE primers. .... estimate the overall genetic diversity in Kenyan sorghum lines: Cheprot et al. 3529 ..... EARN project and Generation Challenge (GCP), ... genetics and molecular biology of plant aluminum resistance and toxicity.

  15. Ecological toxicity estimation of solid waste products of Tekely Ore Mining and Processing Enterprise of OJSC 'Kaztsink' using biological testing methods

    International Nuclear Information System (INIS)

    Vetrinskaya, N.I.; Goldobina, E.A.; Kosmukhambetov, A.R.; Kulikova, O.V.; Ismailova, Zh.B.; Gurikova, N.D.; Kozlova, N.V.

    2001-01-01

    Results are examined of solid waste products estimation using methods of biological testing at testing-objects of different phylogenetic development levels (simple aqua animals, algae, supreme water plants). Correlation is found between lead and zinc content in the extract of leaching out and exact reaction of all under-test objects. Conclusion is made that performing of the complex express economical analysis is necessary using methods of biological testing of industrial waste products monitoring and other man-made pollutants. (author)

  16. Methods for the quantitative comparison of molecular estimates of clade age and the fossil record.

    Science.gov (United States)

    Clarke, Julia A; Boyd, Clint A

    2015-01-01

    Approaches quantifying the relative congruence, or incongruence, of molecular divergence estimates and the fossil record have been limited. Previously proposed methods are largely node specific, assessing incongruence at particular nodes for which both fossil data and molecular divergence estimates are available. These existing metrics, and other methods that quantify incongruence across topologies including entirely extinct clades, have so far not taken into account uncertainty surrounding both the divergence estimates and the ages of fossils. They have also treated molecular divergence estimates younger than previously assessed fossil minimum estimates of clade age as if they were the same as cases in which they were older. However, these cases are not the same. Recovered divergence dates younger than compared oldest known occurrences require prior hypotheses regarding the phylogenetic position of the compared fossil record and standard assumptions about the relative timing of morphological and molecular change to be incorrect. Older molecular dates, by contrast, are consistent with an incomplete fossil record and do not require prior assessments of the fossil record to be unreliable in some way. Here, we compare previous approaches and introduce two new descriptive metrics. Both metrics explicitly incorporate information on uncertainty by utilizing the 95% confidence intervals on estimated divergence dates and data on stratigraphic uncertainty concerning the age of the compared fossils. Metric scores are maximized when these ranges are overlapping. MDI (minimum divergence incongruence) discriminates between situations where molecular estimates are younger or older than known fossils reporting both absolute fit values and a number score for incompatible nodes. DIG range (divergence implied gap range) allows quantification of the minimum increase in implied missing fossil record induced by enforcing a given set of molecular-based estimates. These metrics are used

  17. Phylogenetic networks do not need to be complex: using fewer reticulations to represent conflicting clusters

    NARCIS (Netherlands)

    Iersel, van L.J.J.; Kelk, S.M.; Rupp, R.; Huson, D.H.

    2010-01-01

    Phylogenetic trees are widely used to display estimates of how groups of species are evolved. Each phylogenetic tree can be seen as a collection of clusters, subgroups of the species that evolved from a common ancestor. When phylogenetic trees are obtained for several datasets (e.g. for different

  18. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  19. The best of both worlds: Phylogenetic eigenvector regression and mapping

    Directory of Open Access Journals (Sweden)

    José Alexandre Felizola Diniz Filho

    2015-09-01

    Full Text Available Eigenfunction analyses have been widely used to model patterns of autocorrelation in time, space and phylogeny. In a phylogenetic context, Diniz-Filho et al. (1998 proposed what they called Phylogenetic Eigenvector Regression (PVR, in which pairwise phylogenetic distances among species are submitted to a Principal Coordinate Analysis, and eigenvectors are then used as explanatory variables in regression, correlation or ANOVAs. More recently, a new approach called Phylogenetic Eigenvector Mapping (PEM was proposed, with the main advantage of explicitly incorporating a model-based warping in phylogenetic distance in which an Ornstein-Uhlenbeck (O-U process is fitted to data before eigenvector extraction. Here we compared PVR and PEM in respect to estimated phylogenetic signal, correlated evolution under alternative evolutionary models and phylogenetic imputation, using simulated data. Despite similarity between the two approaches, PEM has a slightly higher prediction ability and is more general than the original PVR. Even so, in a conceptual sense, PEM may provide a technique in the best of both worlds, combining the flexibility of data-driven and empirical eigenfunction analyses and the sounding insights provided by evolutionary models well known in comparative analyses.

  20. Phylogenetic rooting using minimal ancestor deviation.

    Science.gov (United States)

    Tria, Fernando Domingues Kümmel; Landan, Giddy; Dagan, Tal

    2017-06-19

    Ancestor-descendent relations play a cardinal role in evolutionary theory. Those relations are determined by rooting phylogenetic trees. Existing rooting methods are hampered by evolutionary rate heterogeneity or the unavailability of auxiliary phylogenetic information. Here we present a rooting approach, the minimal ancestor deviation (MAD) method, which accommodates heterotachy by using all pairwise topological and metric information in unrooted trees. We demonstrate the performance of the method, in comparison to existing rooting methods, by the analysis of phylogenies from eukaryotes and prokaryotes. MAD correctly recovers the known root of eukaryotes and uncovers evidence for the origin of cyanobacteria in the ocean. MAD is more robust and consistent than existing methods, provides measures of the root inference quality and is applicable to any tree with branch lengths.

  1. TCS: a web server for multiple sequence alignment evaluation and phylogenetic reconstruction.

    Science.gov (United States)

    Chang, Jia-Ming; Di Tommaso, Paolo; Lefort, Vincent; Gascuel, Olivier; Notredame, Cedric

    2015-07-01

    This article introduces the Transitive Consistency Score (TCS) web server; a service making it possible to estimate the local reliability of protein multiple sequence alignments (MSAs) using the TCS index. The evaluation can be used to identify the aligned positions most likely to contain structurally analogous residues and also most likely to support an accurate phylogenetic reconstruction. The TCS scoring scheme has been shown to be accurate predictor of structural alignment correctness among commonly used methods. It has also been shown to outperform common filtering schemes like Gblocks or trimAl when doing MSA post-processing prior to phylogenetic tree reconstruction. The web server is available from http://tcoffee.crg.cat/tcs. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Phylogenetic search through partial tree mixing

    Science.gov (United States)

    2012-01-01

    Background Recent advances in sequencing technology have created large data sets upon which phylogenetic inference can be performed. Current research is limited by the prohibitive time necessary to perform tree search on a reasonable number of individuals. This research develops new phylogenetic algorithms that can operate on tens of thousands of species in a reasonable amount of time through several innovative search techniques. Results When compared to popular phylogenetic search algorithms, better trees are found much more quickly for large data sets. These algorithms are incorporated in the PSODA application available at http://dna.cs.byu.edu/psoda Conclusions The use of Partial Tree Mixing in a partition based tree space allows the algorithm to quickly converge on near optimal tree regions. These regions can then be searched in a methodical way to determine the overall optimal phylogenetic solution. PMID:23320449

  3. Phylogenetic relationships of Hemiptera inferred from mitochondrial and nuclear genes.

    Science.gov (United States)

    Song, Nan; Li, Hu; Cai, Wanzhi; Yan, Fengming; Wang, Jianyun; Song, Fan

    2016-11-01

    Here, we reconstructed the Hemiptera phylogeny based on the expanded mitochondrial protein-coding genes and the nuclear 18S rRNA gene, separately. The differential rates of change across lineages may associate with long-branch attraction (LBA) effect and result in conflicting estimates of phylogeny from different types of data. To reduce the potential effects of systematic biases on inferences of topology, various data coding schemes, site removal method, and different algorithms were utilized in phylogenetic reconstruction. We show that the outgroups Phthiraptera, Thysanoptera, and the ingroup Sternorrhyncha share similar base composition, and exhibit "long branches" relative to other hemipterans. Thus, the long-branch attraction between these groups is suspected to cause the failure of recovering Hemiptera under the homogeneous model. In contrast, a monophyletic Hemiptera is supported when heterogeneous model is utilized in the analysis. Although higher level phylogenetic relationships within Hemiptera remain to be answered, consensus between analyses is beginning to converge on a stable phylogeny.

  4. Phylogenetic turnover during subtropical forest succession across environmental and phylogenetic scales.

    Science.gov (United States)

    Purschke, Oliver; Michalski, Stefan G; Bruelheide, Helge; Durka, Walter

    2017-12-01

    Although spatial and temporal patterns of phylogenetic community structure during succession are inherently interlinked and assembly processes vary with environmental and phylogenetic scales, successional studies of community assembly have yet to integrate spatial and temporal components of community structure, while accounting for scaling issues. To gain insight into the processes that generate biodiversity after disturbance, we combine analyses of spatial and temporal phylogenetic turnover across phylogenetic scales, accounting for covariation with environmental differences. We compared phylogenetic turnover, at the species- and individual-level, within and between five successional stages, representing woody plant communities in a subtropical forest chronosequence. We decomposed turnover at different phylogenetic depths and assessed its covariation with between-plot abiotic differences. Phylogenetic turnover between stages was low relative to species turnover and was not explained by abiotic differences. However, within the late-successional stages, there was high presence-/absence-based turnover (clustering) that occurred deep in the phylogeny and covaried with environmental differentiation. Our results support a deterministic model of community assembly where (i) phylogenetic composition is constrained through successional time, but (ii) toward late succession, species sorting into preferred habitats according to niche traits that are conserved deep in phylogeny, becomes increasingly important.

  5. Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction.

    Science.gov (United States)

    Sayyari, Erfan; Mirarab, Siavash

    2016-11-11

    Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed. We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves. We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.

  6. Extraction and phylogenetic survey of extracellular and intracellular DNA in marine sediments

    DEFF Research Database (Denmark)

    Torti, Andrea

    indeed inflate richness estimates of sediments microbial communities, and point to a role of bioturbation in shaping the prokaryotic diversity of the eDNA pool at the investigated site. Analysis of 18S RNA gene sequences revealed a diverse collection of eukaryotic taxa throughout the sediment column......DNA, and validated for minimal cell lysis during the eDNA extraction process. The optimized method was applied to investigate and compare the bacterial, archaeal, and eukaryotic diversity within iDNA and eDNA pools, in the context of differing geochemical and lithological zones in the Holocene sediment column...... of Aarhus Bay (Demark). Using high-throughput sequencing technologies, I first explored whether, and to what extent, prokaryotic eDNA parallels the phylogenetic composition of the local microbiome. Phylogenetic analyses revealed that, in near-surface sediments influenced by faunal activities, 50% of all...

  7. Phylogenetic fields through time: temporal dynamics of geographical co-occurrence and phylogenetic structure within species ranges.

    Science.gov (United States)

    Villalobos, Fabricio; Carotenuto, Francesco; Raia, Pasquale; Diniz-Filho, José Alexandre F

    2016-04-05

    Species co-occur with different sets of other species across their geographical distribution, which can be either closely or distantly related. Such co-occurrence patterns and their phylogenetic structure within individual species ranges represent what we call the species phylogenetic fields (PFs). These PFs allow investigation of the role of historical processes--speciation, extinction and dispersal--in shaping species co-occurrence patterns, in both extinct and extant species. Here, we investigate PFs of large mammalian species during the last 3 Myr, and how these correlate with trends in diversification rates. Using the fossil record, we evaluate species' distributional and co-occurrence patterns along with their phylogenetic structure. We apply a novel Bayesian framework on fossil occurrences to estimate diversification rates through time. Our findings highlight the effect of evolutionary processes and past climatic changes on species' distributions and co-occurrences. From the Late Pliocene to the Recent, mammal species seem to have responded in an individualistic manner to climate changes and diversification dynamics, co-occurring with different sets of species from different lineages across their geographical ranges. These findings stress the difficulty of forecasting potential effects of future climate changes on biodiversity. © 2016 The Author(s).

  8. Accurate and robust phylogeny estimation based on profile distances: a study of the Chlorophyceae (Chlorophyta

    Directory of Open Access Journals (Sweden)

    Rahmann Sven

    2004-06-01

    Full Text Available Abstract Background In phylogenetic analysis we face the problem that several subclade topologies are known or easily inferred and well supported by bootstrap analysis, but basal branching patterns cannot be unambiguously estimated by the usual methods (maximum parsimony (MP, neighbor-joining (NJ, or maximum likelihood (ML, nor are they well supported. We represent each subclade by a sequence profile and estimate evolutionary distances between profiles to obtain a matrix of distances between subclades. Results Our estimator of profile distances generalizes the maximum likelihood estimator of sequence distances. The basal branching pattern can be estimated by any distance-based method, such as neighbor-joining. Our method (profile neighbor-joining, PNJ then inherits the accuracy and robustness of profiles and the time efficiency of neighbor-joining. Conclusions Phylogenetic analysis of Chlorophyceae with traditional methods (MP, NJ, ML and MrBayes reveals seven well supported subclades, but the methods disagree on the basal branching pattern. The tree reconstructed by our method is better supported and can be confirmed by known morphological characters. Moreover the accuracy is significantly improved as shown by parametric bootstrap.

  9. Electrical estimating methods

    CERN Document Server

    Del Pico, Wayne J

    2014-01-01

    Simplify the estimating process with the latest data, materials, and practices Electrical Estimating Methods, Fourth Edition is a comprehensive guide to estimating electrical costs, with data provided by leading construction database RS Means. The book covers the materials and processes encountered by the modern contractor, and provides all the information professionals need to make the most precise estimate. The fourth edition has been updated to reflect the changing materials, techniques, and practices in the field, and provides the most recent Means cost data available. The complexity of el

  10. Effects of asymmetric nuclear introgression, introgressive mitochondrial sweep, and purifying selection on phylogenetic reconstruction and divergence estimates in the Pacific clade of Locustella warblers.

    Science.gov (United States)

    Drovetski, Sergei V; Semenov, Georgy; Red'kin, Yaroslav A; Sotnikov, Vladimir N; Fadeev, Igor V; Koblik, Evgeniy A

    2015-01-01

    When isolated but reproductively compatible populations expand geographically and meet, simulations predict asymmetric introgression of neutral loci from a local to invading taxon. Genetic introgression may affect phylogenetic reconstruction by obscuring topology and divergence estimates. We combined phylogenetic analysis of sequences from one mtDNA and 12 nuDNA loci with analysis of gene flow among 5 species of Pacific Locustella warblers to test for presence of genetic introgression and its effects on tree topology and divergence estimates. Our data showed that nuDNA introgression was substantial and asymmetrical among all members of superspecies groups whereas mtDNA showed no introgression except a single species pair where the invader's mtDNA was swept by mtDNA of the local species. This introgressive sweep of mtDNA had the opposite direction of the nuDNA introgression and resulted in the paraphyly of the local species' mtDNA haplotypes with respect to those of the invader. Тhe multilocus nuDNA species tree resolved all inter- and intraspecific relationships despite substantial introgression. However, the node ages on the species tree may be underestimated as suggested by the differences in node age estimates based on non-introgressing mtDNA and introgressing nuDNA. In turn, the introgressive sweep and strong purifying selection appear to elongate internal branches in the mtDNA gene tree.

  11. A program for verification of phylogenetic network models.

    Science.gov (United States)

    Gunawan, Andreas D M; Lu, Bingxin; Zhang, Louxin

    2016-09-01

    Genetic material is transferred in a non-reproductive manner across species more frequently than commonly thought, particularly in the bacteria kingdom. On one hand, extant genomes are thus more properly considered as a fusion product of both reproductive and non-reproductive genetic transfers. This has motivated researchers to adopt phylogenetic networks to study genome evolution. On the other hand, a gene's evolution is usually tree-like and has been studied for over half a century. Accordingly, the relationships between phylogenetic trees and networks are the basis for the reconstruction and verification of phylogenetic networks. One important problem in verifying a network model is determining whether or not certain existing phylogenetic trees are displayed in a phylogenetic network. This problem is formally called the tree containment problem. It is NP-complete even for binary phylogenetic networks. We design an exponential time but efficient method for determining whether or not a phylogenetic tree is displayed in an arbitrary phylogenetic network. It is developed on the basis of the so-called reticulation-visible property of phylogenetic networks. A C-program is available for download on http://www.math.nus.edu.sg/∼matzlx/tcp_package matzlx@nus.edu.sg Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Open Reading Frame Phylogenetic Analysis on the Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung

    2013-01-01

    Full Text Available Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.

  13. Genome-wide comparative analysis of phylogenetic trees: the prokaryotic forest of life.

    Science.gov (United States)

    Puigbò, Pere; Wolf, Yuri I; Koonin, Eugene V

    2012-01-01

    Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article, we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the application of these methods to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."

  14. Mapping Phylogenetic Trees to Reveal Distinct Patterns of Evolution.

    Science.gov (United States)

    Kendall, Michelle; Colijn, Caroline

    2016-10-01

    Evolutionary relationships are frequently described by phylogenetic trees, but a central barrier in many fields is the difficulty of interpreting data containing conflicting phylogenetic signals. We present a metric-based method for comparing trees which extracts distinct alternative evolutionary relationships embedded in data. We demonstrate detection and resolution of phylogenetic uncertainty in a recent study of anole lizards, leading to alternate hypotheses about their evolutionary relationships. We use our approach to compare trees derived from different genes of Ebolavirus and find that the VP30 gene has a distinct phylogenetic signature composed of three alternatives that differ in the deep branching structure. phylogenetics, evolution, tree metrics, genetics, sequencing. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. A parametric method for assessing diversification-rate variation in phylogenetic trees.

    Science.gov (United States)

    Shah, Premal; Fitzpatrick, Benjamin M; Fordyce, James A

    2013-02-01

    Phylogenetic hypotheses are frequently used to examine variation in rates of diversification across the history of a group. Patterns of diversification-rate variation can be used to infer underlying ecological and evolutionary processes responsible for patterns of cladogenesis. Most existing methods examine rate variation through time. Methods for examining differences in diversification among groups are more limited. Here, we present a new method, parametric rate comparison (PRC), that explicitly compares diversification rates among lineages in a tree using a variety of standard statistical distributions. PRC can identify subclades of the tree where diversification rates are at variance with the remainder of the tree. A randomization test can be used to evaluate how often such variance would appear by chance alone. The method also allows for comparison of diversification rate among a priori defined groups. Further, the application of the PRC method is not restricted to monophyletic groups. We examined the performance of PRC using simulated data, which showed that PRC has acceptable false-positive rates and statistical power to detect rate variation. We apply the PRC method to the well-studied radiation of North American Plethodon salamanders, and support the inference that the large-bodied Plethodon glutinosus clade has a higher historical rate of diversification compared to other Plethodon salamanders. © 2012 The Author(s). Evolution© 2012 The Society for the Study of Evolution.

  16. An attempt to reconstruct phylogenetic relationships within Caribbean nummulitids: simulating relationships and tracing character evolution

    Science.gov (United States)

    Eder, Wolfgang; Ives Torres-Silva, Ana; Hohenegger, Johann

    2017-04-01

    Phylogenetic analysis and trees based on molecular data are broadly applied and used to infer genetical and biogeographic relationship in recent larger foraminifera. Molecular phylogenetic is intensively used within recent nummulitids, however for fossil representatives these trees are only of minor informational value. Hence, within paleontological studies a phylogenetic approach through morphometric analysis is of much higher value. To tackle phylogenetic relationships within the nummulitid family, a much higher number of morphological character must be measured than are commonly used in biometric studies, where mostly parameters describing embryonic size (e.g., proloculus diameter, deuteroloculus diameter) and/or the marginal spiral (e.g., spiral diagrams, spiral indices) are studied. For this purpose 11 growth-independent and/or growth-invariant characters have been used to describe the morphological variability of equatorial thin sections of seven Carribbean nummulitid taxa (Nummulites striatoreticulatus, N. macgillavry, Palaeonummulites willcoxi, P.floridensis, P. soldadensis, P.trinitatensis and P.ocalanus) and one outgroup taxon (Ranikothalia bermudezi). Using these characters, phylogenetic trees were calculated using a restricted maximum likelihood algorithm (REML), and results are cross-checked by ordination and cluster analysis. Square-change parsimony method has been run to reconstruct ancestral states, as well as to simulate the evolution of the chosen characters along the calculated phylogenetic tree and, independent - contrast analysis was used to estimate confidence intervals. Based on these simulations, phylogenetic tendencies of certain characters proposed for nummulitids (e.g., Cope's rule or nepionic acceleration) can be tested, whether these tendencies are valid for the whole family or only for certain clades. At least, within the Carribean nummulitids, phylogenetic trends along some growth-independent characters of the embryo (e.g., first

  17. A method for investigating relative timing information on phylogenetic trees.

    Science.gov (United States)

    Ford, Daniel; Matsen, Frederick A; Stadler, Tanja

    2009-04-01

    In this paper, we present a new way to describe the timing of branching events in phylogenetic trees. Our description is in terms of the relative timing of diversification events between sister clades; as such it is complementary to existing methods using lineages-through-time plots which consider diversification in aggregate. The method can be applied to look for evidence of diversification happening in lineage-specific "bursts", or the opposite, where diversification between 2 clades happens in an unusually regular fashion. In order to be able to distinguish interesting events from stochasticity, we discuss 2 classes of neutral models on trees with relative timing information and develop a statistical framework for testing these models. These model classes include both the coalescent with ancestral population size variation and global rate speciation-extinction models. We end the paper with 2 example applications: first, we show that the evolution of the hepatitis C virus deviates from the coalescent with arbitrary population size. Second, we analyze a large tree of ants, demonstrating that a period of elevated diversification rates does not appear to have occurred in a bursting manner.

  18. Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach

    Directory of Open Access Journals (Sweden)

    David Lee Erickson

    2014-11-01

    Full Text Available Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1,347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK and psbA-trnH and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance metrics that are commonly used to infer assembly processes were estimated for each plot (Phylogenetic Distance [PD], Mean Phylogenetic Distance [MPD], and Mean Nearest Taxon Distance [MNTD]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for

  19. Enumerating all maximal frequent subtrees in collections of phylogenetic trees.

    Science.gov (United States)

    Deepak, Akshay; Fernández-Baca, David

    2014-01-01

    A common problem in phylogenetic analysis is to identify frequent patterns in a collection of phylogenetic trees. The goal is, roughly, to find a subset of the species (taxa) on which all or some significant subset of the trees agree. One popular method to do so is through maximum agreement subtrees (MASTs). MASTs are also used, among other things, as a metric for comparing phylogenetic trees, computing congruence indices and to identify horizontal gene transfer events. We give algorithms and experimental results for two approaches to identify common patterns in a collection of phylogenetic trees, one based on agreement subtrees, called maximal agreement subtrees, the other on frequent subtrees, called maximal frequent subtrees. These approaches can return subtrees on larger sets of taxa than MASTs, and can reveal new common phylogenetic relationships not present in either MASTs or the majority rule tree (a popular consensus method). Our current implementation is available on the web at https://code.google.com/p/mfst-miner/. Our computational results confirm that maximal agreement subtrees and all maximal frequent subtrees can reveal a more complete phylogenetic picture of the common patterns in collections of phylogenetic trees than maximum agreement subtrees; they are also often more resolved than the majority rule tree. Further, our experiments show that enumerating maximal frequent subtrees is considerably more practical than enumerating ordinary (not necessarily maximal) frequent subtrees.

  20. Assessing the relationships between phylogenetic and functional singularities in sharks (Chondrichthyes).

    Science.gov (United States)

    Cachera, Marie; Le Loc'h, François

    2017-08-01

    The relationships between diversity and ecosystem functioning have become a major focus of science. A crucial issue is to estimate functional diversity, as it is intended to impact ecosystem dynamics and stability. However, depending on the ecosystem, it may be challenging or even impossible to directly measure ecological functions and thus functional diversity. Phylogenetic diversity was recently under consideration as a proxy for functional diversity. Phylogenetic diversity is indeed supposed to match functional diversity if functions are conservative traits along evolution. However, in case of adaptive radiation and/or evolutive convergence, a mismatch may appear between species phylogenetic and functional singularities. Using highly threatened taxa, sharks, this study aimed to explore the relationships between phylogenetic and functional diversities and singularities. Different statistical computations were used in order to test both methodological issue (phylogenetic reconstruction) and overall a theoretical questioning: the predictive power of phylogeny for function diversity. Despite these several methodological approaches, a mismatch between phylogeny and function was highlighted. This mismatch revealed that (i) functions are apparently nonconservative in shark species, and (ii) phylogenetic singularity is not a proxy for functional singularity. Functions appeared to be not conservative along the evolution of sharks, raising the conservational challenge to identify and protect both phylogenetic and functional singular species. Facing the current rate of species loss, it is indeed of major importance to target phylogenetically singular species to protect genetic diversity and also functionally singular species in order to maintain particular functions within ecosystem.

  1. Metagenomic species profiling using universal phylogenetic marker genes

    DEFF Research Database (Denmark)

    Sunagawa, Shinichi; Mende, Daniel R; Zeller, Georg

    2013-01-01

    To quantify known and unknown microorganisms at species-level resolution using shotgun sequencing data, we developed a method that establishes metagenomic operational taxonomic units (mOTUs) based on single-copy phylogenetic marker genes. Applied to 252 human fecal samples, the method revealed th...... that on average 43% of the species abundance and 58% of the richness cannot be captured by current reference genome-based methods. An implementation of the method is available at http://www.bork.embl.de/software/mOTU/.......To quantify known and unknown microorganisms at species-level resolution using shotgun sequencing data, we developed a method that establishes metagenomic operational taxonomic units (mOTUs) based on single-copy phylogenetic marker genes. Applied to 252 human fecal samples, the method revealed...

  2. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    Science.gov (United States)

    Pardi, Fabio; Scornavacca, Celine

    2015-04-01

    Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.

  3. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    Directory of Open Access Journals (Sweden)

    Fabio Pardi

    2015-04-01

    Full Text Available Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.

  4. Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study [version 1; referees: 1 approved, 2 approved with reservations

    Directory of Open Access Journals (Sweden)

    John A. Lees

    2018-03-01

    Full Text Available Background: Phylogenetic reconstruction is a necessary first step in many analyses which use whole genome sequence data from bacterial populations. There are many available methods to infer phylogenies, and these have various advantages and disadvantages, but few unbiased comparisons of the range of approaches have been made. Methods: We simulated data from a defined “true tree” using a realistic evolutionary model. We built phylogenies from this data using a range of methods, and compared reconstructed trees to the true tree using two measures, noting the computational time needed for different phylogenetic reconstructions. We also used real data from Streptococcus pneumoniae alignments to compare individual core gene trees to a core genome tree. Results: We found that, as expected, maximum likelihood trees from good quality alignments were the most accurate, but also the most computationally intensive. Using less accurate phylogenetic reconstruction methods, we were able to obtain results of comparable accuracy; we found that approximate results can rapidly be obtained using genetic distance based methods. In real data we found that highly conserved core genes, such as those involved in translation, gave an inaccurate tree topology, whereas genes involved in recombination events gave inaccurate branch lengths. We also show a tree-of-trees, relating the results of different phylogenetic reconstructions to each other. Conclusions: We recommend three approaches, depending on requirements for accuracy and computational time. Quicker approaches that do not perform full maximum likelihood optimisation may be useful for many analyses requiring a phylogeny, as generating a high quality input alignment is likely to be the major limiting factor of accurate tree topology. We have publicly released our simulated data and code to enable further comparisons.

  5. Phylogenetic turnover during subtropical forest succession across environmental and phylogenetic scales

    OpenAIRE

    Purschke, Oliver; Michalski, Stefan G.; Bruelheide, Helge; Durka, Walter

    2017-01-01

    Abstract Although spatial and temporal patterns of phylogenetic community structure during succession are inherently interlinked and assembly processes vary with environmental and phylogenetic scales, successional studies of community assembly have yet to integrate spatial and temporal components of community structure, while accounting for scaling issues. To gain insight into the processes that generate biodiversity after disturbance, we combine analyses of spatial and temporal phylogenetic ...

  6. Supermatrix and species tree methods resolve phylogenetic relationships within the big cats, Panthera (Carnivora: Felidae).

    Science.gov (United States)

    Davis, Brian W; Li, Gang; Murphy, William J

    2010-07-01

    The pantherine lineage of cats diverged from the remainder of modern Felidae less than 11 million years ago and consists of the five big cats of the genus Panthera, the lion, tiger, jaguar, leopard, and snow leopard, as well as the closely related clouded leopard. A significant problem exists with respect to the precise phylogeny of these highly threatened great cats. Despite multiple publications on the subject, no two molecular studies have reconstructed Panthera with the same topology. These evolutionary relationships remain unresolved partially due to the recent and rapid radiation of pantherines in the Pliocene, individual speciation events occurring within less than 1 million years, and probable introgression between lineages following their divergence. We provide an alternative, highly supported interpretation of the evolutionary history of the pantherine lineage using novel and published DNA sequence data from the autosomes, both sex chromosomes and the mitochondrial genome. New sequences were generated for 39 single-copy regions of the felid Y chromosome, as well as four mitochondrial and four autosomal gene segments, totaling 28.7 kb. Phylogenetic analysis of these new data, combined with all published data in GenBank, highlighted the prevalence of phylogenetic disparities stemming either from the amplification of a mitochondrial to nuclear translocation event (numt), or errors in species identification. Our 47.6 kb combined dataset was analyzed as a supermatrix and with respect to individual partitions using maximum likelihood and Bayesian phylogenetic inference, in conjunction with Bayesian Estimation of Species Trees (BEST) which accounts for heterogeneous gene histories. Our results yield a robust consensus topology supporting the monophyly of lion and leopard, with jaguar sister to these species, as well as a sister species relationship of tiger and snow leopard. These results highlight new avenues for the study of speciation genomics and

  7. A new algorithm to construct phylogenetic networks from trees.

    Science.gov (United States)

    Wang, J

    2014-03-06

    Developing appropriate methods for constructing phylogenetic networks from tree sets is an important problem, and much research is currently being undertaken in this area. BIMLR is an algorithm that constructs phylogenetic networks from tree sets. The algorithm can construct a much simpler network than other available methods. Here, we introduce an improved version of the BIMLR algorithm, QuickCass. QuickCass changes the selection strategy of the labels of leaves below the reticulate nodes, i.e., the nodes with an indegree of at least 2 in BIMLR. We show that QuickCass can construct simpler phylogenetic networks than BIMLR. Furthermore, we show that QuickCass is a polynomial-time algorithm when the output network that is constructed by QuickCass is binary.

  8. Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach

    Science.gov (United States)

    Erickson, David L.; Jones, Frank A.; Swenson, Nathan G.; Pei, Nancai; Bourg, Norman A.; Chen, Wenna; Davies, Stuart J.; Ge, Xue-jun; Hao, Zhanqing; Howe, Robert W.; Huang, Chun-Lin; Larson, Andrew J.; Lum, Shawn K. Y.; Lutz, James A.; Ma, Keping; Meegaskumbura, Madhava; Mi, Xiangcheng; Parker, John D.; Fang-Sun, I.; Wright, S. Joseph; Wolf, Amy T.; Ye, W.; Xing, Dingliang; Zimmerman, Jess K.; Kress, W. John

    2014-01-01

    Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK, and psbA-trnH) and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance (PD) metrics that are commonly used to infer assembly processes were estimated for each plot [PD, Mean Phylogenetic Distance (MPD), and Mean Nearest Taxon Distance (MNTD)]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for individual plots, estimates of

  9. Unrecorded Alcohol Consumption: Quantitative Methods of Estimation

    OpenAIRE

    Razvodovsky, Y. E.

    2010-01-01

    unrecorded alcohol; methods of estimation In this paper we focused on methods of estimation of unrecorded alcohol consumption level. Present methods of estimation of unrevorded alcohol consumption allow only approximate estimation of unrecorded alcohol consumption level. Tacking into consideration the extreme importance of such kind of data, further investigation is necessary to improve the reliability of methods estimation of unrecorded alcohol consumption.

  10. Reconstructing phylogenetic networks using maximum parsimony.

    Science.gov (United States)

    Nakhleh, Luay; Jin, Guohua; Zhao, Fengmei; Mellor-Crummey, John

    2005-01-01

    Phylogenies - the evolutionary histories of groups of organisms - are one of the most widely used tools throughout the life sciences, as well as objects of research within systematics, evolutionary biology, epidemiology, etc. Almost every tool devised to date to reconstruct phylogenies produces trees; yet it is widely understood and accepted that trees oversimplify the evolutionary histories of many groups of organims, most prominently bacteria (because of horizontal gene transfer) and plants (because of hybrid speciation). Various methods and criteria have been introduced for phylogenetic tree reconstruction. Parsimony is one of the most widely used and studied criteria, and various accurate and efficient heuristics for reconstructing trees based on parsimony have been devised. Jotun Hein suggested a straightforward extension of the parsimony criterion to phylogenetic networks. In this paper we formalize this concept, and provide the first experimental study of the quality of parsimony as a criterion for constructing and evaluating phylogenetic networks. Our results show that, when extended to phylogenetic networks, the parsimony criterion produces promising results. In a great majority of the cases in our experiments, the parsimony criterion accurately predicts the numbers and placements of non-tree events.

  11. Enumerating all maximal frequent subtrees in collections of phylogenetic trees

    Science.gov (United States)

    2014-01-01

    Background A common problem in phylogenetic analysis is to identify frequent patterns in a collection of phylogenetic trees. The goal is, roughly, to find a subset of the species (taxa) on which all or some significant subset of the trees agree. One popular method to do so is through maximum agreement subtrees (MASTs). MASTs are also used, among other things, as a metric for comparing phylogenetic trees, computing congruence indices and to identify horizontal gene transfer events. Results We give algorithms and experimental results for two approaches to identify common patterns in a collection of phylogenetic trees, one based on agreement subtrees, called maximal agreement subtrees, the other on frequent subtrees, called maximal frequent subtrees. These approaches can return subtrees on larger sets of taxa than MASTs, and can reveal new common phylogenetic relationships not present in either MASTs or the majority rule tree (a popular consensus method). Our current implementation is available on the web at https://code.google.com/p/mfst-miner/. Conclusions Our computational results confirm that maximal agreement subtrees and all maximal frequent subtrees can reveal a more complete phylogenetic picture of the common patterns in collections of phylogenetic trees than maximum agreement subtrees; they are also often more resolved than the majority rule tree. Further, our experiments show that enumerating maximal frequent subtrees is considerably more practical than enumerating ordinary (not necessarily maximal) frequent subtrees. PMID:25061474

  12. Bayesian methods outperform parsimony but at the expense of precision in the estimation of phylogeny from discrete morphological data.

    Science.gov (United States)

    O'Reilly, Joseph E; Puttick, Mark N; Parry, Luke; Tanner, Alastair R; Tarver, James E; Fleming, James; Pisani, Davide; Donoghue, Philip C J

    2016-04-01

    Different analytical methods can yield competing interpretations of evolutionary history and, currently, there is no definitive method for phylogenetic reconstruction using morphological data. Parsimony has been the primary method for analysing morphological data, but there has been a resurgence of interest in the likelihood-based Mk-model. Here, we test the performance of the Bayesian implementation of the Mk-model relative to both equal and implied-weight implementations of parsimony. Using simulated morphological data, we demonstrate that the Mk-model outperforms equal-weights parsimony in terms of topological accuracy, and implied-weights performs the most poorly. However, the Mk-model produces phylogenies that have less resolution than parsimony methods. This difference in the accuracy and precision of parsimony and Bayesian approaches to topology estimation needs to be considered when selecting a method for phylogeny reconstruction. © 2016 The Authors.

  13. DNA barcode analysis: a comparison of phylogenetic and statistical classification methods.

    Science.gov (United States)

    Austerlitz, Frederic; David, Olivier; Schaeffer, Brigitte; Bleakley, Kevin; Olteanu, Madalina; Leblois, Raphael; Veuille, Michel; Laredo, Catherine

    2009-11-10

    DNA barcoding aims to assign individuals to given species according to their sequence at a small locus, generally part of the CO1 mitochondrial gene. Amongst other issues, this raises the question of how to deal with within-species genetic variability and potential transpecific polymorphism. In this context, we examine several assignation methods belonging to two main categories: (i) phylogenetic methods (neighbour-joining and PhyML) that attempt to account for the genealogical framework of DNA evolution and (ii) supervised classification methods (k-nearest neighbour, CART, random forest and kernel methods). These methods range from basic to elaborate. We investigated the ability of each method to correctly classify query sequences drawn from samples of related species using both simulated and real data. Simulated data sets were generated using coalescent simulations in which we varied the genealogical history, mutation parameter, sample size and number of species. No method was found to be the best in all cases. The simplest method of all, "one nearest neighbour", was found to be the most reliable with respect to changes in the parameters of the data sets. The parameter most influencing the performance of the various methods was molecular diversity of the data. Addition of genetically independent loci--nuclear genes--improved the predictive performance of most methods. The study implies that taxonomists can influence the quality of their analyses either by choosing a method best-adapted to the configuration of their sample, or, given a certain method, increasing the sample size or altering the amount of molecular diversity. This can be achieved either by sequencing more mtDNA or by sequencing additional nuclear genes. In the latter case, they may also have to modify their data analysis method.

  14. DNA barcode analysis: a comparison of phylogenetic and statistical classification methods

    Directory of Open Access Journals (Sweden)

    Leblois Raphael

    2009-11-01

    Full Text Available Abstract Background DNA barcoding aims to assign individuals to given species according to their sequence at a small locus, generally part of the CO1 mitochondrial gene. Amongst other issues, this raises the question of how to deal with within-species genetic variability and potential transpecific polymorphism. In this context, we examine several assignation methods belonging to two main categories: (i phylogenetic methods (neighbour-joining and PhyML that attempt to account for the genealogical framework of DNA evolution and (ii supervised classification methods (k-nearest neighbour, CART, random forest and kernel methods. These methods range from basic to elaborate. We investigated the ability of each method to correctly classify query sequences drawn from samples of related species using both simulated and real data. Simulated data sets were generated using coalescent simulations in which we varied the genealogical history, mutation parameter, sample size and number of species. Results No method was found to be the best in all cases. The simplest method of all, "one nearest neighbour", was found to be the most reliable with respect to changes in the parameters of the data sets. The parameter most influencing the performance of the various methods was molecular diversity of the data. Addition of genetically independent loci - nuclear genes - improved the predictive performance of most methods. Conclusion The study implies that taxonomists can influence the quality of their analyses either by choosing a method best-adapted to the configuration of their sample, or, given a certain method, increasing the sample size or altering the amount of molecular diversity. This can be achieved either by sequencing more mtDNA or by sequencing additional nuclear genes. In the latter case, they may also have to modify their data analysis method.

  15. Evaluating the relationship between evolutionary divergence and phylogenetic accuracy in AFLP data sets.

    Science.gov (United States)

    García-Pereira, María Jesús; Caballero, Armando; Quesada, Humberto

    2010-05-01

    Using in silico amplified fragment length polymorphism (AFLP) fingerprints, we explore the relationship between sequence similarity and phylogeny accuracy to test when, in terms of genetic divergence, the quality of AFLP data becomes too low to be informative for a reliable phylogenetic reconstruction. We generated DNA sequences with known phylogenies using balanced and unbalanced trees with recent, uniform and ancient radiations, and average branch lengths (from the most internal node to the tip) ranging from 0.02 to 0.4 substitutions per site. The resulting sequences were used to emulate the AFLP procedure. Trees were estimated by maximum parsimony (MP), neighbor-joining (NJ), and minimum evolution (ME) methods from both DNA sequences and virtual AFLP fingerprints. The estimated trees were compared with the reference trees using a score that measures overall differences in both topology and relative branch length. As expected, the accuracy of AFLP-based phylogenies decreased dramatically in the more divergent data sets. Above a divergence of approximately 0.05, AFLP-based phylogenies were largely inaccurate irrespective of the distinct topology, radiation model, or phylogenetic method used. This value represents an upper bound of expected tree accuracy for data sets with a simple divergence history; AFLP data sets with a similar divergence but with unbalanced topologies and short ancestral branches produced much less accurate trees. The lack of homology of AFLP bands quickly increases with divergence and reaches its maximum value (100%) at a divergence of only 0.4. Low guanine-cytosine (GC) contents increase the number of nonhomologous bands in AFLP data sets and lead to less reliable trees. However, the effect of the lack of band homology on tree accuracy is surprisingly small relative to the negative impact due to the low information content of AFLP characters. Tree-building methods based on genetic distance displayed similar trends and outperformed parsimony

  16. Robustness of ancestral sequence reconstruction to phylogenetic uncertainty.

    Science.gov (United States)

    Hanson-Smith, Victor; Kolaczkowski, Bryan; Thornton, Joseph W

    2010-09-01

    Ancestral sequence reconstruction (ASR) is widely used to formulate and test hypotheses about the sequences, functions, and structures of ancient genes. Ancestral sequences are usually inferred from an alignment of extant sequences using a maximum likelihood (ML) phylogenetic algorithm, which calculates the most likely ancestral sequence assuming a probabilistic model of sequence evolution and a specific phylogeny--typically the tree with the ML. The true phylogeny is seldom known with certainty, however. ML methods ignore this uncertainty, whereas Bayesian methods incorporate it by integrating the likelihood of each ancestral state over a distribution of possible trees. It is not known whether Bayesian approaches to phylogenetic uncertainty improve the accuracy of inferred ancestral sequences. Here, we use simulation-based experiments under both simplified and empirically derived conditions to compare the accuracy of ASR carried out using ML and Bayesian approaches. We show that incorporating phylogenetic uncertainty by integrating over topologies very rarely changes the inferred ancestral state and does not improve the accuracy of the reconstructed ancestral sequence. Ancestral state reconstructions are robust to uncertainty about the underlying tree because the conditions that produce phylogenetic uncertainty also make the ancestral state identical across plausible trees; conversely, the conditions under which different phylogenies yield different inferred ancestral states produce little or no ambiguity about the true phylogeny. Our results suggest that ML can produce accurate ASRs, even in the face of phylogenetic uncertainty. Using Bayesian integration to incorporate this uncertainty is neither necessary nor beneficial.

  17. Whole Genome Phylogenetic Tree Reconstruction using Colored de Bruijn Graphs

    OpenAIRE

    Lyman, Cole

    2017-01-01

    We present kleuren, a novel assembly-free method to reconstruct phylogenetic trees using the Colored de Bruijn Graph. kleuren works by constructing the Colored de Bruijn Graph and then traversing it, finding bubble structures in the graph that provide phylogenetic signal. The bubbles are then aligned and concatenated to form a supermatrix, from which a phylogenetic tree is inferred. We introduce the algorithm that kleuren uses to accomplish this task, and show its performance on reconstructin...

  18. On the Shapley Value of Unrooted Phylogenetic Trees.

    Science.gov (United States)

    Wicke, Kristina; Fischer, Mareike

    2018-01-17

    The Shapley value, a solution concept from cooperative game theory, has recently been considered for both unrooted and rooted phylogenetic trees. Here, we focus on the Shapley value of unrooted trees and first revisit the so-called split counts of a phylogenetic tree and the Shapley transformation matrix that allows for the calculation of the Shapley value from the edge lengths of a tree. We show that non-isomorphic trees may have permutation-equivalent Shapley transformation matrices and permutation-equivalent null spaces. This implies that estimating the split counts associated with a tree or the Shapley values of its leaves does not suffice to reconstruct the correct tree topology. We then turn to the use of the Shapley value as a prioritization criterion in biodiversity conservation and compare it to a greedy solution concept. Here, we show that for certain phylogenetic trees, the Shapley value may fail as a prioritization criterion, meaning that the diversity spanned by the top k species (ranked by their Shapley values) cannot approximate the total diversity of all n species.

  19. Efficient FPT Algorithms for (Strict) Compatibility of Unrooted Phylogenetic Trees.

    Science.gov (United States)

    Baste, Julien; Paul, Christophe; Sau, Ignasi; Scornavacca, Celine

    2017-04-01

    In phylogenetics, a central problem is to infer the evolutionary relationships between a set of species X; these relationships are often depicted via a phylogenetic tree-a tree having its leaves labeled bijectively by elements of X and without degree-2 nodes-called the "species tree." One common approach for reconstructing a species tree consists in first constructing several phylogenetic trees from primary data (e.g., DNA sequences originating from some species in X), and then constructing a single phylogenetic tree maximizing the "concordance" with the input trees. The obtained tree is our estimation of the species tree and, when the input trees are defined on overlapping-but not identical-sets of labels, is called "supertree." In this paper, we focus on two problems that are central when combining phylogenetic trees into a supertree: the compatibility and the strict compatibility problems for unrooted phylogenetic trees. These problems are strongly related, respectively, to the notions of "containing as a minor" and "containing as a topological minor" in the graph community. Both problems are known to be fixed parameter tractable in the number of input trees k, by using their expressibility in monadic second-order logic and a reduction to graphs of bounded treewidth. Motivated by the fact that the dependency on k of these algorithms is prohibitively large, we give the first explicit dynamic programming algorithms for solving these problems, both running in time [Formula: see text], where n is the total size of the input.

  20. Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparisons with Other Methods

    International Nuclear Information System (INIS)

    Wu, Liyou; Yi, T.Y.; Van Nostrand, Joy; Zhou, Jizhong

    2010-01-01

    Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site (Hanford Reach of the Columbia River (HRCR), 11 strains), Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the average nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.

  1. Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy; Zhou, Jizhong

    2010-05-17

    Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the average nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.

  2. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    Science.gov (United States)

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  3. Nonbinary Tree-Based Phylogenetic Networks.

    Science.gov (United States)

    Jetten, Laura; van Iersel, Leo

    2018-01-01

    Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can, for example, represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and strictly-tree-based nonbinary phylogenetic networks. We give simple graph-theoretic characterizations of tree-based and strictly-tree-based nonbinary phylogenetic networks. Moreover, we show for each of these two classes that it can be decided in polynomial time whether a given network is contained in the class. Our approach also provides a new view on tree-based binary phylogenetic networks. Finally, we discuss two examples of nonbinary phylogenetic networks in biology and show how our results can be applied to them.

  4. Maximizing the phylogenetic diversity of seed banks.

    Science.gov (United States)

    Griffiths, Kate E; Balding, Sharon T; Dickie, John B; Lewis, Gwilym P; Pearce, Tim R; Grenyer, Richard

    2015-04-01

    Ex situ conservation efforts such as those of zoos, botanical gardens, and seed banks will form a vital complement to in situ conservation actions over the coming decades. It is therefore necessary to pay the same attention to the biological diversity represented in ex situ conservation facilities as is often paid to protected-area networks. Building the phylogenetic diversity of ex situ collections will strengthen our capacity to respond to biodiversity loss. Since 2000, the Millennium Seed Bank Partnership has banked seed from 14% of the world's plant species. We assessed the taxonomic, geographic, and phylogenetic diversity of the Millennium Seed Bank collection of legumes (Leguminosae). We compared the collection with all known legume genera, their known geographic range (at country and regional levels), and a genus-level phylogeny of the legume family constructed for this study. Over half the phylogenetic diversity of legumes at the genus level was represented in the Millennium Seed Bank. However, pragmatic prioritization of species of economic importance and endangerment has led to the banking of a less-than-optimal phylogenetic diversity and prioritization of range-restricted species risks an underdispersed collection. The current state of the phylogenetic diversity of legumes in the Millennium Seed Bank could be substantially improved through the strategic banking of relatively few additional taxa. Our method draws on tools that are widely applied to in situ conservation planning, and it can be used to evaluate and improve the phylogenetic diversity of ex situ collections. © 2014 Society for Conservation Biology.

  5. Applying phylogenetic analysis to viral livestock diseases: moving beyond molecular typing.

    Science.gov (United States)

    Olvera, Alex; Busquets, Núria; Cortey, Marti; de Deus, Nilsa; Ganges, Llilianne; Núñez, José Ignacio; Peralta, Bibiana; Toskano, Jennifer; Dolz, Roser

    2010-05-01

    Changes in livestock production systems in recent years have altered the presentation of many diseases resulting in the need for more sophisticated control measures. At the same time, new molecular assays have been developed to support the diagnosis of animal viral disease. Nucleotide sequences generated by these diagnostic techniques can be used in phylogenetic analysis to infer phenotypes by sequence homology and to perform molecular epidemiology studies. In this review, some key elements of phylogenetic analysis are highlighted, such as the selection of the appropriate neutral phylogenetic marker, the proper phylogenetic method and different techniques to test the reliability of the resulting tree. Examples are given of current and future applications of phylogenetic reconstructions in viral livestock diseases. Copyright 2009 Elsevier Ltd. All rights reserved.

  6. Inferring influenza global transmission networks without complete phylogenetic information.

    Science.gov (United States)

    Aris-Brosou, Stéphane

    2014-03-01

    Influenza is one of the most severe respiratory infections affecting humans throughout the world, yet the dynamics of its global transmission network are still contentious. Here, I describe a novel combination of phylogenetics, time series, and graph theory to analyze 14.25 years of data stratified in space and in time, focusing on the main target of the human immune response, the hemagglutinin gene. While bypassing the complete phylogenetic inference of huge data sets, the method still extracts information suggesting that waves of genetic or of nucleotide diversity circulate continuously around the globe for subtypes that undergo sustained transmission over several seasons, such as H3N2 and pandemic H1N1/09, while diversity of prepandemic H1N1 viruses had until 2009 a noncontinuous transmission pattern consistent with a source/sink model. Irrespective of the shift in the structure of H1N1 diversity circulation with the emergence of the pandemic H1N1/09 strain, US prevalence peaks during the winter months when genetic diversity is at its lowest. This suggests that a dominant strain is generally responsible for epidemics and that monitoring genetic and/or nucleotide diversity in real time could provide public health agencies with an indirect estimate of prevalence.

  7. Reconstruction of certain phylogenetic networks from their tree-average distances.

    Science.gov (United States)

    Willson, Stephen J

    2013-10-01

    Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle.

  8. Boundary methods for mode estimation

    Science.gov (United States)

    Pierson, William E., Jr.; Ulug, Batuhan; Ahalt, Stanley C.

    1999-08-01

    This paper investigates the use of Boundary Methods (BMs), a collection of tools used for distribution analysis, as a method for estimating the number of modes associated with a given data set. Model order information of this type is required by several pattern recognition applications. The BM technique provides a novel approach to this parameter estimation problem and is comparable in terms of both accuracy and computations to other popular mode estimation techniques currently found in the literature and automatic target recognition applications. This paper explains the methodology used in the BM approach to mode estimation. Also, this paper quickly reviews other common mode estimation techniques and describes the empirical investigation used to explore the relationship of the BM technique to other mode estimation techniques. Specifically, the accuracy and computational efficiency of the BM technique are compared quantitatively to the a mixture of Gaussian (MOG) approach and a k-means approach to model order estimation. The stopping criteria of the MOG and k-means techniques is the Akaike Information Criteria (AIC).

  9. Incorporating phylogenetic information for the definition of floristic districts in hyperdiverse Amazon forests: Implications for conservation.

    Science.gov (United States)

    Guevara Andino, Juan Ernesto; Pitman, Nigel C A; Ter Steege, Hans; Mogollón, Hugo; Ceron, Carlos; Palacios, Walter; Oleas, Nora; Fine, Paul V A

    2017-11-01

    Using complementary metrics to evaluate phylogenetic diversity can facilitate the delimitation of floristic units and conservation priority areas. In this study, we describe the spatial patterns of phylogenetic alpha and beta diversity, phylogenetic endemism, and evolutionary distinctiveness of the hyperdiverse Ecuador Amazon forests and define priority areas for conservation. We established a network of 62 one-hectare plots in terra firme forests of Ecuadorian Amazon. In these plots, we tagged, collected, and identified every single adult tree with dbh ≥10 cm. These data were combined with a regional community phylogenetic tree to calculate different phylogenetic diversity (PD) metrics in order to create spatial models. We used Loess regression to estimate the spatial variation of taxonomic and phylogenetic beta diversity as well as phylogenetic endemism and evolutionary distinctiveness. We found evidence for the definition of three floristic districts in the Ecuadorian Amazon, supported by both taxonomic and phylogenetic diversity data. Areas with high levels of phylogenetic endemism and evolutionary distinctiveness in Ecuadorian Amazon forests are unprotected. Furthermore, these areas are severely threatened by proposed plans of oil and mining extraction at large scales and should be prioritized in conservation planning for this region.

  10. The prevalence of terraced treescapes in analyses of phylogenetic data sets.

    Science.gov (United States)

    Dobrin, Barbara H; Zwickl, Derrick J; Sanderson, Michael J

    2018-04-04

    The pattern of data availability in a phylogenetic data set may lead to the formation of terraces, collections of equally optimal trees. Terraces can arise in tree space if trees are scored with parsimony or with partitioned, edge-unlinked maximum likelihood. Theory predicts that terraces can be large, but their prevalence in contemporary data sets has never been surveyed. We selected 26 data sets and phylogenetic trees reported in recent literature and investigated the terraces to which the trees would belong, under a common set of inference assumptions. We examined terrace size as a function of the sampling properties of the data sets, including taxon coverage density (the proportion of taxon-by-gene positions with any data present) and a measure of gene sampling "sufficiency". We evaluated each data set in relation to the theoretical minimum gene sampling depth needed to reduce terrace size to a single tree, and explored the impact of the terraces found in replicate trees in bootstrap methods. Terraces were identified in nearly all data sets with taxon coverage densities tree. Terraces found during bootstrap resampling reduced overall support. If certain inference assumptions apply, trees estimated from empirical data sets often belong to large terraces of equally optimal trees. Terrace size correlates to data set sampling properties. Data sets seldom include enough genes to reduce terrace size to one tree. When bootstrap replicate trees lie on a terrace, statistical support for phylogenetic hypotheses may be reduced. Although some of the published analyses surveyed were conducted with edge-linked inference models (which do not induce terraces), unlinked models have been used and advocated. The present study describes the potential impact of that inference assumption on phylogenetic inference in the context of the kinds of multigene data sets now widely assembled for large-scale tree construction.

  11. New substitution models for rooting phylogenetic trees.

    Science.gov (United States)

    Williams, Tom A; Heaps, Sarah E; Cherlin, Svetlana; Nye, Tom M W; Boys, Richard J; Embley, T Martin

    2015-09-26

    The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made. © 2015 The Authors.

  12. Application of unweighted pair group methods with arithmetic average (UPGMA) for identification of kinship types and spreading of ebola virus through establishment of phylogenetic tree

    Science.gov (United States)

    Andriani, Tri; Irawan, Mohammad Isa

    2017-08-01

    Ebola Virus Disease (EVD) is a disease caused by a virus of the genus Ebolavirus (EBOV), family Filoviridae. Ebola virus is classifed into five types, namely Zaire ebolavirus (ZEBOV), Sudan ebolavirus (SEBOV), Bundibugyo ebolavirus (BEBOV), Tai Forest ebolavirus also known as Cote d'Ivoire ebolavirus (CIEBOV), and Reston ebolavirus (REBOV). Identification of kinship types of Ebola virus can be performed using phylogenetic trees. In this study, the phylogenetic tree constructed by UPGMA method in which there are Multiple Alignment using Progressive Method. The results concluded that the phylogenetic tree formation kinship ebola virus types that kind of Tai Forest ebolavirus close to Bundibugyo ebolavirus but the layout state ebola epidemic spread far apart. The genetic distance for this type of Bundibugyo ebolavirus with Tai Forest ebolavirus is 0.3725. Type Tai Forest ebolavirus similar to Bundibugyo ebolavirus not inuenced by the proximity of the area ebola epidemic spread.

  13. Phylogenetic signal dissection identifies the root of starfishes.

    Directory of Open Access Journals (Sweden)

    Roberto Feuda

    Full Text Available Relationships within the class Asteroidea have remained controversial for almost 100 years and, despite many attempts to resolve this problem using molecular data, no consensus has yet emerged. Using two nuclear genes and a taxon sampling covering the major asteroid clades we show that non-phylogenetic signal created by three factors--Long Branch Attraction, compositional heterogeneity and the use of poorly fitting models of evolution--have confounded accurate estimation of phylogenetic relationships. To overcome the effect of this non-phylogenetic signal we analyse the data using non-homogeneous models, site stripping and the creation of subpartitions aimed to reduce or amplify the systematic error, and calculate Bayes Factor support for a selection of previously suggested topological arrangements of asteroid orders. We show that most of the previous alternative hypotheses are not supported in the most reliable data partitions, including the previously suggested placement of either Forcipulatida or Paxillosida as sister group to the other major branches. The best-supported solution places Velatida as the sister group to other asteroids, and the implications of this finding for the morphological evolution of asteroids are presented.

  14. Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.

    Science.gov (United States)

    Sferra, Gabriella; Fratini, Federica; Ponzi, Marta; Pizzi, Elisabetta

    2017-09-05

    Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson's correlation as measures of profile similarity. In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request.

  15. Transforming phylogenetic networks: Moving beyond tree space

    OpenAIRE

    Huber, Katharina T.; Moulton, Vincent; Wu, Taoyang

    2016-01-01

    Phylogenetic networks are a generalization of phylogenetic trees that are used to represent reticulate evolution. Unrooted phylogenetic networks form a special class of such networks, which naturally generalize unrooted phylogenetic trees. In this paper we define two operations on unrooted phylogenetic networks, one of which is a generalization of the well-known nearest-neighbor interchange (NNI) operation on phylogenetic trees. We show that any unrooted phylogenetic network can be transforme...

  16. Estimation of subcriticality of TCA using 'indirect estimation method for calculation error'

    International Nuclear Information System (INIS)

    Naito, Yoshitaka; Yamamoto, Toshihiro; Arakawa, Takuya; Sakurai, Kiyoshi

    1996-01-01

    To estimate the subcriticality of neutron multiplication factor in a fissile system, 'Indirect Estimation Method for Calculation Error' is proposed. This method obtains the calculational error of neutron multiplication factor by correlating measured values with the corresponding calculated ones. This method was applied to the source multiplication and to the pulse neutron experiments conducted at TCA, and the calculation error of MCNP 4A was estimated. In the source multiplication method, the deviation of measured neutron count rate distributions from the calculated ones estimates the accuracy of calculated k eff . In the pulse neutron method, the calculation errors of prompt neutron decay constants give the accuracy of the calculated k eff . (author)

  17. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Directory of Open Access Journals (Sweden)

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  18. Phylogenetic Inference of HIV Transmission Clusters

    Directory of Open Access Journals (Sweden)

    Vlad Novitsky

    2017-10-01

    Full Text Available Better understanding the structure and dynamics of HIV transmission networks is essential for designing the most efficient interventions to prevent new HIV transmissions, and ultimately for gaining control of the HIV epidemic. The inference of phylogenetic relationships and the interpretation of results rely on the definition of the HIV transmission cluster. The definition of the HIV cluster is complex and dependent on multiple factors, including the design of sampling, accuracy of sequencing, precision of sequence alignment, evolutionary models, the phylogenetic method of inference, and specified thresholds for cluster support. While the majority of studies focus on clusters, non-clustered cases could also be highly informative. A new dimension in the analysis of the global and local HIV epidemics is the concept of phylogenetically distinct HIV sub-epidemics. The identification of active HIV sub-epidemics reveals spreading viral lineages and may help in the design of targeted interventions.HIVclustering can also be affected by sampling density. Obtaining a proper sampling density may increase statistical power and reduce sampling bias, so sampling density should be taken into account in study design and in interpretation of phylogenetic results. Finally, recent advances in long-range genotyping may enable more accurate inference of HIV transmission networks. If performed in real time, it could both inform public-health strategies and be clinically relevant (e.g., drug-resistance testing.

  19. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Science.gov (United States)

    Kelly, Steven; Maini, Philip K

    2013-01-01

    The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  20. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Directory of Open Access Journals (Sweden)

    Steven Kelly

    Full Text Available The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  1. Applying species-tree analyses to deep phylogenetic histories: challenges and potential suggested from a survey of empirical phylogenetic studies.

    Science.gov (United States)

    Lanier, Hayley C; Knowles, L Lacey

    2015-02-01

    Coalescent-based methods for species-tree estimation are becoming a dominant approach for reconstructing species histories from multi-locus data, with most of the studies examining these methodologies focused on recently diverged species. However, deeper phylogenies, such as the datasets that comprise many Tree of Life (ToL) studies, also exhibit gene-tree discordance. This discord may also arise from the stochastic sorting of gene lineages during the speciation process (i.e., reflecting the random coalescence of gene lineages in ancestral populations). It remains unknown whether guidelines regarding methodologies and numbers of loci established by simulation studies at shallow tree depths translate into accurate species relationships for deeper phylogenetic histories. We address this knowledge gap and specifically identify the challenges and limitations of species-tree methods that account for coalescent variance for deeper phylogenies. Using simulated data with characteristics informed by empirical studies, we evaluate both the accuracy of estimated species trees and the characteristics associated with recalcitrant nodes, with a specific focus on whether coalescent variance is generally responsible for the lack of resolution. By determining the proportion of coalescent genealogies that support a particular node, we demonstrate that (1) species-tree methods account for coalescent variance at deep nodes and (2) mutational variance - not gene-tree discord arising from the coalescent - posed the primary challenge for accurate reconstruction across the tree. For example, many nodes were accurately resolved despite predicted discord from the random coalescence of gene lineages and nodes with poor support were distributed across a range of depths (i.e., they were not restricted to a particular recent divergences). Given their broad taxonomic scope and large sampling of taxa, deep level phylogenies pose several potential methodological complications including

  2. A Multi-Criterion Evolutionary Approach Applied to Phylogenetic Reconstruction

    OpenAIRE

    Cancino, W.; Delbem, A.C.B.

    2010-01-01

    In this paper, we proposed an MOEA approach, called PhyloMOEA which solves the phylogenetic inference problem using maximum parsimony and maximum likelihood criteria. The PhyloMOEA's development was motivated by several studies in the literature (Huelsenbeck, 1995; Jin & Nei, 1990; Kuhner & Felsenstein, 1994; Tateno et al., 1994), which point out that various phylogenetic inference methods lead to inconsistent solutions. Techniques using parsimony and likelihood criteria yield to different tr...

  3. Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

    Science.gov (United States)

    Sumner, Jeremy G; Taylor, Amelia; Holland, Barbara R; Jarvis, Peter D

    2017-12-01

    Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for

  4. The phylogenetic relationships among infraorders and superfamilies of Diptera based on morphological evidence

    DEFF Research Database (Denmark)

    Lambkin, Christine L.; Sinclair, Bradley J.; Pape, Thomas

    2013-01-01

    Members of the megadiverse insect order Diptera (flies) have successfully colonized all continents and nearly all habitats. There are more than 154 000 described fly species, representing 1012% of animal species. Elucidating the phylogenetic relationships of such a large component of global...... biodiversity is challenging, but significant advances have been made in the last few decades. Since Hennig first discussed the monophyly of major groupings, Diptera has attracted much study, but most researchers have used non-numerical qualitative methods to assess morphological data. More recently......, quantitative phylogenetic methods have been used on both morphological and molecular data. All previous quantitative morphological studies addressed narrower phylogenetic problems, often below the suborder or infraorder level. Here we present the first numerical analysis of phylogenetic relationships...

  5. Accurate phylogenetic tree reconstruction from quartets: a heuristic approach.

    Science.gov (United States)

    Reaz, Rezwana; Bayzid, Md Shamsuzzoha; Rahman, M Sohel

    2014-01-01

    Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A 'quartet' is an unrooted tree over 4 taxa, hence the quartet-based supertree methods combine many 4-taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets.

  6. Predicting community structure in snakes on Eastern Nearctic islands using ecological neutral theory and phylogenetic methods.

    Science.gov (United States)

    Burbrink, Frank T; McKelvy, Alexander D; Pyron, R Alexander; Myers, Edward A

    2015-11-22

    Predicting species presence and richness on islands is important for understanding the origins of communities and how likely it is that species will disperse and resist extinction. The equilibrium theory of island biogeography (ETIB) and, as a simple model of sampling abundances, the unified neutral theory of biodiversity (UNTB), predict that in situations where mainland to island migration is high, species-abundance relationships explain the presence of taxa on islands. Thus, more abundant mainland species should have a higher probability of occurring on adjacent islands. In contrast to UNTB, if certain groups have traits that permit them to disperse to islands better than other taxa, then phylogeny may be more predictive of which taxa will occur on islands. Taking surveys of 54 island snake communities in the Eastern Nearctic along with mainland communities that have abundance data for each species, we use phylogenetic assembly methods and UNTB estimates to predict island communities. Species richness is predicted by island area, whereas turnover from the mainland to island communities is random with respect to phylogeny. Community structure appears to be ecologically neutral and abundance on the mainland is the best predictor of presence on islands. With regard to young and proximate islands, where allopatric or cladogenetic speciation is not a factor, we find that simple neutral models following UNTB and ETIB predict the structure of island communities. © 2015 The Author(s).

  7. Heuristic introduction to estimation methods

    International Nuclear Information System (INIS)

    Feeley, J.J.; Griffith, J.M.

    1982-08-01

    The methods and concepts of optimal estimation and control have been very successfully applied in the aerospace industry during the past 20 years. Although similarities exist between the problems (control, modeling, measurements) in the aerospace and nuclear power industries, the methods and concepts have found only scant acceptance in the nuclear industry. Differences in technical language seem to be a major reason for the slow transfer of estimation and control methods to the nuclear industry. Therefore, this report was written to present certain important and useful concepts with a minimum of specialized language. By employing a simple example throughout the report, the importance of several information and uncertainty sources is stressed and optimal ways of using or allowing for these sources are presented. This report discusses optimal estimation problems. A future report will discuss optimal control problems

  8. Environmental and spatial drivers of taxonomic, functional, and phylogenetic characteristics of bat communities in human-modified landscapes

    Science.gov (United States)

    Fagan, Matthew E.; Willig, Michael R.

    2016-01-01

    Background Assembly of species into communities following human disturbance (e.g., deforestation, fragmentation) may be governed by spatial (e.g., dispersal) or environmental (e.g., niche partitioning) mechanisms. Variation partitioning has been used to broadly disentangle spatial and environmental mechanisms, and approaches utilizing functional and phylogenetic characteristics of communities have been implemented to determine the relative importance of particular environmental (or niche-based) mechanisms. Nonetheless, few studies have integrated these quantitative approaches to comprehensively assess the relative importance of particular structuring processes. Methods We employed a novel variation partitioning approach to evaluate the relative importance of particular spatial and environmental drivers of taxonomic, functional, and phylogenetic aspects of bat communities in a human-modified landscape in Costa Rica. Specifically, we estimated the amount of variation in species composition (taxonomic structure) and in two aspects of functional and phylogenetic structure (i.e., composition and dispersion) along a forest loss and fragmentation gradient that are uniquely explained by landscape characteristics (i.e., environment) or space to assess the importance of competing mechanisms. Results The unique effects of space on taxonomic, functional and phylogenetic structure were consistently small. In contrast, landscape characteristics (i.e., environment) played an appreciable role in structuring bat communities. Spatially-structured landscape characteristics explained 84% of the variation in functional or phylogenetic dispersion, and the unique effects of landscape characteristics significantly explained 14% of the variation in species composition. Furthermore, variation in bat community structure was primarily due to differences in dispersion of species within functional or phylogenetic space along the gradient, rather than due to differences in functional or

  9. Phylogenetic relationships in Solanaceae and related species based on cpDNA sequence from plastid trnE-trnT region

    Directory of Open Access Journals (Sweden)

    Danila Montewka Melotto-Passarin

    2008-01-01

    Full Text Available Intergenic spacers of chloroplast DNA (cpDNA are very useful in phylogenetic and population genetic studiesof plant species, to study their potential integration in phylogenetic analysis. The non-coding trnE-trnT intergenic spacer ofcpDNA was analyzed to assess the nucleotide sequence polymorphism of 16 Solanaceae species and to estimate its ability tocontribute to the resolution of phylogenetic studies of this group. Multiple alignments of DNA sequences of trnE-trnT intergenicspacer made the identification of nucleotide variability in this region possible and the phylogeny was estimated by maximumparsimony and rooted with Convolvulaceae Ipomoea batatas, the most closely related family. Besides, this intergenic spacerwas tested for the phylogenetic ability to differentiate taxonomic levels. For this purpose, species from four other families wereanalyzed and compared with Solanaceae species. Results confirmed polymorphism in the trnE-trnT region at different taxonomiclevels.

  10. Dynamically heterogenous partitions and phylogenetic inference: an evaluation of analytical strategies with cytochrome b and ND6 gene sequences in cranes.

    Science.gov (United States)

    Krajewski, C; Fain, M G; Buckley, L; King, D G

    1999-11-01

    ki ctes over whether molecular sequence data should be partitioned for phylogenetic analysis often confound two types of heterogeneity among partitions. We distinguish historical heterogeneity (i.e., different partitions have different evolutionary relationships) from dynamic heterogeneity (i.e., different partitions show different patterns of sequence evolution) and explore the impact of the latter on phylogenetic accuracy and precision with a two-gene, mitochondrial data set for cranes. The well-established phylogeny of cranes allows us to contrast tree-based estimates of relevant parameter values with estimates based on pairwise comparisons and to ascertain the effects of incorporating different amounts of process information into phylogenetic estimates. We show that codon positions in the cytochrome b and NADH dehydrogenase subunit 6 genes are dynamically heterogenous under both Poisson and invariable-sites + gamma-rates versions of the F84 model and that heterogeneity includes variation in base composition and transition bias as well as substitution rate. Estimates of transition-bias and relative-rate parameters from pairwise sequence comparisons were comparable to those obtained as tree-based maximum likelihood estimates. Neither rate-category nor mixed-model partitioning strategies resulted in a loss of phylogenetic precision relative to unpartitioned analyses. We suggest that weighted-average distances provide a computationally feasible alternative to direct maximum likelihood estimates of phylogeny for mixed-model analyses of large, dynamically heterogenous data sets. Copyright 1999 Academic Press.

  11. Phylogenetic affinity of tree shrews to Glires is attributed to fast evolution rate.

    Science.gov (United States)

    Lin, Jiannan; Chen, Guangfeng; Gu, Liang; Shen, Yuefeng; Zheng, Meizhu; Zheng, Weisheng; Hu, Xinjie; Zhang, Xiaobai; Qiu, Yu; Liu, Xiaoqing; Jiang, Cizhong

    2014-02-01

    Previous phylogenetic analyses have led to incongruent evolutionary relationships between tree shrews and other suborders of Euarchontoglires. What caused the incongruence remains elusive. In this study, we identified 6845 orthologous genes between seventeen placental mammals. Tree shrews and Primates were monophyletic in the phylogenetic trees derived from the first or/and second codon positions whereas tree shrews and Glires formed a monophyly in the trees derived from the third or all codon positions. The same topology was obtained in the phylogeny inference using the slowly and fast evolving genes, respectively. This incongruence was likely attributed to the fast substitution rate in tree shrews and Glires. Notably, sequence GC content only was not informative to resolve the controversial phylogenetic relationships between tree shrews, Glires, and Primates. Finally, estimation in the confidence of the tree selection strongly supported the phylogenetic affiliation of tree shrews to Primates as a monophyly. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. On Nakhleh's metric for reduced phylogenetic networks

    OpenAIRE

    Cardona, Gabriel; Llabrés, Mercè; Rosselló, Francesc; Valiente Feruglio, Gabriel Alejandro

    2009-01-01

    We prove that Nakhleh’s metric for reduced phylogenetic networks is also a metric on the classes of tree-child phylogenetic networks, semibinary tree-sibling time consistent phylogenetic networks, and multilabeled phylogenetic trees. We also prove that it separates distinguishable phylogenetic networks. In this way, it becomes the strongest dissimilarity measure for phylogenetic networks available so far. Furthermore, we propose a generalization of that metric that separates arbitrary phyl...

  13. A Method of Nuclear Software Reliability Estimation

    International Nuclear Information System (INIS)

    Park, Gee Yong; Eom, Heung Seop; Cheon, Se Woo; Jang, Seung Cheol

    2011-01-01

    A method on estimating software reliability for nuclear safety software is proposed. This method is based on the software reliability growth model (SRGM) where the behavior of software failure is assumed to follow the non-homogeneous Poisson process. Several modeling schemes are presented in order to estimate and predict more precisely the number of software defects based on a few of software failure data. The Bayesian statistical inference is employed to estimate the model parameters by incorporating the software test cases into the model. It is identified that this method is capable of accurately estimating the remaining number of software defects which are on-demand type directly affecting safety trip functions. The software reliability can be estimated from a model equation and one method of obtaining the software reliability is proposed

  14. Transforming phylogenetic networks: Moving beyond tree space.

    Science.gov (United States)

    Huber, Katharina T; Moulton, Vincent; Wu, Taoyang

    2016-09-07

    Phylogenetic networks are a generalization of phylogenetic trees that are used to represent reticulate evolution. Unrooted phylogenetic networks form a special class of such networks, which naturally generalize unrooted phylogenetic trees. In this paper we define two operations on unrooted phylogenetic networks, one of which is a generalization of the well-known nearest-neighbor interchange (NNI) operation on phylogenetic trees. We show that any unrooted phylogenetic network can be transformed into any other such network using only these operations. This generalizes the well-known fact that any phylogenetic tree can be transformed into any other such tree using only NNI operations. It also allows us to define a generalization of tree space and to define some new metrics on unrooted phylogenetic networks. To prove our main results, we employ some fascinating new connections between phylogenetic networks and cubic graphs that we have recently discovered. Our results should be useful in developing new strategies to search for optimal phylogenetic networks, a topic that has recently generated some interest in the literature, as well as for providing new ways to compare networks. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. PhyLIS: a simple GNU/Linux distribution for phylogenetics and phyloinformatics.

    Science.gov (United States)

    Thomson, Robert C

    2009-07-30

    PhyLIS is a free GNU/Linux distribution that is designed to provide a simple, standardized platform for phylogenetic and phyloinformatic analysis. The operating system incorporates most commonly used phylogenetic software, which has been pre-compiled and pre-configured, allowing for straightforward application of phylogenetic methods and development of phyloinformatic pipelines in a stable Linux environment. The software is distributed as a live CD and can be installed directly or run from the CD without making changes to the computer. PhyLIS is available for free at http://www.eve.ucdavis.edu/rcthomson/phylis/.

  16. Method-related estimates of sperm vitality.

    Science.gov (United States)

    Cooper, Trevor G; Hellenkemper, Barbara

    2009-01-01

    Comparison of methods that estimate viability of human spermatozoa by monitoring head membrane permeability revealed that wet preparations (whether using positive or negative phase-contrast microscopy) generated significantly higher percentages of nonviable cells than did air-dried eosin-nigrosin smears. Only with the latter method did the sum of motile (presumed live) and stained (presumed dead) preparations never exceed 100%, making this the method of choice for sperm viability estimates.

  17. A format for phylogenetic placements.

    Directory of Open Access Journals (Sweden)

    Frederick A Matsen

    Full Text Available We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g., short reads into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format, which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement.

  18. The power and pitfalls of HIV phylogenetics in public health.

    Science.gov (United States)

    Brooks, James I; Sandstrom, Paul A

    2013-07-25

    Phylogenetics is the application of comparative studies of genetic sequences in order to infer evolutionary relationships among organisms. This tool can be used as a form of molecular epidemiology to enhance traditional population-level communicable disease surveillance. Phylogenetic study has resulted in new paradigms being created in the field of communicable diseases and this commentary aims to provide the reader with an explanation of how phylogenetics can be used in tracking infectious diseases. Special emphasis will be placed upon the application of phylogenetics as a tool to help elucidate HIV transmission patterns and the limitations to these methods when applied to forensic analysis. Understanding infectious disease epidemiology in order to prevent new transmissions is the sine qua non of public health. However, with increasing epidemiological resolution, there may be an associated potential loss of privacy to the individual. It is within this context that we aim to promote the discussion on how to use phylogenetics to achieve important public health goals, while at the same time protecting the rights of the individual.

  19. Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.

    Science.gov (United States)

    Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A

    2018-01-30

    Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  20. Phylogenetic inertia and Darwin's higher law.

    Science.gov (United States)

    Shanahan, Timothy

    2011-03-01

    The concept of 'phylogenetic inertia' is routinely deployed in evolutionary biology as an alternative to natural selection for explaining the persistence of characteristics that appear sub-optimal from an adaptationist perspective. However, in many of these contexts the precise meaning of 'phylogenetic inertia' and its relationship to selection are far from clear. After tracing the history of the concept of 'inertia' in evolutionary biology, I argue that treating phylogenetic inertia and natural selection as alternative explanations is mistaken because phylogenetic inertia is, from a Darwinian point of view, simply an expected effect of selection. Although Darwin did not discuss 'phylogenetic inertia,' he did assert the explanatory priority of selection over descent. An analysis of 'phylogenetic inertia' provides a perspective from which to assess Darwin's view. Copyright © 2010 Elsevier Ltd. All rights reserved.

  1. Secondary structure analyses of the nuclear rRNA internal transcribed spacers and assessment of its phylogenetic utility across the Brassicaceae (mustards.

    Directory of Open Access Journals (Sweden)

    Patrick P Edger

    Full Text Available The internal transcribed spacers of the nuclear ribosomal RNA gene cluster, termed ITS1 and ITS2, are the most frequently used nuclear markers for phylogenetic analyses across many eukaryotic groups including most plant families. The reasons for the popularity of these markers include: 1. Ease of amplification due to high copy number of the gene clusters, 2. Available cost-effective methods and highly conserved primers, 3. Rapidly evolving markers (i.e. variable between closely related species, and 4. The assumption (and/or treatment that these sequences are non-functional, neutrally evolving phylogenetic markers. Here, our analyses of ITS1 and ITS2 for 50 species suggest that both sequences are instead under selective constraints to preserve proper secondary structure, likely to maintain complete self-splicing functions, and thus are not neutrally-evolving phylogenetic markers. Our results indicate the majority of sequence sites are co-evolving with other positions to form proper secondary structure, which has implications for phylogenetic inference. We also found that the lowest energy state and total number of possible alternate secondary structures are highly significantly different between ITS regions and random sequences with an identical overall length and Guanine-Cytosine (GC content. Lastly, we review recent evidence highlighting some additional problematic issues with using these regions as the sole markers for phylogenetic studies, and thus strongly recommend additional markers and cost-effective approaches for future studies to estimate phylogenetic relationships.

  2. The transposition distance for phylogenetic trees

    OpenAIRE

    Rossello, Francesc; Valiente, Gabriel

    2006-01-01

    The search for similarity and dissimilarity measures on phylogenetic trees has been motivated by the computation of consensus trees, the search by similarity in phylogenetic databases, and the assessment of clustering results in bioinformatics. The transposition distance for fully resolved phylogenetic trees is a recent addition to the extensive collection of available metrics for comparing phylogenetic trees. In this paper, we generalize the transposition distance from fully resolved to arbi...

  3. Comparison of sequence-based and structure-based phylogenetic ...

    Indian Academy of Sciences (India)

    Prakash

    phylogenetic tree construction methods, has been considered as an equivalent of .... Further detailed analysis described is restricted to the first two groups only. ..... Aspartate-ammonia ligase. Plant virus ..... enzymatic activities?; Trends ...

  4. Assessment of phylogenetic sensitivity for reconstructing HIV-1 epidemiological relationships.

    Science.gov (United States)

    Beloukas, Apostolos; Magiorkinis, Emmanouil; Magiorkinis, Gkikas; Zavitsanou, Asimina; Karamitros, Timokratis; Hatzakis, Angelos; Paraskevis, Dimitrios

    2012-06-01

    Phylogenetic analysis has been extensively used as a tool for the reconstruction of epidemiological relations for research or for forensic purposes. It was our objective to assess the sensitivity of different phylogenetic methods and various phylogenetic programs to reconstruct epidemiological links among HIV-1 infected patients that is the probability to reveal a true transmission relationship. Multiple datasets (90) were prepared consisting of HIV-1 sequences in protease (PR) and partial reverse transcriptase (RT) sampled from patients with documented epidemiological relationship (target population), and from unrelated individuals (control population) belonging to the same HIV-1 subtype as the target population. Each dataset varied regarding the number, the geographic origin and the transmission risk groups of the sequences among the control population. Phylogenetic trees were inferred by neighbor-joining (NJ), maximum likelihood heuristics (hML) and Bayesian methods. All clusters of sequences belonging to the target population were correctly reconstructed by NJ and Bayesian methods receiving high bootstrap and posterior probability (PP) support, respectively. On the other hand, TreePuzzle failed to reconstruct or provide significant support for several clusters; high puzzling step support was associated with the inclusion of control sequences from the same geographic area as the target population. In contrary, all clusters were correctly reconstructed by hML as implemented in PhyML 3.0 receiving high bootstrap support. We report that under the conditions of our study, hML using PhyML, NJ and Bayesian methods were the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. Copyright © 2012 Elsevier B.V. All rights reserved.

  5. Bias in phylogenetic reconstruction of vertebrate rhodopsin sequences.

    Science.gov (United States)

    Chang, B S; Campbell, D L

    2000-08-01

    Two spurious nodes were found in phylogenetic analyses of vertebrate rhodopsin sequences in comparison with well-established vertebrate relationships. These spurious reconstructions were well supported in bootstrap analyses and occurred independently of the method of phylogenetic analysis used (parsimony, distance, or likelihood). Use of this data set of vertebrate rhodopsin sequences allowed us to exploit established vertebrate relationships, as well as the considerable amount known about the molecular evolution of this gene, in order to identify important factors contributing to the spurious reconstructions. Simulation studies using parametric bootstrapping indicate that it is unlikely that the spurious nodes in the parsimony analyses are due to long branches or other topological effects. Rather, they appear to be due to base compositional bias at third positions, codon bias, and convergent evolution at nucleotide positions encoding the hydrophobic residues isoleucine, leucine, and valine. LogDet distance methods, as well as maximum-likelihood methods which allow for nonstationary changes in base composition, reduce but do not entirely eliminate support for the spurious resolutions. Inclusion of five additional rhodopsin sequences in the phylogenetic analyses largely corrected one of the spurious reconstructions while leaving the other unaffected. The additional sequences not only were more proximal to the corrected node, but were also found to have intermediate levels of base composition and codon bias as compared with neighboring sequences on the tree. This study shows that the spurious reconstructions can be corrected either by excluding third positions, as well as those encoding the amino acids Ile, Val, and Leu (which may not be ideal, as these sites can contain useful phylogenetic signal for other parts of the tree), or by the addition of sequences that reduce problems associated with convergent evolution.

  6. FPGA Hardware Acceleration of a Phylogenetic Tree Reconstruction with Maximum Parsimony Algorithm

    OpenAIRE

    BLOCK, Henry; MARUYAMA, Tsutomu

    2017-01-01

    In this paper, we present an FPGA hardware implementation for a phylogenetic tree reconstruction with a maximum parsimony algorithm. We base our approach on a particular stochastic local search algorithm that uses the Progressive Neighborhood and the Indirect Calculation of Tree Lengths method. This method is widely used for the acceleration of the phylogenetic tree reconstruction algorithm in software. In our implementation, we define a tree structure and accelerate the search by parallel an...

  7. Resolving ambiguity in the phylogenetic relationship of genotypes A, B, and C of hepatitis B virus

    Science.gov (United States)

    2013-01-01

    Background Hepatitis B virus (HBV) is an important infectious agent that causes widespread concern because billions of people are infected by at least 8 different HBV genotypes worldwide. However, reconstruction of the phylogenetic relationship between HBV genotypes is difficult. Specifically, the phylogenetic relationships among genotypes A, B, and C are not clear from previous studies because of the confounding effects of genotype recombination. In order to clarify the evolutionary relationships, a rigorous approach is required that can effectively explore genetic sequences with recombination. Result In the present study, phylogenetic relationship of the HBV genotypes was reconstructed using a consensus phylogeny of phylogenetic trees of HBV genome segments. Reliability of the reconstructed phylogeny was extensively evaluated in agreements of local phylogenies of genome segments. The reconstructed phylogenetic tree revealed that HBV genotypes B and C had a closer phylogenetic relationship than genotypes A and B or A and C. Evaluations showed the consensus method was capable to reconstruct reliable phylogenetic relationship in the presence of recombinants. Conclusion The consensus method implemented in this study provides an alternative approach for reconstructing reliable phylogenetic relationships for viruses with possible genetic recombination. Our approach revealed the phylogenetic relationships of genotypes A, B, and C of HBV. PMID:23758960

  8. The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae).

    Science.gov (United States)

    McCann, Jamie; Schneeweiss, Gerald M; Stuessy, Tod F; Villaseñor, Jose L; Weiss-Schneeweiss, Hanna

    2016-01-01

    Chromosome number change (polyploidy and dysploidy) plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae) as model group, we assess the impact of reconstruction method (maximum parsimony, maximum likelihood, Bayesian methods), branch length model (phylograms versus chronograms) and phylogenetic uncertainty (topological and branch length uncertainty) on the inference of chromosome number evolution. We also address the suitability of the maximum clade credibility (MCC) tree as single representative topology for chromosome number reconstruction. Each of the listed factors causes considerable incongruence among chromosome number reconstructions. Discrepancies between inferences on the MCC tree from those made by integrating over a set of trees are moderate for ancestral chromosome numbers, but severe for the difference of chromosome gains and losses, a measure of the directionality of dysploidy. Therefore, reliance on single trees, such as the MCC tree, is strongly discouraged and model averaging, taking both phylogenetic and model uncertainty into account, is recommended. For studying chromosome number evolution, dedicated models implemented in the program ChromEvol and ordered maximum parsimony may be most appropriate. Chromosome number evolution in Melampodium follows a pattern of bidirectional dysploidy (starting from x = 11 to x = 9 and x = 14, respectively) with no prevailing direction.

  9. The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae.

    Directory of Open Access Journals (Sweden)

    Jamie McCann

    Full Text Available Chromosome number change (polyploidy and dysploidy plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae as model group, we assess the impact of reconstruction method (maximum parsimony, maximum likelihood, Bayesian methods, branch length model (phylograms versus chronograms and phylogenetic uncertainty (topological and branch length uncertainty on the inference of chromosome number evolution. We also address the suitability of the maximum clade credibility (MCC tree as single representative topology for chromosome number reconstruction. Each of the listed factors causes considerable incongruence among chromosome number reconstructions. Discrepancies between inferences on the MCC tree from those made by integrating over a set of trees are moderate for ancestral chromosome numbers, but severe for the difference of chromosome gains and losses, a measure of the directionality of dysploidy. Therefore, reliance on single trees, such as the MCC tree, is strongly discouraged and model averaging, taking both phylogenetic and model uncertainty into account, is recommended. For studying chromosome number evolution, dedicated models implemented in the program ChromEvol and ordered maximum parsimony may be most appropriate. Chromosome number evolution in Melampodium follows a pattern of bidirectional dysploidy (starting from x = 11 to x = 9 and x = 14, respectively with no prevailing direction.

  10. The phylogenetic likelihood library.

    Science.gov (United States)

    Flouri, T; Izquierdo-Carrasco, F; Darriba, D; Aberer, A J; Nguyen, L-T; Minh, B Q; Von Haeseler, A; Stamatakis, A

    2015-03-01

    We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2-10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  11. PhyLIS: A Simple GNU/Linux Distribution for Phylogenetics and Phyloinformatics

    Directory of Open Access Journals (Sweden)

    Robert C. Thomson

    2009-01-01

    Full Text Available PhyLIS is a free GNU/Linux distribution that is designed to provide a simple, standardized platform for phylogenetic and phyloinformatic analysis. The operating system incorporates most commonly used phylogenetic software, which has been pre-compiled and pre-configured, allowing for straightforward application of phylogenetic methods and development of phyloinformatic pipelines in a stable Linux environment. The software is distributed as a live CD and can be installed directly or run from the CD without making changes to the computer. PhyLIS is available for free at http://www.eve.ucdavis.edu/rcthomson/phylis/.

  12. Locating a tree in a phylogenetic network

    NARCIS (Netherlands)

    Iersel, van L.J.J.; Semple, C.; Steel, M.A.

    2010-01-01

    Phylogenetic trees and networks are leaf-labelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster

  13. Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks.

    Science.gov (United States)

    Oh, S June; Joung, Je-Gun; Chang, Jeong-Ho; Zhang, Byoung-Tak

    2006-06-06

    To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence

  14. Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks

    Directory of Open Access Journals (Sweden)

    Chang Jeong-Ho

    2006-06-01

    Full Text Available Abstract Background To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. Results To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. Conclusion By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway

  15. On the use of cartographic projections in visualizing phylo-genetic tree space

    Directory of Open Access Journals (Sweden)

    Clement Mark

    2010-06-01

    Full Text Available Abstract Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger data sets.

  16. An evaluation of phylogenetic informativeness profiles and the molecular phylogeny of diplazontinae (Hymenoptera, Ichneumonidae).

    Science.gov (United States)

    Klopfstein, Seraina; Kropf, Christian; Quicke, Donald L J

    2010-03-01

    How to quantify the phylogenetic information content of a data set is a longstanding question in phylogenetics, influencing both the assessment of data quality in completed studies and the planning of future phylogenetic projects. Recently, a method has been developed that profiles the phylogenetic informativeness (PI) of a data set through time by linking its site-specific rates of change to its power to resolve relationships at different timescales. Here, we evaluate the performance of this method in the case of 2 standard genetic markers for phylogenetic reconstruction, 28S ribosomal RNA and cytochrome oxidase subunit 1 (CO1) mitochondrial DNA, with maximum parsimony, maximum likelihood, and Bayesian analyses of relationships within a group of parasitoid wasps (Hymenoptera: Ichneumonidae, Diplazontinae). Retrieving PI profiles of the 2 genes from our own and from 3 additional data sets, we find that the method repeatedly overestimates the performance of the more quickly evolving CO1 compared with 28S. We explore possible reasons for this bias, including phylogenetic uncertainty, violation of the molecular clock assumption, model misspecification, and nonstationary nucleotide composition. As none of these provides a sufficient explanation of the observed discrepancy, we use simulated data sets, based on an idealized setting, to show that the optimum evolutionary rate decreases with increasing number of taxa. We suggest that this relationship could explain why the formula derived from the 4-taxon case overrates the performance of higher versus lower rates of evolution in our case and that caution should be taken when the method is applied to data sets including more than 4 taxa.

  17. Dimensional Reduction for the General Markov Model on Phylogenetic Trees.

    Science.gov (United States)

    Sumner, Jeremy G

    2017-03-01

    We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.

  18. Spectrum estimation method based on marginal spectrum

    International Nuclear Information System (INIS)

    Cai Jianhua; Hu Weiwen; Wang Xianchun

    2011-01-01

    FFT method can not meet the basic requirements of power spectrum for non-stationary signal and short signal. A new spectrum estimation method based on marginal spectrum from Hilbert-Huang transform (HHT) was proposed. The procession of obtaining marginal spectrum in HHT method was given and the linear property of marginal spectrum was demonstrated. Compared with the FFT method, the physical meaning and the frequency resolution of marginal spectrum were further analyzed. Then the Hilbert spectrum estimation algorithm was discussed in detail, and the simulation results were given at last. The theory and simulation shows that under the condition of short data signal and non-stationary signal, the frequency resolution and estimation precision of HHT method is better than that of FFT method. (authors)

  19. PAL: an object-oriented programming library for molecular evolution and phylogenetics.

    Science.gov (United States)

    Drummond, A; Strimmer, K

    2001-07-01

    Phylogenetic Analysis Library (PAL) is a collection of Java classes for use in molecular evolution and phylogenetics. PAL provides a modular environment for the rapid construction of both special-purpose and general analysis programs. PAL version 1.1 consists of 145 public classes or interfaces in 13 packages, including classes for models of character evolution, maximum-likelihood estimation, and the coalescent, with a total of more than 27000 lines of code. The PAL project is set up as a collaborative project to facilitate contributions from other researchers. AVAILIABILTY: The program is free and is available at http://www.pal-project.org. It requires Java 1.1 or later. PAL is licensed under the GNU General Public License.

  20. Locating a tree in a phylogenetic network

    OpenAIRE

    van Iersel, Leo; Semple, Charles; Steel, Mike

    2010-01-01

    Phylogenetic trees and networks are leaf-labelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NP-complete in general. In this article, we consider t...

  1. Nonbinary tree-based phylogenetic networks

    OpenAIRE

    Jetten, Laura; van Iersel, Leo

    2016-01-01

    Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can for example represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and st...

  2. Forensic application of phylogenetic analyses - Exploration of suspected HIV-1 transmission case.

    Science.gov (United States)

    Siljic, Marina; Salemovic, Dubravka; Cirkovic, Valentina; Pesic-Pavlovic, Ivana; Ranin, Jovan; Todorovic, Marija; Nikolic, Slobodan; Jevtovic, Djordje; Stanojevic, Maja

    2017-03-01

    Transmission of human immunodeficiency virus (HIV) between individuals may have important legal implications and therefore may come to require forensic investigation based upon phylogenetic analysis. In criminal trials results of phylogenetic analyses have been used as evidence of responsibility for HIV transmission. In Serbia, as in many countries worldwide, exposure and deliberate transmission of HIV are criminalized. We present the results of applying state of the art phylogenetic analyses, based on pol and env genetic sequences, in exploration of suspected HIV transmission among three subjects: a man and two women, with presumed assumption of transmission direction from one woman to a man. Phylogenetic methods included relevant neighbor-joining (NJ), maximum likelihood (ML) and Bayesian methods of phylogenetic trees reconstruction and hypothesis testing, that has been shown to be the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. End-point limiting-dilution PCR (EPLD-PCR) assay, generating the minimum of 10 sequences per genetic region per subject, was performed to assess HIV quasispecies distribution and to explore the direction of HIV transmission between three subjects. Phylogenetic analysis revealed that the viral sequences from the three subjects were more genetically related to each other than to other strains circulating in the same area with the similar epidemiological profile, forming strongly supported transmission chain, which could be in favour of a priori hypothesis of one of the women infecting the man. However, in the EPLD based phylogenetic trees for both pol and env genetic region, viral sequences of one subject (man) were paraphyletic to those of two other subjects (women), implying the direction of transmission opposite to the a priori assumption. The dated tree in our analysis confirmed the clustering pattern of query sequences. Still, in the context of unsampled sequences and

  3. SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data.

    Science.gov (United States)

    Lee, Tae-Ho; Guo, Hui; Wang, Xiyin; Kim, Changsoo; Paterson, Andrew H

    2014-02-26

    Phylogenetic trees are widely used for genetic and evolutionary studies in various organisms. Advanced sequencing technology has dramatically enriched data available for constructing phylogenetic trees based on single nucleotide polymorphisms (SNPs). However, massive SNP data makes it difficult to perform reliable analysis, and there has been no ready-to-use pipeline to generate phylogenetic trees from these data. We developed a new pipeline, SNPhylo, to construct phylogenetic trees based on large SNP datasets. The pipeline may enable users to construct a phylogenetic tree from three representative SNP data file formats. In addition, in order to increase reliability of a tree, the pipeline has steps such as removing low quality data and considering linkage disequilibrium. A maximum likelihood method for the inference of phylogeny is also adopted in generation of a tree in our pipeline. Using SNPhylo, users can easily produce a reliable phylogenetic tree from a large SNP data file. Thus, this pipeline can help a researcher focus more on interpretation of the results of analysis of voluminous data sets, rather than manipulations necessary to accomplish the analysis.

  4. [New isolation methods and phylogenetic diversity of actinobacteria from hypersaline beach in Aksu].

    Science.gov (United States)

    Zhang, Yao; Xia, Zhanfeng; Cao, Xinbo; Li, Jun; Zhang, Lili

    2013-08-04

    We explored 4 new methods to improve the isolation of actinobacterial resources from high salt areas. Optimized media based on 4 new strategies were used for isolating actinobacteria from hypersaline beaches. Glycerin-arginine, trehalose-creatine, glycerol-asparticacid, mannitol-casein, casein-mannitol, mannitol-alanine, chitosan-asparagineand GAUZE' No. 1 were used as basic media. New isolation strategy includes 4 methods: ten-fold dilution culture, simulation of the original environment, actinobacterial culture guided by uncultured molecular technology detected, and reference of actinobacterial media for brackish marine environment. The 16S rRNA genes of the isolates were amplified with bacterial universal primers. The results of 16S rRNA gene sequences were compared with sequences obtained from GenBank databases. We constructed phylogenetic tree with the neighbor-joining method. No actinobacterial strains were isolated by 8 media of control group, while 403 strains were isolated by new strategies. The isolates by new methods were members of 14 genera (Streptomyces, Streptomonospora, Saccharomonospora, Plantactinospora, Nocardia, Amycolatopsis, Glycomyces, Micromonospora, Nocardiopsis, Isoptericola, Nonomuraea, Thermobifida, Actinopolyspora, Actinomadura) of 10 families in 8 suborders. The most abundant and diverse isolates were the two suborders of Streptomycineae (69.96%) and Streptosporangineaesuborder (9.68%) within the phylum Actinobacteria, including 9 potential novel species. New isolation methods significantly improved the actinobacterial culturability of hypersaline areas, and obtained many potential novel species, which provided a new and more effective way to isolate actinobacteria resources in hypersaline environments.

  5. Global patterns of amphibian phylogenetic diversity

    DEFF Research Database (Denmark)

    Fritz, Susanne; Rahbek, Carsten

    2012-01-01

    Aim  Phylogenetic diversity can provide insight into how evolutionary processes may have shaped contemporary patterns of species richness. Here, we aim to test for the influence of phylogenetic history on global patterns of amphibian species richness, and to identify areas where macroevolutionary...... processes such as diversification and dispersal have left strong signatures on contemporary species richness. Location  Global; equal-area grid cells of approximately 10,000 km2. Methods  We generated an amphibian global supertree (6111 species) and repeated analyses with the largest available molecular...... phylogeny (2792 species). We combined each tree with global species distributions to map four indices of phylogenetic diversity. To investigate congruence between global spatial patterns of amphibian species richness and phylogenetic diversity, we selected Faith’s phylogenetic diversity (PD) index...

  6. Detecting Network Communities: An Application to Phylogenetic Analysis

    Science.gov (United States)

    Andrade, Roberto F. S.; Rocha-Neto, Ivan C.; Santos, Leonardo B. L.; de Santana, Charles N.; Diniz, Marcelo V. C.; Lobão, Thierry Petit; Goés-Neto, Aristóteles; Pinho, Suani T. R.; El-Hani, Charbel N.

    2011-01-01

    This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. PMID:21573202

  7. Tree-Based Unrooted Phylogenetic Networks.

    Science.gov (United States)

    Francis, A; Huber, K T; Moulton, V

    2018-02-01

    Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X. It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms.

  8. PhyloSift: phylogenetic analysis of genomes and metagenomes.

    Science.gov (United States)

    Darling, Aaron E; Jospin, Guillaume; Lowe, Eric; Matsen, Frederick A; Bik, Holly M; Eisen, Jonathan A

    2014-01-01

    Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection. In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata. These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454).

  9. PhyloSift: phylogenetic analysis of genomes and metagenomes

    Directory of Open Access Journals (Sweden)

    Aaron E. Darling

    2014-01-01

    Full Text Available Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection.In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata.These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454.

  10. Species trees for the tree swallows (Genus Tachycineta): an alternative phylogenetic hypothesis to the mitochondrial gene tree.

    Science.gov (United States)

    Dor, Roi; Carling, Matthew D; Lovette, Irby J; Sheldon, Frederick H; Winkler, David W

    2012-10-01

    The New World swallow genus Tachycineta comprises nine species that collectively have a wide geographic distribution and remarkable variation both within- and among-species in ecologically important traits. Existing phylogenetic hypotheses for Tachycineta are based on mitochondrial DNA sequences, thus they provide estimates of a single gene tree. In this study we sequenced multiple individuals from each species at 16 nuclear intron loci. We used gene concatenated approaches (Bayesian and maximum likelihood) as well as coalescent-based species tree inference to reconstruct phylogenetic relationships of the genus. We examined the concordance and conflict between the nuclear and mitochondrial trees and between concatenated and coalescent-based inferences. Our results provide an alternative phylogenetic hypothesis to the existing mitochondrial DNA estimate of phylogeny. This new hypothesis provides a more accurate framework in which to explore trait evolution and examine the evolution of the mitochondrial genome in this group. Copyright © 2012 Elsevier Inc. All rights reserved.

  11. Reconstruction of phylogenetic trees of prokaryotes using maximal common intervals.

    Science.gov (United States)

    Heydari, Mahdi; Marashi, Sayed-Amir; Tusserkani, Ruzbeh; Sadeghi, Mehdi

    2014-10-01

    One of the fundamental problems in bioinformatics is phylogenetic tree reconstruction, which can be used for classifying living organisms into different taxonomic clades. The classical approach to this problem is based on a marker such as 16S ribosomal RNA. Since evolutionary events like genomic rearrangements are not included in reconstructions of phylogenetic trees based on single genes, much effort has been made to find other characteristics for phylogenetic reconstruction in recent years. With the increasing availability of completely sequenced genomes, gene order can be considered as a new solution for this problem. In the present work, we applied maximal common intervals (MCIs) in two or more genomes to infer their distance and to reconstruct their evolutionary relationship. Additionally, measures based on uncommon segments (UCS's), i.e., those genomic segments which are not detected as part of any of the MCIs, are also used for phylogenetic tree reconstruction. We applied these two types of measures for reconstructing the phylogenetic tree of 63 prokaryotes with known COG (clusters of orthologous groups) families. Similarity between the MCI-based (resp. UCS-based) reconstructed phylogenetic trees and the phylogenetic tree obtained from NCBI taxonomy browser is as high as 93.1% (resp. 94.9%). We show that in the case of this diverse dataset of prokaryotes, tree reconstruction based on MCI and UCS outperforms most of the currently available methods based on gene orders, including breakpoint distance and DCJ. We additionally tested our new measures on a dataset of 13 closely-related bacteria from the genus Prochlorococcus. In this case, distances like rearrangement distance, breakpoint distance and DCJ proved to be useful, while our new measures are still appropriate for phylogenetic reconstruction. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  12. System and method for traffic signal timing estimation

    KAUST Repository

    Dumazert, Julien; Claudel, Christian G.

    2015-01-01

    A method and system for estimating traffic signals. The method and system can include constructing trajectories of probe vehicles from GPS data emitted by the probe vehicles, estimating traffic signal cycles, combining the estimates, and computing the traffic signal timing by maximizing a scoring function based on the estimates. Estimating traffic signal cycles can be based on transition times of the probe vehicles starting after a traffic signal turns green.

  13. System and method for traffic signal timing estimation

    KAUST Repository

    Dumazert, Julien

    2015-12-30

    A method and system for estimating traffic signals. The method and system can include constructing trajectories of probe vehicles from GPS data emitted by the probe vehicles, estimating traffic signal cycles, combining the estimates, and computing the traffic signal timing by maximizing a scoring function based on the estimates. Estimating traffic signal cycles can be based on transition times of the probe vehicles starting after a traffic signal turns green.

  14. Phylogenetically informed logic relationships improve detection of biological network organization

    Science.gov (United States)

    2011-01-01

    Background A "phylogenetic profile" refers to the presence or absence of a gene across a set of organisms, and it has been proven valuable for understanding gene functional relationships and network organization. Despite this success, few studies have attempted to search beyond just pairwise relationships among genes. Here we search for logic relationships involving three genes, and explore its potential application in gene network analyses. Results Taking advantage of a phylogenetic matrix constructed from the large orthologs database Roundup, we invented a method to create balanced profiles for individual triplets of genes that guarantee equal weight on the different phylogenetic scenarios of coevolution between genes. When we applied this idea to LAPP, the method to search for logic triplets of genes, the balanced profiles resulted in significant performance improvement and the discovery of hundreds of thousands more putative triplets than unadjusted profiles. We found that logic triplets detected biological network organization and identified key proteins and their functions, ranging from neighbouring proteins in local pathways, to well separated proteins in the whole pathway, and to the interactions among different pathways at the system level. Finally, our case study suggested that the directionality in a logic relationship and the profile of a triplet could disclose the connectivity between the triplet and surrounding networks. Conclusion Balanced profiles are superior to the raw profiles employed by traditional methods of phylogenetic profiling in searching for high order gene sets. Gene triplets can provide valuable information in detection of biological network organization and identification of key genes at different levels of cellular interaction. PMID:22172058

  15. Nodal distances for rooted phylogenetic trees.

    Science.gov (United States)

    Cardona, Gabriel; Llabrés, Mercè; Rosselló, Francesc; Valiente, Gabriel

    2010-08-01

    Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics in spaces M(n)(R) of real-valued n x n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined in this way using L(p) metrics on M(n)(R), with p [epsilon] R(>0).

  16. Reverse survival method of fertility estimation: An evaluation

    Directory of Open Access Journals (Sweden)

    Thomas Spoorenberg

    2014-07-01

    Full Text Available Background: For the most part, demographers have relied on the ever-growing body of sample surveys collecting full birth history to derive total fertility estimates in less statistically developed countries. Yet alternative methods of fertility estimation can return very consistent total fertility estimates by using only basic demographic information. Objective: This paper evaluates the consistency and sensitivity of the reverse survival method -- a fertility estimation method based on population data by age and sex collected in one census or a single-round survey. Methods: A simulated population was first projected over 15 years using a set of fertility and mortality age and sex patterns. The projected population was then reverse survived using the Excel template FE_reverse_4.xlsx, provided with Timæus and Moultrie (2012. Reverse survival fertility estimates were then compared for consistency to the total fertility rates used to project the population. The sensitivity was assessed by introducing a series of distortions in the projection of the population and comparing the difference implied in the resulting fertility estimates. Results: The reverse survival method produces total fertility estimates that are very consistent and hardly affected by erroneous assumptions on the age distribution of fertility or by the use of incorrect mortality levels, trends, and age patterns. The quality of the age and sex population data that is 'reverse survived' determines the consistency of the estimates. The contribution of the method for the estimation of past and present trends in total fertility is illustrated through its application to the population data of five countries characterized by distinct fertility levels and data quality issues. Conclusions: Notwithstanding its simplicity, the reverse survival method of fertility estimation has seldom been applied. The method can be applied to a large body of existing and easily available population data

  17. A Consistent Phylogenetic Backbone for the Fungi

    Science.gov (United States)

    Ebersberger, Ingo; de Matos Simoes, Ricardo; Kupczok, Anne; Gube, Matthias; Kothe, Erika; Voigt, Kerstin; von Haeseler, Arndt

    2012-01-01

    The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data—a common practice in phylogenomic analyses—introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses. PMID:22114356

  18. Molecular phylogenetics of mastodon and Tyrannosaurus rex.

    Science.gov (United States)

    Organ, Chris L; Schweitzer, Mary H; Zheng, Wenxia; Freimark, Lisa M; Cantley, Lewis C; Asara, John M

    2008-04-25

    We report a molecular phylogeny for a nonavian dinosaur, extending our knowledge of trait evolution within nonavian dinosaurs into the macromolecular level of biological organization. Fragments of collagen alpha1(I) and alpha2(I) proteins extracted from fossil bones of Tyrannosaurus rex and Mammut americanum (mastodon) were analyzed with a variety of phylogenetic methods. Despite missing sequence data, the mastodon groups with elephant and the T. rex groups with birds, consistent with predictions based on genetic and morphological data for mastodon and on morphological data for T. rex. Our findings suggest that molecular data from long-extinct organisms may have the potential for resolving relationships at critical areas in the vertebrate evolutionary tree that have, so far, been phylogenetically intractable.

  19. Statistically Efficient Methods for Pitch and DOA Estimation

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll; Jensen, Søren Holdt

    2013-01-01

    , it was recently considered to estimate the DOA and pitch jointly. In this paper, we propose two novel methods for DOA and pitch estimation. They both yield maximum-likelihood estimates in white Gaussian noise scenar- ios, where the SNR may be different across channels, as opposed to state-of-the-art methods......Traditionally, direction-of-arrival (DOA) and pitch estimation of multichannel, periodic sources have been considered as two separate problems. Separate estimation may render the task of resolving sources with similar DOA or pitch impossible, and it may decrease the estimation accuracy. Therefore...

  20. Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

    DEFF Research Database (Denmark)

    Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.

    2012-01-01

    Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly......: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting...... phylogenetic profiles often require very different RT sets to support high prediction accuracy....

  1. Multigene molecular phylogenetics reveals true morels (Morchella) are especially species-rich in China

    Science.gov (United States)

    The phylogenetic diversity of true morels (Morchella) in China was estimated by initially analyzing nuclear ribosomal internal transcribed spacer (ITS) rDNA sequences from 361 specimens collected in 21 provinces during the 2003-2011 growing seasons, together with six collections obtained on loan fro...

  2. Phylogenetic versus functional signals in the evolution of form-function relationships in terrestrial vision.

    Science.gov (United States)

    Motani, Ryosuke; Schmitz, Lars

    2011-08-01

    Phylogeny is deeply pertinent to evolutionary studies. Traits that perform a body function are expected to be strongly influenced by physical "requirements" of the function. We investigated if such traits exhibit phylogenetic signals, and, if so, how phylogenetic noises bias quantification of form-function relationships. A form-function system that is strongly influenced by physics, namely the relationship between eye morphology and visual optics in amniotes, was used. We quantified the correlation between form (i.e., eye morphology) and function (i.e., ocular optics) while varying the level of phylogenetic bias removal through adjusting Pagel's λ. Ocular soft-tissue dimensions exhibited the highest correlation with ocular optics when 1% of phylogenetic bias expected from Brownian motion was removed (i.e., λ= 0.01); the value for hard-tissue data were 8%. A small degree of phylogenetic bias therefore exists in morphology despite of the stringent functional constraints. We also devised a phylogenetically informed discriminant analysis and recorded the effects of phylogenetic bias on this method using the same data. Use of proper λ values during phylogenetic bias removal improved misidentification rates in resulting classifications when prior probabilities were assumed to be equal. Even a small degree of phylogenetic bias affected the classification resulting from phylogenetically informed discriminant analysis. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.

  3. On Tree-Based Phylogenetic Networks.

    Science.gov (United States)

    Zhang, Louxin

    2016-07-01

    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa. This answers two problems posted by Francis and Steel recently. A byproduct is a computer program for generating random binary phylogenetic networks under the uniform distribution model.

  4. Universal artifacts affect the branching of phylogenetic trees, not universal scaling laws.

    Science.gov (United States)

    Altaba, Cristian R

    2009-01-01

    The superficial resemblance of phylogenetic trees to other branching structures allows searching for macroevolutionary patterns. However, such trees are just statistical inferences of particular historical events. Recent meta-analyses report finding regularities in the branching pattern of phylogenetic trees. But is this supported by evidence, or are such regularities just methodological artifacts? If so, is there any signal in a phylogeny? In order to evaluate the impact of polytomies and imbalance on tree shape, the distribution of all binary and polytomic trees of up to 7 taxa was assessed in tree-shape space. The relationship between the proportion of outgroups and the amount of imbalance introduced with them was assessed applying four different tree-building methods to 100 combinations from a set of 10 ingroup and 9 outgroup species, and performing covariance analyses. The relevance of this analysis was explored taking 61 published phylogenies, based on nucleic acid sequences and involving various taxa, taxonomic levels, and tree-building methods. All methods of phylogenetic inference are quite sensitive to the artifacts introduced by outgroups. However, published phylogenies appear to be subject to a rather effective, albeit rather intuitive control against such artifacts. The data and methods used to build phylogenetic trees are varied, so any meta-analysis is subject to pitfalls due to their uneven intrinsic merits, which translate into artifacts in tree shape. The binary branching pattern is an imposition of methods, and seldom reflects true relationships in intraspecific analyses, yielding artifactual polytomies in short trees. Above the species level, the departure of real trees from simplistic random models is caused at least by two natural factors--uneven speciation and extinction rates; and artifacts such as choice of taxa included in the analysis, and imbalance introduced by outgroups and basal paraphyletic taxa. This artifactual imbalance accounts

  5. Maximum Parsimony on Phylogenetic networks

    Science.gov (United States)

    2012-01-01

    Background Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past. Results In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores. Conclusion The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are

  6. A Fast Soft Bit Error Rate Estimation Method

    Directory of Open Access Journals (Sweden)

    Ait-Idir Tarik

    2010-01-01

    Full Text Available We have suggested in a previous publication a method to estimate the Bit Error Rate (BER of a digital communications system instead of using the famous Monte Carlo (MC simulation. This method was based on the estimation of the probability density function (pdf of soft observed samples. The kernel method was used for the pdf estimation. In this paper, we suggest to use a Gaussian Mixture (GM model. The Expectation Maximisation algorithm is used to estimate the parameters of this mixture. The optimal number of Gaussians is computed by using Mutual Information Theory. The analytical expression of the BER is therefore simply given by using the different estimated parameters of the Gaussian Mixture. Simulation results are presented to compare the three mentioned methods: Monte Carlo, Kernel and Gaussian Mixture. We analyze the performance of the proposed BER estimator in the framework of a multiuser code division multiple access system and show that attractive performance is achieved compared with conventional MC or Kernel aided techniques. The results show that the GM method can drastically reduce the needed number of samples to estimate the BER in order to reduce the required simulation run-time, even at very low BER.

  7. A Distance Measure for Genome Phylogenetic Analysis

    Science.gov (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  8. Statistical error estimation of the Feynman-α method using the bootstrap method

    International Nuclear Information System (INIS)

    Endo, Tomohiro; Yamamoto, Akio; Yagi, Takahiro; Pyeon, Cheol Ho

    2016-01-01

    Applicability of the bootstrap method is investigated to estimate the statistical error of the Feynman-α method, which is one of the subcritical measurement techniques on the basis of reactor noise analysis. In the Feynman-α method, the statistical error can be simply estimated from multiple measurements of reactor noise, however it requires additional measurement time to repeat the multiple times of measurements. Using a resampling technique called 'bootstrap method' standard deviation and confidence interval of measurement results obtained by the Feynman-α method can be estimated as the statistical error, using only a single measurement of reactor noise. In order to validate our proposed technique, we carried out a passive measurement of reactor noise without any external source, i.e. with only inherent neutron source by spontaneous fission and (α,n) reactions in nuclear fuels at the Kyoto University Criticality Assembly. Through the actual measurement, it is confirmed that the bootstrap method is applicable to approximately estimate the statistical error of measurement results obtained by the Feynman-α method. (author)

  9. Bacterial phylogenetic reconstruction from whole genomes is robust to recombination but demographic inference is not.

    Science.gov (United States)

    Hedge, Jessica; Wilson, Daniel J

    2014-11-25

    Phylogenetic inference in bacterial genomics is fundamental to understanding problems such as population history, antimicrobial resistance, and transmission dynamics. The field has been plagued by an apparent state of contradiction since the distorting effects of recombination on phylogeny were discovered more than a decade ago. Researchers persist with detailed phylogenetic analyses while simultaneously acknowledging that recombination seriously misleads inference of population dynamics and selection. Here we resolve this paradox by showing that phylogenetic tree topologies based on whole genomes robustly reconstruct the clonal frame topology but that branch lengths are badly skewed. Surprisingly, removing recombining sites can exacerbate branch length distortion caused by recombination. Phylogenetic tree reconstruction is a popular approach for understanding the relatedness of bacteria in a population from differences in their genome sequences. However, bacteria frequently exchange regions of their genomes by a process called homologous recombination, which violates a fundamental assumption of phylogenetic methods. Since many researchers continue to use phylogenetics for recombining bacteria, it is important to understand how recombination affects the conclusions drawn from these analyses. We find that whole-genome sequences afford great accuracy in reconstructing evolutionary relationships despite concerns surrounding the presence of recombination, but the branch lengths of the phylogenetic tree are indeed badly distorted. Surprisingly, methods to reduce the impact of recombination on branch lengths can exacerbate the problem. Copyright © 2014 Hedge and Wilson.

  10. Comprehensive Phylogenetic Analysis of Bovine Non-aureus Staphylococci Species Based on Whole-Genome Sequencing

    Science.gov (United States)

    Naushad, Sohail; Barkema, Herman W.; Luby, Christopher; Condas, Larissa A. Z.; Nobrega, Diego B.; Carson, Domonique A.; De Buck, Jeroen

    2016-01-01

    Non-aureus staphylococci (NAS), a heterogeneous group of a large number of species and subspecies, are the most frequently isolated pathogens from intramammary infections in dairy cattle. Phylogenetic relationships among bovine NAS species are controversial and have mostly been determined based on single-gene trees. Herein, we analyzed phylogeny of bovine NAS species using whole-genome sequencing (WGS) of 441 distinct isolates. In addition, evolutionary relationships among bovine NAS were estimated from multilocus data of 16S rRNA, hsp60, rpoB, sodA, and tuf genes and sequences from these and numerous other single genes/proteins. All phylogenies were created with FastTree, Maximum-Likelihood, Maximum-Parsimony, and Neighbor-Joining methods. Regardless of methodology, WGS-trees clearly separated bovine NAS species into five monophyletic coherent clades. Furthermore, there were consistent interspecies relationships within clades in all WGS phylogenetic reconstructions. Except for the Maximum-Parsimony tree, multilocus data analysis similarly produced five clades. There were large variations in determining clades and interspecies relationships in single gene/protein trees, under different methods of tree constructions, highlighting limitations of using single genes for determining bovine NAS phylogeny. However, based on WGS data, we established a robust phylogeny of bovine NAS species, unaffected by method or model of evolutionary reconstructions. Therefore, it is now possible to determine associations between phylogeny and many biological traits, such as virulence, antimicrobial resistance, environmental niche, geographical distribution, and host specificity. PMID:28066335

  11. Phylogenetic placement of two species known only from resting spores

    DEFF Research Database (Denmark)

    Hajek, Ann E; Gryganskyi, Andrii; Bittner, Tonya

    2016-01-01

    resting spores, Zoophthora independentia, infecting Tipula (Lunatipula) submaculata in New York State, is now described as a new species and Tarichium porteri, described in 1942, which infects Tipula (Triplicitipula) colei in Tennessee, is transferred to the genus Zoophthora. We have shown that use......Molecular methods were used to determine the generic placement of two species of Entomophthorales known only from resting spores. Historically, these species would belong in the form-genus Tarichium, but this classification provides no information about phylogenetic relationships. Using DNA from...... of molecular methods can assist with determination of the phylogenetic relations of specimens within the form-genus Tarichium for an already described species and a new species for which only resting spores are available....

  12. treespace: Statistical exploration of landscapes of phylogenetic trees.

    Science.gov (United States)

    Jombart, Thibaut; Kendall, Michelle; Almagro-Garcia, Jacob; Colijn, Caroline

    2017-11-01

    The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low-dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group-specific consensus phylogenies. treespace also provides a user-friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results. © 2017 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  13. A MONTE-CARLO METHOD FOR ESTIMATING THE CORRELATION EXPONENT

    NARCIS (Netherlands)

    MIKOSCH, T; WANG, QA

    We propose a Monte Carlo method for estimating the correlation exponent of a stationary ergodic sequence. The estimator can be considered as a bootstrap version of the classical Hill estimator. A simulation study shows that the method yields reasonable estimates.

  14. Nonbinary Tree-Based Phylogenetic Networks

    NARCIS (Netherlands)

    Jetten, L.; van Iersel, L.J.J.

    2018-01-01

    Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can for example

  15. GenNon-h: Generating multiple sequence alignments on nonhomogeneous phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Kedzierska Anna M

    2012-08-01

    Full Text Available Abstract Background A number of software packages are available to generate DNA multiple sequence alignments (MSAs evolved under continuous-time Markov processes on phylogenetic trees. On the other hand, methods of simulating the DNA MSA directly from the transition matrices do not exist. Moreover, existing software restricts to the time-reversible models and it is not optimized to generate nonhomogeneous data (i.e. placing distinct substitution rates at different lineages. Results We present the first package designed to generate MSAs evolving under discrete-time Markov processes on phylogenetic trees, directly from probability substitution matrices. Based on the input model and a phylogenetic tree in the Newick format (with branch lengths measured as the expected number of substitutions per site, the algorithm produces DNA alignments of desired length. GenNon-h is publicly available for download. Conclusion The software presented here is an efficient tool to generate DNA MSAs on a given phylogenetic tree. GenNon-h provides the user with the nonstationary or nonhomogeneous phylogenetic data that is well suited for testing complex biological hypotheses, exploring the limits of the reconstruction algorithms and their robustness to such models.

  16. The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae)

    OpenAIRE

    McCann, Jamie; Schneeweiss, Gerald M.; Stuessy, Tod F.; Villase?or, Jose L.; Weiss-Schneeweiss, Hanna

    2016-01-01

    Chromosome number change (polyploidy and dysploidy) plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae) as model group, we assess the impact of reconstruction method (ma...

  17. A Generalized Autocovariance Least-Squares Method for Covariance Estimation

    DEFF Research Database (Denmark)

    Åkesson, Bernt Magnus; Jørgensen, John Bagterp; Poulsen, Niels Kjølstad

    2007-01-01

    A generalization of the autocovariance least- squares method for estimating noise covariances is presented. The method can estimate mutually correlated system and sensor noise and can be used with both the predicting and the filtering form of the Kalman filter.......A generalization of the autocovariance least- squares method for estimating noise covariances is presented. The method can estimate mutually correlated system and sensor noise and can be used with both the predicting and the filtering form of the Kalman filter....

  18. Phylogenetic relationships of Chaetomium isolates based on the ...

    African Journals Online (AJOL)

    Biotech Unit

    2013-02-27

    Feb 27, 2013 ... Phylogenetic analysis of Chaetomium species. The evolutionary history was inferred using the maximum parsimony method. The bootstrap consensus tree inferred from. 1000 replicates is taken to represent the evolutionary history of the taxa analyzed (Felsenstein, 1985). The MP tree was obtained using.

  19. Molecular phylogenetics and historical biogeography of Rhinolophus bats.

    Science.gov (United States)

    Stoffberg, Samantha; Jacobs, David S; Mackie, Iain J; Matthee, Conrad A

    2010-01-01

    The phylogenetic relationships within the horseshoe bats (genus Rhinolophus) are poorly resolved, particularly at deeper levels within the tree. We present a better-resolved phylogenetic hypothesis for 30 rhinolophid species based on parsimony and Bayesian analyses of the mitochondrial cytochrome b gene and three nuclear introns (TG, THY and PRKC1). Strong support was found for the existence of two geographic clades within the monophyletic Rhinolophidae: an African group and an Oriental assemblage. The relaxed Bayesian clock method indicated that the two rhinolophid clades diverged approximately 35 million years ago and results from Dispersal Vicariance (DIVA) analysis suggest that the horseshoe bats arose in Asia and subsequently dispersed into Europe and Africa.

  20. Different relationships between temporal phylogenetic turnover and phylogenetic similarity and in two forests were detected by a new null model.

    Science.gov (United States)

    Huang, Jian-Xiong; Zhang, Jian; Shen, Yong; Lian, Ju-yu; Cao, Hong-lin; Ye, Wan-hui; Wu, Lin-fang; Bin, Yue

    2014-01-01

    Ecologists have been monitoring community dynamics with the purpose of understanding the rates and causes of community change. However, there is a lack of monitoring of community dynamics from the perspective of phylogeny. We attempted to understand temporal phylogenetic turnover in a 50 ha tropical forest (Barro Colorado Island, BCI) and a 20 ha subtropical forest (Dinghushan in southern China, DHS). To obtain temporal phylogenetic turnover under random conditions, two null models were used. The first shuffled names of species that are widely used in community phylogenetic analyses. The second simulated demographic processes with careful consideration on the variation in dispersal ability among species and the variations in mortality both among species and among size classes. With the two models, we tested the relationships between temporal phylogenetic turnover and phylogenetic similarity at different spatial scales in the two forests. Results were more consistent with previous findings using the second null model suggesting that the second null model is more appropriate for our purposes. With the second null model, a significantly positive relationship was detected between phylogenetic turnover and phylogenetic similarity in BCI at a 10 m×10 m scale, potentially indicating phylogenetic density dependence. This relationship in DHS was significantly negative at three of five spatial scales. This could indicate abiotic filtering processes for community assembly. Using variation partitioning, we found phylogenetic similarity contributed to variation in temporal phylogenetic turnover in the DHS plot but not in BCI plot. The mechanisms for community assembly in BCI and DHS vary from phylogenetic perspective. Only the second null model detected this difference indicating the importance of choosing a proper null model.

  1. Phylogenetic structure in tropical hummingbird communities

    DEFF Research Database (Denmark)

    Graham, Catherine H; Parra, Juan L; Rahbek, Carsten

    2009-01-01

    How biotic interactions, current and historical environment, and biogeographic barriers determine community structure is a fundamental question in ecology and evolution, especially in diverse tropical regions. To evaluate patterns of local and regional diversity, we quantified the phylogenetic...... composition of 189 hummingbird communities in Ecuador. We assessed how species and phylogenetic composition changed along environmental gradients and across biogeographic barriers. We show that humid, low-elevation communities are phylogenetically overdispersed (coexistence of distant relatives), a pattern...... that is consistent with the idea that competition influences the local composition of hummingbirds. At higher elevations communities are phylogenetically clustered (coexistence of close relatives), consistent with the expectation of environmental filtering, which may result from the challenge of sustaining...

  2. Constructing phylogenetic trees using interacting pathways.

    Science.gov (United States)

    Wan, Peng; Che, Dongsheng

    2013-01-01

    Phylogenetic trees are used to represent evolutionary relationships among biological species or organisms. The construction of phylogenetic trees is based on the similarities or differences of their physical or genetic features. Traditional approaches of constructing phylogenetic trees mainly focus on physical features. The recent advancement of high-throughput technologies has led to accumulation of huge amounts of biological data, which in turn changed the way of biological studies in various aspects. In this paper, we report our approach of building phylogenetic trees using the information of interacting pathways. We have applied hierarchical clustering on two domains of organisms-eukaryotes and prokaryotes. Our preliminary results have shown the effectiveness of using the interacting pathways in revealing evolutionary relationships.

  3. Unemployment estimation: Spatial point referenced methods and models

    KAUST Repository

    Pereira, Soraia

    2017-06-26

    Portuguese Labor force survey, from 4th quarter of 2014 onwards, started geo-referencing the sampling units, namely the dwellings in which the surveys are carried. This opens new possibilities in analysing and estimating unemployment and its spatial distribution across any region. The labor force survey choose, according to an preestablished sampling criteria, a certain number of dwellings across the nation and survey the number of unemployed in these dwellings. Based on this survey, the National Statistical Institute of Portugal presently uses direct estimation methods to estimate the national unemployment figures. Recently, there has been increased interest in estimating these figures in smaller areas. Direct estimation methods, due to reduced sampling sizes in small areas, tend to produce fairly large sampling variations therefore model based methods, which tend to

  4. Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

    Science.gov (United States)

    Kolaczkowski, Bryan; Thornton, Joseph W

    2009-12-09

    Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.

  5. Phylogenetic classification of bony fishes.

    Science.gov (United States)

    Betancur-R, Ricardo; Wiley, Edward O; Arratia, Gloria; Acero, Arturo; Bailly, Nicolas; Miya, Masaki; Lecointre, Guillaume; Ortí, Guillermo

    2017-07-06

    Fish classifications, as those of most other taxonomic groups, are being transformed drastically as new molecular phylogenies provide support for natural groups that were unanticipated by previous studies. A brief review of the main criteria used by ichthyologists to define their classifications during the last 50 years, however, reveals slow progress towards using an explicit phylogenetic framework. Instead, the trend has been to rely, in varying degrees, on deep-rooted anatomical concepts and authority, often mixing taxa with explicit phylogenetic support with arbitrary groupings. Two leading sources in ichthyology frequently used for fish classifications (JS Nelson's volumes of Fishes of the World and W. Eschmeyer's Catalog of Fishes) fail to adopt a global phylogenetic framework despite much recent progress made towards the resolution of the fish Tree of Life. The first explicit phylogenetic classification of bony fishes was published in 2013, based on a comprehensive molecular phylogeny ( www.deepfin.org ). We here update the first version of that classification by incorporating the most recent phylogenetic results. The updated classification presented here is based on phylogenies inferred using molecular and genomic data for nearly 2000 fishes. A total of 72 orders (and 79 suborders) are recognized in this version, compared with 66 orders in version 1. The phylogeny resolves placement of 410 families, or ~80% of the total of 514 families of bony fishes currently recognized. The ordinal status of 30 percomorph families included in this study, however, remains uncertain (incertae sedis in the series Carangaria, Ovalentaria, or Eupercaria). Comments to support taxonomic decisions and comparisons with conflicting taxonomic groups proposed by others are presented. We also highlight cases were morphological support exist for the groups being classified. This version of the phylogenetic classification of bony fishes is substantially improved, providing resolution

  6. Nuclear and cpDNA sequences combined provide strong inference of higher phylogenetic relationships in the phlox family (Polemoniaceae).

    Science.gov (United States)

    Johnson, Leigh A; Chan, Lauren M; Weese, Terri L; Busby, Lisa D; McMurry, Samuel

    2008-09-01

    Members of the phlox family (Polemoniaceae) serve as useful models for studying various evolutionary and biological processes. Despite its biological importance, no family-wide phylogenetic estimate based on multiple DNA regions with complete generic sampling is available. Here, we analyze one nuclear and five chloroplast DNA sequence regions (nuclear ITS, chloroplast matK, trnL intron plus trnL-trnF intergeneric spacer, and the trnS-trnG, trnD-trnT, and psbM-trnD intergenic spacers) using parsimony and Bayesian methods, as well as assessments of congruence and long branch attraction, to explore phylogenetic relationships among 84 ingroup species representing all currently recognized Polemoniaceae genera. Relationships inferred from the ITS and concatenated chloroplast regions are similar overall. A combined analysis provides strong support for the monophyly of Polemoniaceae and subfamilies Acanthogilioideae, Cobaeoideae, and Polemonioideae. Relationships among subfamilies, and thus for the precise root of Polemoniaceae, remain poorly supported. Within the largest subfamily, Polemonioideae, four clades corresponding to tribes Polemonieae, Phlocideae, Gilieae, and Loeselieae receive strong support. The monogeneric Polemonieae appears sister to Phlocideae. Relationships within Polemonieae, Phlocideae, and Gilieae are mostly consistent between analyses and data permutations. Many relationships within Loeselieae remain uncertain. Overall, inferred phylogenetic relationships support a higher-level classification for Polemoniaceae proposed in 2000.

  7. On the Methods for Estimating the Corneoscleral Limbus.

    Science.gov (United States)

    Jesus, Danilo A; Iskander, D Robert

    2017-08-01

    The aim of this study was to develop computational methods for estimating limbus position based on the measurements of three-dimensional (3-D) corneoscleral topography and ascertain whether corneoscleral limbus routinely estimated from the frontal image corresponds to that derived from topographical information. Two new computational methods for estimating the limbus position are proposed: One based on approximating the raw anterior eye height data by series of Zernike polynomials and one that combines the 3-D corneoscleral topography with the frontal grayscale image acquired with the digital camera in-built in the profilometer. The proposed methods are contrasted against a previously described image-only-based procedure and to a technique of manual image annotation. The estimates of corneoscleral limbus radius were characterized with a high precision. The group average (mean ± standard deviation) of the maximum difference between estimates derived from all considered methods was 0.27 ± 0.14 mm and reached up to 0.55 mm. The four estimating methods lead to statistically significant differences (nonparametric ANOVA (the Analysis of Variance) test, p 0.05). Precise topographical limbus demarcation is possible either from the frontal digital images of the eye or from the 3-D topographical information of corneoscleral region. However, the results demonstrated that the corneoscleral limbus estimated from the anterior eye topography does not always correspond to that obtained through image-only based techniques. The experimental findings have shown that 3-D topography of anterior eye, in the absence of a gold standard, has the potential to become a new computational methodology for estimating the corneoscleral limbus.

  8. Phylogenetic relationships within and among Brassica species from ...

    African Journals Online (AJOL)

    STORAGESEVER

    2008-05-02

    May 2, 2008 ... Inappropriate tree reconstruction methods would pose a problem only in the basal relationships rather than in terminal taxa; the paraphyly observed in this study applied mostly to terminal taxa. This study recovered sufficient phylogenetic characters to separate accessions of the same species, making.

  9. Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

    Science.gov (United States)

    Gatesy, John; Springer, Mark S

    2014-11-01

    Large datasets are required to solve difficult phylogenetic problems that are deep in the Tree of Life. Currently, two divergent systematic methods are commonly applied to such datasets: the traditional supermatrix approach (= concatenation) and "shortcut" coalescence (= coalescence methods wherein gene trees and the species tree are not co-estimated). When applied to ancient clades, these contrasting frameworks often produce congruent results, but in recent phylogenetic analyses of Placentalia (placental mammals), this is not the case. A recent series of papers has alternatively disputed and defended the utility of shortcut coalescence methods at deep phylogenetic scales. Here, we examine this exchange in the context of published phylogenomic data from Mammalia; in particular we explore two critical issues - the delimitation of data partitions ("genes") in coalescence analysis and hidden support that emerges with the combination of such partitions in phylogenetic studies. Hidden support - increased support for a clade in combined analysis of all data partitions relative to the support evident in separate analyses of the various data partitions, is a hallmark of the supermatrix approach and a primary rationale for concatenating all characters into a single matrix. In the most extreme cases of hidden support, relationships that are contradicted by all gene trees are supported when all of the genes are analyzed together. A valid fear is that shortcut coalescence methods might bypass or distort character support that is hidden in individual loci because small gene fragments are analyzed in isolation. Given the extensive systematic database for Mammalia, the assumptions and applicability of shortcut coalescence methods can be assessed with rigor to complement a small but growing body of simulation work that has directly compared these methods to concatenation. We document several remarkable cases of hidden support in both supermatrix and coalescence paradigms and argue

  10. Comparison of methods for estimating carbon in harvested wood products

    International Nuclear Information System (INIS)

    Claudia Dias, Ana; Louro, Margarida; Arroja, Luis; Capela, Isabel

    2009-01-01

    There is a great diversity of methods for estimating carbon storage in harvested wood products (HWP) and, therefore, it is extremely important to agree internationally on the methods to be used in national greenhouse gas inventories. This study compares three methods for estimating carbon accumulation in HWP: the method suggested by Winjum et al. (Winjum method), the tier 2 method proposed by the IPCC Good Practice Guidance for Land Use, Land-Use Change and Forestry (GPG LULUCF) (GPG tier 2 method) and a method consistent with GPG LULUCF tier 3 methods (GPG tier 3 method). Carbon accumulation in HWP was estimated for Portugal under three accounting approaches: stock-change, production and atmospheric-flow. The uncertainty in the estimates was also evaluated using Monte Carlo simulation. The estimates of carbon accumulation in HWP obtained with the Winjum method differed substantially from the estimates obtained with the other methods, because this method tends to overestimate carbon accumulation with the stock-change and the production approaches and tends to underestimate carbon accumulation with the atmospheric-flow approach. The estimates of carbon accumulation provided by the GPG methods were similar, but the GPG tier 3 method reported the lowest uncertainties. For the GPG methods, the atmospheric-flow approach produced the largest estimates of carbon accumulation, followed by the production approach and the stock-change approach, by this order. A sensitivity analysis showed that using the ''best'' available data on production and trade of HWP produces larger estimates of carbon accumulation than using data from the Food and Agriculture Organization. (author)

  11. BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

    Science.gov (United States)

    Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

    2012-01-01

    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.

  12. Phylogenetic tree construction using trinucleotide usage profile (TUP).

    Science.gov (United States)

    Chen, Si; Deng, Lih-Yuan; Bowman, Dale; Shiau, Jyh-Jen Horng; Wong, Tit-Yee; Madahian, Behrouz; Lu, Henry Horng-Shing

    2016-10-06

    It has been a challenging task to build a genome-wide phylogenetic tree for a large group of species containing a large number of genes with long nucleotides sequences. The most popular method, called feature frequency profile (FFP-k), finds the frequency distribution for all words of certain length k over the whole genome sequence using (overlapping) windows of the same length. For a satisfactory result, the recommended word length (k) ranges from 6 to 15 and it may not be a multiple of 3 (codon length). The total number of possible words needed for FFP-k can range from 4 6 =4096 to 4 15 . We propose a simple improvement over the popular FFP method using only a typical word length of 3. A new method, called Trinucleotide Usage Profile (TUP), is proposed based only on the (relative) frequency distribution using non-overlapping windows of length 3. The total number of possible words needed for TUP is 4 3 =64, which is much less than the total count for the recommended optimal "resolution" for FFP. To build a phylogenetic tree, we propose first representing each of the species by a TUP vector and then using an appropriate distance measure between pairs of the TUP vectors for the tree construction. In particular, we propose summarizing a DNA sequence by a matrix of three rows corresponding to three reading frames, recording the frequency distribution of the non-overlapping words of length 3 in each of the reading frame. We also provide a numerical measure for comparing trees constructed with various methods. Compared to the FFP method, our empirical study showed that the proposed TUP method is more capable of building phylogenetic trees with a stronger biological support. We further provide some justifications on this from the information theory viewpoint. Unlike the FFP method, the TUP method takes the advantage that the starting of the first reading frame is (usually) known. Without this information, the FFP method could only rely on the frequency distribution of

  13. Fast phylogenetic DNA barcoding

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Willerslev, Eske

    2008-01-01

    We present a heuristic approach to the DNA assignment problem based on phylogenetic inferences using constrained neighbour joining and non-parametric bootstrapping. We show that this method performs as well as the more computationally intensive full Bayesian approach in an analysis of 500 insect...... DNA sequences obtained from GenBank. We also analyse a previously published dataset of environmental DNA sequences from soil from New Zealand and Siberia, and use these data to illustrate the fact that statistical approaches to the DNA assignment problem allow for more appropriate criteria...... for determining the taxonomic level at which a particular DNA sequence can be assigned....

  14. Revisiting the Zingiberales: using multiplexed exon capture to resolve ancient and recent phylogenetic splits in a charismatic plant lineage

    Directory of Open Access Journals (Sweden)

    Chodon Sass

    2016-01-01

    Full Text Available The Zingiberales are an iconic order of monocotyledonous plants comprising eight families with distinctive and diverse floral morphologies and representing an important ecological element of tropical and subtropical forests. While the eight families are demonstrated to be monophyletic, phylogenetic relationships among these families remain unresolved. Neither combined morphological and molecular studies nor recent attempts to resolve family relationships using sequence data from whole plastomes has resulted in a well-supported, family-level phylogenetic hypothesis of relationships. Here we approach this challenge by leveraging the complete genome of one member of the order, Musa acuminata, together with transcriptome information from each of the other seven families to design a set of nuclear loci that can be enriched from highly divergent taxa with a single array-based capture of indexed genomic DNA. A total of 494 exons from 418 nuclear genes were captured for 53 ingroup taxa. The entire plastid genome was also captured for the same 53 taxa. Of the total genes captured, 308 nuclear and 68 plastid genes were used for phylogenetic estimation. The concatenated plastid and nuclear dataset supports the position of Musaceae as sister to the remaining seven families. Moreover, the combined dataset recovers known intra- and inter-family phylogenetic relationships with generally high bootstrap support. This is a flexible and cost effective method that gives the broader plant biology community a tool for generating phylogenomic scale sequence data in non-model systems at varying evolutionary depths.

  15. Evaluation of non cyanide methods for hemoglobin estimation

    Directory of Open Access Journals (Sweden)

    Vinaya B Shah

    2011-01-01

    Full Text Available Background: The hemoglobincyanide method (HiCN method for measuring hemoglobin is used extensively worldwide; its advantages are the ready availability of a stable and internationally accepted reference standard calibrator. However, its use may create a problem, as the waste disposal of large volumes of reagent containing cyanide constitutes a potential toxic hazard. Aims and Objective: As an alternative to drabkin`s method of Hb estimation, we attempted to estimate hemoglobin by other non-cyanide methods: alkaline hematin detergent (AHD-575 using Triton X-100 as lyser and alkaline- borax method using quarternary ammonium detergents as lyser. Materials and Methods: The hemoglobin (Hb results on 200 samples of varying Hb concentrations obtained by these two cyanide free methods were compared with a cyanmethemoglobin method on a colorimeter which is light emitting diode (LED based. Hemoglobin was also estimated in one hundred blood donors and 25 blood samples of infants and compared by these methods. Statistical analysis used was Pearson`s correlation coefficient. Results: The response of the non cyanide method is linear for serially diluted blood samples over the Hb concentration range from 3gm/dl -20 gm/dl. The non cyanide methods has a precision of + 0.25g/dl (coefficient of variation= (2.34% and is suitable for use with fixed wavelength or with colorimeters at wavelength- 530 nm and 580 nm. Correlation of these two methods was excellent (r=0.98. The evaluation has shown it to be as reliable and reproducible as HiCN for measuring hemoglobin at all concentrations. The reagents used in non cyanide methods are non-biohazardous and did not affect the reliability of data determination and also the cost was less than HiCN method. Conclusions: Thus, non cyanide methods of Hb estimation offer possibility of safe and quality Hb estimation and should prove useful for routine laboratory use. Non cyanide methods is easily incorporated in hemobloginometers

  16. Effects of logging and recruitment on community phylogenetic structure in 32 permanent forest plots of Kampong Thom, Cambodia.

    Science.gov (United States)

    Toyama, Hironori; Kajisa, Tsuyoshi; Tagane, Shuichiro; Mase, Keiko; Chhang, Phourin; Samreth, Vanna; Ma, Vuthy; Sokh, Heng; Ichihashi, Ryuji; Onoda, Yusuke; Mizoue, Nobuya; Yahara, Tetsukazu

    2015-02-19

    Ecological communities including tropical rainforest are rapidly changing under various disturbances caused by increasing human activities. Recently in Cambodia, illegal logging and clear-felling for agriculture have been increasing. Here, we study the effects of logging, mortality and recruitment of plot trees on phylogenetic community structure in 32 plots in Kampong Thom, Cambodia. Each plot was 0.25 ha; 28 plots were established in primary evergreen forests and four were established in secondary dry deciduous forests. Measurements were made in 1998, 2000, 2004 and 2010, and logging, recruitment and mortality of each tree were recorded. We estimated phylogeny using rbcL and matK gene sequences and quantified phylogenetic α and β diversity. Within communities, logging decreased phylogenetic diversity, and increased overall phylogenetic clustering and terminal phylogenetic evenness. Between communities, logging increased phylogenetic similarity between evergreen and deciduous plots. On the other hand, recruitment had opposite effects both within and between communities. The observed patterns can be explained by environmental homogenization under logging. Logging is biased to particular species and larger diameter at breast height, and forest patrol has been effective in decreasing logging. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  17. Assessing the Accuracy of Ancestral Protein Reconstruction Methods

    OpenAIRE

    Williams, Paul D; Pollock, David D; Blackburne, Benjamin P; Goldstein, Richard A

    2006-01-01

    The phylogenetic inference of ancestral protein sequences is a powerful technique for the study of molecular evolution, but any conclusions drawn from such studies are only as good as the accuracy of the reconstruction method. Every inference method leads to errors in the ancestral protein sequence, resulting in potentially misleading estimates of the ancestral protein's properties. To assess the accuracy of ancestral protein reconstruction methods, we performed computational population evolu...

  18. Combining Phylogenetic and Occurrence Information for Risk Assessment of Pest and Pathogen Interactions with Host Plants

    Directory of Open Access Journals (Sweden)

    Ángel L. Robles-Fernández

    2017-08-01

    Full Text Available Phytosanitary agencies conduct plant biosecurity activities, including early detection of potential introduction pathways, to improve control and eradication of pest and pathogen incursions. For such actions, analytical tools based on solid scientific knowledge regarding plant-pest or pathogen relationships for pest risk assessment are needed. Recent evidence indicating that closely related species share a higher chance of becoming infected or attacked by pests has allowed the identification of taxa with different degrees of vulnerability. Here, we use information readily available online about pest-host interactions and their geographic distributions, in combination with host phylogenetic reconstructions, to estimate a pest-host interaction (in some cases infection index in geographic space as a more comprehensive, spatially explicit tool for risk assessment. We demonstrate this protocol using phylogenetic relationships for 20 beetle species and 235 host plant genera: first, we estimate the probability of a host sharing pests, and second, we project the index in geographic space. Overall, the predictions allow identification of the pest-host interaction type (e.g., generalist or specialist, which is largely determined by both host range and phylogenetic constraints. Furthermore, the results can be valuable in terms of identifying hotspots where pests and vulnerable hosts interact. This knowledge is useful for anticipating biological invasions or spreading of disease. We suggest that our understanding of biotic interactions will improve after combining information from multiple dimensions of biodiversity at multiple scales (e.g., phylogenetic signal and host-vector-pathogen geographic distribution.

  19. A Bayesian framework to estimate diversification rates and their variation through time and space

    Directory of Open Access Journals (Sweden)

    Silvestro Daniele

    2011-10-01

    Full Text Available Abstract Background Patterns of species diversity are the result of speciation and extinction processes, and molecular phylogenetic data can provide valuable information to derive their variability through time and across clades. Bayesian Markov chain Monte Carlo methods offer a promising framework to incorporate phylogenetic uncertainty when estimating rates of diversification. Results We introduce a new approach to estimate diversification rates in a Bayesian framework over a distribution of trees under various constant and variable rate birth-death and pure-birth models, and test it on simulated phylogenies. Furthermore, speciation and extinction rates and their posterior credibility intervals can be estimated while accounting for non-random taxon sampling. The framework is particularly suitable for hypothesis testing using Bayes factors, as we demonstrate analyzing dated phylogenies of Chondrostoma (Cyprinidae and Lupinus (Fabaceae. In addition, we develop a model that extends the rate estimation to a meta-analysis framework in which different data sets are combined in a single analysis to detect general temporal and spatial trends in diversification. Conclusions Our approach provides a flexible framework for the estimation of diversification parameters and hypothesis testing while simultaneously accounting for uncertainties in the divergence times and incomplete taxon sampling.

  20. Phylogenetic reconstruction of endophytic fungal isolates using internal transcribed spacer 2 (ITS2) region.

    Science.gov (United States)

    GokulRaj, Kathamuthu; Sundaresan, Natesan; Ganeshan, Enthai Jagan; Rajapriya, Pandi; Muthumary, Johnpaul; Sridhar, Jayavel; Pandi, Mohan

    2014-01-01

    Endophytic fungi are inhabitants of plants, living most part of their lifecycle asymptomatically which mainly confer protection and ecological advantages to the host plant. In this present study, 48 endophytic fungi were isolated from the leaves of three medicinal plants and characterized based on ITS2 sequence - secondary structure analysis. ITS2 secondary structures were elucidated with minimum free energy method (MFOLD version 3.1) and consensus structure of each genus was generated by 4SALE. ProfDistS was used to generate ITS2 sequence structure based phylogenetic tree respectively. Our elucidated isolates were belonging to Ascomycetes family, representing 5 orders and 6 genera. Colletotrichum/Glomerella spp., Diaporthae/Phomopsis spp., and Alternaria spp., were predominantly observed while Cochliobolus sp., Cladosporium sp., and Emericella sp., were represented by singletons. The constructed phylogenetic tree has well resolved monophyletic groups with >50% bootstrap value support. Secondary structures based fungal systematics improves not only the stability; it also increases the precision of phylogenetic inference. Above ITS2 based phylogenetic analysis was performed for our 48 isolates along with sequences of known ex-types taken from GenBank which confirms the efficiency of the proposed method. Further, we propose it as superlative marker for reconstructing phylogenetic relationships at different taxonomic levels due to their lesser length.

  1. Methods for estimating the semivariogram

    DEFF Research Database (Denmark)

    Lophaven, Søren Nymand; Carstensen, Niels Jacob; Rootzen, Helle

    2002-01-01

    . In the existing literature various methods for modelling the semivariogram have been proposed, while only a few studies have been made on comparing different approaches. In this paper we compare eight approaches for modelling the semivariogram, i.e. six approaches based on least squares estimation...... maximum likelihood performed better than the least squares approaches. We also applied maximum likelihood and least squares estimation to a real dataset, containing measurements of salinity at 71 sampling stations in the Kattegat basin. This showed that the calculation of spatial predictions...

  2. The Past Sure is Tense: On Interpreting Phylogenetic Divergence Time Estimates.

    Science.gov (United States)

    Brown, Joseph W; Smith, Stephen A

    2018-03-01

    Divergence time estimation-the calibration of a phylogeny to geological time-is an integral first step in modeling the tempo of biological evolution (traits and lineages). However, despite increasingly sophisticated methods to infer divergence times from molecular genetic sequences, the estimated age of many nodes across the tree of life contrast significantly and consistently with timeframes conveyed by the fossil record. This is perhaps best exemplified by crown angiosperms, where molecular clock (Triassic) estimates predate the oldest (Early Cretaceous) undisputed angiosperm fossils by tens of millions of years or more. While the incompleteness of the fossil record is a common concern, issues of data limitation and model inadequacy are viable (if underexplored) alternative explanations. In this vein, Beaulieu et al. (2015) convincingly demonstrated how methods of divergence time inference can be misled by both (i) extreme state-dependent molecular substitution rate heterogeneity and (ii) biased sampling of representative major lineages. These results demonstrate the impact of (potentially common) model violations. Here, we suggest another potential challenge: that the configuration of the statistical inference problem (i.e., the parameters, their relationships, and associated priors) alone may preclude the reconstruction of the paleontological timeframe for the crown age of angiosperms. We demonstrate, through sampling from the joint prior (formed by combining the tree (diversification) prior with the calibration densities specified for fossil-calibrated nodes) that with no data present at all, that an Early Cretaceous crown angiosperms is rejected (i.e., has essentially zero probability). More worrisome, however, is that for the 24 nodes calibrated by fossils, almost all have indistinguishable marginal prior and posterior age distributions when employing routine lognormal fossil calibration priors. These results indicate that there is inadequate information in

  3. Bayesian Inference Methods for Sparse Channel Estimation

    DEFF Research Database (Denmark)

    Pedersen, Niels Lovmand

    2013-01-01

    This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...

  4. Phylogenetic estimates of diversification rate are affected by molecular rate variation.

    Science.gov (United States)

    Duchêne, D A; Hua, X; Bromham, L

    2017-10-01

    Molecular phylogenies are increasingly being used to investigate the patterns and mechanisms of macroevolution. In particular, node heights in a phylogeny can be used to detect changes in rates of diversification over time. Such analyses rest on the assumption that node heights in a phylogeny represent the timing of diversification events, which in turn rests on the assumption that evolutionary time can be accurately predicted from DNA sequence divergence. But there are many influences on the rate of molecular evolution, which might also influence node heights in molecular phylogenies, and thus affect estimates of diversification rate. In particular, a growing number of studies have revealed an association between the net diversification rate estimated from phylogenies and the rate of molecular evolution. Such an association might, by influencing the relative position of node heights, systematically bias estimates of diversification time. We simulated the evolution of DNA sequences under several scenarios where rates of diversification and molecular evolution vary through time, including models where diversification and molecular evolutionary rates are linked. We show that commonly used methods, including metric-based, likelihood and Bayesian approaches, can have a low power to identify changes in diversification rate when molecular substitution rates vary. Furthermore, the association between the rates of speciation and molecular evolution rate can cause the signature of a slowdown or speedup in speciation rates to be lost or misidentified. These results suggest that the multiple sources of variation in molecular evolutionary rates need to be considered when inferring macroevolutionary processes from phylogenies. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.

  5. Applying a multiobjective metaheuristic inspired by honey bees to phylogenetic inference.

    Science.gov (United States)

    Santander-Jiménez, Sergio; Vega-Rodríguez, Miguel A

    2013-10-01

    The development of increasingly popular multiobjective metaheuristics has allowed bioinformaticians to deal with optimization problems in computational biology where multiple objective functions must be taken into account. One of the most relevant research topics that can benefit from these techniques is phylogenetic inference. Throughout the years, different researchers have proposed their own view about the reconstruction of ancestral evolutionary relationships among species. As a result, biologists often report different phylogenetic trees from a same dataset when considering distinct optimality principles. In this work, we detail a multiobjective swarm intelligence approach based on the novel Artificial Bee Colony algorithm for inferring phylogenies. The aim of this paper is to propose a complementary view of phylogenetics according to the maximum parsimony and maximum likelihood criteria, in order to generate a set of phylogenetic trees that represent a compromise between these principles. Experimental results on a variety of nucleotide data sets and statistical studies highlight the relevance of the proposal with regard to other multiobjective algorithms and state-of-the-art biological methods. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  6. Mosasauroid phylogeny under multiple phylogenetic methods provides new insights on the evolution of aquatic adaptations in the group.

    Directory of Open Access Journals (Sweden)

    Tiago R Simões

    Full Text Available Mosasauroids were a successful lineage of squamate reptiles (lizards and snakes that radiated during the Late Cretaceous (95-66 million years ago. They can be considered one of the few lineages in the evolutionary history of tetrapods to have acquired a fully aquatic lifestyle, similarly to whales, ichthyosaurs and plesiosaurs. Despite a long history of research on this group, their phylogenetic relationships have only been tested so far using traditional (unweighted maximum parsimony. However, hypotheses of mosasauroid relationships and the recently proposed multiple origins of aquatically adapted pelvic and pedal features in this group can be more thoroughly tested by methods that take into account variation in branch lengths and evolutionary rates. In this study, we present the first mosasauroid phylogenetic analysis performed under different analytical methods, including maximum likelihood, Bayesian inference, and implied weighting maximum parsimony. The results indicate a lack of congruence in the topological position of halisaurines and Dallasaurus. Additionally, the genus Prognathodon is paraphyletic under all hypotheses. Interestingly, a number of traditional mosasauroid clades become weakly supported, or unresolved, under Bayesian analyses. The reduced resolutions in some consensus trees create ambiguities concerning the evolution of fully aquatic pelvic/pedal conditions under many analyses. However, when enough resolution was obtained, reversals of the pelvic/pedal conditions were favoured by parsimony and likelihood ancestral state reconstructions instead of independent origins of aquatic features in mosasauroids. It is concluded that most of the observed discrepancies among the results can be associated with different analytical procedures, but also due to limited postcranial data on halisaurines, yaguarasaurines and Dallasaurus.

  7. Mosasauroid phylogeny under multiple phylogenetic methods provides new insights on the evolution of aquatic adaptations in the group

    Science.gov (United States)

    Vernygora, Oksana; Paparella, Ilaria; Jimenez-Huidobro, Paulina; Caldwell, Michael W.

    2017-01-01

    Mosasauroids were a successful lineage of squamate reptiles (lizards and snakes) that radiated during the Late Cretaceous (95–66 million years ago). They can be considered one of the few lineages in the evolutionary history of tetrapods to have acquired a fully aquatic lifestyle, similarly to whales, ichthyosaurs and plesiosaurs. Despite a long history of research on this group, their phylogenetic relationships have only been tested so far using traditional (unweighted) maximum parsimony. However, hypotheses of mosasauroid relationships and the recently proposed multiple origins of aquatically adapted pelvic and pedal features in this group can be more thoroughly tested by methods that take into account variation in branch lengths and evolutionary rates. In this study, we present the first mosasauroid phylogenetic analysis performed under different analytical methods, including maximum likelihood, Bayesian inference, and implied weighting maximum parsimony. The results indicate a lack of congruence in the topological position of halisaurines and Dallasaurus. Additionally, the genus Prognathodon is paraphyletic under all hypotheses. Interestingly, a number of traditional mosasauroid clades become weakly supported, or unresolved, under Bayesian analyses. The reduced resolutions in some consensus trees create ambiguities concerning the evolution of fully aquatic pelvic/pedal conditions under many analyses. However, when enough resolution was obtained, reversals of the pelvic/pedal conditions were favoured by parsimony and likelihood ancestral state reconstructions instead of independent origins of aquatic features in mosasauroids. It is concluded that most of the observed discrepancies among the results can be associated with different analytical procedures, but also due to limited postcranial data on halisaurines, yaguarasaurines and Dallasaurus. PMID:28467456

  8. Which came first: The lizard or the egg? Robustness in phylogenetic reconstruction of ancestral states.

    Science.gov (United States)

    Wright, April M; Lyons, Kathleen M; Brandley, Matthew C; Hillis, David M

    2015-09-01

    Changes in parity mode between egg-laying (oviparity) and live-bearing (viviparity) have occurred repeatedly throughout vertebrate evolution. Oviparity is the ancestral amniote state, and viviparity has evolved many times independently within amniotes (especially in lizards and snakes), with possibly a few reversions to oviparity. In amniotes, the shelled egg is considered a complex structure that is unlikely to re-evolve if lost (i.e., it is an example of Dollo's Principle). However, a recent ancestral state reconstruction analysis concluded that viviparity was the ancestral state of squamate reptiles (lizards and snakes), and that oviparity re-evolved from viviparity many times throughout the evolutionary history of squamates. Here, we re-evaluate support for this provocative conclusion by testing the sensitivity of the analysis to model assumptions and estimates of squamate phylogeny. We found that the models and methods used for parity mode reconstruction are highly sensitive to the specific estimate of phylogeny used, and that the point estimate of phylogeny used to suggest that viviparity is the root state of the squamate tree is far from an optimal phylogenetic solution. The ancestral state reconstructions are also highly sensitive to model choice and specific values of model parameters. A method that is designed to account for biases in taxon sampling actually accentuates, rather than lessens, those biases with respect to ancestral state reconstructions. In contrast to recent conclusions from the same data set, we find that ancestral state reconstruction analyses provide highly equivocal support for the number and direction of transitions between oviparity and viviparity in squamates. Moreover, the reconstructions of ancestral parity state are highly dependent on the assumptions of each model. We conclude that the common ancestor of squamates was oviparous, and subsequent evolutionary transitions to viviparity were common, but reversals to oviparity were

  9. treeman: an R package for efficient and intuitive manipulation of phylogenetic trees.

    Science.gov (United States)

    Bennett, Dominic J; Sutton, Mark D; Turvey, Samuel T

    2017-01-07

    Phylogenetic trees are hierarchical structures used for representing the inter-relationships between biological entities. They are the most common tool for representing evolution and are essential to a range of fields across the life sciences. The manipulation of phylogenetic trees-in terms of adding or removing tips-is often performed by researchers not just for reasons of management but also for performing simulations in order to understand the processes of evolution. Despite this, the most common programming language among biologists, R, has few class structures well suited to these tasks. We present an R package that contains a new class, called TreeMan, for representing the phylogenetic tree. This class has a list structure allowing phylogenetic trees to be manipulated more efficiently. Computational running times are reduced because of the ready ability to vectorise and parallelise methods. Development is also improved due to fewer lines of code being required for performing manipulation processes. We present three use cases-pinning missing taxa to a supertree, simulating evolution with a tree-growth model and detecting significant phylogenetic turnover-that demonstrate the new package's speed and simplicity.

  10. Order statistics & inference estimation methods

    CERN Document Server

    Balakrishnan, N

    1991-01-01

    The literature on order statistics and inferenc eis quite extensive and covers a large number of fields ,but most of it is dispersed throughout numerous publications. This volume is the consolidtion of the most important results and places an emphasis on estimation. Both theoretical and computational procedures are presented to meet the needs of researchers, professionals, and students. The methods of estimation discussed are well-illustrated with numerous practical examples from both the physical and life sciences, including sociology,psychology,a nd electrical and chemical engineering. A co

  11. Investigation of MLE in nonparametric estimation methods of reliability function

    International Nuclear Information System (INIS)

    Ahn, Kwang Won; Kim, Yoon Ik; Chung, Chang Hyun; Kim, Kil Yoo

    2001-01-01

    There have been lots of trials to estimate a reliability function. In the ESReDA 20 th seminar, a new method in nonparametric way was proposed. The major point of that paper is how to use censored data efficiently. Generally there are three kinds of approach to estimate a reliability function in nonparametric way, i.e., Reduced Sample Method, Actuarial Method and Product-Limit (PL) Method. The above three methods have some limits. So we suggest an advanced method that reflects censored information more efficiently. In many instances there will be a unique maximum likelihood estimator (MLE) of an unknown parameter, and often it may be obtained by the process of differentiation. It is well known that the three methods generally used to estimate a reliability function in nonparametric way have maximum likelihood estimators that are uniquely exist. So, MLE of the new method is derived in this study. The procedure to calculate a MLE is similar just like that of PL-estimator. The difference of the two is that in the new method, the mass (or weight) of each has an influence of the others but the mass in PL-estimator not

  12. Detection of Horizontal Gene Transfers from Phylogenetic Comparisons

    Science.gov (United States)

    Pylro, Victor Satler; Vespoli, Luciano de Souza; Duarte, Gabriela Frois; Yotoko, Karla Suemy Clemente

    2012-01-01

    Bacterial phylogenies have become one of the most important challenges for microbial ecology. This field started in the mid-1970s with the aim of using the sequence of the small subunit ribosomal RNA (16S) tool to infer bacterial phylogenies. Phylogenetic hypotheses based on other sequences usually give conflicting topologies that reveal different evolutionary histories, which in some cases may be the result of horizontal gene transfer events. Currently, one of the major goals of molecular biology is to understand the role that horizontal gene transfer plays in species adaptation and evolution. In this work, we compared the phylogenetic tree based on 16S with the tree based on dszC, a gene involved in the cleavage of carbon-sulfur bonds. Bacteria of several genera perform this survival task when living in environments lacking free mineral sulfur. The biochemical pathway of the desulphurization process was extensively studied due to its economic importance, since this step is expensive and indispensable in fuel production. Our results clearly show that horizontal gene transfer events could be detected using common phylogenetic methods with gene sequences obtained from public sequence databases. PMID:22675653

  13. Methods to estimate the genetic risk

    International Nuclear Information System (INIS)

    Ehling, U.H.

    1989-01-01

    The estimation of the radiation-induced genetic risk to human populations is based on the extrapolation of results from animal experiments. Radiation-induced mutations are stochastic events. The probability of the event depends on the dose; the degree of the damage dose not. There are two main approaches in making genetic risk estimates. One of these, termed the direct method, expresses risk in terms of expected frequencies of genetic changes induced per unit dose. The other, referred to as the doubling dose method or the indirect method, expresses risk in relation to the observed incidence of genetic disorders now present in man. The advantage of the indirect method is that not only can Mendelian mutations be quantified, but also other types of genetic disorders. The disadvantages of the method are the uncertainties in determining the current incidence of genetic disorders in human and, in addition, the estimasion of the genetic component of congenital anomalies, anomalies expressed later and constitutional and degenerative diseases. Using the direct method we estimated that 20-50 dominant radiation-induced mutations would be expected in 19 000 offspring born to parents exposed in Hiroshima and Nagasaki, but only a small proportion of these mutants would have been detected with the techniques used for the population study. These methods were used to predict the genetic damage from the fallout of the reactor accident at Chernobyl in the vicinity of Southern Germany. The lack of knowledge for the interaction of chemicals with ionizing radiation and the discrepancy between the high safety standards for radiation protection and the low level of knowledge for the toxicological evaluation of chemical mutagens will be emphasized. (author)

  14. A method of estimating log weights.

    Science.gov (United States)

    Charles N. Mann; Hilton H. Lysons

    1972-01-01

    This paper presents a practical method of estimating the weights of logs before they are yarded. Knowledge of log weights is required to achieve optimum loading of modern yarding equipment. Truckloads of logs are weighed and measured to obtain a local density index (pounds per cubic foot) for a species of logs. The density index is then used to estimate the weights of...

  15. Conformation of phylogenetic relationship of Penaeidae shrimp based on morphometric and molecular investigations.

    Science.gov (United States)

    Rajakumaran, P; Vaseeharan, B; Jayakumar, R; Chidambara, R

    2014-01-01

    Understanding of accurate phylogenetic relationship among Penaeidae shrimp is important for academic and fisheries industry. The Morphometric and Randomly amplified polymorphic DNA (RAPD) analysis was used to make the phylogenetic relationsip among 13 Penaeidae shrimp. For morphometric analysis forty variables and total lengths of shrimp were measured for each species, and removed the effect of size variation. The size normalized values obtained was subjected to UPGMA (Unweighted Pair-Group Method with Arithmetic Mean) cluster analysis. For RAPD analysis, the four primers showed reliable differentiation between species, and used correlation coefficient between the DNA banding patterns of 13 Penaeidae species to construct UPGMA dendrogram. Phylogenetic relationship from morphometric and molecular analysis for Penaeidae species found to be congruent. We concluded that as the results from morphometry investigations concur with molecular one, phylogenetic relationship obtained for the studied Penaeidae are considered to be reliable.

  16. A Fast LMMSE Channel Estimation Method for OFDM Systems

    Directory of Open Access Journals (Sweden)

    Zhou Wen

    2009-01-01

    Full Text Available A fast linear minimum mean square error (LMMSE channel estimation method has been proposed for Orthogonal Frequency Division Multiplexing (OFDM systems. In comparison with the conventional LMMSE channel estimation, the proposed channel estimation method does not require the statistic knowledge of the channel in advance and avoids the inverse operation of a large dimension matrix by using the fast Fourier transform (FFT operation. Therefore, the computational complexity can be reduced significantly. The normalized mean square errors (NMSEs of the proposed method and the conventional LMMSE estimation have been derived. Numerical results show that the NMSE of the proposed method is very close to that of the conventional LMMSE method, which is also verified by computer simulation. In addition, computer simulation shows that the performance of the proposed method is almost the same with that of the conventional LMMSE method in terms of bit error rate (BER.

  17. A Computationally Efficient Method for Polyphonic Pitch Estimation

    Directory of Open Access Journals (Sweden)

    Ruohua Zhou

    2009-01-01

    Full Text Available This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated from the original RTFI energy spectrum according to harmonic grouping principles. Then the incorrect estimations are removed according to spectral irregularity and knowledge of the harmonic structures of the music notes played on commonly used music instruments. The new approach is compared with a variety of other frame-based polyphonic pitch estimation methods, and results demonstrate the high performance and computational efficiency of the approach.

  18. Rearrangement moves on rooted phylogenetic networks.

    Science.gov (United States)

    Gambette, Philippe; van Iersel, Leo; Jones, Mark; Lafond, Manuel; Pardi, Fabio; Scornavacca, Celine

    2017-08-01

    Phylogenetic tree reconstruction is usually done by local search heuristics that explore the space of the possible tree topologies via simple rearrangements of their structure. Tree rearrangement heuristics have been used in combination with practically all optimization criteria in use, from maximum likelihood and parsimony to distance-based principles, and in a Bayesian context. Their basic components are rearrangement moves that specify all possible ways of generating alternative phylogenies from a given one, and whose fundamental property is to be able to transform, by repeated application, any phylogeny into any other phylogeny. Despite their long tradition in tree-based phylogenetics, very little research has gone into studying similar rearrangement operations for phylogenetic network-that is, phylogenies explicitly representing scenarios that include reticulate events such as hybridization, horizontal gene transfer, population admixture, and recombination. To fill this gap, we propose "horizontal" moves that ensure that every network of a certain complexity can be reached from any other network of the same complexity, and "vertical" moves that ensure reachability between networks of different complexities. When applied to phylogenetic trees, our horizontal moves-named rNNI and rSPR-reduce to the best-known moves on rooted phylogenetic trees, nearest-neighbor interchange and rooted subtree pruning and regrafting. Besides a number of reachability results-separating the contributions of horizontal and vertical moves-we prove that rNNI moves are local versions of rSPR moves, and provide bounds on the sizes of the rNNI neighborhoods. The paper focuses on the most biologically meaningful versions of phylogenetic networks, where edges are oriented and reticulation events clearly identified. Moreover, our rearrangement moves are robust to the fact that networks with higher complexity usually allow a better fit with the data. Our goal is to provide a solid basis for

  19. Rearrangement moves on rooted phylogenetic networks.

    Directory of Open Access Journals (Sweden)

    Philippe Gambette

    2017-08-01

    Full Text Available Phylogenetic tree reconstruction is usually done by local search heuristics that explore the space of the possible tree topologies via simple rearrangements of their structure. Tree rearrangement heuristics have been used in combination with practically all optimization criteria in use, from maximum likelihood and parsimony to distance-based principles, and in a Bayesian context. Their basic components are rearrangement moves that specify all possible ways of generating alternative phylogenies from a given one, and whose fundamental property is to be able to transform, by repeated application, any phylogeny into any other phylogeny. Despite their long tradition in tree-based phylogenetics, very little research has gone into studying similar rearrangement operations for phylogenetic network-that is, phylogenies explicitly representing scenarios that include reticulate events such as hybridization, horizontal gene transfer, population admixture, and recombination. To fill this gap, we propose "horizontal" moves that ensure that every network of a certain complexity can be reached from any other network of the same complexity, and "vertical" moves that ensure reachability between networks of different complexities. When applied to phylogenetic trees, our horizontal moves-named rNNI and rSPR-reduce to the best-known moves on rooted phylogenetic trees, nearest-neighbor interchange and rooted subtree pruning and regrafting. Besides a number of reachability results-separating the contributions of horizontal and vertical moves-we prove that rNNI moves are local versions of rSPR moves, and provide bounds on the sizes of the rNNI neighborhoods. The paper focuses on the most biologically meaningful versions of phylogenetic networks, where edges are oriented and reticulation events clearly identified. Moreover, our rearrangement moves are robust to the fact that networks with higher complexity usually allow a better fit with the data. Our goal is to provide

  20. Phylogenetic tests of distribution patterns in South Asia: towards

    Indian Academy of Sciences (India)

    The last four decades have seen an increasing integration of phylogenetics and biogeography. However, a dearth of phylogenetic studies has precluded such biogeographic analyses in South Asia until recently. Noting the increase in phylogenetic research and interest in phylogenetic biogeography in the region, we ...

  1. Fourier transform inequalities for phylogenetic trees.

    Science.gov (United States)

    Matsen, Frederick A

    2009-01-01

    Phylogenetic invariants are not the only constraints on site-pattern frequency vectors for phylogenetic trees. A mutation matrix, by its definition, is the exponential of a matrix with non-negative off-diagonal entries; this positivity requirement implies non-trivial constraints on the site-pattern frequency vectors. We call these additional constraints "edge-parameter inequalities". In this paper, we first motivate the edge-parameter inequalities by considering a pathological site-pattern frequency vector corresponding to a quartet tree with a negative internal edge. This site-pattern frequency vector nevertheless satisfies all of the constraints described up to now in the literature. We next describe two complete sets of edge-parameter inequalities for the group-based models; these constraints are square-free monomial inequalities in the Fourier transformed coordinates. These inequalities, along with the phylogenetic invariants, form a complete description of the set of site-pattern frequency vectors corresponding to bona fide trees. Said in mathematical language, this paper explicitly presents two finite lists of inequalities in Fourier coordinates of the form "monomial < or = 1", each list characterizing the phylogenetically relevant semialgebraic subsets of the phylogenetic varieties.

  2. A Comparative Study of Distribution System Parameter Estimation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Sun, Yannan; Williams, Tess L.; Gourisetti, Sri Nikhil Gup

    2016-07-17

    In this paper, we compare two parameter estimation methods for distribution systems: residual sensitivity analysis and state-vector augmentation with a Kalman filter. These two methods were originally proposed for transmission systems, and are still the most commonly used methods for parameter estimation. Distribution systems have much lower measurement redundancy than transmission systems. Therefore, estimating parameters is much more difficult. To increase the robustness of parameter estimation, the two methods are applied with combined measurement snapshots (measurement sets taken at different points in time), so that the redundancy for computing the parameter values is increased. The advantages and disadvantages of both methods are discussed. The results of this paper show that state-vector augmentation is a better approach for parameter estimation in distribution systems. Simulation studies are done on a modified version of IEEE 13-Node Test Feeder with varying levels of measurement noise and non-zero error in the other system model parameters.

  3. Evaluation of three paediatric weight estimation methods in Singapore.

    Science.gov (United States)

    Loo, Pei Ying; Chong, Shu-Ling; Lek, Ngee; Bautista, Dianne; Ng, Kee Chong

    2013-04-01

    Rapid paediatric weight estimation methods in the emergency setting have not been evaluated for South East Asian children. This study aims to assess the accuracy and precision of three such methods in Singapore children: Broselow-Luten (BL) tape, Advanced Paediatric Life Support (APLS) (estimated weight (kg) = 2 (age + 4)) and Luscombe (estimated weight (kg) = 3 (age) + 7) formulae. We recruited 875 patients aged 1-10 years in a Paediatric Emergency Department in Singapore over a 2-month period. For each patient, true weight and height were determined. True height was cross-referenced to the BL tape markings and used to derive estimated weight (virtual BL tape method), while patient's round-down age (in years) was used to derive estimated weights using APLS and Luscombe formulae, respectively. The percentage difference between the true and estimated weights was calculated. For each method, the bias and extent of agreement were quantified using Bland-Altman method (mean percentage difference (MPD) and 95% limits of agreement (LOA)). The proportion of weight estimates within 10% of true weight (p₁₀) was determined. The BL tape method marginally underestimated weights (MPD +0.6%; 95% LOA -26.8% to +28.1%; p₁₀ 58.9%). The APLS formula underestimated weights (MPD +7.6%; 95% LOA -26.5% to +41.7%; p₁₀ 45.7%). The Luscombe formula overestimated weights (MPD -7.4%; 95% LOA -51.0% to +36.2%; p₁₀ 37.7%). Of the three methods we evaluated, the BL tape method provided the most accurate and precise weight estimation for Singapore children. The APLS and Luscombe formulae underestimated and overestimated the children's weights, respectively, and were considerably less precise. © 2013 The Authors. Journal of Paediatrics and Child Health © 2013 Paediatrics and Child Health Division (Royal Australasian College of Physicians).

  4. Phylogenetic Signal in AFLP Data Sets

    NARCIS (Netherlands)

    Koopman, W.J.M.

    2005-01-01

    AFLP markers provide a potential source of phylogenetic information for molecular systematic studies. However, there are properties of restriction fragment data that limit phylogenetic interpretation of AFLPs. These are (a) possible nonindependence of fragments, (b) problems of homology assignment

  5. Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees.

    Science.gov (United States)

    Keller, Alexander; Förster, Frank; Müller, Tobias; Dandekar, Thomas; Schultz, Jörg; Wolf, Matthias

    2010-01-15

    In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking. This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness. Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion. This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

  6. Joint Pitch and DOA Estimation Using the ESPRIT method

    DEFF Research Database (Denmark)

    Wu, Yuntao; Amir, Leshem; Jensen, Jesper Rindom

    2015-01-01

    In this paper, the problem of joint multi-pitch and direction-of-arrival (DOA) estimation for multi-channel harmonic sinusoidal signals is considered. A spatio-temporal matrix signal model for a uniform linear array is defined, and then the ESPRIT method based on subspace techniques that exploits...... the invariance property in the time domain is first used to estimate the multi pitch frequencies of multiple harmonic signals. Followed by the estimated pitch frequencies, the DOA estimations based on the ESPRIT method are also presented by using the shift invariance structure in the spatial domain. Compared...... to the existing stateof-the-art algorithms, the proposed method based on ESPRIT without 2-D searching is computationally more efficient but performs similarly. An asymptotic performance analysis of the DOA and pitch estimation of the proposed method are also presented. Finally, the effectiveness of the proposed...

  7. Disturbance by an endemic rodent in an arid shrubland is a habitat filter: effects on plant invasion and taxonomical, functional and phylogenetic community structure.

    Science.gov (United States)

    Escobedo, Víctor M; Rios, Rodrigo S; Salgado-Luarte, Cristian; Stotz, Gisela C; Gianoli, Ernesto

    2017-03-01

    Disturbance often drives plant invasion and may modify community assembly. However, little is known about how these modifications of community patterns occur in terms of taxonomic, functional and phylogenetic structure. This study evaluated in an arid shrubland the influence of disturbance by an endemic rodent on community functional divergence and phylogenetic structure as well as on plant invasion. It was expected that disturbance would operate as a habitat filter favouring exotic species with short life cycles. Sixteen plots were sampled along a disturbance gradient caused by the endemic fossorial rodent Spalacopus cyanus , measuring community parameters and estimating functional divergence for life history traits (functional dispersion index) and the relative contribution to functional divergence of exotic and native species. The phylogenetic signal (Pagel's lambda) and phylogenetic community structure (mean phylogenetic distance and mean nearest taxon phylogenetic distance) were also estimated. The use of a continuous approach to the disturbance gradient allowed the identification of non-linear relationships between disturbance and community parameters. The relationship between disturbance and both species richness and abundance was positive for exotic species and negative for native species. Disturbance modified community composition, and exotic species were associated with more disturbed sites. Disturbance increased trait convergence, which resulted in phylogenetic clustering because traits showed a significant phylogenetic signal. The relative contribution of exotic species to functional divergence increased, while that of natives decreased, with disturbance. Exotic and native species were not phylogenetically distinct. Disturbance by rodents in this arid shrubland constitutes a habitat filter over phylogeny-dependent life history traits, leading to phylogenetic clustering, and drives invasion by favouring species with short life cycles. Results can be

  8. Phylogenetic analysis of rabbit haemorrhagic disease virus (RHDV) strains isolated in Poland.

    Science.gov (United States)

    Fitzner, Andrzej; Niedbalski, Wieslaw

    2017-10-01

    The aim of this study was to characterise the nucleotide and amino acid sequence of complete genomes (7.5 kb) from RHDV strains isolated in Poland and estimate the genetic variability in different elements of the viral RNA. In addition, the sequence of Polish RHDV isolates isolated from 1988-2015 was compared with the sequences of other European RHDV, including the RHDVa and RHDV2/RHDVb subtypes. The complete sequence was developed by the compilation of partial nucleotide sequences. This sequence consisted of approximately 7428 nucleotides. For comparison of nucleotide sequences and the development of phylogenetic trees of Polish RHDV isolates and reference RHDV strains representing the main phylogenetic groups of classical RHDV, RHDVa and RHDV2 as well as the non-pathogenic rabbit lagovirus RCV, the BLAST software with blastn and MEGA6 with neighbour-joining method was applied. The complete nucleotide sequence of Polish isolates of RHDV has also been entered into GenBank. For comparative analysis, nineteen complete sequences representing the main RHDV genetic types available in GenBank were used. The results of phylogenetic analysis of Polish RHDV strains reveals the presence of three classical RHDV genogroups (G2, G4 and G5) and an RHDVa variant (G6). The oldest RHDV isolates (KGM 1988, PD 1989 and MAL 1994) belong to genogroup G2. It can be assumed that the elimination of these strains from the environment probably occurred at the turn of 1994 and 1995. Genogroup G2 was replaced by the phylogenetically younger BLA 1994 and OPO 2004 strains from genogroup G4, which probably originated from the G3 lineage, represented by the Italian strains BS89. The last representatives of classical RHDV in Poland are isolates GSK 1988 and ZD0 2000 from genogroup G5. A single clade contains the Polish RHDV strains from 2004-2015 (GRZ 2004, KRY 2004, L145 2004, W147 2005, SKO 2013, GLE 2013, RED1 2013, STR 2012, STR2 2013, STR 2014, BIE 2015) identified as RHDVa, which clustered

  9. The Development of Three Long Universal Nuclear Protein-Coding Locus Markers and Their Application to Osteichthyan Phylogenetics with Nested PCR

    Science.gov (United States)

    Zhang, Peng

    2012-01-01

    Background Universal nuclear protein-coding locus (NPCL) markers that are applicable across diverse taxa and show good phylogenetic discrimination have broad applications in molecular phylogenetic studies. For example, RAG1, a representative NPCL marker, has been successfully used to make phylogenetic inferences within all major osteichthyan groups. However, such markers with broad working range and high phylogenetic performance are still scarce. It is necessary to develop more universal NPCL markers comparable to RAG1 for osteichthyan phylogenetics. Methodology/Principal Findings We developed three long universal NPCL markers (>1.6 kb each) based on single-copy nuclear genes (KIAA1239, SACS and TTN) that possess large exons and exhibit the appropriate evolutionary rates. We then compared their phylogenetic utilities with that of the reference marker RAG1 in 47 jawed vertebrate species. In comparison with RAG1, each of the three long universal markers yielded similar topologies and branch supports, all in congruence with the currently accepted osteichthyan phylogeny. To compare their phylogenetic performance visually, we also estimated the phylogenetic informativeness (PI) profile for each of the four long universal NPCL markers. The PI curves indicated that SACS performed best over the whole timescale, while RAG1, KIAA1239 and TTN exhibited similar phylogenetic performances. In addition, we compared the success of nested PCR and standard PCR when amplifying NPCL marker fragments. The amplification success rate and efficiency of the nested PCR were overwhelmingly higher than those of standard PCR. Conclusions/Significance Our work clearly demonstrates the superiority of nested PCR over the conventional PCR in phylogenetic studies and develops three long universal NPCL markers (KIAA1239, SACS and TTN) with the nested PCR strategy. The three markers exhibit high phylogenetic utilities in osteichthyan phylogenetics and can be widely used as pilot genes for

  10. Mitogenomic phylogenetic analyses of the Delphinidae with an emphasis on the Globicephalinae

    Directory of Open Access Journals (Sweden)

    de Stephanis Renaud

    2011-03-01

    Full Text Available Abstract Background Previous DNA-based phylogenetic studies of the Delphinidae family suggest it has undergone rapid diversification, as characterised by unresolved and poorly supported taxonomic relationships (polytomies for some of the species within this group. Using an increased amount of sequence data we test between alternative hypotheses of soft polytomies caused by rapid speciation, slow evolutionary rate and/or insufficient sequence data, and hard polytomies caused by simultaneous speciation within this family. Combining the mitogenome sequences of five new and 12 previously published species within the Delphinidae, we used Bayesian and maximum-likelihood methods to estimate the phylogeny from partitioned and unpartitioned mitogenome sequences. Further ad hoc tests were then conducted to estimate the support for alternative topologies. Results We found high support for all the relationships within our reconstructed phylogenies, and topologies were consistent between the Bayesian and maximum-likelihood trees inferred from partitioned and unpartitioned data. Resolved relationships included the placement of the killer whale (Orcinus orca as sister taxon to the rest of the Globicephalinae subfamily, placement of the Risso's dolphin (Grampus griseus within the Globicephalinae subfamily, removal of the white-beaked dolphin (Lagenorhynchus albirostris from the Delphininae subfamily and the placement of the rough-toothed dolphin (Steno bredanensis as sister taxon to the rest of the Delphininae subfamily rather than within the Globicephalinae subfamily. The additional testing of alternative topologies allowed us to reject all other putative relationships, with the exception that we were unable to reject the hypothesis that the relationship between L. albirostris and the Globicephalinae and Delphininae subfamilies was polytomic. Conclusion Despite their rapid diversification, the increased sequence data yielded by mitogenomes enables the

  11. A Channelization-Based DOA Estimation Method for Wideband Signals

    Directory of Open Access Journals (Sweden)

    Rui Guo

    2016-07-01

    Full Text Available In this paper, we propose a novel direction of arrival (DOA estimation method for wideband signals with sensor arrays. The proposed method splits the wideband array output into multiple frequency sub-channels and estimates the signal parameters using a digital channelization receiver. Based on the output sub-channels, a channelization-based incoherent signal subspace method (Channelization-ISM and a channelization-based test of orthogonality of projected subspaces method (Channelization-TOPS are proposed. Channelization-ISM applies narrowband signal subspace methods on each sub-channel independently. Then the arithmetic mean or geometric mean of the estimated DOAs from each sub-channel gives the final result. Channelization-TOPS measures the orthogonality between the signal and the noise subspaces of the output sub-channels to estimate DOAs. The proposed channelization-based method isolates signals in different bandwidths reasonably and improves the output SNR. It outperforms the conventional ISM and TOPS methods on estimation accuracy and dynamic range, especially in real environments. Besides, the parallel processing architecture makes it easy to implement on hardware. A wideband digital array radar (DAR using direct wideband radio frequency (RF digitization is presented. Experiments carried out in a microwave anechoic chamber with the wideband DAR are presented to demonstrate the performance. The results verify the effectiveness of the proposed method.

  12. Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

    Directory of Open Access Journals (Sweden)

    Bryan Kolaczkowski

    Full Text Available Bayesian inference (BI of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML, so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.

  13. A next-generation sequencing method for overcoming the multiple gene copy problem in polyploid phylogenetics, applied to Poa grasses

    Directory of Open Access Journals (Sweden)

    Robin Charles

    2011-03-01

    Full Text Available Abstract Background Polyploidy is important from a phylogenetic perspective because of its immense past impact on evolution and its potential future impact on diversification, survival and adaptation, especially in plants. Molecular population genetics studies of polyploid organisms have been difficult because of problems in sequencing multiple-copy nuclear genes using Sanger sequencing. This paper describes a method for sequencing a barcoded mixture of targeted gene regions using next-generation sequencing methods to overcome these problems. Results Using 64 3-bp barcodes, we successfully sequenced three chloroplast and two nuclear gene regions (each of which contained two gene copies with up to two alleles per individual in a total of 60 individuals across 11 species of Australian Poa grasses. This method had high replicability, a low sequencing error rate (after appropriate quality control and a low rate of missing data. Eighty-eight percent of the 320 gene/individual combinations produced sequence reads, and >80% of individuals produced sufficient reads to detect all four possible nuclear alleles of the homeologous nuclear loci with 95% probability. We applied this method to a group of sympatric Australian alpine Poa species, which we discovered to share an allopolyploid ancestor with a group of American Poa species. All markers revealed extensive allele sharing among the Australian species and so we recommend that the current taxonomy be re-examined. We also detected hypermutation in the trnH-psbA marker, suggesting it should not be used as a land plant barcode region. Some markers indicated differentiation between Tasmanian and mainland samples. Significant positive spatial genetic structure was detected at Conclusions Our results demonstrate that 454 sequencing of barcoded amplicon mixtures can be used to reliably sample all alleles of homeologous loci in polyploid species and successfully investigate phylogenetic relationships among

  14. Nucleotide diversity and phylogenetic relationships among ...

    Indian Academy of Sciences (India)

    NIRAJ SINGH

    for phylogenetic analysis of Gladiolus and related taxa using combined datasets from chloroplast genome. The psbA–trnH ... phylogenetic relationships among cultivars could be useful for hybridization programmes for further improvement of the crop. [Singh N. ... breeding in nature, and exhibited diverse pollination mech-.

  15. Phylogenetic relationships among anuran trypanosomes as revealed by riboprinting.

    Science.gov (United States)

    Clark, C G; Martin, D S; Diamond, L S

    1995-01-01

    Twenty trypanosome isolates from Anura (frogs and toads) assigned to several species were characterized by riboprinting-restriction enzyme digestion of polymerase chain reaction amplified small subunit ribosomal RNA genes. Restriction site polymorphisms allowed distinction of all the recognized species and no intraspecific variation in riboprint patterns was detected. Phylogenetic reconstruction using parsimony and distance estimates based on restriction fragment comigration showed Trypanosoma chattoni to be only distantly related to the other species, while T. ranarum and T. fallisi appear to be sister taxa despite showing non-overlapping host specificities.

  16. Re-Evaluation of Phylogenetic Relationships among Species of the Mangrove Genus Avicennia from Indo-West Pacific Based on Multilocus Analyses.

    Science.gov (United States)

    Li, Xinnian; Duke, Norman C; Yang, Yuchen; Huang, Lishi; Zhu, Yuxiang; Zhang, Zhang; Zhou, Renchao; Zhong, Cairong; Huang, Yelin; Shi, Suhua

    2016-01-01

    Avicennia L. (Avicenniaceae), one of the most diverse mangrove genera, is distributed widely in tropical and subtropical intertidal zones worldwide. Five species of Avicennia in the Indo-West Pacific region have been previously described. However, their phylogenetic relationships were determined based on morphological and allozyme data. To enhance our understanding of evolutionary patterns in the clade, we carried out a molecular phylogenetic study using wide sampling and multiple loci. Our results support two monophyletic clades across all species worldwide in Avicennia: an Atlantic-East Pacific (AEP) lineage and an Indo-West Pacific (IWP) lineage. This split is in line with biogeographic distribution of the clade. Focusing on the IWP branch, we reconstructed a detailed phylogenetic tree based on sequences from 25 nuclear genes. The results identified three distinct subclades, (1) A. rumphiana and A. alba, (2) A. officinalis and A. integra, and (3) the A. marina complex, with high bootstrap support. The results strongly corresponded to two morphological traits in floral structure: stigma position in relation to the anthers and style length. Using Bayesian dating methods we estimated diversification of the IWP lineage was dated to late Miocene (c. 6.0 million years ago) and may have been driven largely by the fluctuating sea levels since that time.

  17. Re-Evaluation of Phylogenetic Relationships among Species of the Mangrove Genus Avicennia from Indo-West Pacific Based on Multilocus Analyses.

    Directory of Open Access Journals (Sweden)

    Xinnian Li

    Full Text Available Avicennia L. (Avicenniaceae, one of the most diverse mangrove genera, is distributed widely in tropical and subtropical intertidal zones worldwide. Five species of Avicennia in the Indo-West Pacific region have been previously described. However, their phylogenetic relationships were determined based on morphological and allozyme data. To enhance our understanding of evolutionary patterns in the clade, we carried out a molecular phylogenetic study using wide sampling and multiple loci. Our results support two monophyletic clades across all species worldwide in Avicennia: an Atlantic-East Pacific (AEP lineage and an Indo-West Pacific (IWP lineage. This split is in line with biogeographic distribution of the clade. Focusing on the IWP branch, we reconstructed a detailed phylogenetic tree based on sequences from 25 nuclear genes. The results identified three distinct subclades, (1 A. rumphiana and A. alba, (2 A. officinalis and A. integra, and (3 the A. marina complex, with high bootstrap support. The results strongly corresponded to two morphological traits in floral structure: stigma position in relation to the anthers and style length. Using Bayesian dating methods we estimated diversification of the IWP lineage was dated to late Miocene (c. 6.0 million years ago and may have been driven largely by the fluctuating sea levels since that time.

  18. Bears in a forest of gene trees: phylogenetic inference is complicated by incomplete lineage sorting and gene flow.

    Science.gov (United States)

    Kutschera, Verena E; Bidon, Tobias; Hailer, Frank; Rodi, Julia L; Fain, Steven R; Janke, Axel

    2014-08-01

    Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and complicate phylogenetic inferences, but are not accounted for in phylogenetic analyses of concatenated data. We generated a high-resolution data set of autosomal introns from several individuals per species and of Y-chromosomal markers. Incorporating intraspecific variability in coalescence-based phylogenetic and gene flow estimation approaches, we traced the genealogical history of individual alleles. Considerable heterogeneity among nuclear loci and discordance between nuclear and mitochondrial phylogenies were found. A species tree with divergence time estimates indicated that ursine bears diversified within less than 2 My. Consistent with a complex branching order within a clade of Asian bear species, we identified unidirectional gene flow from Asian black into sloth bears. Moreover, gene flow detected from brown into American black bears can explain the conflicting placement of the American black bear in mitochondrial and nuclear phylogenies. These results highlight that both incomplete lineage sorting and introgression are prominent evolutionary forces even on time scales up to several million years. Complex evolutionary patterns are not adequately captured by strictly bifurcating models, and can only be fully understood when analyzing multiple independently inherited loci in a coalescence framework. Phylogenetic incongruence among gene trees hence needs to be recognized as a biologically meaningful signal. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Estimation of pump operational state with model-based methods

    International Nuclear Information System (INIS)

    Ahonen, Tero; Tamminen, Jussi; Ahola, Jero; Viholainen, Juha; Aranto, Niina; Kestilae, Juha

    2010-01-01

    Pumps are widely used in industry, and they account for 20% of the industrial electricity consumption. Since the speed variation is often the most energy-efficient method to control the head and flow rate of a centrifugal pump, frequency converters are used with induction motor-driven pumps. Although a frequency converter can estimate the operational state of an induction motor without external measurements, the state of a centrifugal pump or other load machine is not typically considered. The pump is, however, usually controlled on the basis of the required flow rate or output pressure. As the pump operational state can be estimated with a general model having adjustable parameters, external flow rate or pressure measurements are not necessary to determine the pump flow rate or output pressure. Hence, external measurements could be replaced with an adjustable model for the pump that uses estimates of the motor operational state. Besides control purposes, modelling the pump operation can provide useful information for energy auditing and optimization purposes. In this paper, two model-based methods for pump operation estimation are presented. Factors affecting the accuracy of the estimation methods are analyzed. The applicability of the methods is verified by laboratory measurements and tests in two pilot installations. Test results indicate that the estimation methods can be applied to the analysis and control of pump operation. The accuracy of the methods is sufficient for auditing purposes, and the methods can inform the user if the pump is driven inefficiently.

  20. Evaluating the phylogenetic signal limit from mitogenomes, slow evolving nuclear genes, and the concatenation approach. New insights into the Lacertini radiation using fast evolving nuclear genes and species trees.

    Science.gov (United States)

    Mendes, Joana; Harris, D James; Carranza, Salvador; Salvi, Daniele

    2016-07-01

    Estimating the phylogeny of lacertid lizards, and particularly the tribe Lacertini has been challenging, possibly due to the fast radiation of this group resulting in a hard polytomy. However this is still an open question, as concatenated data primarily from mitochondrial markers have been used so far whereas in a recent phylogeny based on a compilation of these data within a squamate supermatrix the basal polytomy seems to be resolved. In this study, we estimate phylogenetic relationships between all Lacertini genera using for the first time DNA sequences from five fast evolving nuclear genes (acm4, mc1r, pdc, βfib and reln) and two mitochondrial genes (nd4 and 12S). We generated a total of 529 sequences from 88 species and used Maximum Likelihood and Bayesian Inference methods based on concatenated multilocus dataset as well as a coalescent-based species tree approach with the aim of (i) shedding light on the basal relationships of Lacertini (ii) assessing the monophyly of genera which were previously questioned, and (iii) discussing differences between estimates from this and previous studies based on different markers, and phylogenetic methods. Results uncovered (i) a new phylogenetic clade formed by the monotypic genera Archaeolacerta, Zootoca, Teira and Scelarcis; and (ii) support for the monophyly of the Algyroides clade, with two sister species pairs represented by western (A. marchi and A. fitzingeri) and eastern (A. nigropunctatus and A. moreoticus) lineages. In both cases the members of these groups show peculiar morphology and very different geographical distributions, suggesting that they are relictual groups that were once diverse and widespread. They probably originated about 11-13 million years ago during early events of speciation in the tribe, and the split between their members is estimated to be only slightly older. This scenario may explain why mitochondrial markers (possibly saturated at higher divergence levels) or slower nuclear markers

  1. Population Estimation with Mark and Recapture Method Program

    International Nuclear Information System (INIS)

    Limohpasmanee, W.; Kaewchoung, W.

    1998-01-01

    Population estimation is the important information which required for the insect control planning especially the controlling with SIT. Moreover, It can be used to evaluate the efficiency of controlling method. Due to the complexity of calculation, the population estimation with mark and recapture methods were not used widely. So that, this program is developed with Qbasic on the purpose to make it accuracy and easier. The program evaluation consists with 6 methods; follow Seber's, Jolly-seber's, Jackson's Ito's, Hamada's and Yamamura's methods. The results are compared with the original methods, found that they are accuracy and more easier to applied

  2. New methods of testing nonlinear hypothesis using iterative NLLS estimator

    Science.gov (United States)

    Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.

    2017-11-01

    This research paper discusses the method of testing nonlinear hypothesis using iterative Nonlinear Least Squares (NLLS) estimator. Takeshi Amemiya [1] explained this method. However in the present research paper, a modified Wald test statistic due to Engle, Robert [6] is proposed to test the nonlinear hypothesis using iterative NLLS estimator. An alternative method for testing nonlinear hypothesis using iterative NLLS estimator based on nonlinear hypothesis using iterative NLLS estimator based on nonlinear studentized residuals has been proposed. In this research article an innovative method of testing nonlinear hypothesis using iterative restricted NLLS estimator is derived. Pesaran and Deaton [10] explained the methods of testing nonlinear hypothesis. This paper uses asymptotic properties of nonlinear least squares estimator proposed by Jenrich [8]. The main purpose of this paper is to provide very innovative methods of testing nonlinear hypothesis using iterative NLLS estimator, iterative NLLS estimator based on nonlinear studentized residuals and iterative restricted NLLS estimator. Eakambaram et al. [12] discussed least absolute deviation estimations versus nonlinear regression model with heteroscedastic errors and also they studied the problem of heteroscedasticity with reference to nonlinear regression models with suitable illustration. William Grene [13] examined the interaction effect in nonlinear models disused by Ai and Norton [14] and suggested ways to examine the effects that do not involve statistical testing. Peter [15] provided guidelines for identifying composite hypothesis and addressing the probability of false rejection for multiple hypotheses.

  3. Niche conservatism and dispersal limitation cause large-scale phylogenetic structure in the New World palm flora

    DEFF Research Database (Denmark)

    Eiserhardt, Wolf L.; Svenning, J.-C.; Baker, William J.

    similarity decays after speciation depends on the rates of niche evolution and dispersal. If dispersal is slow compared to the tempo of lineage diversification, distributions change little during clade diversification. Phylogenetic niche conservatism precludes distributional shifts in environmental space......, and to the degree that distributions are limited by the niche, also in geographic space. Using phylogenetic turnover methods, we simultaneously analysed the distributions of all New World palms (n=547) and inferred to which degree phylogenetic niche conservatism and dispersal limitation, respectively, caused...

  4. Curious parallels and curious connections--phylogenetic thinking in biology and historical linguistics.

    Science.gov (United States)

    Atkinson, Quentin D; Gray, Russell D

    2005-08-01

    In The Descent of Man (1871), Darwin observed "curious parallels" between the processes of biological and linguistic evolution. These parallels mean that evolutionary biologists and historical linguists seek answers to similar questions and face similar problems. As a result, the theory and methodology of the two disciplines have evolved in remarkably similar ways. In addition to Darwin's curious parallels of process, there are a number of equally curious parallels and connections between the development of methods in biology and historical linguistics. Here we briefly review the parallels between biological and linguistic evolution and contrast the historical development of phylogenetic methods in the two disciplines. We then look at a number of recent studies that have applied phylogenetic methods to language data and outline some current problems shared by the two fields.

  5. First phylogenetic analysis of Ehrlichia canis in dogs and ticks from Mexico. Preliminary study

    Directory of Open Access Journals (Sweden)

    Carolina G. Sosa-Gutiérrez

    2016-09-01

    Full Text Available Objective. Phylogenetic characterization of Ehrlichia canis in dogs naturally infected and ticks, diagnosed by PCR and sequencing of 16SrRNA gene; compare different isolates found in American countries. Materials and methods. Were collected Blood samples from 139 dogs with suggestive clinical manifestations of this disease and they were infested with ticks; part of 16SrRNA gene was sequenced and aligned, with 17 sequences reported in American countries. Two phylogenetic trees were constructed using the Maximum likelihood method, and Maximum parsimony. Results. They were positive to E. canis 25/139 (18.0% dogs and 29/139 (20.9% ticks. The clinical manifestations presented were fever, fatigue, depression and vomiting. Rhipicephalus sanguineus Dermacentor variabilis and Haemaphysalis leporis-palustris ticks were positive for E. canis. Phylogenetic analysis showed that the sequences of dogs and ticks in Mexico form a third group diverging of sequences from South America and USA. Conclusions. This is the first phylogenetic analysis of E. canis in Mexico. There are differences in the sequences of Mexico with those reported in South America and USA. This research lays the foundation for further study of genetic variability.

  6. Integrating evolution into geographical ecology: a phylogenetic perspective on palm distributions and community composition across scales

    DEFF Research Database (Denmark)

    Eiserhardt, Wolf L.; Svenning, J.-C.; Kissling, W. Daniel

    in 430 transects in the Western Amazon, b) a set of range maps for all American palms (550 spp.), and c) global country-level presence/ absence data of all (>2400) palm species. These data were analysed with novel phylogenetic community structure and turnover methods. Globally, the phylogenetic structure...

  7. Phylogenetic Position of Barbus lacerta Heckel, 1843

    Directory of Open Access Journals (Sweden)

    Mustafa Korkmaz

    2015-11-01

    As a result, five clades come out from phylogenetic reconstruction and in phylogenetic tree Barbus lacerta determined to be sister group of Barbus macedonicus, Barbus oligolepis and Barbus plebejus complex.

  8. Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Dandekar Thomas

    2010-01-01

    Full Text Available Abstract Background In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking. Results This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness. Conclusions Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion. Reviewers This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber and Eugene V. Koonin. Open peer review Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

  9. The phylogenetics of succession can guide restoration

    DEFF Research Database (Denmark)

    Shooner, Stephanie; Chisholm, Chelsea Lee; Davies, T. Jonathan

    2015-01-01

    Phylogenetic tools have increasingly been used in community ecology to describe the evolutionary relationships among co-occurring species. In studies of succession, such tools may allow us to identify the evolutionary lineages most suited for particular stages of succession and habitat...... rehabilitation. However, to date, these two applications have been largely separate. Here, we suggest that information on phylogenetic community structure might help to inform community restoration strategies following major disturbance. Our study examined phylogenetic patterns of succession based...... for species sorting along abiotic gradients (slope and aspect) on the mine sites that had been abandoned for the longest. Synthesis and applications. Understanding the trajectory of succession is critical for restoration efforts. Our results suggest that early colonizers represent a phylogenetically random...

  10. Effects of Phylogenetic Tree Style on Student Comprehension

    Science.gov (United States)

    Dees, Jonathan Andrew

    Phylogenetic trees are powerful tools of evolutionary biology that have become prominent across the life sciences. Consequently, learning to interpret and reason from phylogenetic trees is now an essential component of biology education. However, students often struggle to understand these diagrams, even after explicit instruction. One factor that has been observed to affect student understanding of phylogenetic trees is style (i.e., diagonal or bracket). The goal of this dissertation research was to systematically explore effects of style on student interpretations and construction of phylogenetic trees in the context of an introductory biology course. Before instruction, students were significantly more accurate with bracket phylogenetic trees for a variety of interpretation and construction tasks. Explicit instruction that balanced the use of diagonal and bracket phylogenetic trees mitigated some, but not all, style effects. After instruction, students were significantly more accurate for interpretation tasks involving taxa relatedness and construction exercises when using the bracket style. Based on this dissertation research and prior studies on style effects, I advocate for introductory biology instructors to use only the bracket style. Future research should examine causes of style effects and variables other than style to inform the development of research-based instruction that best supports student understanding of phylogenetic trees.

  11. TreeCluster: Massively scalable transmission clustering using phylogenetic trees

    OpenAIRE

    Moshiri, Alexander

    2018-01-01

    Background: The ability to infer transmission clusters from molecular data is critical to designing and evaluating viral control strategies. Viral sequencing datasets are growing rapidly, but standard methods of transmission cluster inference do not scale well beyond thousands of sequences. Results: I present TreeCluster, a cross-platform tool that performs transmission cluster inference on a given phylogenetic tree orders of magnitude faster than existing inference methods and supports multi...

  12. Nucleotide diversity and phylogenetic relationships among ...

    Indian Academy of Sciences (India)

    Navya

    2 attached at the base of tree as the diverging Iridaceae relative's lineage. Present study revealed that psbA-trnH region are useful in addressing questions of phylogenetic relationships among the Gladiolus cultivars, as these intergenic spacers are more variable and have more phylogenetically informative sites than the ...

  13. Estimation methods for nonlinear state-space models in ecology

    DEFF Research Database (Denmark)

    Pedersen, Martin Wæver; Berg, Casper Willestofte; Thygesen, Uffe Høgsbro

    2011-01-01

    The use of nonlinear state-space models for analyzing ecological systems is increasing. A wide range of estimation methods for such models are available to ecologists, however it is not always clear, which is the appropriate method to choose. To this end, three approaches to estimation in the theta...... logistic model for population dynamics were benchmarked by Wang (2007). Similarly, we examine and compare the estimation performance of three alternative methods using simulated data. The first approach is to partition the state-space into a finite number of states and formulate the problem as a hidden...... Markov model (HMM). The second method uses the mixed effects modeling and fast numerical integration framework of the AD Model Builder (ADMB) open-source software. The third alternative is to use the popular Bayesian framework of BUGS. The study showed that state and parameter estimation performance...

  14. SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

    Science.gov (United States)

    Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen

    2010-07-01

    We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.

  15. Bin mode estimation methods for Compton camera imaging

    International Nuclear Information System (INIS)

    Ikeda, S.; Odaka, H.; Uemura, M.; Takahashi, T.; Watanabe, S.; Takeda, S.

    2014-01-01

    We study the image reconstruction problem of a Compton camera which consists of semiconductor detectors. The image reconstruction is formulated as a statistical estimation problem. We employ a bin-mode estimation (BME) and extend an existing framework to a Compton camera with multiple scatterers and absorbers. Two estimation algorithms are proposed: an accelerated EM algorithm for the maximum likelihood estimation (MLE) and a modified EM algorithm for the maximum a posteriori (MAP) estimation. Numerical simulations demonstrate the potential of the proposed methods

  16. Methods for the estimation of uranium ore reserves

    International Nuclear Information System (INIS)

    1985-01-01

    The Manual is designed mainly to provide assistance in uranium ore reserve estimation methods to mining engineers and geologists with limited experience in estimating reserves, especially to those working in developing countries. This Manual deals with the general principles of evaluation of metalliferous deposits but also takes into account the radioactivity of uranium ores. The methods presented have been generally accepted in the international uranium industry

  17. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  18. Measures of phylogenetic differentiation provide robust and complementary insights into microbial communities.

    Science.gov (United States)

    Parks, Donovan H; Beiko, Robert G

    2013-01-01

    High-throughput sequencing techniques have made large-scale spatial and temporal surveys of microbial communities routine. Gaining insight into microbial diversity requires methods for effectively analyzing and visualizing these extensive data sets. Phylogenetic β-diversity measures address this challenge by allowing the relationship between large numbers of environmental samples to be explored using standard multivariate analysis techniques. Despite the success and widespread use of phylogenetic β-diversity measures, an extensive comparative analysis of these measures has not been performed. Here, we compare 39 measures of phylogenetic β diversity in order to establish the relative similarity of these measures along with key properties and performance characteristics. While many measures are highly correlated, those commonly used within microbial ecology were found to be distinct from those popular within classical ecology, and from the recently recommended Gower and Canberra measures. Many of the measures are surprisingly robust to different rootings of the gene tree, the choice of similarity threshold used to define operational taxonomic units, and the presence of outlying basal lineages. Measures differ considerably in their sensitivity to rare organisms, and the effectiveness of measures can vary substantially under alternative models of differentiation. Consequently, the depth of sequencing required to reveal underlying patterns of relationships between environmental samples depends on the selected measure. Our results demonstrate that using complementary measures of phylogenetic β diversity can further our understanding of how communities are phylogenetically differentiated. Open-source software implementing the phylogenetic β-diversity measures evaluated in this manuscript is available at http://kiwi.cs.dal.ca/Software/ExpressBetaDiversity.

  19. A SOFTWARE RELIABILITY ESTIMATION METHOD TO NUCLEAR SAFETY SOFTWARE

    Directory of Open Access Journals (Sweden)

    GEE-YONG PARK

    2014-02-01

    Full Text Available A method for estimating software reliability for nuclear safety software is proposed in this paper. This method is based on the software reliability growth model (SRGM, where the behavior of software failure is assumed to follow a non-homogeneous Poisson process. Two types of modeling schemes based on a particular underlying method are proposed in order to more precisely estimate and predict the number of software defects based on very rare software failure data. The Bayesian statistical inference is employed to estimate the model parameters by incorporating software test cases as a covariate into the model. It was identified that these models are capable of reasonably estimating the remaining number of software defects which directly affects the reactor trip functions. The software reliability might be estimated from these modeling equations, and one approach of obtaining software reliability value is proposed in this paper.

  20. Phylogenetic Patterns of Extinction Risk in the Eastern Arc Ecosystems, an African Biodiversity Hotspot

    OpenAIRE

    Yessoufou, Kowiyou; Daru, Barnabas H.; Davies, T. Jonathan

    2012-01-01

    There is an urgent need to reduce drastically the rate at which biodiversity is declining worldwide. Phylogenetic methods are increasingly being recognised as providing a useful framework for predicting future losses, and guiding efforts for pre-emptive conservation actions. In this study, we used a reconstructed phylogenetic tree of angiosperm species of the Eastern Arc Mountains - an important African biodiversity hotspot - and described the distribution of extinction risk across taxonomic ...

  1. Characterization of Escherichia coli Phylogenetic Groups ...

    African Journals Online (AJOL)

    Background: Escherichia coli strains mainly fall into four phylogenetic groups (A, B1, B2, and D) and that virulent extra‑intestinal strains mainly belong to groups B2 and D. Aim: The aim was to determine the association between phylogenetic groups of E. coli causing extraintestinal infections (ExPEC) regarding the site of ...

  2. Phylogenetic tests of a Cercopithecus monkey hybrid reveal X ...

    African Journals Online (AJOL)

    A captive Cercopithecus nictitans × C. cephus male was examined at loci on the X- and Y-chromosomes as a test of previously described phylogenetic methods for identifying hybrid Cercopithecus monkeys. The results confirm the reliability of such assays, indicating that they can be of immediate utility for studies of wild ...

  3. Evaluation and reliability of bone histological age estimation methods

    African Journals Online (AJOL)

    Human age estimation at death plays a vital role in forensic anthropology and bioarchaeology. Researchers used morphological and histological methods to estimate human age from their skeletal remains. This paper discussed different histological methods that used human long bones and ribs to determine age ...

  4. Evolution of feeding specialization in Tanganyikan scale-eating cichlids: a molecular phylogenetic approach

    Directory of Open Access Journals (Sweden)

    Nishida Mutsumi

    2007-10-01

    Full Text Available Abstract Background Cichlid fishes in Lake Tanganyika exhibit remarkable diversity in their feeding habits. Among them, seven species in the genus Perissodus are known for their unique feeding habit of scale eating with specialized feeding morphology and behaviour. Although the origin of the scale-eating habit has long been questioned, its evolutionary process is still unknown. In the present study, we conducted interspecific phylogenetic analyses for all nine known species in the tribe Perissodini (seven Perissodus and two Haplotaxodon species using amplified fragment length polymorphism (AFLP analyses of the nuclear DNA. On the basis of the resultant phylogenetic frameworks, the evolution of their feeding habits was traced using data from analyses of stomach contents, habitat depths, and observations of oral jaw tooth morphology. Results AFLP analyses resolved the phylogenetic relationships of the Perissodini, strongly supporting monophyly for each species. The character reconstruction of feeding ecology based on the AFLP tree suggested that scale eating evolved from general carnivorous feeding to highly specialized scale eating. Furthermore, scale eating is suggested to have evolved in deepwater habitats in the lake. Oral jaw tooth shape was also estimated to have diverged in step with specialization for scale eating. Conclusion The present evolutionary analyses of feeding ecology and morphology based on the obtained phylogenetic tree demonstrate for the first time the evolutionary process leading from generalised to highly specialized scale eating, with diversification in feeding morphology and behaviour among species.

  5. Phylogenetic evidence for cladogenetic polyploidization in land plants.

    Science.gov (United States)

    Zhan, Shing H; Drori, Michal; Goldberg, Emma E; Otto, Sarah P; Mayrose, Itay

    2016-07-01

    Polyploidization is a common and recurring phenomenon in plants and is often thought to be a mechanism of "instant speciation". Whether polyploidization is associated with the formation of new species (cladogenesis) or simply occurs over time within a lineage (anagenesis), however, has never been assessed systematically. We tested this hypothesis using phylogenetic and karyotypic information from 235 plant genera (mostly angiosperms). We first constructed a large database of combined sequence and chromosome number data sets using an automated procedure. We then applied likelihood models (ClaSSE) that estimate the degree of synchronization between polyploidization and speciation events in maximum likelihood and Bayesian frameworks. Our maximum likelihood analysis indicated that 35 genera supported a model that includes cladogenetic transitions over a model with only anagenetic transitions, whereas three genera supported a model that incorporates anagenetic transitions over one with only cladogenetic transitions. Furthermore, the Bayesian analysis supported a preponderance of cladogenetic change in four genera but did not support a preponderance of anagenetic change in any genus. Overall, these phylogenetic analyses provide the first broad confirmation that polyploidization is temporally associated with speciation events, suggesting that it is indeed a major speciation mechanism in plants, at least in some genera. © 2016 Botanical Society of America.

  6. A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks

    NARCIS (Netherlands)

    K.T. Huber; L.J.J. van Iersel (Leo); S.M. Kelk (Steven); R. Suchecki

    2010-01-01

    htmlabstractRecently much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks - a type of

  7. A practical algorithm for reconstructing level-1 phylogenetic networks

    NARCIS (Netherlands)

    Huber, K.T.; Iersel, van L.J.J.; Kelk, S.M.; Suchecki, R.

    2011-01-01

    Recently, much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here, we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks-a type of network

  8. Application of agglomerative clustering for analyzing phylogenetically on bacterium of saliva

    Science.gov (United States)

    Bustamam, A.; Fitria, I.; Umam, K.

    2017-07-01

    Analyzing population of Streptococcus bacteria is important since these species can cause dental caries, periodontal, halitosis (bad breath) and more problems. This paper will discuss the phylogenetically relation between the bacterium Streptococcus in saliva using a phylogenetic tree of agglomerative clustering methods. Starting with the bacterium Streptococcus DNA sequence obtained from the GenBank, then performed characteristic extraction of DNA sequences. The characteristic extraction result is matrix form, then performed normalization using min-max normalization and calculate genetic distance using Manhattan distance. Agglomerative clustering technique consisting of single linkage, complete linkage and average linkage. In this agglomerative algorithm number of group is started with the number of individual species. The most similar species is grouped until the similarity decreases and then formed a single group. Results of grouping is a phylogenetic tree and branches that join an established level of distance, that the smaller the distance the more the similarity of the larger species implementation is using R, an open source program.

  9. Multiple alignment analysis on phylogenetic tree of the spread of SARS epidemic using distance method

    Science.gov (United States)

    Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.

    2017-09-01

    Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.

  10. Folding and unfolding phylogenetic trees and networks.

    Science.gov (United States)

    Huber, Katharina T; Moulton, Vincent; Steel, Mike; Wu, Taoyang

    2016-12-01

    Phylogenetic networks are rooted, labelled directed acyclic graphswhich are commonly used to represent reticulate evolution. There is a close relationship between phylogenetic networks and multi-labelled trees (MUL-trees). Indeed, any phylogenetic network N can be "unfolded" to obtain a MUL-tree U(N) and, conversely, a MUL-tree T can in certain circumstances be "folded" to obtain aphylogenetic network F(T) that exhibits T. In this paper, we study properties of the operations U and F in more detail. In particular, we introduce the class of stable networks, phylogenetic networks N for which F(U(N)) is isomorphic to N, characterise such networks, and show that they are related to the well-known class of tree-sibling networks. We also explore how the concept of displaying a tree in a network N can be related to displaying the tree in the MUL-tree U(N). To do this, we develop aphylogenetic analogue of graph fibrations. This allows us to view U(N) as the analogue of the universal cover of a digraph, and to establish a close connection between displaying trees in U(N) and reconciling phylogenetic trees with networks.

  11. Conservation threats and the phylogenetic utility of IUCN Red List rankings in Incilius toads.

    Science.gov (United States)

    Schachat, Sandra R; Mulcahy, Daniel G; Mendelson, Joseph R

    2016-02-01

    Phylogenetic analysis of extinction threat is an emerging tool in the field of conservation. However, there are problems with the methods and data as commonly used. Phylogenetic sampling usually extends to the level of family or genus, but International Union for Conservation of Nature (IUCN) rankings are available only for individual species, and, although different species within a taxonomic group may have the same IUCN rank, the species may have been ranked as such for different reasons. Therefore, IUCN rank may not reflect evolutionary history and thus may not be appropriate for use in a phylogenetic context. To be used appropriately, threat-risk data should reflect the cause of extinction threat rather than the IUCN threat ranking. In a case study of the toad genus Incilius, with phylogenetic sampling at the species level (so that the resolution of the phylogeny matches character data from the IUCN Red List), we analyzed causes of decline and IUCN threat rankings by calculating metrics of phylogenetic signal (such as Fritz and Purvis' D). We also analyzed the extent to which cause of decline and threat ranking overlap by calculating phylogenetic correlation between these 2 types of character data. Incilius species varied greatly in both threat ranking and cause of decline; this variability would be lost at a coarser taxonomic resolution. We found far more phylogenetic signal, likely correlated with evolutionary history, for causes of decline than for IUCN threat ranking. Individual causes of decline and IUCN threat rankings were largely uncorrelated on the phylogeny. Our results demonstrate the importance of character selection and taxonomic resolution when extinction threat is analyzed in a phylogenetic context. © 2015 Society for Conservation Biology.

  12. Topological variation in single-gene phylogenetic trees

    OpenAIRE

    Castresana, Jose

    2007-01-01

    A recent large-scale phylogenomic study has shown the great degree of topological variation that can be found among eukaryotic phylogenetic trees constructed from single genes, highlighting the problems that can be associated with gene sampling in phylogenetic studies.

  13. Phylogenetic relationships and divergence dates of softshell turtles (Testudines: Trionychidae) inferred from complete mitochondrial genomes.

    Science.gov (United States)

    Li, H; Liu, J; Xiong, L; Zhang, H; Zhou, H; Yin, H; Jing, W; Li, J; Shi, Q; Wang, Y; Liu, J; Nie, L

    2017-05-01

    The softshell turtles (Trionychidae) are one of the most widely distributed reptile groups in the world, and fossils have been found on all continents except Antarctica. The phylogenetic relationships among members of this group have been previously studied; however, disagreements regarding its taxonomy, its phylogeography and divergence times are still poorly understood as well. Here, we present a comprehensive mitogenomic study of softshell turtles. We sequenced the complete mitochondrial genomes of 10 softshell turtles, in addition to the GenBank sequence of Dogania subplana, Lissemys punctata, Trionyx triunguis, which cover all extant genera within Trionychidae except for Cyclanorbis and Cycloderma. These data were combined with other mitogenomes of turtles for phylogenetic analyses. Divergence time calibration and ancestral reconstruction were calculated using BEAST and RASP software, respectively. Our phylogenetic analyses indicate that Trionychidae is the sister taxon of Carettochelyidae, and support the monophyly of Trionychinae and Cyclanorbinae, which is consistent with morphological data and molecular analysis. Our phylogenetic analyses have established a sister taxon relationship between the Asian Rafetus and the Asian Palea + Pelodiscus + Dogania + Nilssonia + Amyda, whereas a previous study grouped the Asian Rafetus with the American Apalone. The results of divergence time estimates and area ancestral reconstruction show that extant Trionychidae originated in Asia at around 108 million years ago (MA), and radiations mainly occurred during two warm periods, namely Late Cretaceous-Early Eocene and Oligocene. By combining the estimated divergence time and the reconstructed ancestral area of softshell turtles, we determined that the dispersal of softshell turtles out of Asia may have taken three routes. Furthermore, the times of dispersal seem to be in agreement with the time of the India-Asia collision and opening of the Bering Strait, which

  14. Study on Top-Down Estimation Method of Software Project Planning

    Institute of Scientific and Technical Information of China (English)

    ZHANG Jun-guang; L(U) Ting-jie; ZHAO Yu-mei

    2006-01-01

    This paper studies a new software project planning method under some actual project data in order to make software project plans more effective. From the perspective of system theory, our new method regards a software project plan as an associative unit for study. During a top-down estimation of a software project, Program Evaluation and Review Technique (PERT) method and analogy method are combined to estimate its size, then effort estimation and specific schedules are obtained according to distributions of the phase effort. This allows a set of practical and feasible planning methods to be constructed. Actual data indicate that this set of methods can lead to effective software project planning.

  15. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  16. Analysis of Acorus calamus chloroplast genome and its phylogenetic implications.

    Science.gov (United States)

    Goremykin, Vadim V; Holland, Barbara; Hirsch-Ernst, Karen I; Hellwig, Frank H

    2005-09-01

    Determining the phylogenetic relationships among the major lines of angiosperms is a long-standing problem, yet the uncertainty as to the phylogenetic affinity of these lines persists. While a number of studies have suggested that the ANITA (Amborella-Nymphaeales-Illiciales-Trimeniales-Aristolochiales) grade is basal within angiosperms, studies of complete chloroplast genome sequences also suggested an alternative tree, wherein the line leading to the grasses branches first among the angiosperms. To improve taxon sampling in the existing chloroplast genome data, we sequenced the chloroplast genome of the monocot Acorus calamus. We generated a concatenated alignment (89,436 positions for 15 taxa), encompassing almost all sequences usable for phylogeny reconstruction within spermatophytes. The data still contain support for both the ANITA-basal and grasses-basal hypotheses. Using simulations we can show that were the ANITA-basal hypothesis true, parsimony (and distance-based methods with many models) would be expected to fail to recover it. The self-evident explanation for this failure appears to be a long-branch attraction (LBA) between the clade of grasses and the out-group. However, this LBA cannot explain the discrepancies observed between tree topology recovered using the maximum likelihood (ML) method and the topologies recovered using the parsimony and distance-based methods when grasses are deleted. Furthermore, the fact that neither maximum parsimony nor distance methods consistently recover the ML tree, when according to the simulations they would be expected to, when the out-group (Pinus) is deleted, suggests that either the generating tree is not correct or the best symmetric model is misspecified (or both). We demonstrate that the tree recovered under ML is extremely sensitive to model specification and that the best symmetric model is misspecified. Hence, we remain agnostic regarding phylogenetic relationships among basal angiosperm lineages.

  17. Fast and accurate methods for phylogenomic analyses

    Directory of Open Access Journals (Sweden)

    Warnow Tandy

    2011-10-01

    Full Text Available Abstract Background Species phylogenies are not estimated directly, but rather through phylogenetic analyses of different gene datasets. However, true gene trees can differ from the true species tree (and hence from one another due to biological processes such as horizontal gene transfer, incomplete lineage sorting, and gene duplication and loss, so that no single gene tree is a reliable estimate of the species tree. Several methods have been developed to estimate species trees from estimated gene trees, differing according to the specific algorithmic technique used and the biological model used to explain differences between species and gene trees. Relatively little is known about the relative performance of these methods. Results We report on a study evaluating several different methods for estimating species trees from sequence datasets, simulating sequence evolution under a complex model including indels (insertions and deletions, substitutions, and incomplete lineage sorting. The most important finding of our study is that some fast and simple methods are nearly as accurate as the most accurate methods, which employ sophisticated statistical methods and are computationally quite intensive. We also observe that methods that explicitly consider errors in the estimated gene trees produce more accurate trees than methods that assume the estimated gene trees are correct. Conclusions Our study shows that highly accurate estimations of species trees are achievable, even when gene trees differ from each other and from the species tree, and that these estimations can be obtained using fairly simple and computationally tractable methods.

  18. Sensitivity of metrics of phylogenetic structure to scale, source of data and species pool of hummingbird assemblages along elevational gradients.

    Directory of Open Access Journals (Sweden)

    Sebastián González-Caro

    Full Text Available Patterns of phylogenetic structure of assemblages are increasingly used to gain insight into the ecological and evolutionary processes involved in the assembly of co-occurring species. Metrics of phylogenetic structure can be sensitive to scaling issues and data availability. Here we empirically assess the sensitivity of four metrics of phylogenetic structure of assemblages to changes in (i the source of data, (ii the spatial grain at which assemblages are defined, and (iii the definition of species pools using hummingbird (Trochilidae assemblages along an elevational gradient in Colombia. We also discuss some of the implications in terms of the potential mechanisms driving these patterns. To explore how source of data influence phylogenetic structure we defined assemblages using three sources of data: field inventories, museum specimens, and range maps. Assemblages were defined at two spatial grains: coarse-grained (elevational bands of 800-m width and fine-grained (1-km(2 plots. We used three different species pools: all species contained in assemblages, all species within half-degree quadrats, and all species either above or below 2000 m elevation. Metrics considering phylogenetic relationships among all species within assemblages showed phylogenetic clustering at high elevations and phylogenetic evenness in the lowlands, whereas those metrics considering only the closest co-occurring relatives showed the opposite trend. This result suggests that using multiple metrics of phylogenetic structure should provide greater insight into the mechanisms shaping assemblage structure. The source and spatial grain of data had important influences on estimates of both richness and phylogenetic structure. Metrics considering the co-occurrence of close relatives were particularly sensitive to changes in the spatial grain. Assemblages based on range maps included more species and showed less phylogenetic structure than assemblages based on museum or field

  19. Fast Computations for Measures of Phylogenetic Beta Diversity.

    Directory of Open Access Journals (Sweden)

    Constantinos Tsirogiannis

    Full Text Available For many applications in ecology, it is important to examine the phylogenetic relations between two communities of species. More formally, let [Formula: see text] be a phylogenetic tree and let A and B be two samples of its tips, representing the examined communities. We want to compute a value that expresses the phylogenetic diversity between A and B in [Formula: see text]. There exist several measures that can do this; these are the so-called phylogenetic beta diversity (β-diversity measures. Two popular measures of this kind are the Community Distance (CD and the Common Branch Length (CBL. In most applications, it is not sufficient to compute the value of a beta diversity measure for two communities A and B; we also want to know if this value is relatively large or small compared to all possible pairs of communities in [Formula: see text] that have the same size. To decide this, the ideal approach is to compute a standardised index that involves the mean and the standard deviation of this measure among all pairs of species samples that have the same number of elements as A and B. However, no method exists for computing exactly and efficiently this index for CD and CBL. We present analytical expressions for computing the expectation and the standard deviation of CD and CBL. Based on these expressions, we describe efficient algorithms for computing the standardised indices of the two measures. Using standard algorithmic analysis, we provide guarantees on the theoretical efficiency of our algorithms. We implemented our algorithms and measured their efficiency in practice. Our implementations compute the standardised indices of CD and CBL in less than twenty seconds for a hundred pairs of samples on trees with 7 ⋅ 10(4 tips. Our implementations are available through the R package PhyloMeasures.

  20. An improved method for estimating the frequency correlation function

    KAUST Repository

    Chelli, Ali; Pä tzold, Matthias

    2012-01-01

    For time-invariant frequency-selective channels, the transfer function is a superposition of waves having different propagation delays and path gains. In order to estimate the frequency correlation function (FCF) of such channels, the frequency averaging technique can be utilized. The obtained FCF can be expressed as a sum of auto-terms (ATs) and cross-terms (CTs). The ATs are caused by the autocorrelation of individual path components. The CTs are due to the cross-correlation of different path components. These CTs have no physical meaning and leads to an estimation error. We propose a new estimation method aiming to improve the estimation accuracy of the FCF of a band-limited transfer function. The basic idea behind the proposed method is to introduce a kernel function aiming to reduce the CT effect, while preserving the ATs. In this way, we can improve the estimation of the FCF. The performance of the proposed method and the frequency averaging technique is analyzed using a synthetically generated transfer function. We show that the proposed method is more accurate than the frequency averaging technique. The accurate estimation of the FCF is crucial for the system design. In fact, we can determine the coherence bandwidth from the FCF. The exact knowledge of the coherence bandwidth is beneficial in both the design as well as optimization of frequency interleaving and pilot arrangement schemes. © 2012 IEEE.

  1. An improved method for estimating the frequency correlation function

    KAUST Repository

    Chelli, Ali

    2012-04-01

    For time-invariant frequency-selective channels, the transfer function is a superposition of waves having different propagation delays and path gains. In order to estimate the frequency correlation function (FCF) of such channels, the frequency averaging technique can be utilized. The obtained FCF can be expressed as a sum of auto-terms (ATs) and cross-terms (CTs). The ATs are caused by the autocorrelation of individual path components. The CTs are due to the cross-correlation of different path components. These CTs have no physical meaning and leads to an estimation error. We propose a new estimation method aiming to improve the estimation accuracy of the FCF of a band-limited transfer function. The basic idea behind the proposed method is to introduce a kernel function aiming to reduce the CT effect, while preserving the ATs. In this way, we can improve the estimation of the FCF. The performance of the proposed method and the frequency averaging technique is analyzed using a synthetically generated transfer function. We show that the proposed method is more accurate than the frequency averaging technique. The accurate estimation of the FCF is crucial for the system design. In fact, we can determine the coherence bandwidth from the FCF. The exact knowledge of the coherence bandwidth is beneficial in both the design as well as optimization of frequency interleaving and pilot arrangement schemes. © 2012 IEEE.

  2. Predicting rates of interspecific interaction from phylogenetic trees.

    Science.gov (United States)

    Nuismer, Scott L; Harmon, Luke J

    2015-01-01

    Integrating phylogenetic information can potentially improve our ability to explain species' traits, patterns of community assembly, the network structure of communities, and ecosystem function. In this study, we use mathematical models to explore the ecological and evolutionary factors that modulate the explanatory power of phylogenetic information for communities of species that interact within a single trophic level. We find that phylogenetic relationships among species can influence trait evolution and rates of interaction among species, but only under particular models of species interaction. For example, when interactions within communities are mediated by a mechanism of phenotype matching, phylogenetic trees make specific predictions about trait evolution and rates of interaction. In contrast, if interactions within a community depend on a mechanism of phenotype differences, phylogenetic information has little, if any, predictive power for trait evolution and interaction rate. Together, these results make clear and testable predictions for when and how evolutionary history is expected to influence contemporary rates of species interaction. © 2014 John Wiley & Sons Ltd/CNRS.

  3. Environmental and spatial drivers of taxonomic, functional, and phylogenetic characteristics of bat communities in human-modified landscapes.

    Science.gov (United States)

    Cisneros, Laura M; Fagan, Matthew E; Willig, Michael R

    2016-01-01

    Assembly of species into communities following human disturbance (e.g., deforestation, fragmentation) may be governed by spatial (e.g., dispersal) or environmental (e.g., niche partitioning) mechanisms. Variation partitioning has been used to broadly disentangle spatial and environmental mechanisms, and approaches utilizing functional and phylogenetic characteristics of communities have been implemented to determine the relative importance of particular environmental (or niche-based) mechanisms. Nonetheless, few studies have integrated these quantitative approaches to comprehensively assess the relative importance of particular structuring processes. We employed a novel variation partitioning approach to evaluate the relative importance of particular spatial and environmental drivers of taxonomic, functional, and phylogenetic aspects of bat communities in a human-modified landscape in Costa Rica. Specifically, we estimated the amount of variation in species composition (taxonomic structure) and in two aspects of functional and phylogenetic structure (i.e., composition and dispersion) along a forest loss and fragmentation gradient that are uniquely explained by landscape characteristics (i.e., environment) or space to assess the importance of competing mechanisms. The unique effects of space on taxonomic, functional and phylogenetic structure were consistently small. In contrast, landscape characteristics (i.e., environment) played an appreciable role in structuring bat communities. Spatially-structured landscape characteristics explained 84% of the variation in functional or phylogenetic dispersion, and the unique effects of landscape characteristics significantly explained 14% of the variation in species composition. Furthermore, variation in bat community structure was primarily due to differences in dispersion of species within functional or phylogenetic space along the gradient, rather than due to differences in functional or phylogenetic composition. Variation

  4. Phylogenetic Structure of Foliar Spectral Traits in Tropical Forest Canopies

    Directory of Open Access Journals (Sweden)

    Kelly M. McManus

    2016-02-01

    Full Text Available The Spectranomics approach to tropical forest remote sensing has established a link between foliar reflectance spectra and the phylogenetic composition of tropical canopy tree communities vis-à-vis the taxonomic organization of biochemical trait variation. However, a direct relationship between phylogenetic affiliation and foliar reflectance spectra of species has not been established. We sought to develop this relationship by quantifying the extent to which underlying patterns of phylogenetic structure drive interspecific variation among foliar reflectance spectra within three Neotropical canopy tree communities with varying levels of soil fertility. We interpreted the resulting spectral patterns of phylogenetic signal in the context of foliar biochemical traits that may contribute to the spectral-phylogenetic link. We utilized a multi-model ensemble to elucidate trait-spectral relationships, and quantified phylogenetic signal for spectral wavelengths and traits using Pagel’s lambda statistic. Foliar reflectance spectra showed evidence of phylogenetic influence primarily within the visible and shortwave infrared spectral regions. These regions were also selected by the multi-model ensemble as those most important to the quantitative prediction of several foliar biochemical traits. Patterns of phylogenetic organization of spectra and traits varied across sites and with soil fertility, indicative of the complex interactions between the environmental and phylogenetic controls underlying patterns of biodiversity.

  5. Is invasion success of Australian trees mediated by their native biogeography, phylogenetic history, or both?

    Science.gov (United States)

    Miller, Joseph T; Hui, Cang; Thornhill, Andrew; Gallien, Laure; Le Roux, Johannes J; Richardson, David M

    2016-12-30

    For a plant species to become invasive it has to progress along the introduction-naturalization-invasion (INI) continuum which reflects the joint direction of niche breadth. Identification of traits that correlate with and drive species invasiveness along the continuum is a major focus of invasion biology. If invasiveness is underlain by heritable traits, and if such traits are phylogenetically conserved, then we would expect non-native species with different introduction status (i.e. position along the INI continuum) to show phylogenetic signal. This study uses two clades that contain a large number of invasive tree species from the genera Acacia and Eucalyptus to test whether geographic distribution and a novel phylogenetic conservation method can predict which species have been introduced, became naturalized, and invasive. Our results suggest that no underlying phylogenetic signal underlie the introduction status for both groups of trees, except for introduced acacias. The more invasive acacia clade contains invasive species that have smoother geographic distributions and are more marginal in the phylogenetic network. The less invasive eucalyptus group contains invasive species that are more clustered geographically, more centrally located in the phylogenetic network and have phylogenetic distances between invasive and non-invasive species that are trending toward the mean pairwise distance. This suggests that highly invasive groups may be identified because they have invasive species with smoother and faster expanding native distributions and are located more to the edges of phylogenetic networks than less invasive groups. Published by Oxford University Press on behalf of the Annals of Botany Company.

  6. A comparison of analysis methods to estimate contingency strength.

    Science.gov (United States)

    Lloyd, Blair P; Staubitz, Johanna L; Tapp, Jon T

    2018-05-09

    To date, several data analysis methods have been used to estimate contingency strength, yet few studies have compared these methods directly. To compare the relative precision and sensitivity of four analysis methods (i.e., exhaustive event-based, nonexhaustive event-based, concurrent interval, concurrent+lag interval), we applied all methods to a simulated data set in which several response-dependent and response-independent schedules of reinforcement were programmed. We evaluated the degree to which contingency strength estimates produced from each method (a) corresponded with expected values for response-dependent schedules and (b) showed sensitivity to parametric manipulations of response-independent reinforcement. Results indicated both event-based methods produced contingency strength estimates that aligned with expected values for response-dependent schedules, but differed in sensitivity to response-independent reinforcement. The precision of interval-based methods varied by analysis method (concurrent vs. concurrent+lag) and schedule type (continuous vs. partial), and showed similar sensitivities to response-independent reinforcement. Recommendations and considerations for measuring contingencies are identified. © 2018 Society for the Experimental Analysis of Behavior.

  7. Building a Phylogenetic Tree of the Human and Ape Superfamily Using DNA-DNA Hybridization Data

    Science.gov (United States)

    Maier, Caroline Alexander

    2004-01-01

    The study describes the process of DNA-DNA hybridization and the history of its use by Sibley and Alquist in simple, straightforward, and interesting language that students easily understand to create their own phylogenetic tree of the hominoid superfamily. They calibrate the DNA clock and use it to estimate the divergence dates of the various…

  8. Plant-available soil water capacity: estimation methods and implications

    Directory of Open Access Journals (Sweden)

    Bruno Montoani Silva

    2014-04-01

    Full Text Available The plant-available water capacity of the soil is defined as the water content between field capacity and wilting point, and has wide practical application in planning the land use. In a representative profile of the Cerrado Oxisol, methods for estimating the wilting point were studied and compared, using a WP4-T psychrometer and Richards chamber for undisturbed and disturbed samples. In addition, the field capacity was estimated by the water content at 6, 10, 33 kPa and by the inflection point of the water retention curve, calculated by the van Genuchten and cubic polynomial models. We found that the field capacity moisture determined at the inflection point was higher than by the other methods, and that even at the inflection point the estimates differed, according to the model used. By the WP4-T psychrometer, the water content was significantly lower found the estimate of the permanent wilting point. We concluded that the estimation of the available water holding capacity is markedly influenced by the estimation methods, which has to be taken into consideration because of the practical importance of this parameter.

  9. Nonparametric methods for volatility density estimation

    NARCIS (Netherlands)

    Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.

    2009-01-01

    Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on

  10. Fusion rule estimation using vector space methods

    International Nuclear Information System (INIS)

    Rao, N.S.V.

    1997-01-01

    In a system of N sensors, the sensor S j , j = 1, 2 .... N, outputs Y (j) element-of Re, according to an unknown probability distribution P (Y(j) /X) , corresponding to input X element-of [0, 1]. A training n-sample (X 1 , Y 1 ), (X 2 , Y 2 ), ..., (X n , Y n ) is given where Y i = (Y i (1) , Y i (2) , . . . , Y i N ) such that Y i (j) is the output of S j in response to input X i . The problem is to estimate a fusion rule f : Re N → [0, 1], based on the sample, such that the expected square error is minimized over a family of functions Y that constitute a vector space. The function f* that minimizes the expected error cannot be computed since the underlying densities are unknown, and only an approximation f to f* is feasible. We estimate the sample size sufficient to ensure that f provides a close approximation to f* with a high probability. The advantages of vector space methods are two-fold: (a) the sample size estimate is a simple function of the dimensionality of F, and (b) the estimate f can be easily computed by well-known least square methods in polynomial time. The results are applicable to the classical potential function methods and also (to a recently proposed) special class of sigmoidal feedforward neural networks

  11. Ecomorphology and phylogenetic risk: Implications for habitat reconstruction using fossil bovids.

    Science.gov (United States)

    Scott, Robert S; Barr, W Andrew

    2014-08-01

    Reconstructions of paleohabitats are necessary aids in understanding hominin evolution. The morphology of species from relevant sites, understood in terms of functional relationships to habitat (termed ecomorphology), offers a direct link to habitat. Bovids are a speciose radiation that includes many habitat specialists and are abundant in the fossil record. Thus, bovids are extremely common in ecomorphological analyses. However, bovid phylogeny and habitat preference are related, which raises the possibility that analyses linking habitat with morphology are not 'taxon free' but 'taxon-dependent.' Here we analyze eight relative dimensions and one shape index of the metatarsal for a sample of 72 bovid species and one antilocaprid. The selected variables have been previously shown to have strong associations with habitat and to have functional explanations for these associations. Phylogenetic generalized least squares analyses of these variables, including habitat and size, resulted in estimates for the parameter lambda (used to model phylogenetic signal) varying from zero to one. Thus, while phylogeny, morphology, and habitat all march together among the bovids, the odds that phylogeny confounds ecomorphological analyses may vary depending on particular morphological characteristics. While large values of lambda do not necessarily indicate that habitat differences are unimportant drivers of morphology, we consider the low value of lambda for relative metatarsal width suggestive that conclusions about habitat built on observations of this particular morphology carry with them less 'phylogenetic risk.' We suggest that the way forward for ecomorphology is grounded in functionally relevant observations and careful consideration of phylogeny designed to bracket probable habitat preferences appropriately. Separate consideration of different morphological variables may help to determine the level of 'phylogenetic risk' attached to conclusions linking habitat and morphology

  12. Fast optimization of statistical potentials for structurally constrained phylogenetic models

    Directory of Open Access Journals (Sweden)

    Rodrigue Nicolas

    2009-09-01

    Full Text Available Abstract Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure. Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.

  13. A Benchmark Estimate for the Capital Stock. An Optimal Consistency Method

    OpenAIRE

    Jose Miguel Albala-Bertrand

    2001-01-01

    There are alternative methods to estimate a capital stock for a benchmark year. These methods, however, do not allow for an independent check, which could establish whether the estimated benchmark level is too high or too low. I propose here an optimal consistency method (OCM), which may allow estimating a capital stock level for a benchmark year and/or checking the consistency of alternative estimates of a benchmark capital stock.

  14. Phylogeny and phylogenetic classification of the antbirds, ovenbirds, woodcreepers, and allies (Aves: Passeriformes: Infraorder Furnariides)

    Science.gov (United States)

    Moyle, R.G.; Chesser, R.T.; Brumfield, R.T.; Tello, J.G.; Marchese, D.J.; Cracraft, J.

    2009-01-01

    The infraorder Furnariides is a diverse group of suboscine passerine birds comprising a substantial component of the Neotropical avifauna. The included species encompass a broad array of morphologies and behaviours, making them appealing for evolutionary studies, but the size of the group (ca. 600 species) has limited well-sampled higher-level phylogenetic studies. Using DNA sequence data from the nuclear RAG-1 and RAG-2 exons, we undertook a phylogenetic analysis of the Furnariides sampling 124 (more than 88%) of the genera. Basal relationships among family-level taxa differed depending on phylogenetic method, but all topologies had little nodal support, mirroring the results from earlier studies in which discerning relationships at the base of the radiation was also difficult. In contrast, branch support for family-rank taxa and for many relationships within those clades was generally high. Our results support the Melanopareidae and Grallariidae as distinct from the Rhinocryptidae and Formicariidae, respectively. Within the Furnariides our data contradict some recent phylogenetic hypotheses and suggest that further study is needed to resolve these discrepancies. Of the few genera represented by multiple species, several were not monophyletic, indicating that additional systematic work remains within furnariine families and must include dense taxon sampling. We use this study as a basis for proposing a new phylogenetic classification for the group and in the process erect new family-group names for clades having high branch support across methods. ?? 2009 The Willi Hennig Society.

  15. Thermodynamic properties of organic compounds estimation methods, principles and practice

    CERN Document Server

    Janz, George J

    1967-01-01

    Thermodynamic Properties of Organic Compounds: Estimation Methods, Principles and Practice, Revised Edition focuses on the progression of practical methods in computing the thermodynamic characteristics of organic compounds. Divided into two parts with eight chapters, the book concentrates first on the methods of estimation. Topics presented are statistical and combined thermodynamic functions; free energy change and equilibrium conversions; and estimation of thermodynamic properties. The next discussions focus on the thermodynamic properties of simple polyatomic systems by statistical the

  16. Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions.

    Science.gov (United States)

    Bogdanowicz, Damian; Giaro, Krzysztof

    2017-05-01

    Ability to quantify dissimilarity of different phylogenetic trees describing the relationship between the same group of taxa is required in various types of phylogenetic studies. For example, such metrics are used to assess the quality of phylogeny construction methods, to define optimization criteria in supertree building algorithms, or to find horizontal gene transfer (HGT) events. Among the set of metrics described so far in the literature, the most commonly used seems to be the Robinson-Foulds distance. In this article, we define a new metric for rooted trees-the Matching Pair (MP) distance. The MP metric uses the concept of the minimum-weight perfect matching in a complete bipartite graph constructed from partitions of all pairs of leaves of the compared phylogenetic trees. We analyze the properties of the MP metric and present computational experiments showing its potential applicability in tasks related to finding the HGT events.

  17. A Group Contribution Method for Estimating Cetane and Octane Numbers

    Energy Technology Data Exchange (ETDEWEB)

    Kubic, William Louis [Los Alamos National Lab. (LANL), Los Alamos, NM (United States). Process Modeling and Analysis Group

    2016-07-28

    Much of the research on advanced biofuels is devoted to the study of novel chemical pathways for converting nonfood biomass into liquid fuels that can be blended with existing transportation fuels. Many compounds under consideration are not found in the existing fuel supplies. Often, the physical properties needed to assess the viability of a potential biofuel are not available. The only reliable information available may be the molecular structure. Group contribution methods for estimating physical properties from molecular structure have been used for more than 60 years. The most common application is estimation of thermodynamic properties. More recently, group contribution methods have been developed for estimating rate dependent properties including cetane and octane numbers. Often, published group contribution methods are limited in terms of types of function groups and range of applicability. In this study, a new, broadly-applicable group contribution method based on an artificial neural network was developed to estimate cetane number research octane number, and motor octane numbers of hydrocarbons and oxygenated hydrocarbons. The new method is more accurate over a greater range molecular weights and structural complexity than existing group contribution methods for estimating cetane and octane numbers.

  18. The Role of the Phylogenetic Diversity Measure, PD, in Bio-informatics: Getting the Definition Right

    Directory of Open Access Journals (Sweden)

    Daniel P. Faith

    2006-01-01

    Full Text Available A recent paper in this journal (Faith and Baker, 2006 described bio-informatics challenges in the application of the PD (phylogenetic diversity measure of Faith (1992a, and highlighted the use of the root of the phylogenetic tree, as implied by the original definition of PD. A response paper (Crozier et al. 2006 stated that 1 the (Faith, 1992a PD definition did not include the use of the root of the tree, and 2 Moritz and Faith (1998 changed the PD definition to include the root. Both characterizations are here refuted. Examples from Faith (1992a,b document the link from the definition to the use of the root of the overall tree, and a survey of papers over the past 15 years by Faith and colleagues demonstrate that the stated PD definition has remained the same as that in the original 1992 study. PD’s estimation of biodiversity at the level of “feature diversity” is seen to have provided the original rationale for the measure’s consideration of the root of the phylogenetic tree.

  19. Structural Reliability Using Probability Density Estimation Methods Within NESSUS

    Science.gov (United States)

    Chamis, Chrisos C. (Technical Monitor); Godines, Cody Ric

    2003-01-01

    A reliability analysis studies a mathematical model of a physical system taking into account uncertainties of design variables and common results are estimations of a response density, which also implies estimations of its parameters. Some common density parameters include the mean value, the standard deviation, and specific percentile(s) of the response, which are measures of central tendency, variation, and probability regions, respectively. Reliability analyses are important since the results can lead to different designs by calculating the probability of observing safe responses in each of the proposed designs. All of this is done at the expense of added computational time as compared to a single deterministic analysis which will result in one value of the response out of many that make up the density of the response. Sampling methods, such as monte carlo (MC) and latin hypercube sampling (LHS), can be used to perform reliability analyses and can compute nonlinear response density parameters even if the response is dependent on many random variables. Hence, both methods are very robust; however, they are computationally expensive to use in the estimation of the response density parameters. Both methods are 2 of 13 stochastic methods that are contained within the Numerical Evaluation of Stochastic Structures Under Stress (NESSUS) program. NESSUS is a probabilistic finite element analysis (FEA) program that was developed through funding from NASA Glenn Research Center (GRC). It has the additional capability of being linked to other analysis programs; therefore, probabilistic fluid dynamics, fracture mechanics, and heat transfer are only a few of what is possible with this software. The LHS method is the newest addition to the stochastic methods within NESSUS. Part of this work was to enhance NESSUS with the LHS method. The new LHS module is complete, has been successfully integrated with NESSUS, and been used to study four different test cases that have been

  20. Phylogenetic community structure: temporal variation in fish assemblage

    OpenAIRE

    Santorelli, Sergio; Magnusson, William; Ferreira, Efrem; Caramaschi, Erica; Zuanon, Jansen; Amadio, Sidnéia

    2014-01-01

    Hypotheses about phylogenetic relationships among species allow inferences about the mechanisms that affect species coexistence. Nevertheless, most studies assume that phylogenetic patterns identified are stable over time. We used data on monthly samples of fish from a single lake over 10 years to show that the structure in phylogenetic assemblages varies over time and conclusions depend heavily on the time scale investigated. The data set was organized in guild structures and temporal scales...

  1. Polytomy identification in microbial phylogenetic reconstruction

    Directory of Open Access Journals (Sweden)

    Lin Guan

    2011-12-01

    Full Text Available Abstract Background A phylogenetic tree, showing ancestral relations among organisms, is commonly represented as a rooted tree with sets of bifurcating branches (dichotomies for simplicity, although polytomies (multifurcating branches may reflect more accurate evolutionary relationships. To represent the true evolutionary relationships, it is important to systematically identify the polytomies from a bifurcating tree and generate a taxonomy-compatible multifurcating tree. For this purpose we propose a novel approach, "PolyPhy", which would classify a set of bifurcating branches of a phylogenetic tree into a set of branches with dichotomies and polytomies by considering genome distances among genomes and tree topological properties. Results PolyPhy employs a machine learning technique, BLR (Bayesian logistic regression classifier, to identify possible bifurcating subtrees as polytomies from the trees resulted from ComPhy. Other than considering genome-scale distances between all pairs of species, PolyPhy also takes into account different properties of tree topology between dichotomy and polytomy, such as long-branch retraction and short-branch contraction, and quantifies these properties into comparable rates among different sub-branches. We extract three tree topological features, 'LR' (Leaf rate, 'IntraR' (Intra-subset branch rate and 'InterR' (Inter-subset branch rate, all of which are calculated from bifurcating tree branch sets for classification. We have achieved F-measure (balanced measure between precision and recall of 81% with about 0.9 area under the curve (AUC of ROC. Conclusions PolyPhy is a fast and robust method to identify polytomies from phylogenetic trees based on genome-wide inference of evolutionary relationships among genomes. The software package and test data can be downloaded from http://digbio.missouri.edu/ComPhy/phyloTreeBiNonBi-1.0.zip.

  2. Potentials and limitations of histone repeat sequences for phylogenetic reconstruction of Sophophora.

    Science.gov (United States)

    Baldo, A M; Les, D H; Strausbaugh, L D

    1999-11-01

    Simplified DNA sequence acquisition has provided many new data sets that are useful for phylogenetic reconstruction, including single- and multiple-copy nuclear and organellar genes. Although transcribed regions receive much attention, nontranscribed regions have recently been added to the repertoire of sequences suitable for phylogenetic studies, especially for closely related taxa. We evaluated the efficacy of a small portion of the histone repeat for phylogenetic reconstruction among Drosophila species. Histone repeats in invertebrates offer distinct advantages similar to those of widely used ribosomal repeats. First, the units are tandemly repeated and undergo concerted evolution. Second, histone repeats include both highly conserved coding and variable intergenic regions. This composition facilitates application of "universal" primers spanning potentially informative sites. We examined a small region of the histone repeat, including the intergenic spacer segments of coding regions from the divergently transcribed H2A and H2B histone genes. The spacer (about 230 bp) exists as a mosaic with highly conserved functional motifs interspersed with rapidly diverging regions; the former aid in alignment of the spacer. There are no ambiguities in alignment of coding regions. Coding and noncoding regions were analyzed together and separately for phylogenetic information. Parsimony, distance, and maximum-likelihood methods successfully retrieve the corroborated phylogeny for the taxa examined. This study demonstrates the resolving power of a small histone region which may now be added to the growing collection of phylogenetically useful DNA sequences.

  3. Motion estimation using point cluster method and Kalman filter.

    Science.gov (United States)

    Senesh, M; Wolf, A

    2009-05-01

    The most frequently used method in a three dimensional human gait analysis involves placing markers on the skin of the analyzed segment. This introduces a significant artifact, which strongly influences the bone position and orientation and joint kinematic estimates. In this study, we tested and evaluated the effect of adding a Kalman filter procedure to the previously reported point cluster technique (PCT) in the estimation of a rigid body motion. We demonstrated the procedures by motion analysis of a compound planar pendulum from indirect opto-electronic measurements of markers attached to an elastic appendage that is restrained to slide along the rigid body long axis. The elastic frequency is close to the pendulum frequency, as in the biomechanical problem, where the soft tissue frequency content is similar to the actual movement of the bones. Comparison of the real pendulum angle to that obtained by several estimation procedures--PCT, Kalman filter followed by PCT, and low pass filter followed by PCT--enables evaluation of the accuracy of the procedures. When comparing the maximal amplitude, no effect was noted by adding the Kalman filter; however, a closer look at the signal revealed that the estimated angle based only on the PCT method was very noisy with fluctuation, while the estimated angle based on the Kalman filter followed by the PCT was a smooth signal. It was also noted that the instantaneous frequencies obtained from the estimated angle based on the PCT method is more dispersed than those obtained from the estimated angle based on Kalman filter followed by the PCT method. Addition of a Kalman filter to the PCT method in the estimation procedure of rigid body motion results in a smoother signal that better represents the real motion, with less signal distortion than when using a digital low pass filter. Furthermore, it can be concluded that adding a Kalman filter to the PCT procedure substantially reduces the dispersion of the maximal and minimal

  4. Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses.

    Science.gov (United States)

    Fouquier, Jennifer; Rideout, Jai Ram; Bolyen, Evan; Chase, John; Shiffer, Arron; McDonald, Daniel; Knight, Rob; Caporaso, J Gregory; Kelley, Scott T

    2016-02-24

    methods for larger effect sizes. The Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments. The ghost-tree software package can also be used to develop phylogenetic trees for other marker gene sets that afford different taxonomic resolution, or for bridging genome trees with amplicon trees. ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree .

  5. Phylogenetic inference in Rafflesiales: the influence of rate heterogeneity and horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Vidal-Russell Romina

    2004-10-01

    Full Text Available Abstract Background The phylogenetic relationships among the holoparasites of Rafflesiales have remained enigmatic for over a century. Recent molecular phylogenetic studies using the mitochondrial matR gene placed Rafflesia, Rhizanthes and Sapria (Rafflesiaceae s. str. in the angiosperm order Malpighiales and Mitrastema (Mitrastemonaceae in Ericales. These phylogenetic studies did not, however, sample two additional groups traditionally classified within Rafflesiales (Apodantheaceae and Cytinaceae. Here we provide molecular phylogenetic evidence using DNA sequence data from mitochondrial and nuclear genes for representatives of all genera in Rafflesiales. Results Our analyses indicate that the phylogenetic affinities of the large-flowered clade and Mitrastema, ascertained using mitochondrial matR, are congruent with results from nuclear SSU rDNA when these data are analyzed using maximum likelihood and Bayesian methods. The relationship of Cytinaceae to Malvales was recovered in all analyses. Relationships between Apodanthaceae and photosynthetic angiosperms varied depending upon the data partition: Malvales (3-gene, Cucurbitales (matR or Fabales (atp1. The latter incongruencies suggest that horizontal gene transfer (HGT may be affecting the mitochondrial gene topologies. The lack of association between Mitrastema and Ericales using atp1 is suggestive of HGT, but greater sampling within eudicots is needed to test this hypothesis further. Conclusions Rafflesiales are not monophyletic but composed of three or four independent lineages (families: Rafflesiaceae, Mitrastemonaceae, Apodanthaceae and Cytinaceae. Long-branch attraction appears to be misleading parsimony analyses of nuclear small-subunit rDNA data, but model-based methods (maximum likelihood and Bayesian analyses recover a topology that is congruent with the mitochondrial matR gene tree, thus providing compelling evidence for organismal relationships. Horizontal gene transfer appears to

  6. Reconciling taxonomy and phylogenetic inference: formalism and algorithms for describing discord and inferring taxonomic roots

    Directory of Open Access Journals (Sweden)

    Matsen Frederick A

    2012-05-01

    Full Text Available Abstract Background Although taxonomy is often used informally to evaluate the results of phylogenetic inference and the root of phylogenetic trees, algorithmic methods to do so are lacking. Results In this paper we formalize these procedures and develop algorithms to solve the relevant problems. In particular, we introduce a new algorithm that solves a "subcoloring" problem to express the difference between a taxonomy and a phylogeny at a given rank. This algorithm improves upon the current best algorithm in terms of asymptotic complexity for the parameter regime of interest; we also describe a branch-and-bound algorithm that saves orders of magnitude in computation on real data sets. We also develop a formalism and an algorithm for rooting phylogenetic trees according to a taxonomy. Conclusions The algorithms in this paper, and the associated freely-available software, will help biologists better use and understand taxonomically labeled phylogenetic trees.

  7. An Estimation Method for number of carrier frequency

    Directory of Open Access Journals (Sweden)

    Xiong Peng

    2015-01-01

    Full Text Available This paper proposes a method that utilizes AR model power spectrum estimation based on Burg algorithm to estimate the number of carrier frequency in single pulse. In the modern electronic and information warfare, the pulse signal form of radar is complex and changeable, among which single pulse with multi-carrier frequencies is the most typical one, such as the frequency shift keying (FSK signal, the frequency shift keying with linear frequency (FSK-LFM hybrid modulation signal and the frequency shift keying with bi-phase shift keying (FSK-BPSK hybrid modulation signal. In view of this kind of single pulse which has multi-carrier frequencies, this paper adopts a method which transforms the complex signal into AR model, then takes power spectrum based on Burg algorithm to show the effect. Experimental results show that the estimation method still can determine the number of carrier frequencies accurately even when the signal noise ratio (SNR is very low.

  8. Phylogenetic relationships, character evolution, and taxonomic implications within the slipper lobsters (Crustacea: Decapoda: Scyllaridae).

    Science.gov (United States)

    Yang, Chien-Hui; Bracken-Grissom, Heather; Kim, Dohyup; Crandall, Keith A; Chan, Tin-Yam

    2012-01-01

    The slipper lobsters belong to the family Scyllaridae which contains a total of 20 genera and 89 species distributed across four subfamilies (Arctidinae, Ibacinae, Scyllarinae, and Theninae). We have collected nucleotide sequence data from regions of five different genes (16S, 18S, COI, 28S, H3) to estimate phylogenetic relationships among 54 species from the Scyllaridae with a focus on the species rich subfamily Scyllarinae. We have included in our analyses at least one representative from all 20 genera in the Scyllaridae and 35 of the 52 species within the Scyllarinae. Our resulting phylogenetic estimate shows the subfamilies are monophyletic, except for Ibacinae, which has paraphyletic relationships among genera. Many of the genera within the Scyllarinae form non-monophyletic groups, while the genera from all other subfamilies form well supported clades. We discuss the implications of this history on the evolution of morphological characters and ecological transitions (nearshore vs. offshore) within the slipper lobsters. Finally, we identify, through ancestral state character reconstructions, key morphological features diagnostic of the major clades of diversity within the Scyllaridae and relate this character evolution to current taxonomy and classification. Copyright © 2011 Elsevier Inc. All rights reserved.

  9. Species boundaries and phylogenetic relationships in the critically endangered Asian box turtle genus Cuora.

    Science.gov (United States)

    Spinks, Phillip Q; Thomson, Robert C; Zhang, YaPing; Che, Jing; Wu, Yonghua; Shaffer, H Bradley

    2012-06-01

    Turtles are currently the most endangered major clade of vertebrates on earth, and Asian box turtles (Cuora) are in catastrophic decline. Effective management of this diverse turtle clade has been hampered by human-mediated, and perhaps natural hybridization, resulting in discordance between mitochondrial and nuclear markers and confusion regarding species boundaries and phylogenetic relationships among hypothesized species of Cuora. Here, we present analyses of mitochondrial and nuclear DNA data for all 12 currently hypothesized species to resolve both species boundaries and phylogenetic relationships. Our 15-gene, 40-individual nuclear data set was frequently in conflict with our mitochondrial data set; based on its general concordance with published morphological analyses and the strength of 15 independent estimates of evolutionary history, we interpret the nuclear data as representing the most reliable estimate of species boundaries and phylogeny of Cuora. Our results strongly reiterate the necessity of using multiple nuclear markers for phylogeny and species delimitation in these animals, including any form of DNA "barcoding", and point to Cuora as an important case study where reliance on mitochondrial DNA can lead to incorrect species identification. Copyright © 2012 Elsevier Inc. All rights reserved.

  10. Hydrological model uncertainty due to spatial evapotranspiration estimation methods

    Science.gov (United States)

    Yu, Xuan; Lamačová, Anna; Duffy, Christopher; Krám, Pavel; Hruška, Jakub

    2016-05-01

    Evapotranspiration (ET) continues to be a difficult process to estimate in seasonal and long-term water balances in catchment models. Approaches to estimate ET typically use vegetation parameters (e.g., leaf area index [LAI], interception capacity) obtained from field observation, remote sensing data, national or global land cover products, and/or simulated by ecosystem models. In this study we attempt to quantify the uncertainty that spatial evapotranspiration estimation introduces into hydrological simulations when the age of the forest is not precisely known. The Penn State Integrated Hydrologic Model (PIHM) was implemented for the Lysina headwater catchment, located 50°03‧N, 12°40‧E in the western part of the Czech Republic. The spatial forest patterns were digitized from forest age maps made available by the Czech Forest Administration. Two ET methods were implemented in the catchment model: the Biome-BGC forest growth sub-model (1-way coupled to PIHM) and with the fixed-seasonal LAI method. From these two approaches simulation scenarios were developed. We combined the estimated spatial forest age maps and two ET estimation methods to drive PIHM. A set of spatial hydrologic regime and streamflow regime indices were calculated from the modeling results for each method. Intercomparison of the hydrological responses to the spatial vegetation patterns suggested considerable variation in soil moisture and recharge and a small uncertainty in the groundwater table elevation and streamflow. The hydrologic modeling with ET estimated by Biome-BGC generated less uncertainty due to the plant physiology-based method. The implication of this research is that overall hydrologic variability induced by uncertain management practices was reduced by implementing vegetation models in the catchment models.

  11. Virulence, serotype and phylogenetic groups of diarrhoeagenic ...

    African Journals Online (AJOL)

    Dr DADIE Thomas

    2014-02-17

    Feb 17, 2014 ... The virulence, serotype and phylogenetic traits of diarrhoeagenic Escherichia coli were detected in 502 strains isolated during digestive infections. Molecular detection of the target virulence genes, rfb gene of operon O and phylogenetic grouping genes Chua, yjaA and TSPE4.C2 was performed.

  12. False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing.

    Science.gov (United States)

    Xiao, Jian; Cao, Hongyuan; Chen, Jun

    2017-09-15

    Next generation sequencing technologies have enabled the study of the human microbiome through direct sequencing of microbial DNA, resulting in an enormous amount of microbiome sequencing data. One unique characteristic of microbiome data is the phylogenetic tree that relates all the bacterial species. Closely related bacterial species have a tendency to exhibit a similar relationship with the environment or disease. Thus, incorporating the phylogenetic tree information can potentially improve the detection power for microbiome-wide association studies, where hundreds or thousands of tests are conducted simultaneously to identify bacterial species associated with a phenotype of interest. Despite much progress in multiple testing procedures such as false discovery rate (FDR) control, methods that take into account the phylogenetic tree are largely limited. We propose a new FDR control procedure that incorporates the prior structure information and apply it to microbiome data. The proposed procedure is based on a hierarchical model, where a structure-based prior distribution is designed to utilize the phylogenetic tree. By borrowing information from neighboring bacterial species, we are able to improve the statistical power of detecting associated bacterial species while controlling the FDR at desired levels. When the phylogenetic tree is mis-specified or non-informative, our procedure achieves a similar power as traditional procedures that do not take into account the tree structure. We demonstrate the performance of our method through extensive simulations and real microbiome datasets. We identified far more alcohol-drinking associated bacterial species than traditional methods. R package StructFDR is available from CRAN. chen.jun2@mayo.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  13. Consumptive use of upland rice as estimated by different methods

    International Nuclear Information System (INIS)

    Chhabda, P.R.; Varade, S.B.

    1985-01-01

    The consumptive use of upland rice (Oryza sativa Linn.) grown during the wet season (kharif) as estimated by modified Penman, radiation, pan-evaporation and Hargreaves methods showed a variation from computed consumptive use estimated by the gravimetric method. The variability increased with an increase in the irrigation interval, and decreased with an increase in the level of N applied. The average variability was less in pan-evaporation method, which could reliably be used for estimating water requirement of upland rice if percolation losses are considered

  14. Phylogenetic molecular function annotation

    International Nuclear Information System (INIS)

    Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E

    2009-01-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  15. The influence of molecular markers and methods on inferring the phylogenetic relationships between the representatives of the Arini (parrots, Psittaciformes), determined on the basis of their complete mitochondrial genomes.

    Science.gov (United States)

    Urantowka, Adam Dawid; Kroczak, Aleksandra; Mackiewicz, Paweł

    2017-07-14

    Conures are a morphologically diverse group of Neotropical parrots classified as members of the tribe Arini, which has recently been subjected to a taxonomic revision. The previously broadly defined Aratinga genus of this tribe has been split into the 'true' Aratinga and three additional genera, Eupsittula, Psittacara and Thectocercus. Popular markers used in the reconstruction of the parrots' phylogenies derive from mitochondrial DNA. However, current phylogenetic analyses seem to indicate conflicting relationships between Aratinga and other conures, and also among other Arini members. Therefore, it is not clear if the mtDNA phylogenies can reliably define the species tree. The inconsistencies may result from the variable evolution rate of the markers used or their weak phylogenetic signal. To resolve these controversies and to assess to what extent the phylogenetic relationships in the tribe Arini can be inferred from mitochondrial genomes, we compared representative Arini mitogenomes as well as examined the usefulness of the individual mitochondrial markers and the efficiency of various phylogenetic methods. Single molecular markers produced inconsistent tree topologies, while different methods offered various topologies even for the same marker. A significant disagreement in these tree topologies occurred for cytb, nd2 and nd6 genes, which are commonly used in parrot phylogenies. The strongest phylogenetic signal was found in the control region and RNA genes. However, these markers cannot be used alone in inferring Arini phylogenies because they do not provide fully resolved trees. The most reliable phylogeny of the parrots under study is obtained only on the concatenated set of all mitochondrial markers. The analyses established significantly resolved relationships within the former Aratinga representatives and the main genera of the tribe Arini. Such mtDNA phylogeny can be in agreement with the species tree, owing to its match with synapomorphic features in

  16. Methods for estimating low-flow statistics for Massachusetts streams

    Science.gov (United States)

    Ries, Kernell G.; Friesz, Paul J.

    2000-01-01

    Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The

  17. Comparing Methods for Estimating Direct Costs of Adverse Drug Events.

    Science.gov (United States)

    Gyllensten, Hanna; Jönsson, Anna K; Hakkarainen, Katja M; Svensson, Staffan; Hägg, Staffan; Rehnberg, Clas

    2017-12-01

    To estimate how direct health care costs resulting from adverse drug events (ADEs) and cost distribution are affected by methodological decisions regarding identification of ADEs, assigning relevant resource use to ADEs, and estimating costs for the assigned resources. ADEs were identified from medical records and diagnostic codes for a random sample of 4970 Swedish adults during a 3-month study period in 2008 and were assessed for causality. Results were compared for five cost evaluation methods, including different methods for identifying ADEs, assigning resource use to ADEs, and for estimating costs for the assigned resources (resource use method, proportion of registered cost method, unit cost method, diagnostic code method, and main diagnosis method). Different levels of causality for ADEs and ADEs' contribution to health care resource use were considered. Using the five methods, the maximum estimated overall direct health care costs resulting from ADEs ranged from Sk10,000 (Sk = Swedish krona; ~€1,500 in 2016 values) using the diagnostic code method to more than Sk3,000,000 (~€414,000) using the unit cost method in our study population. The most conservative definitions for ADEs' contribution to health care resource use and the causality of ADEs resulted in average costs per patient ranging from Sk0 using the diagnostic code method to Sk4066 (~€500) using the unit cost method. The estimated costs resulting from ADEs varied considerably depending on the methodological choices. The results indicate that costs for ADEs need to be identified through medical record review and by using detailed unit cost data. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  18. Phase difference estimation method based on data extension and Hilbert transform

    International Nuclear Information System (INIS)

    Shen, Yan-lin; Tu, Ya-qing; Chen, Lin-jun; Shen, Ting-ao

    2015-01-01

    To improve the precision and anti-interference performance of phase difference estimation for non-integer periods of sampling signals, a phase difference estimation method based on data extension and Hilbert transform is proposed. Estimated phase difference is obtained by means of data extension, Hilbert transform, cross-correlation, auto-correlation, and weighted phase average. Theoretical analysis shows that the proposed method suppresses the end effects of Hilbert transform effectively. The results of simulations and field experiments demonstrate that the proposed method improves the anti-interference performance of phase difference estimation and has better performance of phase difference estimation than the correlation, Hilbert transform, and data extension-based correlation methods, which contribute to improving the measurement precision of the Coriolis mass flowmeter. (paper)

  19. Gray bootstrap method for estimating frequency-varying random vibration signals with small samples

    Directory of Open Access Journals (Sweden)

    Wang Yanqing

    2014-04-01

    Full Text Available During environment testing, the estimation of random vibration signals (RVS is an important technique for the airborne platform safety and reliability. However, the available methods including extreme value envelope method (EVEM, statistical tolerances method (STM and improved statistical tolerance method (ISTM require large samples and typical probability distribution. Moreover, the frequency-varying characteristic of RVS is usually not taken into account. Gray bootstrap method (GBM is proposed to solve the problem of estimating frequency-varying RVS with small samples. Firstly, the estimated indexes are obtained including the estimated interval, the estimated uncertainty, the estimated value, the estimated error and estimated reliability. In addition, GBM is applied to estimating the single flight testing of certain aircraft. At last, in order to evaluate the estimated performance, GBM is compared with bootstrap method (BM and gray method (GM in testing analysis. The result shows that GBM has superiority for estimating dynamic signals with small samples and estimated reliability is proved to be 100% at the given confidence level.

  20. Simple method for the estimation of glomerular filtration rate

    Energy Technology Data Exchange (ETDEWEB)

    Groth, T [Group for Biomedical Informatics, Uppsala Univ. Data Center, Uppsala (Sweden); Tengstroem, B [District General Hospital, Skoevde (Sweden)

    1977-02-01

    A simple method is presented for indirect estimation of the glomerular filtration rate from two venous blood samples, drawn after a single injection of a small dose of (/sup 125/I)sodium iothalamate (10 ..mu..Ci). The method does not require exact dosage, as the first sample, taken after a few minutes (t=5 min) after injection, is used to normilize the value of the second sample, which should be taken in between 2 to 4 h after injection. The glomerular filtration rate, as measured by standard insulin clearance, may then be predicted from the logarithm of the normalized value and linear regression formulas with a standard error of estimate of the order of 1 to 2 ml/min/1.73 m/sup 2/. The slope-intercept method for direct estimation of glomerular filtration rate is also evaluated and found to significantly underestimate standard insulin clearance. The normalized 'single-point' method is concluded to be superior to the slope-intercept method and more sophisticated methods using curve fitting technique, with regard to predictive force and clinical applicability.

  1. Phylogenetic patterns of extinction risk in the eastern arc ecosystems, an African biodiversity hotspot.

    Science.gov (United States)

    Yessoufou, Kowiyou; Daru, Barnabas H; Davies, T Jonathan

    2012-01-01

    There is an urgent need to reduce drastically the rate at which biodiversity is declining worldwide. Phylogenetic methods are increasingly being recognised as providing a useful framework for predicting future losses, and guiding efforts for pre-emptive conservation actions. In this study, we used a reconstructed phylogenetic tree of angiosperm species of the Eastern Arc Mountains - an important African biodiversity hotspot - and described the distribution of extinction risk across taxonomic ranks and phylogeny. We provide evidence for both taxonomic and phylogenetic selectivity in extinction risk. However, we found that selectivity varies with IUCN extinction risk category. Vulnerable species are more closely related than expected by chance, whereas endangered and critically endangered species are not significantly clustered on the phylogeny. We suggest that the general observation for taxonomic and phylogenetic selectivity (i.e. phylogenetic signal, the tendency of closely related species to share similar traits) in extinction risks is therefore largely driven by vulnerable species, and not necessarily the most highly threatened. We also used information on altitudinal distribution and climate to generate a predictive model of at-risk species richness, and found that greater threatened species richness is found at higher altitude, allowing for more informed conservation decision making. Our results indicate that evolutionary history can help predict plant susceptibility to extinction threats in the hyper-diverse but woefully-understudied Eastern Arc Mountains, and illustrate the contribution of phylogenetic approaches in conserving African floristic biodiversity where detailed ecological and evolutionary data are often lacking.

  2. Phylogenetic patterns of extinction risk in the eastern arc ecosystems, an African biodiversity hotspot.

    Directory of Open Access Journals (Sweden)

    Kowiyou Yessoufou

    Full Text Available There is an urgent need to reduce drastically the rate at which biodiversity is declining worldwide. Phylogenetic methods are increasingly being recognised as providing a useful framework for predicting future losses, and guiding efforts for pre-emptive conservation actions. In this study, we used a reconstructed phylogenetic tree of angiosperm species of the Eastern Arc Mountains - an important African biodiversity hotspot - and described the distribution of extinction risk across taxonomic ranks and phylogeny. We provide evidence for both taxonomic and phylogenetic selectivity in extinction risk. However, we found that selectivity varies with IUCN extinction risk category. Vulnerable species are more closely related than expected by chance, whereas endangered and critically endangered species are not significantly clustered on the phylogeny. We suggest that the general observation for taxonomic and phylogenetic selectivity (i.e. phylogenetic signal, the tendency of closely related species to share similar traits in extinction risks is therefore largely driven by vulnerable species, and not necessarily the most highly threatened. We also used information on altitudinal distribution and climate to generate a predictive model of at-risk species richness, and found that greater threatened species richness is found at higher altitude, allowing for more informed conservation decision making. Our results indicate that evolutionary history can help predict plant susceptibility to extinction threats in the hyper-diverse but woefully-understudied Eastern Arc Mountains, and illustrate the contribution of phylogenetic approaches in conserving African floristic biodiversity where detailed ecological and evolutionary data are often lacking.

  3. The performance of the Congruence Among Distance Matrices (CADM) test in phylogenetic analysis

    Science.gov (United States)

    2011-01-01

    Background CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. Results Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa), the number of partially or completely congruent matrices and the level of congruence among distance matrices. Conclusions Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously. PMID:21388552

  4. The performance of the Congruence Among Distance Matrices (CADM test in phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Lapointe François-Joseph

    2011-03-01

    Full Text Available Abstract Background CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. Results Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa, the number of partially or completely congruent matrices and the level of congruence among distance matrices. Conclusions Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously.

  5. Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: evidence from case studies in the Metazoa.

    Science.gov (United States)

    Letsch, Harald O; Kjer, Karl M

    2011-05-27

    Failure to account for covariation patterns in helical regions of ribosomal RNA (rRNA) genes has the potential to misdirect the estimation of the phylogenetic signal of the data. Furthermore, the extremes of length variation among taxa, combined with regional substitution rate variation can mislead the alignment of rRNA sequences and thus distort subsequent tree reconstructions. However, recent developments in phylogenetic methodology now allow a comprehensive integration of secondary structures in alignment and tree reconstruction analyses based on rRNA sequences, which has been shown to correct some of these problems. Here, we explore the potentials of RNA substitution models and the interactions of specific model setups with the inherent pattern of covariation in rRNA stems and substitution rate variation among loop regions. We found an explicit impact of RNA substitution models on tree reconstruction analyses. The application of specific RNA models in tree reconstructions is hampered by interaction between the appropriate modelling of covarying sites in stem regions, and excessive homoplasy in some loop regions. RNA models often failed to recover reasonable trees when single-stranded regions are excessively homoplastic, because these regions contribute a greater proportion of the data when covarying sites are essentially downweighted. In this context, the RNA6A model outperformed all other models, including the more parametrized RNA7 and RNA16 models. Our results depict a trade-off between increased accuracy in estimation of interdependencies in helical regions with the risk of magnifying positions lacking phylogenetic signal. We can therefore conclude that caution is warranted when applying rRNA covariation models, and suggest that loop regions be independently screened for phylogenetic signal, and eliminated when they are indistinguishable from random noise. In addition to covariation and homoplasy, other factors, like non-stationarity of substitution rates

  6. Phylogenetics links monster larva to deep-sea shrimp.

    Science.gov (United States)

    Bracken-Grissom, Heather D; Felder, Darryl L; Vollmer, Nicole L; Martin, Joel W; Crandall, Keith A

    2012-10-01

    Mid-water plankton collections commonly include bizarre and mysterious developmental stages that differ conspicuously from their adult counterparts in morphology and habitat. Unaware of the existence of planktonic larval stages, early zoologists often misidentified these unique morphologies as independent adult lineages. Many such mistakes have since been corrected by collecting larvae, raising them in the lab, and identifying the adult forms. However, challenges arise when the larva is remarkably rare in nature and relatively inaccessible due to its changing habitats over the course of ontogeny. The mid-water marine species Cerataspis monstrosa (Gray 1828) is an armored crustacean larva whose adult identity has remained a mystery for over 180 years. Our phylogenetic analyses, based in part on recent collections from the Gulf of Mexico, provide definitive evidence that the rare, yet broadly distributed larva, C. monstrosa, is an early developmental stage of the globally distributed deepwater aristeid shrimp, Plesiopenaeus armatus. Divergence estimates and phylogenetic relationships across five genes confirm the larva and adult are the same species. Our work demonstrates the diagnostic power of molecular systematics in instances where larval rearing seldom succeeds and morphology and habitat are not indicative of identity. Larval-adult linkages not only aid in our understanding of biodiversity, they provide insights into the life history, distribution, and ecology of an organism.

  7. Comparison of methods for estimating premorbid intelligence

    OpenAIRE

    Bright, Peter; van der Linde, Ian

    2018-01-01

    To evaluate impact of neurological injury on cognitive performance it is typically necessary to derive a baseline (or ‘premorbid’) estimate of a patient’s general cognitive ability prior to the onset of impairment. In this paper, we consider a range of common methods for producing this estimate, including those based on current best performance, embedded ‘hold/no hold’ tests, demographic information, and word reading ability. Ninety-two neurologically healthy adult participants were assessed ...

  8. Competitive interactions between forest trees are driven by species' trait hierarchy, not phylogenetic or functional similarity: implications for forest community assembly.

    Science.gov (United States)

    Kunstler, Georges; Lavergne, Sébastien; Courbaud, Benoît; Thuiller, Wilfried; Vieilledent, Ghislain; Zimmermann, Niklaus E; Kattge, Jens; Coomes, David A

    2012-08-01

    The relative importance of competition vs. environmental filtering in the assembly of communities is commonly inferred from their functional and phylogenetic structure, on the grounds that similar species compete most strongly for resources and are therefore less likely to coexist locally. This approach ignores the possibility that competitive effects can be determined by relative positions of species on a hierarchy of competitive ability. Using growth data, we estimated 275 interaction coefficients between tree species in the French mountains. We show that interaction strengths are mainly driven by trait hierarchy and not by functional or phylogenetic similarity. On the basis of this result, we thus propose that functional and phylogenetic convergence in local tree community might be due to competition-sorting species with different competitive abilities and not only environmental filtering as commonly assumed. We then show a functional and phylogenetic convergence of forest structure with increasing plot age, which supports this view. © 2012 Blackwell Publishing Ltd/CNRS.

  9. A numerical integration-based yield estimation method for integrated circuits

    International Nuclear Information System (INIS)

    Liang Tao; Jia Xinzhang

    2011-01-01

    A novel integration-based yield estimation method is developed for yield optimization of integrated circuits. This method tries to integrate the joint probability density function on the acceptability region directly. To achieve this goal, the simulated performance data of unknown distribution should be converted to follow a multivariate normal distribution by using Box-Cox transformation (BCT). In order to reduce the estimation variances of the model parameters of the density function, orthogonal array-based modified Latin hypercube sampling (OA-MLHS) is presented to generate samples in the disturbance space during simulations. The principle of variance reduction of model parameters estimation through OA-MLHS together with BCT is also discussed. Two yield estimation examples, a fourth-order OTA-C filter and a three-dimensional (3D) quadratic function are used for comparison of our method with Monte Carlo based methods including Latin hypercube sampling and importance sampling under several combinations of sample sizes and yield values. Extensive simulations show that our method is superior to other methods with respect to accuracy and efficiency under all of the given cases. Therefore, our method is more suitable for parametric yield optimization. (semiconductor integrated circuits)

  10. A numerical integration-based yield estimation method for integrated circuits

    Energy Technology Data Exchange (ETDEWEB)

    Liang Tao; Jia Xinzhang, E-mail: tliang@yahoo.cn [Key Laboratory of Ministry of Education for Wide Bandgap Semiconductor Materials and Devices, School of Microelectronics, Xidian University, Xi' an 710071 (China)

    2011-04-15

    A novel integration-based yield estimation method is developed for yield optimization of integrated circuits. This method tries to integrate the joint probability density function on the acceptability region directly. To achieve this goal, the simulated performance data of unknown distribution should be converted to follow a multivariate normal distribution by using Box-Cox transformation (BCT). In order to reduce the estimation variances of the model parameters of the density function, orthogonal array-based modified Latin hypercube sampling (OA-MLHS) is presented to generate samples in the disturbance space during simulations. The principle of variance reduction of model parameters estimation through OA-MLHS together with BCT is also discussed. Two yield estimation examples, a fourth-order OTA-C filter and a three-dimensional (3D) quadratic function are used for comparison of our method with Monte Carlo based methods including Latin hypercube sampling and importance sampling under several combinations of sample sizes and yield values. Extensive simulations show that our method is superior to other methods with respect to accuracy and efficiency under all of the given cases. Therefore, our method is more suitable for parametric yield optimization. (semiconductor integrated circuits)

  11. PALM: a paralleled and integrated framework for phylogenetic inference with automatic likelihood model selectors.

    Directory of Open Access Journals (Sweden)

    Shu-Hwa Chen

    phylogenetic relationship not only by vanquishing the computation difficulty of ML methods but also providing statistic methods for model selection and bootstrapping. The proposed approach can reduce calculation time, which is particularly relevant when querying a large data set. PALM can be accessed online at http://palm.iis.sinica.edu.tw.

  12. Correction of Misclassifications Using a Proximity-Based Estimation Method

    Directory of Open Access Journals (Sweden)

    Shmulevich Ilya

    2004-01-01

    Full Text Available An estimation method for correcting misclassifications in signal and image processing is presented. The method is based on the use of context-based (temporal or spatial information in a sliding-window fashion. The classes can be purely nominal, that is, an ordering of the classes is not required. The method employs nonlinear operations based on class proximities defined by a proximity matrix. Two case studies are presented. In the first, the proposed method is applied to one-dimensional signals for processing data that are obtained by a musical key-finding algorithm. In the second, the estimation method is applied to two-dimensional signals for correction of misclassifications in images. In the first case study, the proximity matrix employed by the estimation method follows directly from music perception studies, whereas in the second case study, the optimal proximity matrix is obtained with genetic algorithms as the learning rule in a training-based optimization framework. Simulation results are presented in both case studies and the degree of improvement in classification accuracy that is obtained by the proposed method is assessed statistically using Kappa analysis.

  13. Phylogenetically Acquired Representations and Evolutionary Algorithms.

    OpenAIRE

    Wozniak , Adrianna

    2006-01-01

    First, we explain why Genetic Algorithms (GAs), inspired by the Modern Synthesis, do not accurately model biological evolution, being rather an artificial version of artificial, rather than natural selection. Being focused on optimisation, we propose two improvements of GAs, with the aim to successfully generate adapted, desired behaviour. The first one concerns phylogenetic grounding of meaning, a way to avoid the Symbol Grounding Problem. We give a definition of Phylogenetically Acquired Re...

  14. Stock price estimation using ensemble Kalman Filter square root method

    Science.gov (United States)

    Karya, D. F.; Katias, P.; Herlambang, T.

    2018-04-01

    Shares are securities as the possession or equity evidence of an individual or corporation over an enterprise, especially public companies whose activity is stock trading. Investment in stocks trading is most likely to be the option of investors as stocks trading offers attractive profits. In determining a choice of safe investment in the stocks, the investors require a way of assessing the stock prices to buy so as to help optimize their profits. An effective method of analysis which will reduce the risk the investors may bear is by predicting or estimating the stock price. Estimation is carried out as a problem sometimes can be solved by using previous information or data related or relevant to the problem. The contribution of this paper is that the estimates of stock prices in high, low, and close categorycan be utilized as investors’ consideration for decision making in investment. In this paper, stock price estimation was made by using the Ensemble Kalman Filter Square Root method (EnKF-SR) and Ensemble Kalman Filter method (EnKF). The simulation results showed that the resulted estimation by applying EnKF method was more accurate than that by the EnKF-SR, with an estimation error of about 0.2 % by EnKF and an estimation error of 2.6 % by EnKF-SR.

  15. Improvement of Accuracy for Background Noise Estimation Method Based on TPE-AE

    Science.gov (United States)

    Itai, Akitoshi; Yasukawa, Hiroshi

    This paper proposes a method of a background noise estimation based on the tensor product expansion with a median and a Monte carlo simulation. We have shown that a tensor product expansion with absolute error method is effective to estimate a background noise, however, a background noise might not be estimated by using conventional method properly. In this paper, it is shown that the estimate accuracy can be improved by using proposed methods.

  16. Methods for risk estimation in nuclear energy

    Energy Technology Data Exchange (ETDEWEB)

    Gauvenet, A [CEA, 75 - Paris (France)

    1979-01-01

    The author presents methods for estimating the different risks related to nuclear energy: immediate or delayed risks, individual or collective risks, risks of accidents and long-term risks. These methods have attained a highly valid level of elaboration and their application to other industrial or human problems is currently under way, especially in English-speaking countries.

  17. A program to compute the soft Robinson-Foulds distance between phylogenetic networks.

    Science.gov (United States)

    Lu, Bingxin; Zhang, Louxin; Leong, Hon Wai

    2017-03-14

    Over the past two decades, phylogenetic networks have been studied to model reticulate evolutionary events. The relationships among phylogenetic networks, phylogenetic trees and clusters serve as the basis for reconstruction and comparison of phylogenetic networks. To understand these relationships, two problems are raised: the tree containment problem, which asks whether a phylogenetic tree is displayed in a phylogenetic network, and the cluster containment problem, which asks whether a cluster is represented at a node in a phylogenetic network. Both the problems are NP-complete. A fast exponential-time algorithm for the cluster containment problem on arbitrary networks is developed and implemented in C. The resulting program is further extended into a computer program for fast computation of the Soft Robinson-Foulds distance between phylogenetic networks. Two computer programs are developed for facilitating reconstruction and validation of phylogenetic network models in evolutionary and comparative genomics. Our simulation tests indicated that they are fast enough for use in practice. Additionally, the distribution of the Soft Robinson-Foulds distance between phylogenetic networks is demonstrated to be unlikely normal by our simulation data.

  18. YBYRÁ facilitates comparison of large phylogenetic trees.

    Science.gov (United States)

    Machado, Denis Jacob

    2015-07-01

    The number and size of tree topologies that are being compared by phylogenetic systematists is increasing due to technological advancements in high-throughput DNA sequencing. However, we still lack tools to facilitate comparison among phylogenetic trees with a large number of terminals. The "YBYRÁ" project integrates software solutions for data analysis in phylogenetics. It comprises tools for (1) topological distance calculation based on the number of shared splits or clades, (2) sensitivity analysis and automatic generation of sensitivity plots and (3) clade diagnoses based on different categories of synapomorphies. YBYRÁ also provides (4) an original framework to facilitate the search for potential rogue taxa based on how much they affect average matching split distances (using MSdist). YBYRÁ facilitates comparison of large phylogenetic trees and outperforms competing software in terms of usability and time efficiency, specially for large data sets. The programs that comprises this toolkit are written in Python, hence they do not require installation and have minimum dependencies. The entire project is available under an open-source licence at http://www.ib.usp.br/grant/anfibios/researchSoftware.html .

  19. Using tree diversity to compare phylogenetic heuristics.

    Science.gov (United States)

    Sul, Seung-Jin; Matthews, Suzanne; Williams, Tiffani L

    2009-04-29

    Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees-especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.

  20. Disentangling the phylogenetic and ecological components of spider phenotypic variation.

    Science.gov (United States)

    Gonçalves-Souza, Thiago; Diniz-Filho, José Alexandre Felizola; Romero, Gustavo Quevedo

    2014-01-01

    An understanding of how the degree of phylogenetic relatedness influences the ecological similarity among species is crucial to inferring the mechanisms governing the assembly of communities. We evaluated the relative importance of spider phylogenetic relationships and ecological niche (plant morphological variables) to the variation in spider body size and shape by comparing spiders at different scales: (i) between bromeliads and dicot plants (i.e., habitat scale) and (ii) among bromeliads with distinct architectural features (i.e., microhabitat scale). We partitioned the interspecific variation in body size and shape into phylogenetic (that express trait values as expected by phylogenetic relationships among species) and ecological components (that express trait values independent of phylogenetic relationships). At the habitat scale, bromeliad spiders were larger and flatter than spiders associated with the surrounding dicots. At this scale, plant morphology sorted out close related spiders. Our results showed that spider flatness is phylogenetically clustered at the habitat scale, whereas it is phylogenetically overdispersed at the microhabitat scale, although phylogenic signal is present in both scales. Taken together, these results suggest that whereas at the habitat scale selective colonization affect spider body size and shape, at fine scales both selective colonization and adaptive evolution determine spider body shape. By partitioning the phylogenetic and ecological components of phenotypic variation, we were able to disentangle the evolutionary history of distinct spider traits and show that plant architecture plays a role in the evolution of spider body size and shape. We also discussed the relevance in considering multiple scales when studying phylogenetic community structure.

  1. Study on Comparison of Bidding and Pricing Behavior Distinction between Estimate Methods

    Science.gov (United States)

    Morimoto, Emi; Namerikawa, Susumu

    The most characteristic trend on bidding and pricing behavior distinction in recent years is the increasing number of bidders just above the criteria for low-price bidding investigations. The contractor's markup is the difference between the bidding price and the execution price. Therefore, the contractor's markup is the difference between criteria for low-price bidding investigations price and the execution price in the public works bid in Japan. Virtually, bidder's strategies and behavior have been controlled by public engineer's budgets. Estimation and bid are inseparably linked in the Japanese public works procurement system. The trial of the unit price-type estimation method begins in 2004. On another front, accumulated estimation method is one of the general methods in public works. So, there are two types of standard estimation methods in Japan. In this study, we did a statistical analysis on the bid information of civil engineering works for the Ministry of Land, Infrastructure, and Transportation in 2008. It presents several issues that bidding and pricing behavior is related to an estimation method (several estimation methods) for public works bid in Japan. The two types of standard estimation methods produce different results that number of bidders (decide on bid-no bid strategy) and distribution of bid price (decide on mark-up strategy).The comparison on the distribution of bid prices showed that the percentage of the bid concentrated on the criteria for low-price bidding investigations have had a tendency to get higher in the large-sized public works by the unit price-type estimation method, comparing with the accumulated estimation method. On one hand, the number of bidders who bids for public works estimated unit-price tends to increase significantly Public works estimated unit-price is likely to have been one of the factors for the construction companies to decide if they participate in the biddings.

  2. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov

    2013-01-01

    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  3. A different approach to estimate nonlinear regression model using numerical methods

    Science.gov (United States)

    Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.

    2017-11-01

    This research paper concerns with the computational methods namely the Gauss-Newton method, Gradient algorithm methods (Newton-Raphson method, Steepest Descent or Steepest Ascent algorithm method, the Method of Scoring, the Method of Quadratic Hill-Climbing) based on numerical analysis to estimate parameters of nonlinear regression model in a very different way. Principles of matrix calculus have been used to discuss the Gradient-Algorithm methods. Yonathan Bard [1] discussed a comparison of gradient methods for the solution of nonlinear parameter estimation problems. However this article discusses an analytical approach to the gradient algorithm methods in a different way. This paper describes a new iterative technique namely Gauss-Newton method which differs from the iterative technique proposed by Gorden K. Smyth [2]. Hans Georg Bock et.al [10] proposed numerical methods for parameter estimation in DAE’s (Differential algebraic equation). Isabel Reis Dos Santos et al [11], Introduced weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel. For large-scale non smooth convex minimization the Hager and Zhang (HZ) conjugate gradient Method and the modified HZ (MHZ) method were presented by Gonglin Yuan et al [12].

  4. Phylogenetic system and zoogeography of the Plecoptera.

    Science.gov (United States)

    Zwick, P

    2000-01-01

    Information about the phylogenetic relationships of Plecoptera is summarized. The few characters supporting monophyly of the order are outlined. Several characters of possible significance for the search for the closest relatives of the stoneflies are discussed, but the sister-group of the order remains unknown. Numerous characters supporting the presently recognized phylogenetic system of Plecoptera are presented, alternative classifications are discussed, and suggestions for future studies are made. Notes on zoogeography are appended. The order as such is old (Permian fossils), but phylogenetic relationships and global distribution patterns suggest that evolution of the extant suborders started with the breakup of Pangaea. There is evidence of extensive recent speciation in all parts of the world.

  5. Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf.

    Science.gov (United States)

    Cardona, Gabriel; Mir, Arnau; Rosselló, Francesc; Rotger, Lucía; Sánchez, David

    2013-01-16

    Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair of phylogenetic trees by first encoding them by means of their half-matrices of cophenetic values, and then comparing these matrices. This idea has been used several times since then to define dissimilarity measures between phylogenetic trees but, to our knowledge, no proper metric on weighted phylogenetic trees with nested taxa based on this idea has been formally defined and studied yet. Actually, the cophenetic values of pairs of different taxa alone are not enough to single out phylogenetic trees with weighted arcs or nested taxa. For every (rooted) phylogenetic tree T, let its cophenetic vectorφ(T) consist of all pairs of cophenetic values between pairs of taxa in T and all depths of taxa in T. It turns out that these cophenetic vectors single out weighted phylogenetic trees with nested taxa. We then define a family of cophenetic metrics dφ,p by comparing these cophenetic vectors by means of Lp norms, and we study, either analytically or numerically, some of their basic properties: neighbors, diameter, distribution, and their rank correlation with each other and with other metrics. The cophenetic metrics can be safely used on weighted phylogenetic trees with nested taxa and no restriction on degrees, and they can be computed in O(n2) time, where n stands for the number of taxa. The metrics dφ,1 and dφ,2 have positive skewed distributions, and they show a low rank correlation with the Robinson-Foulds metric and the nodal metrics, and a very high correlation with each other and with the splitted nodal metrics. The diameter of dφ,p, for p⩾1 , is in O(n(p+2)/p), and thus for low p they are more discriminative, having a wider range of values.

  6. Ore reserve estimation: a summary of principles and methods

    International Nuclear Information System (INIS)

    Marques, J.P.M.

    1985-01-01

    The mining industry has experienced substantial improvements with the increasing utilization of computerized and electronic devices throughout the last few years. In the ore reserve estimation field the main methods have undergone recent advances in order to improve their overall efficiency. This paper presents the three main groups of ore reserve estimation methods presently used worldwide: Conventional, Statistical and Geostatistical, and elaborates a detaited description and comparative analysis of each. The Conventional Methods are the oldest, less complex and most employed ones. The Geostatistical Methods are the most recent precise and more complex ones. The Statistical Methods are intermediate to the others in complexity, diffusion and chronological order. (D.J.M.) [pt

  7. Pattern of phylogenetic diversification of the Cychrini ground beetles in the world as deduced mainly from sequence comparisons of the mitochondrial genes.

    Science.gov (United States)

    Su, Zhi-Hui; Imura, Yûki; Okamoto, Munehiro; Osawa, Syozo

    2004-02-04

    The phylogenetic position of the tribe Cychrini within the subfamily Carabinae (the family Carabidae) was estimated by comparing the nucleotide sequences of the mitochondrial NADH dehydrogenase subunit 5 (ND5) gene and the nuclear 28S ribosomal DNA (rDNA). The phylogenetic trees suggest that the Cychrini would most probably be the oldest line within the Carabinae. Phylogenetic trees were constructed by comparing the mitochondrial cytochrome C oxidase subunit I (COI) gene sequences from 33 species of the Cychrini from various localities that include the whole distribution ranges of the representative species within all the known genera in the world. The trees suggest that the Cychrini members radiated into a number of phylogenetic lineages within a short period, starting about 44 million years ago (MYA). Most of the phylogenetic lineages or sublineages are geographically linked, each consisting of a single or only a few species without scarce morphological differentiation in spite of their long evolutionary histories (silent or near-silent evolution [see Adv. Biophys. 36 (1999) 65; J. Mol. Evol. 53 (2001) 517]). The fact suggests that the geographic isolation per se did not bring about conspicuous morphological differentiation. The phylogenetic lineages of the Cychrini well correspond to the taxonomically defined genera and the subgenera.

  8. A Bayes linear Bayes method for estimation of correlated event rates.

    Science.gov (United States)

    Quigley, John; Wilson, Kevin J; Walls, Lesley; Bedford, Tim

    2013-12-01

    Typically, full Bayesian estimation of correlated event rates can be computationally challenging since estimators are intractable. When estimation of event rates represents one activity within a larger modeling process, there is an incentive to develop more efficient inference than provided by a full Bayesian model. We develop a new subjective inference method for correlated event rates based on a Bayes linear Bayes model under the assumption that events are generated from a homogeneous Poisson process. To reduce the elicitation burden we introduce homogenization factors to the model and, as an alternative to a subjective prior, an empirical method using the method of moments is developed. Inference under the new method is compared against estimates obtained under a full Bayesian model, which takes a multivariate gamma prior, where the predictive and posterior distributions are derived in terms of well-known functions. The mathematical properties of both models are presented. A simulation study shows that the Bayes linear Bayes inference method and the full Bayesian model provide equally reliable estimates. An illustrative example, motivated by a problem of estimating correlated event rates across different users in a simple supply chain, shows how ignoring the correlation leads to biased estimation of event rates. © 2013 Society for Risk Analysis.

  9. Dual ant colony operational modal analysis parameter estimation method

    Science.gov (United States)

    Sitarz, Piotr; Powałka, Bartosz

    2018-01-01

    Operational Modal Analysis (OMA) is a common technique used to examine the dynamic properties of a system. Contrary to experimental modal analysis, the input signal is generated in object ambient environment. Operational modal analysis mainly aims at determining the number of pole pairs and at estimating modal parameters. Many methods are used for parameter identification. Some methods operate in time while others in frequency domain. The former use correlation functions, the latter - spectral density functions. However, while some methods require the user to select poles from a stabilisation diagram, others try to automate the selection process. Dual ant colony operational modal analysis parameter estimation method (DAC-OMA) presents a new approach to the problem, avoiding issues involved in the stabilisation diagram. The presented algorithm is fully automated. It uses deterministic methods to define the interval of estimated parameters, thus reducing the problem to optimisation task which is conducted with dedicated software based on ant colony optimisation algorithm. The combination of deterministic methods restricting parameter intervals and artificial intelligence yields very good results, also for closely spaced modes and significantly varied mode shapes within one measurement point.

  10. Reversible polymorphism-aware phylogenetic models and their application to tree inference.

    Science.gov (United States)

    Schrempf, Dominik; Minh, Bui Quang; De Maio, Nicola; von Haeseler, Arndt; Kosiol, Carolin

    2016-10-21

    We present a reversible Polymorphism-Aware Phylogenetic Model (revPoMo) for species tree estimation from genome-wide data. revPoMo enables the reconstruction of large scale species trees for many within-species samples. It expands the alphabet of DNA substitution models to include polymorphic states, thereby, naturally accounting for incomplete lineage sorting. We implemented revPoMo in the maximum likelihood software IQ-TREE. A simulation study and an application to great apes data show that the runtimes of our approach and standard substitution models are comparable but that revPoMo has much better accuracy in estimating trees, divergence times and mutation rates. The advantage of revPoMo is that an increase of sample size per species improves estimations but does not increase runtime. Therefore, revPoMo is a valuable tool with several applications, from speciation dating to species tree reconstruction. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  11. Phylogenetic signal in the acoustic parameters of the advertisement calls of four clades of anurans.

    Science.gov (United States)

    Gingras, Bruno; Mohandesan, Elmira; Boko, Drasko; Fitch, W Tecumseh

    2013-07-01

    Anuran vocalizations, especially their advertisement calls, are largely species-specific and can be used to identify taxonomic affiliations. Because anurans are not vocal learners, their vocalizations are generally assumed to have a strong genetic component. This suggests that the degree of similarity between advertisement calls may be related to large-scale phylogenetic relationships. To test this hypothesis, advertisement calls from 90 species belonging to four large clades (Bufo, Hylinae, Leptodactylus, and Rana) were analyzed. Phylogenetic distances were estimated based on the DNA sequences of the 12S mitochondrial ribosomal RNA gene, and, for a subset of 49 species, on the rhodopsin gene. Mean values for five acoustic parameters (coefficient of variation of root-mean-square amplitude, dominant frequency, spectral flux, spectral irregularity, and spectral flatness) were computed for each species. We then tested for phylogenetic signal on the body-size-corrected residuals of these five parameters, using three statistical tests (Moran's I, Mantel, and Blomberg's K) and three models of genetic distance (pairwise distances, Abouheif's proximities, and the variance-covariance matrix derived from the phylogenetic tree). A significant phylogenetic signal was detected for most acoustic parameters on the 12S dataset, across statistical tests and genetic distance models, both for the entire sample of 90 species and within clades in several cases. A further analysis on a subset of 49 species using genetic distances derived from rhodopsin and from 12S broadly confirmed the results obtained on the larger sample, indicating that the phylogenetic signals observed in these acoustic parameters can be detected using a variety of genetic distance models derived either from a variable mitochondrial sequence or from a conserved nuclear gene. We found a robust relationship, in a large number of species, between anuran phylogenetic relatedness and acoustic similarity in the

  12. Estimation of water percolation by different methods using TDR

    Directory of Open Access Journals (Sweden)

    Alisson Jadavi Pereira da Silva

    2014-02-01

    Full Text Available Detailed knowledge on water percolation into the soil in irrigated areas is fundamental for solving problems of drainage, pollution and the recharge of underground aquifers. The aim of this study was to evaluate the percolation estimated by time-domain-reflectometry (TDR in a drainage lysimeter. We used Darcy's law with K(θ functions determined by field and laboratory methods and by the change in water storage in the soil profile at 16 points of moisture measurement at different time intervals. A sandy clay soil was saturated and covered with plastic sheet to prevent evaporation and an internal drainage trial in a drainage lysimeter was installed. The relationship between the observed and estimated percolation values was evaluated by linear regression analysis. The results suggest that percolation in the field or laboratory can be estimated based on continuous monitoring with TDR, and at short time intervals, of the variations in soil water storage. The precision and accuracy of this approach are similar to those of the lysimeter and it has advantages over the other evaluated methods, of which the most relevant are the possibility of estimating percolation in short time intervals and exemption from the predetermination of soil hydraulic properties such as water retention and hydraulic conductivity. The estimates obtained by the Darcy-Buckingham equation for percolation levels using function K(θ predicted by the method of Hillel et al. (1972 provided compatible water percolation estimates with those obtained in the lysimeter at time intervals greater than 1 h. The methods of Libardi et al. (1980, Sisson et al. (1980 and van Genuchten (1980 underestimated water percolation.

  13. Phylogenetic Analyses of Armillaria Reveal at Least 15 Phylogenetic Lineages in China, Seven of Which Are Associated with Cultivated Gastrodia elata.

    Directory of Open Access Journals (Sweden)

    Ting Guo

    Full Text Available Fungal species of Armillaria, which can act as plant pathogens and/or symbionts of the Chinese traditional medicinal herb Gastrodia elata ("Tianma", are ecologically and economically important and have consequently attracted the attention of mycologists. However, their taxonomy has been highly dependent on morphological characterization and mating tests. In this study, we phylogenetically analyzed Chinese Armillaria samples using the sequences of the internal transcribed spacer region, translation elongation factor-1 alpha gene and beta-tubulin gene. Our data revealed at least 15 phylogenetic lineages of Armillaria from China, of which seven were newly discovered and two were recorded from China for the first time. Fourteen Chinese biological species of Armillaria, which were previously defined based on mating tests, could be assigned to the 15 phylogenetic lineages identified herein. Seven of the 15 phylogenetic lineages were found to be disjunctively distributed in different continents of the Northern Hemisphere, while eight were revealed to be endemic to certain continents. In addition, we found that seven phylogenetic lineages of Armillaria were used for the cultivation of Tianma, only two of which had been recorded to be associated with Tianma previously. We also illustrated that G. elata f. glauca ("Brown Tianma" and G. elata f. elata ("Red Tianma", two cultivars of Tianma grown in different regions of China, form symbiotic relationships with different phylogenetic lineages of Armillaria. These findings should aid the development of Tianma cultivation in China.

  14. Phylogenetic relationships within and among Brassica species from ...

    African Journals Online (AJOL)

    Consequently, two potentially susceptible B. napus accessions were identified. The high polymorphic information content (PIC) and number of phylogenetically informative bands established RAPD as a useful tool for phylogenetic reconstruction, quantification of genetic diversity for conservation, cultivar classification and ...

  15. Climate reconstruction analysis using coexistence likelihood estimation (CRACLE): a method for the estimation of climate using vegetation.

    Science.gov (United States)

    Harbert, Robert S; Nixon, Kevin C

    2015-08-01

    • Plant distributions have long been understood to be correlated with the environmental conditions to which species are adapted. Climate is one of the major components driving species distributions. Therefore, it is expected that the plants coexisting in a community are reflective of the local environment, particularly climate.• Presented here is a method for the estimation of climate from local plant species coexistence data. The method, Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE), is a likelihood-based method that employs specimen collection data at a global scale for the inference of species climate tolerance. CRACLE calculates the maximum joint likelihood of coexistence given individual species climate tolerance characterization to estimate the expected climate.• Plant distribution data for more than 4000 species were used to show that this method accurately infers expected climate profiles for 165 sites with diverse climatic conditions. Estimates differ from the WorldClim global climate model by less than 1.5°C on average for mean annual temperature and less than ∼250 mm for mean annual precipitation. This is a significant improvement upon other plant-based climate-proxy methods.• CRACLE validates long hypothesized interactions between climate and local associations of plant species. Furthermore, CRACLE successfully estimates climate that is consistent with the widely used WorldClim model and therefore may be applied to the quantitative estimation of paleoclimate in future studies. © 2015 Botanical Society of America, Inc.

  16. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment?

    Directory of Open Access Journals (Sweden)

    Hartmann Stefanie

    2008-03-01

    Full Text Available Abstract Background While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood and studied ways to improve the accuracy of trees obtained from such datasets. Results We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. Conclusion These results demonstrate that partial gene

  17. Using ESTs for phylogenomics: can one accurately infer a phylogenetic tree from a gappy alignment?

    Science.gov (United States)

    Hartmann, Stefanie; Vision, Todd J

    2008-03-26

    While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood) and studied ways to improve the accuracy of trees obtained from such datasets. We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. These results demonstrate that partial gene sequences and gappy multiple sequence alignments can pose a

  18. Molecular phylogenetic reconstruction of the endemic Asian salamander family Hynobiidae (Amphibia, Caudata).

    Science.gov (United States)

    Weisrock, David W; Macey, J Robert; Matsui, Masafumi; Mulcahy, Daniel G; Papenfuss, Theodore J

    2013-01-01

    The salamander family Hynobiidae contains over 50 species and has been the subject of a number of molecular phylogenetic investigations aimed at reconstructing branches across the entire family. In general, studies using the greatest amount of sequence data have used reduced taxon sampling, while the study with the greatest taxon sampling has used a limited sequence data set. Here, we provide insights into the phylogenetic history of the Hynobiidae using both dense taxon sampling and a large mitochondrial DNA sequence data set. We report exclusive new mitochondrial DNA data of 2566 aligned bases (with 151 excluded sites, of included sites 1157 are variable with 957 parsimony informative). This is sampled from two genic regions encoding a 12S-16S region (the 3' end of 12S rRNA, tRNA(VAI), and the 5' end of 16S rRNA), and a ND2-COI region (ND2, tRNA(Trp), tRNA(Ala), tRNA(Asn), the origin for light strand replication--O(L), tRNA(Cys), tRNAT(Tyr), and the 5' end of COI). Analyses using parsimony, Bayesian, and maximum likelihood optimality criteria produce similar phylogenetic trees, with discordant branches generally receiving low levels of branch support. Monophyly of the Hynobiidae is strongly supported across all analyses, as is the sister relationship and deep divergence between the genus Onychodactylus with all remaining hynobiids. Within this latter grouping our phylogenetic results identify six clades that are relatively divergent from one another, but for which there is minimal support for their phylogenetic placement. This includes the genus Batrachuperus, the genus Hynobius, the genus Pachyhynobius, the genus Salamandrella, a clade containing the genera Ranodon and Paradactylodon, and a clade containing the genera Liua and Pseudohynobius. This latter clade receives low bootstrap support in the parsimony analysis, but is consistent across all three analytical methods. Our results also clarify a number of well-supported relationships within the larger

  19. A New Method for Estimation of Velocity Vectors

    DEFF Research Database (Denmark)

    Jensen, Jørgen Arendt; Munk, Peter

    1998-01-01

    The paper describes a new method for determining the velocity vector of a remotely sensed object using either sound or electromagnetic radiation. The movement of the object is determined from a field with spatial oscillations in both the axial direction of the transducer and in one or two...... directions transverse to the axial direction. By using a number of pulse emissions, the inter-pulse movement can be estimated and the velocity found from the estimated movement and the time between pulses. The method is based on the principle of using transverse spatial modulation for making the received...

  20. Comparison of methods used for estimating pharmacist counseling behaviors.

    Science.gov (United States)

    Schommer, J C; Sullivan, D L; Wiederholt, J B

    1994-01-01

    To compare the rates reported for provision of types of information conveyed by pharmacists among studies for which different methods of estimation were used and different dispensing situations were studied. Empiric studies conducted in the US, reported from 1982 through 1992, were selected from International Pharmaceutical Abstracts, MEDLINE, and noncomputerized sources. Empiric studies were selected for review if they reported the provision of at least three types of counseling information. Four components of methods used for estimating pharmacist counseling behaviors were extracted and summarized in a table: (1) sample type and area, (2) sampling unit, (3) sample size, and (4) data collection method. In addition, situations that were investigated in each study were compiled. Twelve studies met our inclusion criteria. Patients were interviewed via telephone in four studies and were surveyed via mail in two studies. Pharmacists were interviewed via telephone in one study and surveyed via mail in two studies. For three studies, researchers visited pharmacy sites for data collection using the shopper method or observation method. Studies with similar methods and situations provided similar results. Data collected by using patient surveys, pharmacist surveys, and observation methods can provide useful estimations of pharmacist counseling behaviors if researchers measure counseling for specific, well-defined dispensing situations.

  1. Visualising very large phylogenetic trees in three dimensional hyperbolic space

    Directory of Open Access Journals (Sweden)

    Liberles David A

    2004-04-01

    Full Text Available Abstract Background Common existing phylogenetic tree visualisation tools are not able to display readable trees with more than a few thousand nodes. These existing methodologies are based in two dimensional space. Results We introduce the idea of visualising phylogenetic trees in three dimensional hyperbolic space with the Walrus graph visualisation tool and have developed a conversion tool that enables the conversion of standard phylogenetic tree formats to Walrus' format. With Walrus, it becomes possible to visualise and navigate phylogenetic trees with more than 100,000 nodes. Conclusion Walrus enables desktop visualisation of very large phylogenetic trees in 3 dimensional hyperbolic space. This application is potentially useful for visualisation of the tree of life and for functional genomics derivatives, like The Adaptive Evolution Database (TAED.

  2. Detecting taxonomic and phylogenetic signals in equid cheek teeth: towards new palaeontological and archaeological proxies

    Science.gov (United States)

    Mohaseb, A.; Peigné, S.; Debue, K.; Orlando, L.; Mashkour, M.

    2017-01-01

    The Plio–Pleistocene evolution of Equus and the subsequent domestication of horses and donkeys remains poorly understood, due to the lack of phenotypic markers capable of tracing this evolutionary process in the palaeontological/archaeological record. Using images from 345 specimens, encompassing 15 extant taxa of equids, we quantified the occlusal enamel folding pattern in four mandibular cheek teeth with a single geometric morphometric protocol. We initially investigated the protocol accuracy by assigning each tooth to its correct anatomical position and taxonomic group. We then contrasted the phylogenetic signal present in each tooth shape with an exome-wide phylogeny from 10 extant equine species. We estimated the strength of the phylogenetic signal using a Brownian motion model of evolution with multivariate K statistic, and mapped the dental shape along the molecular phylogeny using an approach based on squared-change parsimony. We found clear evidence for the relevance of dental phenotypes to accurately discriminate all modern members of the genus Equus and capture their phylogenetic relationships. These results are valuable for both palaeontologists and zooarchaeologists exploring the spatial and temporal dynamics of the evolutionary history of the horse family, up to the latest domestication trajectories of horses and donkeys. PMID:28484618

  3. Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation

    Directory of Open Access Journals (Sweden)

    Sharad Damodar Gore

    2009-10-01

    Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.

  4. Climate-driven extinctions shape the phylogenetic structure of temperate tree floras.

    Science.gov (United States)

    Eiserhardt, Wolf L; Borchsenius, Finn; Plum, Christoffer M; Ordonez, Alejandro; Svenning, Jens-Christian

    2015-03-01

    When taxa go extinct, unique evolutionary history is lost. If extinction is selective, and the intrinsic vulnerabilities of taxa show phylogenetic signal, more evolutionary history may be lost than expected under random extinction. Under what conditions this occurs is insufficiently known. We show that late Cenozoic climate change induced phylogenetically selective regional extinction of northern temperate trees because of phylogenetic signal in cold tolerance, leading to significantly and substantially larger than random losses of phylogenetic diversity (PD). The surviving floras in regions that experienced stronger extinction are phylogenetically more clustered, indicating that non-random losses of PD are of increasing concern with increasing extinction severity. Using simulations, we show that a simple threshold model of survival given a physiological trait with phylogenetic signal reproduces our findings. Our results send a strong warning that we may expect future assemblages to be phylogenetically and possibly functionally depauperate if anthropogenic climate change affects taxa similarly. © 2015 John Wiley & Sons Ltd/CNRS.

  5. Increased phylogenetic resolution using target enrichment in Rubus

    Science.gov (United States)

    Phylogenetic analyses in Rubus L. have been challenging due to polyploidy, hybridization, and apomixis within the genus. Wide morphological diversity occurs within and between species, contributing to challenges at lower and higher systematic levels. Phylogenetic inferences to date have been based o...

  6. Phylogenetic relationships of African sunbird-like warblers: Moho ...

    African Journals Online (AJOL)

    Phylogenetic relationships of African sunbird-like warblers: Moho ( Hypergerus atriceps ), Green Hylia ( Hylia prasina ) and Tit-hylia ( Pholidornis rushiae ) ... different points in avian evolution reduces the phylogenetic signal in molecular sequence data, making difficult the reconstruction of relationships among taxa resulting ...

  7. The Cladistic Basis for the Phylogenetic Diversity (PD Measure Links Evolutionary Features to Environmental Gradients and Supports Broad Applications of Microbial Ecology’s “Phylogenetic Beta Diversity” Framework

    Directory of Open Access Journals (Sweden)

    Rob Knight

    2009-11-01

    Full Text Available The PD measure of phylogenetic diversity interprets branch lengths cladistically to make inferences about feature diversity. PD calculations extend conventional specieslevel ecological indices to the features level. The “phylogenetic beta diversity” framework developed by microbial ecologists calculates PD-dissimilarities between community localities. Interpretation of these PD-dissimilarities at the feature level explains the framework’s success in producing ordinations revealing environmental gradients. An example gradients space using PD-dissimilarities illustrates how evolutionary features form unimodal response patterns to gradients. This features model supports new application of existing species-level methods that are robust to unimodal responses, plus novel applications relating to climate change, commercial products discovery, and community assembly.

  8. Interpreting the universal phylogenetic tree

    Science.gov (United States)

    Woese, C. R.

    2000-01-01

    The universal phylogenetic tree not only spans all extant life, but its root and earliest branchings represent stages in the evolutionary process before modern cell types had come into being. The evolution of the cell is an interplay between vertically derived and horizontally acquired variation. Primitive cellular entities were necessarily simpler and more modular in design than are modern cells. Consequently, horizontal gene transfer early on was pervasive, dominating the evolutionary dynamic. The root of the universal phylogenetic tree represents the first stage in cellular evolution when the evolving cell became sufficiently integrated and stable to the erosive effects of horizontal gene transfer that true organismal lineages could exist.

  9. A simple method to estimate interwell autocorrelation

    Energy Technology Data Exchange (ETDEWEB)

    Pizarro, J.O.S.; Lake, L.W. [Univ. of Texas, Austin, TX (United States)

    1997-08-01

    The estimation of autocorrelation in the lateral or interwell direction is important when performing reservoir characterization studies using stochastic modeling. This paper presents a new method to estimate the interwell autocorrelation based on parameters, such as the vertical range and the variance, that can be estimated with commonly available data. We used synthetic fields that were generated from stochastic simulations to provide data to construct the estimation charts. These charts relate the ratio of areal to vertical variance and the autocorrelation range (expressed variously) in two directions. Three different semivariogram models were considered: spherical, exponential and truncated fractal. The overall procedure is demonstrated using field data. We find that the approach gives the most self-consistent results when it is applied to previously identified facies. Moreover, the autocorrelation trends follow the depositional pattern of the reservoir, which gives confidence in the validity of the approach.

  10. A phylogenetic method to perform genome-wide association studies in microbes that accounts for population structure and recombination.

    Directory of Open Access Journals (Sweden)

    Caitlin Collins

    2018-02-01

    Full Text Available Genome-Wide Association Studies (GWAS in microbial organisms have the potential to vastly improve the way we understand, manage, and treat infectious diseases. Yet, microbial GWAS methods established thus far remain insufficiently able to capitalise on the growing wealth of bacterial and viral genetic sequence data. Facing clonal population structure and homologous recombination, existing GWAS methods struggle to achieve both the precision necessary to reject spurious findings and the power required to detect associations in microbes. In this paper, we introduce a novel phylogenetic approach that has been tailor-made for microbial GWAS, which is applicable to organisms ranging from purely clonal to frequently recombining, and to both binary and continuous phenotypes. Our approach is robust to the confounding effects of both population structure and recombination, while maintaining high statistical power to detect associations. Thorough testing via application to simulated data provides strong support for the power and specificity of our approach and demonstrates the advantages offered over alternative cluster-based and dimension-reduction methods. Two applications to Neisseria meningitidis illustrate the versatility and potential of our method, confirming previously-identified penicillin resistance loci and resulting in the identification of both well-characterised and novel drivers of invasive disease. Our method is implemented as an open-source R package called treeWAS which is freely available at https://github.com/caitiecollins/treeWAS.

  11. Extended molecular phylogenetics and revised systematics of Malagasy scincine lizards.

    Science.gov (United States)

    Erens, Jesse; Miralles, Aurélien; Glaw, Frank; Chatrou, Lars W; Vences, Miguel

    2017-02-01

    Among the endemic biota of Madagascar, skinks are a diverse radiation of lizards that exhibit a striking ecomorphological variation, and could provide an interesting system to study body-form evolution in squamate reptiles. We provide a new phylogenetic hypothesis for Malagasy skinks of the subfamily Scincinae based on an extended molecular dataset comprising 8060bp from three mitochondrial and nine nuclear loci. Our analysis also increases taxon sampling of the genus Amphiglossus by including 16 out of 25 nominal species. Additionally, we examined whether the molecular phylogenetic patterns coincide with morphological differentiation in the species currently assigned to this genus. Various methods of inference recover a mostly strongly supported phylogeny with three main clades of Amphiglossus. However, relationships among these three clades and the limb-reduced genera Grandidierina, Voeltzkowia and Pygomeles remain uncertain. Supported by a variety of morphological differences (predominantly related to the degree of body elongation), but considering the remaining phylogenetic uncertainty, we propose a redefinition of Amphiglossus into three different genera (Amphiglossus sensu stricto, Flexiseps new genus, and Brachyseps new genus) to remove the non-monophyly of Amphiglossus sensu lato and to facilitate future studies on this fascinating group of lizards. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Monogenean anchor morphometry: systematic value, phylogenetic signal, and evolution

    Science.gov (United States)

    Soo, Oi Yoon Michelle; Tan, Wooi Boon; Lim, Lee Hong Susan

    2016-01-01

    Background. Anchors are one of the important attachment appendages for monogenean parasites. Common descent and evolutionary processes have left their mark on anchor morphometry, in the form of patterns of shape and size variation useful for systematic and evolutionary studies. When combined with morphological and molecular data, analysis of anchor morphometry can potentially answer a wide range of biological questions. Materials and Methods. We used data from anchor morphometry, body size and morphology of 13 Ligophorus (Monogenea: Ancyrocephalidae) species infecting two marine mugilid (Teleostei: Mugilidae) fish hosts: Moolgarda buchanani (Bleeker) and Liza subviridis (Valenciennes) from Malaysia. Anchor shape and size data (n = 530) were generated using methods of geometric morphometrics. We used 28S rRNA, 18S rRNA, and ITS1 sequence data to infer a maximum likelihood phylogeny. We discriminated species using principal component and cluster analysis of shape data. Adams’s Kmult was used to detect phylogenetic signal in anchor shape. Phylogeny-correlated size and shape changes were investigated using continuous character mapping and directional statistics, respectively. We assessed morphological constraints in anchor morphometry using phylogenetic regression of anchor shape against body size and anchor size. Anchor morphological integration was studied using partial least squares method. The association between copulatory organ morphology and anchor shape and size in phylomorphospace was used to test the Rohde-Hobbs hypothesis. We created monogeneaGM, a new R package that integrates analyses of monogenean anchor geometric morphometric data with morphological and phylogenetic data. Results. We discriminated 12 of the 13 Ligophorus species using anchor shape data. Significant phylogenetic signal was detected in anchor shape. Thus, we discovered new morphological characters based on anchor shaft shape, the length between the inner root point and the outer root

  13. Complete mitochondrial genome of the Indian peafowl (Pavo cristatus), with phylogenetic analysis in phasianidae.

    Science.gov (United States)

    Zhou, Tai-Cheng; Sha, Tao; Irwin, David M; Zhang, Ya-Ping

    2015-01-01

    Pavo cristatus, known as the Indian peafowl, is endemic to India and Sri Lanka and has been domesticated for its ornamental and food value. However, its phylogenetic status is still debated. Here, to clarify the phylogenetic status of P. cristatus within Phasianidae, we analyzed its mitochondrial genome (mtDNA). The complete mitochondrial DNA (mtDNA) genome was determined using 34 pairs of primers. Our data show that the mtDNA genome of P. cristatus is 16,686 bp in length. Molecular phylogenetic analyses of P. cristatus was performed along with 22 complete mtDNA genomes belonging to other species in Phasianidae using Bayesian and maximum likelihood methods, where Aythya americana and Anas platyrhynchos were used as outgroups. Our results show that P. critatus has its closest genetic affinity with Pavo muticus and belongs to clade that contains Gallus, Bambusicola and Francolinus.

  14. THE METHODS FOR ESTIMATING REGIONAL PROFESSIONAL MOBILE RADIO MARKET POTENTIAL

    Directory of Open Access Journals (Sweden)

    Y.À. Korobeynikov

    2008-12-01

    Full Text Available The paper represents the author’s methods of estimating regional professional mobile radio market potential, that belongs to high-tech b2b markets. These methods take into consideration such market peculiarities as great range and complexity of products, technological constraints and infrastructure development for the technological systems operation. The paper gives an estimation of professional mobile radio potential in Perm region. This estimation is already used by one of the systems integrator for its strategy development.

  15. Utilization of complete chloroplast genomes for phylogenetic studies

    NARCIS (Netherlands)

    Ramlee, Shairul Izan Binti

    2016-01-01

    Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from

  16. Phylogenetic position of Loricifera inferred from nearly complete 18S and 28S rRNA gene sequences

    OpenAIRE

    Yamasaki, Hiroshi; Fujimoto, Shinta; Miyazaki, Katsumi

    2015-01-01

    Background Loricifera is an enigmatic metazoan phylum; its morphology appeared to place it with Priapulida and Kinorhyncha in the group Scalidophora which, along with Nematoida (Nematoda and Nematomorpha), comprised the group Cycloneuralia. Scarce molecular data have suggested an alternative phylogenetic hypothesis, that the phylum Loricifera is a sister taxon to Nematomorpha, although the actual phylogenetic position of the phylum remains unclear. Methods Ecdysozoan phylogeny was reconstruct...

  17. Comparative study of the geostatistical ore reserve estimation method over the conventional methods

    International Nuclear Information System (INIS)

    Kim, Y.C.; Knudsen, H.P.

    1975-01-01

    Part I contains a comprehensive treatment of the comparative study of the geostatistical ore reserve estimation method over the conventional methods. The conventional methods chosen for comparison were: (a) the polygon method, (b) the inverse of the distance squared method, and (c) a method similar to (b) but allowing different weights in different directions. Briefly, the overall result from this comparative study is in favor of the use of geostatistics in most cases because the method has lived up to its theoretical claims. A good exposition on the theory of geostatistics, the adopted study procedures, conclusions and recommended future research are given in Part I. Part II of this report contains the results of the second and the third study objectives, which are to assess the potential benefits that can be derived by the introduction of the geostatistical method to the current state-of-the-art in uranium reserve estimation method and to be instrumental in generating the acceptance of the new method by practitioners through illustrative examples, assuming its superiority and practicality. These are given in the form of illustrative examples on the use of geostatistics and the accompanying computer program user's guide

  18. Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms.

    Science.gov (United States)

    Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H

    2014-11-19

    Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new

  19. Estimating misclassification error: a closer look at cross-validation based methods

    Directory of Open Access Journals (Sweden)

    Ounpraseuth Songthip

    2012-11-01

    Full Text Available Abstract Background To estimate a classifier’s error in predicting future observations, bootstrap methods have been proposed as reduced-variation alternatives to traditional cross-validation (CV methods based on sampling without replacement. Monte Carlo (MC simulation studies aimed at estimating the true misclassification error conditional on the training set are commonly used to compare CV methods. We conducted an MC simulation study to compare a new method of bootstrap CV (BCV to k-fold CV for estimating clasification error. Findings For the low-dimensional conditions simulated, the modest positive bias of k-fold CV contrasted sharply with the substantial negative bias of the new BCV method. This behavior was corroborated using a real-world dataset of prognostic gene-expression profiles in breast cancer patients. Our simulation results demonstrate some extreme characteristics of variance and bias that can occur due to a fault in the design of CV exercises aimed at estimating the true conditional error of a classifier, and that appear not to have been fully appreciated in previous studies. Although CV is a sound practice for estimating a classifier’s generalization error, using CV to estimate the fixed misclassification error of a trained classifier conditional on the training set is problematic. While MC simulation of this estimation exercise can correctly represent the average bias of a classifier, it will overstate the between-run variance of the bias. Conclusions We recommend k-fold CV over the new BCV method for estimating a classifier’s generalization error. The extreme negative bias of BCV is too high a price to pay for its reduced variance.

  20. Maximum Likelihood-Based Methods for Target Velocity Estimation with Distributed MIMO Radar

    Directory of Open Access Journals (Sweden)

    Zhenxin Cao

    2018-02-01

    Full Text Available The estimation problem for target velocity is addressed in this in the scenario with a distributed multi-input multi-out (MIMO radar system. A maximum likelihood (ML-based estimation method is derived with the knowledge of target position. Then, in the scenario without the knowledge of target position, an iterative method is proposed to estimate the target velocity by updating the position information iteratively. Moreover, the Carmér-Rao Lower Bounds (CRLBs for both scenarios are derived, and the performance degradation of velocity estimation without the position information is also expressed. Simulation results show that the proposed estimation methods can approach the CRLBs, and the velocity estimation performance can be further improved by increasing either the number of radar antennas or the information accuracy of the target position. Furthermore, compared with the existing methods, a better estimation performance can be achieved.

  1. Paleogenetic analyses reveal unsuspected phylogenetic affinities between mice and the extinct Malpaisomys insularis, an endemic rodent of the Canaries.

    Directory of Open Access Journals (Sweden)

    Marie Pagès

    Full Text Available BACKGROUND: The lava mouse, Malpaisomys insularis, was endemic to the Eastern Canary islands and became extinct at the beginning of the 14(th century when the Europeans reached the archipelago. Studies to determine Malpaisomys' phylogenetic affinities, based on morphological characters, remained inconclusive because morphological changes experienced by this insular rodent make phylogenetic investigations a real challenge. Over 20 years since its first description, Malpaisomys' phylogenetic position remains enigmatic. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we resolved this issue using molecular characters. Mitochondrial and nuclear markers were successfully amplified from subfossils of three lava mouse samples. Molecular phylogenetic reconstructions revealed, without any ambiguity, unsuspected relationships between Malpaisomys and extant mice (genus Mus, Murinae. Moreover, through molecular dating we estimated the origin of the Malpaisomys/mouse clade at 6.9 Ma, corresponding to the maximal age at which the archipelago was colonised by the Malpaisomys ancestor via natural rafting. CONCLUSION/SIGNIFICANCE: This study reconsiders the derived morphological characters of Malpaisomys in light of this unexpected molecular finding. To reconcile molecular and morphological data, we propose to consider Malpaisomys insularis as an insular lineage of mouse.

  2. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

    Directory of Open Access Journals (Sweden)

    Allen Eric E

    2008-10-01

    Full Text Available Abstract Background The process of horizontal gene transfer (HGT is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. Description The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource http://darkhorse.ucsd.edu. Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence. Conclusion The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and

  3. Benchmarking Foot Trajectory Estimation Methods for Mobile Gait Analysis

    Directory of Open Access Journals (Sweden)

    Julius Hannink

    2017-08-01

    Full Text Available Mobile gait analysis systems based on inertial sensing on the shoe are applied in a wide range of applications. Especially for medical applications, they can give new insights into motor impairment in, e.g., neurodegenerative disease and help objectify patient assessment. One key component in these systems is the reconstruction of the foot trajectories from inertial data. In literature, various methods for this task have been proposed. However, performance is evaluated on a variety of datasets due to the lack of large, generally accepted benchmark datasets. This hinders a fair comparison of methods. In this work, we implement three orientation estimation and three double integration schemes for use in a foot trajectory estimation pipeline. All methods are drawn from literature and evaluated against a marker-based motion capture reference. We provide a fair comparison on the same dataset consisting of 735 strides from 16 healthy subjects. As a result, the implemented methods are ranked and we identify the most suitable processing pipeline for foot trajectory estimation in the context of mobile gait analysis.

  4. On the information content of discrete phylogenetic characters.

    Science.gov (United States)

    Bordewich, Magnus; Deutschmann, Ina Maria; Fischer, Mareike; Kasbohm, Elisa; Semple, Charles; Steel, Mike

    2017-12-16

    Phylogenetic inference aims to reconstruct the evolutionary relationships of different species based on genetic (or other) data. Discrete characters are a particular type of data, which contain information on how the species should be grouped together. However, it has long been known that some characters contain more information than others. For instance, a character that assigns the same state to each species groups all of them together and so provides no insight into the relationships of the species considered. At the other extreme, a character that assigns a different state to each species also conveys no phylogenetic signal. In this manuscript, we study a natural combinatorial measure of the information content of an individual character and analyse properties of characters that provide the maximum phylogenetic information, particularly, the number of states such a character uses and how the different states have to be distributed among the species or taxa of the phylogenetic tree.

  5. DendroPy: a Python library for phylogenetic computing.

    Science.gov (United States)

    Sukumaran, Jeet; Holder, Mark T

    2010-06-15

    DendroPy is a cross-platform library for the Python programming language that provides for object-oriented reading, writing, simulation and manipulation of phylogenetic data, with an emphasis on phylogenetic tree operations. DendroPy uses a splits-hash mapping to perform rapid calculations of tree distances, similarities and shape under various metrics. It contains rich simulation routines to generate trees under a number of different phylogenetic and coalescent models. DendroPy's data simulation and manipulation facilities, in conjunction with its support of a broad range of phylogenetic data formats (NEXUS, Newick, PHYLIP, FASTA, NeXML, etc.), allow it to serve a useful role in various phyloinformatics and phylogeographic pipelines. The stable release of the library is available for download and automated installation through the Python Package Index site (http://pypi.python.org/pypi/DendroPy), while the active development source code repository is available to the public from GitHub (http://github.com/jeetsukumaran/DendroPy).

  6. Statistical methods of parameter estimation for deterministically chaotic time series

    Science.gov (United States)