WorldWideScience

Sample records for phylogenetic analysis inferred

  1. Inferring Phylogenetic Networks Using PhyloNet.

    Science.gov (United States)

    Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay

    2018-07-01

    PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.

  2. Phylogenetic Inference of HIV Transmission Clusters

    Directory of Open Access Journals (Sweden)

    Vlad Novitsky

    2017-10-01

    Full Text Available Better understanding the structure and dynamics of HIV transmission networks is essential for designing the most efficient interventions to prevent new HIV transmissions, and ultimately for gaining control of the HIV epidemic. The inference of phylogenetic relationships and the interpretation of results rely on the definition of the HIV transmission cluster. The definition of the HIV cluster is complex and dependent on multiple factors, including the design of sampling, accuracy of sequencing, precision of sequence alignment, evolutionary models, the phylogenetic method of inference, and specified thresholds for cluster support. While the majority of studies focus on clusters, non-clustered cases could also be highly informative. A new dimension in the analysis of the global and local HIV epidemics is the concept of phylogenetically distinct HIV sub-epidemics. The identification of active HIV sub-epidemics reveals spreading viral lineages and may help in the design of targeted interventions.HIVclustering can also be affected by sampling density. Obtaining a proper sampling density may increase statistical power and reduce sampling bias, so sampling density should be taken into account in study design and in interpretation of phylogenetic results. Finally, recent advances in long-range genotyping may enable more accurate inference of HIV transmission networks. If performed in real time, it could both inform public-health strategies and be clinically relevant (e.g., drug-resistance testing.

  3. Co-Inheritance Analysis within the Domains of Life Substantially Improves Network Inference by Phylogenetic Profiling.

    Directory of Open Access Journals (Sweden)

    Junha Shin

    Full Text Available Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co

  4. An Improved Binary Differential Evolution Algorithm to Infer Tumor Phylogenetic Trees.

    Science.gov (United States)

    Liang, Ying; Liao, Bo; Zhu, Wen

    2017-01-01

    Tumourigenesis is a mutation accumulation process, which is likely to start with a mutated founder cell. The evolutionary nature of tumor development makes phylogenetic models suitable for inferring tumor evolution through genetic variation data. Copy number variation (CNV) is the major genetic marker of the genome with more genes, disease loci, and functional elements involved. Fluorescence in situ hybridization (FISH) accurately measures multiple gene copy number of hundreds of single cells. We propose an improved binary differential evolution algorithm, BDEP, to infer tumor phylogenetic tree based on FISH platform. The topology analysis of tumor progression tree shows that the pathway of tumor subcell expansion varies greatly during different stages of tumor formation. And the classification experiment shows that tree-based features are better than data-based features in distinguishing tumor. The constructed phylogenetic trees have great performance in characterizing tumor development process, which outperforms other similar algorithms.

  5. Some limitations of public sequence data for phylogenetic inference (in plants).

    Science.gov (United States)

    Hinchliff, Cody E; Smith, Stephen Andrew

    2014-01-01

    The GenBank database contains essentially all of the nucleotide sequence data generated for published molecular systematic studies, but for the majority of taxa these data remain sparse. GenBank has value for phylogenetic methods that leverage data-mining and rapidly improving computational methods, but the limits imposed by the sparse structure of the data are not well understood. Here we present a tree representing 13,093 land plant genera--an estimated 80% of extant plant diversity--to illustrate the potential of public sequence data for broad phylogenetic inference in plants, and we explore the limits to inference imposed by the structure of these data using theoretical foundations from phylogenetic data decisiveness. We find that despite very high levels of missing data (over 96%), the present data retain the potential to inform over 86.3% of all possible phylogenetic relationships. Most of these relationships, however, are informed by small amounts of data--approximately half are informed by fewer than four loci, and more than 99% are informed by fewer than fifteen. We also apply an information theoretic measure of branch support to assess the strength of phylogenetic signal in the data, revealing many poorly supported branches concentrated near the tips of the tree, where data are sparse and the limiting effects of this sparseness are stronger. We argue that limits to phylogenetic inference and signal imposed by low data coverage may pose significant challenges for comprehensive phylogenetic inference at the species level. Computational requirements provide additional limits for large reconstructions, but these may be overcome by methodological advances, whereas insufficient data coverage can only be remedied by additional sampling effort. We conclude that public databases have exceptional value for modern systematics and evolutionary biology, and that a continued emphasis on expanding taxonomic and genomic coverage will play a critical role in developing

  6. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics

    Directory of Open Access Journals (Sweden)

    von Haeseler Arndt

    2004-06-01

    Full Text Available Abstract Background Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. Results TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004 are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. Conclusions TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies.

  7. Genetic variation and phylogenetic relationship analysis of Jatropha curcas L. inferred from nrDNA ITS sequences.

    Science.gov (United States)

    Guo, Guo-Ye; Chen, Fang; Shi, Xiao-Dong; Tian, Yin-Shuai; Yu, Mao-Qun; Han, Xue-Qin; Yuan, Li-Chun; Zhang, Ying

    2016-01-01

    Genetic variation and phylogenetic relationships among 102 Jatropha curcas accessions from Asia, Africa, and the Americas were assessed using the internal transcribed spacer region of nuclear ribosomal DNA (nrDNA ITS). The average G+C content (65.04%) was considerably higher than the A+T (34.96%) content. The estimated genetic diversity revealed moderate genetic variation. The pairwise genetic divergences (GD) between haplotypes were evaluated and ranged from 0.000 to 0.017, suggesting a higher level of genetic differentiation in Mexican accessions than those of other regions. Phylogenetic relationships and intraspecific divergence were inferred by Bayesian inference (BI), maximum parsimony (MP), and median joining (MJ) network analysis and were generally resolved. The J. curcas accessions were consistently divided into three lineages, groups A, B, and C, which demonstrated distant geographical isolation and genetic divergence between American accessions and those from other regions. The MJ network analysis confirmed that Central America was the possible center of origin. The putative migration route suggested that J. curcas was distributed from Mexico or Brazil, via Cape Verde and then split into two routes. One route was dispersed to Spain, then migrated to China, eventually spreading to southeastern Asia, while the other route was dispersed to Africa, via Madagascar and migrated to China, later spreading to southeastern Asia. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  8. Inferring 'weak spots' in phylogenetic trees: application to mosasauroid nomenclature.

    Science.gov (United States)

    Madzia, Daniel; Cau, Andrea

    2017-01-01

    Mosasauroid squamates represented the apex predators within the Late Cretaceous marine and occasionally also freshwater ecosystems. Proper understanding of the origin of their ecological adaptations or paleobiogeographic dispersals requires adequate knowledge of their phylogeny. The studies assessing the position of mosasauroids on the squamate evolutionary tree and their origins have long given conflicting results. The phylogenetic relationships within Mosasauroidea, however, have experienced only little changes throughout the last decades. Considering the substantial improvements in the development of phylogenetic methodology that have undergone in recent years, resulting, among others, in numerous alterations in the phylogenetic hypotheses of other fossil amniotes, we test the robustness in our understanding of mosasauroid beginnings and their evolutionary history. We re-examined a data set that results from modifications assembled in the course of the last 20 years and performed multiple parsimony analyses and Bayesian tip-dating analysis. Following the inferred topologies and the 'weak spots' in the phylogeny of mosasauroids, we revise the nomenclature of the 'traditionally' recognized mosasauroid clades, to acknowledge the overall weakness among branches and the alternative topologies suggested previously, and discuss several factors that might have an impact on the differing phylogenetic hypotheses and their statistical support.

  9. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.

    Science.gov (United States)

    Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis

    2016-09-02

    Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal

  10. Applying a multiobjective metaheuristic inspired by honey bees to phylogenetic inference.

    Science.gov (United States)

    Santander-Jiménez, Sergio; Vega-Rodríguez, Miguel A

    2013-10-01

    The development of increasingly popular multiobjective metaheuristics has allowed bioinformaticians to deal with optimization problems in computational biology where multiple objective functions must be taken into account. One of the most relevant research topics that can benefit from these techniques is phylogenetic inference. Throughout the years, different researchers have proposed their own view about the reconstruction of ancestral evolutionary relationships among species. As a result, biologists often report different phylogenetic trees from a same dataset when considering distinct optimality principles. In this work, we detail a multiobjective swarm intelligence approach based on the novel Artificial Bee Colony algorithm for inferring phylogenies. The aim of this paper is to propose a complementary view of phylogenetics according to the maximum parsimony and maximum likelihood criteria, in order to generate a set of phylogenetic trees that represent a compromise between these principles. Experimental results on a variety of nucleotide data sets and statistical studies highlight the relevance of the proposal with regard to other multiobjective algorithms and state-of-the-art biological methods. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. Calibrated birth-death phylogenetic time-tree priors for bayesian inference.

    Science.gov (United States)

    Heled, Joseph; Drummond, Alexei J

    2015-05-01

    Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  12. Reconciling taxonomy and phylogenetic inference: formalism and algorithms for describing discord and inferring taxonomic roots

    Directory of Open Access Journals (Sweden)

    Matsen Frederick A

    2012-05-01

    Full Text Available Abstract Background Although taxonomy is often used informally to evaluate the results of phylogenetic inference and the root of phylogenetic trees, algorithmic methods to do so are lacking. Results In this paper we formalize these procedures and develop algorithms to solve the relevant problems. In particular, we introduce a new algorithm that solves a "subcoloring" problem to express the difference between a taxonomy and a phylogeny at a given rank. This algorithm improves upon the current best algorithm in terms of asymptotic complexity for the parameter regime of interest; we also describe a branch-and-bound algorithm that saves orders of magnitude in computation on real data sets. We also develop a formalism and an algorithm for rooting phylogenetic trees according to a taxonomy. Conclusions The algorithms in this paper, and the associated freely-available software, will help biologists better use and understand taxonomically labeled phylogenetic trees.

  13. Phylogenetic relationships of Hemiptera inferred from mitochondrial and nuclear genes.

    Science.gov (United States)

    Song, Nan; Li, Hu; Cai, Wanzhi; Yan, Fengming; Wang, Jianyun; Song, Fan

    2016-11-01

    Here, we reconstructed the Hemiptera phylogeny based on the expanded mitochondrial protein-coding genes and the nuclear 18S rRNA gene, separately. The differential rates of change across lineages may associate with long-branch attraction (LBA) effect and result in conflicting estimates of phylogeny from different types of data. To reduce the potential effects of systematic biases on inferences of topology, various data coding schemes, site removal method, and different algorithms were utilized in phylogenetic reconstruction. We show that the outgroups Phthiraptera, Thysanoptera, and the ingroup Sternorrhyncha share similar base composition, and exhibit "long branches" relative to other hemipterans. Thus, the long-branch attraction between these groups is suspected to cause the failure of recovering Hemiptera under the homogeneous model. In contrast, a monophyletic Hemiptera is supported when heterogeneous model is utilized in the analysis. Although higher level phylogenetic relationships within Hemiptera remain to be answered, consensus between analyses is beginning to converge on a stable phylogeny.

  14. Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

    Science.gov (United States)

    Sumner, Jeremy G; Taylor, Amelia; Holland, Barbara R; Jarvis, Peter D

    2017-12-01

    Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for

  15. galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.

    Science.gov (United States)

    Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M

    2004-06-12

    The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se

  16. Applying phylogenetic analysis to viral livestock diseases: moving beyond molecular typing.

    Science.gov (United States)

    Olvera, Alex; Busquets, Núria; Cortey, Marti; de Deus, Nilsa; Ganges, Llilianne; Núñez, José Ignacio; Peralta, Bibiana; Toskano, Jennifer; Dolz, Roser

    2010-05-01

    Changes in livestock production systems in recent years have altered the presentation of many diseases resulting in the need for more sophisticated control measures. At the same time, new molecular assays have been developed to support the diagnosis of animal viral disease. Nucleotide sequences generated by these diagnostic techniques can be used in phylogenetic analysis to infer phenotypes by sequence homology and to perform molecular epidemiology studies. In this review, some key elements of phylogenetic analysis are highlighted, such as the selection of the appropriate neutral phylogenetic marker, the proper phylogenetic method and different techniques to test the reliability of the resulting tree. Examples are given of current and future applications of phylogenetic reconstructions in viral livestock diseases. Copyright 2009 Elsevier Ltd. All rights reserved.

  17. PALM: a paralleled and integrated framework for phylogenetic inference with automatic likelihood model selectors.

    Directory of Open Access Journals (Sweden)

    Shu-Hwa Chen

    Full Text Available BACKGROUND: Selecting an appropriate substitution model and deriving a tree topology for a given sequence set are essential in phylogenetic analysis. However, such time consuming, computationally intensive tasks rely on knowledge of substitution model theories and related expertise to run through all possible combinations of several separate programs. To ensure a thorough and efficient analysis and avert tedious manipulations of various programs, this work presents an intuitive framework, the phylogenetic reconstruction with automatic likelihood model selectors (PALM, with convincing, updated algorithms and a best-fit model selection mechanism for seamless phylogenetic analysis. METHODOLOGY: As an integrated framework of ClustalW, PhyML, MODELTEST, ProtTest, and several in-house programs, PALM evaluates the fitness of 56 substitution models for nucleotide sequences and 112 substitution models for protein sequences with scores in various criteria. The input for PALM can be either sequences in FASTA format or a sequence alignment file in PHYLIP format. To accelerate the computing of maximum likelihood and bootstrapping, this work integrates MPICH2/PhyML, PalmMonitor and Palm job controller across several machines with multiple processors and adopts the task parallelism approach. Moreover, an intuitive and interactive web component, PalmTree, is developed for displaying and operating the output tree with options of tree rooting, branches swapping, viewing the branch length values, and viewing bootstrapping score, as well as removing nodes to restart analysis iteratively. SIGNIFICANCE: The workflow of PALM is straightforward and coherent. Via a succinct, user-friendly interface, researchers unfamiliar with phylogenetic analysis can easily use this server to submit sequences, retrieve the output, and re-submit a job based on a previous result if some sequences are to be deleted or added for phylogenetic reconstruction. PALM results in an inference of

  18. Bacterial phylogenetic reconstruction from whole genomes is robust to recombination but demographic inference is not.

    Science.gov (United States)

    Hedge, Jessica; Wilson, Daniel J

    2014-11-25

    Phylogenetic inference in bacterial genomics is fundamental to understanding problems such as population history, antimicrobial resistance, and transmission dynamics. The field has been plagued by an apparent state of contradiction since the distorting effects of recombination on phylogeny were discovered more than a decade ago. Researchers persist with detailed phylogenetic analyses while simultaneously acknowledging that recombination seriously misleads inference of population dynamics and selection. Here we resolve this paradox by showing that phylogenetic tree topologies based on whole genomes robustly reconstruct the clonal frame topology but that branch lengths are badly skewed. Surprisingly, removing recombining sites can exacerbate branch length distortion caused by recombination. Phylogenetic tree reconstruction is a popular approach for understanding the relatedness of bacteria in a population from differences in their genome sequences. However, bacteria frequently exchange regions of their genomes by a process called homologous recombination, which violates a fundamental assumption of phylogenetic methods. Since many researchers continue to use phylogenetics for recombining bacteria, it is important to understand how recombination affects the conclusions drawn from these analyses. We find that whole-genome sequences afford great accuracy in reconstructing evolutionary relationships despite concerns surrounding the presence of recombination, but the branch lengths of the phylogenetic tree are indeed badly distorted. Surprisingly, methods to reduce the impact of recombination on branch lengths can exacerbate the problem. Copyright © 2014 Hedge and Wilson.

  19. Large-scale inference of gene function through phylogenetic annotation of Gene Ontology terms: case study of the apoptosis and autophagy cellular processes.

    Science.gov (United States)

    Feuermann, Marc; Gaudet, Pascale; Mi, Huaiyu; Lewis, Suzanna E; Thomas, Paul D

    2016-01-01

    We previously reported a paradigm for large-scale phylogenomic analysis of gene families that takes advantage of the large corpus of experimentally supported Gene Ontology (GO) annotations. This 'GO Phylogenetic Annotation' approach integrates GO annotations from evolutionarily related genes across ∼100 different organisms in the context of a gene family tree, in which curators build an explicit model of the evolution of gene functions. GO Phylogenetic Annotation models the gain and loss of functions in a gene family tree, which is used to infer the functions of uncharacterized (or incompletely characterized) gene products, even for human proteins that are relatively well studied. Here, we report our results from applying this paradigm to two well-characterized cellular processes, apoptosis and autophagy. This revealed several important observations with respect to GO annotations and how they can be used for function inference. Notably, we applied only a small fraction of the experimentally supported GO annotations to infer function in other family members. The majority of other annotations describe indirect effects, phenotypes or results from high throughput experiments. In addition, we show here how feedback from phylogenetic annotation leads to significant improvements in the PANTHER trees, the GO annotations and GO itself. Thus GO phylogenetic annotation both increases the quantity and improves the accuracy of the GO annotations provided to the research community. We expect these phylogenetically based annotations to be of broad use in gene enrichment analysis as well as other applications of GO annotations.Database URL: http://amigo.geneontology.org/amigo. © The Author(s) 2016. Published by Oxford University Press.

  20. Bryozoans are returning home: recolonization of freshwater ecosystems inferred from phylogenetic relationships.

    Science.gov (United States)

    Koletić, Nikola; Novosel, Maja; Rajević, Nives; Franjević, Damjan

    2015-01-01

    Bryozoans are aquatic invertebrates that inhabit all types of aquatic ecosystems. They are small animals that form large colonies by asexual budding. Colonies can reach the size of several tens of centimeters, while individual units within a colony are the size of a few millimeters. Each individual within a colony works as a separate zooid and is genetically identical to each other individual within the same colony. Most freshwater species of bryozoans belong to the Phylactolaemata class, while several species that tolerate brackish water belong to the Gymnolaemata class. Tissue samples for this study were collected in the rivers of Adriatic and Danube basin and in the wetland areas in the continental part of Croatia (Europe). Freshwater and brackish taxons of bryozoans were genetically analyzed for the purpose of creating phylogenetic relationships between freshwater and brackish taxons of the Phylactolaemata and Gymnolaemata classes and determining the role of brackish species in colonizing freshwater and marine ecosystems. Phylogenetic relationships inferred on the genes for 18S rRNA, 28S rRNA, COI, and ITS2 region confirmed Phylactolaemata bryozoans as radix bryozoan group. Phylogenetic analysis proved Phylactolaemata bryozoan's close relations with taxons from Phoronida phylum as well as the separation of the Lophopodidae family from other families within the Plumatellida genus. Comparative analysis of existing knowledge about the phylogeny of bryozoans and the expansion of known evolutionary hypotheses is proposed with the model of settlement of marine and freshwater ecosystems by the bryozoans group during their evolutionary past. In this case study, brackish bryozoan taxons represent a link for this ecological phylogenetic hypothesis. Comparison of brackish bryozoan species Lophopus crystallinus and Conopeum seurati confirmed a dual colonization of freshwater ecosystems throughout evolution of this group of animals.

  1. SWPhylo - A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees.

    Science.gov (United States)

    Yu, Xiaoyu; Reva, Oleg N

    2018-01-01

    Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.

  2. Nuclear and cpDNA sequences combined provide strong inference of higher phylogenetic relationships in the phlox family (Polemoniaceae).

    Science.gov (United States)

    Johnson, Leigh A; Chan, Lauren M; Weese, Terri L; Busby, Lisa D; McMurry, Samuel

    2008-09-01

    Members of the phlox family (Polemoniaceae) serve as useful models for studying various evolutionary and biological processes. Despite its biological importance, no family-wide phylogenetic estimate based on multiple DNA regions with complete generic sampling is available. Here, we analyze one nuclear and five chloroplast DNA sequence regions (nuclear ITS, chloroplast matK, trnL intron plus trnL-trnF intergeneric spacer, and the trnS-trnG, trnD-trnT, and psbM-trnD intergenic spacers) using parsimony and Bayesian methods, as well as assessments of congruence and long branch attraction, to explore phylogenetic relationships among 84 ingroup species representing all currently recognized Polemoniaceae genera. Relationships inferred from the ITS and concatenated chloroplast regions are similar overall. A combined analysis provides strong support for the monophyly of Polemoniaceae and subfamilies Acanthogilioideae, Cobaeoideae, and Polemonioideae. Relationships among subfamilies, and thus for the precise root of Polemoniaceae, remain poorly supported. Within the largest subfamily, Polemonioideae, four clades corresponding to tribes Polemonieae, Phlocideae, Gilieae, and Loeselieae receive strong support. The monogeneric Polemonieae appears sister to Phlocideae. Relationships within Polemonieae, Phlocideae, and Gilieae are mostly consistent between analyses and data permutations. Many relationships within Loeselieae remain uncertain. Overall, inferred phylogenetic relationships support a higher-level classification for Polemoniaceae proposed in 2000.

  3. Phylogenetic reconstruction methods: an overview.

    Science.gov (United States)

    De Bruyn, Alexandre; Martin, Darren P; Lefeuvre, Pierre

    2014-01-01

    Initially designed to infer evolutionary relationships based on morphological and physiological characters, phylogenetic reconstruction methods have greatly benefited from recent developments in molecular biology and sequencing technologies with a number of powerful methods having been developed specifically to infer phylogenies from macromolecular data. This chapter, while presenting an overview of basic concepts and methods used in phylogenetic reconstruction, is primarily intended as a simplified step-by-step guide to the construction of phylogenetic trees from nucleotide sequences using fairly up-to-date maximum likelihood methods implemented in freely available computer programs. While the analysis of chloroplast sequences from various Vanilla species is used as an illustrative example, the techniques covered here are relevant to the comparative analysis of homologous sequences datasets sampled from any group of organisms.

  4. Phylogenetic diversity and relationships among species of genus ...

    African Journals Online (AJOL)

    Fifty six Nicotiana species were used to construct phylogenetic trees and to asses the genetic relationships between them. Genetic distances estimated from RAPD analysis was used to construct phylogenetic trees using Phylogenetic Inference Package (PHYLIP). Since phylogenetic relationships estimated for closely ...

  5. SWPhylo – A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees

    Science.gov (United States)

    Yu, Xiaoyu; Reva, Oleg N

    2018-01-01

    Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354

  6. Phylogenetic inference with weighted codon evolutionary distances.

    Science.gov (United States)

    Criscuolo, Alexis; Michel, Christian J

    2009-04-01

    We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates.

  7. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov

    2013-01-01

    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  8. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment?

    Directory of Open Access Journals (Sweden)

    Hartmann Stefanie

    2008-03-01

    sequences and gappy multiple sequence alignments can pose a major problem for phylogenetic analysis. The concern will be greatest for high-throughput phylogenomic analyses, in which Neighbor Joining is often the preferred method due to its computational efficiency. Both approaches can be used to increase the accuracy of phylogenetic inference from a gappy alignment. The choice between the two approaches will depend upon how robust the application is to the loss of sequences from the input set, with alignment masking generally giving a much greater improvement in accuracy but at the cost of discarding a larger number of the input sequences.

  9. Using ESTs for phylogenomics: can one accurately infer a phylogenetic tree from a gappy alignment?

    Science.gov (United States)

    Hartmann, Stefanie; Vision, Todd J

    2008-03-26

    major problem for phylogenetic analysis. The concern will be greatest for high-throughput phylogenomic analyses, in which Neighbor Joining is often the preferred method due to its computational efficiency. Both approaches can be used to increase the accuracy of phylogenetic inference from a gappy alignment. The choice between the two approaches will depend upon how robust the application is to the loss of sequences from the input set, with alignment masking generally giving a much greater improvement in accuracy but at the cost of discarding a larger number of the input sequences.

  10. Multiple sequence alignment accuracy and phylogenetic inference.

    Science.gov (United States)

    Ogden, T Heath; Rosenberg, Michael S

    2006-04-01

    Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.

  11. Phylogenetic relationships and timing of diversification in gonorynchiform fishes inferred using nuclear gene DNA sequences (Teleostei: Ostariophysi).

    Science.gov (United States)

    Near, Thomas J; Dornburg, Alex; Friedman, Matt

    2014-11-01

    The Gonorynchiformes are the sister lineage of the species-rich Otophysi and provide important insights into the diversification of ostariophysan fishes. Phylogenies of gonorynchiforms inferred using morphological characters and mtDNA gene sequences provide differing resolutions with regard to the sister lineage of all other gonorynchiforms (Chanos vs. Gonorynchus) and support for monophyly of the two miniaturized lineages Cromeria and Grasseichthys. In this study the phylogeny and divergence times of gonorynchiforms are investigated with DNA sequences sampled from nine nuclear genes and a published morphological character matrix. Bayesian phylogenetic analyses reveal substantial congruence among individual gene trees with inferences from eight genes placing Gonorynchus as the sister lineage to all other gonorynchiforms. Seven gene trees resolve Cromeria and Grasseichthys as a clade, supporting previous inferences using morphological characters. Phylogenies resulting from either concatenating the nuclear genes, performing a multispecies coalescent species tree analysis, or combining the morphological and nuclear gene DNA sequences resolve Gonorynchus as the living sister lineage of all other gonorynchiforms, strongly support the monophyly of Cromeria and Grasseichthys, and resolve a clade containing Parakneria, Cromeria, and Grasseichthys. The morphological dataset, which includes 13 gonorynchiform fossil taxa that range in age from Early Cretaceous to Eocene, was analyzed in combination with DNA sequences from the nine nuclear genes and a relaxed molecular clock to estimate times of evolutionary divergence. This "tip dating" strategy accommodates uncertainty in the phylogenetic resolution of fossil taxa that provide calibration information in the relaxed molecular clock analysis. The estimated age of the most recent common ancestor (MRCA) of living gonorynchiforms is slightly older than estimates from previous node dating efforts, but the molecular tip dating

  12. Inferring influenza global transmission networks without complete phylogenetic information.

    Science.gov (United States)

    Aris-Brosou, Stéphane

    2014-03-01

    Influenza is one of the most severe respiratory infections affecting humans throughout the world, yet the dynamics of its global transmission network are still contentious. Here, I describe a novel combination of phylogenetics, time series, and graph theory to analyze 14.25 years of data stratified in space and in time, focusing on the main target of the human immune response, the hemagglutinin gene. While bypassing the complete phylogenetic inference of huge data sets, the method still extracts information suggesting that waves of genetic or of nucleotide diversity circulate continuously around the globe for subtypes that undergo sustained transmission over several seasons, such as H3N2 and pandemic H1N1/09, while diversity of prepandemic H1N1 viruses had until 2009 a noncontinuous transmission pattern consistent with a source/sink model. Irrespective of the shift in the structure of H1N1 diversity circulation with the emergence of the pandemic H1N1/09 strain, US prevalence peaks during the winter months when genetic diversity is at its lowest. This suggests that a dominant strain is generally responsible for epidemics and that monitoring genetic and/or nucleotide diversity in real time could provide public health agencies with an indirect estimate of prevalence.

  13. Compositional and mutational rate heterogeneity in mitochondrial genomes and its effect on the phylogenetic inferences of Cimicomorpha (Hemiptera: Heteroptera).

    Science.gov (United States)

    Yang, Huanhuan; Li, Teng; Dang, Kai; Bu, Wenjun

    2018-04-18

    Mitochondrial genome (mt-genome) data can potentially return artefactual relationships in the higher-level phylogenetic inference of insects due to the biases of accelerated substitution rates and compositional heterogeneity. Previous studies based on mt-genome data alone showed a paraphyly of Cimicomorpha (Insecta, Hemiptera) due to the positions of the families Tingidae and Reduviidae rather than the monophyly that was supported based on morphological characters, morphological and molecular combined data and large scale molecular datasets. Various strategies have been proposed to ameliorate the effects of potential mt-genome biases, including dense taxon sampling, removal of third codon positions or purine-pyrimidine coding and the use of site-heterogeneous models. In this study, we sequenced the mt-genomes of five additional Tingidae species and discussed the compositional and mutational rate heterogeneity in mt-genomes and its effect on the phylogenetic inferences of Cimicomorpha by implementing the bias-reduction strategies mentioned above. Heterogeneity in nucleotide composition and mutational biases were found in mt protein-coding genes, and the third codon exhibited high levels of saturation. Dense taxon sampling of Tingidae and Reduviidae and the other common strategies mentioned above were insufficient to recover the monophyly of the well-established group Cimicomorpha. When the sites with weak phylogenetic signals in the dataset were removed, the remaining dataset of mt-genomes can support the monophyly of Cimicomorpha; this support demonstrates that mt-genomes possess strong phylogenetic signals for the inference of higher-level phylogeny of this group. Comparison of the ratio of the removal of amino acids for each PCG showed that ATP8 has the highest ratio while CO1 has the lowest. This pattern is largely congruent with the evolutionary rate of 13 PCGs that ATP8 represents the highest evolutionary rate, whereas CO1 appears to be the lowest. Notably

  14. Intraspecific relationship within the genus convolvulus l. inferred by rbcl gene using different phylogenetic approaches

    International Nuclear Information System (INIS)

    Kausar, S.; Qamarunnisa, S.

    2016-01-01

    A molecular systematics analysis was conducted using sequence data of chloroplast rbcL gene for the genus Convolvulus L., by distance and character based phylogenetic methods. Fifteen representative members from genus Convolvulus L., were included as in group whereas two members from a sister family Solanaceae were taken as out group to root the tree. Intraspecific relationships within Convolvulus were inferred by distance matrix, maximum parsimony and bayesian analysis. Transition/transversion ratio was also calculated and it was revealed that in the investigated Convolvulus species, transitional changes were more prevalent in rbcL gene. The nature of rbcL gene in the present study was observed to be conserved, as it does not show major variations between examined species. Distance matrix represented the minimal genetic variations between some species (C. glomeratus and C. pyrrhotrichus), thus exhibiting them as close relatives. The result of parsimonious and bayesian analysis revealed almost similar clades however maximum parsimony based tree was unable to establish relationship between some Convolvulus species. The bayesian inference method was found to be the method of choice for establishing intraspecific associations between Convolvulus species using rbcL data as it clearly defined the connections supported by posterior probability values. (author)

  15. Untangling hybrid phylogenetic signals: horizontal gene transfer and artifacts of phylogenetic reconstruction.

    Science.gov (United States)

    Beiko, Robert G; Ragan, Mark A

    2009-01-01

    Phylogenomic methods can be used to investigate the tangled evolutionary relationships among genomes. Building 'all the trees of all the genes' can potentially identify common pathways of horizontal gene transfer (HGT) among taxa at varying levels of phylogenetic depth. Phylogenetic affinities can be aggregated and merged with the information about genetic linkage and biochemical function to examine hypotheses of adaptive evolution via HGT. Additionally, the use of many genetic data sets increases the power of statistical tests for phylogenetic artifacts. However, large-scale phylogenetic analyses pose several challenges, including the necessary abandonment of manual validation techniques, the need to translate inferred phylogenetic discordance into inferred HGT events, and the challenges involved in aggregating results from search-based inference methods. In this chapter we describe a tree search procedure to recover the most parsimonious pathways of HGT, and examine some of the assumptions that are made by this method.

  16. Inference of Transmission Network Structure from HIV Phylogenetic Trees.

    Science.gov (United States)

    Giardina, Federica; Romero-Severson, Ethan Obie; Albert, Jan; Britton, Tom; Leitner, Thomas

    2017-01-01

    Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions, we added a within-host evolutionary model in epidemiological simulations and examined if the resulting phylogeny could recover different types of contact networks. To further improve realism, we also introduced patient-specific differences in infectivity across disease stages, and on the epidemic level we considered incomplete sampling and the age of the epidemic. Second, we implemented an inference method based on approximate Bayesian computation (ABC) to discriminate among three well-studied network models and jointly estimate both network parameters and key epidemiological quantities such as the infection rate. Our ABC framework used both topological and distance-based tree statistics for comparison between simulated and observed trees. Overall, our simulations showed that a virus time-scaled phylogeny (genealogy) may be substantially different from the between-host transmission tree. This has important implications for the interpretation of what a phylogeny reveals about the underlying epidemic contact network. In particular, we found that while the within-host evolutionary process obscures the transmission tree, the diversification process and infectivity dynamics also add discriminatory power to differentiate between different types of contact networks. We also found that the possibility to differentiate contact networks depends on how far an epidemic has progressed, where distance-based tree statistics have more power early in an epidemic. Finally, we applied our ABC inference on two different outbreaks from the Swedish HIV-1 epidemic.

  17. Phylogenetic analysis in Myrcia section Aulomyrcia and inferences on plant diversity in the Atlantic rainforest.

    Science.gov (United States)

    Staggemeier, Vanessa Graziele; Diniz-Filho, José Alexandre Felizola; Forest, Félix; Lucas, Eve

    2015-04-01

    Myrcia section Aulomyrcia includes ∼120 species that are endemic to the Neotropics and disjunctly distributed in the moist Amazon and Atlantic coastal forests of Brazil. This paper presents the first comprehensive phylogenetic study of this group and this phylogeny is used as a basis to evaluate recent classification systems and to test alternative hypotheses associated with the history of this clade. Fifty-three taxa were sampled out of the 120 species currently recognized, plus 40 outgroup taxa, for one nuclear marker (ribosomal internal transcribed spacer) and four plastid markers (psbA-trnH, trnL-trnF, trnQ-rpS16 and ndhF). The relationships were reconstructed based on Bayesian and maximum likelihood analyses. Additionally, a likelihood approach, 'geographic state speciation and extinction', was used to estimate region- dependent rates of speciation, extinction and dispersal, comparing historically climatic stable areas (refugia) and unstable areas. Maximum likelihood and Bayesian inferences indicate that Myrcia and Marlierea are polyphyletic, and the internal groupings recovered are characterized by combinations of morphological characters. Phylogenetic relationships support a link between Amazonian and north-eastern species and between north-eastern and south-eastern species. Lower extinction rates within glacial refugia suggest that these areas were important in maintaining diversity in the Atlantic forest biodiversity hotspot. This study provides a robust phylogenetic framework to address important ecological questions for Myrcia s.l. within an evolutionary context, and supports the need to unite taxonomically the two traditional genera Myrcia and Marlierea in an expanded Myrcia s.l. Furthermore, this study offers valuable insights into the diversification of plant species in the highly impacted Atlantic forest of South America; evidence is presented that the lowest extinction rates are found inside refugia and that range expansion from unstable areas

  18. Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks.

    Science.gov (United States)

    Oh, S June; Joung, Je-Gun; Chang, Jeong-Ho; Zhang, Byoung-Tak

    2006-06-06

    To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence

  19. Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks

    Directory of Open Access Journals (Sweden)

    Chang Jeong-Ho

    2006-06-01

    Full Text Available Abstract Background To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. Results To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. Conclusion By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway

  20. Phylogenetic Relationships of Pseudorasbora, Pseudopungtungia, and Pungtungia (Teleostei; Cypriniformes; Gobioninae Inferred from Multiple Nuclear Gene Sequences

    Directory of Open Access Journals (Sweden)

    Keun-Yong Kim

    2013-01-01

    Full Text Available Gobionine species belonging to the genera Pseudorasbora, Pseudopungtungia, and Pungtungia (Teleostei; Cypriniformes; Cyprinidae have been heavily studied because of problems on taxonomy, threats of extinction, invasion, and human health. Nucleotide sequences of three nuclear genes, that is, recombination activating protein gene 1 (rag1, recombination activating gene 2 (rag2, and early growth response 1 gene (egr1, from Pseudorasbora, Pseudopungtungia, and Pungtungia species residing in China, Japan, and Korea, were analyzed to elucidate their intergeneric and interspecific phylogenetic relationships. In the phylogenetic tree inferred from their multiple gene sequences, Pseudorasbora, Pseudopungtungia and Pungtungia species ramified into three phylogenetically distinct clades; the “tenuicorpa” clade composed of Pseudopungtungia tenuicorpa, the “parva” clade composed of all Pseudorasbora species/subspecies, and the “herzi” clade composed of Pseudopungtungia nigra, and Pungtungia herzi. The genus Pseudorasbora was recovered as monophyletic, while the genus Pseudopungtungia was recovered as polyphyletic. Our phylogenetic result implies the unstable taxonomic status of the genus Pseudopungtungia.

  1. Philosophy and phylogenetic inference: a comparison of likelihood and parsimony methods in the context of Karl Popper's writings on corroboration.

    Science.gov (United States)

    de Queiroz, K; Poe, S

    2001-06-01

    Advocates of cladistic parsimony methods have invoked the philosophy of Karl Popper in an attempt to argue for the superiority of those methods over phylogenetic methods based on Ronald Fisher's statistical principle of likelihood. We argue that the concept of likelihood in general, and its application to problems of phylogenetic inference in particular, are highly compatible with Popper's philosophy. Examination of Popper's writings reveals that his concept of corroboration is, in fact, based on likelihood. Moreover, because probabilistic assumptions are necessary for calculating the probabilities that define Popper's corroboration, likelihood methods of phylogenetic inference--with their explicit probabilistic basis--are easily reconciled with his concept. In contrast, cladistic parsimony methods, at least as described by certain advocates of those methods, are less easily reconciled with Popper's concept of corroboration. If those methods are interpreted as lacking probabilistic assumptions, then they are incompatible with corroboration. Conversely, if parsimony methods are to be considered compatible with corroboration, then they must be interpreted as carrying implicit probabilistic assumptions. Thus, the non-probabilistic interpretation of cladistic parsimony favored by some advocates of those methods is contradicted by an attempt by the same authors to justify parsimony methods in terms of Popper's concept of corroboration. In addition to being compatible with Popperian corroboration, the likelihood approach to phylogenetic inference permits researchers to test the assumptions of their analytical methods (models) in a way that is consistent with Popper's ideas about the provisional nature of background knowledge.

  2. The space of ultrametric phylogenetic trees.

    Science.gov (United States)

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  3. Phylogenetic relationships of Chaetomium isolates based on the ...

    African Journals Online (AJOL)

    Biotech Unit

    2013-02-27

    Feb 27, 2013 ... Phylogenetic analysis of Chaetomium species. The evolutionary history was inferred using the maximum parsimony method. The bootstrap consensus tree inferred from. 1000 replicates is taken to represent the evolutionary history of the taxa analyzed (Felsenstein, 1985). The MP tree was obtained using.

  4. Consequences of recombination on traditional phylogenetic analysis

    DEFF Research Database (Denmark)

    Schierup, M H; Hein, J

    2000-01-01

    We investigate the shape of a phylogenetic tree reconstructed from sequences evolving under the coalescent with recombination. The motivation is that evolutionary inferences are often made from phylogenetic trees reconstructed from population data even though recombination may well occur (mt......DNA or viral sequences) or does occur (nuclear sequences). We investigate the size and direction of biases when a single tree is reconstructed ignoring recombination. Standard software (PHYLIP) was used to construct the best phylogenetic tree from sequences simulated under the coalescent with recombination....... With recombination present, the length of terminal branches and the total branch length are larger, and the time to the most recent common ancestor smaller, than for a tree reconstructed from sequences evolving with no recombination. The effects are pronounced even for small levels of recombination that may...

  5. Phylogenetic inference of calyptrates, with the first mitogenomes for Gasterophilinae (Diptera: Oestridae) and Paramacronychiinae (Diptera: Sarcophagidae)

    Science.gov (United States)

    Zhang, Dong; Yan, Liping; Zhang, Ming; Chu, Hongjun; Cao, Jie; Li, Kai; Hu, Defu; Pape, Thomas

    2016-01-01

    The complete mitogenome of the horse stomach bot fly Gasterophilus pecorum (Fabricius) and a near-complete mitogenome of Wohlfahrt's wound myiasis fly Wohlfahrtia magnifica (Schiner) were sequenced. The mitogenomes contain the typical 37 mitogenes found in metazoans, organized in the same order and orientation as in other cyclorrhaphan Diptera. Phylogenetic analyses of mitogenomes from 38 calyptrate taxa with and without two non-calyptrate outgroups were performed using Bayesian Inference and Maximum Likelihood. Three sub-analyses were performed on the concatenated data: (1) not partitioned; (2) partitioned by gene; (3) 3rd codon positions of protein-coding genes omitted. We estimated the contribution of each of the mitochondrial genes for phylogenetic analysis, as well as the effect of some popular methodologies on calyptrate phylogeny reconstruction. In the favoured trees, the Oestroidea are nested within the muscoid grade. Relationships at the family level within Oestroidea are (remaining Calliphoridae (Sarcophagidae (Oestridae, Pollenia + Tachinidae))). Our mito-phylogenetic reconstruction of the Calyptratae presents the most extensive taxon coverage so far, and the risk of long-branch attraction is reduced by an appropriate selection of outgroups. We find that in the Calyptratae the ND2, ND5, ND1, COIII, and COI genes are more phylogenetically informative compared with other mitochondrial protein-coding genes. Our study provides evidence that data partitioning and the inclusion of conserved tRNA genes have little influence on calyptrate phylogeny reconstruction, and that the 3rd codon positions of protein-coding genes are not saturated and therefore should be included. PMID:27019632

  6. The Reliability and Stability of an Inferred Phylogenetic Tree from Empirical Data.

    Science.gov (United States)

    Katsura, Yukako; Stanley, Craig E; Kumar, Sudhir; Nei, Masatoshi

    2017-03-01

    The reliability of a phylogenetic tree obtained from empirical data is usually measured by the bootstrap probability (Pb) of interior branches of the tree. If the bootstrap probability is high for most branches, the tree is considered to be reliable. If some interior branches show relatively low bootstrap probabilities, we are not sure that the inferred tree is really reliable. Here, we propose another quantity measuring the reliability of the tree called the stability of a subtree. This quantity refers to the probability of obtaining a subtree (Ps) of an inferred tree obtained. We then show that if the tree is to be reliable, both Pb and Ps must be high. We also show that Ps is given by a bootstrap probability of the subtree with the closest outgroup sequence, and computer program RESTA for computing the Pb and Ps values will be presented. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Phylogenetics of neotropical Platymiscium (Leguminosae

    DEFF Research Database (Denmark)

    Saslis-Lagoudakis, C. Haris; Chase, Mark W; Robinson, Daniel N

    2008-01-01

    Platymiscium is a neotropical legume genus of forest trees in the Pterocarpus clade of the pantropical "dalbergioid" clade. It comprises 19 species (29 taxa), distributed from Mexico to southern Brazil. This study presents a molecular phylogenetic analysis of Platymiscium and allies inferred from...

  8. A taxonomic and phylogenetic re-appraisal of the genus Curvularia

    Science.gov (United States)

    Species of Curvularia are important plant and human pathogens worldwide. In this study, the genus Curvularia is re-assessed based on molecular phylogenetic analysis and morphological observations of available isolates and specimens. A multi-gene phylogenetic tree inferred from ITS, TEF and GPDH gene...

  9. FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods

    Directory of Open Access Journals (Sweden)

    Bakos Jason D

    2010-04-01

    Full Text Available Abstract Background Likelihood (ML-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. Results We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10× speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Conclusions Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs 1.

  10. Data set for phylogenetic tree and RAMPAGE Ramachandran plot analysis of SODs in Gossypium raimondii and G. arboreum.

    Science.gov (United States)

    Wang, Wei; Xia, Minxuan; Chen, Jie; Deng, Fenni; Yuan, Rui; Zhang, Xiaopei; Shen, Fafu

    2016-12-01

    The data presented in this paper is supporting the research article "Genome-Wide Analysis of Superoxide Dismutase Gene Family in Gossypium raimondii and G. arboreum" [1]. In this data article, we present phylogenetic tree showing dichotomy with two different clusters of SODs inferred by the Bayesian method of MrBayes (version 3.2.4), "Bayesian phylogenetic inference under mixed models" [2], Ramachandran plots of G. raimondii and G. arboreum SODs, the protein sequence used to generate 3D sructure of proteins and the template accession via SWISS-MODEL server, "SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information." [3] and motif sequences of SODs identified by InterProScan (version 4.8) with the Pfam database, "Pfam: the protein families database" [4].

  11. Phylogenetic Analysis Using Protein Mass Spectrometry.

    Science.gov (United States)

    Ma, Shiyong; Downard, Kevin M; Wong, Jason W H

    2017-01-01

    Through advances in molecular biology, comparative analysis of DNA sequences is currently the cornerstone in the study of molecular evolution and phylogenetics. Nevertheless, protein mass spectrometry offers some unique opportunities to enable phylogenetic analyses in organisms where DNA may be difficult or costly to obtain. To date, the methods of phylogenetic analysis using protein mass spectrometry can be classified into three categories: (1) de novo protein sequencing followed by classical phylogenetic reconstruction, (2) direct phylogenetic reconstruction using proteolytic peptide mass maps, and (3) mapping of mass spectral data onto classical phylogenetic trees. In this chapter, we provide a brief description of the three methods and the protocol for each method along with relevant tools and algorithms.

  12. Integrated Automatic Workflow for Phylogenetic Tree Analysis Using Public Access and Local Web Services.

    Science.gov (United States)

    Damkliang, Kasikrit; Tandayya, Pichaya; Sangket, Unitsa; Pasomsub, Ekawat

    2016-11-28

    At the present, coding sequence (CDS) has been discovered and larger CDS is being revealed frequently. Approaches and related tools have also been developed and upgraded concurrently, especially for phylogenetic tree analysis. This paper proposes an integrated automatic Taverna workflow for the phylogenetic tree inferring analysis using public access web services at European Bioinformatics Institute (EMBL-EBI) and Swiss Institute of Bioinformatics (SIB), and our own deployed local web services. The workflow input is a set of CDS in the Fasta format. The workflow supports 1,000 to 20,000 numbers in bootstrapping replication. The workflow performs the tree inferring such as Parsimony (PARS), Distance Matrix - Neighbor Joining (DIST-NJ), and Maximum Likelihood (ML) algorithms of EMBOSS PHYLIPNEW package based on our proposed Multiple Sequence Alignment (MSA) similarity score. The local web services are implemented and deployed into two types using the Soaplab2 and Apache Axis2 deployment. There are SOAP and Java Web Service (JWS) providing WSDL endpoints to Taverna Workbench, a workflow manager. The workflow has been validated, the performance has been measured, and its results have been verified. Our workflow's execution time is less than ten minutes for inferring a tree with 10,000 replicates of the bootstrapping numbers. This paper proposes a new integrated automatic workflow which will be beneficial to the bioinformaticians with an intermediate level of knowledge and experiences. All local services have been deployed at our portal http://bioservices.sci.psu.ac.th.

  13. The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae).

    Science.gov (United States)

    McCann, Jamie; Schneeweiss, Gerald M; Stuessy, Tod F; Villaseñor, Jose L; Weiss-Schneeweiss, Hanna

    2016-01-01

    Chromosome number change (polyploidy and dysploidy) plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae) as model group, we assess the impact of reconstruction method (maximum parsimony, maximum likelihood, Bayesian methods), branch length model (phylograms versus chronograms) and phylogenetic uncertainty (topological and branch length uncertainty) on the inference of chromosome number evolution. We also address the suitability of the maximum clade credibility (MCC) tree as single representative topology for chromosome number reconstruction. Each of the listed factors causes considerable incongruence among chromosome number reconstructions. Discrepancies between inferences on the MCC tree from those made by integrating over a set of trees are moderate for ancestral chromosome numbers, but severe for the difference of chromosome gains and losses, a measure of the directionality of dysploidy. Therefore, reliance on single trees, such as the MCC tree, is strongly discouraged and model averaging, taking both phylogenetic and model uncertainty into account, is recommended. For studying chromosome number evolution, dedicated models implemented in the program ChromEvol and ordered maximum parsimony may be most appropriate. Chromosome number evolution in Melampodium follows a pattern of bidirectional dysploidy (starting from x = 11 to x = 9 and x = 14, respectively) with no prevailing direction.

  14. The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae.

    Directory of Open Access Journals (Sweden)

    Jamie McCann

    Full Text Available Chromosome number change (polyploidy and dysploidy plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae as model group, we assess the impact of reconstruction method (maximum parsimony, maximum likelihood, Bayesian methods, branch length model (phylograms versus chronograms and phylogenetic uncertainty (topological and branch length uncertainty on the inference of chromosome number evolution. We also address the suitability of the maximum clade credibility (MCC tree as single representative topology for chromosome number reconstruction. Each of the listed factors causes considerable incongruence among chromosome number reconstructions. Discrepancies between inferences on the MCC tree from those made by integrating over a set of trees are moderate for ancestral chromosome numbers, but severe for the difference of chromosome gains and losses, a measure of the directionality of dysploidy. Therefore, reliance on single trees, such as the MCC tree, is strongly discouraged and model averaging, taking both phylogenetic and model uncertainty into account, is recommended. For studying chromosome number evolution, dedicated models implemented in the program ChromEvol and ordered maximum parsimony may be most appropriate. Chromosome number evolution in Melampodium follows a pattern of bidirectional dysploidy (starting from x = 11 to x = 9 and x = 14, respectively with no prevailing direction.

  15. Parametric inference for biological sequence analysis.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  16. Molecular Phylogenetic: Organism Taxonomy Method Based on Evolution History

    Directory of Open Access Journals (Sweden)

    N.L.P Indi Dharmayanti

    2011-03-01

    Full Text Available Phylogenetic is described as taxonomy classification of an organism based on its evolution history namely its phylogeny and as a part of systematic science that has objective to determine phylogeny of organism according to its characteristic. Phylogenetic analysis from amino acid and protein usually became important area in sequence analysis. Phylogenetic analysis can be used to follow the rapid change of a species such as virus. The phylogenetic evolution tree is a two dimensional of a species graphic that shows relationship among organisms or particularly among their gene sequences. The sequence separation are referred as taxa (singular taxon that is defined as phylogenetically distinct units on the tree. The tree consists of outer branches or leaves that represents taxa and nodes and branch represent correlation among taxa. When the nucleotide sequence from two different organism are similar, they were inferred to be descended from common ancestor. There were three methods which were used in phylogenetic, namely (1 Maximum parsimony, (2 Distance, and (3 Maximum likehoood. Those methods generally are applied to construct the evolutionary tree or the best tree for determine sequence variation in group. Every method is usually used for different analysis and data.

  17. [Genome-wide identification, phylogenetic analysis and expression profiling of the WOX family genes in Solanum lycopersicum].

    Science.gov (United States)

    Li, Xiao-xu; Liu, Cheng; Li, Wei; Zhang, Zeng-lin; Gao, Xiao-ming; Zhou, Hui; Guo, Yong-feng

    2016-05-01

    Members of the plant-specific WOX transcription factor family have been reported to play important roles in cell to cell communication as well as other physiological and developmental processes. In this study, ten members of the WOX transcription factor family were identified in Solanum lycopersicum with HMMER. Neighbor-joining phylogenetic tree, maximum-likelihood tree and Bayesian-inference tree were constructed and similar topologies were shown using the protein sequences of the homeodomain. Phylogenetic study revealed that the 25 WOX family members from Arabidopsis and tomato fall into three clades and nine subfamilies. The patterns of exon-intron structures and organization of conserved domains in Arabidopsis and tomato were consistent based on the phylogenetic results. Transcriptome analysis showed that the expression patterns of SlWOXs were different in different tissue types. Gene Ontology (GO) analysis suggested that, as transcription factors, the SlWOX family members could be involved in a number of biological processes including cell to cell communication and tissue development. Our results are useful for future studies on WOX family members in tomato and other plant species.

  18. Efficient parsimony-based methods for phylogenetic network reconstruction.

    Science.gov (United States)

    Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir

    2007-01-15

    Phylogenies--the evolutionary histories of groups of organisms-play a major role in representing relationships among biological entities. Although many biological processes can be effectively modeled as tree-like relationships, others, such as hybrid speciation and horizontal gene transfer (HGT), result in networks, rather than trees, of relationships. Hybrid speciation is a significant evolutionary mechanism in plants, fish and other groups of species. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Maximum parsimony is one of the most commonly used criteria for phylogenetic tree inference. Roughly speaking, inference based on this criterion seeks the tree that minimizes the amount of evolution. In 1990, Jotun Hein proposed using this criterion for inferring the evolution of sequences subject to recombination. Preliminary results on small synthetic datasets. Nakhleh et al. (2005) demonstrated the criterion's application to phylogenetic network reconstruction in general and HGT detection in particular. However, the naive algorithms used by the authors are inapplicable to large datasets due to their demanding computational requirements. Further, no rigorous theoretical analysis of computing the criterion was given, nor was it tested on biological data. In the present work we prove that the problem of scoring the parsimony of a phylogenetic network is NP-hard and provide an improved fixed parameter tractable algorithm for it. Further, we devise efficient heuristics for parsimony-based reconstruction of phylogenetic networks. We test our methods on both synthetic and biological data (rbcL gene in bacteria) and obtain very promising results.

  19. Prokaryotic diversity, composition structure, and phylogenetic analysis of microbial communities in leachate sediment ecosystems.

    Science.gov (United States)

    Liu, Jingjing; Wu, Weixiang; Chen, Chongjun; Sun, Faqian; Chen, Yingxu

    2011-09-01

    In order to obtain insight into the prokaryotic diversity and community in leachate sediment, a culture-independent DNA-based molecular phylogenetic approach was performed with archaeal and bacterial 16S rRNA gene clone libraries derived from leachate sediment of an aged landfill. A total of 59 archaeal and 283 bacterial rDNA phylotypes were identified in 425 archaeal and 375 bacterial analyzed clones. All archaeal clones distributed within two archaeal phyla of the Euryarchaeota and Crenarchaeota, and well-defined methanogen lineages, especially Methanosaeta spp., are the most numerically dominant species of the archaeal community. Phylogenetic analysis of the bacterial library revealed a variety of pollutant-degrading and biotransforming microorganisms, including 18 distinct phyla. A substantial fraction of bacterial clones showed low levels of similarity with any previously documented sequences and thus might be taxonomically new. Chemical characteristics and phylogenetic inferences indicated that (1) ammonium-utilizing bacteria might form consortia to alleviate or avoid the negative influence of high ammonium concentration on other microorganisms, and (2) members of the Crenarchaeota found in the sediment might be involved in ammonium oxidation. This study is the first to report the composition of the microbial assemblages and phylogenetic characteristics of prokaryotic populations extant in leachate sediment. Additional work on microbial activity and contaminant biodegradation remains to be explored.

  20. Phylogenetic analysis of Bacillus subtilis strains applicable to natto (fermented soybean) production.

    Science.gov (United States)

    Kubo, Yuji; Rooney, Alejandro P; Tsukakoshi, Yoshiki; Nakagawa, Rikio; Hasegawa, Hiromasa; Kimura, Keitarou

    2011-09-01

    Spore-forming Bacillus strains that produce extracellular poly-γ-glutamic acid were screened for their application to natto (fermented soybean food) fermentation. Among the 424 strains, including Bacillus subtilis and B. amyloliquefaciens, which we isolated from rice straw, 59 were capable of fermenting natto. Biotin auxotrophism was tightly linked to natto fermentation. A multilocus nucleotide sequence of six genes (rpoB, purH, gyrA, groEL, polC, and 16S rRNA) was used for phylogenetic analysis, and amplified fragment length polymorphism (AFLP) analysis was also conducted on the natto-fermenting strains. The ability to ferment natto was inferred from the two principal components of the AFLP banding pattern, and natto-fermenting strains formed a tight cluster within the B. subtilis subsp. subtilis group.

  1. phangorn: phylogenetic analysis in R.

    Science.gov (United States)

    Schliep, Klaus Peter

    2011-02-15

    phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum parsimony or maximum likelihood (ML) and performing Hadamard conjugation. Extending the general ML framework, this package provides the possibility of estimating mixture and partition models. Furthermore, phangorn offers several functions for comparing trees, phylogenetic models or splits, simulating character data and performing congruence analyses. phangorn can be obtained through the CRAN homepage http://cran.r-project.org/web/packages/phangorn/index.html. phangorn is licensed under GPL 2.

  2. Open Reading Frame Phylogenetic Analysis on the Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung

    2013-01-01

    Full Text Available Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.

  3. Targeted Enrichment of Large Gene Families for Phylogenetic Inference: Phylogeny and Molecular Evolution of Photosynthesis Genes in the Portullugo Clade (Caryophyllales).

    Science.gov (United States)

    Moore, Abigail J; Vos, Jurriaan M De; Hancock, Lillian P; Goolsby, Eric; Edwards, Erika J

    2018-05-01

    Hybrid enrichment is an increasingly popular approach for obtaining hundreds of loci for phylogenetic analysis across many taxa quickly and cheaply. The genes targeted for sequencing are typically single-copy loci, which facilitate a more straightforward sequence assembly and homology assignment process. However, this approach limits the inclusion of most genes of functional interest, which often belong to multi-gene families. Here, we demonstrate the feasibility of including large gene families in hybrid enrichment protocols for phylogeny reconstruction and subsequent analyses of molecular evolution, using a new set of bait sequences designed for the "portullugo" (Caryophyllales), a moderately sized lineage of flowering plants (~ 2200 species) that includes the cacti and harbors many evolutionary transitions to C$_{\\mathrm{4}}$ and CAM photosynthesis. Including multi-gene families allowed us to simultaneously infer a robust phylogeny and construct a dense sampling of sequences for a major enzyme of C$_{\\mathrm{4}}$ and CAM photosynthesis, which revealed the accumulation of adaptive amino acid substitutions associated with C$_{\\mathrm{4}}$ and CAM origins in particular paralogs. Our final set of matrices for phylogenetic analyses included 75-218 loci across 74 taxa, with ~ 50% matrix completeness across data sets. Phylogenetic resolution was greatly improved across the tree, at both shallow and deep levels. Concatenation and coalescent-based approaches both resolve the sister lineage of the cacti with strong support: Anacampserotaceae $+$ Portulacaceae, two lineages of mostly diminutive succulent herbs of warm, arid regions. In spite of this congruence, BUCKy concordance analyses demonstrated strong and conflicting signals across gene trees. Our results add to the growing number of examples illustrating the complexity of phylogenetic signals in genomic-scale data.

  4. The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae)

    OpenAIRE

    McCann, Jamie; Schneeweiss, Gerald M.; Stuessy, Tod F.; Villase?or, Jose L.; Weiss-Schneeweiss, Hanna

    2016-01-01

    Chromosome number change (polyploidy and dysploidy) plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae) as model group, we assess the impact of reconstruction method (ma...

  5. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  6. Robustness of ancestral sequence reconstruction to phylogenetic uncertainty.

    Science.gov (United States)

    Hanson-Smith, Victor; Kolaczkowski, Bryan; Thornton, Joseph W

    2010-09-01

    Ancestral sequence reconstruction (ASR) is widely used to formulate and test hypotheses about the sequences, functions, and structures of ancient genes. Ancestral sequences are usually inferred from an alignment of extant sequences using a maximum likelihood (ML) phylogenetic algorithm, which calculates the most likely ancestral sequence assuming a probabilistic model of sequence evolution and a specific phylogeny--typically the tree with the ML. The true phylogeny is seldom known with certainty, however. ML methods ignore this uncertainty, whereas Bayesian methods incorporate it by integrating the likelihood of each ancestral state over a distribution of possible trees. It is not known whether Bayesian approaches to phylogenetic uncertainty improve the accuracy of inferred ancestral sequences. Here, we use simulation-based experiments under both simplified and empirically derived conditions to compare the accuracy of ASR carried out using ML and Bayesian approaches. We show that incorporating phylogenetic uncertainty by integrating over topologies very rarely changes the inferred ancestral state and does not improve the accuracy of the reconstructed ancestral sequence. Ancestral state reconstructions are robust to uncertainty about the underlying tree because the conditions that produce phylogenetic uncertainty also make the ancestral state identical across plausible trees; conversely, the conditions under which different phylogenies yield different inferred ancestral states produce little or no ambiguity about the true phylogeny. Our results suggest that ML can produce accurate ASRs, even in the face of phylogenetic uncertainty. Using Bayesian integration to incorporate this uncertainty is neither necessary nor beneficial.

  7. Bears in a forest of gene trees: phylogenetic inference is complicated by incomplete lineage sorting and gene flow.

    Science.gov (United States)

    Kutschera, Verena E; Bidon, Tobias; Hailer, Frank; Rodi, Julia L; Fain, Steven R; Janke, Axel

    2014-08-01

    Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and complicate phylogenetic inferences, but are not accounted for in phylogenetic analyses of concatenated data. We generated a high-resolution data set of autosomal introns from several individuals per species and of Y-chromosomal markers. Incorporating intraspecific variability in coalescence-based phylogenetic and gene flow estimation approaches, we traced the genealogical history of individual alleles. Considerable heterogeneity among nuclear loci and discordance between nuclear and mitochondrial phylogenies were found. A species tree with divergence time estimates indicated that ursine bears diversified within less than 2 My. Consistent with a complex branching order within a clade of Asian bear species, we identified unidirectional gene flow from Asian black into sloth bears. Moreover, gene flow detected from brown into American black bears can explain the conflicting placement of the American black bear in mitochondrial and nuclear phylogenies. These results highlight that both incomplete lineage sorting and introgression are prominent evolutionary forces even on time scales up to several million years. Complex evolutionary patterns are not adequately captured by strictly bifurcating models, and can only be fully understood when analyzing multiple independently inherited loci in a coalescence framework. Phylogenetic incongruence among gene trees hence needs to be recognized as a biologically meaningful signal. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Science.gov (United States)

    Kelly, Steven; Maini, Philip K

    2013-01-01

    The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  9. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Directory of Open Access Journals (Sweden)

    Steven Kelly

    Full Text Available The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  10. SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data.

    Science.gov (United States)

    Lee, Tae-Ho; Guo, Hui; Wang, Xiyin; Kim, Changsoo; Paterson, Andrew H

    2014-02-26

    Phylogenetic trees are widely used for genetic and evolutionary studies in various organisms. Advanced sequencing technology has dramatically enriched data available for constructing phylogenetic trees based on single nucleotide polymorphisms (SNPs). However, massive SNP data makes it difficult to perform reliable analysis, and there has been no ready-to-use pipeline to generate phylogenetic trees from these data. We developed a new pipeline, SNPhylo, to construct phylogenetic trees based on large SNP datasets. The pipeline may enable users to construct a phylogenetic tree from three representative SNP data file formats. In addition, in order to increase reliability of a tree, the pipeline has steps such as removing low quality data and considering linkage disequilibrium. A maximum likelihood method for the inference of phylogeny is also adopted in generation of a tree in our pipeline. Using SNPhylo, users can easily produce a reliable phylogenetic tree from a large SNP data file. Thus, this pipeline can help a researcher focus more on interpretation of the results of analysis of voluminous data sets, rather than manipulations necessary to accomplish the analysis.

  11. Fast and scalable inference of multi-sample cancer lineages.

    KAUST Repository

    Popic, Victoria; Salari, Raheleh; Hajirasouliha, Iman; Kashef-Haghighi, Dorna; West, Robert B; Batzoglou, Serafim

    2015-01-01

    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee .

  12. Fast and scalable inference of multi-sample cancer lineages.

    KAUST Repository

    Popic, Victoria

    2015-05-06

    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee .

  13. A Multi-Criterion Evolutionary Approach Applied to Phylogenetic Reconstruction

    OpenAIRE

    Cancino, W.; Delbem, A.C.B.

    2010-01-01

    In this paper, we proposed an MOEA approach, called PhyloMOEA which solves the phylogenetic inference problem using maximum parsimony and maximum likelihood criteria. The PhyloMOEA's development was motivated by several studies in the literature (Huelsenbeck, 1995; Jin & Nei, 1990; Kuhner & Felsenstein, 1994; Tateno et al., 1994), which point out that various phylogenetic inference methods lead to inconsistent solutions. Techniques using parsimony and likelihood criteria yield to different tr...

  14. Inferring phylogenetic networks by the maximum parsimony criterion: a case study.

    Science.gov (United States)

    Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir

    2007-01-01

    Horizontal gene transfer (HGT) may result in genes whose evolutionary histories disagree with each other, as well as with the species tree. In this case, reconciling the species and gene trees results in a network of relationships, known as the "phylogenetic network" of the set of species. A phylogenetic network that incorporates HGT consists of an underlying species tree that captures vertical inheritance and a set of edges which model the "horizontal" transfer of genetic material. In a series of papers, Nakhleh and colleagues have recently formulated a maximum parsimony (MP) criterion for phylogenetic networks, provided an array of computationally efficient algorithms and heuristics for computing it, and demonstrated its plausibility on simulated data. In this article, we study the performance and robustness of this criterion on biological data. Our findings indicate that MP is very promising when its application is extended to the domain of phylogenetic network reconstruction and HGT detection. In all cases we investigated, the MP criterion detected the correct number of HGT events required to map the evolutionary history of a gene data set onto the species phylogeny. Furthermore, our results indicate that the criterion is robust with respect to both incomplete taxon sampling and the use of different site substitution matrices. Finally, our results show that the MP criterion is very promising in detecting HGT in chimeric genes, whose evolutionary histories are a mix of vertical and horizontal evolution. Besides the performance analysis of MP, our findings offer new insights into the evolution of 4 biological data sets and new possible explanations of HGT scenarios in their evolutionary history.

  15. Phylogenetic rooting using minimal ancestor deviation.

    Science.gov (United States)

    Tria, Fernando Domingues Kümmel; Landan, Giddy; Dagan, Tal

    2017-06-19

    Ancestor-descendent relations play a cardinal role in evolutionary theory. Those relations are determined by rooting phylogenetic trees. Existing rooting methods are hampered by evolutionary rate heterogeneity or the unavailability of auxiliary phylogenetic information. Here we present a rooting approach, the minimal ancestor deviation (MAD) method, which accommodates heterotachy by using all pairwise topological and metric information in unrooted trees. We demonstrate the performance of the method, in comparison to existing rooting methods, by the analysis of phylogenies from eukaryotes and prokaryotes. MAD correctly recovers the known root of eukaryotes and uncovers evidence for the origin of cyanobacteria in the ocean. MAD is more robust and consistent than existing methods, provides measures of the root inference quality and is applicable to any tree with branch lengths.

  16. Fast algorithms for computing phylogenetic divergence time.

    Science.gov (United States)

    Crosby, Ralph W; Williams, Tiffani L

    2017-12-06

    The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.

  17. Comparative phylogenetic analysis of intergenic spacers and small ...

    African Journals Online (AJOL)

    The phylogenetic analysis of test isolates included assessment of variation in sequences and length of IGS and SSU-rRNA genes with reference to 16 different microsporidian sequences. The results proved that IGS sequences have more variation than SSU-rRNA gene sequences. Analysis of phylogenetic trees reveal that ...

  18. Maximum parsimony, substitution model, and probability phylogenetic trees.

    Science.gov (United States)

    Weng, J F; Thomas, D A; Mareels, I

    2011-01-01

    The problem of inferring phylogenies (phylogenetic trees) is one of the main problems in computational biology. There are three main methods for inferring phylogenies-Maximum Parsimony (MP), Distance Matrix (DM) and Maximum Likelihood (ML), of which the MP method is the most well-studied and popular method. In the MP method the optimization criterion is the number of substitutions of the nucleotides computed by the differences in the investigated nucleotide sequences. However, the MP method is often criticized as it only counts the substitutions observable at the current time and all the unobservable substitutions that really occur in the evolutionary history are omitted. In order to take into account the unobservable substitutions, some substitution models have been established and they are now widely used in the DM and ML methods but these substitution models cannot be used within the classical MP method. Recently the authors proposed a probability representation model for phylogenetic trees and the reconstructed trees in this model are called probability phylogenetic trees. One of the advantages of the probability representation model is that it can include a substitution model to infer phylogenetic trees based on the MP principle. In this paper we explain how to use a substitution model in the reconstruction of probability phylogenetic trees and show the advantage of this approach with examples.

  19. Molecular cytogenetic characterisation and phylogenetic analysis of the seven cultivated Vigna species (Fabaceae).

    Science.gov (United States)

    She, C-W; Jiang, X-H; Ou, L-J; Liu, J; Long, K-L; Zhang, L-H; Duan, W-T; Zhao, W; Hu, J-C

    2015-01-01

    The genomic organisation of the seven cultivated Vigna species, V. unguiculata, V. subterranea, V. angularis, V. umbellata, V. radiata, V. mungo and V. aconitifolia, was determined using sequential combined PI and DAPI (CPD) staining and dual-colour fluorescence in situ hybridisation (FISH) with 5S and 45S rDNA probes. For phylogenetic analyses, comparative genomic in situ hybridisation (cGISH) onto somatic chromosomes and sequence analysis of the internal transcribed spacer (ITS) of 45S rDNA were used. Quantitative karyotypes were established using chromosome measurements, fluorochrome bands and rDNA FISH signals. All species had symmetrical karyotypes composed of only metacentric or metacentric and submetacentric chromosomes. Distinct heterochromatin differentiation was revealed by CPD staining and DAPI counterstaining after FISH. The rDNA sites among all species differed in their number, location and size. cGISH of V. umbellata genomic DNA to the chromosomes of all species produced strong signals in all centromeric regions of V. umbellata and V. angularis, weak signals in all pericentromeric regions of V. aconitifolia, and CPD-banded proximal regions of V. mungo var. mungo. Molecular phylogenetic trees showed that V. angularis and V. umbellata were the closest relatives, and V. mungo and V. aconitifolia were relatively closely related; these species formed a group that was separated from another group comprising V. radiata, V. unguiculata ssp. sesquipedalis and V. subterranea. This result was consistent with the phylogenetic relationships inferred from the heterochromatin and cGISH patterns; thus, fluorochrome banding and cGISH are efficient tools for the phylogenetic analysis of Vigna species. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.

  20. Bootstrapping phylogenies inferred from rearrangement data

    Directory of Open Access Journals (Sweden)

    Lin Yu

    2012-08-01

    Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its

  1. Bootstrapping phylogenies inferred from rearrangement data.

    Science.gov (United States)

    Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me

    2012-08-29

    Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver

  2. Inferring phylogenetic trees from the knowledge of rare evolutionary events.

    Science.gov (United States)

    Hellmuth, Marc; Hernandez-Rosales, Maribel; Long, Yangjing; Stadler, Peter F

    2018-06-01

    Rare events have played an increasing role in molecular phylogenetics as potentially homoplasy-poor characters. In this contribution we analyze the phylogenetic information content from a combinatorial point of view by considering the binary relation on the set of taxa defined by the existence of a single event separating two taxa. We show that the graph-representation of this relation must be a tree. Moreover, we characterize completely the relationship between the tree of such relations and the underlying phylogenetic tree. With directed operations such as tandem-duplication-random-loss events in mind we demonstrate how non-symmetric information constrains the position of the root in the partially reconstructed phylogeny.

  3. Phylogenetic Analysis of Bacillus subtilis Strains Applicable to Natto (Fermented Soybean) Production ▿

    Science.gov (United States)

    Kubo, Yuji; Rooney, Alejandro P.; Tsukakoshi, Yoshiki; Nakagawa, Rikio; Hasegawa, Hiromasa; Kimura, Keitarou

    2011-01-01

    Spore-forming Bacillus strains that produce extracellular poly-γ-glutamic acid were screened for their application to natto (fermented soybean food) fermentation. Among the 424 strains, including Bacillus subtilis and B. amyloliquefaciens, which we isolated from rice straw, 59 were capable of fermenting natto. Biotin auxotrophism was tightly linked to natto fermentation. A multilocus nucleotide sequence of six genes (rpoB, purH, gyrA, groEL, polC, and 16S rRNA) was used for phylogenetic analysis, and amplified fragment length polymorphism (AFLP) analysis was also conducted on the natto-fermenting strains. The ability to ferment natto was inferred from the two principal components of the AFLP banding pattern, and natto-fermenting strains formed a tight cluster within the B. subtilis subsp. subtilis group. PMID:21764950

  4. Increased phylogenetic resolution using target enrichment in Rubus

    Science.gov (United States)

    Phylogenetic analyses in Rubus L. have been challenging due to polyploidy, hybridization, and apomixis within the genus. Wide morphological diversity occurs within and between species, contributing to challenges at lower and higher systematic levels. Phylogenetic inferences to date have been based o...

  5. Phylogenetic analysis of Gossypium L. using restriction fragment length polymorphism of repeated sequences.

    Science.gov (United States)

    Zhang, Meiping; Rong, Ying; Lee, Mi-Kyung; Zhang, Yang; Stelly, David M; Zhang, Hong-Bin

    2015-10-01

    Cotton is the world's leading textile fiber crop and is also grown as a bioenergy and food crop. Knowledge of the phylogeny of closely related species and the genome origin and evolution of polyploid species is significant for advanced genomics research and breeding. We have reconstructed the phylogeny of the cotton genus, Gossypium L., and deciphered the genome origin and evolution of its five polyploid species by restriction fragment analysis of repeated sequences. Nuclear DNA of 84 accessions representing 35 species and all eight genomes of the genus were analyzed. The phylogenetic tree of the genus was reconstructed using the parsimony method on 1033 polymorphic repeated sequence restriction fragments. The genome origin of its polyploids was determined by calculating the diploid-polyploid restriction fragment correspondence (RFC). The tree is consistent with the morphological classification, genome designation and geographic distribution of the species at subgenus, section and subsection levels. Gossypium lobatum (D7) was unambiguously shown to have the highest RFC with the D-subgenomes of all five polyploids of the genus, while the common ancestor of Gossypium herbaceum (A1) and Gossypium arboreum (A2) likely contributed to the A-subgenomes of the polyploids. These results provide a comprehensive phylogenetic tree of the cotton genus and new insights into the genome origin and evolution of its polyploid species. The results also further demonstrate a simple, rapid and inexpensive method suitable for phylogenetic analysis of closely related species, especially congeneric species, and the inference of genome origin of polyploids that constitute over 70 % of flowering plants.

  6. Phylogenetic relationships of true butterflies (Lepidoptera: Papilionoidea) inferred from COI, 16S rRNA and EF-1α sequences.

    Science.gov (United States)

    Kim, Man Il; Wan, Xinlong; Kim, Min Jee; Jeong, Heon Cheon; Ahn, Neung-Ho; Kim, Ki-Gyoung; Han, Yeon Soo; Kim, Iksoo

    2010-11-01

    The molecular phylogenetic relationships among true butterfly families (superfamily Papilionoidea) have been a matter of substantial controversy; this debate has led to several competing hypotheses. Two of the most compelling of those hypotheses involve the relationships of (Nymphalidae + Lycaenidae) + (Pieridae + Papilionidae) and (((Nymphalidae + Lycaenidae) + Pieridae) + Papilionidae). In this study, approximately 3,500 nucleotide sequences from cytochrome oxidase subunit I (COI), 16S ribosomal RNA (16S rRNA), and elongation factor-1 alpha (EF-1α) were sequenced from 83 species belonging to four true butterfly families, along with those of three outgroup species belonging to three lepidopteran superfamilies. These sequences were subjected to phylogenetic reconstruction via Bayesian Inference (BI), Maximum Likelihood (ML), and Maximum Parsimony (MP) algorithms. The monophyletic Pieridae and monophyletic Papilionidae evidenced good recovery in all analyses, but in some analyses, the monophylies of the Lycaenidae and Nymphalidae were hampered by the inclusion of single species of the lycaenid subfamily Miletinae and the nymphalid subfamily Danainae. Excluding those singletons, all phylogenetic analyses among the four true butterfly families clearly identified the Nymphalidae as the sister to the Lycaenidae and identified this group as a sister to the Pieridae, with the Papilionidae identified as the most basal linage to the true butterfly, thus supporting the hypothesis: (Papilionidae + (Pieridae + (Nymphalidae + Lycaenidae))).

  7. A new effective method for estimating missing values in the sequence data prior to phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Abdoulaye Baniré Diallo

    2006-01-01

    Full Text Available In this article we address the problem of phylogenetic inference from nucleic acid data containing missing bases. We introduce a new effective approach, called “Probabilistic estimation of missing values” (PEMV, allowing one to estimate unknown nucleotides prior to computing the evolutionary distances between them. We show that the new method improves the accuracy of phylogenetic inference compared to the existing methods “Ignoring Missing Sites” (IMS, “Proportional Distribution of Missing and Ambiguous Bases” (PDMAB included in the PAUP software [26]. The proposed strategy for estimating missing nucleotides is based on probabilistic formulae developed in the framework of the Jukes-Cantor [10] and Kimura 2-parameter [11] models. The relative performances of the new method were assessed through simulations carried out with the SeqGen program [20], for data generation, and the BioNJ method [7], for inferring phylogenies. We also compared the new method to the DNAML program [5] and “Matrix Representation using Parsimony” (MRP [13], [19] considering an example of 66 eutherian mammals originally analyzed in [17].

  8. Fast phylogenetic DNA barcoding

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Willerslev, Eske

    2008-01-01

    We present a heuristic approach to the DNA assignment problem based on phylogenetic inferences using constrained neighbour joining and non-parametric bootstrapping. We show that this method performs as well as the more computationally intensive full Bayesian approach in an analysis of 500 insect...... DNA sequences obtained from GenBank. We also analyse a previously published dataset of environmental DNA sequences from soil from New Zealand and Siberia, and use these data to illustrate the fact that statistical approaches to the DNA assignment problem allow for more appropriate criteria...... for determining the taxonomic level at which a particular DNA sequence can be assigned....

  9. HIV forensics: pitfalls and acceptable standards in the use of phylogenetic analysis as evidence in criminal investigations of HIV transmission.

    Science.gov (United States)

    Bernard, E J; Azad, Y; Vandamme, A M; Weait, M; Geretti, A M

    2007-09-01

    Phylogenetic analysis - the study of the genetic relatedness between HIV strains - has recently been used in criminal prosecutions as evidence of responsibility for HIV transmission. In these trials, the expert opinion of virologists has been of critical importance. Phylogenetic analysis of HIV gene sequences is complex and its findings do not achieve the levels of certainty obtained with the forensic analysis of human DNA. Although two individuals may carry HIV strains that are closely related, these will not necessarily be unique to the two parties and could extend to other persons within the same transmission network. For forensic purposes, phylogenetic analysis should be conducted under strictly controlled conditions by laboratories with relevant expertise applying rigorous methods. It is vitally important to include the right controls, which should be epidemiologically and temporally relevant to the parties under investigation. Use of inappropriate controls can exaggerate any relatedness between the virus strains of the complainant and defendant as being strikingly unique. It will be often difficult to obtain the relevant controls. If convenient but less appropriate controls are used, interpretation of the findings should be tempered accordingly. Phylogenetic analysis cannot prove that HIV transmission occurred directly between two individuals. However, it can exonerate individuals by demonstrating that the defendant carries a virus strain unrelated to that of the complainant. Expert witnesses should acknowledge the limitations of the inferences that might be made and choose the correct language in both written and verbal testimony.

  10. Inverse Ising inference with correlated samples

    International Nuclear Information System (INIS)

    Obermayer, Benedikt; Levine, Erel

    2014-01-01

    Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)

  11. Phylogenetic position of Loricifera inferred from nearly complete 18S and 28S rRNA gene sequences.

    Science.gov (United States)

    Yamasaki, Hiroshi; Fujimoto, Shinta; Miyazaki, Katsumi

    2015-01-01

    Loricifera is an enigmatic metazoan phylum; its morphology appeared to place it with Priapulida and Kinorhyncha in the group Scalidophora which, along with Nematoida (Nematoda and Nematomorpha), comprised the group Cycloneuralia. Scarce molecular data have suggested an alternative phylogenetic hypothesis, that the phylum Loricifera is a sister taxon to Nematomorpha, although the actual phylogenetic position of the phylum remains unclear. Ecdysozoan phylogeny was reconstructed through maximum-likelihood (ML) and Bayesian inference (BI) analyses of nuclear 18S and 28S rRNA gene sequences from 60 species representing all eight ecdysozoan phyla, and including a newly collected loriciferan species. Ecdysozoa comprised two clades with high support values in both the ML and BI trees. One consisted of Priapulida and Kinorhyncha, and the other of Loricifera, Nematoida, and Panarthropoda (Tardigrada, Onychophora, and Arthropoda). The relationships between Loricifera, Nematoida, and Panarthropoda were not well resolved. Loricifera appears to be closely related to Nematoida and Panarthropoda, rather than grouping with Priapulida and Kinorhyncha, as had been suggested by previous studies. Thus, both Scalidophora and Cycloneuralia are a polyphyletic or paraphyletic groups. In addition, Loricifera and Nematomorpha did not emerge as sister groups.

  12. Streptococcus massiliensis in the human mouth: a phylogenetic approach for the inference of bacterial habitats.

    Science.gov (United States)

    Póntigo, F; Silva, C; Moraga, M; Flores, S V

    2015-12-29

    Streptococcus is a diverse bacterial lineage. Species of this genus occupy a myriad of environments inside humans and other animals. Despite the elucidation of several of these habitats, many remain to be identified. Here, we explore a methodological approach to reveal unknown bacterial environments. Specifically, we inferred the phylogeny of the Mitis group by analyzing the sequences of eight genes. In addition, information regarding habitat use of species belonging to this group was obtained from the scientific literature. The oral cavity emerged as a potential, previously unknown, environment of Streptococcus massiliensis. This phylogeny-based prediction was confirmed by species-specific polymerase chain reaction (PCR) amplification. We propose employing a similar approach, i.e., use of bibliographic data and molecular phylogenetics as predictive methods, and species-specific PCR as confirmation, in order to reveal other unknown habitats in further bacterial taxa.

  13. Molecular phylogeny of Systellognatha (Plecoptera: Arctoperlaria) inferred from mitochondrial genome sequences.

    Science.gov (United States)

    Chen, Zhi-Teng; Zhao, Meng-Yuan; Xu, Cheng; Du, Yu-Zhou

    2018-05-01

    The infraorder Systellognatha is the most species-rich clade in the insect order Plecoptera and includes six families in two superfamilies: Pteronarcyoidea (Pteronarcyidae, Peltoperlidae, and Styloperlidae) and Perloidea (Perlidae, Perlodidae, and Chloroperlidae). To resolve the debatable phylogeny of Systellognatha, we carried out the first mitochondrial phylogenetic analysis covering all the six families, including three newly sequenced mitogenomes from two families (Perlodidae and Peltoperlidae) and 15 published mitogenomes. The three newly reported mitogenomes share conserved mitogenomic features with other sequenced stoneflies. For phylogenetic analyses, we assembled five datasets with two inference methods to assess their influence on topology and nodal support within Systellognatha. The results indicated that inclusion of the third codon positions of PCGs, exclusion of rRNA genes, the use of nucleotide datasets and Bayesian inference could improve the phylogenetic reconstruction of Systellognatha. The monophyly of Perloidea was supported in the mitochondrial phylogeny, but Pteronarcyoidea was recovered as paraphyletic and remained controversial. In this mitochondrial phylogenetic study, the relationships within Systellognatha were recovered as (((Perlidae + (Perlodidae + Chloroperlidae)) + (Pteronarcyidae + Styloperlidae)) + Peltoperlidae). Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Inference of miRNA targets using evolutionary conservation and pathway analysis

    Directory of Open Access Journals (Sweden)

    van Nimwegen Erik

    2007-03-01

    Full Text Available Abstract Background MicroRNAs have emerged as important regulatory genes in a variety of cellular processes and, in recent years, hundreds of such genes have been discovered in animals. In contrast, functional annotations are available only for a very small fraction of these miRNAs, and even in these cases only partially. Results We developed a general Bayesian method for the inference of miRNA target sites, in which, for each miRNA, we explicitly model the evolution of orthologous target sites in a set of related species. Using this method we predict target sites for all known miRNAs in flies, worms, fish, and mammals. By comparing our predictions in fly with a reference set of experimentally tested miRNA-mRNA interactions we show that our general method performs at least as well as the most accurate methods available to date, including ones specifically tailored for target prediction in fly. An important novel feature of our model is that it explicitly infers the phylogenetic distribution of functional target sites, independently for each miRNA. This allows us to infer species-specific and clade-specific miRNA targeting. We also show that, in long human 3' UTRs, miRNA target sites occur preferentially near the start and near the end of the 3' UTR. To characterize miRNA function beyond the predicted lists of targets we further present a method to infer significant associations between the sets of targets predicted for individual miRNAs and specific biochemical pathways, in particular those of the KEGG pathway database. We show that this approach retrieves several known functional miRNA-mRNA associations, and predicts novel functions for known miRNAs in cell growth and in development. Conclusion We have presented a Bayesian target prediction algorithm without any tunable parameters, that can be applied to sequences from any clade of species. The algorithm automatically infers the phylogenetic distribution of functional sites for each miRNA, and

  15. Computerized Assessment of Communication for Cognitive Stimulation for People with Cognitive Decline Using Spectral-Distortion Measures and Phylogenetic Inference

    Science.gov (United States)

    Pham, Tuan D.; Oyama-Higa, Mayumi; Truong, Cong-Thang; Okamoto, Kazushi; Futaba, Terufumi; Kanemoto, Shigeru; Sugiyama, Masahide; Lampe, Lisa

    2015-01-01

    Therapeutic communication and interpersonal relationships in care homes can help people to improve their mental wellbeing. Assessment of the efficacy of these dynamic and complex processes are necessary for psychosocial planning and management. This paper presents a pilot application of photoplethysmography in synchronized physiological measurements of communications between the care-giver and people with dementia. Signal-based evaluations of the therapy can be carried out using the measures of spectral distortion and the inference of phylogenetic trees. The proposed computational models can be of assistance and cost-effectiveness in caring for and monitoring people with cognitive decline. PMID:25803586

  16. Computerized assessment of communication for cognitive stimulation for people with cognitive decline using spectral-distortion measures and phylogenetic inference.

    Directory of Open Access Journals (Sweden)

    Tuan D Pham

    Full Text Available Therapeutic communication and interpersonal relationships in care homes can help people to improve their mental wellbeing. Assessment of the efficacy of these dynamic and complex processes are necessary for psychosocial planning and management. This paper presents a pilot application of photoplethysmography in synchronized physiological measurements of communications between the care-giver and people with dementia. Signal-based evaluations of the therapy can be carried out using the measures of spectral distortion and the inference of phylogenetic trees. The proposed computational models can be of assistance and cost-effectiveness in caring for and monitoring people with cognitive decline.

  17. Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

    Science.gov (United States)

    Kolaczkowski, Bryan; Thornton, Joseph W

    2009-12-09

    Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.

  18. Phylogenetic community structure: temporal variation in fish assemblage

    OpenAIRE

    Santorelli, Sergio; Magnusson, William; Ferreira, Efrem; Caramaschi, Erica; Zuanon, Jansen; Amadio, Sidnéia

    2014-01-01

    Hypotheses about phylogenetic relationships among species allow inferences about the mechanisms that affect species coexistence. Nevertheless, most studies assume that phylogenetic patterns identified are stable over time. We used data on monthly samples of fish from a single lake over 10 years to show that the structure in phylogenetic assemblages varies over time and conclusions depend heavily on the time scale investigated. The data set was organized in guild structures and temporal scales...

  19. TreeCluster: Massively scalable transmission clustering using phylogenetic trees

    OpenAIRE

    Moshiri, Alexander

    2018-01-01

    Background: The ability to infer transmission clusters from molecular data is critical to designing and evaluating viral control strategies. Viral sequencing datasets are growing rapidly, but standard methods of transmission cluster inference do not scale well beyond thousands of sequences. Results: I present TreeCluster, a cross-platform tool that performs transmission cluster inference on a given phylogenetic tree orders of magnitude faster than existing inference methods and supports multi...

  20. PhyloSift: phylogenetic analysis of genomes and metagenomes.

    Science.gov (United States)

    Darling, Aaron E; Jospin, Guillaume; Lowe, Eric; Matsen, Frederick A; Bik, Holly M; Eisen, Jonathan A

    2014-01-01

    Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection. In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata. These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454).

  1. PhyloSift: phylogenetic analysis of genomes and metagenomes

    Directory of Open Access Journals (Sweden)

    Aaron E. Darling

    2014-01-01

    Full Text Available Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection.In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata.These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454.

  2. MulRF: a software package for phylogenetic analysis using multi-copy gene trees.

    Science.gov (United States)

    Chaudhary, Ruchi; Fernández-Baca, David; Burleigh, John Gordon

    2015-02-01

    MulRF is a platform-independent software package for phylogenetic analysis using multi-copy gene trees. It seeks the species tree that minimizes the Robinson-Foulds (RF) distance to the input trees using a generalization of the RF distance to multi-labeled trees. The underlying generic tree distance measure and fast running time make MulRF useful for inferring phylogenies from large collections of gene trees, in which multiple evolutionary processes as well as phylogenetic error may contribute to gene tree discord. MulRF implements several features for customizing the species tree search and assessing the results, and it provides a user-friendly graphical user interface (GUI) with tree visualization. The species tree search is implemented in C++ and the GUI in Java Swing. MulRF's executable as well as sample datasets and manual are available at http://genome.cs.iastate.edu/CBL/MulRF/, and the source code is available at https://github.com/ruchiherself/MulRFRepo. ruchic@ufl.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Evaluation of atpB nucleotide sequences for phylogenetic studies of ferns and other pteridophytes.

    Science.gov (United States)

    Wolf, P

    1997-10-01

    Inferring basal relationships among vascular plants poses a major challenge to plant systematists. The divergence events that describe these relationships occurred long ago and considerable homoplasy has since accrued for both molecular and morphological characters. A potential solution is to examine phylogenetic analyses from multiple data sets. Here I present a new source of phylogenetic data for ferns and other pteridophytes. I sequenced the chloroplast gene atpB from 23 pteridophyte taxa and used maximum parsimony to infer relationships. A 588-bp region of the gene appeared to contain a statistically significant amount of phylogenetic signal and the resulting trees were largely congruent with similar analyses of nucleotide sequences from rbcL. However, a combined analysis of atpB plus rbcL produced a better resolved tree than did either data set alone. In the shortest trees, leptosporangiate ferns formed a monophyletic group. Also, I detected a well-supported clade of Psilotaceae (Psilotum and Tmesipteris) plus Ophioglossaceae (Ophioglossum and Botrychium). The demonstrated utility of atpB suggests that sequences from this gene should play a role in phylogenetic analyses that incorporate data from chloroplast genes, nuclear genes, morphology, and fossil data.

  4. Cenozoic biogeography and evolution in direct-developing frogs of Central America (Leptodactylidae: Eleutherodactylus) as inferred from a phylogenetic analysis of nuclear and mitochondrial genes.

    Science.gov (United States)

    Crawford, Andrew J; Smith, Eric N

    2005-06-01

    We report the first phylogenetic analysis of DNA sequence data for the Central American component of the genus Eleutherodactylus (Anura: Leptodactylidae: Eleutherodactylinae), one of the most ubiquitous, diverse, and abundant components of the Neotropical amphibian fauna. We obtained DNA sequence data from 55 specimens representing 45 species. Sampling was focused on Central America, but also included Bolivia, Brazil, Jamaica, and the USA. We sequenced 1460 contiguous base pairs (bp) of the mitochondrial genome containing ND2 and five neighboring tRNA genes, plus 1300 bp of the c-myc nuclear gene. The resulting phylogenetic inferences were broadly concordant between data sets and among analytical methods. The subgenus Craugastor is monophyletic and its initial radiation was potentially rapid and adaptive. Within Craugastor, the earliest splits separate three northern Central American species groups, milesi, augusti, and alfredi, from a clade comprising the rest of Craugastor. Within the latter clade, the rhodopis group as formerly recognized comprises three deeply divergent clades that do not form a monophyletic group; we therefore restrict the content of the rhodopis group to one of two northern clades, and use new names for the other northern (mexicanus group) and one southern clade (bransfordii group). The new rhodopis and bransfordii groups together form the sister taxon to a clade comprising the biporcatus, fitzingeri, mexicanus, and rugulosus groups. We used a Bayesian MCMC approach together with geological and biogeographic assumptions to estimate divergence times from the combined DNA sequence data. Our results corroborated three independent dispersal events for the origins of Central American Eleutherodactylus: (1) an ancestor of Craugastor entered northern Central America from South American in the early Paleocene, (2) an ancestor of the subgenus Syrrhophus entered northern Central America from the Caribbean at the end of the Eocene, and (3) a wave of

  5. The power and pitfalls of HIV phylogenetics in public health.

    Science.gov (United States)

    Brooks, James I; Sandstrom, Paul A

    2013-07-25

    Phylogenetics is the application of comparative studies of genetic sequences in order to infer evolutionary relationships among organisms. This tool can be used as a form of molecular epidemiology to enhance traditional population-level communicable disease surveillance. Phylogenetic study has resulted in new paradigms being created in the field of communicable diseases and this commentary aims to provide the reader with an explanation of how phylogenetics can be used in tracking infectious diseases. Special emphasis will be placed upon the application of phylogenetics as a tool to help elucidate HIV transmission patterns and the limitations to these methods when applied to forensic analysis. Understanding infectious disease epidemiology in order to prevent new transmissions is the sine qua non of public health. However, with increasing epidemiological resolution, there may be an associated potential loss of privacy to the individual. It is within this context that we aim to promote the discussion on how to use phylogenetics to achieve important public health goals, while at the same time protecting the rights of the individual.

  6. The influence of molecular markers and methods on inferring the phylogenetic relationships between the representatives of the Arini (parrots, Psittaciformes), determined on the basis of their complete mitochondrial genomes.

    Science.gov (United States)

    Urantowka, Adam Dawid; Kroczak, Aleksandra; Mackiewicz, Paweł

    2017-07-14

    Conures are a morphologically diverse group of Neotropical parrots classified as members of the tribe Arini, which has recently been subjected to a taxonomic revision. The previously broadly defined Aratinga genus of this tribe has been split into the 'true' Aratinga and three additional genera, Eupsittula, Psittacara and Thectocercus. Popular markers used in the reconstruction of the parrots' phylogenies derive from mitochondrial DNA. However, current phylogenetic analyses seem to indicate conflicting relationships between Aratinga and other conures, and also among other Arini members. Therefore, it is not clear if the mtDNA phylogenies can reliably define the species tree. The inconsistencies may result from the variable evolution rate of the markers used or their weak phylogenetic signal. To resolve these controversies and to assess to what extent the phylogenetic relationships in the tribe Arini can be inferred from mitochondrial genomes, we compared representative Arini mitogenomes as well as examined the usefulness of the individual mitochondrial markers and the efficiency of various phylogenetic methods. Single molecular markers produced inconsistent tree topologies, while different methods offered various topologies even for the same marker. A significant disagreement in these tree topologies occurred for cytb, nd2 and nd6 genes, which are commonly used in parrot phylogenies. The strongest phylogenetic signal was found in the control region and RNA genes. However, these markers cannot be used alone in inferring Arini phylogenies because they do not provide fully resolved trees. The most reliable phylogeny of the parrots under study is obtained only on the concatenated set of all mitochondrial markers. The analyses established significantly resolved relationships within the former Aratinga representatives and the main genera of the tribe Arini. Such mtDNA phylogeny can be in agreement with the species tree, owing to its match with synapomorphic features in

  7. Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

    Directory of Open Access Journals (Sweden)

    Bryan Kolaczkowski

    Full Text Available Bayesian inference (BI of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML, so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.

  8. Sorting through the chaff, nDNA gene trees for phylogenetic inference and hybrid identification of annual sunflowers (Helianthus sect. Helianthus).

    Science.gov (United States)

    Moody, Michael L; Rieseberg, Loren H

    2012-07-01

    The annual sunflowers (Helianthus sect. Helianthus) present a formidable challenge for phylogenetic inference because of ancient hybrid speciation, recent introgression, and suspected issues with deep coalescence. Here we analyze sequence data from 11 nuclear DNA (nDNA) genes for multiple genotypes of species within the section to (1) reconstruct the phylogeny of this group, (2) explore the utility of nDNA gene trees for detecting hybrid speciation and introgression; and (3) test an empirical method of hybrid identification based on the phylogenetic congruence of nDNA gene trees from tightly linked genes. We uncovered considerable topological heterogeneity among gene trees with or without three previously identified hybrid species included in the analyses, as well as a general lack of reciprocal monophyly of species. Nonetheless, partitioned Bayesian analyses provided strong support for the reciprocal monophyly of all species except H. annuus (0.89 PP), the most widespread and abundant annual sunflower. Previous hypotheses of relationships among taxa were generally strongly supported (1.0 PP), except among taxa typically associated with H. annuus, apparently due to the paraphyly of the latter in all gene trees. While the individual nDNA gene trees provided a useful means for detecting recent hybridization, identification of ancient hybridization was problematic for all ancient hybrid species, even when linkage was considered. We discuss biological factors that affect the efficacy of phylogenetic methods for hybrid identification.

  9. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  10. On Unrooted and Root-Uncertain Variants of Several Well-Known Phylogenetic Network Problems

    NARCIS (Netherlands)

    van Iersel, L.J.J.; Kelk, Steven; Stougie, Leen; Boes, Olivier

    2017-01-01

    The hybridization number problem requires us to embed a set of binary rooted phylogenetic trees into a binary rooted phylogenetic network such that the number of nodes with indegree two is minimized. However, from a biological point of view accurately inferring the root location in a phylogenetic

  11. Phylogenetic analysis and DNA-based species confirmation in Anopheles (Nyssorhynchus.

    Directory of Open Access Journals (Sweden)

    Peter G Foster

    Full Text Available Specimens of neotropical Anopheles (Nyssorhynchus were collected and identified morphologically. We amplified three genes for phylogenetic analysis-the single copy nuclear white and CAD genes, and the COI barcode region. Since we had multiple specimens for most species we were able to test how well the single or combined genes were able to corroborate morphologically defined species by placing the species into exclusive groups. We found that single genes, including the COI barcode region, were poor at confirming species, but that the three genes combined were able to do so much better. This has implications for species identification, species delimitation, and species discovery, and we caution that single genes are not enough. Higher level groupings were partially resolved with some well-supported groupings, whereas others were found to be either polyphyletic or paraphyletic. There were examples of known groups, such as the Myzorhynchella Section, which were poorly supported with single genes but were well supported with combined genes. From this we can infer that more sequence data will be needed in order to show more higher-level groupings with good support. We got unambiguously good support (0.94-1.0 Bayesian posterior probability from all DNA-based analyses for a grouping of An. dunhami with An. nuneztovari and An. goeldii, and because of this and because of morphological similarities we propose that An. dunhami be included in the Nuneztovari Complex. We obtained phylogenetic corroboration for new species which had been recognised by morphological differences; these will need to be formally described and named.

  12. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    Science.gov (United States)

    Pardi, Fabio; Scornavacca, Celine

    2015-04-01

    Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.

  13. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    Directory of Open Access Journals (Sweden)

    Fabio Pardi

    2015-04-01

    Full Text Available Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.

  14. Quartet-net: a quartet-based method to reconstruct phylogenetic networks.

    Science.gov (United States)

    Yang, Jialiang; Grünewald, Stefan; Wan, Xiu-Feng

    2013-05-01

    Phylogenetic networks can model reticulate evolutionary events such as hybridization, recombination, and horizontal gene transfer. However, reconstructing such networks is not trivial. Popular character-based methods are computationally inefficient, whereas distance-based methods cannot guarantee reconstruction accuracy because pairwise genetic distances only reflect partial information about a reticulate phylogeny. To balance accuracy and computational efficiency, here we introduce a quartet-based method to construct a phylogenetic network from a multiple sequence alignment. Unlike distances that only reflect the relationship between a pair of taxa, quartets contain information on the relationships among four taxa; these quartets provide adequate capacity to infer a more accurate phylogenetic network. In applications to simulated and biological data sets, we demonstrate that this novel method is robust and effective in reconstructing reticulate evolutionary events and it has the potential to infer more accurate phylogenetic distances than other conventional phylogenetic network construction methods such as Neighbor-Joining, Neighbor-Net, and Split Decomposition. This method can be used in constructing phylogenetic networks from simple evolutionary events involving a few reticulate events to complex evolutionary histories involving a large number of reticulate events. A software called "Quartet-Net" is implemented and available at http://sysbio.cvm.msstate.edu/QuartetNet/.

  15. BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

    Science.gov (United States)

    Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

    2012-01-01

    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.

  16. Phylogenetic search through partial tree mixing

    Science.gov (United States)

    2012-01-01

    Background Recent advances in sequencing technology have created large data sets upon which phylogenetic inference can be performed. Current research is limited by the prohibitive time necessary to perform tree search on a reasonable number of individuals. This research develops new phylogenetic algorithms that can operate on tens of thousands of species in a reasonable amount of time through several innovative search techniques. Results When compared to popular phylogenetic search algorithms, better trees are found much more quickly for large data sets. These algorithms are incorporated in the PSODA application available at http://dna.cs.byu.edu/psoda Conclusions The use of Partial Tree Mixing in a partition based tree space allows the algorithm to quickly converge on near optimal tree regions. These regions can then be searched in a methodical way to determine the overall optimal phylogenetic solution. PMID:23320449

  17. Towards an integrated phylogenetic classification of the Tremellomycetes.

    Science.gov (United States)

    Liu, X-Z; Wang, Q-M; Göker, M; Groenewald, M; Kachalkin, A V; Lumbsch, H T; Millanes, A M; Wedin, M; Yurkov, A M; Boekhout, T; Bai, F-Y

    2015-06-01

    Families and genera assigned to Tremellomycetes have been mainly circumscribed by morphology and for the yeasts also by biochemical and physiological characteristics. This phenotype-based classification is largely in conflict with molecular phylogenetic analyses. Here a phylogenetic classification framework for the Tremellomycetes is proposed based on the results of phylogenetic analyses from a seven-genes dataset covering the majority of tremellomycetous yeasts and closely related filamentous taxa. Circumscriptions of the taxonomic units at the order, family and genus levels recognised were quantitatively assessed using the phylogenetic rank boundary optimisation (PRBO) and modified general mixed Yule coalescent (GMYC) tests. In addition, a comprehensive phylogenetic analysis on an expanded LSU rRNA (D1/D2 domains) gene sequence dataset covering as many as available teleomorphic and filamentous taxa within Tremellomycetes was performed to investigate the relationships between yeasts and filamentous taxa and to examine the stability of undersampled clades. Based on the results inferred from molecular data and morphological and physiochemical features, we propose an updated classification for the Tremellomycetes. We accept five orders, 17 families and 54 genera, including seven new families and 18 new genera. In addition, seven families and 17 genera are emended and one new species name and 185 new combinations are proposed. We propose to use the term pro tempore or pro tem. in abbreviation to indicate the species names that are temporarily maintained.

  18. The phylogenetic likelihood library.

    Science.gov (United States)

    Flouri, T; Izquierdo-Carrasco, F; Darriba, D; Aberer, A J; Nguyen, L-T; Minh, B Q; Von Haeseler, A; Stamatakis, A

    2015-03-01

    We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2-10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  19. SuperTRI: A new approach based on branch support analyses of multiple independent data sets for assessing reliability of phylogenetic inferences.

    Science.gov (United States)

    Ropiquet, Anne; Li, Blaise; Hassanin, Alexandre

    2009-09-01

    Supermatrix and supertree are two methods for constructing a phylogenetic tree by using multiple data sets. However, these methods are not a panacea, as conflicting signals between data sets can lead to misinterpret the evolutionary history of taxa. In particular, the supermatrix approach is expected to be misleading if the species-tree signal is not dominant after the combination of the data sets. Moreover, most current supertree methods suffer from two limitations: (i) they ignore or misinterpret secondary (non-dominant) phylogenetic signals of the different data sets; and (ii) the logical basis of node robustness measures is unclear. To overcome these limitations, we propose a new approach, called SuperTRI, which is based on the branch support analyses of the independent data sets, and where the reliability of the nodes is assessed using three measures: the supertree Bootstrap percentage and two other values calculated from the separate analyses: the mean branch support (mean Bootstrap percentage or mean posterior probability) and the reproducibility index. The SuperTRI approach is tested on a data matrix including seven genes for 82 taxa of the family Bovidae (Mammalia, Ruminantia), and the results are compared to those found with the supermatrix approach. The phylogenetic analyses of the supermatrix and independent data sets were done using four methods of tree reconstruction: Bayesian inference, maximum likelihood, and unweighted and weighted maximum parsimony. The results indicate, firstly, that the SuperTRI approach shows less sensitivity to the four phylogenetic methods, secondly, that it is more accurate to interpret the relationships among taxa, and thirdly, that interesting conclusions on introgression and radiation can be drawn from the comparisons between SuperTRI and supermatrix analyses.

  20. Whole Genome Phylogenetic Tree Reconstruction using Colored de Bruijn Graphs

    OpenAIRE

    Lyman, Cole

    2017-01-01

    We present kleuren, a novel assembly-free method to reconstruct phylogenetic trees using the Colored de Bruijn Graph. kleuren works by constructing the Colored de Bruijn Graph and then traversing it, finding bubble structures in the graph that provide phylogenetic signal. The bubbles are then aligned and concatenated to form a supermatrix, from which a phylogenetic tree is inferred. We introduce the algorithm that kleuren uses to accomplish this task, and show its performance on reconstructin...

  1. Phylogenetic Conflict in Bears Identified by Automated Discovery of Transposable Element Insertions in Low-Coverage Genomes

    Science.gov (United States)

    Gallus, Susanne; Janke, Axel

    2017-01-01

    Abstract Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation. PMID:28985298

  2. The prevalence of terraced treescapes in analyses of phylogenetic data sets.

    Science.gov (United States)

    Dobrin, Barbara H; Zwickl, Derrick J; Sanderson, Michael J

    2018-04-04

    The pattern of data availability in a phylogenetic data set may lead to the formation of terraces, collections of equally optimal trees. Terraces can arise in tree space if trees are scored with parsimony or with partitioned, edge-unlinked maximum likelihood. Theory predicts that terraces can be large, but their prevalence in contemporary data sets has never been surveyed. We selected 26 data sets and phylogenetic trees reported in recent literature and investigated the terraces to which the trees would belong, under a common set of inference assumptions. We examined terrace size as a function of the sampling properties of the data sets, including taxon coverage density (the proportion of taxon-by-gene positions with any data present) and a measure of gene sampling "sufficiency". We evaluated each data set in relation to the theoretical minimum gene sampling depth needed to reduce terrace size to a single tree, and explored the impact of the terraces found in replicate trees in bootstrap methods. Terraces were identified in nearly all data sets with taxon coverage densities tree. Terraces found during bootstrap resampling reduced overall support. If certain inference assumptions apply, trees estimated from empirical data sets often belong to large terraces of equally optimal trees. Terrace size correlates to data set sampling properties. Data sets seldom include enough genes to reduce terrace size to one tree. When bootstrap replicate trees lie on a terrace, statistical support for phylogenetic hypotheses may be reduced. Although some of the published analyses surveyed were conducted with edge-linked inference models (which do not induce terraces), unlinked models have been used and advocated. The present study describes the potential impact of that inference assumption on phylogenetic inference in the context of the kinds of multigene data sets now widely assembled for large-scale tree construction.

  3. Universal artifacts affect the branching of phylogenetic trees, not universal scaling laws.

    Science.gov (United States)

    Altaba, Cristian R

    2009-01-01

    The superficial resemblance of phylogenetic trees to other branching structures allows searching for macroevolutionary patterns. However, such trees are just statistical inferences of particular historical events. Recent meta-analyses report finding regularities in the branching pattern of phylogenetic trees. But is this supported by evidence, or are such regularities just methodological artifacts? If so, is there any signal in a phylogeny? In order to evaluate the impact of polytomies and imbalance on tree shape, the distribution of all binary and polytomic trees of up to 7 taxa was assessed in tree-shape space. The relationship between the proportion of outgroups and the amount of imbalance introduced with them was assessed applying four different tree-building methods to 100 combinations from a set of 10 ingroup and 9 outgroup species, and performing covariance analyses. The relevance of this analysis was explored taking 61 published phylogenies, based on nucleic acid sequences and involving various taxa, taxonomic levels, and tree-building methods. All methods of phylogenetic inference are quite sensitive to the artifacts introduced by outgroups. However, published phylogenies appear to be subject to a rather effective, albeit rather intuitive control against such artifacts. The data and methods used to build phylogenetic trees are varied, so any meta-analysis is subject to pitfalls due to their uneven intrinsic merits, which translate into artifacts in tree shape. The binary branching pattern is an imposition of methods, and seldom reflects true relationships in intraspecific analyses, yielding artifactual polytomies in short trees. Above the species level, the departure of real trees from simplistic random models is caused at least by two natural factors--uneven speciation and extinction rates; and artifacts such as choice of taxa included in the analysis, and imbalance introduced by outgroups and basal paraphyletic taxa. This artifactual imbalance accounts

  4. Relationships among North American and Japanese Laetiporus isolates inferred from molecular phylogenetics and single-spore incompatibility reactions

    Science.gov (United States)

    Mark T. Banik; Daniel L. Lindner; Yuko Ota; Tsutomu Hattori

    2010-01-01

    Relationships were investigated among North American and Japanese isolates of Laetiporus using phylogenetic analysis of ITS sequences and single-spore isolate incompatibility. Single-spore isolate pairings revealed no significant compatibility between North American and Japanese isolates. ITS analysis revealed 12 clades within the core ...

  5. A new fast method for inferring multiple consensus trees using k-medoids.

    Science.gov (United States)

    Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

    2018-04-05

    Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while

  6. Phylogenetic comparative methods complement discriminant function analysis in ecomorphology.

    Science.gov (United States)

    Barr, W Andrew; Scott, Robert S

    2014-04-01

    In ecomorphology, Discriminant Function Analysis (DFA) has been used as evidence for the presence of functional links between morphometric variables and ecological categories. Here we conduct simulations of characters containing phylogenetic signal to explore the performance of DFA under a variety of conditions. Characters were simulated using a phylogeny of extant antelope species from known habitats. Characters were modeled with no biomechanical relationship to the habitat category; the only sources of variation were body mass, phylogenetic signal, or random "noise." DFA on the discriminability of habitat categories was performed using subsets of the simulated characters, and Phylogenetic Generalized Least Squares (PGLS) was performed for each character. Analyses were repeated with randomized habitat assignments. When simulated characters lacked phylogenetic signal and/or habitat assignments were random, ecomorphology. Copyright © 2013 Wiley Periodicals, Inc.

  7. A molecular phylogenetic appraisal of the acanthostomines Acanthostomum and Timoniella and their position within Cryptogonimidae (Trematoda: Opisthorchioidea).

    Science.gov (United States)

    Martínez-Aquino, Andrés; Vidal-Martínez, Victor M; Aguirre-Macedo, M Leopoldina

    2017-01-01

    The phylogenetic position of three taxa from two trematode genera, belonging to the subfamily Acanthostominae (Opisthorchioidea: Cryptogonimidae), were analysed using partial 28S ribosomal DNA (Domains 1-2) and internal transcribed spacers (ITS1-5.8S-ITS2). Bayesian inference and Maximum likelihood analyses of combined 28S rDNA and ITS1 + 5.8S + ITS2 sequences indicated the monophyly of the genus Acanthostomum ( A. cf. americanum and A. burminis ) and paraphyly of the Acanthostominae . These phylogenetic relationships were consistent in analyses of 28S alone and concatenated 28S + ITS1 + 5.8S + ITS2 sequences analyses. Based on molecular phylogenetic analyses, the subfamily Acanthostominae is therefore a paraphyletic taxon, in contrast with previous classifications based on morphological data. Phylogenetic patterns of host specificity inferred from adult stages of other cryptogonimid taxa are also well supported. However, analyses using additional genera and species are necessary to support the phylogenetic inferences from this study. Our molecular phylogenetic reconstruction linked two larval stages of A. cf. americanum cercariae and metacercariae. Here, we present the evolutionary and ecological implications of parasitic infections in freshwater and brackish environments.

  8. A molecular phylogenetic appraisal of the acanthostomines Acanthostomum and Timoniella and their position within Cryptogonimidae (Trematoda: Opisthorchioidea

    Directory of Open Access Journals (Sweden)

    Andrés Martínez-Aquino

    2017-12-01

    Full Text Available The phylogenetic position of three taxa from two trematode genera, belonging to the subfamily Acanthostominae (Opisthorchioidea: Cryptogonimidae, were analysed using partial 28S ribosomal DNA (Domains 1–2 and internal transcribed spacers (ITS1–5.8S–ITS2. Bayesian inference and Maximum likelihood analyses of combined 28S rDNA and ITS1 + 5.8S + ITS2 sequences indicated the monophyly of the genus Acanthostomum (A. cf. americanum and A. burminis and paraphyly of the Acanthostominae. These phylogenetic relationships were consistent in analyses of 28S alone and concatenated 28S + ITS1 + 5.8S + ITS2 sequences analyses. Based on molecular phylogenetic analyses, the subfamily Acanthostominae is therefore a paraphyletic taxon, in contrast with previous classifications based on morphological data. Phylogenetic patterns of host specificity inferred from adult stages of other cryptogonimid taxa are also well supported. However, analyses using additional genera and species are necessary to support the phylogenetic inferences from this study. Our molecular phylogenetic reconstruction linked two larval stages of A. cf. americanum cercariae and metacercariae. Here, we present the evolutionary and ecological implications of parasitic infections in freshwater and brackish environments.

  9. Multispecies coalescent analysis of the early diversification of neotropical primates: phylogenetic inference under strong gene trees/species tree conflict.

    Science.gov (United States)

    Schrago, Carlos G; Menezes, Albert N; Furtado, Carolina; Bonvicino, Cibele R; Seuanez, Hector N

    2014-11-05

    Neotropical primates (NP) are presently distributed in the New World from Mexico to northern Argentina, comprising three large families, Cebidae, Atelidae, and Pitheciidae, consequently to their diversification following their separation from Old World anthropoids near the Eocene/Oligocene boundary, some 40 Ma. The evolution of NP has been intensively investigated in the last decade by studies focusing on their phylogeny and timescale. However, despite major efforts, the phylogenetic relationship between these three major clades and the age of their last common ancestor are still controversial because these inferences were based on limited numbers of loci and dating analyses that did not consider the evolutionary variation associated with the distribution of gene trees within the proposed phylogenies. We show, by multispecies coalescent analyses of selected genome segments, spanning along 92,496,904 bp that the early diversification of extant NP was marked by a 2-fold increase of their effective population size and that Atelids and Cebids are more closely related respective to Pitheciids. The molecular phylogeny of NP has been difficult to solve because of population-level phenomena at the early evolution of the lineage. The association of evolutionary variation with the distribution of gene trees within proposed phylogenies is crucial for distinguishing the mean genetic divergence between species (the mean coalescent time between loci) from speciation time. This approach, based on extensive genomic data provided by new generation DNA sequencing, provides more accurate reconstructions of phylogenies and timescales for all organisms. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Phylogenetic Analysis of Phytophthora Species Based on Mitochondrial and Nuclear DNA Sequences

    NARCIS (Netherlands)

    Kroon, L.P.N.M.; Bakker, F.T.; Bosch, van den G.B.M.; Bonants, P.J.M.; Flier, W.G.

    2004-01-01

    A molecular phylogenetic analysis of the genus Phytophthora was performed, 113 isolates from 48 Phytophthora species were included in this analysis. Phylogenetic analyses were performed on regions of mitochondrial (cytochrome c oxidase subunit 1; NADH dehydrogenase subunit 1) and nuclear gene

  11. Phylogenetic relationships within Echinococcus and Taenia tapeworms (Cestoda: Taeniidae): an inference from nuclear protein-coding genes.

    Science.gov (United States)

    Knapp, Jenny; Nakao, Minoru; Yanagida, Tetsuya; Okamoto, Munehiro; Saarma, Urmas; Lavikainen, Antti; Ito, Akira

    2011-12-01

    The family Taeniidae of tapeworms is composed of two genera, Echinococcus and Taenia, which obligately parasitize mammals including humans. Inferring phylogeny via molecular markers is the only way to trace back their evolutionary histories. However, molecular dating approaches are lacking so far. Here we established new markers from nuclear protein-coding genes for RNA polymerase II second largest subunit (rpb2), phosphoenolpyruvate carboxykinase (pepck) and DNA polymerase delta (pold). Bayesian inference and maximum likelihood analyses of the concatenated gene sequences allowed us to reconstruct phylogenetic trees for taeniid parasites. The tree topologies clearly demonstrated that Taenia is paraphyletic and that the clade of Echinococcus oligarthrus and Echinococcusvogeli is sister to all other members of Echinococcus. Both species are endemic in Central and South America, and their definitive hosts originated from carnivores that immigrated from North America after the formation of the Panamanian land bridge about 3 million years ago (Ma). A time-calibrated phylogeny was estimated by a Bayesian relaxed-clock method based on the assumption that the most recent common ancestor of E. oligarthrus and E. vogeli existed during the late Pliocene (3.0 Ma). The results suggest that a clade of Taenia including human-pathogenic species diversified primarily in the late Miocene (11.2 Ma), whereas Echinococcus started to diversify later, in the end of the Miocene (5.8 Ma). Close genetic relationships among the members of Echinococcus imply that the genus is a young group in which speciation and global radiation occurred rapidly. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. Relating phylogenetic trees to transmission trees of infectious disease outbreaks.

    Science.gov (United States)

    Ypma, Rolf J F; van Ballegooijen, W Marijn; Wallinga, Jacco

    2013-11-01

    Transmission events are the fundamental building blocks of the dynamics of any infectious disease. Much about the epidemiology of a disease can be learned when these individual transmission events are known or can be estimated. Such estimations are difficult and generally feasible only when detailed epidemiological data are available. The genealogy estimated from genetic sequences of sampled pathogens is another rich source of information on transmission history. Optimal inference of transmission events calls for the combination of genetic data and epidemiological data into one joint analysis. A key difficulty is that the transmission tree, which describes the transmission events between infected hosts, differs from the phylogenetic tree, which describes the ancestral relationships between pathogens sampled from these hosts. The trees differ both in timing of the internal nodes and in topology. These differences become more pronounced when a higher fraction of infected hosts is sampled. We show how the phylogenetic tree of sampled pathogens is related to the transmission tree of an outbreak of an infectious disease, by the within-host dynamics of pathogens. We provide a statistical framework to infer key epidemiological and mutational parameters by simultaneously estimating the phylogenetic tree and the transmission tree. We test the approach using simulations and illustrate its use on an outbreak of foot-and-mouth disease. The approach unifies existing methods in the emerging field of phylodynamics with transmission tree reconstruction methods that are used in infectious disease epidemiology.

  13. One tree to link them all: a phylogenetic dataset for the European tetrapoda.

    Science.gov (United States)

    Roquet, Cristina; Lavergne, Sébastien; Thuiller, Wilfried

    2014-08-08

    Since the ever-increasing availability of phylogenetic informative data, the last decade has seen an upsurge of ecological studies incorporating information on evolutionary relationships among species. However, detailed species-level phylogenies are still lacking for many large groups and regions, which are necessary for comprehensive large-scale eco-phylogenetic analyses. Here, we provide a dataset of 100 dated phylogenetic trees for all European tetrapods based on a mixture of supermatrix and supertree approaches. Phylogenetic inference was performed separately for each of the main Tetrapoda groups of Europe except mammals (i.e. amphibians, birds, squamates and turtles) by means of maximum likelihood (ML) analyses of supermatrix applying a tree constraint at the family (amphibians and squamates) or order (birds and turtles) levels based on consensus knowledge. For each group, we inferred 100 ML trees to be able to provide a phylogenetic dataset that accounts for phylogenetic uncertainty, and assessed node support with bootstrap analyses. Each tree was dated using penalized-likelihood and fossil calibration. The trees obtained were well-supported by existing knowledge and previous phylogenetic studies. For mammals, we modified the most complete supertree dataset available on the literature to include a recent update of the Carnivora clade. As a final step, we merged the phylogenetic trees of all groups to obtain a set of 100 phylogenetic trees for all European Tetrapoda species for which data was available (91%). We provide this phylogenetic dataset (100 chronograms) for the purpose of comparative analyses, macro-ecological or community ecology studies aiming to incorporate phylogenetic information while accounting for phylogenetic uncertainty.

  14. Phylogenetic relationships in the genus Leonardoxa (Leguminosae: Caesalpinioideae) inferred from chloroplast trnL intron and trnL-trnF intergenic spacer sequences.

    Science.gov (United States)

    Brouat, Carine; Gielly, Ludovic; McKey, Doyle

    2001-01-01

    The African genus LEONARDOXA: (Leguminosae: Caesalpinioideae) comprises two Congolean species and a group of four mostly allopatric subspecies principally located in Cameroon and clustered together in the L. africana complex. LEONARDOXA: provides a good opportunity to investigate the evolutionary history of ant-plant mutualisms, as it exhibits various grades of ant-plant interactions from diffuse to obligate and symbiotic associations. We present in this paper the first molecular phylogenetic study of this genus. We sequenced both the chloroplast DNA trnL intron (677 aligned base pairs [bp]) and trnL-trnF intergene spacer (598 aligned bp). Inferred phylogenetic relationships suggested first that the genus is paraphyletic. The L. africana complex is clearly separated from the two Congolean species, and the integrity of the genus is thus in question. In the L. africana complex, our data showed a lack of congruence between clades suggested by morphological and chloroplast characters. This, and the low level of molecular divergence found between subspecies, suggests gene flow and introgressive events in the L. africana complex.

  15. Phylogenetic utility of ribosomal genes for reconstructing the phylogeny of five Chinese satyrine tribes (Lepidoptera, Nymphalidae

    Directory of Open Access Journals (Sweden)

    Mingsheng Yang

    2015-03-01

    Full Text Available Satyrinae is one of twelve subfamilies of the butterfly family Nymphalidae, which currently includes nine tribes. However, phylogenetic relationships among them remain largely unresolved, though different researches have been conducted based on both morphological and molecular data. However, ribosomal genes have never been used in tribe level phylogenetic analyses of Satyrinae. In this study we investigate for the first time the phylogenetic relationships among the tribes Elymniini, Amathusiini, Zetherini and Melanitini which are indicated to be a monophyletic group, and the Satyrini, using two ribosomal genes (28s rDNA and 16s rDNA and four protein-coding genes (EF-1α, COI, COII and Cytb. We mainly aim to assess the phylogenetic informativeness of the ribosomal genes as well as clarify the relationships among different tribes. Our results show the two ribosomal genes generally have the same high phylogenetic informativeness compared with EF-1α; and we infer the 28s rDNA would show better informativeness if the 28s rDNA sequence data for each sampling taxon are obtained in this study. The placement of the monotypic genus Callarge Leech in Zetherini is confirmed for the first time based on molecular evidence. In addition, our maximum likelihood (ML and Bayesian inference (BI trees consistently show that the involved Satyrinae including the Amathusiini is monophyletic with high support values. Although the relationships among the five tribes are identical among ML and BI analyses and are mostly strongly-supported in BI analysis, those in ML analysis are lowly- or moderately- supported. Therefore, the relationships among the related five tribes recovered herein need further verification based on more sampling taxa.

  16. STRIDE: Species Tree Root Inference from Gene Duplication Events.

    Science.gov (United States)

    Emms, David M; Kelly, Steven

    2017-12-01

    The correct interpretation of any phylogenetic tree is dependent on that tree being correctly rooted. We present STRIDE, a fast, effective, and outgroup-free method for identification of gene duplication events and species tree root inference in large-scale molecular phylogenetic analyses. STRIDE identifies sets of well-supported in-group gene duplication events from a set of unrooted gene trees, and analyses these events to infer a probability distribution over an unrooted species tree for the location of its root. We show that STRIDE correctly identifies the root of the species tree in multiple large-scale molecular phylogenetic data sets spanning a wide range of timescales and taxonomic groups. We demonstrate that the novel probability model implemented in STRIDE can accurately represent the ambiguity in species tree root assignment for data sets where information is limited. Furthermore, application of STRIDE to outgroup-free inference of the origin of the eukaryotic tree resulted in a root probability distribution that provides additional support for leading hypotheses for the origin of the eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. BLAST-EXPLORER helps you building datasets for phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Claverie Jean-Michel

    2010-01-01

    Full Text Available Abstract Background The right sampling of homologous sequences for phylogenetic or molecular evolution analyses is a crucial step, the quality of which can have a significant impact on the final interpretation of the study. There is no single way for constructing datasets suitable for phylogenetic analysis, because this task intimately depends on the scientific question we want to address, Moreover, database mining softwares such as BLAST which are routinely used for searching homologous sequences are not specifically optimized for this task. Results To fill this gap, we designed BLAST-Explorer, an original and friendly web-based application that combines a BLAST search with a suite of tools that allows interactive, phylogenetic-oriented exploration of the BLAST results and flexible selection of homologous sequences among the BLAST hits. Once the selection of the BLAST hits is done using BLAST-Explorer, the corresponding sequence can be imported locally for external analysis or passed to the phylogenetic tree reconstruction pipelines available on the Phylogeny.fr platform. Conclusions BLAST-Explorer provides a simple, intuitive and interactive graphical representation of the BLAST results and allows selection and retrieving of the BLAST hit sequences based a wide range of criterions. Although BLAST-Explorer primarily aims at helping the construction of sequence datasets for further phylogenetic study, it can also be used as a standard BLAST server with enriched output. BLAST-Explorer is available at http://www.phylogeny.fr

  18. Phylogenetic and biogeographic analysis of sphaerexochine trilobites.

    Directory of Open Access Journals (Sweden)

    Curtis R Congreve

    Full Text Available BACKGROUND: Sphaerexochinae is a speciose and widely distributed group of cheirurid trilobites. Their temporal range extends from the earliest Ordovician through the Silurian, and they survived the end Ordovician mass extinction event (the second largest mass extinction in Earth history. Prior to this study, the individual evolutionary relationships within the group had yet to be determined utilizing rigorous phylogenetic methods. Understanding these evolutionary relationships is important for producing a stable classification of the group, and will be useful in elucidating the effects the end Ordovician mass extinction had on the evolutionary and biogeographic history of the group. METHODOLOGY/PRINCIPAL FINDINGS: Cladistic parsimony analysis of cheirurid trilobites assigned to the subfamily Sphaerexochinae was conducted to evaluate phylogenetic patterns and produce a hypothesis of relationship for the group. This study utilized the program TNT, and the analysis included thirty-one taxa and thirty-nine characters. The results of this analysis were then used in a Lieberman-modified Brooks Parsimony Analysis to analyze biogeographic patterns during the Ordovician-Silurian. CONCLUSIONS/SIGNIFICANCE: The genus Sphaerexochus was found to be monophyletic, consisting of two smaller clades (one composed entirely of Ordovician species and another composed of Silurian and Ordovician species. By contrast, the genus Kawina was found to be paraphyletic. It is a basal grade that also contains taxa formerly assigned to Cydonocephalus. Phylogenetic patterns suggest Sphaerexochinae is a relatively distinctive trilobite clade because it appears to have been largely unaffected by the end Ordovician mass extinction. Finally, the biogeographic analysis yields two major conclusions about Sphaerexochus biogeography: Bohemia and Avalonia were close enough during the Silurian to exchange taxa; and during the Ordovician there was dispersal between Eastern Laurentia and

  19. Dinoflagellate phylogeny as inferred from heat shock protein 90 and ribosomal gene sequences.

    Directory of Open Access Journals (Sweden)

    Mona Hoppenrath

    2010-10-01

    evolutionary past. Nonetheless, the more comprehensive analysis of Hsp90 sequences enabled us to infer phylogenetic interrelationships of dinoflagellates more rigorously. For instance, the phylogenetic position of Noctiluca, which possesses several unusual features, was incongruent with previous phylogenetic studies. Therefore, the generation of additional dinoflagellate Hsp90 sequences is expected to refine the stem group of athecate species observed here and contribute to future multi-gene analyses of dinoflagellate interrelationships.

  20. Genome-wide identification, characterization and phylogenetic analysis of 50 catfish ATP-binding cassette (ABC) transporter genes.

    Science.gov (United States)

    Liu, Shikai; Li, Qi; Liu, Zhanjiang

    2013-01-01

    Although a large set of full-length transcripts was recently assembled in catfish, annotation of large gene families, especially those with duplications, is still a great challenge. Most often, complexities in annotation cause mis-identification and thereby much confusion in the scientific literature. As such, detailed phylogenetic analysis and/or orthology analysis are required for annotation of genes involved in gene families. The ATP-binding cassette (ABC) transporter gene superfamily is a large gene family that encodes membrane proteins that transport a diverse set of substrates across membranes, playing important roles in protecting organisms from diverse environment. In this work, we identified a set of 50 ABC transporters in catfish genome. Phylogenetic analysis allowed their identification and annotation into seven subfamilies, including 9 ABCA genes, 12 ABCB genes, 12 ABCC genes, 5 ABCD genes, 2 ABCE genes, 4 ABCF genes and 6 ABCG genes. Most ABC transporters are conserved among vertebrates, though cases of recent gene duplications and gene losses do exist. Gene duplications in catfish were found for ABCA1, ABCB3, ABCB6, ABCC5, ABCD3, ABCE1, ABCF2 and ABCG2. The whole set of catfish ABC transporters provide the essential genomic resources for future biochemical, toxicological and physiological studies of ABC drug efflux transporters. The establishment of orthologies should allow functional inferences with the information from model species, though the function of lineage-specific genes can be distinct because of specific living environment with different selection pressure.

  1. Phylogenetic Conflict in Bears Identified by Automated Discovery of Transposable Element Insertions in Low-Coverage Genomes.

    Science.gov (United States)

    Lammers, Fritjof; Gallus, Susanne; Janke, Axel; Nilsson, Maria A

    2017-10-01

    Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  2. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    Directory of Open Access Journals (Sweden)

    Schwartz Russell

    2008-08-01

    Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.

  3. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    Science.gov (United States)

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  4. Phylogenetic reconstruction of endophytic fungal isolates using internal transcribed spacer 2 (ITS2) region.

    Science.gov (United States)

    GokulRaj, Kathamuthu; Sundaresan, Natesan; Ganeshan, Enthai Jagan; Rajapriya, Pandi; Muthumary, Johnpaul; Sridhar, Jayavel; Pandi, Mohan

    2014-01-01

    Endophytic fungi are inhabitants of plants, living most part of their lifecycle asymptomatically which mainly confer protection and ecological advantages to the host plant. In this present study, 48 endophytic fungi were isolated from the leaves of three medicinal plants and characterized based on ITS2 sequence - secondary structure analysis. ITS2 secondary structures were elucidated with minimum free energy method (MFOLD version 3.1) and consensus structure of each genus was generated by 4SALE. ProfDistS was used to generate ITS2 sequence structure based phylogenetic tree respectively. Our elucidated isolates were belonging to Ascomycetes family, representing 5 orders and 6 genera. Colletotrichum/Glomerella spp., Diaporthae/Phomopsis spp., and Alternaria spp., were predominantly observed while Cochliobolus sp., Cladosporium sp., and Emericella sp., were represented by singletons. The constructed phylogenetic tree has well resolved monophyletic groups with >50% bootstrap value support. Secondary structures based fungal systematics improves not only the stability; it also increases the precision of phylogenetic inference. Above ITS2 based phylogenetic analysis was performed for our 48 isolates along with sequences of known ex-types taken from GenBank which confirms the efficiency of the proposed method. Further, we propose it as superlative marker for reconstructing phylogenetic relationships at different taxonomic levels due to their lesser length.

  5. Phylogenetic and recombination analysis of tomato spotted wilt virus.

    Directory of Open Access Journals (Sweden)

    Sen Lian

    Full Text Available Tomato spotted wilt virus (TSWV severely damages and reduces the yield of many economically important plants worldwide. In this study, we determined the whole-genome sequences of 10 TSWV isolates recently identified from various regions and hosts in Korea. Phylogenetic analysis of these 10 isolates as well as the three previously sequenced isolates indicated that the 13 Korean TSWV isolates could be divided into two groups reflecting either two different origins or divergences of Korean TSWV isolates. In addition, the complete nucleotide sequences for the 13 Korean TSWV isolates along with previously sequenced TSWV RNA segments from Korea and other countries were subjected to phylogenetic and recombination analysis. The phylogenetic analysis indicated that both the RNA L and RNA M segments of most Korean isolates might have originated in Western Europe and North America but that the RNA S segments for all Korean isolates might have originated in China and Japan. Recombination analysis identified a total of 12 recombination events among all isolates and segments and five recombination events among the 13 Korea isolates; among the five recombinants from Korea, three contained the whole RNA L segment, suggesting reassortment rather than recombination. Our analyses provide evidence that both recombination and reassortment have contributed to the molecular diversity of TSWV.

  6. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  7. Loss of the flagellum happened only once in the fungal lineage: phylogenetic structure of Kingdom Fungi inferred from RNA polymerase II subunit genes

    Directory of Open Access Journals (Sweden)

    Hodson Matthew C

    2006-09-01

    Full Text Available Abstract Background At present, there is not a widely accepted consensus view regarding the phylogenetic structure of kingdom Fungi although two major phyla, Ascomycota and Basidiomycota, are clearly delineated. Regarding the lower fungi, Zygomycota and Chytridiomycota, a variety of proposals have been advanced. Microsporidia may or may not be fungi; the Glomales (vesicular-arbuscular mycorrhizal fungi may or may not constitute a fifth fungal phylum, and the loss of the flagellum may have occurred either once or multiple times during fungal evolution. All of these issues are capable of being resolved by a molecular phylogenetic analysis which achieves strong statistical support for major branches. To date, no fungal phylogeny based upon molecular characters has satisfied this criterion. Results Using the translated amino acid sequences of the RPB1 and RPB2 genes, we have inferred a fungal phylogeny that consists largely of well-supported monophyletic phyla. Our major results, each with significant statistical support, are: (1 Microsporidia are sister to kingdom Fungi and are not members of Zygomycota; that is, Microsporidia and fungi originated from a common ancestor. (2 Chytridiomycota, the only fungal phylum having a developmental stage with a flagellum, is paraphyletic and is the basal lineage. (3 Zygomycota is monophyletic based upon sampling of Trichomycetes, Zygomycetes, and Glomales. (4 Zygomycota, Basidiomycota, and Ascomycota form a monophyletic group separate from Chytridiomycota. (5 Basidiomycota and Ascomycota are monophyletic sister groups. Conclusion In general, this paper highlights the evolutionary position and significance of the lower fungi (Zygomycota and Chytridiomycota. Our results suggest that loss of the flagellum happened only once during early stages of fungal evolution; consequently, the majority of fungi, unlike plants and animals, are nonflagellated. The phylogeny we infer from gene sequences is the first one that is

  8. Loss of the flagellum happened only once in the fungal lineage: phylogenetic structure of kingdom Fungi inferred from RNA polymerase II subunit genes.

    Science.gov (United States)

    Liu, Yajuan J; Hodson, Matthew C; Hall, Benjamin D

    2006-09-29

    At present, there is not a widely accepted consensus view regarding the phylogenetic structure of kingdom Fungi although two major phyla, Ascomycota and Basidiomycota, are clearly delineated. Regarding the lower fungi, Zygomycota and Chytridiomycota, a variety of proposals have been advanced. Microsporidia may or may not be fungi; the Glomales (vesicular-arbuscular mycorrhizal fungi) may or may not constitute a fifth fungal phylum, and the loss of the flagellum may have occurred either once or multiple times during fungal evolution. All of these issues are capable of being resolved by a molecular phylogenetic analysis which achieves strong statistical support for major branches. To date, no fungal phylogeny based upon molecular characters has satisfied this criterion. Using the translated amino acid sequences of the RPB1 and RPB2 genes, we have inferred a fungal phylogeny that consists largely of well-supported monophyletic phyla. Our major results, each with significant statistical support, are: (1) Microsporidia are sister to kingdom Fungi and are not members of Zygomycota; that is, Microsporidia and fungi originated from a common ancestor. (2) Chytridiomycota, the only fungal phylum having a developmental stage with a flagellum, is paraphyletic and is the basal lineage. (3) Zygomycota is monophyletic based upon sampling of Trichomycetes, Zygomycetes, and Glomales. (4) Zygomycota, Basidiomycota, and Ascomycota form a monophyletic group separate from Chytridiomycota. (5) Basidiomycota and Ascomycota are monophyletic sister groups. In general, this paper highlights the evolutionary position and significance of the lower fungi (Zygomycota and Chytridiomycota). Our results suggest that loss of the flagellum happened only once during early stages of fungal evolution; consequently, the majority of fungi, unlike plants and animals, are nonflagellated. The phylogeny we infer from gene sequences is the first one that is congruent with the widely accepted morphology

  9. BioMatriX: Sequence analysis, structure visualization, phylogenetics ...

    African Journals Online (AJOL)

    bmx-biomatrix.blogspot.com) developed for biological science community to augment scientific research regarding genomics, proteomics, phylogenetics and linkage analysis in one platform. BioMatriX offers multi-functional services to perform ...

  10. An attempt to reconstruct phylogenetic relationships within Caribbean nummulitids: simulating relationships and tracing character evolution

    Science.gov (United States)

    Eder, Wolfgang; Ives Torres-Silva, Ana; Hohenegger, Johann

    2017-04-01

    Phylogenetic analysis and trees based on molecular data are broadly applied and used to infer genetical and biogeographic relationship in recent larger foraminifera. Molecular phylogenetic is intensively used within recent nummulitids, however for fossil representatives these trees are only of minor informational value. Hence, within paleontological studies a phylogenetic approach through morphometric analysis is of much higher value. To tackle phylogenetic relationships within the nummulitid family, a much higher number of morphological character must be measured than are commonly used in biometric studies, where mostly parameters describing embryonic size (e.g., proloculus diameter, deuteroloculus diameter) and/or the marginal spiral (e.g., spiral diagrams, spiral indices) are studied. For this purpose 11 growth-independent and/or growth-invariant characters have been used to describe the morphological variability of equatorial thin sections of seven Carribbean nummulitid taxa (Nummulites striatoreticulatus, N. macgillavry, Palaeonummulites willcoxi, P.floridensis, P. soldadensis, P.trinitatensis and P.ocalanus) and one outgroup taxon (Ranikothalia bermudezi). Using these characters, phylogenetic trees were calculated using a restricted maximum likelihood algorithm (REML), and results are cross-checked by ordination and cluster analysis. Square-change parsimony method has been run to reconstruct ancestral states, as well as to simulate the evolution of the chosen characters along the calculated phylogenetic tree and, independent - contrast analysis was used to estimate confidence intervals. Based on these simulations, phylogenetic tendencies of certain characters proposed for nummulitids (e.g., Cope's rule or nepionic acceleration) can be tested, whether these tendencies are valid for the whole family or only for certain clades. At least, within the Carribean nummulitids, phylogenetic trends along some growth-independent characters of the embryo (e.g., first

  11. Phylogenetic analysis of HSP70 and cyt b gene sequences for Chinese Leishmania isolates and ultrastructural characteristics of Chinese Leishmania sp.

    Science.gov (United States)

    Yuan, Dongmei; Qin, Hanxiao; Zhang, Jianguo; Liao, Lin; Chen, Qiwei; Chen, Dali; Chen, Jianping

    2017-02-01

    Leishmaniasis is a worldwide epidemic disease caused by the genus Leishmania, which is still endemic in the west and northwest areas of China. Some viewpoints of the traditional taxonomy of Chinese Leishmania have been challenged by recent phylogenetic researches based on different molecular markers. However, the taxonomic positions and phylogenetic relationships of Chinese Leishmania isolates remain controversial, which need for more data and further analysis. In this study, the heat shock protein 70 (HSP70) gene and cytochrome b (cyt b) gene were used for phylogenetic analysis of Chinese Leishmania isolates from patients, dogs, gerbils, and sand flies in different geographic origins. Besides, for the interesting Leishmania sp. in China, the ultrastructure of three Chinese Leishmania sp. strains (MHOM/CN/90/SC10H2, SD, GL) were observed by transmission electron microscopy. Bayesian trees from HSP70 and cyt b congruently indicated that the 14 Chinese Leishmania isolates belong to three Leishmania species including L. donovani complex, L. gerbilli, and L. (Sauroleishmania) sp. Their identity further confirmed that the undescribed Leishmania species causing visceral Leishmaniasis (VL) in China is closely related to L. tarentolae. The phylogenetic results from HSP70 also suggested the classification of subspecies within L. donovani complex: KXG-918, KXG-927, KXG-Liu, KXG-Xu, 9044, SC6, and KXG-65 belong to L. donovani; Cy, WenChuan, and 801 were proposed to be L. infantum. Through transmission electron microscopy, unexpectedly, the Golgi apparatus were not observed in SC10H2, SD, and GL, which was similar to previous reports of reptilian Leishmania. The statistical analysis of microtubule counts separated SC10H2, SD, and GL as one group from any other reference strain (L. donovani MHOM/IN/80/DD8; L. tropica MHOM/SU/74/K27; L. gerbilli MRHO/CN/60/GERBILLI). The ultrastructural characteristics of Leishmania sp. partly lend support to the phylogenetic inference that

  12. Phylogeny and biogeography of highly diverged freshwater fish species (Leuciscinae, Cyprinidae, Teleostei) inferred from mitochondrial genome analysis.

    Science.gov (United States)

    Imoto, Junichi M; Saitoh, Kenji; Sasaki, Takeshi; Yonezawa, Takahiro; Adachi, Jun; Kartavtsev, Yuri P; Miya, Masaki; Nishida, Mutsumi; Hanzawa, Naoto

    2013-02-10

    The distribution of freshwater taxa is a good biogeographic model to study pattern and process of vicariance and dispersal. The subfamily Leuciscinae (Cyprinidae, Teleostei) consists of many species distributed widely in Eurasia and North America. Leuciscinae have been divided into two phyletic groups, leuciscin and phoxinin. The phylogenetic relationships between major clades within the subfamily are poorly understood, largely because of the overwhelming diversity of the group. The origin of the Far Eastern phoxinin is an interesting question regarding the evolutionary history of Leuciscinae. Here we present phylogenetic analysis of 31 species of Leuciscinae and outgroups based on complete mitochondrial genome sequences to clarify the phylogenetic relationships and to infer the evolutionary history of the subfamily. Phylogenetic analysis suggests that the Far Eastern phoxinin species comprised the monophyletic clades Tribolodon, Pseudaspius, Oreoleuciscus and Far Eastern Phoxinus. The Far Eastern phoxinin clade was independent of other Leuciscinae lineages and was closer to North American phoxinins than European leuciscins. All of our analysis also suggested that leuciscins and phoxinins each constituted monophyletic groups. Divergence time estimation suggested that Leuciscinae species diverged from outgroups such as Tincinae to be 83.3 million years ago (Mya) in the Late Cretaceous and leuciscin and phoxinin shared a common ancestor 70.7 Mya. Radiation of Leuciscinae lineages occurred during the Late Cretaceous to Paleocene. This period also witnessed the radiation of tetrapods. Reconstruction of ancestral areas indicates Leuciscinae species originated within Europe. Leuciscin species evolved in Europe and the ancestor of phoxinin was distributed in North America. The Far Eastern phoxinins would have dispersed from North America to Far East across the Beringia land bridge. The present study suggests important roles for the continental rearrangements during the

  13. Phylogenetic relationships within the cyst-forming nematodes (Nematoda, Heteroderidae) based on analysis of sequences from the ITS regions of ribosomal DNA.

    Science.gov (United States)

    Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R

    2001-10-01

    The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.

  14. Phylogenetic inferences of Atelinae (Platyrrhini) based on multi-directional chromosome painting in Brachyteles arachnoides, Ateles paniscus paniscus and Ateles b. marginatus.

    Science.gov (United States)

    de Oliveira, E H C; Neusser, M; Pieczarka, J C; Nagamachi, C; Sbalqueiro, I J; Müller, S

    2005-01-01

    We performed multi-directional chromosome painting in a comparative cytogenetic study of the three Atelinae species Brachyteles arachnoides, Ateles paniscus paniscus and Ateles belzebuth marginatus, in order to reconstruct phylogenetic relationships within this Platyrrhini subfamily. Comparative chromosome maps between these species were established by multi-color fluorescence in situ hybridization (FISH) employing human, Saguinus oedipus and Lagothrix lagothricha chromosome-specific probes. The three species included in this study and four previously analyzed species from all four Atelinae genera were subjected to a phylogenetic analysis on the basis of a data matrix comprised of 82 discrete chromosome characters. The results confirmed that Atelinae represent a monophyletic clade with a putative ancestral karyotype of 2n = 62 chromosomes. Phylogenetic analysis revealed an evolutionary branching sequence [Alouatta [Brachyteles [Lagothrix and Ateles

  15. Human synthetic lethal inference as potential anti-cancer target gene detection

    Directory of Open Access Journals (Sweden)

    Solé Ricard V

    2009-12-01

    Full Text Available Abstract Background Two genes are called synthetic lethal (SL if mutation of either alone is not lethal, but mutation of both leads to death or a significant decrease in organism's fitness. The detection of SL gene pairs constitutes a promising alternative for anti-cancer therapy. As cancer cells exhibit a large number of mutations, the identification of these mutated genes' SL partners may provide specific anti-cancer drug candidates, with minor perturbations to the healthy cells. Since existent SL data is mainly restricted to yeast screenings, the road towards human SL candidates is limited to inference methods. Results In the present work, we use phylogenetic analysis and database manipulation (BioGRID for interactions, Ensembl and NCBI for homology, Gene Ontology for GO attributes in order to reconstruct the phylogenetically-inferred SL gene network for human. In addition, available data on cancer mutated genes (COSMIC and Cancer Gene Census databases as well as on existent approved drugs (DrugBank database supports our selection of cancer-therapy candidates. Conclusions Our work provides a complementary alternative to the current methods for drug discovering and gene target identification in anti-cancer research. Novel SL screening analysis and the use of highly curated databases would contribute to improve the results of this methodology.

  16. Phylogenetic analysis and victim contact tracing of rabies virus from humans and dogs in Bali, Indonesia.

    Science.gov (United States)

    Mahardika, G N K; Dibia, N; Budayanti, N S; Susilawathi, N M; Subrata, K; Darwinata, A E; Wignall, F S; Richt, J A; Valdivia-Granda, W A; Sudewi, A A R

    2014-06-01

    The emergence of human and animal rabies in Bali since November 2008 has attracted local, national and international interest. The potential origin and time of introduction of rabies virus to Bali is described. The nucleoprotein (N) gene of rabies virus from dog brain and human clinical specimens was sequenced using an automated DNA sequencer. Phylogenetic inference with Bayesian Markov Chain Monte Carlo (MCMC) analysis using the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) v. 1.7.5 software confirmed that the outbreak of rabies in Bali was caused by an Indonesian lineage virus following a single introduction. The ancestor of Bali viruses was the descendant of a virus from Kalimantan. Contact tracing showed that the event most likely occurred in early 2008. The introduction of rabies into a large unvaccinated dog population in Bali clearly demonstrates the risk of disease transmission for government agencies and should lead to an increased preparedness and efforts for sustained risk reduction to prevent such events from occurring in future.

  17. First evidence of Leishmania infection in European brown hare (Lepus europaeus) in Greece: GIS analysis and phylogenetic position within the Leishmania spp.

    Science.gov (United States)

    Tsokana, C N; Sokos, C; Giannakopoulos, A; Mamuris, Z; Birtsas, P; Papaspyropoulos, K; Valiakos, G; Spyrou, V; Lefkaditis, M; Chatzopoulos, D C; Kantere, M; Manolakou, K; Touloudi, A; Burriel, A Rodi; Ferroglio, E; Hadjichristodoulou, C; Billinis, C

    2016-01-01

    Although the existence of a sylvatic transmission cycle of Leishmania spp., independent from the domestic cycle, has been proposed, data are scarce on Leishmania infection in wild mammals in Greece. In this study, we aimed to investigate the presence of Leishmania infection in the European brown hare in Greece, to infer the phylogenetic position of the Leishmania parasites detected in hares in Greece, and to identify any possible correlation between Leishmania infection in hares with environmental parameters, using the geographical information system (GIS). Spleen samples from 166 hares were tested by internal transcribed spacer-1 (ITS-1)-nested PCR for the detection of Leishmania DNA. Phylogenetic analysis was performed on Leishmania sequences from hares in Greece in conjunction with Leishmania sequences from dogs in Greece and 46 Leishmania sequences retrieved from GenBank. The Leishmania DNA prevalence in hares was found to be 23.49 % (95 % confidence interval (CI) 17.27-30.69). The phylogenetic analysis confirmed that the Leishmania sequences from hares in Greece belong in the Leishmania donovani complex. The widespread Leishmania infection in hares should be taken into consideration because under specific circumstances, this species can act as a reservoir host. This study suggests that the role of wild animals, including hares, in the epidemiology of Leishmania spp. in Greece deserves further elucidation.

  18. Phylogenetic analysis of hepatitis B virus in pakistan

    International Nuclear Information System (INIS)

    Baig, S.; Hasnain, N.U.

    2008-01-01

    To identify the distribution pattern of Hepatitis B Virus (HBV) genotype in a group of patients and to study its phylogenetic divergence. Two hundred and one HBV infected patients were genotyped for this study. All HbsAg positive individuals, either healthy carriers or suffering from conditions such as acute or chronic hepatitis, cirrhosis and hepatocellular carcinoma were included. Hepatitis B patients co-infected with other hepatic viruses were excluded. Hepatitis B virus DNA was extracted from serum, and subjected to a nested PCR, using the primers type-specific for genotype detection. Phylogenetic analysis was performed in the pre-S1 through S genes of HBV. The divergence was studied through 15 sequences of 967bp submitted to the DBJ/EMBL/GenBank databases accessible under accession number EF584640 through EF584654. Out of 201 patients tested, 156 were males and 45 were females. Genotype D was the predominant type found in 128 (64%) patients followed by A in 47 (23%) and mixed A/D in 26 (13%). Phylogenetic analysis confirmed the dominance of genotype D and subtype ayw2. There was dominance of genotype D subtype ayw2. It had a close resemblance with HBV strains that circulate in Iran, India and Japan. (author)

  19. Visualizing phylogenetic tree landscapes.

    Science.gov (United States)

    Wilgenbusch, James C; Huang, Wen; Gallivan, Kyle A

    2017-02-02

    Genomic-scale sequence alignments are increasingly used to infer phylogenies in order to better understand the processes and patterns of evolution. Different partitions within these new alignments (e.g., genes, codon positions, and structural features) often favor hundreds if not thousands of competing phylogenies. Summarizing and comparing phylogenies obtained from multi-source data sets using current consensus tree methods discards valuable information and can disguise potential methodological problems. Discovery of efficient and accurate dimensionality reduction methods used to display at once in 2- or 3- dimensions the relationship among these competing phylogenies will help practitioners diagnose the limits of current evolutionary models and potential problems with phylogenetic reconstruction methods when analyzing large multi-source data sets. We introduce several dimensionality reduction methods to visualize in 2- and 3-dimensions the relationship among competing phylogenies obtained from gene partitions found in three mid- to large-size mitochondrial genome alignments. We test the performance of these dimensionality reduction methods by applying several goodness-of-fit measures. The intrinsic dimensionality of each data set is also estimated to determine whether projections in 2- and 3-dimensions can be expected to reveal meaningful relationships among trees from different data partitions. Several new approaches to aid in the comparison of different phylogenetic landscapes are presented. Curvilinear Components Analysis (CCA) and a stochastic gradient decent (SGD) optimization method give the best representation of the original tree-to-tree distance matrix for each of the three- mitochondrial genome alignments and greatly outperformed the method currently used to visualize tree landscapes. The CCA + SGD method converged at least as fast as previously applied methods for visualizing tree landscapes. We demonstrate for all three mtDNA alignments that 3D

  20. Integrative taxonomy of ciliates: Assessment of molecular phylogenetic content and morphological homology testing.

    Science.gov (United States)

    Vďačný, Peter

    2017-10-01

    The very diverse and comparatively complex morphology of ciliates has given rise to numerous taxonomic concepts. However, the information content of the utilized molecular markers has seldom been explored prior to phylogenetic analyses and taxonomic decisions. Likewise, robust testing of morphological homology statements and the apomorphic nature of diagnostic characters of ciliate taxa is rarely carried out. Four phylogenetic techniques that may help address these issues are reviewed. (1) Split spectrum analysis serves to determine the exact number and quality of nucleotide positions supporting individual nodes in phylogenetic trees and to discern long-branch artifacts that cause spurious phylogenies. (2) Network analysis can depict all possible evolutionary trajectories inferable from the dataset and locate and measure the conflict between them. (3) A priori likelihood mapping tests the suitability of data for reconstruction of a well resolved tree, visualizes the tree-likeness of quartets, and assesses the support of an internal branch of a given tree topology. (4) Reconstruction of ancestral morphologies can be applied for analyzing homology and apomorphy statements without circular reasoning. Since these phylogenetic tools are rarely used, their principles and interpretation are introduced and exemplified using various groups of ciliates. Finally, environmental sequencing data are discussed in this light. Copyright © 2017 The Author. Published by Elsevier GmbH.. All rights reserved.

  1. Causal inference in survival analysis using pseudo-observations

    DEFF Research Database (Denmark)

    Andersen, Per K; Syriopoulou, Elisavet; Parner, Erik T

    2017-01-01

    Causal inference for non-censored response variables, such as binary or quantitative outcomes, is often based on either (1) direct standardization ('G-formula') or (2) inverse probability of treatment assignment weights ('propensity score'). To do causal inference in survival analysis, one needs ...

  2. Phylogenetic Information Content of Copepoda Ribosomal DNA Repeat Units: ITS1 and ITS2 Impact

    Science.gov (United States)

    Zagoskin, Maxim V.; Lazareva, Valentina I.; Grishanin, Andrey K.; Mukha, Dmitry V.

    2014-01-01

    The utility of various regions of the ribosomal repeat unit for phylogenetic analysis was examined in 16 species representing four families, nine genera, and two orders of the subclass Copepoda (Crustacea). Fragments approximately 2000 bp in length containing the ribosomal DNA (rDNA) 18S and 28S gene fragments, the 5.8S gene, and the internal transcribed spacer regions I and II (ITS1 and ITS2) were amplified and analyzed. The DAMBE (Data Analysis in Molecular Biology and Evolution) software was used to analyze the saturation of nucleotide substitutions; this test revealed the suitability of both the 28S gene fragment and the ITS1/ITS2 rDNA regions for the reconstruction of phylogenetic trees. Distance (minimum evolution) and probabilistic (maximum likelihood, Bayesian) analyses of the data revealed that the 28S rDNA and the ITS1 and ITS2 regions are informative markers for inferring phylogenetic relationships among families of copepods and within the Cyclopidae family and associated genera. Split-graph analysis of concatenated ITS1/ITS2 rDNA regions of cyclopoid copepods suggested that the Mesocyclops, Thermocyclops, and Macrocyclops genera share complex evolutionary relationships. This study revealed that the ITS1 and ITS2 regions potentially represent different phylogenetic signals. PMID:25215300

  3. Novel Approaches for Phylogenetic Inference from Morphological Data and Total-Evidence Dating in Squamate Reptiles (Lizards, Snakes, and Amphisbaenians).

    Science.gov (United States)

    Pyron, R Alexander

    2017-01-01

    Here, I combine previously underutilized models and priors to perform more biologically realistic phylogenetic inference from morphological data, with an example from squamate reptiles. When coding morphological characters, it is often possible to denote ordered states with explicit reference to observed or hypothetical ancestral conditions. Using this logic, we can integrate across character-state labels and estimate meaningful rates of forward and backward transitions from plesiomorphy to apomorphy. I refer to this approach as MkA, for “asymmetric.” The MkA model incorporates the biological reality of limited reversal for many phylogenetically informative characters, and significantly increases likelihoods in the empirical data sets. Despite this, the phylogeny of Squamata remains contentious. Total-evidence analyses using combined morphological and molecular data and the MkA approach tend toward recent consensus estimates supporting a nested Iguania. However, support for this topology is not unambiguous across data sets or analyses, and no mechanism has been proposed to explain the widespread incongruence between partitions, or the hidden support for various topologies in those partitions. Furthermore, different morphological data sets produced by different authors contain both different characters and different states for the same or similar characters, resulting in drastically different placements for many important fossil lineages. Effort is needed to standardize ontology for morphology, resolve incongruence, and estimate a robust phylogeny. The MkA approach provides a preliminary avenue for investigating morphological evolution while accounting for temporal evidence and asymmetry in character-state changes.

  4. Continental scale patterns and predictors of fern richness and phylogenetic diversity

    Directory of Open Access Journals (Sweden)

    Nathalie eNagalingum

    2015-04-01

    Full Text Available Because ferns have a wide range of habitat preferences and are widely distributed, they are an ideal group for understanding how diversity is distributed. Here we examine fern diversity on a broad-scale using standard and corrected richness measures as well as phylogenetic indices; in addition we determine the environmental predictors of each diversity metric. Using the combined records of Australian herbaria, a dataset of over 60,000 records was obtained for 89 genera to infer richness. A phylogenetic tree of all the genera was constructed and combined with the herbarium records to obtain phylogenetic diversity patterns. A hotspot of both taxic and phylogenetic diversity occurs in the Wet Tropics of northeastern Australia. Although considerable diversity is distributed along the eastern coast, some important regions of diversity are identified only after sample-standardization of richness and through the phylogenetic metric. Of all of the metrics, annual precipitation was identified as the most explanatory variable, in part, in agreement with global and regional fern studies. Precipitation was combined with a different variable for each different metric. For corrected richness, precipitation is combined with temperature seasonality, while correlation of phylogenetic diversity to precipitation plus radiation indicates support for the species-energy hypothesis. Significantly high and significantly low phylogenetic diversity were found in geographically separate areas. These areas are correlated with different climatic conditions such as seasonality in precipitation. The use of phylogenetic metrics identifies additional areas of significant diversity, some of which have not been revealed using traditional taxonomic analyses, suggesting that different ecological and evolutionary processes have operated over the continent. Our study demonstrates that it is possible and vital to incorporate evolutionary metrics when inferring biodiversity hotspots

  5. [Phylogeny of protostome moulting animals (Ecdysozoa) inferred from 18 and 28S rRNA gene sequences].

    Science.gov (United States)

    Petrov, N B; Vladychenskaia, N S

    2005-01-01

    Reliability of reconstruction of phylogenetic relationships within a group of protostome moulting animals was evaluated by means of comparison of 18 and 28S rRNA gene sequences sets both taken separately and combined. Reliability of reconstructions was evaluated by values of the bootstrap support of major phylogenetic tree nodes and by degree of congruence of phylogenetic trees inferred by various methods. By both criteria, phylogenetic trees reconstructed from the combined 18 and 28S rRNA gene sequences were better than those inferred from 18 and 28S sequences taken separately. Results obtained are consistent with phylogenetic hypothesis separating protostome animals into two major clades, moulting Ecdysozoa (Priapulida + Kinorhyncha, Nematoda + Nematomorpha, Onychophora + Tardigrada, Myriapoda + Chelicerata, Crustacea + Hexapoda) and unmoulting Lophotrochozoa (Plathelminthes, Nemertini, Annelida, Mollusca, Echiura, Sipuncula). Clade Cephalorhyncha does not include nematomorphs (Nematomorpha). Conclusion was taken that it is necessary to use combined 18 and 28S data in phylogenetic studies.

  6. Phylogenetic diversity analysis of Trichoderma species based on ...

    African Journals Online (AJOL)

    vi-4177/CSAU be assigned as the type strains of a species of genus Trichoderma based on phylogenetic tree analysis together with the 18S rRNA gene sequence search in Ribosomal Database Project, small subunit rRNA and large subunit ...

  7. Phylogenetic inference in Rafflesiales: the influence of rate heterogeneity and horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Vidal-Russell Romina

    2004-10-01

    Full Text Available Abstract Background The phylogenetic relationships among the holoparasites of Rafflesiales have remained enigmatic for over a century. Recent molecular phylogenetic studies using the mitochondrial matR gene placed Rafflesia, Rhizanthes and Sapria (Rafflesiaceae s. str. in the angiosperm order Malpighiales and Mitrastema (Mitrastemonaceae in Ericales. These phylogenetic studies did not, however, sample two additional groups traditionally classified within Rafflesiales (Apodantheaceae and Cytinaceae. Here we provide molecular phylogenetic evidence using DNA sequence data from mitochondrial and nuclear genes for representatives of all genera in Rafflesiales. Results Our analyses indicate that the phylogenetic affinities of the large-flowered clade and Mitrastema, ascertained using mitochondrial matR, are congruent with results from nuclear SSU rDNA when these data are analyzed using maximum likelihood and Bayesian methods. The relationship of Cytinaceae to Malvales was recovered in all analyses. Relationships between Apodanthaceae and photosynthetic angiosperms varied depending upon the data partition: Malvales (3-gene, Cucurbitales (matR or Fabales (atp1. The latter incongruencies suggest that horizontal gene transfer (HGT may be affecting the mitochondrial gene topologies. The lack of association between Mitrastema and Ericales using atp1 is suggestive of HGT, but greater sampling within eudicots is needed to test this hypothesis further. Conclusions Rafflesiales are not monophyletic but composed of three or four independent lineages (families: Rafflesiaceae, Mitrastemonaceae, Apodanthaceae and Cytinaceae. Long-branch attraction appears to be misleading parsimony analyses of nuclear small-subunit rDNA data, but model-based methods (maximum likelihood and Bayesian analyses recover a topology that is congruent with the mitochondrial matR gene tree, thus providing compelling evidence for organismal relationships. Horizontal gene transfer appears to

  8. Global patterns and drivers of phylogenetic structure in island floras

    NARCIS (Netherlands)

    Weigelt, P.; Kissling, W.D.; Kisel, Y.; Fritz, S.A.; Karger, D.N.; Kessler, A.; Lehtonen, S.; Svenning, J.-C.; Kreft, H.

    2015-01-01

    Islands are ideal for investigating processes that shape species assemblages because they are isolated and have discrete boundaries. Quantifying phylogenetic assemblage structure allows inferences about these processes, in particular dispersal, environmental filtering and in-situ speciation. Here,

  9. Data for constructing insect genome content matrices for phylogenetic analysis and functional annotation

    Directory of Open Access Journals (Sweden)

    Jeffrey Rosenfeld

    2016-03-01

    Full Text Available Twenty one fully sequenced and well annotated insect genomes were used to construct genome content matrices for phylogenetic analysis and functional annotation of insect genomes. To examine the role of e-value cutoff in ortholog determination we used scaled e-value cutoffs and a single linkage clustering approach.. The present communication includes (1 a list of the genomes used to construct the genome content phylogenetic matrices, (2 a nexus file with the data matrices used in phylogenetic analysis, (3 a nexus file with the Newick trees generated by phylogenetic analysis, (4 an excel file listing the Core (CORE genes and Unique (UNI genes found in five insect groups, and (5 a figure showing a plot of consistency index (CI versus percent of unannotated genes that are apomorphies in the data set for gene losses and gains and bar plots of gains and losses for four consistency index (CI cutoffs.

  10. On the information content of discrete phylogenetic characters.

    Science.gov (United States)

    Bordewich, Magnus; Deutschmann, Ina Maria; Fischer, Mareike; Kasbohm, Elisa; Semple, Charles; Steel, Mike

    2017-12-16

    Phylogenetic inference aims to reconstruct the evolutionary relationships of different species based on genetic (or other) data. Discrete characters are a particular type of data, which contain information on how the species should be grouped together. However, it has long been known that some characters contain more information than others. For instance, a character that assigns the same state to each species groups all of them together and so provides no insight into the relationships of the species considered. At the other extreme, a character that assigns a different state to each species also conveys no phylogenetic signal. In this manuscript, we study a natural combinatorial measure of the information content of an individual character and analyse properties of characters that provide the maximum phylogenetic information, particularly, the number of states such a character uses and how the different states have to be distributed among the species or taxa of the phylogenetic tree.

  11. Soft-tissue anatomy of the extant hominoids: a review and phylogenetic analysis

    Science.gov (United States)

    Gibbs, S; Collard, M; Wood, B

    2002-01-01

    This paper reports the results of a literature search for information about the soft-tissue anatomy of the extant non-human hominoid genera, Pan, Gorilla, Pongo and Hylobates, together with the results of a phylogenetic analysis of these data plus comparable data for Homo. Information on the four extant non-human hominoid genera was located for 240 out of the 1783 soft-tissue structures listed in the Nomina Anatomica. Numerically these data are biased so that information about some systems (e.g. muscles) and some regions (e.g. the forelimb) are over-represented, whereas other systems and regions (e.g. the veins and the lymphatics of the vascular system, the head region) are either under-represented or not represented at all. Screening to ensure that the data were suitable for use in a phylogenetic analysis reduced the number of eligible soft-tissue structures to 171. These data, together with comparable data for modern humans, were converted into discontinuous character states suitable for phylogenetic analysis and then used to construct a taxon-by-character matrix. This matrix was used in two tests of the hypothesis that soft-tissue characters can be relied upon to reconstruct hominoid phylogenetic relationships. In the first, parsimony analysis was used to identify cladograms requiring the smallest number of character state changes. In the second, the phylogenetic bootstrap was used to determine the confidence intervals of the most parsimonious clades. The parsimony analysis yielded a single most parsimonious cladogram that matched the molecular cladogram. Similarly the bootstrap analysis yielded clades that were compatible with the molecular cladogram; a (Homo, Pan) clade was supported by 95% of the replicates, and a (Gorilla, Pan, Homo) clade by 96%. These are the first hominoid morphological data to provide statistically significant support for the clades favoured by the molecular evidence. PMID:11833653

  12. Phylogenetic prediction of Alternaria leaf blight resistance in wild and cultivated species of carrots (Daucus, Apiaceae)

    Science.gov (United States)

    Plant scientists make inferences and predictions from phylogenetic trees to solve scientific problems. Crop losses due to disease damage is an important problem that many plant breeders would like to solve, so the ability to predict traits like disease resistance from phylogenetic trees derived from...

  13. A Comparative Analysis of Fuzzy Inference Engines in Context of ...

    African Journals Online (AJOL)

    Fuzzy inference engine has found successful applications in a wide variety of fields, such as automatic control, data classification, decision analysis, expert engines, time series prediction, robotics, pattern recognition, etc. This paper presents a comparative analysis of three fuzzy inference engines, max-product, max-min ...

  14. Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction.

    Science.gov (United States)

    Sayyari, Erfan; Mirarab, Siavash

    2016-11-11

    Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed. We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves. We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.

  15. Characterization and phylogenetic analysis of α-gliadin gene ...

    Indian Academy of Sciences (India)

    Supplementary data: Characterization and phylogenetic analysis of α-gliadin gene sequences reveals significant genomic divergence in Triticeae species. Guang-Rong Li, Tao Lang, En-Nian Yang, Cheng Liu ... The MITE insertion at the 3 UTR is boxed. Figure 2. The secondary structure of MITE insertion in HM452949.

  16. Phylogenetic analysis of subgenus vigna species using nuclear ribosomal RNA ITS: evidence of hybridization among Vigna unguiculata subspecies.

    Science.gov (United States)

    Vijaykumar, Archana; Saini, Ajay; Jawali, Narendra

    2010-01-01

    Molecular phylogeny among species belonging to subgenus Vigna (genus Vigna) was inferred based on internal transcribed spacer (ITS) sequences of 18S-5.8S-26S ribosomal RNA gene unit. Analysis showed a total of 356 polymorphic sites of which approximately 80% were parsimony informative. Phylogenetic reconstruction by neighbor joining and maximum parsimony methods placed the 57 Vigna accessions (belonging to 15 species) into 5 major clades. Five species viz. Vigna heterophylla, Vigna pubigera, Vigna parkeri, Vigna laurentii, and Vigna gracilis whose position in the subgenus was previously not known were placed in the section Vigna. A single accession (Vigna unguiculata ssp. tenuis, NI 1637) harbored 2 intragenomic ITS variants, indicative of 2 different types of ribosomal DNA (rDNA) repeat units. ITS variant type-I was close to ITS from V. unguiculata ssp. pubescens, whereas type-II was close to V. unguiculata ssp. tenuis. Transcript analysis clearly demonstrates that in accession NI 1637, rDNA repeat units with only type-II ITS variants are transcriptionally active. Evidence from sequence analysis (of 5.8S, ITS1, and ITS2) and secondary structure analysis (of ITS1 and ITS2) indicates that the type-I ITS variant probably does not belong to the pseudogenic rDNA repeat units. The results from phylogenetic and transcript analysis suggest that the rDNA units with the type-I ITS may have introgressed as a result of hybridization (between ssp. tenuis and ssp. pubescens); however, it has been epigenetically silenced. The results also demonstrate differential evolution of ITS sequence among wild and cultivated forms of V. unguiculata.

  17. Barcoding and Phylogenetic Inferences in Nine Mugilid Species (Pisces, Mugiliformes

    Directory of Open Access Journals (Sweden)

    Neonila Polyakova

    2013-10-01

    Full Text Available Accurate identification of fish and fish products, from eggs to adults, is important in many areas. Grey mullets of the family Mugilidae are distributed worldwide and inhabit marine, estuarine, and freshwater environments in all tropical and temperate regions. Various Mugilid species are commercially important species in fishery and aquaculture of many countries. For the present study we have chosen two Mugilid genes with different phylogenetic signals: relatively variable mitochondrial cytochrome oxidase subunit I (COI and conservative nuclear rhodopsin (RHO. We examined their diversity within and among 9 Mugilid species belonging to 4 genera, many of which have been examined from multiple specimens, with the goal of determining whether DNA barcoding can achieve unambiguous species recognition of Mugilid species. The data obtained showed that information based on COI sequences was diagnostic not only for species-level identification but also for recognition of intraspecific units, e.g., allopatric populations of circumtropical Mugil cephalus, or even native and acclimatized specimens of Chelon haematocheila. All RHO sequences appeared strictly species specific. Based on the data obtained, we conclude that COI, as well as RHO sequencing can be used to unambiguously identify fish species. Topologies of phylogeny based on RHO and COI sequences coincided with each other, while together they had a good phylogenetic signal.

  18. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Science.gov (United States)

    Leonard, Guy; Stevens, Jamie R.; Richards, Thomas A.

    2009-01-01

    The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment file, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree files (with a user-defined combination of species name and/or database accession number). Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file) and generation of species and accession number lists for use in supplementary materials or figure legends. PMID:19812722

  19. AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.

    Science.gov (United States)

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.

  20. Evolution of larval life mode of Oecophoridae (Lepidoptera: Gelechioidea) inferred from molecular phylogeny.

    Science.gov (United States)

    Kim, Sora; Kaila, Lauri; Lee, Seunghwan

    2016-08-01

    Phylogenetic relationships within family Oecophoridae have been poorly understood. Consequently the subfamily and genus level classifications with this family problematic. A comprehensive phylogenetic analysis of Oecophoridae, the concealer moths, was performed based on analysis of 4444 base pairs of mitochondrial COI, nuclear ribosomal RNA genes (18S and 28S) and nuclear protein coding genes (IDH, MDH, Rps5, EF1a and wingless) for 82 taxa. Data were analyzed using maximum likelihood (ML), parsimony (MP) and Bayesian (BP) phylogenetic frameworks. Phylogenetic analyses indicated that (i) genera Casmara, Tyrolimnas and Pseudodoxia did not belong to Oecophoridae, suggesting that Oecophoridae s. authors was not monophyletic; (ii) other oecophorids comprising two subfamilies, Pleurotinae and Oecophorinae, were nested within the same clade, and (iii) Martyringa, Acryptolechia and Periacmini were clustered with core Xyloryctidae. They appeared to be sister lineage with core Oecophoridae. BayesTraits were implemented to explore the ancestral character states to infer historical microhabitat patterns and sheltering strategy of larvae. Reconstruction of ancestral microhabitat of oecophorids indicated that oecophorids might have evolved from dried plant feeders and further convergently specialized. The ancestral larva sheltering strategy of oecophorids might have used a silk tube by making itself, shifting from mining leaves. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. SILVA tree viewer: interactive web browsing of the SILVA phylogenetic guide trees.

    Science.gov (United States)

    Beccati, Alan; Gerken, Jan; Quast, Christian; Yilmaz, Pelin; Glöckner, Frank Oliver

    2017-09-30

    Phylogenetic trees are an important tool to study the evolutionary relationships among organisms. The huge amount of available taxa poses difficulties in their interactive visualization. This hampers the interaction with the users to provide feedback for the further improvement of the taxonomic framework. The SILVA Tree Viewer is a web application designed for visualizing large phylogenetic trees without requiring the download of any software tool or data files. The SILVA Tree Viewer is based on Web Geographic Information Systems (Web-GIS) technology with a PostgreSQL backend. It enables zoom and pan functionalities similar to Google Maps. The SILVA Tree Viewer enables access to two phylogenetic (guide) trees provided by the SILVA database: the SSU Ref NR99 inferred from high-quality, full-length small subunit sequences, clustered at 99% sequence identity and the LSU Ref inferred from high-quality, full-length large subunit sequences. The Tree Viewer provides tree navigation, search and browse tools as well as an interactive feedback system to collect any kinds of requests ranging from taxonomy to data curation and improving the tool itself.

  2. Phylogenetic relationships in Peniocereus (Cactaceae) inferred from plastid DNA sequence data.

    Science.gov (United States)

    Arias, Salvador; Terrazas, Teresa; Arreola-Nava, Hilda J; Vázquez-Sánchez, Monserrat; Cameron, Kenneth M

    2005-10-01

    The phylogenetic relationships of Peniocereus (Cactaceae) species were studied using parsimony analyses of DNA sequence data. The plastid rpl16 and trnL-F regions were sequenced for 98 taxa including 17 species of Peniocereus, representatives from all genera of tribe Pachycereeae, four genera of tribe Hylocereeae, as well as from three additional outgroup genera of tribes Calymmantheae, Notocacteae, and Trichocereeae. Phylogenetic analyses support neither the monophyly of Peniocereus as currently circumscribed, nor the monophyly of tribe Pachycereeae since species of Peniocereus subgenus Pseudoacanthocereus are embedded within tribe Hylocereeae. Furthermore, these results show that the eight species of Peniocereus subgenus Peniocereus (Peniocereus sensu stricto) form a well-supported clade within subtribe Pachycereinae; P. serpentinus is also a member of this subtribe, but is sister to Bergerocactus. Moreover, Nyctocereus should be resurrected as a monotypic genus. Species of Peniocereus subgenus Pseudoacanthocereus are positioned among species of Acanthocereus within tribe Hylocereeae, indicating that they may be better classified within that genus. A number of morphological and anatomical characters, especially related to the presence or absence of dimorphic branches, are discussed to support these relationships.

  3. Phylogenetic framework for coevolutionary studies: a compass for exploring jungles of tangled trees.

    Science.gov (United States)

    Martínez-Aquino, Andrés

    2016-08-01

    Phylogenetics is used to detect past evolutionary events, from how species originated to how their ecological interactions with other species arose, which can mirror cophylogenetic patterns. Cophylogenetic reconstructions uncover past ecological relationships between taxa through inferred coevolutionary events on trees, for example, codivergence, duplication, host-switching, and loss. These events can be detected by cophylogenetic analyses based on nodes and the length and branching pattern of the phylogenetic trees of symbiotic associations, for example, host-parasite. In the past 2 decades, algorithms have been developed for cophylogetenic analyses and implemented in different software, for example, statistical congruence index and event-based methods. Based on the combination of these approaches, it is possible to integrate temporal information into cophylogenetical inference, such as estimates of lineage divergence times between 2 taxa, for example, hosts and parasites. Additionally, the advances in phylogenetic biogeography applying methods based on parametric process models and combined Bayesian approaches, can be useful for interpreting coevolutionary histories in a scenario of biogeographical area connectivity through time. This article briefly reviews the basics of parasitology and provides an overview of software packages in cophylogenetic methods. Thus, the objective here is to present a phylogenetic framework for coevolutionary studies, with special emphasis on groups of parasitic organisms. Researchers wishing to undertake phylogeny-based coevolutionary studies can use this review as a "compass" when "walking" through jungles of tangled phylogenetic trees.

  4. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Directory of Open Access Journals (Sweden)

    Guy Leonard

    2009-01-01

    Full Text Available The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment fi le, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree fi les (with a user-defined combination of species name and/or database accession number. Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file and generation of species and accession number lists for use in supplementary materials or figure legends.

  5. Evolutionary relationships in Aspergillus section Fumigati inferred from partial beta-tubulin and hydrophobin sequences

    DEFF Research Database (Denmark)

    Geiser, D.M.; Frisvad, Jens Christian; Taylor, J.W.

    1998-01-01

    are heterothallic. Phylogenetic relationships were inferred among members of Aspergillus section Fumigati based on partial DNA sequences from the benA beta-tubulin and rodA hydrophobin genes. Aspergillus clavatus was chosen as an outgroup. The two gene regions provided nearly equal numbers of phylogenetically...... informative nucleotide characters. The rodA region possessed a considerably higher level of inferred amino acid variation than did the benA region. The results of a partition homogeneity test showed that the benA and rodA data sets were not in significant conflict, and the topologies of the most parsimonious...

  6. Phylogenetic relationships and divergence dates of softshell turtles (Testudines: Trionychidae) inferred from complete mitochondrial genomes.

    Science.gov (United States)

    Li, H; Liu, J; Xiong, L; Zhang, H; Zhou, H; Yin, H; Jing, W; Li, J; Shi, Q; Wang, Y; Liu, J; Nie, L

    2017-05-01

    The softshell turtles (Trionychidae) are one of the most widely distributed reptile groups in the world, and fossils have been found on all continents except Antarctica. The phylogenetic relationships among members of this group have been previously studied; however, disagreements regarding its taxonomy, its phylogeography and divergence times are still poorly understood as well. Here, we present a comprehensive mitogenomic study of softshell turtles. We sequenced the complete mitochondrial genomes of 10 softshell turtles, in addition to the GenBank sequence of Dogania subplana, Lissemys punctata, Trionyx triunguis, which cover all extant genera within Trionychidae except for Cyclanorbis and Cycloderma. These data were combined with other mitogenomes of turtles for phylogenetic analyses. Divergence time calibration and ancestral reconstruction were calculated using BEAST and RASP software, respectively. Our phylogenetic analyses indicate that Trionychidae is the sister taxon of Carettochelyidae, and support the monophyly of Trionychinae and Cyclanorbinae, which is consistent with morphological data and molecular analysis. Our phylogenetic analyses have established a sister taxon relationship between the Asian Rafetus and the Asian Palea + Pelodiscus + Dogania + Nilssonia + Amyda, whereas a previous study grouped the Asian Rafetus with the American Apalone. The results of divergence time estimates and area ancestral reconstruction show that extant Trionychidae originated in Asia at around 108 million years ago (MA), and radiations mainly occurred during two warm periods, namely Late Cretaceous-Early Eocene and Oligocene. By combining the estimated divergence time and the reconstructed ancestral area of softshell turtles, we determined that the dispersal of softshell turtles out of Asia may have taken three routes. Furthermore, the times of dispersal seem to be in agreement with the time of the India-Asia collision and opening of the Bering Strait, which

  7. Phylogenetic patterns in populations of Chilean species of the genus Orestias (Teleostei: Cyprinodontidae): results of mitochondrial DNA analysis.

    Science.gov (United States)

    Lüssen, Arne; Falk, Thomas M; Villwock, Wolfgang

    2003-10-01

    Patterns of molecular genetic differentiation among taxa of the "agassii species complex" (Parenti, 1984) were analysed based on partial mtDNA control region sequences. Special attention has been paid to Chilean populations of Orestias agassii and species from isolated lakes of northern Chile, e.g., O. agassii, Orestias chungarensis, Orestias parinacotensis, Orestias laucaensis, and Orestias ascotanensis. Orestias tschudii, Orestias luteus, and Orestias ispi were analysed comparatively. Our findings support the utility of mtDNA control region sequences for phylogenetic studies within the "agassii species complex" and confirmed the monophyly of this particular lineage, excluding O. luteus. However, the monophyly of further morphologically defined lineages within the "agassii complex" appears doubtful. No support was found for the utility of these data sets for inferring phylogenetic relationships between more distantly related taxa originating from Lake Titicaca.

  8. galaxieEST: addressing EST identity through automated phylogenetic analysis.

    Science.gov (United States)

    Nilsson, R Henrik; Rajashekar, Balaji; Larsson, Karl-Henrik; Ursing, Björn M

    2004-07-05

    Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology. In these cases, a phylogenetic study of the query sequence together with the most similar sequences in the database may be of great value to the identification process. In order to facilitate this laborious procedure, a project to employ automated phylogenetic analysis in the identification of ESTs was initiated. galaxieEST is an open source Perl-CGI script package designed to complement traditional similarity-based identification of EST sequences through employment of automated phylogenetic analysis. It uses a series of BLAST runs as a sieve to retrieve nucleotide and protein sequences for inclusion in neighbour joining and parsimony analyses; the output includes the BLAST output, the results of the phylogenetic analyses, and the corresponding multiple alignments. galaxieEST is available as an on-line web service for identification of fungal ESTs and for download / local installation for use with any organism group at http://galaxie.cgb.ki.se/galaxieEST.html. By addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove particularly useful in cases where similarity searches return one or more pertinent, but not full, matches and

  9. Extended molecular phylogenetics and revised systematics of Malagasy scincine lizards.

    Science.gov (United States)

    Erens, Jesse; Miralles, Aurélien; Glaw, Frank; Chatrou, Lars W; Vences, Miguel

    2017-02-01

    Among the endemic biota of Madagascar, skinks are a diverse radiation of lizards that exhibit a striking ecomorphological variation, and could provide an interesting system to study body-form evolution in squamate reptiles. We provide a new phylogenetic hypothesis for Malagasy skinks of the subfamily Scincinae based on an extended molecular dataset comprising 8060bp from three mitochondrial and nine nuclear loci. Our analysis also increases taxon sampling of the genus Amphiglossus by including 16 out of 25 nominal species. Additionally, we examined whether the molecular phylogenetic patterns coincide with morphological differentiation in the species currently assigned to this genus. Various methods of inference recover a mostly strongly supported phylogeny with three main clades of Amphiglossus. However, relationships among these three clades and the limb-reduced genera Grandidierina, Voeltzkowia and Pygomeles remain uncertain. Supported by a variety of morphological differences (predominantly related to the degree of body elongation), but considering the remaining phylogenetic uncertainty, we propose a redefinition of Amphiglossus into three different genera (Amphiglossus sensu stricto, Flexiseps new genus, and Brachyseps new genus) to remove the non-monophyly of Amphiglossus sensu lato and to facilitate future studies on this fascinating group of lizards. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Phylogenetic classification of bony fishes.

    Science.gov (United States)

    Betancur-R, Ricardo; Wiley, Edward O; Arratia, Gloria; Acero, Arturo; Bailly, Nicolas; Miya, Masaki; Lecointre, Guillaume; Ortí, Guillermo

    2017-07-06

    Fish classifications, as those of most other taxonomic groups, are being transformed drastically as new molecular phylogenies provide support for natural groups that were unanticipated by previous studies. A brief review of the main criteria used by ichthyologists to define their classifications during the last 50 years, however, reveals slow progress towards using an explicit phylogenetic framework. Instead, the trend has been to rely, in varying degrees, on deep-rooted anatomical concepts and authority, often mixing taxa with explicit phylogenetic support with arbitrary groupings. Two leading sources in ichthyology frequently used for fish classifications (JS Nelson's volumes of Fishes of the World and W. Eschmeyer's Catalog of Fishes) fail to adopt a global phylogenetic framework despite much recent progress made towards the resolution of the fish Tree of Life. The first explicit phylogenetic classification of bony fishes was published in 2013, based on a comprehensive molecular phylogeny ( www.deepfin.org ). We here update the first version of that classification by incorporating the most recent phylogenetic results. The updated classification presented here is based on phylogenies inferred using molecular and genomic data for nearly 2000 fishes. A total of 72 orders (and 79 suborders) are recognized in this version, compared with 66 orders in version 1. The phylogeny resolves placement of 410 families, or ~80% of the total of 514 families of bony fishes currently recognized. The ordinal status of 30 percomorph families included in this study, however, remains uncertain (incertae sedis in the series Carangaria, Ovalentaria, or Eupercaria). Comments to support taxonomic decisions and comparisons with conflicting taxonomic groups proposed by others are presented. We also highlight cases were morphological support exist for the groups being classified. This version of the phylogenetic classification of bony fishes is substantially improved, providing resolution

  11. Phylogenetic relationships among populations of Pristurus rupestris Blanford,1874 (Sauria: Sphaerodactylidae) in southern Iran

    OpenAIRE

    YOUSOFI, SUGOL; POUYANI, ESKANDAR RASTEGAR; HOJATI, VIDA

    2015-01-01

    We examined intraspecific relationships of the subspecies Pristurus rupestris iranicus from the northern Persian Gulf area (Hormozgan, Bushehr, and Sistan and Baluchestan provinces). Phylogenetic relationships among these samples were estimated based on the mitochondrial cytochrome b gene. We used three methods of phylogenetic tree reconstruction (maximum likelihood, maximum parsimony, and Bayesian inference). The sampled populations were divided into 5 clades but exhibit little genetic diver...

  12. Evolutionary history of tall fescue morphotypes inferred from molecular phylogenetics of the Lolium-Festuca species complex

    Directory of Open Access Journals (Sweden)

    Stewart Alan V

    2010-10-01

    Full Text Available Abstract Background The agriculturally important pasture grass tall fescue (Festuca arundinacea Schreb. syn. Lolium arundinaceum (Schreb. Darbysh. is an outbreeding allohexaploid, that may be more accurately described as a species complex consisting of three major (Continental, Mediterranean and rhizomatous morphotypes. Observation of hybrid infertility in some crossing combinations between morphotypes suggests the possibility of independent origins from different diploid progenitors. This study aims to clarify the evolutionary relationships between each tall fescue morphotype through phylogenetic analysis using two low-copy nuclear genes (encoding plastid acetyl-CoA carboxylase [Acc1] and centroradialis [CEN], the nuclear ribosomal DNA internal transcribed spacer (rDNA ITS and the chloroplast DNA (cpDNA genome-located matK gene. Other taxa within the closely related Lolium-Festuca species complex were also included in the study, to increase understanding of evolutionary processes in a taxonomic group characterised by multiple inter-specific hybridisation events. Results Putative homoeologous sequences from both nuclear genes were obtained from each polyploid species and compared to counterparts from 15 diploid taxa. Phylogenetic reconstruction confirmed F. pratensis and F. arundinacea var. glaucescens as probable progenitors to Continental tall fescue, and these species are also likely to be ancestral to the rhizomatous morphotype. However, these two morphotypes are sufficiently distinct to be located in separate clades based on the ITS-derived data set. All four of the generated data sets suggest independent evolution of the Mediterranean and Continental morphotypes, with minimal affinity between cognate sequence haplotypes. No obvious candidate progenitor species for Mediterranean tall fescues were identified, and only two putative sub-genome-specific haplotypes were identified for this morphotype. Conclusions This study describes the first

  13. Fast and accurate phylogenetic reconstruction from high-resolution whole-genome data and a novel robustness estimator.

    Science.gov (United States)

    Lin, Y; Rajan, V; Moret, B M E

    2011-09-01

    The rapid accumulation of whole-genome data has renewed interest in the study of genomic rearrangements. Comparative genomics, evolutionary biology, and cancer research all require models and algorithms to elucidate the mechanisms, history, and consequences of these rearrangements. However, even simple models lead to NP-hard problems, particularly in the area of phylogenetic analysis. Current approaches are limited to small collections of genomes and low-resolution data (typically a few hundred syntenic blocks). Moreover, whereas phylogenetic analyses from sequence data are deemed incomplete unless bootstrapping scores (a measure of confidence) are given for each tree edge, no equivalent to bootstrapping exists for rearrangement-based phylogenetic analysis. We describe a fast and accurate algorithm for rearrangement analysis that scales up, in both time and accuracy, to modern high-resolution genomic data. We also describe a novel approach to estimate the robustness of results-an equivalent to the bootstrapping analysis used in sequence-based phylogenetic reconstruction. We present the results of extensive testing on both simulated and real data showing that our algorithm returns very accurate results, while scaling linearly with the size of the genomes and cubically with their number. We also present extensive experimental results showing that our approach to robustness testing provides excellent estimates of confidence, which, moreover, can be tuned to trade off thresholds between false positives and false negatives. Together, these two novel approaches enable us to attack heretofore intractable problems, such as phylogenetic inference for high-resolution vertebrate genomes, as we demonstrate on a set of six vertebrate genomes with 8,380 syntenic blocks. A copy of the software is available on demand.

  14. Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data

    Directory of Open Access Journals (Sweden)

    Wei Du

    2013-01-01

    Full Text Available Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.

  15. Comprehensive untargeted metabolomics of Lychnnophorinae subtribe (Asteraceae: Vernonieae) in a phylogenetic context.

    Science.gov (United States)

    Martucci, Maria Elvira Poleti; Loeuille, Benoit; Pirani, José Rubens; Gobbo-Neto, Leonardo

    2018-01-01

    Members of the subtribe Lychnophorinae occur mostly within the Cerrado domain of the Brazilian Central Plateau. The relationships between its 11 genera, as well as between Lychnophorinae and other subtribes belonging to the tribe Vernonieae, have recently been investigated upon a phylogeny based on molecular and morphological data. We report the use of a comprehensive untargeted metabolomics approach, combining HPLC-MS and GC-MS data, followed by multivariate analyses aiming to assess the congruence between metabolomics data and the phylogenetic hypothesis, as well as its potential as a chemotaxonomic tool. We analyzed 78 species by UHPLC-MS and GC-MS in both positive and negative ionization modes. The metabolic profiles obtained for these species were treated in MetAlign and in MSClust and the matrices generated were used in SIMCA for hierarchical cluster analyses, principal component analyses and orthogonal partial least square discriminant analysis. The results showed that metabolomic analyses are mostly congruent with the phylogenetic hypothesis especially at lower taxonomic levels (Lychnophora or Eremanthus). Our results confirm that data generated using metabolomics provide evidence for chemotaxonomical studies, especially for phylogenetic inference of the Lychnophorinae subtribe and insight into the evolution of the secondary metabolites of this group.

  16. Bayesian models for comparative analysis integrating phylogenetic uncertainty

    Directory of Open Access Journals (Sweden)

    Villemereuil Pierre de

    2012-06-01

    Full Text Available Abstract Background Uncertainty in comparative analyses can come from at least two sources: a phylogenetic uncertainty in the tree topology or branch lengths, and b uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow and inflated significance in hypothesis testing (e.g. p-values will be too small. Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible

  17. Bayesian models for comparative analysis integrating phylogenetic uncertainty

    Science.gov (United States)

    2012-01-01

    Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for

  18. Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees.

    Science.gov (United States)

    Keller, Alexander; Förster, Frank; Müller, Tobias; Dandekar, Thomas; Schultz, Jörg; Wolf, Matthias

    2010-01-15

    In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking. This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness. Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion. This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

  19. Phylogenetic analysis of anemone fishes of the Persian Gulf using ...

    African Journals Online (AJOL)

    STORAGESEVER

    2008-06-17

    Jun 17, 2008 ... genetic diversity among samples was investigated by phylogenetic analysis. Results show that there is ... more about the living organisms found in this region. Many marine ... Kish (modified from Pous et al., 2004). Table 2.

  20. Molecular Phylogenetics: Concepts for a Newcomer.

    Science.gov (United States)

    Ajawatanawong, Pravech

    Molecular phylogenetics is the study of evolutionary relationships among organisms using molecular sequence data. The aim of this review is to introduce the important terminology and general concepts of tree reconstruction to biologists who lack a strong background in the field of molecular evolution. Some modern phylogenetic programs are easy to use because of their user-friendly interfaces, but understanding the phylogenetic algorithms and substitution models, which are based on advanced statistics, is still important for the analysis and interpretation without a guide. Briefly, there are five general steps in carrying out a phylogenetic analysis: (1) sequence data preparation, (2) sequence alignment, (3) choosing a phylogenetic reconstruction method, (4) identification of the best tree, and (5) evaluating the tree. Concepts in this review enable biologists to grasp the basic ideas behind phylogenetic analysis and also help provide a sound basis for discussions with expert phylogeneticists.

  1. Niche conservatism and dispersal limitation cause large-scale phylogenetic structure in the New World palm flora

    DEFF Research Database (Denmark)

    Eiserhardt, Wolf L.; Svenning, J.-C.; Baker, William J.

    similarity decays after speciation depends on the rates of niche evolution and dispersal. If dispersal is slow compared to the tempo of lineage diversification, distributions change little during clade diversification. Phylogenetic niche conservatism precludes distributional shifts in environmental space......, and to the degree that distributions are limited by the niche, also in geographic space. Using phylogenetic turnover methods, we simultaneously analysed the distributions of all New World palms (n=547) and inferred to which degree phylogenetic niche conservatism and dispersal limitation, respectively, caused...

  2. Causal inference in survival analysis using pseudo-observations.

    Science.gov (United States)

    Andersen, Per K; Syriopoulou, Elisavet; Parner, Erik T

    2017-07-30

    Causal inference for non-censored response variables, such as binary or quantitative outcomes, is often based on either (1) direct standardization ('G-formula') or (2) inverse probability of treatment assignment weights ('propensity score'). To do causal inference in survival analysis, one needs to address right-censoring, and often, special techniques are required for that purpose. We will show how censoring can be dealt with 'once and for all' by means of so-called pseudo-observations when doing causal inference in survival analysis. The pseudo-observations can be used as a replacement of the outcomes without censoring when applying 'standard' causal inference methods, such as (1) or (2) earlier. We study this idea for estimating the average causal effect of a binary treatment on the survival probability, the restricted mean lifetime, and the cumulative incidence in a competing risks situation. The methods will be illustrated in a small simulation study and via a study of patients with acute myeloid leukemia who received either myeloablative or non-myeloablative conditioning before allogeneic hematopoetic cell transplantation. We will estimate the average causal effect of the conditioning regime on outcomes such as the 3-year overall survival probability and the 3-year risk of chronic graft-versus-host disease. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  3. Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

    Directory of Open Access Journals (Sweden)

    Holland Barbara R

    2006-07-01

    Full Text Available Abstract Background Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high-scoring segment pairs (HSPs between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called δ value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. Results Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. δ values are found to be a reliable predictor of phylogenetic accuracy. Conclusion Using the most treelike distance matrices, as

  4. What is the phylogenetic signal limit from mitogenomes? The reconciliation between mitochondrial and nuclear data in the Insecta class phylogeny

    Directory of Open Access Journals (Sweden)

    Talavera Gerard

    2011-10-01

    limits of the phylogenetic signal that can be extracted from Insecta mitogenomes. Based on the combined use of the five best topology-performing genes we obtained comparable results to whole mitogenomes, highlighting the important role of data quality. Conclusion We show for the first time that mitogenomic data agrees with nuclear and morphological data for several of the most controversial insect evolutionary relationships, adding a new independent source of evidence to study relationships among insect orders. We propose that deeper divergences cannot be inferred with the current available methods due to sequence saturation and compositional bias inconsistencies. Our exploratory analysis indicates that the CAT model is the best dealing with LBA and it could be useful for other groups and datasets with similar phylogenetic difficulties.

  5. Genome-wide Studies of Mycolic Acid Bacteria: Computational Identification and Analysis of a Minimal Genome

    KAUST Repository

    Kamanu, Frederick Kinyua

    2012-01-01

    to monophyletic (16S small ribosomal subunit) in delineating a total of 52 mycolic acid bacterial species. Phylogenetic inference was performed using the neighbor-joining method. To further refine phylogenetic analysis and to take advantage of the widespread

  6. Detecting Network Communities: An Application to Phylogenetic Analysis

    Science.gov (United States)

    Andrade, Roberto F. S.; Rocha-Neto, Ivan C.; Santos, Leonardo B. L.; de Santana, Charles N.; Diniz, Marcelo V. C.; Lobão, Thierry Petit; Goés-Neto, Aristóteles; Pinho, Suani T. R.; El-Hani, Charbel N.

    2011-01-01

    This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. PMID:21573202

  7. Genome-wide comparative analysis of phylogenetic trees: the prokaryotic forest of life.

    Science.gov (United States)

    Puigbò, Pere; Wolf, Yuri I; Koonin, Eugene V

    2012-01-01

    Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article, we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the application of these methods to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."

  8. Intraspecific differentiation of Paramecium novaurelia strains (Ciliophora, Protozoa) inferred from phylogenetic analysis of ribosomal and mitochondrial DNA variation.

    Science.gov (United States)

    Tarcz, Sebastian

    2013-01-01

    Paramecium novaurelia Beale and Schneller, 1954, was first found in Scotland and is known to occur mainly in Europe, where it is the most common species of the P. aurelia complex. In recent years, two non-European localities have been described: Turkey and the United States of America. This article presents the analysis of intraspecific variability among 25 strains of P. novaurelia with the application of ribosomal and mitochondrial loci (ITS1-5.8S-ITS2, 5' large subunit rDNA (5'LSU rDNA) and cytochrome c oxidase subunit 1 (COI) mtDNA). The mean distance observed for all of the studied P. novaurelia sequence pairs was p=0.008/0.016/0.092 (ITS1-5.8S-ITS2/5'LSU rDNA/COI). Phylogenetic trees (NJ/MP/BI) based on a comparison of all of the analysed sequences show that the studied strains of P. novaurelia form a distinct clade, separate from the P. caudatum outgroup, and are divided into two clusters (A and B) and two branches (C and D). The occurrence of substantial genetic differentiation within P. novaurelia, confirmed by the analysed DNA fragments, indicates a rapid evolution of particular species within the Paramecium genus. Copyright © 2012 Elsevier GmbH. All rights reserved.

  9. Assessment of phylogenetic sensitivity for reconstructing HIV-1 epidemiological relationships.

    Science.gov (United States)

    Beloukas, Apostolos; Magiorkinis, Emmanouil; Magiorkinis, Gkikas; Zavitsanou, Asimina; Karamitros, Timokratis; Hatzakis, Angelos; Paraskevis, Dimitrios

    2012-06-01

    Phylogenetic analysis has been extensively used as a tool for the reconstruction of epidemiological relations for research or for forensic purposes. It was our objective to assess the sensitivity of different phylogenetic methods and various phylogenetic programs to reconstruct epidemiological links among HIV-1 infected patients that is the probability to reveal a true transmission relationship. Multiple datasets (90) were prepared consisting of HIV-1 sequences in protease (PR) and partial reverse transcriptase (RT) sampled from patients with documented epidemiological relationship (target population), and from unrelated individuals (control population) belonging to the same HIV-1 subtype as the target population. Each dataset varied regarding the number, the geographic origin and the transmission risk groups of the sequences among the control population. Phylogenetic trees were inferred by neighbor-joining (NJ), maximum likelihood heuristics (hML) and Bayesian methods. All clusters of sequences belonging to the target population were correctly reconstructed by NJ and Bayesian methods receiving high bootstrap and posterior probability (PP) support, respectively. On the other hand, TreePuzzle failed to reconstruct or provide significant support for several clusters; high puzzling step support was associated with the inclusion of control sequences from the same geographic area as the target population. In contrary, all clusters were correctly reconstructed by hML as implemented in PhyML 3.0 receiving high bootstrap support. We report that under the conditions of our study, hML using PhyML, NJ and Bayesian methods were the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  11. Efficient Detection of Repeating Sites to Accelerate Phylogenetic Likelihood Calculations.

    Science.gov (United States)

    Kobert, K; Stamatakis, A; Flouri, T

    2017-03-01

    The phylogenetic likelihood function (PLF) is the major computational bottleneck in several applications of evolutionary biology such as phylogenetic inference, species delimitation, model selection, and divergence times estimation. Given the alignment, a tree and the evolutionary model parameters, the likelihood function computes the conditional likelihood vectors for every node of the tree. Vector entries for which all input data are identical result in redundant likelihood operations which, in turn, yield identical conditional values. Such operations can be omitted for improving run-time and, using appropriate data structures, reducing memory usage. We present a fast, novel method for identifying and omitting such redundant operations in phylogenetic likelihood calculations, and assess the performance improvement and memory savings attained by our method. Using empirical and simulated data sets, we show that a prototype implementation of our method yields up to 12-fold speedups and uses up to 78% less memory than one of the fastest and most highly tuned implementations of the PLF currently available. Our method is generic and can seamlessly be integrated into any phylogenetic likelihood implementation. [Algorithms; maximum likelihood; phylogenetic likelihood function; phylogenetics]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  12. Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies.

    Science.gov (United States)

    Leaché, Adam D; Banbury, Barbara L; Felsenstein, Joseph; de Oca, Adrián Nieto-Montes; Stamatakis, Alexandros

    2015-11-01

    Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the

  13. Phylogenetic relationships of German heavy draught horse breeds inferred from mitochondrial DNA D-loop variation.

    Science.gov (United States)

    Aberle, K S; Hamann, H; Drögemüller, C; Distl, O

    2007-04-01

    We analysed a 610-bp mitochondrial (mt)DNA D-loop fragment in a sample of German draught horse breeds and compared the polymorphic sites with sequences from Arabian, Hanoverian, Exmoor, Icelandic, Sorraia and Przewalski's Horses as well as with Suffolk, Shire and Belgian horses. In a total of 65 horses, 70 polymorphic sites representing 47 haplotypes were observed. The average percentage of polymorphic sites was 11.5% for the mtDNA fragment analysed. In the nine different draught horse breeds including South German, Mecklenburg, Saxon Thuringa coldblood, Rhenisch German, Schleswig Draught Horse, Black Forest Horse, Shire, Suffolk and Belgian, 61 polymorphic sites and 24 haplotypes were found. The phylogenetic analysis failed to show monophyletic groups for the draught horses. The analysis indicated that the draught horse populations investigated consist of diverse genetic groups with respect to their maternal lineage.

  14. Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Dandekar Thomas

    2010-01-01

    Full Text Available Abstract Background In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking. Results This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness. Conclusions Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion. Reviewers This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber and Eugene V. Koonin. Open peer review Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

  15. The impact of phenotypic and molecular data on the inference of Colletotrichum diversity associated with Musa.

    Science.gov (United States)

    Vieira, Willie A S; Lima, Waléria G; Nascimento, Eduardo S; Michereff, Sami J; Câmara, Marcos P S; Doyle, Vinson P

    2017-01-01

    Developing a comprehensive and reliable taxonomy for the Colletotrichum gloeosporioides species complex will require adopting data standards on the basis of an understanding of how methodological choices impact morphological evaluations and phylogenetic inference. We explored the impact of methodological choices in a morphological and molecular evaluation of Colletotrichum species associated with banana in Brazil. The choice of alignment filtering algorithm has a significant impact on topological inference and the retention of phylogenetically informative sites. Similarly, the choice of phylogenetic marker affects the delimitation of species boundaries, particularly if low phylogenetic signal is confounded with strong discordance, and inference of the species tree from multiple-gene trees. According to both phylogenetic informativeness profiling and Bayesian concordance analyses, the most informative loci are DNA lyase (APN2), intergenic spacer (IGS) between DNA lyase and the mating-type locus MAT1-2-1 (APN2/MAT-IGS), calmodulin (CAL), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), glutamine synthetase (GS), β-tubulin (TUB2), and a new marker, the intergenic spacer between GAPDH and an hypothetical protein (GAP2-IGS). Cornmeal agar minimizes the variance in conidial dimensions compared with potato dextrose agar and synthetic nutrient-poor agar, such that species are more readily distinguishable based on phenotypic differences. We apply these insights to investigate the diversity of Colletotrichum species associated with banana anthracnose in Brazil and report C. musae, C. tropicale, C. theobromicola, and C. siamense in association with banana anthracnose. One lineage did not cluster with any previously described species and is described here as C. chrysophilum.

  16. Disentangling the phylogenetic and ecological components of spider phenotypic variation.

    Science.gov (United States)

    Gonçalves-Souza, Thiago; Diniz-Filho, José Alexandre Felizola; Romero, Gustavo Quevedo

    2014-01-01

    An understanding of how the degree of phylogenetic relatedness influences the ecological similarity among species is crucial to inferring the mechanisms governing the assembly of communities. We evaluated the relative importance of spider phylogenetic relationships and ecological niche (plant morphological variables) to the variation in spider body size and shape by comparing spiders at different scales: (i) between bromeliads and dicot plants (i.e., habitat scale) and (ii) among bromeliads with distinct architectural features (i.e., microhabitat scale). We partitioned the interspecific variation in body size and shape into phylogenetic (that express trait values as expected by phylogenetic relationships among species) and ecological components (that express trait values independent of phylogenetic relationships). At the habitat scale, bromeliad spiders were larger and flatter than spiders associated with the surrounding dicots. At this scale, plant morphology sorted out close related spiders. Our results showed that spider flatness is phylogenetically clustered at the habitat scale, whereas it is phylogenetically overdispersed at the microhabitat scale, although phylogenic signal is present in both scales. Taken together, these results suggest that whereas at the habitat scale selective colonization affect spider body size and shape, at fine scales both selective colonization and adaptive evolution determine spider body shape. By partitioning the phylogenetic and ecological components of phenotypic variation, we were able to disentangle the evolutionary history of distinct spider traits and show that plant architecture plays a role in the evolution of spider body size and shape. We also discussed the relevance in considering multiple scales when studying phylogenetic community structure.

  17. An analysis pipeline for the inference of protein-protein interaction networks

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Ronald C.; Singhal, Mudita; Daly, Don S.; Gilmore, Jason M.; Cannon, William R.; Domico, Kelly O.; White, Amanda M.; Auberry, Deanna L.; Auberry, Kenneth J.; Hooker, Brian S.; Hurst, G. B.; McDermott, Jason E.; McDonald, W. H.; Pelletier, Dale A.; Schmoyer, Denise A.; Wiley, H. S.

    2009-12-01

    An analysis pipeline has been created for deployment of a novel algorithm, the Bayesian Estimator of Protein-Protein Association Probabilities (BEPro), for use in the reconstruction of protein-protein interaction networks. We have combined the Software Environment for BIological Network Inference (SEBINI), an interactive environment for the deployment and testing of network inference algorithms that use high-throughput data, and the Collective Analysis of Biological Interaction Networks (CABIN), software that allows integration and analysis of protein-protein interaction and gene-to-gene regulatory evidence obtained from multiple sources, to allow interactions computed by BEPro to be stored, visualized, and further analyzed. Incorporating BEPro into SEBINI and automatically feeding the resulting inferred network into CABIN, we have created a structured workflow for protein-protein network inference and supplemental analysis from sets of mass spectrometry bait-prey experiment data. SEBINI demo site: https://www.emsl.pnl.gov /SEBINI/ Contact: ronald.taylor@pnl.gov. BEPro is available at http://www.pnl.gov/statistics/BEPro3/index.htm. Contact: ds.daly@pnl.gov. CABIN is available at http://www.sysbio.org/dataresources/cabin.stm. Contact: mudita.singhal@pnl.gov.

  18. Early-branching euteleost relationships: areas of congruence between concatenation and coalescent model inferences

    Directory of Open Access Journals (Sweden)

    Matthew A. Campbell

    2017-09-01

    Full Text Available Phylogenetic inference based on evidence from DNA sequences has led to significant strides in the development of a stable and robustly supported framework for the vertebrate tree of life. To date, the bulk of those advances have relied on sequence data from a small number of genome regions that have proven unable to produce satisfactory answers to consistently recalcitrant phylogenetic questions. Here, we re-examine phylogenetic relationships among early-branching euteleostean fish lineages classically grouped in the Protacanthopterygii using DNA sequence data surrounding ultraconserved elements. We report and examine a dataset of thirty-four OTUs with 17,957 aligned characters from fifty-three nuclear loci. Phylogenetic analysis is conducted in concatenated, joint gene trees and species tree estimation and summary coalescent frameworks. All analytical frameworks yield supporting evidence for existing hypotheses of relationship for the placement of Lepidogalaxias salamandroides, monophyly of the Stomiatii and the presence of an esociform + salmonid clade. Lepidogalaxias salamandroides and the Esociformes + Salmoniformes are successive sister lineages to all other euteleosts in the majority of analyses. The concatenated and joint gene trees and species tree analysis types produce high support values for this arrangement. However, inter-relationships of Argentiniformes, Stomiatii and Neoteleostei remain uncertain as they varied by analysis type while receiving strong and contradictory indices of support. Topological differences between analysis types are also apparent within the otomorph and the percomorph taxa in the data set. Our results identify concordant areas with strong support for relationships within and between early-branching euteleost lineages but they also reveal limitations in the ability of larger datasets to conclusively resolve other aspects of that phylogeny.

  19. Early-branching euteleost relationships: areas of congruence between concatenation and coalescent model inferences.

    Science.gov (United States)

    Campbell, Matthew A; Alfaro, Michael E; Belasco, Max; López, J Andrés

    2017-01-01

    Phylogenetic inference based on evidence from DNA sequences has led to significant strides in the development of a stable and robustly supported framework for the vertebrate tree of life. To date, the bulk of those advances have relied on sequence data from a small number of genome regions that have proven unable to produce satisfactory answers to consistently recalcitrant phylogenetic questions. Here, we re-examine phylogenetic relationships among early-branching euteleostean fish lineages classically grouped in the Protacanthopterygii using DNA sequence data surrounding ultraconserved elements. We report and examine a dataset of thirty-four OTUs with 17,957 aligned characters from fifty-three nuclear loci. Phylogenetic analysis is conducted in concatenated, joint gene trees and species tree estimation and summary coalescent frameworks. All analytical frameworks yield supporting evidence for existing hypotheses of relationship for the placement of Lepidogalaxias salamandroides , monophyly of the Stomiatii and the presence of an esociform + salmonid clade. Lepidogalaxias salamandroides and the Esociformes + Salmoniformes are successive sister lineages to all other euteleosts in the majority of analyses. The concatenated and joint gene trees and species tree analysis types produce high support values for this arrangement. However, inter-relationships of Argentiniformes, Stomiatii and Neoteleostei remain uncertain as they varied by analysis type while receiving strong and contradictory indices of support. Topological differences between analysis types are also apparent within the otomorph and the percomorph taxa in the data set. Our results identify concordant areas with strong support for relationships within and between early-branching euteleost lineages but they also reveal limitations in the ability of larger datasets to conclusively resolve other aspects of that phylogeny.

  20. Identification and phylogenetic inferences on stocks of sharks affected by the fishing industry off the Northern coast of Brazil

    Directory of Open Access Journals (Sweden)

    Luis Fernando da Silva Rodrigues-Filho

    2009-01-01

    Full Text Available The ongoing decline in abundance and diversity of shark stocks, primarily due to uncontrolled fishery exploitation, is a worldwide problem. An additional problem for the development of conservation and management programmes is the identification of species diversity within a given area, given the morphological similarities among shark species, and the typical disembarkation of processed carcasses which are almost impossible to differentiate. The main aim of the present study was to identify those shark species being exploited off northern Brazil, by using the 12S-16S molecular marker. For this, DNA sequences were obtained from 122 specimens collected on the docks and the fish market in Bragança, in the Brazilian state of Pará. We identified at least 11 species. Three-quarters of the specimens collected were either Carcharhinus porosus or Rhizoprionodon sp, while a notable absence was the daggernose shark, Isogomphodon oxyrhyncus, previously one of the most common species in local catches. The study emphasises the value of molecular techniques for the identification of cryptic shark species, and the potential of the 12S-16S marker as a tool for phylogenetic inferences in a study of elasmobranchs.

  1. Phylogenetic comparative methods on phylogenetic networks with reticulations.

    Science.gov (United States)

    Bastide, Paul; Solís-Lemus, Claudia; Kriebel, Ricardo; Sparks, K William; Ané, Cécile

    2018-04-25

    The goal of Phylogenetic Comparative Methods (PCMs) is to study the distribution of quantitative traits among related species. The observed traits are often seen as the result of a Brownian Motion (BM) along the branches of a phylogenetic tree. Reticulation events such as hybridization, gene flow or horizontal gene transfer, can substantially affect a species' traits, but are not modeled by a tree. Phylogenetic networks have been designed to represent reticulate evolution. As they become available for downstream analyses, new models of trait evolution are needed, applicable to networks. One natural extension of the BM is to use a weighted average model for the trait of a hybrid, at a reticulation point. We develop here an efficient recursive algorithm to compute the phylogenetic variance matrix of a trait on a network, in only one preorder traversal of the network. We then extend the standard PCM tools to this new framework, including phylogenetic regression with covariates (or phylogenetic ANOVA), ancestral trait reconstruction, and Pagel's λ test of phylogenetic signal. The trait of a hybrid is sometimes outside of the range of its two parents, for instance because of hybrid vigor or hybrid depression. These two phenomena are rather commonly observed in present-day hybrids. Transgressive evolution can be modeled as a shift in the trait value following a reticulation point. We develop a general framework to handle such shifts, and take advantage of the phylogenetic regression view of the problem to design statistical tests for ancestral transgressive evolution in the evolutionary history of a group of species. We study the power of these tests in several scenarios, and show that recent events have indeed the strongest impact on the trait distribution of present-day taxa. We apply those methods to a dataset of Xiphophorus fishes, to confirm and complete previous analysis in this group. All the methods developed here are available in the Julia package PhyloNetworks.

  2. MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation.

    Science.gov (United States)

    Hoang, Diep Thi; Vinh, Le Sy; Flouri, Tomáš; Stamatakis, Alexandros; von Haeseler, Arndt; Minh, Bui Quang

    2018-02-02

    The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. However, such an approach is still missing for maximum parsimony. To close this gap we present MPBoot, an adaptation and extension of UFBoot to compute branch supports under the maximum parsimony principle. MPBoot works for both uniform and non-uniform cost matrices. Our analyses on biological DNA and protein showed that under uniform cost matrices, MPBoot runs on average 4.7 (DNA) to 7 times (protein data) (range: 1.2-20.7) faster than the standard parsimony bootstrap implemented in PAUP*; but 1.6 (DNA) to 4.1 times (protein data) slower than the standard bootstrap with a fast search routine in TNT (fast-TNT). However, for non-uniform cost matrices MPBoot is 5 (DNA) to 13 times (protein data) (range:0.3-63.9) faster than fast-TNT. We note that MPBoot achieves better scores more frequently than PAUP* and fast-TNT. However, this effect is less pronounced if an intensive but slower search in TNT is invoked. Moreover, experiments on large-scale simulated data show that while both PAUP* and TNT bootstrap estimates are too conservative, MPBoot bootstrap estimates appear more unbiased. MPBoot provides an efficient alternative to the standard maximum parsimony bootstrap procedure. It shows favorable performance in terms of run time, the capability of finding a maximum parsimony tree, and high bootstrap accuracy on simulated as well as empirical data sets. MPBoot is easy-to-use, open-source and available at http://www.cibiv.at/software/mpboot .

  3. Auto-validating von Neumann rejection sampling from small phylogenetic tree spaces

    Directory of Open Access Journals (Sweden)

    York Thomas

    2009-01-01

    Full Text Available Abstract Background In phylogenetic inference one is interested in obtaining samples from the posterior distribution over the tree space on the basis of some observed DNA sequence data. One of the simplest sampling methods is the rejection sampler due to von Neumann. Here we introduce an auto-validating version of the rejection sampler, via interval analysis, to rigorously draw samples from posterior distributions over small phylogenetic tree spaces. Results The posterior samples from the auto-validating sampler are used to rigorously (i estimate posterior probabilities for different rooted topologies based on mitochondrial DNA from human, chimpanzee and gorilla, (ii conduct a non-parametric test of rate variation between protein-coding and tRNA-coding sites from three primates and (iii obtain a posterior estimate of the human-neanderthal divergence time. Conclusion This solves the open problem of rigorously drawing independent and identically distributed samples from the posterior distribution over rooted and unrooted small tree spaces (3 or 4 taxa based on any multiply-aligned sequence data.

  4. Comprehensive untargeted metabolomics of Lychnnophorinae subtribe (Asteraceae: Vernonieae in a phylogenetic context.

    Directory of Open Access Journals (Sweden)

    Maria Elvira Poleti Martucci

    Full Text Available Members of the subtribe Lychnophorinae occur mostly within the Cerrado domain of the Brazilian Central Plateau. The relationships between its 11 genera, as well as between Lychnophorinae and other subtribes belonging to the tribe Vernonieae, have recently been investigated upon a phylogeny based on molecular and morphological data. We report the use of a comprehensive untargeted metabolomics approach, combining HPLC-MS and GC-MS data, followed by multivariate analyses aiming to assess the congruence between metabolomics data and the phylogenetic hypothesis, as well as its potential as a chemotaxonomic tool. We analyzed 78 species by UHPLC-MS and GC-MS in both positive and negative ionization modes. The metabolic profiles obtained for these species were treated in MetAlign and in MSClust and the matrices generated were used in SIMCA for hierarchical cluster analyses, principal component analyses and orthogonal partial least square discriminant analysis. The results showed that metabolomic analyses are mostly congruent with the phylogenetic hypothesis especially at lower taxonomic levels (Lychnophora or Eremanthus. Our results confirm that data generated using metabolomics provide evidence for chemotaxonomical studies, especially for phylogenetic inference of the Lychnophorinae subtribe and insight into the evolution of the secondary metabolites of this group.

  5. Species trees for the tree swallows (Genus Tachycineta): an alternative phylogenetic hypothesis to the mitochondrial gene tree.

    Science.gov (United States)

    Dor, Roi; Carling, Matthew D; Lovette, Irby J; Sheldon, Frederick H; Winkler, David W

    2012-10-01

    The New World swallow genus Tachycineta comprises nine species that collectively have a wide geographic distribution and remarkable variation both within- and among-species in ecologically important traits. Existing phylogenetic hypotheses for Tachycineta are based on mitochondrial DNA sequences, thus they provide estimates of a single gene tree. In this study we sequenced multiple individuals from each species at 16 nuclear intron loci. We used gene concatenated approaches (Bayesian and maximum likelihood) as well as coalescent-based species tree inference to reconstruct phylogenetic relationships of the genus. We examined the concordance and conflict between the nuclear and mitochondrial trees and between concatenated and coalescent-based inferences. Our results provide an alternative phylogenetic hypothesis to the existing mitochondrial DNA estimate of phylogeny. This new hypothesis provides a more accurate framework in which to explore trait evolution and examine the evolution of the mitochondrial genome in this group. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. Inference of Large Phylogenies Using Neighbour-Joining

    DEFF Research Database (Denmark)

    Simonsen, Martin; Mailund, Thomas; Pedersen, Christian Nørgaard Storm

    2011-01-01

    The neighbour-joining method is a widely used method for phylogenetic reconstruction which scales to thousands of taxa. However, advances in sequencing technology have made data sets with more than 10,000 related taxa widely available. Inference of such large phylogenies takes hours or days using...... the Neighbour-Joining method on a normal desktop computer because of the O(n^3) running time. RapidNJ is a search heuristic which reduce the running time of the Neighbour-Joining method significantly but at the cost of an increased memory consumption making inference of large phylogenies infeasible. We present...... two extensions for RapidNJ which reduce the memory requirements and \\makebox{allows} phylogenies with more than 50,000 taxa to be inferred efficiently on a desktop computer. Furthermore, an improved version of the search heuristic is presented which reduces the running time of RapidNJ on many data...

  7. Genetic and phylogenetic analysis of ten Gobiidae species in China ...

    African Journals Online (AJOL)

    To study the genetic and phylogenetic relationship of gobioid fishes in China, the representatives of 10 gobioid fishes from 2 subfamilies in China were examined by amplified fragment length polymorphism (AFLP) analysis. We established 220 AFLP bands for 45 individuals from the 10 species, and the percentage of ...

  8. Phylogenetic study of Class Armophorea (Alveolata, Ciliophora based on 18S-rDNA data

    Directory of Open Access Journals (Sweden)

    Thiago da Silva Paiva

    2013-01-01

    Full Text Available The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195 retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree.

  9. A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the 'true citrus fruit trees' group (Citrinae, Rutaceae) and the origin of cultivated species.

    Science.gov (United States)

    Garcia-Lor, Andres; Curk, Franck; Snoussi-Trifa, Hager; Morillon, Raphael; Ancillo, Gema; Luro, François; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Despite differences in morphology, the genera representing 'true citrus fruit trees' are sexually compatible, and their phylogenetic relationships remain unclear. Most of the important commercial 'species' of Citrus are believed to be of interspecific origin. By studying polymorphisms of 27 nuclear genes, the average molecular differentiation between species was estimated and some phylogenetic relationships between 'true citrus fruit trees' were clarified. Sanger sequencing of PCR-amplified fragments from 18 genes involved in metabolite biosynthesis pathways and nine putative genes for salt tolerance was performed for 45 genotypes of Citrus and relatives of Citrus to mine single nucleotide polymorphisms (SNPs) and indel polymorphisms. Fifty nuclear simple sequence repeats (SSRs) were also analysed. A total of 16 238 kb of DNA was sequenced for each genotype, and 1097 single nucleotide polymorphisms (SNPs) and 50 indels were identified. These polymorphisms were more valuable than SSRs for inter-taxon differentiation. Nuclear phylogenetic analysis revealed that Citrus reticulata and Fortunella form a cluster that is differentiated from the clade that includes three other basic taxa of cultivated citrus (C. maxima, C. medica and C. micrantha). These results confirm the taxonomic subdivision between the subgenera Metacitrus and Archicitrus. A few genes displayed positive selection patterns within or between species, but most of them displayed neutral patterns. The phylogenetic inheritance patterns of the analysed genes were inferred for commercial Citrus spp. Numerous molecular polymorphisms (SNPs and indels), which are potentially useful for the analysis of interspecific genetic structures, have been identified. The nuclear phylogenetic network for Citrus and its sexually compatible relatives was consistent with the geographical origins of these genera. The positive selection observed for a few genes will help further works to analyse the molecular basis of the

  10. Constructing level-2 phylogenetic networks from triplets

    OpenAIRE

    Iersel, Leo; Keijsper, J.C.M.; Kelk, Steven; Stougie, Leen; Hagen, F.; Boekhout, T.; Vingron, M.; Wong, L.

    2009-01-01

    htmlabstractJansson and Sung showed that, given a dense set of input triplets T (representing hypotheses about the local evolutionary relationships of triplets of taxa), it is possible to determine in polynomial time whether there exists a level-1 network consistent with T, and if so to construct such a network (Inferring a Level-1 Phylogenetic Network from a Dense Set of Rooted Triplets, Theoretical Computer Science, 363, pp. 60-68 (2006)). Here we extend this work by showing that this probl...

  11. Spatial and temporal heterogeneity in high-grade serous ovarian cancer: a phylogenetic analysis.

    Directory of Open Access Journals (Sweden)

    Roland F Schwarz

    2015-02-01

    Full Text Available The major clinical challenge in the treatment of high-grade serous ovarian cancer (HGSOC is the development of progressive resistance to platinum-based chemotherapy. The objective of this study was to determine whether intra-tumour genetic heterogeneity resulting from clonal evolution and the emergence of subclonal tumour populations in HGSOC was associated with the development of resistant disease.Evolutionary inference and phylogenetic quantification of heterogeneity was performed using the MEDICC algorithm on high-resolution whole genome copy number profiles and selected genome-wide sequencing of 135 spatially and temporally separated samples from 14 patients with HGSOC who received platinum-based chemotherapy. Samples were obtained from the clinical CTCR-OV03/04 studies, and patients were enrolled between 20 July 2007 and 22 October 2009. Median follow-up of the cohort was 31 mo (interquartile range 22-46 mo, censored after 26 October 2013. Outcome measures were overall survival (OS and progression-free survival (PFS. There were marked differences in the degree of clonal expansion (CE between patients (median 0.74, interquartile range 0.66-1.15, and dichotimization by median CE showed worse survival in CE-high cases (PFS 12.7 versus 10.1 mo, p = 0.009; OS 42.6 versus 23.5 mo, p = 0.003. Bootstrap analysis with resampling showed that the 95% confidence intervals for the hazard ratios for PFS and OS in the CE-high group were greater than 1.0. These data support a relationship between heterogeneity and survival but do not precisely determine its effect size. Relapsed tissue was available for two patients in the CE-high group, and phylogenetic analysis showed that the prevalent clonal population at clinical recurrence arose from early divergence events. A subclonal population marked by a NF1 deletion showed a progressive increase in tumour allele fraction during chemotherapy.This study demonstrates that quantitative measures of intra

  12. Phylogenetic analysis of fungal ABC transporters.

    Science.gov (United States)

    Kovalchuk, Andriy; Driessen, Arnold J M

    2010-03-16

    The superfamily of ABC proteins is among the largest known in nature. Its members are mainly, but not exclusively, involved in the transport of a broad range of substrates across biological membranes. Many contribute to multidrug resistance in microbial pathogens and cancer cells. The diversity of ABC proteins in fungi is comparable with those in multicellular animals, but so far fungal ABC proteins have barely been studied. We performed a phylogenetic analysis of the ABC proteins extracted from the genomes of 27 fungal species from 18 orders representing 5 fungal phyla thereby covering the most important groups. Our analysis demonstrated that some of the subfamilies of ABC proteins remained highly conserved in fungi, while others have undergone a remarkable group-specific diversification. Members of the various fungal phyla also differed significantly in the number of ABC proteins found in their genomes, which is especially reduced in the yeast S. cerevisiae and S. pombe. Data obtained during our analysis should contribute to a better understanding of the diversity of the fungal ABC proteins and provide important clues about their possible biological functions.

  13. Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach

    Directory of Open Access Journals (Sweden)

    David Lee Erickson

    2014-11-01

    Full Text Available Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1,347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK and psbA-trnH and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance metrics that are commonly used to infer assembly processes were estimated for each plot (Phylogenetic Distance [PD], Mean Phylogenetic Distance [MPD], and Mean Nearest Taxon Distance [MNTD]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for

  14. New substitution models for rooting phylogenetic trees.

    Science.gov (United States)

    Williams, Tom A; Heaps, Sarah E; Cherlin, Svetlana; Nye, Tom M W; Boys, Richard J; Embley, T Martin

    2015-09-26

    The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made. © 2015 The Authors.

  15. Dynamically heterogenous partitions and phylogenetic inference: an evaluation of analytical strategies with cytochrome b and ND6 gene sequences in cranes.

    Science.gov (United States)

    Krajewski, C; Fain, M G; Buckley, L; King, D G

    1999-11-01

    ki ctes over whether molecular sequence data should be partitioned for phylogenetic analysis often confound two types of heterogeneity among partitions. We distinguish historical heterogeneity (i.e., different partitions have different evolutionary relationships) from dynamic heterogeneity (i.e., different partitions show different patterns of sequence evolution) and explore the impact of the latter on phylogenetic accuracy and precision with a two-gene, mitochondrial data set for cranes. The well-established phylogeny of cranes allows us to contrast tree-based estimates of relevant parameter values with estimates based on pairwise comparisons and to ascertain the effects of incorporating different amounts of process information into phylogenetic estimates. We show that codon positions in the cytochrome b and NADH dehydrogenase subunit 6 genes are dynamically heterogenous under both Poisson and invariable-sites + gamma-rates versions of the F84 model and that heterogeneity includes variation in base composition and transition bias as well as substitution rate. Estimates of transition-bias and relative-rate parameters from pairwise sequence comparisons were comparable to those obtained as tree-based maximum likelihood estimates. Neither rate-category nor mixed-model partitioning strategies resulted in a loss of phylogenetic precision relative to unpartitioned analyses. We suggest that weighted-average distances provide a computationally feasible alternative to direct maximum likelihood estimates of phylogeny for mixed-model analyses of large, dynamically heterogenous data sets. Copyright 1999 Academic Press.

  16. Phylogenetic analysis of West Nile virus, Nuevo Leon State, Mexico.

    Science.gov (United States)

    Blitvich, Bradley J; Fernández-Salas, Ildefonso; Contreras-Cordero, Juan F; Loroño-Pino, María A; Marlenee, Nicole L; Díaz, Francisco J; González-Rojas, José I; Obregón-Martínez, Nelson; Chiu-García, Jorge A; Black, William C; Beaty, Barry J

    2004-07-01

    West Nile virus RNA was detected in brain tissue from a horse that died in June 2003 in Nuevo Leon State, Mexico. Nucleotide sequencing and phylogenetic analysis of the premembrane and envelope genes showed that the virus was most closely related to West Nile virus isolates collected in Texas in 2002.

  17. Evaluating Fast Maximum Likelihood-Based Phylogenetic Programs Using Empirical Phylogenomic Data Sets

    Science.gov (United States)

    Zhou, Xiaofan; Shen, Xing-Xing; Hittinger, Chris Todd

    2018-01-01

    Abstract The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets with hundreds to thousands of genes and up to 200 taxa with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation-based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the programs’ relative performance. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses. PMID:29177474

  18. A Distance Measure for Genome Phylogenetic Analysis

    Science.gov (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  19. First phylogenetic analysis of Ehrlichia canis in dogs and ticks from Mexico. Preliminary study

    Directory of Open Access Journals (Sweden)

    Carolina G. Sosa-Gutiérrez

    2016-09-01

    Full Text Available Objective. Phylogenetic characterization of Ehrlichia canis in dogs naturally infected and ticks, diagnosed by PCR and sequencing of 16SrRNA gene; compare different isolates found in American countries. Materials and methods. Were collected Blood samples from 139 dogs with suggestive clinical manifestations of this disease and they were infested with ticks; part of 16SrRNA gene was sequenced and aligned, with 17 sequences reported in American countries. Two phylogenetic trees were constructed using the Maximum likelihood method, and Maximum parsimony. Results. They were positive to E. canis 25/139 (18.0% dogs and 29/139 (20.9% ticks. The clinical manifestations presented were fever, fatigue, depression and vomiting. Rhipicephalus sanguineus Dermacentor variabilis and Haemaphysalis leporis-palustris ticks were positive for E. canis. Phylogenetic analysis showed that the sequences of dogs and ticks in Mexico form a third group diverging of sequences from South America and USA. Conclusions. This is the first phylogenetic analysis of E. canis in Mexico. There are differences in the sequences of Mexico with those reported in South America and USA. This research lays the foundation for further study of genetic variability.

  20. A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the ‘true citrus fruit trees’ group (Citrinae, Rutaceae) and the origin of cultivated species

    Science.gov (United States)

    Garcia-Lor, Andres; Curk, Franck; Snoussi-Trifa, Hager; Morillon, Raphael; Ancillo, Gema; Luro, François; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Background and Aims Despite differences in morphology, the genera representing ‘true citrus fruit trees’ are sexually compatible, and their phylogenetic relationships remain unclear. Most of the important commercial ‘species’ of Citrus are believed to be of interspecific origin. By studying polymorphisms of 27 nuclear genes, the average molecular differentiation between species was estimated and some phylogenetic relationships between ‘true citrus fruit trees’ were clarified. Methods Sanger sequencing of PCR-amplified fragments from 18 genes involved in metabolite biosynthesis pathways and nine putative genes for salt tolerance was performed for 45 genotypes of Citrus and relatives of Citrus to mine single nucleotide polymorphisms (SNPs) and indel polymorphisms. Fifty nuclear simple sequence repeats (SSRs) were also analysed. Key Results A total of 16 238 kb of DNA was sequenced for each genotype, and 1097 single nucleotide polymorphisms (SNPs) and 50 indels were identified. These polymorphisms were more valuable than SSRs for inter-taxon differentiation. Nuclear phylogenetic analysis revealed that Citrus reticulata and Fortunella form a cluster that is differentiated from the clade that includes three other basic taxa of cultivated citrus (C. maxima, C. medica and C. micrantha). These results confirm the taxonomic subdivision between the subgenera Metacitrus and Archicitrus. A few genes displayed positive selection patterns within or between species, but most of them displayed neutral patterns. The phylogenetic inheritance patterns of the analysed genes were inferred for commercial Citrus spp. Conclusions Numerous molecular polymorphisms (SNPs and indels), which are potentially useful for the analysis of interspecific genetic structures, have been identified. The nuclear phylogenetic network for Citrus and its sexually compatible relatives was consistent with the geographical origins of these genera. The positive selection observed for a few genes will

  1. Stay True to the Sound of History: Philology, Phylogenetics and Information Engineering in Musicology

    Directory of Open Access Journals (Sweden)

    Sebastiano Verde

    2018-02-01

    Full Text Available This work investigates computational musicology for the study of tape music works tackling the problems concerning stemmatics. These philological problems have been analyzed with an innovative approach considering the peculiarities of audio tape recordings. The paper presents a phylogenetic reconstruction strategy that relies on digitizing the analyzed tapes and then converting each audio track into a two-dimensional spectrogram. This conversion allows adopting a set of computer vision tools to align and equalize different tracks in order to infer the most likely transformation that converts one track into another. In the presented approach, the main editing techniques, intentional and unintentional alterations and different configurations of a tape recorded are estimated in phylogeny analysis. The proposed solution presents a satisfying robustness to the adoption of the wrong reading setup together with a good reconstruction accuracy of the phylogenetic tree. The reconstructed dependencies proved to be correct or plausible in 90% of the experimental cases.

  2. Reconstruction of phylogenetic relationships in dermatomycete genus Trichophyton Malmsten 1848 based on ribosomal internal transcribed spacer region, partial 28S rRNA and beta-tubulin genes sequences.

    Science.gov (United States)

    Pchelin, Ivan M; Zlatogursky, Vasily V; Rudneva, Mariya V; Chilina, Galina A; Rezaei-Matehkolaei, Ali; Lavnikevich, Dmitry M; Vasilyeva, Natalya V; Taraskina, Anastasia E

    2016-09-01

    Trichophyton spp. are important causative agents of superficial mycoses. The phylogeny of the genus and accurate strain identification, based on the ribosomal ITS region sequencing, are still under development. The present work is aimed at (i) inferring the genus phylogeny from partial ITS, LSU and BT2 sequences (ii) description of ribosomal ITS region polymorphism in 15 strains of Trichophyton interdigitale. We performed DNA sequence-based species identification and phylogenetic analysis on 48 strains belonging to the genus Trichophyton. Phylogenetic relationships were inferred by maximum likelihood and Bayesian methods on concatenated ITS, LSU and BT2 sequences. Ribosomal ITS region polymorphisms were assessed directly on the alignment. By phylogenetic reconstruction, we reveal major anthropophilic and zoophilic species clusters in the genus Trichophyton. We describe several sequences of the ITS region of T. interdigitale, which do not fit in the traditional polymorphism scheme and propose emendations in this scheme for discrimination between ITS sequence types in T. interdigitale. The new polymorphism scheme will allow inclusion of a wider spectrum of isolates while retaining its explanatory power. This scheme was also found to be partially congruent with NTS typing technique. © 2016 Blackwell Verlag GmbH.

  3. Meaningful mediation analysis : Plausible causal inference and informative communication

    NARCIS (Netherlands)

    Pieters, Rik

    2017-01-01

    Statistical mediation analysis has become the technique of choice in consumer research to make causal inferences about the influence of a treatment on an outcome via one or more mediators. This tutorial aims to strengthen two weak links that impede statistical mediation analysis from reaching its

  4. Transcriptional and phylogenetic analysis of five complete ambystomatid salamander mitochondrial genomes.

    Science.gov (United States)

    Samuels, Amy K; Weisrock, David W; Smith, Jeramiah J; France, Katherine J; Walker, John A; Putta, Srikrishna; Voss, S Randal

    2005-04-11

    We report on a study that extended mitochondrial transcript information from a recent EST project to obtain complete mitochondrial genome sequence for 5 tiger salamander complex species (Ambystoma mexicanum, A. t. tigrinum, A. andersoni, A. californiense, and A. dumerilii). We describe, for the first time, aspects of mitochondrial transcription in a representative amphibian, and then use complete mitochondrial sequence data to examine salamander phylogeny at both deep and shallow levels of evolutionary divergence. The available mitochondrial ESTs for A. mexicanum (N=2481) and A. t. tigrinum (N=1205) provided 92% and 87% coverage of the mitochondrial genome, respectively. Complete mitochondrial sequences for all species were rapidly obtained by using long distance PCR and DNA sequencing. A number of genome structural characteristics (base pair length, base composition, gene number, gene boundaries, codon usage) were highly similar among all species and to other distantly related salamanders. Overall, mitochondrial transcription in Ambystoma approximated the pattern observed in other vertebrates. We inferred from the mapping of ESTs onto mtDNA that transcription occurs from both heavy and light strand promoters and continues around the entire length of the mtDNA, followed by post-transcriptional processing. However, the observation of many short transcripts corresponding to rRNA genes indicates that transcription may often terminate prematurely to bias transcription of rRNA genes; indeed an rRNA transcription termination signal sequence was observed immediately following the 16S rRNA gene. Phylogenetic analyses of salamander family relationships consistently grouped Ambystomatidae in a clade containing Cryptobranchidae and Hynobiidae, to the exclusion of Salamandridae. This robust result suggests a novel alternative hypothesis because previous studies have consistently identified Ambystomatidae and Salamandridae as closely related taxa. Phylogenetic analyses of tiger

  5. Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids.

    Science.gov (United States)

    Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry

    2006-04-09

    The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place

  6. Phylogenetic analyses of Vitis (Vitaceae based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids

    Directory of Open Access Journals (Sweden)

    Alverson Andrew J

    2006-04-01

    Full Text Available Abstract Background The Vitaceae (grape is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade

  7. A phylogenetic transform enhances analysis of compositional microbiota data.

    Science.gov (United States)

    Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A

    2017-02-15

    Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.

  8. Inference algorithms and learning theory for Bayesian sparse factor analysis

    International Nuclear Information System (INIS)

    Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John

    2009-01-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  9. Inference algorithms and learning theory for Bayesian sparse factor analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)

    2009-12-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  10. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Science.gov (United States)

    2010-01-01

    Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for

  11. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Directory of Open Access Journals (Sweden)

    Dawyndt Peter

    2010-01-01

    Full Text Available Abstract Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the

  12. From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

    Science.gov (United States)

    Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard

    2010-01-30

    Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial

  13. Phylogenetic position of Loricifera inferred from nearly complete 18S and 28S rRNA gene sequences

    OpenAIRE

    Yamasaki, Hiroshi; Fujimoto, Shinta; Miyazaki, Katsumi

    2015-01-01

    Background Loricifera is an enigmatic metazoan phylum; its morphology appeared to place it with Priapulida and Kinorhyncha in the group Scalidophora which, along with Nematoida (Nematoda and Nematomorpha), comprised the group Cycloneuralia. Scarce molecular data have suggested an alternative phylogenetic hypothesis, that the phylum Loricifera is a sister taxon to Nematomorpha, although the actual phylogenetic position of the phylum remains unclear. Methods Ecdysozoan phylogeny was reconstruct...

  14. Identification of putative orthologous genes for the phylogenetic reconstruction of temperate woody bamboos (Poaceae: Bambusoideae).

    Science.gov (United States)

    Zhang, Li-Na; Zhang, Xian-Zhi; Zhang, Yu-Xiao; Zeng, Chun-Xia; Ma, Peng-Fei; Zhao, Lei; Guo, Zhen-Hua; Li, De-Zhu

    2014-09-01

    The temperate woody bamboos (Arundinarieae) are highly diverse in morphology but lack a substantial amount of genetic variation. The taxonomy of this lineage is intractable, and the relationships within the tribe have not been well resolved. Recent studies indicated that this tribe could have a complex evolutionary history. Although phylogenetic studies of the tribe have been carried out, most of these phylogenetic reconstructions were based on plastid data, which provide lower phylogenetic resolution compared with nuclear data. In this study, we intended to identify a set of desirable nuclear genes for resolving the phylogeny of the temperate woody bamboos. Using two different methodologies, we identified 209 and 916 genes, respectively, as putative single copy orthologous genes. A total of 112 genes was successfully amplified and sequenced by next-generation sequencing technologies in five species sampled from the tribe. As most of the genes exhibited intra-individual allele heterozygotes, we investigated phylogenetic utility by reconstructing the phylogeny based on individual genes. Discordance among gene trees was observed and, to resolve the conflict, we performed a range of analyses using BUCKy and HybTree. While caution should be taken when inferring a phylogeny from multiple conflicting genes, our analysis indicated that 74 of the 112 investigated genes are potential markers for resolving the phylogeny of the temperate woody bamboos. © 2014 John Wiley & Sons Ltd.

  15. The evolutionary history of ferns inferred from 25 low-copy nuclear genes.

    Science.gov (United States)

    Rothfels, Carl J; Li, Fay-Wei; Sigel, Erin M; Huiet, Layne; Larsson, Anders; Burge, Dylan O; Ruhsam, Markus; Deyholos, Michael; Soltis, Douglas E; Stewart, C Neal; Shaw, Shane W; Pokorny, Lisa; Chen, Tao; dePamphilis, Claude; DeGironimo, Lisa; Chen, Li; Wei, Xiaofeng; Sun, Xiao; Korall, Petra; Stevenson, Dennis W; Graham, Sean W; Wong, Gane K-S; Pryer, Kathleen M

    2015-07-01

    • Understanding fern (monilophyte) phylogeny and its evolutionary timescale is critical for broad investigations of the evolution of land plants, and for providing the point of comparison necessary for studying the evolution of the fern sister group, seed plants. Molecular phylogenetic investigations have revolutionized our understanding of fern phylogeny, however, to date, these studies have relied almost exclusively on plastid data.• Here we take a curated phylogenomics approach to infer the first broad fern phylogeny from multiple nuclear loci, by combining broad taxon sampling (73 ferns and 12 outgroup species) with focused character sampling (25 loci comprising 35877 bp), along with rigorous alignment, orthology inference and model selection.• Our phylogeny corroborates some earlier inferences and provides novel insights; in particular, we find strong support for Equisetales as sister to the rest of ferns, Marattiales as sister to leptosporangiate ferns, and Dennstaedtiaceae as sister to the eupolypods. Our divergence-time analyses reveal that divergences among the extant fern orders all occurred prior to ∼200 MYA. Finally, our species-tree inferences are congruent with analyses of concatenated data, but generally with lower support. Those cases where species-tree support values are higher than expected involve relationships that have been supported by smaller plastid datasets, suggesting that deep coalescence may be reducing support from the concatenated nuclear data.• Our study demonstrates the utility of a curated phylogenomics approach to inferring fern phylogeny, and highlights the need to consider underlying data characteristics, along with data quantity, in phylogenetic studies. © 2015 Botanical Society of America, Inc.

  16. EM for phylogenetic topology reconstruction on nonhomogeneous data.

    Science.gov (United States)

    Ibáñez-Marcelo, Esther; Casanellas, Marta

    2014-06-17

    The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data.

  17. Conus pennaceus : a phylogenetic analysis of the Mozambican ...

    African Journals Online (AJOL)

    The genus Conus has over 500 species and is the most species-rich taxon of marine invertebrates. Based on mitochondrial DNA, this study focuses on the phylogenetics of Conus, particularly the pennaceus complex collected along the Mozambican coast. Phylogenetic trees based on both the 16S and the 12S ribosomal ...

  18. Phylogenetic Analysis of Dengue Virus in Bangkalan, Madura Island, East Java Province, Indonesia.

    Science.gov (United States)

    Sucipto, Teguh Hari; Kotaki, Tomohiro; Mulyatno, Kris Cahyo; Churrotin, Siti; Labiqah, Amaliah; Soegijanto, Soegeng; Kameoka, Masanori

    2018-01-01

    Dengue virus (DENV) infection is a major health issue in tropical and subtropical areas. Indonesia is one of the biggest dengue endemic countries in the world. In the present study, the phylogenetic analysis of DENV in Bangkalan, Madura Island, Indonesia, was performed in order to obtain a clearer understanding of its dynamics in this country. A total of 359 blood samples from dengue-suspected patients were collected between 2012 and 2014. Serotyping was conducted using a multiplex Reverse Transcriptase-Polymerase Chain Reaction and a phylogenetic analysis of E gene sequences was performed using the Bayesian Markov chain Monte Carlo (MCMC) method. 17 out of 359 blood samples (4.7%) were positive for the isolation of DENV. Serotyping and the phylogenetic analysis revealed the predominance of DENV-1 genotype I (9/17, 52.9%), followed by DENV-2 Cosmopolitan type (7/17, 41.2%) and DENV-3 genotype I (1/17, 5.9%) . DENV-4 was not isolated. The Madura Island isolates showed high nucleotide similarity to other Indonesian isolates, indicating frequent virus circulation in Indonesia. The results of the present study highlight the importance of continuous viral surveillance in dengue endemic areas in order to obtain a clearer understanding of the dynamics of DENV in Indonesia.

  19. Efficient FPT Algorithms for (Strict) Compatibility of Unrooted Phylogenetic Trees.

    Science.gov (United States)

    Baste, Julien; Paul, Christophe; Sau, Ignasi; Scornavacca, Celine

    2017-04-01

    In phylogenetics, a central problem is to infer the evolutionary relationships between a set of species X; these relationships are often depicted via a phylogenetic tree-a tree having its leaves labeled bijectively by elements of X and without degree-2 nodes-called the "species tree." One common approach for reconstructing a species tree consists in first constructing several phylogenetic trees from primary data (e.g., DNA sequences originating from some species in X), and then constructing a single phylogenetic tree maximizing the "concordance" with the input trees. The obtained tree is our estimation of the species tree and, when the input trees are defined on overlapping-but not identical-sets of labels, is called "supertree." In this paper, we focus on two problems that are central when combining phylogenetic trees into a supertree: the compatibility and the strict compatibility problems for unrooted phylogenetic trees. These problems are strongly related, respectively, to the notions of "containing as a minor" and "containing as a topological minor" in the graph community. Both problems are known to be fixed parameter tractable in the number of input trees k, by using their expressibility in monadic second-order logic and a reduction to graphs of bounded treewidth. Motivated by the fact that the dependency on k of these algorithms is prohibitively large, we give the first explicit dynamic programming algorithms for solving these problems, both running in time [Formula: see text], where n is the total size of the input.

  20. Recursive algorithms for phylogenetic tree counting.

    Science.gov (United States)

    Gavryushkina, Alexandra; Welch, David; Drummond, Alexei J

    2013-10-28

    In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult. We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals. These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.

  1. Phylogenetic diversity and biodiversity indices on phylogenetic networks.

    Science.gov (United States)

    Wicke, Kristina; Fischer, Mareike

    2018-04-01

    In biodiversity conservation it is often necessary to prioritize the species to conserve. Existing approaches to prioritization, e.g. the Fair Proportion Index and the Shapley Value, are based on phylogenetic trees and rank species according to their contribution to overall phylogenetic diversity. However, in many cases evolution is not treelike and thus, phylogenetic networks have been developed as a generalization of phylogenetic trees, allowing for the representation of non-treelike evolutionary events, such as hybridization. Here, we extend the concepts of phylogenetic diversity and phylogenetic diversity indices from phylogenetic trees to phylogenetic networks. On the one hand, we consider the treelike content of a phylogenetic network, e.g. the (multi)set of phylogenetic trees displayed by a network and the so-called lowest stable ancestor tree associated with it. On the other hand, we derive the phylogenetic diversity of subsets of taxa and biodiversity indices directly from the internal structure of the network. We consider both approaches that are independent of so-called inheritance probabilities as well as approaches that explicitly incorporate these probabilities. Furthermore, we introduce our software package NetDiversity, which is implemented in Perl and allows for the calculation of all generalized measures of phylogenetic diversity and generalized phylogenetic diversity indices established in this note that are independent of inheritance probabilities. We apply our methods to a phylogenetic network representing the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), a group of species characterized by widespread hybridization. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. Bayesian phylogeny analysis via stochastic approximation Monte Carlo

    KAUST Repository

    Cheon, Sooyoung; Liang, Faming

    2009-01-01

    in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method

  3. Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.

    Science.gov (United States)

    Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F

    2017-08-01

    Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins

    Science.gov (United States)

    Gaucher, Eric A.; Thomson, J. Michael; Burgan, Michelle F.; Benner, Steven A.

    2003-01-01

    Features of the physical environment surrounding an ancestral organism can be inferred by reconstructing sequences of ancient proteins made by those organisms, resurrecting these proteins in the laboratory, and measuring their properties. Here, we resurrect candidate sequences for elongation factors of the Tu family (EF-Tu) found at ancient nodes in the bacterial evolutionary tree, and measure their activities as a function of temperature. The ancient EF-Tu proteins have temperature optima of 55-65 degrees C. This value seems to be robust with respect to uncertainties in the ancestral reconstruction. This suggests that the ancient bacteria that hosted these particular genes were thermophiles, and neither hyperthermophiles nor mesophiles. This conclusion can be compared and contrasted with inferences drawn from an analysis of the lengths of branches in trees joining proteins from contemporary bacteria, the distribution of thermophily in derived bacterial lineages, the inferred G + C content of ancient ribosomal RNA, and the geological record combined with assumptions concerning molecular clocks. The study illustrates the use of experimental palaeobiochemistry and assumptions about deep phylogenetic relationships between bacteria to explore the character of ancient life.

  5. Ultrafast Approximation for Phylogenetic Bootstrap

    NARCIS (Netherlands)

    Bui Quang Minh, [No Value; Nguyen, Thi; von Haeseler, Arndt

    Nonparametric bootstrap has been a widely used tool in phylogenetic analysis to assess the clade support of phylogenetic trees. However, with the rapidly growing amount of data, this task remains a computational bottleneck. Recently, approximation methods such as the RAxML rapid bootstrap (RBS) and

  6. Detection of Horizontal Gene Transfers from Phylogenetic Comparisons

    Science.gov (United States)

    Pylro, Victor Satler; Vespoli, Luciano de Souza; Duarte, Gabriela Frois; Yotoko, Karla Suemy Clemente

    2012-01-01

    Bacterial phylogenies have become one of the most important challenges for microbial ecology. This field started in the mid-1970s with the aim of using the sequence of the small subunit ribosomal RNA (16S) tool to infer bacterial phylogenies. Phylogenetic hypotheses based on other sequences usually give conflicting topologies that reveal different evolutionary histories, which in some cases may be the result of horizontal gene transfer events. Currently, one of the major goals of molecular biology is to understand the role that horizontal gene transfer plays in species adaptation and evolution. In this work, we compared the phylogenetic tree based on 16S with the tree based on dszC, a gene involved in the cleavage of carbon-sulfur bonds. Bacteria of several genera perform this survival task when living in environments lacking free mineral sulfur. The biochemical pathway of the desulphurization process was extensively studied due to its economic importance, since this step is expensive and indispensable in fuel production. Our results clearly show that horizontal gene transfer events could be detected using common phylogenetic methods with gene sequences obtained from public sequence databases. PMID:22675653

  7. Phylogenetic affinity of tree shrews to Glires is attributed to fast evolution rate.

    Science.gov (United States)

    Lin, Jiannan; Chen, Guangfeng; Gu, Liang; Shen, Yuefeng; Zheng, Meizhu; Zheng, Weisheng; Hu, Xinjie; Zhang, Xiaobai; Qiu, Yu; Liu, Xiaoqing; Jiang, Cizhong

    2014-02-01

    Previous phylogenetic analyses have led to incongruent evolutionary relationships between tree shrews and other suborders of Euarchontoglires. What caused the incongruence remains elusive. In this study, we identified 6845 orthologous genes between seventeen placental mammals. Tree shrews and Primates were monophyletic in the phylogenetic trees derived from the first or/and second codon positions whereas tree shrews and Glires formed a monophyly in the trees derived from the third or all codon positions. The same topology was obtained in the phylogeny inference using the slowly and fast evolving genes, respectively. This incongruence was likely attributed to the fast substitution rate in tree shrews and Glires. Notably, sequence GC content only was not informative to resolve the controversial phylogenetic relationships between tree shrews, Glires, and Primates. Finally, estimation in the confidence of the tree selection strongly supported the phylogenetic affiliation of tree shrews to Primates as a monophyly. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. treespace: Statistical exploration of landscapes of phylogenetic trees.

    Science.gov (United States)

    Jombart, Thibaut; Kendall, Michelle; Almagro-Garcia, Jacob; Colijn, Caroline

    2017-11-01

    The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low-dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group-specific consensus phylogenies. treespace also provides a user-friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results. © 2017 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  9. Bootstrap-based Support of HGT Inferred by Maximum Parsimony

    Directory of Open Access Journals (Sweden)

    Nakhleh Luay

    2010-05-01

    Full Text Available Abstract Background Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. Results In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. Conclusions We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/, and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  10. Bootstrap-based support of HGT inferred by maximum parsimony.

    Science.gov (United States)

    Park, Hyun Jung; Jin, Guohua; Nakhleh, Luay

    2010-05-05

    Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/), and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.

  11. Ancestry inference using principal component analysis and spatial analysis: a distance-based analysis to account for population substructure.

    Science.gov (United States)

    Byun, Jinyoung; Han, Younghun; Gorlov, Ivan P; Busam, Jonathan A; Seldin, Michael F; Amos, Christopher I

    2017-10-16

    Accurate inference of genetic ancestry is of fundamental interest to many biomedical, forensic, and anthropological research areas. Genetic ancestry memberships may relate to genetic disease risks. In a genome association study, failing to account for differences in genetic ancestry between cases and controls may also lead to false-positive results. Although a number of strategies for inferring and taking into account the confounding effects of genetic ancestry are available, applying them to large studies (tens thousands samples) is challenging. The goal of this study is to develop an approach for inferring genetic ancestry of samples with unknown ancestry among closely related populations and to provide accurate estimates of ancestry for application to large-scale studies. In this study we developed a novel distance-based approach, Ancestry Inference using Principal component analysis and Spatial analysis (AIPS) that incorporates an Inverse Distance Weighted (IDW) interpolation method from spatial analysis to assign individuals to population memberships. We demonstrate the benefits of AIPS in analyzing population substructure, specifically related to the four most commonly used tools EIGENSTRAT, STRUCTURE, fastSTRUCTURE, and ADMIXTURE using genotype data from various intra-European panels and European-Americans. While the aforementioned commonly used tools performed poorly in inferring ancestry from a large number of subpopulations, AIPS accurately distinguished variations between and within subpopulations. Our results show that AIPS can be applied to large-scale data sets to discriminate the modest variability among intra-continental populations as well as for characterizing inter-continental variation. The method we developed will protect against spurious associations when mapping the genetic basis of a disease. Our approach is more accurate and computationally efficient method for inferring genetic ancestry in the large-scale genetic studies.

  12. [A phylogenetic analysis of plant communities of Teberda Biosphere Reserve].

    Science.gov (United States)

    Shulakov, A A; Egorov, A V; Onipchenko, V G

    2016-01-01

    Phylogenetic analysis of communities is based on the comparison of distances on the phylogenetic tree between species of a community under study and those distances in random samples taken out of local flora. It makes it possible to determine to what extent a community composition is formed by more closely related species (i.e., "clustered") or, on the opposite, it is more even and includes species that are less related with each other. The first case is usually interpreted as a result of strong influence caused by abiotic factors, due to which species with similar ecology, a priori more closely related, would remain: In the second case, biotic factors, such as competition, may come to the fore and lead to forming a community out of distant clades due to divergence of their ecological niches: The aim of this' study Was Ad explore the phylogenetic structure in communities of the northwestern Caucasus at two spatial scales - the scale of area from 4 to 100 m2 and the smaller scale within a community. The list of local flora of the alpine belt has been composed using the database of geobotanic descriptions carried out in Teberda Biosphere Reserve at true altitudes exceeding.1800 m. It includes 585 species of flowering plants belonging to 57 families. Basal groups of flowering plants are.not represented in the list. At the scale of communities of three classes, namely Thlaspietea rotundifolii - commumties formed on screes and pebbles, Calluno-Ulicetea - alpine meadow, and Mulgedio-Aconitetea subalpine meadows, have not demonstrated significant distinction of phylogenetic structure. At intra level, for alpine meadows the larger share of closely related species. (clustered community) is detected. Significantly clustered happen to be those communities developing on rocks (class Asplenietea trichomanis) and alpine (class Juncetea trifidi). At the same time, alpine lichen proved to have even phylogenetic structure at the small scale. Alpine (class Salicetea herbaceae) that

  13. Phylogenetic analysis of pelecaniformes (aves based on osteological data: implications for waterbird phylogeny and fossil calibration studies.

    Directory of Open Access Journals (Sweden)

    Nathan D Smith

    2010-10-01

    Full Text Available Debate regarding the monophyly and relationships of the avian order Pelecaniformes represents a classic example of discord between morphological and molecular estimates of phylogeny. This lack of consensus hampers interpretation of the group's fossil record, which has major implications for understanding patterns of character evolution (e.g., the evolution of wing-propelled diving and temporal diversification (e.g., the origins of modern families. Relationships of the Pelecaniformes were inferred through parsimony analyses of an osteological dataset encompassing 59 taxa and 464 characters. The relationships of the Plotopteridae, an extinct family of wing-propelled divers, and several other fossil pelecaniforms (Limnofregata, Prophaethon, Lithoptila, ?Borvocarbo stoeffelensis were also assessed. The antiquity of these taxa and their purported status as stem members of extant families makes them valuable for studies of higher-level avian diversification.Pelecaniform monophyly is not recovered, with Phaethontidae recovered as distantly related to all other pelecaniforms, which are supported as a monophyletic Steganopodes. Some anatomical partitions of the dataset possess different phylogenetic signals, and partitioned analyses reveal that these discrepancies are localized outside of Steganopodes, and primarily due to a few labile taxa. The Plotopteridae are recovered as the sister taxon to Phalacrocoracoidea, and the relationships of other fossil pelecaniforms representing key calibration points are well supported, including Limnofregata (sister taxon to Fregatidae, Prophaethon and Lithoptila (successive sister taxa to Phaethontidae, and ?Borvocarbo stoeffelensis (sister taxon to Phalacrocoracidae. These relationships are invariant when 'backbone' constraints based on recent avian phylogenies are imposed.Relationships of extant pelecaniforms inferred from morphology are more congruent with molecular phylogenies than previously assumed, though

  14. The conquering of North America: dated phylogenetic and biogeographic inference of migratory behavior in bee hummingbirds.

    Science.gov (United States)

    Licona-Vera, Yuyini; Ornelas, Juan Francisco

    2017-06-05

    Geographical and temporal patterns of diversification in bee hummingbirds (Mellisugini) were assessed with respect to the evolution of migration, critical for colonization of North America. We generated a dated multilocus phylogeny of the Mellisugini based on a dense sampling using Bayesian inference, maximum-likelihood and maximum parsimony methods, and reconstructed the ancestral states of distributional areas in a Bayesian framework and migratory behavior using maximum parsimony, maximum-likelihood and re-rooting methods. All phylogenetic analyses confirmed monophyly of the Mellisugini and the inclusion of Atthis, Calothorax, Doricha, Eulidia, Mellisuga, Microstilbon, Myrmia, Tilmatura, and Thaumastura. Mellisugini consists of two clades: (1) South American species (including Tilmatura dupontii), and (2) species distributed in North and Central America and the Caribbean islands. The second clade consists of four subclades: Mexican (Calothorax, Doricha) and Caribbean (Archilochus, Calliphlox, Mellisuga) sheartails, Calypte, and Selasphorus (incl. Atthis). Coalescent-based dating places the origin of the Mellisugini in the mid-to-late Miocene, with crown ages of most subclades in the early Pliocene, and subsequent species splits in the Pleistocene. Bee hummingbirds reached western North America by the end of the Miocene and the ancestral mellisuginid (bee hummingbirds) was reconstructed as sedentary, with four independent gains of migratory behavior during the evolution of the Mellisugini. Early colonization of North America and subsequent evolution of migration best explained biogeographic and diversification patterns within the Mellisugini. The repeated evolution of long-distance migration by different lineages was critical for the colonization of North America, contributing to the radiation of bee hummingbirds. Comparative phylogeography is needed to test whether the repeated evolution of migration resulted from northward expansion of southern sedentary

  15. Reconstruction of phylogenetic trees of prokaryotes using maximal common intervals.

    Science.gov (United States)

    Heydari, Mahdi; Marashi, Sayed-Amir; Tusserkani, Ruzbeh; Sadeghi, Mehdi

    2014-10-01

    One of the fundamental problems in bioinformatics is phylogenetic tree reconstruction, which can be used for classifying living organisms into different taxonomic clades. The classical approach to this problem is based on a marker such as 16S ribosomal RNA. Since evolutionary events like genomic rearrangements are not included in reconstructions of phylogenetic trees based on single genes, much effort has been made to find other characteristics for phylogenetic reconstruction in recent years. With the increasing availability of completely sequenced genomes, gene order can be considered as a new solution for this problem. In the present work, we applied maximal common intervals (MCIs) in two or more genomes to infer their distance and to reconstruct their evolutionary relationship. Additionally, measures based on uncommon segments (UCS's), i.e., those genomic segments which are not detected as part of any of the MCIs, are also used for phylogenetic tree reconstruction. We applied these two types of measures for reconstructing the phylogenetic tree of 63 prokaryotes with known COG (clusters of orthologous groups) families. Similarity between the MCI-based (resp. UCS-based) reconstructed phylogenetic trees and the phylogenetic tree obtained from NCBI taxonomy browser is as high as 93.1% (resp. 94.9%). We show that in the case of this diverse dataset of prokaryotes, tree reconstruction based on MCI and UCS outperforms most of the currently available methods based on gene orders, including breakpoint distance and DCJ. We additionally tested our new measures on a dataset of 13 closely-related bacteria from the genus Prochlorococcus. In this case, distances like rearrangement distance, breakpoint distance and DCJ proved to be useful, while our new measures are still appropriate for phylogenetic reconstruction. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  16. Molecular identification and phylogenetic analysis of Wuchereria bancrofti from human blood samples in Egypt.

    Science.gov (United States)

    Abdel-Shafi, Iman R; Shoieb, Eman Y; Attia, Samar S; Rubio, José M; Ta-Tang, Thuy-Huong; El-Badry, Ayman A

    2017-03-01

    Lymphatic filariasis (LF) is a serious vector-borne health problem, and Wuchereria bancrofti (W.b) is the major cause of LF worldwide and is focally endemic in Egypt. Identification of filarial infection using traditional morphologic and immunological criteria can be difficult and lead to misdiagnosis. The aim of the present study was molecular detection of W.b in residents in endemic areas in Egypt, sequence variance analysis, and phylogenetic analysis of W.b DNA. Collected blood samples from residents in filariasis endemic areas in five governorates were subjected to semi-nested PCR targeting repeated DNA sequence, for detection of W.b DNA. PCR products were sequenced; subsequently, a phylogenetic analysis of the obtained sequences was performed. Out of 300 blood samples, W.b DNA was identified in 48 (16%). Sequencing analysis confirmed PCR results identifying only W.b species. Sequence alignment and phylogenetic analysis indicated genetically distinct clusters of W.b among the study population. Study results demonstrated that the semi-nested PCR proved to be an effective diagnostic tool for accurate and rapid detection of W.b infections in nano-epidemics and is applicable for samples collected in the daytime as well as the night time. PCR products sequencing and phylogenitic analysis revealed three different nucleotide sequences variants. Further genetic studies of W.b in Egypt and other endemic areas are needed to distinguish related strains and the various ecological as well as drug effects exerted on them to support W.b elimination.

  17. Assessing the Goodness of Fit of Phylogenetic Comparative Methods: A Meta-Analysis and Simulation Study.

    Directory of Open Access Journals (Sweden)

    Dwueng-Chwuan Jhwueng

    Full Text Available Phylogenetic comparative methods (PCMs have been applied widely in analyzing data from related species but their fit to data is rarely assessed.Can one determine whether any particular comparative method is typically more appropriate than others by examining comparative data sets?I conducted a meta-analysis of 122 phylogenetic data sets found by searching all papers in JEB, Blackwell Synergy and JSTOR published in 2002-2005 for the purpose of assessing the fit of PCMs. The number of species in these data sets ranged from 9 to 117.I used the Akaike information criterion to compare PCMs, and then fit PCMs to bivariate data sets through REML analysis. Correlation estimates between two traits and bootstrapped confidence intervals of correlations from each model were also compared.For phylogenies of less than one hundred taxa, the Independent Contrast method and the independent, non-phylogenetic models provide the best fit.For bivariate analysis, correlations from different PCMs are qualitatively similar so that actual correlations from real data seem to be robust to the PCM chosen for the analysis. Therefore, researchers might apply the PCM they believe best describes the evolutionary mechanisms underlying their data.

  18. Comparison of Boolean analysis and standard phylogenetic methods using artificially evolved and natural mt-tRNA sequences from great apes.

    Science.gov (United States)

    Ari, Eszter; Ittzés, Péter; Podani, János; Thi, Quynh Chi Le; Jakó, Eena

    2012-04-01

    Boolean analysis (or BOOL-AN; Jakó et al., 2009. BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction. Mol. Phylogenet. Evol. 52, 887-97.), a recently developed method for sequence comparison uses the Iterative Canonical Form of Boolean functions. It considers sequence information in a way entirely different from standard phylogenetic methods (i.e. Maximum Parsimony, Maximum-Likelihood, Neighbor-Joining, and Bayesian analysis). The performance and reliability of Boolean analysis were tested and compared with the standard phylogenetic methods, using artificially evolved - simulated - nucleotide sequences and the 22 mitochondrial tRNA genes of the great apes. At the outset, we assumed that the phylogeny of Hominidae is generally well established, and the guide tree of artificial sequence evolution can also be used as a benchmark. These offer a possibility to compare and test the performance of different phylogenetic methods. Trees were reconstructed by each method from 2500 simulated sequences and 22 mitochondrial tRNA sequences. We also introduced a special re-sampling method for Boolean analysis on permuted sequence sites, the P-BOOL-AN procedure. Considering the reliability values (branch support values of consensus trees and Robinson-Foulds distances) we used for simulated sequence trees produced by different phylogenetic methods, BOOL-AN appeared as the most reliable method. Although the mitochondrial tRNA sequences of great apes are relatively short (59-75 bases long) and the ratio of their constant characters is about 75%, BOOL-AN, P-BOOL-AN and the Bayesian approach produced the same tree-topology as the established phylogeny, while the outcomes of Maximum Parsimony, Maximum-Likelihood and Neighbor-Joining methods were equivocal. We conclude that Boolean analysis is a promising alternative to existing methods of sequence comparison for phylogenetic reconstruction and congruence analysis. Copyright © 2012 Elsevier Inc. All

  19. Phylogenetic inertia and Darwin's higher law.

    Science.gov (United States)

    Shanahan, Timothy

    2011-03-01

    The concept of 'phylogenetic inertia' is routinely deployed in evolutionary biology as an alternative to natural selection for explaining the persistence of characteristics that appear sub-optimal from an adaptationist perspective. However, in many of these contexts the precise meaning of 'phylogenetic inertia' and its relationship to selection are far from clear. After tracing the history of the concept of 'inertia' in evolutionary biology, I argue that treating phylogenetic inertia and natural selection as alternative explanations is mistaken because phylogenetic inertia is, from a Darwinian point of view, simply an expected effect of selection. Although Darwin did not discuss 'phylogenetic inertia,' he did assert the explanatory priority of selection over descent. An analysis of 'phylogenetic inertia' provides a perspective from which to assess Darwin's view. Copyright © 2010 Elsevier Ltd. All rights reserved.

  20. Phylogenetic analysis of nitrite, nitric oxide, and nitrous oxide respiratory enzymes reveal a complex evolutionary history for denitrification.

    Science.gov (United States)

    Jones, Christopher M; Stres, Blaz; Rosenquist, Magnus; Hallin, Sara

    2008-09-01

    Denitrification is a facultative respiratory pathway in which nitrite (NO2(-)), nitric oxide (NO), and nitrous oxide (N2O) are successively reduced to nitrogen gas (N(2)), effectively closing the nitrogen cycle. The ability to denitrify is widely dispersed among prokaryotes, and this polyphyletic distribution has raised the possibility of horizontal gene transfer (HGT) having a substantial role in the evolution of denitrification. Comparisons of 16S rRNA and denitrification gene phylogenies in recent studies support this possibility; however, these results remain speculative as they are based on visual comparisons of phylogenies from partial sequences. We reanalyzed publicly available nirS, nirK, norB, and nosZ partial sequences using Bayesian and maximum likelihood phylogenetic inference. Concomitant analysis of denitrification genes with 16S rRNA sequences from the same organisms showed substantial differences between the trees, which were supported by examining the posterior probability of monophyletic constraints at different taxonomic levels. Although these differences suggest HGT of denitrification genes, the presence of structural variants for nirK, norB, and nosZ makes it difficult to determine HGT from other evolutionary events. Additional analysis using phylogenetic networks and likelihood ratio tests of phylogenies based on full-length sequences retrieved from genomes also revealed significant differences in tree topologies among denitrification and 16S rRNA gene phylogenies, with the exception of the nosZ gene phylogeny within the data set of the nirK-harboring genomes. However, inspection of codon usage and G + C content plots from complete genomes gave no evidence for recent HGT. Instead, the close proximity of denitrification gene copies in the genomes of several denitrifying bacteria suggests duplication. Although HGT cannot be ruled out as a factor in the evolution of denitrification genes, our analysis suggests that other phenomena, such gene

  1. Orthology prediction at scalable resolution by phylogenetic tree analysis

    NARCIS (Netherlands)

    Heijden, R.T.J.M. van der; Snel, B.; Noort, V. van; Huynen, M.A.

    2007-01-01

    BACKGROUND: Orthology is one of the cornerstones of gene function prediction. Dividing the phylogenetic relations between genes into either orthologs or paralogs is however an oversimplification. Already in two-species gene-phylogenies, the complicated, non-transitive nature of phylogenetic

  2. A format for phylogenetic placements.

    Directory of Open Access Journals (Sweden)

    Frederick A Matsen

    Full Text Available We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g., short reads into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format, which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement.

  3. A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function.

    Science.gov (United States)

    Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A

    2012-09-21

    Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. The relationships within the Chaitophorinae and Drepanosiphinae (Hemiptera, Aphididae) inferred from molecular-based phylogeny and comprehensive morphological data

    Science.gov (United States)

    Wieczorek, Karina; Lachowska-Cierlik, Dorota; Kajtoch, Łukasz; Kanturski, Mariusz

    2017-01-01

    The Chaitophorinae is a bionomically diverse Holarctic subfamily of Aphididae. The current classification includes two tribes: the Chaitophorini associated with deciduous trees and shrubs, and Siphini that feed on monocotyledonous plants. We present the first phylogenetic hypothesis for the subfamily, based on molecular and morphological datasets. Molecular analyses were based on the mitochondrial gene cytochrome oxidase subunit I (COI) and the nuclear gene elongation factor-1α (EF-1α). Phylogenetic inferences were obtained individually on each of genes and joined alignments using Bayesian inference (BI) and Maximum likelihood (ML). In phylogenetic trees reconstructed on the basis of nuclear and mitochondrial genes as well as a morphological dataset, the monophyly of Siphini and the genus Chaitophorus was supported. Periphyllus forms independent lineages from Chaitophorus and Siphini. Within this genus two clades comprising European and Asiatic species, respectively, were indicated. Concerning relationships within the subfamily, EF-1α and joined COI and EF-1α genes analysis strongly supports the hypothesis that Chaitophorini do not form a monophyletic clade. Periphyllus is a sister group to a clade containing Chaitophorus and Siphini. The Asiatic unit of Periphyllus also includes Trichaitophorus koyaensis. The analysis of morphological dataset under equally weighted parsimony also supports the view that Chaitophorini is an artificial taxon, as Lambersaphis pruinosae and Pseudopterocomma hughi, both traditionally included in the Chaitophorini, formed independent lineages. COI analyses support consistent groups within the subfamily, but relationships between groups are poorly resolved. These analyses were extended to include the species of closely related and phylogenetically unstudied subfamily Drepanosiphinae, which produced congruent results. Genera Drepanosiphum and Depanaphis are monophyletic and sister. The position of Yamatocallis tokyoensis differs in the

  5. Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.

    Science.gov (United States)

    Allman, Elizabeth S; Rhodes, John A; Sullivant, Seth

    2017-02-01

    Frequencies of k-mers in sequences are sometimes used as a basis for inferring phylogenetic trees without first obtaining a multiple sequence alignment. We show that a standard approach of using the squared Euclidean distance between k-mer vectors to approximate a tree metric can be statistically inconsistent. To remedy this, we derive model-based distance corrections for orthologous sequences without gaps, which lead to consistent tree inference. The identifiability of model parameters from k-mer frequencies is also studied. Finally, we report simulations showing that the corrected distance outperforms many other k-mer methods, even when sequences are generated with an insertion and deletion process. These results have implications for multiple sequence alignment as well since k-mer methods are usually the first step in constructing a guide tree for such algorithms.

  6. FuncPatch: a web server for the fast Bayesian inference of conserved functional patches in protein 3D structures.

    Science.gov (United States)

    Huang, Yi-Fei; Golding, G Brian

    2015-02-15

    A number of statistical phylogenetic methods have been developed to infer conserved functional sites or regions in proteins. Many methods, e.g. Rate4Site, apply the standard phylogenetic models to infer site-specific substitution rates and totally ignore the spatial correlation of substitution rates in protein tertiary structures, which may reduce their power to identify conserved functional patches in protein tertiary structures when the sequences used in the analysis are highly similar. The 3D sliding window method has been proposed to infer conserved functional patches in protein tertiary structures, but the window size, which reflects the strength of the spatial correlation, must be predefined and is not inferred from data. We recently developed GP4Rate to solve these problems under the Bayesian framework. Unfortunately, GP4Rate is computationally slow. Here, we present an intuitive web server, FuncPatch, to perform a fast approximate Bayesian inference of conserved functional patches in protein tertiary structures. Both simulations and four case studies based on empirical data suggest that FuncPatch is a good approximation to GP4Rate. However, FuncPatch is orders of magnitudes faster than GP4Rate. In addition, simulations suggest that FuncPatch is potentially a useful tool complementary to Rate4Site, but the 3D sliding window method is less powerful than FuncPatch and Rate4Site. The functional patches predicted by FuncPatch in the four case studies are supported by experimental evidence, which corroborates the usefulness of FuncPatch. The software FuncPatch is freely available at the web site, http://info.mcmaster.ca/yifei/FuncPatch golding@mcmaster.ca Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Detection and phylogenetic analysis of bacteriophage WO in spiders (Araneae).

    Science.gov (United States)

    Yan, Qian; Qiao, Huping; Gao, Jin; Yun, Yueli; Liu, Fengxiang; Peng, Yu

    2015-11-01

    Phage WO is a bacteriophage found in Wolbachia. Herein, we represent the first phylogenetic study of WOs that infect spiders (Araneae). Seven species of spiders (Araneus alternidens, Nephila clavata, Hylyphantes graminicola, Prosoponoides sinensis, Pholcus crypticolens, Coleosoma octomaculatum, and Nurscia albofasciata) from six families were infected by Wolbachia and WO, followed by comprehensive sequence analysis. Interestingly, WO could be only detected Wolbachia-infected spiders. The relative infection rates of those seven species of spiders were 75, 100, 88.9, 100, 62.5, 72.7, and 100 %, respectively. Our results indicated that both Wolbachia and WO were found in three different body parts of N. clavata, and WO could be passed to the next generation of H. graminicola by vertical transmission. There were three different sequences for WO infected in A. alternidens and two different WO sequences from C. octomaculatum. Only one sequence of WO was found for the other five species of spiders. The discovered sequence of WO ranged from 239 to 311 bp. Phylogenetic tree was generated using maximum likelihood (ML) based on the orf7 gene sequences. According to the phylogenetic tree, WOs in N. clavata and H. graminicola were clustered in the same group. WOs from A. alternidens (WAlt1) and C. octomaculatum (WOct2) were closely related to another clade, whereas WO in P. sinensis was classified as a sole cluster.

  8. Comparative analyses of plastid genomes from fourteen Cornales species: inferences for phylogenetic relationships and genome evolution.

    Science.gov (United States)

    Fu, Chao-Nan; Li, Hong-Tao; Milne, Richard; Zhang, Ting; Ma, Peng-Fei; Yang, Jing; Li, De-Zhu; Gao, Lian-Ming

    2017-12-08

    The Cornales is the basal lineage of the asterids, the largest angiosperm clade. Phylogenetic relationships within the order were previously not fully resolved. Fifteen plastid genomes representing 14 species, ten genera and seven families of Cornales were newly sequenced for comparative analyses of genome features, evolution, and phylogenomics based on different partitioning schemes and filtering strategies. All plastomes of the 14 Cornales species had the typical quadripartite structure with a genome size ranging from 156,567 bp to 158,715 bp, which included two inverted repeats (25,859-26,451 bp) separated by a large single-copy region (86,089-87,835 bp) and a small single-copy region (18,250-18,856 bp) region. These plastomes encoded the same set of 114 unique genes including 31 transfer RNA, 4 ribosomal RNA and 79 coding genes, with an identical gene order across all examined Cornales species. Two genes (rpl22 and ycf15) contained premature stop codons in seven and five species respectively. The phylogenetic relationships among all sampled species were fully resolved with maximum support. Different filtering strategies (none, light and strict) of sequence alignment did not have an effect on these relationships. The topology recovered from coding and noncoding data sets was the same as for the whole plastome, regardless of filtering strategy. Moreover, mutational hotspots and highly informative regions were identified. Phylogenetic relationships among families and intergeneric relationships within family of Cornales were well resolved. Different filtering strategies and partitioning schemes do not influence the relationships. Plastid genomes have great potential to resolve deep phylogenetic relationships of plants.

  9. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences.

    Science.gov (United States)

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-26

    The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consisting of N. ampullaria, N. mirabilis, N. gracilis and N. rafflesiana, and another containing both intermediately distributed species (N. albomarginata and N. benstonei) and four highland species (N. sanguinea, N. macfarlanei, N. ramispina and N. alba). The trnL intron and ITS sequences proved to provide phylogenetic informative characters for deriving a phylogeny of Nepenthes species in Peninsular Malaysia. To our knowledge, this is the first molecular phylogenetic study of Nepenthes species occurring along an altitudinal gradient in Peninsular Malaysia.

  10. The Cladistic Basis for the Phylogenetic Diversity (PD Measure Links Evolutionary Features to Environmental Gradients and Supports Broad Applications of Microbial Ecology’s “Phylogenetic Beta Diversity” Framework

    Directory of Open Access Journals (Sweden)

    Rob Knight

    2009-11-01

    Full Text Available The PD measure of phylogenetic diversity interprets branch lengths cladistically to make inferences about feature diversity. PD calculations extend conventional specieslevel ecological indices to the features level. The “phylogenetic beta diversity” framework developed by microbial ecologists calculates PD-dissimilarities between community localities. Interpretation of these PD-dissimilarities at the feature level explains the framework’s success in producing ordinations revealing environmental gradients. An example gradients space using PD-dissimilarities illustrates how evolutionary features form unimodal response patterns to gradients. This features model supports new application of existing species-level methods that are robust to unimodal responses, plus novel applications relating to climate change, commercial products discovery, and community assembly.

  11. The phylogenetic position of the roughskin skate Dipturus trachyderma (Krefft & Stehmann, 1975) (Rajiformes, Rajidae) inferred from the mitochondrial genome.

    Science.gov (United States)

    Vargas-Caro, Carolina; Bustamante, Carlos; Lamilla, Julio; Bennett, Michael B; Ovenden, Jennifer R

    2016-07-01

    The complete mitochondrial genome of the roughskin skate Dipturus trachyderma is described from 1 455 724 sequences obtained using Illumina NGS technology. Total length of the mitogenome was 16 909 base pairs, comprising 2 rRNAs, 13 protein-coding genes, 22 tRNAs and 2 non-coding regions. Phylogenetic analysis based on mtDNA revealed low genetic divergence among longnose skates, in particular, those dwelling the continental shelf and slope off the coasts of Chile and Argentina.

  12. Molecular phylogenetic analysis of Fasciola flukes from eastern India.

    Science.gov (United States)

    Hayashi, Kei; Ichikawa-Seki, Madoka; Mohanta, Uday Kumar; Singh, T Shantikumar; Shoriki, Takuya; Sugiyama, Hiromu; Itagaki, Tadashi

    2015-10-01

    Fasciola flukes from eastern India were characterized on the basis of spermatogenesis status and nuclear ITS1. Both Fasciola gigantica and aspermic Fasciola flukes were detected in Imphal, Kohima, and Gantoku districts. The sequences of mitochondrial nad1 were analyzed to infer their phylogenetical relationship with neighboring countries. The haplotypes of aspermic Fasciola flukes were identical or showed a single nucleotide substitution compared to those from populations in the neighboring countries, corroborating the previous reports that categorized them in the same lineage. However, the prevalence of aspermic Fasciola flukes in eastern India was lower than those in the neighboring countries, suggesting that they have not dispersed throughout eastern India. In contrast, F. gigantica was predominant and well diversified, and the species was thought to be distributed in the area for a longer time than the aspermic Fasciola flukes. Fasciola gigantica populations from eastern India were categorized into two distinct haplogroups A and B. The level of their genetic diversity suggests that populations belonging to haplogroup A have dispersed from the west side of the Indian subcontinent to eastern India with the artificial movement of domestic cattle, Bos indicus, whereas populations belonging to haplogroup B might have spread from Myanmar to eastern India with domestic buffaloes, Bubalus bubalis. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  13. A gateway for phylogenetic analysis powered by grid computing featuring GARLI 2.0.

    Science.gov (United States)

    Bazinet, Adam L; Zwickl, Derrick J; Cummings, Michael P

    2014-09-01

    We introduce molecularevolution.org, a publicly available gateway for high-throughput, maximum-likelihood phylogenetic analysis powered by grid computing. The gateway features a garli 2.0 web service that enables a user to quickly and easily submit thousands of maximum likelihood tree searches or bootstrap searches that are executed in parallel on distributed computing resources. The garli web service allows one to easily specify partitioned substitution models using a graphical interface, and it performs sophisticated post-processing of phylogenetic results. Although the garli web service has been used by the research community for over three years, here we formally announce the availability of the service, describe its capabilities, highlight new features and recent improvements, and provide details about how the grid system efficiently delivers high-quality phylogenetic results. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  14. Phylogenetic systematics of the genus Echinococcus (Cestoda: Taeniidae).

    Science.gov (United States)

    Nakao, Minoru; Lavikainen, Antti; Yanagida, Tetsuya; Ito, Akira

    2013-11-01

    Echinococcosis is a serious helminthic zoonosis in humans, livestock and wildlife. The pathogenic organisms are members of the genus Echinococcus (Cestoda: Taeniidae). Life cycles of Echinococcus spp. are consistently dependent on predator-prey association between two obligate mammalian hosts. Carnivores (canids and felids) serve as definitive hosts for adult tapeworms and their herbivore prey (ungulates, rodents and lagomorphs) as intermediate hosts for metacestode larvae. Humans are involved as an accidental host for metacestode infections. The metacestodes develop in various internal organs, particularly in liver and lungs. Each metacestode of Echinococcus spp. has an organotropism and a characteristic form known as an unilocular (cystic), alveolar or polycystic hydatid. Recent molecular phylogenetic studies have demonstrated that the type species, Echinococcus granulosus, causing cystic echinococcosis is a cryptic species complex. Therefore, the orthodox taxonomy of Echinococcus established from morphological criteria has been revised from the standpoint of phylogenetic systematics. Nine valid species including newly resurrected taxa are recognised as a result of the revision. This review summarises the recent advances in the phylogenetic systematics of Echinococcus, together with the historical backgrounds and molecular epidemiological aspects of each species. A new phylogenetic tree inferred from the mitochondrial genomes of all valid Echinococcus spp. is also presented. The taxonomic nomenclature for Echinococcus oligarthrus is shown to be incorrect and this name should be replaced with Echinococcus oligarthra. Copyright © 2013 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  15. Monte Carlo estimation of total variation distance of Markov chains on large spaces, with application to phylogenetics.

    Science.gov (United States)

    Herbei, Radu; Kubatko, Laura

    2013-03-26

    Markov chains are widely used for modeling in many areas of molecular biology and genetics. As the complexity of such models advances, it becomes increasingly important to assess the rate at which a Markov chain converges to its stationary distribution in order to carry out accurate inference. A common measure of convergence to the stationary distribution is the total variation distance, but this measure can be difficult to compute when the state space of the chain is large. We propose a Monte Carlo method to estimate the total variation distance that can be applied in this situation, and we demonstrate how the method can be efficiently implemented by taking advantage of GPU computing techniques. We apply the method to two Markov chains on the space of phylogenetic trees, and discuss the implications of our findings for the development of algorithms for phylogenetic inference.

  16. Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance

    Science.gov (United States)

    2013-01-01

    Background Constructing species trees from multi-copy gene trees remains a challenging problem in phylogenetics. One difficulty is that the underlying genes can be incongruent due to evolutionary processes such as gene duplication and loss, deep coalescence, or lateral gene transfer. Gene tree estimation errors may further exacerbate the difficulties of species tree estimation. Results We present a new approach for inferring species trees from incongruent multi-copy gene trees that is based on a generalization of the Robinson-Foulds (RF) distance measure to multi-labeled trees (mul-trees). We prove that it is NP-hard to compute the RF distance between two mul-trees; however, it is easy to calculate this distance between a mul-tree and a singly-labeled species tree. Motivated by this, we formulate the RF problem for mul-trees (MulRF) as follows: Given a collection of multi-copy gene trees, find a singly-labeled species tree that minimizes the total RF distance from the input mul-trees. We develop and implement a fast SPR-based heuristic algorithm for the NP-hard MulRF problem. We compare the performance of the MulRF method (available at http://genome.cs.iastate.edu/CBL/MulRF/) with several gene tree parsimony approaches using gene tree simulations that incorporate gene tree error, gene duplications and losses, and/or lateral transfer. The MulRF method produces more accurate species trees than gene tree parsimony approaches. We also demonstrate that the MulRF method infers in minutes a credible plant species tree from a collection of nearly 2,000 gene trees. Conclusions Our new phylogenetic inference method, based on a generalized RF distance, makes it possible to quickly estimate species trees from large genomic data sets. Since the MulRF method, unlike gene tree parsimony, is based on a generic tree distance measure, it is appealing for analyses of genomic data sets, in which many processes such as deep coalescence, recombination, gene duplication and losses as

  17. Bayesian phylogenetic estimation of fossil ages.

    Science.gov (United States)

    Drummond, Alexei J; Stadler, Tanja

    2016-07-19

    Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth-death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the 'morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses.This article is part of the themed issue 'Dating species divergences using

  18. Orthology prediction at scalable resolution by phylogenetic tree analysis

    Directory of Open Access Journals (Sweden)

    Huynen Martijn A

    2007-03-01

    Full Text Available Abstract Background Orthology is one of the cornerstones of gene function prediction. Dividing the phylogenetic relations between genes into either orthologs or paralogs is however an oversimplification. Already in two-species gene-phylogenies, the complicated, non-transitive nature of phylogenetic relations results in inparalogs and outparalogs. For situations with more than two species we lack semantics to specifically describe the phylogenetic relations, let alone to exploit them. Published procedures to extract orthologous groups from phylogenetic trees do not allow identification of orthology at various levels of resolution, nor do they document the relations between the orthologous groups. Results We introduce "levels of orthology" to describe the multi-level nature of gene relations. This is implemented in a program LOFT (Levels of Orthology From Trees that assigns hierarchical orthology numbers to genes based on a phylogenetic tree. To decide upon speciation and gene duplication events in a tree LOFT can be instructed either to perform classical species-tree reconciliation or to use the species overlap between partitions in the tree. The hierarchical orthology numbers assigned by LOFT effectively summarize the phylogenetic relations between genes. The resulting high-resolution orthologous groups are depicted in colour, facilitating visual inspection of (large trees. A benchmark for orthology prediction, that takes into account the varying levels of orthology between genes, shows that the phylogeny-based high-resolution orthology assignments made by LOFT are reliable. Conclusion The "levels of orthology" concept offers high resolution, reliable orthology, while preserving the relations between orthologous groups. A Windows as well as a preliminary Java version of LOFT is available from the LOFT website http://www.cmbi.ru.nl/LOFT.

  19. Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution.

    Directory of Open Access Journals (Sweden)

    Morgan Kullberg

    Full Text Available BACKGROUND: We investigate the usefulness of expressed sequence tags, ESTs, for establishing divergences within the tree of placental mammals. This is done on the example of the established relationships among primates (human, lagomorphs (rabbit, rodents (rat and mouse, artiodactyls (cow, carnivorans (dog and proboscideans (elephant. METHODOLOGY/PRINCIPAL FINDINGS: We have produced 2000 ESTs (1.2 mega bases from a marsupial mouse and characterized the data for their use in phylogenetic analysis. The sequences were used to identify putative orthologous sequences from whole genome projects. Although most ESTs stem from single sequence reads, the frequency of potential sequencing errors was found to be lower than allelic variation. Most of the sequences represented slowly evolving housekeeping-type genes, with an average amino acid distance of 6.6% between human and mouse. Positive Darwinian selection was identified at only a few single sites. Phylogenetic analyses of the EST data yielded trees that were consistent with those established from whole genome projects. CONCLUSIONS: The general quality of EST sequences and the general absence of positive selection in these sequences make ESTs an attractive tool for phylogenetic analysis. The EST approach allows, at reasonable costs, a fast extension of data sampling from species outside the genome projects.

  20. A proposal for a multivariate quantitative approach to infer karyological relationships among taxa

    Directory of Open Access Journals (Sweden)

    Lorenzo Peruzzi

    2014-12-01

    Full Text Available Until now, basic karyological parameters have been used in different ways by researchers to infer karyological relationships among organisms. In the present study, we propose a standardized approach to this aim, integrating six different, not redundant, parameters in a multivariate PCoA analysis. These parameters are chromosome number, basic chromosome number, total haploid chromosome length, MCA (Mean Centromeric Asymmetry, CVCL (Coefficient of Variation of Chromosome Length and CVCI (Coefficient of Variation of Centromeric Index. The method is exemplified with the application to several plant taxa, and its significance and limits are discussed in the light of current phylogenetic knowledge of these groups.

  1. Yleaf: Software for Human Y-Chromosomal Haplogroup Inference from Next-Generation Sequencing Data.

    Science.gov (United States)

    Ralf, Arwin; Montiel González, Diego; Zhong, Kaiyin; Kayser, Manfred

    2018-05-01

    Next-generation sequencing (NGS) technologies offer immense possibilities given the large genomic data they simultaneously deliver. The human Y-chromosome serves as good example how NGS benefits various applications in evolution, anthropology, genealogy, and forensics. Prior to NGS, the Y-chromosome phylogenetic tree consisted of a few hundred branches, based on NGS data, it now contains many thousands. The complexity of both, Y tree and NGS data provide challenges for haplogroup assignment. For effective analysis and interpretation of Y-chromosome NGS data, we present Yleaf, a publically available, automated, user-friendly software for high-resolution Y-chromosome haplogroup inference independently of library and sequencing methods.

  2. PHYLOGENETIC RELATIONSHIPS AMONGST 10 Durio SPECIES BASED ON PCR-RFLP ANALYSIS OF TWO CHLOROPLAST GENES

    Directory of Open Access Journals (Sweden)

    Panca J. Santoso

    2013-07-01

    Full Text Available Twenty seven species of Durio have been identified in Sabah and Sarawak, Malaysia, but their relationships have not been studied. This study was conducted to analyse phylogenetic relationships amongst 10 Durio species in Malaysia using PCR-RFLP on two chloroplast DNA genes, i.e. ndhC-trnV and rbcL. DNAs were extracted from young leaves of 11 accessions from 10 Durio species collected from the Tenom Agriculture Research Station, Sabah, and University Agriculture Park, Universiti Putra Malaysia. Two pairs of oligonucleotide primers, N1-N2 and rbcL1-rbcL2, were used to flank the target regions ndhC-trnV and rbcL. Eight restriction enzymes, HindIII, BsuRI, PstI, TaqI, MspI, SmaI, BshNI, and EcoR130I, were used to digest the amplicons. Based on the results of PCR-RFLP on ndhC-trnV gene, the 10 Durio species were grouped into five distinct clusters, and the accessions generally showed high variations. However, based on the results of PCR-RFLP on the rbcL gene, the species were grouped into three distinct clusters, and generally showed low variations. This means that ndhC-trnV gene is more reliable for phylogenetic analysis in lower taxonomic level of Durio species or for diversity analysis, while rbcL gene is reliable marker for phylogenetic analysis at higher taxonomic level. PCR-RFLP on the ndhC-trnV and rbcL genes could therefore be considered as useful markers to phylogenetic analysis amongst Durio species. These finding might be used for further molecular marker assisted in Durio breeding program.

  3. GapCoder automates the use of indel characters in phylogenetic analysis.

    Science.gov (United States)

    Young, Nelson D; Healy, John

    2003-02-19

    Several ways of incorporating indels into phylogenetic analysis have been suggested. Simple indel coding has two strengths: (1) biological realism and (2) efficiency of analysis. In the method, each indel with different start and/or end positions is considered to be a separate character. The presence/absence of these indel characters is then added to the data set. We have written a program, GapCoder to automate this procedure. The program can input PIR format aligned datasets, find the indels and add the indel-based characters. The output is a NEXUS format file, which includes a table showing what region each indel characters is based on. If regions are excluded from analysis, this table makes it easy to identify the corresponding indel characters for exclusion. Manual implementation of the simple indel coding method can be very time-consuming, especially in data sets where indels are numerous and/or overlapping. GapCoder automates this method and is therefore particularly useful during procedures where phylogenetic analyses need to be repeated many times, such as when different alignments are being explored or when various taxon or character sets are being explored. GapCoder is currently available for Windows from http://www.home.duq.edu/~youngnd/GapCoder.

  4. Nucleotide diversity and phylogenetic relationships among ...

    Indian Academy of Sciences (India)

    NIRAJ SINGH

    for phylogenetic analysis of Gladiolus and related taxa using combined datasets from chloroplast genome. The psbA–trnH ... phylogenetic relationships among cultivars could be useful for hybridization programmes for further improvement of the crop. [Singh N. ... breeding in nature, and exhibited diverse pollination mech-.

  5. Unrealistic phylogenetic trees may improve phylogenetic footprinting.

    Science.gov (United States)

    Nettling, Martin; Treutler, Hendrik; Cerquides, Jesus; Grosse, Ivo

    2017-06-01

    The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene regulation as a whole. Due to the development of sequencing technologies and the increasing number of available genomes, approaches based on phylogenetic footprinting become increasingly attractive. Phylogenetic footprinting requires phylogenetic trees with attached substitution probabilities for quantifying the evolution of binding sites, but these trees and substitution probabilities are typically not known and cannot be estimated easily. Here, we investigate the influence of phylogenetic trees with different substitution probabilities on the classification performance of phylogenetic footprinting using synthetic and real data. For synthetic data we find that the classification performance is highest when the substitution probability used for phylogenetic footprinting is similar to that used for data generation. For real data, however, we typically find that the classification performance of phylogenetic footprinting surprisingly increases with increasing substitution probabilities and is often highest for unrealistically high substitution probabilities close to one. This finding suggests that choosing realistic model assumptions might not always yield optimal predictions in general and that choosing unrealistically high substitution probabilities close to one might actually improve the classification performance of phylogenetic footprinting. The proposed PF is implemented in JAVA and can be downloaded from https://github.com/mgledi/PhyFoo. : martin.nettling@informatik.uni-halle.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  6. Estimating phylogenetic relationships despite discordant gene trees across loci: the species tree of a diverse species group of feather mites (Acari: Proctophyllodidae).

    Science.gov (United States)

    Knowles, Lacey L; Klimov, Pavel B

    2011-11-01

    With the increased availability of multilocus sequence data, the lack of concordance of gene trees estimated for independent loci has focused attention on both the biological processes producing the discord and the methodologies used to estimate phylogenetic relationships. What has emerged is a suite of new analytical tools for phylogenetic inference--species tree approaches. In contrast to traditional phylogenetic methods that are stymied by the idiosyncrasies of gene trees, approaches for estimating species trees explicitly take into account the cause of discord among loci and, in the process, provides a direct estimate of phylogenetic history (i.e. the history of species divergence, not divergence of specific loci). We illustrate the utility of species tree estimates with an analysis of a diverse group of feather mites, the pinnatus species group (genus Proctophyllodes). Discord among four sequenced nuclear loci is consistent with theoretical expectations, given the short time separating speciation events (as evident by short internodes relative to terminal branch lengths in the trees). Nevertheless, many of the relationships are well resolved in a Bayesian estimate of the species tree; the analysis also highlights ambiguous aspects of the phylogeny that require additional loci. The broad utility of species tree approaches is discussed, and specifically, their application to groups with high speciation rates--a history of diversification with particular prevalence in host/parasite systems where species interactions can drive rapid diversification.

  7. Inference of Well-Typings for Logic Programs with Application to Termination Analysis

    DEFF Research Database (Denmark)

    Bruynooghe, M.; Gallagher, John Patrick; Humbeeck, W. Van

    2005-01-01

    A method is developed to infer a polymorphic well-typing for a logic program. Our motivation is to improve the automation of termination analysis by deriving types from which norms can automatically be constructed. Previous work on type-based termination analysis used either types declared...... by the user, or automatically generated monomorphic types describing the success set of predicates. The latter types are less precise and result in weaker termination conditions than those obtained from declared types. Our type inference procedure involves solving set constraints generated from the program...... and derives a well-typing in contrast to a success-set approximation. Experiments so far show that our automatically inferred well-typings are close to the declared types and result in termination conditions that are as strong as those obtained with declared types. We describe the method, its implementation...

  8. Penicillium simile sp. nov. revealed by morphological and phylogenetic analysis.

    Science.gov (United States)

    Davolos, Domenico; Pietrangeli, Biancamaria; Persiani, Anna Maria; Maggi, Oriana

    2012-02-01

    The morphology of three phenetically identical Penicillium isolates, collected from the bioaerosol in a restoration laboratory in Italy, displayed macro- and microscopic characteristics that were similar though not completely ascribable to Penicillium raistrickii. For this reason, a phylogenetic approach based on DNA sequencing analysis was performed to establish both the taxonomic status and the evolutionary relationships of these three peculiar isolates in relation to previously described species of the genus Penicillium. We used four nuclear loci (both rRNA and protein coding genes) that have previously proved useful for the molecular investigation of taxa belonging to the genus Penicillium at various evolutionary levels. The internal transcribed spacer region (ITS1-5.8S-ITS2), domains D1 and D2 of the 28S rDNA, a region of the tubulin beta chain gene (benA) and part of the calmodulin gene (cmd) were amplified by PCR and sequenced. Analysis of the rRNA genes and of the benA and cmd sequence data indicates the presence of three isogenic isolates belonging to a genetically distinct species of the genus Penicillium, here described and named Penicillium simile sp. nov. (ATCC MYA-4591(T)  = CBS 129191(T)). This novel species is phylogenetically different from P. raistrickii and other related species of the genus Penicillium (e.g. Penicillium scabrosum), from which it can be distinguished on the basis of morphological trait analysis.

  9. Mixed integer linear programming for maximum-parsimony phylogeny inference.

    Science.gov (United States)

    Sridhar, Srinath; Lam, Fumei; Blelloch, Guy E; Ravi, R; Schwartz, Russell

    2008-01-01

    Reconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue to make effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear programming (ILP) formulations to find the most parsimonious phylogenetic tree from a set of binary variation data. One method uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has, however, proven extremely efficient in practice on datasets that are well beyond the reach of the available provably efficient methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times competitive with fast heuristics than cannot guarantee optimality. An alternative formulation establishes that the problem can be solved with a polynomial-sized ILP. We further present a web server developed based on the exponential-sized ILP that performs fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human genome.

  10. Phylogenetic Diversity, Distribution, and Cophylogeny of Giant Bacteria (Epulopiscium) with their Surgeonfish Hosts in the Red Sea

    KAUST Repository

    Miyake, Sou

    2016-03-14

    reflected by inferred differences in the host diets. Overall, our analysis identified a large phylogenetic diversity of Epulopiscium (up to 10% sequence divergence of 16S rRNA genes), which lets us hypothesize that there are multiple species that are spread across guts of different host species.

  11. Phylogenetic Diversity, Distribution, and Cophylogeny of Giant Bacteria (Epulopiscium) with their Surgeonfish Hosts in the Red Sea

    Science.gov (United States)

    Miyake, Sou; Ngugi, David K.; Stingl, Ulrich

    2016-01-01

    reflected by inferred differences in the host diets. Overall, our analysis identified a large phylogenetic diversity of Epulopiscium (up to 10% sequence divergence of 16S rRNA genes), which lets us hypothesize that there are multiple species that are spread across guts of different host species. PMID:27014209

  12. Phylogenetic Diversity, Distribution, and Cophylogeny of Giant Bacteria (Epulopiscium) with their Surgeonfish Hosts in the Red Sea

    KAUST Repository

    Miyake, Sou; Ngugi, David; Stingl, Ulrich

    2016-01-01

    reflected by inferred differences in the host diets. Overall, our analysis identified a large phylogenetic diversity of Epulopiscium (up to 10% sequence divergence of 16S rRNA genes), which lets us hypothesize that there are multiple species that are spread across guts of different host species.

  13. Phylogenetic analysis of several Thermus strains from Rehai of Tengchong, Yunnan, China.

    Science.gov (United States)

    Lin, Lianbing; Zhang, Jie; Wei, Yunlin; Chen, Chaoyin; Peng, Qian

    2005-10-01

    Several Thermus strains were isolated from 10 hot springs of the Rehai geothermal area in Tengchong, Yunnan province. The diversity of Thermus strains was examined by sequencing the 16S rRNA genes and comparing their sequences. Phylogenetic analysis showed that the 16S rDNA sequences from the Rehai geothermal isolates form four branches in the phylogenetic tree and had greater than 95.9% similarity in the phylogroup. Secondary structure comparison also indicated that the 16S rRNA from the Rehai geothermal isolates have unique secondary structure characteristics in helix 6, helix 9, and helix 10 (reference to Escherichia coli). This research is the first attempt to reveal the diversity of Thermus strains that are distributed in the Rehai geothermal area.

  14. Phylogenetic trees

    OpenAIRE

    Baños, Hector; Bushek, Nathaniel; Davidson, Ruth; Gross, Elizabeth; Harris, Pamela E.; Krone, Robert; Long, Colby; Stewart, Allen; Walker, Robert

    2016-01-01

    We introduce the package PhylogeneticTrees for Macaulay2 which allows users to compute phylogenetic invariants for group-based tree models. We provide some background information on phylogenetic algebraic geometry and show how the package PhylogeneticTrees can be used to calculate a generating set for a phylogenetic ideal as well as a lower bound for its dimension. Finally, we show how methods within the package can be used to compute a generating set for the join of any two ideals.

  15. Phylogenetic relationships between Sarcocystis species from reindeer and other Sarcocystidae deduced from ssu rRNA gene sequences

    DEFF Research Database (Denmark)

    Dahlgren, S.S.; Oliveira, Rodrigo Gouveia; Gjerde, B.

    2008-01-01

    any effect on previously inferred phylogenetic relationships within the Sarcocystidae. The complete small subunit (ssu) rRNA gene sequences of all six Sarcocystis species from reindeer were used in the phylogenetic analyses along with ssu rRNA gene sequences of 85 other members of the Coccidea. Trees...... the six species in phylogenetic analyses of the Sarcocystidae, and also to investigate the phylogenetic relationships between the species from reindeer and those from other hosts. The study also aimed at revealing whether the inclusion of six Sarcocystis species from the same intermediate host would have....... tarandivulpes, formed a sister group to other Sarcocystis species with a canine definitive host. The position of S. hardangeri on the tree suggested that it uses another type of definitive host than the other Sarcocystis species in this clade. Considering the geographical distribution and infection intensity...

  16. Phylogenetic analysis of methanogens from the bovine rumen

    Directory of Open Access Journals (Sweden)

    Forster Robert J

    2001-05-01

    Full Text Available Abstract Background Interest in methanogens from ruminants has resulted from the role of methane in global warming and from the fact that cattle typically lose 6 % of ingested energy as methane. Several species of methanogens have been isolated from ruminants. However they are difficult to culture, few have been consistently found in high numbers, and it is likely that major species of rumen methanogens are yet to be identified. Results Total DNA from clarified bovine rumen fluid was amplified using primers specific for Archaeal 16S rRNA gene sequences (rDNA. Phylogenetic analysis of 41 rDNA sequences identified three clusters of methanogens. The largest cluster contained two distinct subclusters with rDNA sequences similar to Methanobrevibacter ruminantium 16S rDNA. A second cluster contained sequences related to 16S rDNA from Methanosphaera stadtmanae, an organism not previously described in the rumen. The third cluster contained rDNA sequences that may form a novel group of rumen methanogens. Conclusions The current set of 16S rRNA hybridization probes targeting methanogenic Archaea does not cover the phylogenetic diversity present in the rumen and possibly other gastro-intestinal tract environments. New probes and quantitative PCR assays are needed to determine the distribution of the newly identified methanogen clusters in rumen microbial communities.

  17. Phylogenetic analyses provide insights into the historical biogeography and evolution of Brachyrhaphis fishes.

    Science.gov (United States)

    Ingley, Spencer J; Reina, Ruth G; Bermingham, Eldredge; Johnson, Jerald B

    2015-08-01

    The livebearing fish genus Brachyrhaphis (Poeciliidae) has become an increasingly important model in evolution and ecology research, yet the phylogeny of this group is not well understood, nor has it been examined thoroughly using modern phylogenetic methods. Here, we present the first comprehensive phylogenetic analysis of Brachyrhaphis by using four molecular markers (3mtDNA, 1nucDNA) to infer relationships among species in this genus. We tested the validity of this genus as a monophyletic group using extensive outgroup sampling based on recent phylogenetic hypotheses of Poeciliidae. We also tested the validity of recently described species of Brachyrhaphis that are part of the B. episcopi complex in Panama. Finally, we examined the impact of historical events on diversification of Brachyrhaphis, and made predictions regarding the role of different ecological environments on evolutionary diversification where known historical events apparently fail to explain speciation. Based on our results, we reject the monophyly of Brachyrhaphis, and question the validity of two recently described species (B. hessfeldi and B. roswithae). Historical biogeography of Brachyrhaphis generally agrees with patterns found in other freshwater taxa in Lower Central America, which show that geological barriers frequently predict speciation. Specifically, we find evidence in support of an 'island' model of Lower Central American formation, which posits that the nascent isthmus was partitioned by several marine connections before linking North and South America. In some cases where historic events (e.g., vicariance) fail to explain allopatric species breaks in Brachyrhaphis, ecological processes (e.g., divergent predation environments) offer additional insight into our understanding of phylogenetic diversification in this group. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Phylogenetic relationships among species of Lutzomyia, subgenus Lutzomyia (Diptera: Psychodidae).

    Science.gov (United States)

    Pinto, Israel S; Filho, José D Andrade; Santos, Claudiney B; Falqueto, Aloísio; Leite, Yuri L R

    2010-01-01

    Lutzomyia França is the largest and most diverse sand fly genus in the New World and contains all the species involved in the transmission of American visceral leishmaniasis (AVL). Morphological characters were used to test the monophyly and to infer phylogenetic relationships among members of the Lutzomyia subgenus. Fifty-two morphological characters from male and female adult specimens belonging to 18 species of Lu. (Lutzomyia) were scored and analyzed. The resulting phylogeny confirms the monophyly of this subgenus and reveals four main internal clades. These four clades, however, do not support the classification of the subgenus in two series, longipalpis and cavernicola, because neither is necessarily monophyletic. Knowledge on phylogenetic relationships among these relevant vectors of AVL should be used as a tool for monitoring target taxa and a first step for establishing an early warning system for disease control.

  19. Inference of Tumor Phylogenies with Improved Somatic Mutation Discovery

    KAUST Repository

    Salari, Raheleh

    2013-01-01

    Next-generation sequencing technologies provide a powerful tool for studying genome evolution during progression of advanced diseases such as cancer. Although many recent studies have employed new sequencing technologies to detect mutations across multiple, genetically related tumors, current methods do not exploit available phylogenetic information to improve the accuracy of their variant calls. Here, we present a novel algorithm that uses somatic single nucleotide variations (SNVs) in multiple, related tissue samples as lineage markers for phylogenetic tree reconstruction. Our method then leverages the inferred phylogeny to improve the accuracy of SNV discovery. Experimental analyses demonstrate that our method achieves up to 32% improvement for somatic SNV calling of multiple related samples over the accuracy of GATK\\'s Unified Genotyper, the state of the art multisample SNV caller. © 2013 Springer-Verlag.

  20. Effects analysis fuzzy inference system in nuclear problems using approximate reasoning

    International Nuclear Information System (INIS)

    Guimaraes, Antonio C.F.; Franklin Lapa, Celso Marcelo

    2004-01-01

    In this paper a fuzzy inference system modeling technique applied on failure mode and effects analysis (FMEA) is introduced in reactor nuclear problems. This method uses the concept of a pure fuzzy logic system to treat the traditional FMEA parameters: probabilities of occurrence, severity and detection. The auxiliary feed-water system of a typical two-loop pressurized water reactor (PWR) was used as practical example in this analysis. The kernel result is the conceptual confrontation among the traditional risk priority number (RPN) and the fuzzy risk priority number (FRPN) obtained from experts opinion. The set of results demonstrated the great potential of the inference system and advantage of the gray approach in this class of problems

  1. Ixodes ricinus tick lipocalins: identification, cloning, phylogenetic analysis and biochemical characterization.

    Directory of Open Access Journals (Sweden)

    Jérôme Beaufays

    Full Text Available BACKGROUND: During their blood meal, ticks secrete a wide variety of proteins that interfere with their host's defense mechanisms. Among these proteins, lipocalins play a major role in the modulation of the inflammatory response. METHODOLOGY/PRINCIPAL FINDINGS: Screening a cDNA library in association with RT-PCR and RACE methodologies allowed us to identify 14 new lipocalin genes in the salivary glands of the Ixodes ricinus hard tick. A computational in-depth structural analysis confirmed that LIRs belong to the lipocalin family. These proteins were called LIR for "Lipocalin from I. ricinus" and numbered from 1 to 14 (LIR1 to LIR14. According to their percentage identity/similarity, LIR proteins may be assigned to 6 distinct phylogenetic groups. The mature proteins have calculated pM and pI varying from 21.8 kDa to 37.2 kDa and from 4.45 to 9.57 respectively. In a western blot analysis, all recombinant LIRs appeared as a series of thin bands at 50-70 kDa, suggesting extensive glycosylation, which was experimentally confirmed by treatment with N-glycosidase F. In addition, the in vivo expression analysis of LIRs in I. ricinus, examined by RT-PCR, showed homogeneous expression profiles for certain phylogenetic groups and relatively heterogeneous profiles for other groups. Finally, we demonstrated that LIR6 codes for a protein that specifically binds leukotriene B4. CONCLUSIONS/SIGNIFICANCE: This work confirms that, regarding their biochemical properties, expression profile, and sequence signature, lipocalins in Ixodes hard tick genus, and more specifically in the Ixodes ricinus species, are segregated into distinct phylogenetic groups suggesting potential distinct function. This was particularly demonstrated by the ability of LIR6 to scavenge leukotriene B4. The other LIRs did not bind any of the ligands tested, such as 5-hydroxytryptamine, ADP, norepinephrine, platelet activating factor, prostaglandins D2 and E2, and finally leukotrienes B4 and C

  2. Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.

    Science.gov (United States)

    Sferra, Gabriella; Fratini, Federica; Ponzi, Marta; Pizzi, Elisabetta

    2017-09-05

    Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson's correlation as measures of profile similarity. In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request.

  3. Novel Method To Identify Source-Associated Phylogenetic Clustering Shows that Listeria monocytogenes Includes Niche-Adapted Clonal Groups with Distinct Ecological Preferences

    DEFF Research Database (Denmark)

    Nightingale, K. K.; Lyles, K.; Ayodele, M.

    2006-01-01

    population are identified (TreeStats test). Analysis of sequence data for 120 L. monocytogenes isolates revealed evidence of clustering between isolates from the same source, based on the phylogenies inferred from actA and inlA (P = 0.02 and P = 0.07, respectively; SourceCluster test). Overall, the Tree...... are biologically valid. Overall, our data show that (i) the SourceCluster and TreeStats tests can identify biologically meaningful source-associated phylogenetic clusters and (ii) L. monocytogenes includes clonal groups that have adapted to infect specific host species or colonize nonhost environments......., including humans, animals, and food. If the null hypothesis that the genetic distances for isolates within and between source populations are identical can be rejected (SourceCluster test), then particular clades in the phylogenetic tree with significant overrepresentation of sequences from a given source...

  4. Developmental and Ultrastructural Characterization and Phylogenetic Analysis of Trypanosoma herthameyeri n. sp. of Brazilian Leptodactilydae Frogs.

    Science.gov (United States)

    Attias, Márcia; Sato, Lyslaine H; Ferreira, Robson C; Takata, Carmen S A; Campaner, Marta; Camargo, Erney P; Teixeira, Marta M G; de Souza, Wanderley

    2016-09-01

    We described the phylogenetic affiliation, development in cultures and ultrastructural features of a trypanosome of Leptodacylus chaquensis from the Pantanal biome of Brazil. In the inferred phylogeny, this trypanosome nested into the Anura clade of the basal Aquatic clade of Trypanosoma, but was separate from all known species within this clade. This finding enabled us to describe it as Trypanosoma herthameyeri n. sp., which also infects other Leptodacylus species from the Pantanal and Caatinga biomes. Trypanosoma herthameyeri multiplies as small rounded forms clumped together and evolving into multiple-fission forms and rosettes of epimastigotes released as long forms with long flagella; scarce trypomastigotes and glove-like forms are common in stationary-phase cultures. For the first time, a trypanosome from an amphibian was observed by field emission scanning electron microscopy, revealing a cytostome opening, well-developed flagellar lamella, and many grooves in pumpkin-like forms. Transmission electron microscopy showed highly developed Golgi complexes, relaxed catenation of KDNA, and a rich set of spongiome tubules in a regular parallel arrangement to the flagellar pocket as confirmed by electron tomography. Considering the basal position in the phylogenetic tree, developmental and ultrastructural data of T. herthameyeri are valuable for evolutionary studies of trypanosome architecture and cell biology. © 2016 The Author(s) Journal of Eukaryotic Microbiology © 2016 International Society of Protistologists.

  5. Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach

    Science.gov (United States)

    Erickson, David L.; Jones, Frank A.; Swenson, Nathan G.; Pei, Nancai; Bourg, Norman A.; Chen, Wenna; Davies, Stuart J.; Ge, Xue-jun; Hao, Zhanqing; Howe, Robert W.; Huang, Chun-Lin; Larson, Andrew J.; Lum, Shawn K. Y.; Lutz, James A.; Ma, Keping; Meegaskumbura, Madhava; Mi, Xiangcheng; Parker, John D.; Fang-Sun, I.; Wright, S. Joseph; Wolf, Amy T.; Ye, W.; Xing, Dingliang; Zimmerman, Jess K.; Kress, W. John

    2014-01-01

    Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK, and psbA-trnH) and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance (PD) metrics that are commonly used to infer assembly processes were estimated for each plot [PD, Mean Phylogenetic Distance (MPD), and Mean Nearest Taxon Distance (MNTD)]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for individual plots, estimates of

  6. Phylo.io: Interactive Viewing and Comparison of Large Phylogenetic Trees on the Web.

    Science.gov (United States)

    Robinson, Oscar; Dylus, David; Dessimoz, Christophe

    2016-08-01

    Phylogenetic trees are pervasively used to depict evolutionary relationships. Increasingly, researchers need to visualize large trees and compare multiple large trees inferred for the same set of taxa (reflecting uncertainty in the tree inference or genuine discordance among the loci analyzed). Existing tree visualization tools are however not well suited to these tasks. In particular, side-by-side comparison of trees can prove challenging beyond a few dozen taxa. Here, we introduce Phylo.io, a web application to visualize and compare phylogenetic trees side-by-side. Its distinctive features are: highlighting of similarities and differences between two trees, automatic identification of the best matching rooting and leaf order, scalability to large trees, high usability, multiplatform support via standard HTML5 implementation, and possibility to store and share visualizations. The tool can be freely accessed at http://phylo.io and can easily be embedded in other web servers. The code for the associated JavaScript library is available at https://github.com/DessimozLab/phylo-io under an MIT open source license. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Maximum Parsimony on Phylogenetic networks

    Science.gov (United States)

    2012-01-01

    Background Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past. Results In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores. Conclusion The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are

  8. Fossil biogeography: a new model to infer dispersal, extinction and sampling from palaeontological data.

    Science.gov (United States)

    Silvestro, Daniele; Zizka, Alexander; Bacon, Christine D; Cascales-Miñana, Borja; Salamin, Nicolas; Antonelli, Alexandre

    2016-04-05

    Methods in historical biogeography have revolutionized our ability to infer the evolution of ancestral geographical ranges from phylogenies of extant taxa, the rates of dispersals, and biotic connectivity among areas. However, extant taxa are likely to provide limited and potentially biased information about past biogeographic processes, due to extinction, asymmetrical dispersals and variable connectivity among areas. Fossil data hold considerable information about past distribution of lineages, but suffer from largely incomplete sampling. Here we present a new dispersal-extinction-sampling (DES) model, which estimates biogeographic parameters using fossil occurrences instead of phylogenetic trees. The model estimates dispersal and extinction rates while explicitly accounting for the incompleteness of the fossil record. Rates can vary between areas and through time, thus providing the opportunity to assess complex scenarios of biogeographic evolution. We implement the DES model in a Bayesian framework and demonstrate through simulations that it can accurately infer all the relevant parameters. We demonstrate the use of our model by analysing the Cenozoic fossil record of land plants and inferring dispersal and extinction rates across Eurasia and North America. Our results show that biogeographic range evolution is not a time-homogeneous process, as assumed in most phylogenetic analyses, but varies through time and between areas. In our empirical assessment, this is shown by the striking predominance of plant dispersals from Eurasia into North America during the Eocene climatic cooling, followed by a shift in the opposite direction, and finally, a balance in biotic interchange since the middle Miocene. We conclude by discussing the potential of fossil-based analyses to test biogeographic hypotheses and improve phylogenetic methods in historical biogeography. © 2016 The Author(s).

  9. Species trees from consensus single nucleotide polymorphism (SNP) data: Testing phylogenetic approaches with simulated and empirical data.

    Science.gov (United States)

    Schmidt-Lebuhn, Alexander N; Aitken, Nicola C; Chuah, Aaron

    2017-11-01

    Datasets of hundreds or thousands of SNPs (Single Nucleotide Polymorphisms) from multiple individuals per species are increasingly used to study population structure, species delimitation and shallow phylogenetics. The principal software tool to infer species or population trees from SNP data is currently the BEAST template SNAPP which uses a Bayesian coalescent analysis. However, it is computationally extremely demanding and tolerates only small amounts of missing data. We used simulated and empirical SNPs from plants (Australian Craspedia, Asteraceae, and Pelargonium, Geraniaceae) to compare species trees produced (1) by SNAPP, (2) using SVD quartets, and (3) using Bayesian and parsimony analysis with several different approaches to summarising data from multiple samples into one set of traits per species. Our aims were to explore the impact of tree topology and missing data on the results, and to test which data summarising and analyses approaches would best approximate the results obtained from SNAPP for empirical data. SVD quartets retrieved the correct topology from simulated data, as did SNAPP except in the case of a very unbalanced phylogeny. Both methods failed to retrieve the correct topology when large amounts of data were missing. Bayesian analysis of species level summary data scoring the two alleles of each SNP as independent characters and parsimony analysis of data scoring each SNP as one character produced trees with branch length distributions closest to the true trees on which SNPs were simulated. For empirical data, Bayesian inference and Dollo parsimony analysis of data scored allele-wise produced phylogenies most congruent with the results of SNAPP. In the case of study groups divergent enough for missing data to be phylogenetically informative (because of additional mutations preventing amplification of genomic fragments or bioinformatic establishment of homology), scoring of SNP data as a presence/absence matrix irrespective of allele

  10. Phylogenetic relationships, diversification and expansion of chili peppers (Capsicum, Solanaceae).

    Science.gov (United States)

    Carrizo García, Carolina; Barfuss, Michael H J; Sehr, Eva M; Barboza, Gloria E; Samuel, Rosabelle; Moscone, Eduardo A; Ehrendorfer, Friedrich

    2016-07-01

    Capsicum (Solanaceae), native to the tropical and temperate Americas, comprises the well-known sweet and hot chili peppers and several wild species. So far, only partial taxonomic and phylogenetic analyses have been done for the genus. Here, the phylogenetic relationships between nearly all taxa of Capsicum were explored to test the monophyly of the genus and to obtain a better knowledge of species relationships, diversification and expansion. Thirty-four of approximately 35 Capsicum species were sampled. Maximum parsimony and Bayesian inference analyses were performed using two plastid markers (matK and psbA-trnH) and one single-copy nuclear gene (waxy). The evolutionary changes of nine key features were reconstructed following the parsimony ancestral states method. Ancestral areas were reconstructed through a Bayesian Markov chain Monte Carlo analysis. Capsicum forms a monophyletic clade, with Lycianthes as a sister group, following both phylogenetic approaches. Eleven well-supported clades (four of them monotypic) can be recognized within Capsicum, although some interspecific relationships need further analysis. A few features are useful to characterize different clades (e.g. fruit anatomy, chromosome base number), whereas some others are highly homoplastic (e.g. seed colour). The origin of Capsicum is postulated in an area along the Andes of western to north-western South America. The expansion of the genus has followed a clockwise direction around the Amazon basin, towards central and south-eastern Brazil, then back to western South America, and finally northwards to Central America. New insights are provided regarding interspecific relationships, character evolution, and geographical origin and expansion of Capsicum A clearly distinct early-diverging clade can be distinguished, centred in western-north-western South America. Subsequent rapid speciation has led to the origin of the remaining clades. The diversification of Capsicum has culminated in the origin

  11. Phylogenetic relationships, diversification and expansion of chili peppers (Capsicum, Solanaceae)

    Science.gov (United States)

    Carrizo García, Carolina; Barfuss, Michael H. J.; Sehr, Eva M.; Barboza, Gloria E.; Samuel, Rosabelle; Moscone, Eduardo A.; Ehrendorfer, Friedrich

    2016-01-01

    Background and Aims Capsicum (Solanaceae), native to the tropical and temperate Americas, comprises the well-known sweet and hot chili peppers and several wild species. So far, only partial taxonomic and phylogenetic analyses have been done for the genus. Here, the phylogenetic relationships between nearly all taxa of Capsicum were explored to test the monophyly of the genus and to obtain a better knowledge of species relationships, diversification and expansion. Methods Thirty-four of approximately 35 Capsicum species were sampled. Maximum parsimony and Bayesian inference analyses were performed using two plastid markers (matK and psbA-trnH) and one single-copy nuclear gene (waxy). The evolutionary changes of nine key features were reconstructed following the parsimony ancestral states method. Ancestral areas were reconstructed through a Bayesian Markov chain Monte Carlo analysis. Key Results Capsicum forms a monophyletic clade, with Lycianthes as a sister group, following both phylogenetic approaches. Eleven well-supported clades (four of them monotypic) can be recognized within Capsicum, although some interspecific relationships need further analysis. A few features are useful to characterize different clades (e.g. fruit anatomy, chromosome base number), whereas some others are highly homoplastic (e.g. seed colour). The origin of Capsicum is postulated in an area along the Andes of western to north-western South America. The expansion of the genus has followed a clockwise direction around the Amazon basin, towards central and south-eastern Brazil, then back to western South America, and finally northwards to Central America. Conclusions New insights are provided regarding interspecific relationships, character evolution, and geographical origin and expansion of Capsicum. A clearly distinct early-diverging clade can be distinguished, centred in western–north-western South America. Subsequent rapid speciation has led to the origin of the remaining clades. The

  12. Algorithms for MDC-Based Multi-locus Phylogeny Inference

    Science.gov (United States)

    Yu, Yun; Warnow, Tandy; Nakhleh, Luay

    One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is minimize deep coalescence, or MDC. Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this paper, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. Finally, we study the performance of these methods in coalescent-based computer simulations.

  13. Phylogenetic Variants of Rickettsia africae, and Incidental Identification of "Candidatus Rickettsia Moyalensis" in Kenya.

    Science.gov (United States)

    Kimita, Gathii; Mutai, Beth; Nyanjom, Steven Ger; Wamunyokoli, Fred; Waitumbi, John

    2016-07-01

    Rickettsia africae, the etiological agent of African tick bite fever, is widely distributed in sub-Saharan Africa. Contrary to reports of its homogeneity, a localized study in Asembo, Kenya recently reported high genetic diversity. The present study aims to elucidate the extent of this heterogeneity by examining archived Rickettsia africae DNA samples collected from different eco-regions of Kenya. To evaluate their phylogenetic relationships, archived genomic DNA obtained from 57 ticks a priori identified to contain R. africae by comparison to ompA, ompB and gltA genes was used to amplify five rickettsial genes i.e. gltA, ompA, ompB, 17kDa and sca4. The resulting amplicons were sequenced. Translated amino acid alignments were used to guide the nucleotide alignments. Single gene and concatenated alignments were used to infer phylogenetic relationships. Out of the 57 DNA samples, three were determined to be R. aeschlimanii and not R. africae. One sample turned out to be a novel rickettsiae and an interim name of "Candidatus Rickettsia moyalensis" is proposed. The bonafide R. africae formed two distinct clades. Clade I contained 9% of the samples and branched with the validated R. africae str ESF-5, while clade II (two samples) formed a distinct sub-lineage. This data supports the use of multiple genes for phylogenetic inferences. It is determined that, despite its recent emergence, the R. africae lineage is diverse. This data also provides evidence of a novel Rickettsia species, Candidatus Rickettsia moyalensis.

  14. Phylogenetic relationships of the genus Phanerochaete inferred from the internal transcribed spacer region

    Science.gov (United States)

    Theodorus H. de Koker; Karen K. Nakasone; Jacques Haarhof; Harold H. Burdsall; Bernard J.H. Janse

    2003-01-01

    Phanerochaete is a genus of resupinate homobasidiomycetes that are saprophytic on woody debris and logs. Morphological studies in the past indicated that Phanerochaete is a heterogeneous assemblage of species. In this study the internal transcribed spacer (ITS) region of the nuclear ribosomal DNA was used to test the monophyly of the genus Phanerochaete and to infer...

  15. Phylogenetic and comparative gene expression analysis of barley (Hordeum vulgare)WRKY transcription factor family reveals putatively retained functions betweenmonocots and dicots

    Energy Technology Data Exchange (ETDEWEB)

    Mangelsen, Elke; Kilian, Joachim; Berendzen, Kenneth W.; Kolukisaoglu, Uner; Harter, Klaus; Jansson, Christer; Wanke, Dierk

    2008-02-01

    WRKY proteins belong to the WRKY-GCM1 superfamily of zinc finger transcription factors that have been subject to a large plant-specific diversification. For the cereal crop barley (Hordeum vulgare), three different WRKY proteins have been characterized so far, as regulators in sucrose signaling, in pathogen defense, and in response to cold and drought, respectively. However, their phylogenetic relationship remained unresolved. In this study, we used the available sequence information to identify a minimum number of 45 barley WRKY transcription factor (HvWRKY) genes. According to their structural features the HvWRKY factors were classified into the previously defined polyphyletic WRKY subgroups 1 to 3. Furthermore, we could assign putative orthologs of the HvWRKY proteins in Arabidopsis and rice. While in most cases clades of orthologous proteins were formed within each group or subgroup, other clades were composed of paralogous proteins for the grasses and Arabidopsis only, which is indicative of specific gene radiation events. To gain insight into their putative functions, we examined expression profiles of WRKY genes from publicly available microarray data resources and found group specific expression patterns. While putative orthologs of the HvWRKY transcription factors have been inferred from phylogenetic sequence analysis, we performed a comparative expression analysis of WRKY genes in Arabidopsis and barley. Indeed, highly correlative expression profiles were found between some of the putative orthologs. HvWRKY genes have not only undergone radiation in monocot or dicot species, but exhibit evolutionary traits specific to grasses. HvWRKY proteins exhibited not only sequence similarities between orthologs with Arabidopsis, but also relatedness in their expression patterns. This correlative expression is indicative for a putative conserved function of related WRKY proteins in mono- and dicot species.

  16. Phenotypic differentiation and phylogenetic signal of wing shape in western European biting midges, Culicoides spp., of the subgenus Avaritia

    DEFF Research Database (Denmark)

    Muñoz-Muñoz, F.; Talavera, S.; Carpenter, S.

    2014-01-01

    of cytochrome oxidase subunit I barcode sequencing and geometric morphometric analyses to investigate wing shape as a means to infer species identification within this subgenus. In addition the congruence of morphological data with different phylogenetic hypotheses is tested. Five different species...

  17. Incompletely resolved phylogenetic trees inflate estimates of phylogenetic conservatism.

    Science.gov (United States)

    Davies, T Jonathan; Kraft, Nathan J B; Salamin, Nicolas; Wolkovich, Elizabeth M

    2012-02-01

    The tendency for more closely related species to share similar traits and ecological strategies can be explained by their longer shared evolutionary histories and represents phylogenetic conservatism. How strongly species traits co-vary with phylogeny can significantly impact how we analyze cross-species data and can influence our interpretation of assembly rules in the rapidly expanding field of community phylogenetics. Phylogenetic conservatism is typically quantified by analyzing the distribution of species values on the phylogenetic tree that connects them. Many phylogenetic approaches, however, assume a completely sampled phylogeny: while we have good estimates of deeper phylogenetic relationships for many species-rich groups, such as birds and flowering plants, we often lack information on more recent interspecific relationships (i.e., within a genus). A common solution has been to represent these relationships as polytomies on trees using taxonomy as a guide. Here we show that such trees can dramatically inflate estimates of phylogenetic conservatism quantified using S. P. Blomberg et al.'s K statistic. Using simulations, we show that even randomly generated traits can appear to be phylogenetically conserved on poorly resolved trees. We provide a simple rarefaction-based solution that can reliably retrieve unbiased estimates of K, and we illustrate our method using data on first flowering times from Thoreau's woods (Concord, Massachusetts, USA).

  18. Reconstruction of mitogenomes by NGS and phylogenetic implications for leaf beetles.

    Science.gov (United States)

    Song, Nan; Yin, Xinming; Zhao, Xincheng; Chen, Junhua; Yin, Jian

    2017-11-30

    Mitochondrial genome (mitogenome) sequences are frequently used to infer phylogenetic relationships of insects at different taxonomic levels. Next-generation sequencing (NGS) techniques are revolutionizing many fields of biology, and allow for acquisition of insect mitogenomes for large number of species simultaneously. In this study, 20 full or partial mitogenomes were sequenced from pooled genomic DNA samples by NGS for leaf beetles (Chrysomelidae). Combined with published mitogenome sequences, a higher level phylogeny of Chrysomelidae was reconstructed under maximum likelihood and Bayesian inference with different models and various data treatments. The results revealed support for a basal position of Bruchinae within Chrysomelidae. In addition, two major subfamily groupings were recovered: one including seven subfamilies, namely Donaciinae, Criocerinae, Spilopyrinae, Cassidinae, Cryptocephalinae, Chlamisinae and Eumolpinae, another containing a non-monophyletic Chrysomelinae and a monophyletic Galerucinae.

  19. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks

    Science.gov (United States)

    2017-01-01

    Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more

  20. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks.

    Science.gov (United States)

    Klinkenberg, Don; Backer, Jantien A; Didelot, Xavier; Colijn, Caroline; Wallinga, Jacco

    2017-05-01

    Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more

  1. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks.

    Directory of Open Access Journals (Sweden)

    Don Klinkenberg

    2017-05-01

    Full Text Available Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection

  2. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences

    OpenAIRE

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-01

    Background The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Results Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consi...

  3. Reversible polymorphism-aware phylogenetic models and their application to tree inference.

    Science.gov (United States)

    Schrempf, Dominik; Minh, Bui Quang; De Maio, Nicola; von Haeseler, Arndt; Kosiol, Carolin

    2016-10-21

    We present a reversible Polymorphism-Aware Phylogenetic Model (revPoMo) for species tree estimation from genome-wide data. revPoMo enables the reconstruction of large scale species trees for many within-species samples. It expands the alphabet of DNA substitution models to include polymorphic states, thereby, naturally accounting for incomplete lineage sorting. We implemented revPoMo in the maximum likelihood software IQ-TREE. A simulation study and an application to great apes data show that the runtimes of our approach and standard substitution models are comparable but that revPoMo has much better accuracy in estimating trees, divergence times and mutation rates. The advantage of revPoMo is that an increase of sample size per species improves estimations but does not increase runtime. Therefore, revPoMo is a valuable tool with several applications, from speciation dating to species tree reconstruction. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. A phylogenetic framework for root lesion nematodes of the genus Pratylenchus (Nematoda): Evidence from 18S and D2-D3 expansion segments of 28S ribosomal RNA genes and morphological characters.

    Science.gov (United States)

    Subbotin, Sergei A; Ragsdale, Erik J; Mullens, Teresa; Roberts, Philip A; Mundo-Ocampo, Manuel; Baldwin, James G

    2008-08-01

    The root lesion nematodes of the genus Pratylenchus Filipjev, 1936 are migratory endoparasites of plant roots, considered among the most widespread and important nematode parasites in a variety of crops. We obtained gene sequences from the D2 and D3 expansion segments of 28S rRNA partial and 18S rRNA from 31 populations belonging to 11 valid and two unidentified species of root lesion nematodes and five outgroup taxa. These datasets were analyzed using maximum parsimony and Bayesian inference. The alignments were generated using the secondary structure models for these molecules and analyzed with Bayesian inference under the standard models and the complex model, considering helices under the doublet model and loops and bulges under the general time reversible model. The phylogenetic informativeness of morphological characters is tested by reconstruction of their histories on rRNA based trees using parallel parsimony and Bayesian approaches. Phylogenetic and sequence analyses of the 28S D2-D3 dataset with 145 accessions for 28 species and 18S dataset with 68 accessions for 15 species confirmed among large numbers of geographical diverse isolates that most classical morphospecies are monophyletic. Phylogenetic analyses revealed at least six distinct major clades of examined Pratylenchus species and these clades are generally congruent with those defined by characters derived from lip patterns, numbers of lip annules, and spermatheca shape. Morphological results suggest the need for sophisticated character discovery and analysis for morphology based phylogenetics in nematodes.

  5. PyElph - a software tool for gel images analysis and phylogenetics

    Directory of Open Access Journals (Sweden)

    Pavel Ana Brânduşa

    2012-01-01

    Full Text Available Abstract Background This paper presents PyElph, a software tool which automatically extracts data from gel images, computes the molecular weights of the analyzed molecules or fragments, compares DNA patterns which result from experiments with molecular genetic markers and, also, generates phylogenetic trees computed by five clustering methods, using the information extracted from the analyzed gel image. The software can be successfully used for population genetics, phylogenetics, taxonomic studies and other applications which require gel image analysis. Researchers and students working in molecular biology and genetics would benefit greatly from the proposed software because it is free, open source, easy to use, has a friendly Graphical User Interface and does not depend on specific image acquisition devices like other commercial programs with similar functionalities do. Results PyElph software tool is entirely implemented in Python which is a very popular programming language among the bioinformatics community. It provides a very friendly Graphical User Interface which was designed in six steps that gradually lead to the results. The user is guided through the following steps: image loading and preparation, lane detection, band detection, molecular weights computation based on a molecular weight marker, band matching and finally, the computation and visualization of phylogenetic trees. A strong point of the software is the visualization component for the processed data. The Graphical User Interface provides operations for image manipulation and highlights lanes, bands and band matching in the analyzed gel image. All the data and images generated in each step can be saved. The software has been tested on several DNA patterns obtained from experiments with different genetic markers. Examples of genetic markers which can be analyzed using PyElph are RFLP (Restriction Fragment Length Polymorphism, AFLP (Amplified Fragment Length Polymorphism, RAPD

  6. Intricate patterns of phylogenetic relationships in the olive family as inferred from multi-locus plastid and nuclear DNA sequence analyses: a close-up on Chionanthus and Noronhia (Oleaceae).

    Science.gov (United States)

    Hong-Wa, Cynthia; Besnard, Guillaume

    2013-05-01

    Noronhia represents the most successful radiation of the olive family (Oleaceae) in Madagascar with more than 40 named endemic species distributed in all ecoregions from sea level to high mountains. Its position within the subtribe Oleinae has, however, been largely unresolved and its evolutionary history has remained unexplored. In this study, we generated a dataset of plastid (trnL-F, trnT-L, trnS-G, trnK-matK) and nuclear (internal transcribed spacer [ITS]) DNA sequences to infer phylogenetic relationships within Oleinae and to examine evolutionary patterns within Noronhia. Our sample included most species of Noronhia and representatives of the ten other extant genera within the subtribe with an emphasis on Chionanthus. Bayesian inferences and maximum likelihood analyses of plastid and nuclear data indicated several instances of paraphyly and polyphyly within Oleinae, with some geographic signal. Both plastid and ITS data showed a polyphyletic Noronhia that included Indian Ocean species of Chionanthus. They also found close relationships between Noronhia and African Chionanthus. However, the plastid data showed little clear differentiation between Noronhia and the African Chionanthus whereas relationships suggested by the nuclear ITS data were more consistent with taxonomy and geography. We used molecular dating to discriminate between hybridization and lineage sorting/gene duplication as alternative explanations for these topological discordances and to infer the biogeographic history of Noronhia. Hybridization between African Chionanthus and Noronhia could not be ruled out. However, Noronhia has long been established in Madagascar after a likely Cenozoic dispersal from Africa, suggesting any hybridization between representatives of African and Malagasy taxa was ancient. In any case, the African and Indian Ocean Chionanthus and Noronhia together formed a strongly supported monophyletic clade distinct and distant from other Chionanthus, which calls for a revised

  7. Effects of methodology and analysis strategy on robustness of pestivirus phylogeny.

    Science.gov (United States)

    Liu, Lihong; Xia, Hongyan; Baule, Claudia; Belák, Sándor; Wahlberg, Niklas

    2010-01-01

    Phylogenetic analysis of pestiviruses is a useful tool for classifying novel pestiviruses and for revealing their phylogenetic relationships. In this study, robustness of pestivirus phylogenies has been compared by analyses of the 5'UTR, and complete N(pro) and E2 gene regions separately and combined, performed by four methods: neighbour-joining (NJ), maximum parsimony (MP), maximum likelihood (ML), and Bayesian inference (BI). The strategy of analysing the combined sequence dataset by BI, ML, and MP methods resulted in a single, well-supported tree topology, indicating a reliable and robust pestivirus phylogeny. By contrast, the single-gene analysis strategy resulted in 12 trees of different topologies, revealing different relationships among pestiviruses. These results indicate that the strategies and methodologies are two vital aspects affecting the robustness of the pestivirus phylogeny. The strategy and methodologies outlined in this paper may have a broader application in inferring phylogeny of other RNA viruses.

  8. Estimating phylogenetic trees from genome-scale data.

    Science.gov (United States)

    Liu, Liang; Xi, Zhenxiang; Wu, Shaoyuan; Davis, Charles C; Edwards, Scott V

    2015-12-01

    The heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. Phylogenetic methods known as "species tree" methods have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Here we review theory and empirical examples that help clarify conflicts between species tree and concatenation methods, and misconceptions in the literature about the performance of species tree methods. Considering concatenation as a special case of the multispecies coalescent model helps explain differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences and long-branch attraction. We show that approaches, such as binning, designed to augment the signal in species tree analyses can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods incorporating biological realism are a key to phylogenetic analysis of whole-genome data. © 2015 New York Academy of Sciences.

  9. Phylogenetic analysis of rabbit haemorrhagic disease virus (RHDV) strains isolated in Poland.

    Science.gov (United States)

    Fitzner, Andrzej; Niedbalski, Wieslaw

    2017-10-01

    The aim of this study was to characterise the nucleotide and amino acid sequence of complete genomes (7.5 kb) from RHDV strains isolated in Poland and estimate the genetic variability in different elements of the viral RNA. In addition, the sequence of Polish RHDV isolates isolated from 1988-2015 was compared with the sequences of other European RHDV, including the RHDVa and RHDV2/RHDVb subtypes. The complete sequence was developed by the compilation of partial nucleotide sequences. This sequence consisted of approximately 7428 nucleotides. For comparison of nucleotide sequences and the development of phylogenetic trees of Polish RHDV isolates and reference RHDV strains representing the main phylogenetic groups of classical RHDV, RHDVa and RHDV2 as well as the non-pathogenic rabbit lagovirus RCV, the BLAST software with blastn and MEGA6 with neighbour-joining method was applied. The complete nucleotide sequence of Polish isolates of RHDV has also been entered into GenBank. For comparative analysis, nineteen complete sequences representing the main RHDV genetic types available in GenBank were used. The results of phylogenetic analysis of Polish RHDV strains reveals the presence of three classical RHDV genogroups (G2, G4 and G5) and an RHDVa variant (G6). The oldest RHDV isolates (KGM 1988, PD 1989 and MAL 1994) belong to genogroup G2. It can be assumed that the elimination of these strains from the environment probably occurred at the turn of 1994 and 1995. Genogroup G2 was replaced by the phylogenetically younger BLA 1994 and OPO 2004 strains from genogroup G4, which probably originated from the G3 lineage, represented by the Italian strains BS89. The last representatives of classical RHDV in Poland are isolates GSK 1988 and ZD0 2000 from genogroup G5. A single clade contains the Polish RHDV strains from 2004-2015 (GRZ 2004, KRY 2004, L145 2004, W147 2005, SKO 2013, GLE 2013, RED1 2013, STR 2012, STR2 2013, STR 2014, BIE 2015) identified as RHDVa, which clustered

  10. Molecular phylogenetic analysis of Chinese indigenous blue-shelled chickens inferred from whole genomic region of the SLCO1B3 gene.

    Science.gov (United States)

    Dalirsefat, Seyed Benyamin; Dong, Xianggui; Deng, Xuemei

    2015-08-01

    In total, 246 individuals from 8 Chinese indigenous blue- and brown-shelled chicken populations (Yimeng Blue, Wulong Blue, Lindian Blue, Dongxiang Blue, Lushi Blue, Jingmen Blue, Dongxiang Brown, and Lushi Brown) were genotyped for 21 SNP markers from the SLCO1B3 gene to evaluate phylogenetic relationships. As a representative of nonblue-shelled breeds, White Leghorn was included in the study for reference. A high proportion of SNP polymorphism was observed in Chinese chicken populations, ranging from 89% in Jingmen Blue to 100% in most populations, with a mean of 95% across all populations. The White Leghorn breed showed the lowest polymorphism, accounting for 43% of total SNPs. The mean expected heterozygosity varied from 0.11 in Dongxiang Blue to 0.46 in Yimeng Blue. Analysis of molecular variation (AMOVA) for 2 groups of Chinese chickens based on eggshell color type revealed 52% within-group and 43% between-group variations of the total genetic variation. As expected, FST and Reynolds' genetic distance were greatest between White Leghorn and Chinese chicken populations, with average values of 0.40 and 0.55, respectively. The first and second principal coordinates explained approximately 92% of the total variation and supported the clustering of the populations according to their eggshell color type and historical origins. STRUCTURE analysis showed a considerable source of variation among populations for the clustering into blue-shelled and nonblue-shelled chicken populations. The low estimation of genetic differentiation (FST) between Chinese chicken populations is possibly due to a common historical origin and high gene flow. Remarkably similar population classifications were obtained with all methods used in the study. Aligning endogenous avian retroviral (EAV)-HP insertion sequences showed no difference among the blue-shelled chickens. © 2015 Poultry Science Association Inc.

  11. Phylogenetic and Diversity Analysis of Dactylis glomerata Subspecies Using SSR and IT-ISJ Markers.

    Science.gov (United States)

    Yan, Defei; Zhao, Xinxin; Cheng, Yajuan; Ma, Xiao; Huang, Linkai; Zhang, Xinquan

    2016-10-31

    The genus Dactylis , an important forage crop, has a wide geographical distribution in temperate regions. While this genus is thought to include a single species, Dactylis glomerata , this species encompasses many subspecies whose relationships have not been fully characterized. In this study, the genetic diversity and phylogenetic relationships of nine representative Dactylis subspecies were examined using SSR and IT-ISJ markers. In total, 21 pairs of SSR primers and 15 pairs of IT-ISJ primers were used to amplify 295 polymorphic bands with polymorphic rates of 100%. The average polymorphic information contents (PICs) of SSR and IT-ISJ markers were 0.909 and 0.780, respectively. The combined data of the two markers indicated a high level of genetic diversity among the nine D. glomerata subspecies, with a Nei's gene diversity index value of 0.283 and Shannon's diversity of 0.448. Preliminarily phylogenetic analysis results revealed that the 20 accessions could be divided into three groups (A, B, C). Furthermore, they could be divided into five clusters, which is similar to the structure analysis with K = 5. Phylogenetic placement in these three groups may be related to the distribution ranges and the climate types of the subspecies in each group. Group A contained eight accessions of four subspecies, originating from the west Mediterranean, while Group B contained seven accessions of three subspecies, originating from the east Mediterranean.

  12. Opposing assembly mechanisms in a neotropical dry forest: implications for phylogenetic and functional community ecology.

    Science.gov (United States)

    Swenson, Nathan G; Enquist, Brian J

    2009-08-01

    Species diversity is promoted and maintained by ecological and evolutionary processes operating on species attributes through space and time. The degree to which variability in species function regulates distribution and promotes coexistence of species has been debated. Previous work has attempted to quantify the relative importance of species function by using phylogenetic relatedness as a proxy for functional similarity. The key assumption of this approach is that function is phylogenetically conserved. If this assumption is supported, then the phylogenetic dispersion in a community should mirror the functional dispersion. Here we quantify functional trait dispersion along several key axes of tree life-history variation and on multiple spatial scales in a Neotropical dry-forest community. We next compare these results to previously reported patterns of phylogenetic dispersion in this same forest. We find that, at small spatial scales, coexisting species are typically more functionally clustered than expected, but traits related to adult and regeneration niches are overdispersed. This outcome was repeated when the analyses were stratified by size class. Some of the trait dispersion results stand in contrast to the previously reported phylogenetic dispersion results. In order to address this inconsistency we examined the strength of phylogenetic signal in traits at different depths in the phylogeny. We argue that: (1) while phylogenetic relatedness may be a good general multivariate proxy for ecological similarity, it may have a reduced capacity to depict the functional mechanisms behind species coexistence when coexisting species simultaneously converge and diverge in function; and (2) the previously used metric of phylogenetic signal provided erroneous inferences about trait dispersion when married with patterns of phylogenetic dispersion.

  13. The complete mitochondrial genome of Somanniathelphusa boyangensis and phylogenetic analysis of Genus Somanniathelphusa (Crustacea: Decapoda: Parathelphusidae.

    Directory of Open Access Journals (Sweden)

    Xin-Nan Jia

    Full Text Available In this study, the authors first obtained the mitochondrial genome of Somanniathelphusa boyangensis. The results showed that the mitochondrial genome is 17,032bp in length, included 13 protein-coding genes, 2 rRNAs genes, 22 tRNAs genes and 1 putative control region, and it has the characteristics of the metazoan mitochondrial genome A+T bias. All tRNA genes display the typical clover-leaf secondary structure except tRNASer(AGN, which has lost the dihydroxyuridine arm. The GenBank database contains the mitochondrial genomes of representatives of approximately 22 families of Brachyura, comprising 56 species, including 4 species of freshwater crab. The authors established the phylogenetic relationships using the maximum likelihood and Bayesian inference methods. The phylogenetic relationship indicated that the molecular taxonomy of S. boyangensis is consistent with current morphological classification, and Parathelphusidae and Potamidae are derived within the freshwater clade or as part of it. In addition, the authors used the COX1 sequence of Somanniathelphusa in GenBank and the COX1 sequence of S. boyangensis to estimated the divergence time of this genus. The result displayed that the divergence time of Somanniathelphusa qiongshanensis is consistent with the separation of Hainan Island from mainland China in the Beibu Gulf, and the divergence time for Somanniathelphusa taiwanensis and Somanniathelphusa amoyensis is consistent with the separation of Taiwan Province from Mainland China at Fujian Province. These data indicate that geologic events influenced speciation of the genus Somanniathelphusa.

  14. Phylogenetic analysis and taxonomic revision of Physodactylinae (Coleoptera, Elateridae

    Directory of Open Access Journals (Sweden)

    Simone Policena Rosa

    2014-01-01

    Full Text Available A phylogeny based on male morphological characters and taxonomic revision of the Physodactylinae genera are presented. The phylogenetic analysis based on 66 male characters resulted in the polyphyly of Physodactylinae which comprises four independent lineages. Oligostethius and Idiotropia from Africa were found to be sister groups. Teslasena from Brazil was corroborated as belonging to Cardiophorinae clade. The South American genera Physodactylus and Dactylophysus were found to be sister groups and phylogenetically related to Heterocrepidius species. The Oriental Toxognathus resulted as sister group of that clade plus (Dicrepidius ramicornis (Lissomus sp, Physorhynus erythrocephalus. Taxonomic revisions include diagnoses and redescriptions of genera and distributional records and illustrations of species. Key to species of Teslasena, Toxognathus, Dactylophysus and Physodactylus are also provided. Teslasena lucasi is synonymized with T. femoralis. A new species of Dactylophysus is described, D. hirtus sp. nov., and lectotypes are designated to non-conspecific D. mendax sensu Fleutiaux and Heterocrepidius mendax Candèze. Physodactylus niger is removed from synonymy under P. oberthuri; P. carreti is synonymized with P. niger; P. obesus and P. testaceus are synonymized with P. sulcatus. Nine new species are described in Physodactylus: P. asper sp. nov., P. brunneus sp. nov., P. chassaini sp. nov., P. flavifrons sp. nov., P. girardi sp. nov., P. gounellei sp. nov., P. latithorax sp. nov., P. patens sp. nov. and P. tuberculatus sp. nov.

  15. First phylogenetic analysis of dengue virus serotype 4 circulating in Espírito Santo state, Brazil, in 2013 and 2014.

    Science.gov (United States)

    Vicente, C R; Pannuti, C S; Urbano, P R; Felix, A C; Cerutti Junior, C; Herbinger, K-H; Fröschl, G; Romano, C M

    2018-01-01

    The purpose of the present study was to reconstruct the phylogeny of dengue virus serotype 4 (DENV-4) that was circulating in Espírito Santo state, Brazil, in 2013 and 2014, and to discuss the epidemiological implications associated with this evolutionary hypothesis. Partial envelope gene of eight DENV-4 samples from Espírito Santo state were sequenced and aligned with 72 worldwide DENV-4 reference sequences from GenBank. A phylogenetic tree was reconstructed through Bayesian Inference and the Time of the Most Recent Common Ancestor was estimated. The study detected the circulation of DENV-4 genotype II in Espírito Santo state, which was closely related to strains from the states of Mato Grosso collected in 2012 and of São Paulo sampled in 2015. This cluster emerged around 2011, approximately 4 years after the entry of the genotype II in Brazil through its northern states, possibly imported from Venezuela and Colombia. This is so far the first phylogenetic study of the DENV-4 circulating in Espírito Santo state and shows the importance of an internal route of dengue viral circulation in Brazil to the introduction of the virus into this state.

  16. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  17. Analysis of Domain Architecture and Phylogenetics of Family 2 Glycoside Hydrolases (GH2.

    Directory of Open Access Journals (Sweden)

    David Talens-Perales

    Full Text Available In this work we report a detailed analysis of the topology and phylogenetics of family 2 glycoside hydrolases (GH2. We distinguish five topologies or domain architectures based on the presence and distribution of protein domains defined in Pfam and Interpro databases. All of them share a central TIM barrel (catalytic module with two β-sandwich domains (non-catalytic at the N-terminal end, but differ in the occurrence and nature of additional non-catalytic modules at the C-terminal region. Phylogenetic analysis was based on the sequence of the Pfam Glyco_hydro_2_C catalytic module present in most GH2 proteins. Our results led us to propose a model in which evolutionary diversity of GH2 enzymes is driven by the addition of different non-catalytic domains at the C-terminal region. This model accounts for the divergence of β-galactosidases from β-glucuronidases, the diversification of β-galactosidases with different transglycosylation specificities, and the emergence of bicistronic β-galactosidases. This study also allows the identification of groups of functionally uncharacterized protein sequences with potential biotechnological interest.

  18. Trypanosoma (Megatrypanum) melophagium in the sheep ked Melophagus ovinus from organic farms in Croatia: phylogenetic inferences support restriction to sheep and sheep keds and close relationship with trypanosomes from other ruminant species.

    Science.gov (United States)

    Martinković, Franjo; Matanović, Krešimir; Rodrigues, Adriana C; Garcia, Herakles A; Teixeira, Marta M G

    2012-01-01

    Trypanosoma (Megatrypanum) melophagium is a parasite of sheep transmitted by sheep keds, the sheep-restricted ectoparasite Melophagus ovinus (Diptera: Hippoboscidae). Sheep keds were 100% prevalent in sheep from five organic farms in Croatia, Southeastern Europe, whereas trypanosomes morphologically compatible with T. melophagium were 86% prevalent in the guts of the sheep keds. Multilocus phylogenetic analyses using sequences of small subunit rRNA, glycosomal glyceraldehyde-3-phosphate dehydrogenase, spliced leader, and internal transcribed spacer 1 of the rDNA distinguished T. melophagium from all allied trypanosomes from other ruminant species and placed the trypanosome in the subgenus Megatrypanum. Trypanosomes from sheep keds from Croatia and Scotland, the only available isolates for comparison, shared identical sequences. All biologic and phylogenetic inferences support the restriction of T. melophagium to sheep and, especially, to the sheep keds. The comparison of trypanosomes from sheep, cattle, and deer from the same country, which was never achieved before this work, strongly supported the host-restricted specificity of trypanosomes of the subgenus Megatrypanum. Our findings indicate that with the expansion of organic farms, both sheep keds and T. melophagium may re-emerge as parasitic infections of sheep. © 2011 The Author(s) Journal of Eukaryotic Microbiology © 2011 International Society of Protistologists.

  19. DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.

    Science.gov (United States)

    Eernisse, D J

    1992-04-01

    DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.

  20. The complete mitochondrial genome of Pallisentis celatus (Acanthocephala) with phylogenetic analysis of acanthocephalans and rotifers.

    Science.gov (United States)

    Pan, Ting Shuang; Nie, Pin

    2013-07-01

    Acanthocephalans are a small group of obligate endoparasites. They and rotifers are recently placed in a group called Syndermata. However, phylogenetic relationships within classes of acanthocephalans, and between them and rotifers, have not been well resolved, possibly due to the lack of molecular data suitable for such analysis. In this study, the mitochondrial (mt) genome was sequenced from Pallisentis celatus (Van Cleave, 1928), an acanthocephalan in the class Eoacanthocephala, an intestinal parasite of rice-field eel, Monopterus albus (Zuiew, 1793), in China. The complete mt genome sequence of P. celatus is 13 855 bp long, containing 36 genes including 12 protein-coding genes, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs) as reported for other acanthocephalan species. All genes are encoded on the same strand and in the same direction. Phylogenetic analysis indicated that acanthocephalans are closely related with a clade containing bdelloids, which then correlates with the clade containing monogononts. The class Eoacanthocephala, containing P. celatus and Paratenuisentis ambiguus (Van Cleave, 1921) was closely related to the Palaeacanthocephala. It is thus indicated that acanthocephalans may be just clustered among groups of rotifers. However, the resolving of phylogenetic relationship among all classes of acanthocephalans and between them and rotifers may require further sampling and more molecular data.

  1. Monogenean anchor morphometry: systematic value, phylogenetic signal, and evolution

    Science.gov (United States)

    Soo, Oi Yoon Michelle; Tan, Wooi Boon; Lim, Lee Hong Susan

    2016-01-01

    Background. Anchors are one of the important attachment appendages for monogenean parasites. Common descent and evolutionary processes have left their mark on anchor morphometry, in the form of patterns of shape and size variation useful for systematic and evolutionary studies. When combined with morphological and molecular data, analysis of anchor morphometry can potentially answer a wide range of biological questions. Materials and Methods. We used data from anchor morphometry, body size and morphology of 13 Ligophorus (Monogenea: Ancyrocephalidae) species infecting two marine mugilid (Teleostei: Mugilidae) fish hosts: Moolgarda buchanani (Bleeker) and Liza subviridis (Valenciennes) from Malaysia. Anchor shape and size data (n = 530) were generated using methods of geometric morphometrics. We used 28S rRNA, 18S rRNA, and ITS1 sequence data to infer a maximum likelihood phylogeny. We discriminated species using principal component and cluster analysis of shape data. Adams’s Kmult was used to detect phylogenetic signal in anchor shape. Phylogeny-correlated size and shape changes were investigated using continuous character mapping and directional statistics, respectively. We assessed morphological constraints in anchor morphometry using phylogenetic regression of anchor shape against body size and anchor size. Anchor morphological integration was studied using partial least squares method. The association between copulatory organ morphology and anchor shape and size in phylomorphospace was used to test the Rohde-Hobbs hypothesis. We created monogeneaGM, a new R package that integrates analyses of monogenean anchor geometric morphometric data with morphological and phylogenetic data. Results. We discriminated 12 of the 13 Ligophorus species using anchor shape data. Significant phylogenetic signal was detected in anchor shape. Thus, we discovered new morphological characters based on anchor shaft shape, the length between the inner root point and the outer root

  2. The complete mitochondrial genome of rabbit pinworm Passalurus ambiguus: genome characterization and phylogenetic analysis.

    Science.gov (United States)

    Liu, Guo-Hua; Li, Sheng; Zou, Feng-Cai; Wang, Chun-Ren; Zhu, Xing-Quan

    2016-01-01

    Passalurus ambiguus (Nematda: Oxyuridae) is a common pinworm which parasitizes in the caecum and colon of rabbits. Despite its significance as a pathogen, the epidemiology, genetics, systematics, and biology of this pinworm remain poorly understood. In the present study, we sequenced the complete mitochondrial (mt) genome of P. ambiguus. The circular mt genome is 14,023 bp in size and encodes of 36 genes, including 12 protein-coding, two ribosomal RNA, and 22 transfer RNA genes. The mt gene order of P. ambiguus is the same as that of Wellcomia siamensis, but distinct from that of Enterobius vermicularis. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference (BI) showed that P. ambiguus was more closely related to W. siamensis than to E. vermicularis. This mt genome provides novel genetic markers for studying the molecular epidemiology, population genetics, systematics of pinworm of animals and humans, and should have implications for the diagnosis, prevention, and control of passaluriasis in rabbits and other animals.

  3. Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

    Science.gov (United States)

    Lara-Ramírez, Edgar E.; Salazar, Ma Isabel; López-López, María de Jesús; Salas-Benito, Juan Santiago; Sánchez-Varela, Alejandro

    2014-01-01

    The increasing number of dengue virus (DENV) genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4) has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC) with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3) as well as the effective number of codons (ENC, ENCp) versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA) and clustering analysis on relative synonymous codon usage (RSCU) within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution. PMID:25136631

  4. Phylogenetic relationships among Neoechinorhynchus species (Acanthocephala: Neoechinorhynchidae) from North-East Asia based on molecular data.

    Science.gov (United States)

    Malyarchuk, Boris; Derenko, Miroslava; Mikhailova, Ekaterina; Denisova, Galina

    2014-02-01

    Phylogenetic and statistical analyses of DNA sequences of two genes, cytochrome oxidase subunit 1 (cox 1) of the mitochondrial DNA and 18S subunit of the nuclear ribosomal RNA (18S rRNA), was used to characterize Neoechinorhynchus species from fishes collected in different localities of North-East Asia. It has been found that four species can be clearly recognized using molecular markers-Neoechinorhynchus tumidus, Neoechinorhynchus beringianus, Neoechinorhynchus simansularis and Neoechinorhynchus salmonis. 18S sequences ascribed to Neoechinorhynchus crassus specimens from North-East Asia were identical to those of N. tumidus, but differed substantially from North American N. crassus. We renamed North-East Asian N. crassus specimens to N. sp., although the possibility that they represent a subspecies of N. tumidus cannot be excluded, taking into account a relatively small distance between cox 1 sequences of North-East Asian specimens of N. crassus and N. tumidus. Maximum likelihood, maximum parsimony and Bayesian inference analyses were performed for phylogeny reconstruction. All the phylogenetic trees showed that North-East Asian species of Neoechinorhynchus analyzed in this study represent independent clades, with the only exception of N. tumidus and N. sp. for 18S data. Phylogenetic analysis has shown that the majority of species sampled (N. tumidus+N. sp., N. simansularis and N. beringianus) are probably very closely related, while N. salmonis occupies separate position in the trees, possibly indicating a North American origin of this species. © 2013.

  5. YBYRÁ facilitates comparison of large phylogenetic trees.

    Science.gov (United States)

    Machado, Denis Jacob

    2015-07-01

    The number and size of tree topologies that are being compared by phylogenetic systematists is increasing due to technological advancements in high-throughput DNA sequencing. However, we still lack tools to facilitate comparison among phylogenetic trees with a large number of terminals. The "YBYRÁ" project integrates software solutions for data analysis in phylogenetics. It comprises tools for (1) topological distance calculation based on the number of shared splits or clades, (2) sensitivity analysis and automatic generation of sensitivity plots and (3) clade diagnoses based on different categories of synapomorphies. YBYRÁ also provides (4) an original framework to facilitate the search for potential rogue taxa based on how much they affect average matching split distances (using MSdist). YBYRÁ facilitates comparison of large phylogenetic trees and outperforms competing software in terms of usability and time efficiency, specially for large data sets. The programs that comprises this toolkit are written in Python, hence they do not require installation and have minimum dependencies. The entire project is available under an open-source licence at http://www.ib.usp.br/grant/anfibios/researchSoftware.html .

  6. Tree-average distances on certain phylogenetic networks have their weights uniquely determined.

    Science.gov (United States)

    Willson, Stephen J

    2012-01-01

    A phylogenetic network N has vertices corresponding to species and arcs corresponding to direct genetic inheritance from the species at the tail to the species at the head. Measurements of DNA are often made on species in the leaf set, and one seeks to infer properties of the network, possibly including the graph itself. In the case of phylogenetic trees, distances between extant species are frequently used to infer the phylogenetic trees by methods such as neighbor-joining. This paper proposes a tree-average distance for networks more general than trees. The notion requires a weight on each arc measuring the genetic change along the arc. For each displayed tree the distance between two leaves is the sum of the weights along the path joining them. At a hybrid vertex, each character is inherited from one of its parents. We will assume that for each hybrid there is a probability that the inheritance of a character is from a specified parent. Assume that the inheritance events at different hybrids are independent. Then for each displayed tree there will be a probability that the inheritance of a given character follows the tree; this probability may be interpreted as the probability of the tree. The tree-average distance between the leaves is defined to be the expected value of their distance in the displayed trees. For a class of rooted networks that includes rooted trees, it is shown that the weights and the probabilities at each hybrid vertex can be calculated given the network and the tree-average distances between the leaves. Hence these weights and probabilities are uniquely determined. The hypotheses on the networks include that hybrid vertices have indegree exactly 2 and that vertices that are not leaves have a tree-child.

  7. Molecular diagnosis and phylogenetic analysis of Babesia bigemina and Babesia bovis hemoparasites from cattle in South Africa.

    Science.gov (United States)

    Mtshali, Moses Sibusiso; Mtshali, Phillip Senzo

    2013-08-08

    Babesia parasites, mainly Babesia bovis and B. bigemina, are tick-borne hemoparasites inducing bovine babesiosis in cattle globally. The clinical signs of the disease include, among others, anemia, fever and hemoglobinuria. Babesiosis is known to occur in tropical and subtropical regions of the world. In this study, we aim to provide information about the occurrence and phylogenetic relationship of B. bigemina and B. bovis species in cattle from different locations in nine provinces of South Africa. A total of 430 blood samples were randomly collected from apparently healthy cattle. These samples were genetically tested for Babesia parasitic infections using nested PCR assays with species-specific primers. Nested PCR assays with Group I primer sets revealed that the overall prevalence of B. bigemina and B. bovis in all bovine samples tested was 64.7% (95% CI = 60.0-69.0) and 35.1% (95% CI = 30.6-39.8), respectively. Only 117/430 (27.2%) animals had a mixed infection. The highest prevalence of 87.5% (95% CI = 77.2-93.5) for B. bigemina was recorded in the Free State province collection sites (Ficksburg, Philippolis and Botshabelo), while North West collection sites had the highest number of animals infected with B. bovis (65.5%; 95% CI = 52.7-76.4). Phylograms were inferred based on B. bigemina-specific gp45 and B. bovis-specific rap-1 nucleotide sequences obtained with Group II nested PCR primers. Phylogenetic analysis of gp45 sequences revealed significant differences in the genotypes of B. bigemina isolates investigated, including those of strains published in GenBank. On the other hand, a phylogeny based on B. bovis rap-1 sequences indicated a similar trend of clustering among the sequences of B. bovis isolates investigated in this study. This study demonstrates the occurrence of Babesia parasites in cattle from different provinces of South Africa. It was also noted that the situation of Babesia parasitic infection in cattle from certain areas

  8. Mitochondrial genome of the stonefly Kamimuria wangi (Plecoptera: Perlidae) and phylogenetic position of plecoptera based on mitogenomes.

    Science.gov (United States)

    Yu-Han, Qian; Hai-Yan, Wu; Xiao-Yu, Ji; Wei-Wei, Yu; Yu-Zhou, Du

    2014-01-01

    This study determined the mitochondrial genome sequence of the stonefly, Kamimuria wangi. In order to investigate the relatedness of stonefly to other members of Neoptera, a phylogenetic analysis was undertaken based on 13 protein-coding genes of mitochondrial genomes in 13 representative insects. The mitochondrial genome of the stonefly is a circular molecule consisting of 16,179 nucleotides and contains the 37 genes typically found in other insects. A 10-bp poly-T stretch was observed in the A+T-rich region of the K. wangi mitochondrial genome. Downstream of the poly-T stretch, two regions were located with potential ability to form stem-loop structures; these were designated stem-loop 1 (positions 15848-15651) and stem-loop 2 (15965-15998). The arrangement of genes and nucleotide composition of the K. wangi mitogenome are similar to those in Pteronarcys princeps, suggesting a conserved genome evolution within the Plecoptera. Phylogenetic analysis using maximum likelihood and Bayesian inference of 13 protein-coding genes supported a novel relationship between the Plecoptera and Ephemeroptera. The results contradict the existence of a monophyletic Plectoptera and Plecoptera as sister taxa to Embiidina, and thus requires further analyses with additional mitogenome sampling at the base of the Neoptera.

  9. Mitochondrial genome of the stonefly Kamimuria wangi (Plecoptera: Perlidae and phylogenetic position of plecoptera based on mitogenomes.

    Directory of Open Access Journals (Sweden)

    Qian Yu-Han

    Full Text Available This study determined the mitochondrial genome sequence of the stonefly, Kamimuria wangi. In order to investigate the relatedness of stonefly to other members of Neoptera, a phylogenetic analysis was undertaken based on 13 protein-coding genes of mitochondrial genomes in 13 representative insects. The mitochondrial genome of the stonefly is a circular molecule consisting of 16,179 nucleotides and contains the 37 genes typically found in other insects. A 10-bp poly-T stretch was observed in the A+T-rich region of the K. wangi mitochondrial genome. Downstream of the poly-T stretch, two regions were located with potential ability to form stem-loop structures; these were designated stem-loop 1 (positions 15848-15651 and stem-loop 2 (15965-15998. The arrangement of genes and nucleotide composition of the K. wangi mitogenome are similar to those in Pteronarcys princeps, suggesting a conserved genome evolution within the Plecoptera. Phylogenetic analysis using maximum likelihood and Bayesian inference of 13 protein-coding genes supported a novel relationship between the Plecoptera and Ephemeroptera. The results contradict the existence of a monophyletic Plectoptera and Plecoptera as sister taxa to Embiidina, and thus requires further analyses with additional mitogenome sampling at the base of the Neoptera.

  10. Phylogenetic position of the giant anuran trypanosomes Trypanosoma chattoni, Trypanosoma fallisi, Trypanosoma mega, Trypanosoma neveulemairei, and Trypanosoma ranarum inferred from 18S rRNA gene sequences.

    Science.gov (United States)

    Martin, Donald S; Wright, André-Denis G; Barta, John R; Desser, Sherwin S

    2002-06-01

    Phylogenetic relationships within the kinetoplastid flagellates were inferred from comparisons of small-subunit ribosomal RNA gene sequences. These included 5 new gene sequences, Trypanosoma fallisi (2,239 bp), Trypanosoma chattoni (2,180 bp), Trypanosoma mega (2,211 bp), Trypanosoma neveulemairei (2,197 bp), and Trypanosoma ranarum (2,203 bp). Trees produced using maximum-parsimony and distance-matrix methods (least-squares, neighbor-joining, and maximum-likelihood), supported by strong bootstrap and quartet-puzzle analyses, indicated that the trypanosomes are a monophyletic group that divides into 2 major lineages, the salivarian trypanosomes and the nonsalivarian trypanosomes. The nonsalivarian trypanosomes further divide into 2 lineages, 1 containing trypanosomes of birds, mammals, and reptiles and the other containing trypanosomes of fish, reptiles, and anurans. Among the giant trypanosomes, T. chattoni is clearly shown to be distantly related to all the other anuran trypanosome species. Trypanosoma mega is closely associated with T. fallisi and T. ranarum, whereas T. neveulemairei and Trypanosoma rotatorium are sister taxa. The branching order of the anuran trypanosomes suggests that some toad trypanosomes may have evolved by host switching from frogs to toads.

  11. Polytomy identification in microbial phylogenetic reconstruction

    Directory of Open Access Journals (Sweden)

    Lin Guan

    2011-12-01

    Full Text Available Abstract Background A phylogenetic tree, showing ancestral relations among organisms, is commonly represented as a rooted tree with sets of bifurcating branches (dichotomies for simplicity, although polytomies (multifurcating branches may reflect more accurate evolutionary relationships. To represent the true evolutionary relationships, it is important to systematically identify the polytomies from a bifurcating tree and generate a taxonomy-compatible multifurcating tree. For this purpose we propose a novel approach, "PolyPhy", which would classify a set of bifurcating branches of a phylogenetic tree into a set of branches with dichotomies and polytomies by considering genome distances among genomes and tree topological properties. Results PolyPhy employs a machine learning technique, BLR (Bayesian logistic regression classifier, to identify possible bifurcating subtrees as polytomies from the trees resulted from ComPhy. Other than considering genome-scale distances between all pairs of species, PolyPhy also takes into account different properties of tree topology between dichotomy and polytomy, such as long-branch retraction and short-branch contraction, and quantifies these properties into comparable rates among different sub-branches. We extract three tree topological features, 'LR' (Leaf rate, 'IntraR' (Intra-subset branch rate and 'InterR' (Inter-subset branch rate, all of which are calculated from bifurcating tree branch sets for classification. We have achieved F-measure (balanced measure between precision and recall of 81% with about 0.9 area under the curve (AUC of ROC. Conclusions PolyPhy is a fast and robust method to identify polytomies from phylogenetic trees based on genome-wide inference of evolutionary relationships among genomes. The software package and test data can be downloaded from http://digbio.missouri.edu/ComPhy/phyloTreeBiNonBi-1.0.zip.

  12. Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.

    Directory of Open Access Journals (Sweden)

    Oscar Westesson

    Full Text Available The Multiple Sequence Alignment (MSA is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history, it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.

  13. Monophyly of Archaeplastida supergroup and relationships among its lineages in the light of phylogenetic and phylogenomic studies. Are we close to a consensus?

    Directory of Open Access Journals (Sweden)

    Paweł Mackiewicz

    2014-12-01

    Full Text Available One of the key evolutionary events on the scale of the biosphere was an endosymbiosis between a heterotrophic eukaryote and a cyanobacterium, resulting in a primary plastid. Such an organelle is characteristic of three eukaryotic lineages, glaucophytes, red algae and green plants. The three groups are usually united under the common name Archaeplastida or Plantae in modern taxonomic classifications, which indicates they are considered monophyletic. The methods generally used to verify this monophyly are phylogenetic analyses. In this article we review up-to-date results of such analyses and discussed their inconsistencies. Although phylogenies of plastid genes suggest a single primary endosymbiosis, which is assumed to mean a common origin of the Archaeplastida, different phylogenetic trees based on nuclear markers show monophyly, paraphyly, polyphyly or unresolved topologies of Archaeplastida hosts. The difficulties in reconstructing host cell relationships could result from stochastic and systematic biases in data sets, including different substitution rates and patterns, gene paralogy and horizontal/endosymbiotic gene transfer into eukaryotic lineages, which attract Archaeplastida in phylogenetic trees. Based on results to date, it is neither possible to confirm nor refute alternative evolutionary scenarios to a single primary endosymbiosis. Nevertheless, if trees supporting monophyly are considered, relationships inferred among Archaeplastida lineages can be discussed. Phylogenetic analyses based on nuclear genes clearly show the earlier divergence of glaucophytes from red algae and green plants. Plastid genes suggest a more complicated history, but at least some studies are congruent with this concept. Additional research involving more representatives of glaucophytes and many understudied lineages of Eukaryota can improve inferring phylogenetic relationships related to the Archaeplastida. In addition, alternative approaches not directly

  14. Molecular evolution of Adh and LEAFY and the phylogenetic utility of their introns in Pyrus (Rosaceae).

    Science.gov (United States)

    Zheng, Xiaoyan; Hu, Chunyun; Spooner, David; Liu, Jing; Cao, Jiashu; Teng, Yuanwen

    2011-09-14

    The genus Pyrus belongs to the tribe Pyreae (the former subfamily Maloideae) of the family Rosaceae, and includes one of the most important commercial fruit crops, pear. The phylogeny of Pyrus has not been definitively reconstructed. In our previous efforts, the internal transcribed spacer region (ITS) revealed a poorly resolved phylogeny due to non-concerted evolution of nrDNA arrays. Therefore, introns of low copy nuclear genes (LCNG) are explored here for improved resolution. However, paralogs and lineage sorting are still two challenges for applying LCNGs in phylogenetic studies, and at least two independent nuclear loci should be compared. In this work the second intron of LEAFY and the alcohol dehydrogenase gene (Adh) were selected to investigate their molecular evolution and phylogenetic utility. DNA sequence analyses revealed a complex ortholog and paralog structure of Adh genes in Pyrus and Malus, the pears and apples. Comparisons between sequences from RT-PCR and genomic PCR indicate that some Adh homologs are putatively nonfunctional. A partial region of Adh1 was sequenced for 18 Pyrus species and three subparalogs representing Adh1-1 were identified. These led to poorly resolved phylogenies due to low sequence divergence and the inclusion of putative recombinants. For the second intron of LEAFY, multiple inparalogs were discovered for both LFY1int2 and LFY2int2. LFY1int2 is inadequate for phylogenetic analysis due to lineage sorting of two inparalogs. LFY2int2-N, however, showed a relatively high sequence divergence and led to the best-resolved phylogeny. This study documents the coexistence of outparalogs and inparalogs, and lineage sorting of these paralogs and orthologous copies. It reveals putative recombinants that can lead to incorrect phylogenetic inferences, and presents an improved phylogenetic resolution of Pyrus using LFY2int2-N. Our study represents the first phylogenetic analyses based on LCNGs in Pyrus. Ancient and recent duplications lead

  15. Molecular evolution of Adh and LEAFY and the phylogenetic utility of their introns in Pyrus (Rosaceae

    Directory of Open Access Journals (Sweden)

    Cao Jiashu

    2011-09-01

    Full Text Available Abstract Background The genus Pyrus belongs to the tribe Pyreae (the former subfamily Maloideae of the family Rosaceae, and includes one of the most important commercial fruit crops, pear. The phylogeny of Pyrus has not been definitively reconstructed. In our previous efforts, the internal transcribed spacer region (ITS revealed a poorly resolved phylogeny due to non-concerted evolution of nrDNA arrays. Therefore, introns of low copy nuclear genes (LCNG are explored here for improved resolution. However, paralogs and lineage sorting are still two challenges for applying LCNGs in phylogenetic studies, and at least two independent nuclear loci should be compared. In this work the second intron of LEAFY and the alcohol dehydrogenase gene (Adh were selected to investigate their molecular evolution and phylogenetic utility. Results DNA sequence analyses revealed a complex ortholog and paralog structure of Adh genes in Pyrus and Malus, the pears and apples. Comparisons between sequences from RT-PCR and genomic PCR indicate that some Adh homologs are putatively nonfunctional. A partial region of Adh1 was sequenced for 18 Pyrus species and three subparalogs representing Adh1-1 were identified. These led to poorly resolved phylogenies due to low sequence divergence and the inclusion of putative recombinants. For the second intron of LEAFY, multiple inparalogs were discovered for both LFY1int2 and LFY2int2. LFY1int2 is inadequate for phylogenetic analysis due to lineage sorting of two inparalogs. LFY2int2-N, however, showed a relatively high sequence divergence and led to the best-resolved phylogeny. This study documents the coexistence of outparalogs and inparalogs, and lineage sorting of these paralogs and orthologous copies. It reveals putative recombinants that can lead to incorrect phylogenetic inferences, and presents an improved phylogenetic resolution of Pyrus using LFY2int2-N. Conclusions Our study represents the first phylogenetic analyses based

  16. Phylogenetic relationships in Asarum: Effect of data partitioning and a revised classification.

    Science.gov (United States)

    Sinn, Brandon T; Kelly, Lawrence M; Freudenstein, John V

    2015-05-01

    Generic boundaries and infrageneric relationships among the charismatic temperate magnoliid Asarum sensu lato (Aristolochiaceae) have long been uncertain. Previous molecular phylogenetic analyses used either plastid or nuclear loci alone and varied greatly in their taxonomic implications for the genus. We analyzed additional molecular markers from the nuclear and plastid genomes, reevaluated the possibility of a derived loss of autonomous self-pollination, and investigated the topological effects of matrix-partitioning-scheme choice. We sequenced seven plastid regions and the nuclear ITS1-ITS2 region of 58 individuals representing all previously recognized Asarum s.l. segregate genera and the monotypic genus Saruma. Matrices were partitioned using common a priori partitioning schemes and PartitionFinder. Topologies that were recovered using a priori partitioning of matrices differed from those recovered using a PartitionFinder-selected scheme, and by analysis method. We recovered six monophyletic groups that we circumscribed into three subgenera and six sections. Putative fungal mimic characters served as synapomorphies only for subgenus Heterotropa. Subgenus Geotaenium, a new subgenus, was recovered as sister to the remainder of Asarum by ML analyses of highly partitioned datasets. Section Longistylis, also newly named, is sister to section Hexastylis. Our analyses do not unambiguously support a single origin for all fungal-mimicry characters. Topologies recovered through the analysis of PartitionFinder-optimized matrices can differ drastically from those inferred from a priori partitioned matrices, and by analytical method. We recommend that investigators evaluate the topological effects of matrix partitioning using multiple methods of phylogenetic reconstruction. © 2015 Botanical Society of America, Inc.

  17. Data partitions, Bayesian analysis and phylogeny of the zygomycetous fungal family Mortierellaceae, inferred from nuclear ribosomal DNA sequences.

    Directory of Open Access Journals (Sweden)

    Tamás Petkovits

    Full Text Available Although the fungal order Mortierellales constitutes one of the largest classical groups of Zygomycota, its phylogeny is poorly understood and no modern taxonomic revision is currently available. In the present study, 90 type and reference strains were used to infer a comprehensive phylogeny of Mortierellales from the sequence data of the complete ITS region and the LSU and SSU genes with a special attention to the monophyly of the genus Mortierella. Out of 15 alternative partitioning strategies compared on the basis of Bayes factors, the one with the highest number of partitions was found optimal (with mixture models yielding the best likelihood and tree length values, implying a higher complexity of evolutionary patterns in the ribosomal genes than generally recognized. Modeling the ITS1, 5.8S, and ITS2, loci separately improved model fit significantly as compared to treating all as one and the same partition. Further, within-partition mixture models suggests that not only the SSU, LSU and ITS regions evolve under qualitatively and/or quantitatively different constraints, but that significant heterogeneity can be found within these loci also. The phylogenetic analysis indicated that the genus Mortierella is paraphyletic with respect to the genera Dissophora, Gamsiella and Lobosporangium and the resulting phylogeny contradict previous, morphology-based sectional classification of Mortierella. Based on tree structure and phenotypic traits, we recognize 12 major clades, for which we attempt to summarize phenotypic similarities. M. longicollis is closely related to the outgroup taxon Rhizopus oryzae, suggesting that it belongs to the Mucorales. Our results demonstrate that traits used in previous classifications of the Mortierellales are highly homoplastic and that the Mortierellales is in a need of a reclassification, where new, phylogenetically informative phenotypic traits should be identified, with molecular phylogenies playing a decisive role.

  18. NGS combined with phylogenetic analysis to detect HIV-1 dual infection in Romanian people who inject drugs.

    Science.gov (United States)

    Popescu, Bogdan; Banica, Leontina; Nicolae, Ionelia; Radu, Eugen; Niculescu, Iulia; Abagiu, Adrian; Otelea, Dan; Paraschiv, Simona

    2018-04-04

    Dual HIV infections are possible and likely in people who inject drugs (PWID). Thirty-eight newly diagnosed patients, 19 PWID and 19 heterosexually HIV infected were analysed. V2-V3 loop of HIV-1 env gene was sequenced on the NGS platform 454 GSJunior (Roche). HIV-1 dual/multiple infections were identified in five PWID. For three of these patients, the reconstructed variants belonged to pure F1 subtype and CRF14_BG strains according to phylogenetic analysis. New recombinant forms between these parental strains were identified in two PWID samples. NGS data can provide, with the help of phylogenetic analysis, important insights about the intra-host sub-population structure. Copyright © 2018. Published by Elsevier Masson SAS.

  19. Functional & phylogenetic diversity of copepod communities

    Science.gov (United States)

    Benedetti, F.; Ayata, S. D.; Blanco-Bercial, L.; Cornils, A.; Guilhaumon, F.

    2016-02-01

    The diversity of natural communities is classically estimated through species identification (taxonomic diversity) but can also be estimated from the ecological functions performed by the species (functional diversity), or from the phylogenetic relationships among them (phylogenetic diversity). Estimating functional diversity requires the definition of specific functional traits, i.e., phenotypic characteristics that impact fitness and are relevant to ecosystem functioning. Estimating phylogenetic diversity requires the description of phylogenetic relationships, for instance by using molecular tools. In the present study, we focused on the functional and phylogenetic diversity of copepod surface communities in the Mediterranean Sea. First, we implemented a specific trait database for the most commonly-sampled and abundant copepod species of the Mediterranean Sea. Our database includes 191 species, described by seven traits encompassing diverse ecological functions: minimal and maximal body length, trophic group, feeding type, spawning strategy, diel vertical migration and vertical habitat. Clustering analysis in the functional trait space revealed that Mediterranean copepods can be gathered into groups that have different ecological roles. Second, we reconstructed a phylogenetic tree using the available sequences of 18S rRNA. Our tree included 154 of the analyzed Mediterranean copepod species. We used these two datasets to describe the functional and phylogenetic diversity of copepod surface communities in the Mediterranean Sea. The replacement component (turn-over) and the species richness difference component (nestedness) of the beta diversity indices were identified. Finally, by comparing various and complementary aspects of plankton diversity (taxonomic, functional, and phylogenetic diversity) we were able to gain a better understanding of the relationships among the zooplankton community, biodiversity, ecosystem function, and environmental forcing.

  20. Phylogenetic turnover during subtropical forest succession across environmental and phylogenetic scales.

    Science.gov (United States)

    Purschke, Oliver; Michalski, Stefan G; Bruelheide, Helge; Durka, Walter

    2017-12-01

    Although spatial and temporal patterns of phylogenetic community structure during succession are inherently interlinked and assembly processes vary with environmental and phylogenetic scales, successional studies of community assembly have yet to integrate spatial and temporal components of community structure, while accounting for scaling issues. To gain insight into the processes that generate biodiversity after disturbance, we combine analyses of spatial and temporal phylogenetic turnover across phylogenetic scales, accounting for covariation with environmental differences. We compared phylogenetic turnover, at the species- and individual-level, within and between five successional stages, representing woody plant communities in a subtropical forest chronosequence. We decomposed turnover at different phylogenetic depths and assessed its covariation with between-plot abiotic differences. Phylogenetic turnover between stages was low relative to species turnover and was not explained by abiotic differences. However, within the late-successional stages, there was high presence-/absence-based turnover (clustering) that occurred deep in the phylogeny and covaried with environmental differentiation. Our results support a deterministic model of community assembly where (i) phylogenetic composition is constrained through successional time, but (ii) toward late succession, species sorting into preferred habitats according to niche traits that are conserved deep in phylogeny, becomes increasingly important.

  1. Aujeszky's disease in red fox (Vulpes vulpes): phylogenetic analysis unravels an unexpected epidemiologic link.

    Science.gov (United States)

    Caruso, Claudio; Dondo, Alessandro; Cerutti, Francesco; Masoero, Loretta; Rosamilia, Alfonso; Zoppi, Simona; D'Errico, Valeria; Grattarola, Carla; Acutis, Pier Luigi; Peletto, Simone

    2014-07-01

    We describe Aujeszky's disease in a female of red fox (Vulpes vulpes). Although wild boar (Sus scrofa) would be the expected source of infection, phylogenetic analysis suggested a domestic rather than a wild source of virus, underscoring the importance of biosecurity measures in pig farms to prevent contact with wild animals.

  2. A preliminary mitochondrial genome phylogeny of Orthoptera (Insecta) and approaches to maximizing phylogenetic signal found within mitochondrial genome data.

    Science.gov (United States)

    Fenn, J Daniel; Song, Hojun; Cameron, Stephen L; Whiting, Michael F

    2008-10-01

    The phylogenetic utility of mitochondrial genomes (mtgenomes) is examined using the framework of a preliminary phylogeny of Orthoptera. This study presents five newly sequenced genomes from four orthopteran families. While all ensiferan and polyneopteran taxa retain the ancestral gene order, all caeliferan lineages including the newly sequenced caeliferan species contain a tRNA rearrangement from the insect ground plan tRNA(Lys)(K)-tRNA(Asp)(D) swapping to tRNA(Asp) (D)-tRNA(Lys) (K) confirming that this rearrangement is a possible molecular synapomorphy for this suborder. The phylogenetic signal in mtgenomes is rigorously examined under the analytical regimens of parsimony, maximum likelihood and Bayesian inference, along with how gene inclusion/exclusion, data recoding, gap coding, and different partitioning schemes influence the phylogenetic reconstruction. When all available data are analyzed simultaneously, the monophyly of Orthoptera and its two suborders, Caelifera and Ensifera, are consistently recovered in the context of our taxon sampling, regardless of the optimality criteria. When protein-coding genes are analyzed as a single partition, nearly identical topology to the combined analyses is recovered, suggesting that much of the signals of the mtgenome come from the protein-coding genes. Transfer and ribosomal RNAs perform poorly when analyzed individually, but contribute signal when analyzed in combination with the protein-coding genes. Inclusion of third codon position of the protein-coding genes does not negatively affect the phylogenetic reconstruction when all genes are analyzed together, whereas recoding of the protein-coding genes into amino acid sequences introduces artificial resolution. Over-partitioning in a Bayesian framework appears to have a negative effect in achieving convergence. Our findings suggest that the best phylogenetic inferences are made when all available nucleotide data from the mtgenome are analyzed simultaneously, and that

  3. Short branches lead to systematic artifacts when BLAST searches are used as surrogate for phylogenetic reconstruction.

    Science.gov (United States)

    Dick, Amanda A; Harlow, Timothy J; Gogarten, J Peter

    2017-02-01

    Long Branch Attraction (LBA) is a well-known artifact in phylogenetic reconstruction when dealing with branch length heterogeneity. Here we show another phenomenon, Short Branch Attraction (SBA), which occurs when BLAST searches, a phenetic analysis, are used as a surrogate method for phylogenetic analysis. This error also results from branch length heterogeneity, but this time it is the short branches that are attracting. The SBA artifact is reciprocal and can be returned 100% of the time when multiple branches differ in length by a factor of more than two. SBA is an intended feature of BLAST searches, but becomes an issue, when top scoring BLAST hit analyses are used to infer Horizontal Gene Transfers (HGTs), assign taxonomic category with environmental sequence data in phylotyping, or gather homologous sequences for building gene families. SBA can lead researchers to believe that there has been a HGT event when only vertical descent has occurred, cause slowly evolving taxa to be over-represented and quickly evolving taxa to be under-represented in phylotyping, or systematically exclude quickly evolving taxa from analyses. SBA also contributes to the changing results of top scoring BLAST hit analyses as the database grows, because more slowly evolving taxa, or short branches, are added over time, introducing more potential for SBA. SBA can be detected by examining reciprocal best BLAST hits among a larger group of taxa, including the known closest phylogenetic neighbors. Therefore, one should look for this phenomenon when conducting best BLAST hit analyses as a surrogate method to identify HGTs, in phylotyping, or when using BLAST to gather homologous sequences. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. A bootstrap based analysis pipeline for efficient classification of phylogenetically related animal miRNAs

    Directory of Open Access Journals (Sweden)

    Gu Xun

    2007-03-01

    Full Text Available Abstract Background Phylogenetically related miRNAs (miRNA families convey important information of the function and evolution of miRNAs. Due to the special sequence features of miRNAs, pair-wise sequence identity between miRNA precursors alone is often inadequate for unequivocally judging the phylogenetic relationships between miRNAs. Most of the current methods for miRNA classification rely heavily on manual inspection and lack measurements of the reliability of the results. Results In this study, we designed an analysis pipeline (the Phylogeny-Bootstrap-Cluster (PBC pipeline to identify miRNA families based on branch stability in the bootstrap trees derived from overlapping genome-wide miRNA sequence sets. We tested the PBC analysis pipeline with the miRNAs from six animal species, H. sapiens, M. musculus, G. gallus, D. rerio, D. melanogaster, and C. elegans. The resulting classification was compared with the miRNA families defined in miRBase. The two classifications were largely consistent. Conclusion The PBC analysis pipeline is an efficient method for classifying large numbers of heterogeneous miRNA sequences. It requires minimum human involvement and provides measurements of the reliability of the classification results.

  5. Phylogenomic and MALDI-TOF MS analysis of Streptococcus sinensis HKU4T reveals a distinct phylogenetic clade in the genus Streptococcus.

    Science.gov (United States)

    Teng, Jade L L; Huang, Yi; Tse, Herman; Chen, Jonathan H K; Tang, Ying; Lau, Susanna K P; Woo, Patrick C Y

    2014-10-20

    Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the "sanguinis group." As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the "mitis group." On the basis of the findings, we propose a novel group, named "sinensis group," to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Molecular and morphological data supporting phylogenetic reconstruction of the genus Goniothalamus (Annonaceae), including a reassessment of previous infrageneric classifications.

    Science.gov (United States)

    Tang, Chin Cheung; Thomas, Daniel C; Saunders, Richard M K

    2015-09-01

    Data is presented in support of a phylogenetic reconstruction of the species-rich early-divergent angiosperm genus Goniothalamus (Annonaceae) (Tang et al., Mol. Phylogenetic Evol., 2015) [1], inferred using chloroplast DNA (cpDNA) sequences. The data includes a list of primers for amplification and sequencing for nine cpDNA regions: atpB-rbcL, matK, ndhF, psbA-trnH, psbM-trnD, rbcL, trnL-F, trnS-G, and ycf1, the voucher information and molecular data (GenBank accession numbers) of 67 ingroup Goniothalamus accessions and 14 outgroup accessions selected from across the tribe Annoneae, and aligned data matrices for each gene region. We also present our Bayesian phylogenetic reconstructions for Goniothalamus, with information on previous infrageneric classifications superimposed to enable an evaluation of monophyly, together with a taxon-character data matrix (with 15 morphological characters scored for 66 Goniothalamus species and seven other species from the tribe Annoneae that are shown to be phylogenetically correlated).

  7. Molecular and morphological data supporting phylogenetic reconstruction of the genus Goniothalamus (Annonaceae, including a reassessment of previous infrageneric classifications

    Directory of Open Access Journals (Sweden)

    Chin Cheung Tang

    2015-09-01

    Full Text Available Data is presented in support of a phylogenetic reconstruction of the species-rich early-divergent angiosperm genus Goniothalamus (Annonaceae (Tang et al., Mol. Phylogenetic Evol., 2015 [1], inferred using chloroplast DNA (cpDNA sequences. The data includes a list of primers for amplification and sequencing for nine cpDNA regions: atpB-rbcL, matK, ndhF, psbA-trnH, psbM-trnD, rbcL, trnL-F, trnS-G, and ycf1, the voucher information and molecular data (GenBank accession numbers of 67 ingroup Goniothalamus accessions and 14 outgroup accessions selected from across the tribe Annoneae, and aligned data matrices for each gene region. We also present our Bayesian phylogenetic reconstructions for Goniothalamus, with information on previous infrageneric classifications superimposed to enable an evaluation of monophyly, together with a taxon-character data matrix (with 15 morphological characters scored for 66 Goniothalamus species and seven other species from the tribe Annoneae that are shown to be phylogenetically correlated.

  8. Characterization of the complete mitochondrial genome of Acanthoscelides obtectus (Coleoptera: Chrysomelidae: Bruchinae) with phylogenetic analysis.

    Science.gov (United States)

    Yao, Jie; Yang, Hong; Dai, Renhuai

    2017-10-01

    Acanthoscelides obtectus is a common species of the subfamily Bruchinae and a worldwide-distributed seed-feeding beetle. The complete mitochondrial genome of A. obtectus is 16,130 bp in length with an A + T content of 76.4%. It contains a positive AT skew and a negative GC skew. The mitogenome of A. obtectus contains 13 protein-coding genes (PCGs), 22 tRNA genes, two rRNA genes and a non-coding region (D-loop). All PCGs start with an ATN codon, and seven (ND3, ATP6, COIII, ND3, ND4L, ND6, and Cytb) of them terminate with TAA, while the remaining five (COI, COII, ND1, ND4, and ND5) terminate with a single T, ATP8 terminates with TGA. Except tRNA Ser , the secondary structures of 21 tRNAs that can be folded into a typical clover-leaf structure were identified. The secondary structures of lrRNA and srRNA were also predicted in this study. There are six domains with 48 helices in lrRNA and three domains with 32 helices in srRNA. The control region of A. obtectus is 1354 bp in size with the highest A + T content (83.5%) in a mitochondrial gene. Thirteen PCGs in 19 species have been used to infer their phylogenetic relationships. Our results show that A. obtectus belongs to the family Chrysomelidae (subfamily-Bruchinae). This is the first study on phylogenetic analyses involving the mitochondrial genes of A. obtectus and could provide basic data for future studies of mitochondrial genome diversities and the evolution of related insect lineages.

  9. Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

    Directory of Open Access Journals (Sweden)

    Edgar E. Lara-Ramírez

    2014-01-01

    Full Text Available The increasing number of dengue virus (DENV genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4 has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3 as well as the effective number of codons (ENC, ENCp versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA and clustering analysis on relative synonymous codon usage (RSCU within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution.

  10. A phylogenetic analysis of the sugar porters in hemiascomycetous yeasts.

    Science.gov (United States)

    Palma, Margarida; Goffeau, André; Spencer-Martins, Isabel; Baret, Philippe V

    2007-01-01

    A total of 214 members of the sugar porter (SP) family (TC 2.A.1.1) from eight hemiascomycetous yeasts: Saccharomyces cerevisiae, Candida glabrata, Kluyveromyces lactis, Ashbya (Eremothecium) gossypii, Debaryomyces hansenii, Yarrowia lipolytica, Candida albicans and Pichia stipitis, were identified. The yeast SPs were classified in 13 different phylogenetic clusters. Specific sugar substrates could be allocated to nine phylogenetic clusters, including two novel TC clusters that are specific to fungi, i.e. the glycerol:H(+) symporter (2.A.1.1.38) and the high-affinity glucose transporter (2.A.1.1.39). Four phylogenetic clusters are identified by the preliminary fifth number Z23, Z24, Z25 and Z26 and the substrates of their members remain undetermined. The amplification of the SP clusters across the Hemiascomycetes reflects adaptation to specific carbon and energy sources available in the habitat of each yeast species. (c) 2007 S. Karger AG, Basel.

  11. Complex phylogenetic placement of ilex species (aquifoliaceae): a case study of molecular phylogeny

    International Nuclear Information System (INIS)

    Yi, F.; Sun, L.; Xiao, P.G.; Hao, D.C.

    2017-01-01

    To investigate the phylogenetic relationships among Ilex species distributed in China, we analyzed two alignments including 4,698 characters corresponding to six plastid sequences (matK, rbcL, atpB-rbcL, trnL-F, psbA-trnH, and rpl32-trnL) and 1,748 characters corresponding to two nuclear sequences (ITS and nepGS). Using different partitioning strategies and approaches (i.e., Bayesian inference, maximum likelihood, and maximum parsimony) for phylogeny reconstruction, different topologies and clade supports were determined. A total of 18 Ilex species was divided into two major groups (group I and II) in both plastid and nuclear phylogenies with some incongruences. Potential hybridization events may account, in part, for those phylogenetic uncertainties. The analyses, together with previously identified sequences, indicated that all 18 species were recovered within Eurasia or Asia/North America groups based on plastid data. Meanwhile, the species in group II in the nuclear phylogeny were placed in the Aquifolium clade, as inferred from traditional classification, whereas the species in group I belonged to several other clades. The divergence time of most of the 18 Ilex species was estimated to be not more than 10 million years ago. Based on the results of this study, we concluded that paleogeographical events and past climate changes during the same period might have played important roles in these diversifications. (author)

  12. jsPhyloSVG: a javascript library for visualizing interactive and vector-based phylogenetic trees on the web.

    Science.gov (United States)

    Smits, Samuel A; Ouverney, Cleber C

    2010-08-18

    Many software packages have been developed to address the need for generating phylogenetic trees intended for print. With an increased use of the web to disseminate scientific literature, there is a need for phylogenetic trees to be viewable across many types of devices and feature some of the interactive elements that are integral to the browsing experience. We propose a novel approach for publishing interactive phylogenetic trees. We present a javascript library, jsPhyloSVG, which facilitates constructing interactive phylogenetic trees from raw Newick or phyloXML formats directly within the browser in Scalable Vector Graphics (SVG) format. It is designed to work across all major browsers and renders an alternative format for those browsers that do not support SVG. The library provides tools for building rectangular and circular phylograms with integrated charting. Interactive features may be integrated and made to respond to events such as clicks on any element of the tree, including labels. jsPhyloSVG is an open-source solution for rendering dynamic phylogenetic trees. It is capable of generating complex and interactive phylogenetic trees across all major browsers without the need for plugins. It is novel in supporting the ability to interpret the tree inference formats directly, exposing the underlying markup to data-mining services. The library source code, extensive documentation and live examples are freely accessible at www.jsphylosvg.com.

  13. Revisiting the phylogeny of Zoanthidea (Cnidaria: Anthozoa): Staggered alignment of hypervariable sequences improves species tree inference.

    Science.gov (United States)

    Swain, Timothy D

    2018-01-01

    The recent rapid proliferation of novel taxon identification in the Zoanthidea has been accompanied by a parallel propagation of gene trees as a tool of species discovery, but not a corresponding increase in our understanding of phylogeny. This disparity is caused by the trade-off between the capabilities of automated DNA sequence alignment and data content of genes applied to phylogenetic inference in this group. Conserved genes or segments are easily aligned across the order, but produce poorly resolved trees; hypervariable genes or segments contain the evolutionary signal necessary for resolution and robust support, but sequence alignment is daunting. Staggered alignments are a form of phylogeny-informed sequence alignment composed of a mosaic of local and universal regions that allow phylogenetic inference to be applied to all nucleotides from both hypervariable and conserved gene segments. Comparisons between species tree phylogenies inferred from all data (staggered alignment) and hypervariable-excluded data (standard alignment) demonstrate improved confidence and greater topological agreement with other sources of data for the complete-data tree. This novel phylogeny is the most comprehensive to date (in terms of taxa and data) and can serve as an expandable tool for evolutionary hypothesis testing in the Zoanthidea. Spanish language abstract available in Text S1. Translation by L. O. Swain, DePaul University, Chicago, Illinois, 60604, USA. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Recombination Blurs Phylogenetic Groups Routine Assignment in Escherichia coli: Setting the Record Straight

    Science.gov (United States)

    Turrientes, María-Carmen; González-Alba, José-María; del Campo, Rosa; Baquero, María-Rosario; Cantón, Rafael; Baquero, Fernando; Galán, Juan Carlos

    2014-01-01

    The characterization of population structures plays a main role for understanding outbreaks and the dynamics of bacterial spreading. In Escherichia coli, the widely used combination of multiplex-PCR scheme together with goeBURST has some limitations. The purpose of this study is to show that the combination of different phylogenetic approaches based on concatenated sequences of MLST genes results in a more precise assignment of E. coli phylogenetic groups, complete understanding of population structure and reconstruction of ancestral clones. A collection of 80 Escherichia coli strains of different origins was analyzed following the Clermont and Doumith's multiplex-PCR schemes. Doumith's multiplex-PCR showed only 1.7% of misassignment, whereas Clermont's-2000 protocol reached 14.0%, although the discrepancies reached 30% and 38.7% respectively when recombinant C, F and E phylogroups were considered. Therefore, correct phylogroup attribution is highly variable and depends on the clonal composition of the sample. As far as population structure of these E. coli strains, including 48 E. coli genomes from GenBank, goeBURST provides a quite dispersed population structure; whereas NeighborNet approach reveals a complex population structure. MLST-based eBURST can infer different founder genotypes, for instance ST23/ST88 could be detected as the founder genotypes for STC23; however, phylogenetic reconstructions might suggest ST410 as the ancestor clone and several evolutionary trajectories with different founders. To improve our routine understanding of E. coli molecular epidemiology, we propose a strategy based on three successive steps; first, to discriminate three main groups A/B1/C, D/F/E and B2 following Doumith's protocol; second, visualization of population structure based on MLST genes according to goeBURST, using NeighborNet to establish more complex relationships among STs; and third, to perform, a cost-free characterization of evolutionary trajectories in variants

  15. Inferring large-scale patterns of niche evolution and dispersal limitation from the phylogenetic composition of assemblages: A case study on New World palms

    DEFF Research Database (Denmark)

    Eiserhardt, Wolf L.; Svenning, J.-C.; Baker, William J.

    How fast species’ environmental tolerances can evolve is crucial for their survival prospect under climate change. Phylogenetic information can yield insights into the tempo of niche evolution. Phylogenetic community structure (PCS) complements the more widely used approach of studying niche...

  16. Fossil gaps inferred from phylogenies alter the apparent nature of diversification in dragonflies and their relatives

    Directory of Open Access Journals (Sweden)

    Nicholson David B

    2011-09-01

    Full Text Available Abstract Background The fossil record has suggested that clade growth may differ in marine and terrestrial taxa, supporting equilibrial models in the former and expansionist models in the latter. However, incomplete sampling may bias findings based on fossil data alone. To attempt to correct for such bias, we assemble phylogenetic supertrees on one of the oldest clades of insects, the Odonatoidea (dragonflies, damselflies and their extinct relatives, using MRP and MRC. We use the trees to determine when, and in what clades, changes in taxonomic richness have occurred. We then test whether equilibrial or expansionist models are supported by fossil data alone, and whether findings differ when phylogenetic information is used to infer gaps in the fossil record. Results There is broad agreement in family-level relationships between both supertrees, though with some uncertainty along the backbone of the tree regarding dragonflies (Anisoptera. "Anisozygoptera" are shown to be paraphyletic when fossil information is taken into account. In both trees, decreases in net diversification are associated with species-poor extant families (Neopetaliidae, Hemiphlebiidae, and an upshift is associated with Calopterygidae + Polythoridae. When ghost ranges are inferred from the fossil record, many families are shown to have much earlier origination dates. In a phylogenetic context, the number of family-level lineages is shown to be up to twice as high as the fossil record alone suggests through the Cretaceous and Cenozoic, and a logistic increase in richness is detected in contrast to an exponential increase indicated by fossils alone. Conclusions Our analysis supports the notion that taxa, which appear to have diversified exponentially using fossil data, may in fact have diversified more logistically. This in turn suggests that one of the major apparent differences between the marine and terrestrial fossil record may simply be an artifact of incomplete sampling

  17. Phylogenetic Reconstruction as a Broadly Applicable Teaching Tool in the Biology Classroom: The Value of Data in Estimating Likely Answers

    Science.gov (United States)

    Julius, Matthew L.; Schoenfuss, Heiko L.

    2006-01-01

    This laboratory exercise introduces students to a fundamental tool in evolutionary biology--phylogenetic inference. Students are required to create a data set via observation and through mining preexisting data sets. These student data sets are then used to develop and compare competing hypotheses of vertebrate phylogeny. The exercise uses readily…

  18. Revealing pancrustacean relationships: phylogenetic analysis of ribosomal protein genes places Collembola (springtails) in a monophyletic Hexapoda and reinforces the discrepancy between mitochondrial and nuclear DNA markers.

    Science.gov (United States)

    Timmermans, M J T N; Roelofs, D; Mariën, J; van Straalen, N M

    2008-03-12

    In recent years, several new hypotheses on phylogenetic relations among arthropods have been proposed on the basis of DNA sequences. One of the challenged hypotheses is the monophyly of hexapods. This discussion originated from analyses based on mitochondrial DNA datasets that, due to an unusual positioning of Collembola, suggested that the hexapod body plan evolved at least twice. Here, we re-evaluate the position of Collembola using ribosomal protein gene sequences. In total 48 ribosomal proteins were obtained for the collembolan Folsomia candida. These 48 sequences were aligned with sequence data on 35 other ecdysozoans. Each ribosomal protein gene was available for 25% to 86% of the taxa. However, the total sequence information was unequally distributed over the taxa and ranged between 4% and 100%. A concatenated dataset was constructed (5034 inferred amino acids in length), of which ~66% of the positions were filled. Phylogenetic tree reconstructions, using Maximum Likelihood, Maximum Parsimony, and Bayesian methods, resulted in a topology that supports monophyly of Hexapoda. Although ribosomal proteins in general may not evolve independently, they once more appear highly valuable for phylogenetic reconstruction. Our analyses clearly suggest that Hexapoda is monophyletic. This underpins the inconsistency between nuclear and mitochondrial datasets when analyzing pancrustacean relationships. Caution is needed when applying mitochondrial markers in deep phylogeny.

  19. Mistaking geography for biology: inferring processes from species distributions.

    Science.gov (United States)

    Warren, Dan L; Cardillo, Marcel; Rosauer, Dan F; Bolnick, Daniel I

    2014-10-01

    Over the past few decades, there has been a rapid proliferation of statistical methods that infer evolutionary and ecological processes from data on species distributions. These methods have led to considerable new insights, but they often fail to account for the effects of historical biogeography on present-day species distributions. Because the geography of speciation can lead to patterns of spatial and temporal autocorrelation in the distributions of species within a clade, this can result in misleading inferences about the importance of deterministic processes in generating spatial patterns of biodiversity. In this opinion article, we discuss ways in which patterns of species distributions driven by historical biogeography are often interpreted as evidence of particular evolutionary or ecological processes. We focus on three areas that are especially prone to such misinterpretations: community phylogenetics, environmental niche modelling, and analyses of beta diversity (compositional turnover of biodiversity). Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.

  20. Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees.

    Science.gov (United States)

    Nye, Tom M W; Tang, Xiaoxian; Weyenberg, Grady; Yoshida, Ruriko

    2017-12-01

    Evolutionary relationships are represented by phylogenetic trees, and a phylogenetic analysis of gene sequences typically produces a collection of these trees, one for each gene in the analysis. Analysis of samples of trees is difficult due to the multi-dimensionality of the space of possible trees. In Euclidean spaces, principal component analysis is a popular method of reducing high-dimensional data to a low-dimensional representation that preserves much of the sample's structure. However, the space of all phylogenetic trees on a fixed set of species does not form a Euclidean vector space, and methods adapted to tree space are needed. Previous work introduced the notion of a principal geodesic in this space, analogous to the first principal component. Here we propose a geometric object for tree space similar to the [Formula: see text]th principal component in Euclidean space: the locus of the weighted Fréchet mean of [Formula: see text] vertex trees when the weights vary over the [Formula: see text]-simplex. We establish some basic properties of these objects, in particular showing that they have dimension [Formula: see text], and propose algorithms for projection onto these surfaces and for finding the principal locus associated with a sample of trees. Simulation studies demonstrate that these algorithms perform well, and analyses of two datasets, containing Apicomplexa and African coelacanth genomes respectively, reveal important structure from the second principal components.

  1. Phylogenetic position of the North American isolate of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, as inferred from 16S rDNA sequence analysis.

    Science.gov (United States)

    Atibalentja, N; Noel, G R; Domier, L L

    2000-03-01

    A 1341 bp sequence of the 16S rDNA of an undescribed species of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, was determined and then compared with a homologous sequence of Pasteuria ramosa, a parasite of cladoceran water fleas of the family Daphnidae. The two Pasteuria sequences, which diverged from each other by a dissimilarity index of 7%, also were compared with the 16S rDNA sequences of 30 other bacterial species to determine the phylogenetic position of the genus Pasteuria among the Gram-positive eubacteria. Phylogenetic analyses using maximum-likelihood, maximum-parsimony and neighbour-joining methods showed that the Heterodera glycines-infecting Pasteuria and its sister species, P. ramosa, form a distinct line of descent within the Alicyclobacillus group of the Bacillaceae. These results are consistent with the view that the genus Pasteuria is a deeply rooted member of the Clostridium-Bacillus-Streptococcus branch of the Gram-positive eubacteria, neither related to the actinomycetes nor closely related to true endospore-forming bacteria.

  2. Genetic variation and phylogenetic relationships of the ectomycorrhizal Floccularia luteovirens on the Qinghai-Tibet Plateau.

    Science.gov (United States)

    Xing, Rui; Gao, Qing-Bo; Zhang, Fa-Qi; Fu, Peng-Cheng; Wang, Jiu-Li; Yan, Hui-Ying; Chen, Shi-Long

    2017-08-01

    Floccularia luteovirens, as an ectomycorrhizal fungus, is widely distributed in the Qinghai-Tibet Plateau. As an edible fungus, it is famous for its unique flavor. Former studies mainly focus on the chemical composition and genetic structure of this species. However, the phylogenetic relationship between genotypes remains unknown. In this study, the genetic variation and phylogenetic relationship between the genotypes of F. luteovirens in Qinghai-Tibet Plateau was estimated through the analysis on two protein-coding genes (rpb1 and ef-1α) from 398 individuals collected from 24 wild populations. The sample covered the entire range of this species during all the growth seasons from 2011 to 2015. 13 genotypes were detected and moderate genetic diversity was revealed. Based on the results of network analysis, the maximum likelihood (ML), maximum parsimony (MP), and Bayesian inference (BI) analyses, the genotypes H-1, H-4, H-6, H-8, H-10, and H-11 were grouped into one clade. Additionally, a relatively higher genotype diversity (average h value is 0.722) and unique genotypes in the northeast edge of Qinghai- Tibet plateau have been found, combined with the results of mismatch analysis and neutrality tests indicated that Southeast Qinghai-Tibet plateau was a refuge for F. luteovirens during the historical geological or climatic events (uplifting of the Qinghai-Tibet Plateau or Last Glacial Maximum). Furthermore, the present distribution of the species on the Qinghai-Tibet plateau has resulted from the recent population expansion. Our findings provide a foundation for the future study of the evolutionary history and the speciation of this species.

  3. Molecular Characterization and Comparative Phylogenetic Analysis of Phytases from Fungi with Their Prospective Applications

    Directory of Open Access Journals (Sweden)

    Sharad Tiwari

    2013-01-01

    Full Text Available Plant seeds that have high phytate content are used as animal feed. Phytases, enzymes that catalyze the breakdown of phytate into inorganic phosphorus and myoinositol phosphate derivatives, have been intensively studied in recent years and gained immense attention because of their application in reducing phytate content in animal feed and food for human consumption, thus indirectly lowering environmental pollution caused by undigested phytate. This review is focused on summarising the current knowledge on recent developments of fungal and yeast phytases. Comparative account on diverse sources and physiological roles, molecular characteristics and regulation mechanisms of phytases are discussed. Phylogenetic relationship of phytases from different classes of fungi is studied in details. It is inferred on the basis of phylogeny that phytases from Ascomycetes and Basidiomycetes differ in the amino acid sequences, therefore they fall in separate clade in the tree. The prospective biotechnological applications of microbial phytases such as animal feed additives, probiotics, pharmaceuticals, as well as in aquaculture, food industry, paper manufacturing, development of transgenic plants and animals with special reference to its use as biofertilizers are also emphasised in this review.

  4. Phylogenetic diversity in the core group of Peziza inferred from ITS sequences and morphology

    DEFF Research Database (Denmark)

    Hansen, K.; Læssøe, Thomas; Pfister, D.H.

    2002-01-01

    Species delimitation within the core group of Peziza is highly controversial. The group, typified by P. vesiculosa, is morphologically coherent and in previous analyses of LSU rDNA sequences it formed a highly supported clade. Phylogenetic diversity and species limits were investigated within......), shallowly cup- to disc-shaped apothecia (A) and large (up to 15 cm), deeply cup-shaped to expanded apothecia (B). The overall exciple structure (a stratified or non-stratified medullary layer) and to some degree spore surface relief, likewise support the groupings. Clade A contains taxa with smooth...... that populations on a diverse array of substrates may be closely related, or indeed, conspecific....

  5. BEASTling: A software tool for linguistic phylogenetics using BEAST 2

    Science.gov (United States)

    Forkel, Robert; Kaiping, Gereon A.; Atkinson, Quentin D.

    2017-01-01

    We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF) permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists) and relevant domain experts. PMID:28796784

  6. BEASTling: A software tool for linguistic phylogenetics using BEAST 2.

    Directory of Open Access Journals (Sweden)

    Luke Maurits

    Full Text Available We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses. Support for the emerging cross-linguistic linked data format (CLDF permits easy incorporation of data published in cross-linguistic linked databases into analyses. BEASTling is intended to make the power of Bayesian analysis more accessible to historical linguists without strong programming backgrounds, in the hopes of encouraging communication and collaboration between those developing computational models of language evolution (who are typically not linguists and relevant domain experts.

  7. Genetic characterization and phylogenetic analysis of porcine circovirus type 2 (PCV2) in Serbia.

    Science.gov (United States)

    Savic, Bozidar; Milicevic, Vesna; Jakic-Dimic, Dobrila; Bojkovski, Jovan; Prodanovic, Radisa; Kureljusic, Branislav; Potkonjak, Aleksandar; Savic, Borivoje

    2012-01-01

    Porcine circovirus type 2 (PCV2) is the main causative agent of postweaning multisystemic wasting syndrome (PMWS). To characterize and determine the genetic diversity of PCV2 in the porcine population of Serbia, nucleotide and deduced amino acid sequences of the open reading frame 2 (ORF2) of PCV2 collected from the tissues of pigs that either had died as a result of PMWS or did not exhibit disease symptoms were analyzed. Sequencing and phylogenetic analysis showed considerable diversity among PCV2 ORF2 sequences and the existence of two main PCV2 genotypes, PCV2b and PCV2a, with at least three clusters, 1A/B, 1C and 2D. In order to provide further proof that the 1C strain is circulating in the porcine population, the whole viral genome of one PCV2 isolate was sequenced. Genotyping and phylogenetic analysis using the entire viral genome sequences confirmed that there was a PMWS-associated 1C strain emerging in Serbia. Our analysis also showed that PCV2b is dominant in the porcine population, and that it is exclusively associated with PMWS occurrences in the country. These data constitute a useful basis for further epidemiological studies regarding the heterogeneity of PCV2 strains on the European continent.

  8. Phylogeny of the Celastraceae inferred from phytochrome B gene sequence and morphology.

    Science.gov (United States)

    Simmons, M P; Clevinger, C C; Savolainen, V; Archer, R H; Mathews, S; Doyle, J J

    2001-02-01

    Phylogenetic relationships within Celastraceae were inferred using a simultaneous analysis of 61 morphological characters and 1123 base pairs of phytochrome B exon 1 from the nuclear genome. No gaps were inferred, and the gene tree topology suggests that the primers were specific to a single locus that did not duplicate among the lineages sampled. This region of phytochrome B was most useful for examining relationships among closely related genera. Fifty-one species from 38 genera of Celastraceae were sampled. The Celastraceae sensu lato (including Hippocrateaceae) were resolved as a monophyletic group. Loesener's subfamilies and tribes of Celastraceae were not supported. The Hippocrateaceae were resolved as a monophyletic group nested within a paraphyletic Celastraceae sensu stricto. Goupia was resolved as more closely related to Euphorbiaceae, Corynocarpaceae, and Linaceae than to Celastraceae. Plagiopteron (Flacourtiaceae) was resolved as the sister group of Hippocrateoideae. Brexia (Brexiaceae) was resolved as closely related to Elaeodendron and Pleurostylia. Canotia was resolved as the sister group of Acanthothamnus within Celastraceae. Perrottetia and Mortonia were resolved as the sister group of the rest of the Celastraceae. Siphonodon was resolved as a derived member of Celastraceae. Maytenus was resolved as three disparate groups, suggesting that this large genus needs to be recircumscribed.

  9. [Phylogenetic analysis of genomes of Vibrio cholerae strains isolated on the territory of Rostov region].

    Science.gov (United States)

    Kuleshov, K V; Markelov, M L; Dedkov, V G; Vodop'ianov, A S; Kermanov, A V; Pisanov, R V; Kruglikov, V D; Mazrukho, A B; Maleev, V V; Shipulin, G A

    2013-01-01

    Determination of origin of 2 Vibrio cholerae strains isolated on the territory of Rostov region by using full genome sequencing data. Toxigenic strain 2011 EL- 301 V. cholerae 01 El Tor Inaba No. 301 (ctxAB+, tcpA+) and nontoxigenic strain V. cholerae O1 Ogawa P- 18785 (ctxAB-, tcpA+) were studied. Sequencing was carried out on the MiSeq platform. Phylogenetic analysis of the genomes obtained was carried out based on comparison of conservative part of the studied and 54 previously sequenced genomes. 2011EL-301 strain genome was presented by 164 contigs with an average coverage of 100, N50 parameter was 132 kb, for strain P- 18785 - 159 contigs with a coverage of69, N50 - 83 kb. The contigs obtained for strain 2011 EL-301 were deposited in DDBJ/EMBL/GenBank databases with access code AJFN02000000, for strain P-18785 - ANHS00000000. 716 protein-coding orthologous genes were detected. Based on phylogenetic analysis strain P- 18785 belongs to PG-1 subgroup (a group of predecessor strains of the 7th pandemic). Strain 2011EL-301 belongs to groups of strains of the 7th pandemic and is included into the cluster with later isolates that are associated with cases of cholera in South Africa and cases of import of cholera to the USA from Pakistan. The data obtained allows to establish phylogenetic connections with V cholerae strains isolated earlier.

  10. Molecular cloning, phylogenetic analysis and heat shock response of Babesia gibsoni heat shock protein 90.

    Science.gov (United States)

    Yamasaki, Masahiro; Tsuboi, Yoshihiro; Taniyama, Yusuke; Uchida, Naohiro; Sato, Reeko; Nakamura, Kensuke; Ohta, Hiroshi; Takiguchi, Mitsuyoshi

    2016-09-01

    The Babesia gibsoni heat shock protein 90 (BgHSP90) gene was cloned and sequenced. The length of the gene was 2,610 bp with two introns. This gene was amplified from cDNA corresponding to full length coding sequence (CDS) with an open reading frame of 2,148 bp. A phylogenetic analysis of the CDS of HSP90 gene showed that B. gibsoni was most closely related to B. bovis and Babesia sp. BQ1/Lintan and lies within a phylogenetic cluster of protozoa. Moreover, mRNA transcription profile for BgHSP90 exposed to high temperature were examined by quantitative real-time reverse transcription-polymerase chain reaction. BgHSP90 levels were elevated when the parasites were incubated at 43°C for 1 hr.

  11. Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm.

    Science.gov (United States)

    Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee

    2014-02-01

    Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.

  12. The combination of phylogenetic analysis with epidemiological and serological data to track HIV-1 transmission in a sexual transmission case.

    Directory of Open Access Journals (Sweden)

    Min Chen

    Full Text Available To investigate the linkage of HIV transmission from a man to a woman through unprotected sexual contact without disclosing his HIV-positive status.Combined with epidemiological information and serological tests, phylogenetic analysis was used to test the a priori hypothesis of HIV transmission from the man to the woman. Control subjects, infected with HIV through heterosexual intercourse, from the same location were also sampled. Phylogenetic analyses were performed using the consensus gag, pol and env sequences obtained from blood samples of the man, the woman and the local control subjects. The env quasispecies of the man, the woman, and two controls were also obtained using single genome amplification and sequencing (SGA/S to explore the paraphyletic relationship by phylogenetic analysis.Epidemiological information and serological tests indicated that the man was infected with HIV-1 earlier than the woman. Phylogenetic analyses of the consensus sequences showed a monophyletic cluster for the man and woman in all three genomic regions. Furthermore, gag sequences of the man and woman shared a unique recombination pattern from subtype B and C, which was different from those of CRF07_BC or CRF08_BC observed in the local samples. These indicated that the viral sequences from the two subjects display a high level of similarity. Further, viral quasispecies from the man exhibited a paraphyletic relationship with those from the woman in the Bayesian and maximum-likelihood (ML phylogenetic trees of the env region, which supported the transmission direction from the man to the woman.In the context of epidemiological and serological evidence, the results of phylogenetic analyses support the transmission from the man to the woman.

  13. Detection and phylogenetic analysis of infectious pancreatic necrosis virus in Chile.

    Science.gov (United States)

    Tapia, D; Eissler, Y; Torres, P; Jorquera, E; Espinoza, J C; Kuznar, J

    2015-10-27

    Infectious pancreatic necrosis virus (IPNV) is the etiological agent of a highly contagious disease that is endemic to salmon farming in Chile and causes great economic losses to the industry. Here we compared different diagnostic methods to detect IPNV in field samples, including 3 real-time reverse transcription PCR (qRT-PCR) assays, cell culture isolation, and indirect fluorescent antibody test (IFAT). Additionally, we performed a phylogenetic analysis to investigate the genogroups prevailing in Chile, as well as their geographic distribution and virulence. The 3 qRT-PCR assays used primers that targeted regions of the VP2 and VP1 genes of the virus and were tested in 46 samples, presenting a fair agreement within their results. All samples were positive for at least 2 of the qRT-PCR assays, 29 were positive for cell culture, and 23 for IFAT, showing less sensitivity for these latter 2 methods. For the phylogenetic analysis, portions of 1180 and 523 bp of the VP2 region of segment A were amplified by RT-PCR, sequenced and compared with sequences from reference strains and from isolates reported by previous studies carried out in Chile. Most of the sequenced isolates belonged to genogroup 5 (European origin), and 5 were classified within genogroup 1 (American origin). Chilean isolates formed clusters within each of the genogroups found, evidencing a clear differentiation from the reference strains. To our knowledge, this is the most extensive study completed for IPNV in Chile, covering isolates from sea- and freshwater salmon farms and showing a high prevalence of this virus in the country.

  14. A Phylogenetic and Phenotypic Analysis of Salmonella enterica Serovar Weltevreden, an Emerging Agent of Diarrheal Disease in Tropical Regions.

    Directory of Open Access Journals (Sweden)

    Carine Makendi

    2016-02-01

    Full Text Available Salmonella enterica serovar Weltevreden (S. Weltevreden is an emerging cause of diarrheal and invasive disease in humans residing in tropical regions. Despite the regional and international emergence of this Salmonella serovar, relatively little is known about its genetic diversity, genomics or virulence potential in model systems. Here we used whole genome sequencing and bioinformatics analyses to define the phylogenetic structure of a diverse global selection of S. Weltevreden. Phylogenetic analysis of more than 100 isolates demonstrated that the population of S. Weltevreden can be segregated into two main phylogenetic clusters, one associated predominantly with continental Southeast Asia and the other more internationally dispersed. Subcluster analysis suggested the local evolution of S. Weltevreden within specific geographical regions. Four of the isolates were sequenced using long read sequencing to produce high quality reference genomes. Phenotypic analysis in Hep-2 cells and in a murine infection model indicated that S. Weltevreden were significantly attenuated in these models compared to the classical S. Typhimurium reference strain SL1344. Our work outlines novel insights into this important emerging pathogen and provides a baseline understanding for future research studies.

  15. Phylogenetic analysis of of Sarcocystis nesbitti (Coccidia: Sarcocystidae) suggests a snake as its probable definitive host

    Science.gov (United States)

    Sarcocystis nesbitti was first described by Mandour in 1969 from rhesus monkey muscle. Its definitive host remains unknown. 18SrRNA gene of Sarcocystis nesbitti was amplified, sequenced, and subjected to phylogenetic analysis. Among those congeners available for comparison, it shares closest affinit...

  16. Inferring Group Processes from Computer-Mediated Affective Text Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Schryver, Jack C [ORNL; Begoli, Edmon [ORNL; Jose, Ajith [Missouri University of Science and Technology; Griffin, Christopher [Pennsylvania State University

    2011-02-01

    Political communications in the form of unstructured text convey rich connotative meaning that can reveal underlying group social processes. Previous research has focused on sentiment analysis at the document level, but we extend this analysis to sub-document levels through a detailed analysis of affective relationships between entities extracted from a document. Instead of pure sentiment analysis, which is just positive or negative, we explore nuances of affective meaning in 22 affect categories. Our affect propagation algorithm automatically calculates and displays extracted affective relationships among entities in graphical form in our prototype (TEAMSTER), starting with seed lists of affect terms. Several useful metrics are defined to infer underlying group processes by aggregating affective relationships discovered in a text. Our approach has been validated with annotated documents from the MPQA corpus, achieving a performance gain of 74% over comparable random guessers.

  17. Molecular phylogenetic trees - On the validity of the Goodman-Moore augmentation algorithm

    Science.gov (United States)

    Holmquist, R.

    1979-01-01

    A response is made to the reply of Nei and Tateno (1979) to the letter of Holmquist (1978) supporting the validity of the augmentation algorithm of Moore (1977) in reconstructions of nucleotide substitutions by means of the maximum parsimony principle. It is argued that the overestimation of the augmented numbers of nucleotide substitutions (augmented distances) found by Tateno and Nei (1978) is due to an unrepresentative data sample and that it is only necessary that evolution be stochastically uniform in different regions of the phylogenetic network for the augmentation method to be useful. The importance of the average value of the true distance over all links is explained, and the relative variances of the true and augmented distances are calculated to be almost identical. The effects of topological changes in the phylogenetic tree on the augmented distance and the question of the correctness of ancestral sequences inferred by the method of parsimony are also clarified.

  18. BIMLR: a method for constructing rooted phylogenetic networks from rooted phylogenetic trees.

    Science.gov (United States)

    Wang, Juan; Guo, Maozu; Xing, Linlin; Che, Kai; Liu, Xiaoyan; Wang, Chunyu

    2013-09-15

    Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. © 2013 Elsevier B.V. All rights reserved.

  19. The Development of Three Long Universal Nuclear Protein-Coding Locus Markers and Their Application to Osteichthyan Phylogenetics with Nested PCR

    Science.gov (United States)

    Zhang, Peng

    2012-01-01

    Background Universal nuclear protein-coding locus (NPCL) markers that are applicable across diverse taxa and show good phylogenetic discrimination have broad applications in molecular phylogenetic studies. For example, RAG1, a representative NPCL marker, has been successfully used to make phylogenetic inferences within all major osteichthyan groups. However, such markers with broad working range and high phylogenetic performance are still scarce. It is necessary to develop more universal NPCL markers comparable to RAG1 for osteichthyan phylogenetics. Methodology/Principal Findings We developed three long universal NPCL markers (>1.6 kb each) based on single-copy nuclear genes (KIAA1239, SACS and TTN) that possess large exons and exhibit the appropriate evolutionary rates. We then compared their phylogenetic utilities with that of the reference marker RAG1 in 47 jawed vertebrate species. In comparison with RAG1, each of the three long universal markers yielded similar topologies and branch supports, all in congruence with the currently accepted osteichthyan phylogeny. To compare their phylogenetic performance visually, we also estimated the phylogenetic informativeness (PI) profile for each of the four long universal NPCL markers. The PI curves indicated that SACS performed best over the whole timescale, while RAG1, KIAA1239 and TTN exhibited similar phylogenetic performances. In addition, we compared the success of nested PCR and standard PCR when amplifying NPCL marker fragments. The amplification success rate and efficiency of the nested PCR were overwhelmingly higher than those of standard PCR. Conclusions/Significance Our work clearly demonstrates the superiority of nested PCR over the conventional PCR in phylogenetic studies and develops three long universal NPCL markers (KIAA1239, SACS and TTN) with the nested PCR strategy. The three markers exhibit high phylogenetic utilities in osteichthyan phylogenetics and can be widely used as pilot genes for

  20. Phylogenetic relationships of typical antbirds (Thamnophilidae and test of incongruence based on Bayes factors

    Directory of Open Access Journals (Sweden)

    Nylander Johan AA

    2004-07-01

    Full Text Available Abstract Background The typical antbirds (Thamnophilidae form a monophyletic and diverse family of suboscine passerines that inhabit neotropical forests. However, the phylogenetic relationships within this assemblage are poorly understood. Herein, we present a hypothesis of the generic relationships of this group based on Bayesian inference analyses of two nuclear introns and the mitochondrial cytochrome b gene. The level of phylogenetic congruence between the individual genes has been investigated utilizing Bayes factors. We also explore how changes in the substitution models affected the observed incongruence between partitions of our data set. Results The phylogenetic analysis supports both novel relationships, as well as traditional groupings. Among the more interesting novel relationship suggested is that the Terenura antwrens, the wing-banded antbird (Myrmornis torquata, the spot-winged antshrike (Pygiptila stellaris and the russet antshrike (Thamnistes anabatinus are sisters to all other typical antbirds. The remaining genera fall into two major clades. The first includes antshrikes, antvireos and the Herpsilochmus antwrens, while the second clade consists of most antwren genera, the Myrmeciza antbirds, the "professional" ant-following antbirds, and allied species. Our results also support previously suggested polyphyly of Myrmotherula antwrens and Myrmeciza antbirds. The tests of phylogenetic incongruence, using Bayes factors, clearly suggests that allowing the gene partitions to have separate topology parameters clearly increased the model likelihood. However, changing a component of the nucleotide substitution model had much higher impact on the model likelihood. Conclusions The phylogenetic results are in broad agreement with traditional classification of the typical antbirds, but some relationships are unexpected based on external morphology. In these cases their true affinities may have been obscured by convergent evolution and

  1. Phylogenetic Analysis of Apple scar skin viroid Isolates in Korea

    Directory of Open Access Journals (Sweden)

    Kang Hee Cho

    2015-12-01

    Full Text Available To identify genome sequences of Apple scar skin viroid (ASSVd isolates in Korea, the field survey was performed from ‘Hongro’ apple orchards located in eight sites in South Korea (Bongwha, Cheongsong, Dangjin, Gimchoen, Muju, Mungyeong, Suwon, and Yeongwol. ASSVd was detected by RT-PCR and PCR fragments were cloned into cloning vector. Full-length viral genomes of eight ASSVd isolates were sequenced and compared with 21 isolates reported previously from Korea, India, China, Japan and Greece. Eight isolates in this study showed 92.2-99.7% nucleotide sequence identities with those reported previously. Phylogenetic analysis showed that seven isolates reported in this study belong to the same group distinct from other groups.

  2. Phylogenetic analysis of Tibetan mastiffs based on mitochondrial hypervariable region I.

    Science.gov (United States)

    Ren, Zhanjun; Chen, Huiling; Yang, Xuejiao; Zhang, Chengdong

    2017-03-01

    Recently, the number of Tibetan mastiffs, which is a precious germplasm resource and cultural heritage, is decreasing sharply. Therefore, the genetic diversity of Tibetan mastiffs needs to be studied to clarify its phylogenetics relationships and lay the foundation for resource protection, rational development and utilization of Tibetan mastiffs. We sequenced hypervariable region I of mitochondrial DNA (mtDNA) of 110 individuals from Tibet region and Gansu province. A total of 12 polymorphic sites were identified which defined eight haplotypes of which H4 and H8 were unique to Tibetan population with H8 being identified first. The haplotype diversity (Hd: 0.808), nucleotide diversity (Pi: 0.603%), the average number of nucleotide difference (K: 3.917) of Tibetan mastiffs from Gansu were higher than those from Tibet region (Hd: 0.794; Pi: 0.589%; K: 3.831), which revealed higher genetic diversity in Gansu. In terms of total population, the genetic variation was low. The median-joining network and phylogenetic tree based on the mtDNA hypervariable region I showed that Tibetan mastiffs originated from grey wolves, as the other domestic dogs and had different history of maternal origin. The mismatch distribution analysis and neutrality tests indicated that Tibetan mastiffs were in genetic equilibrium or in a population decline.

  3. The Complete Mitochondrial Genome of Corizus tetraspilus (Hemiptera: Rhopalidae) and Phylogenetic Analysis of Pentatomomorpha

    Science.gov (United States)

    Guo, Zhong-Long; Wang, Juan; Shen, Yu-Ying

    2015-01-01

    Insect mitochondrial genome (mitogenome) are the most extensively used genetic information for molecular evolution, phylogenetics and population genetics. Pentatomomorpha (>14,000 species) is the second largest infraorder of Heteroptera and of great economic importance. To better understand the diversity and phylogeny within Pentatomomorpha, we sequenced and annotated the complete mitogenome of Corizus tetraspilus (Hemiptera: Rhopalidae), an important pest of alfalfa in China. We analyzed the main features of the C. tetraspilus mitogenome, and provided a comparative analysis with four other Coreoidea species. Our results reveal that gene content, gene arrangement, nucleotide composition, codon usage, rRNA structures and sequences of mitochondrial transcription termination factor are conserved in Coreoidea. Comparative analysis shows that different protein-coding genes have been subject to different evolutionary rates correlated with the G+C content. All the transfer RNA genes found in Coreoidea have the typical clover leaf secondary structure, except for trnS1 (AGN) which lacks the dihydrouridine (DHU) arm and possesses a unusual anticodon stem (9 bp vs. the normal 5 bp). The control regions (CRs) among Coreoidea are highly variable in size, of which the CR of C. tetraspilus is the smallest (440 bp), making the C. tetraspilus mitogenome the smallest (14,989 bp) within all completely sequenced Coreoidea mitogenomes. No conserved motifs are found in the CRs of Coreoidea. In addition, the A+T content (60.68%) of the CR of C. tetraspilus is much lower than that of the entire mitogenome (74.88%), and is lowest among Coreoidea. Phylogenetic analyses based on mitogenomic data support the monophyly of each superfamily within Pentatomomorpha, and recognize a phylogenetic relationship of (Aradoidea + (Pentatomoidea + (Lygaeoidea + (Pyrrhocoroidea + Coreoidea)))). PMID:26042898

  4. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    Directory of Open Access Journals (Sweden)

    Kodner Robin B

    2010-10-01

    Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.

  5. Phylogenetic relationships of Palaearctic Formica species (Hymenoptera, Formicidae based on mitochondrial cytochrome B sequences.

    Directory of Open Access Journals (Sweden)

    Anna V Goropashnaya

    Full Text Available Ants of genus Formica demonstrate variation in social organization and represent model species for ecological, behavioral, evolutionary studies and testing theoretical implications of the kin selection theory. Subgeneric division of the Formica ants based on morphology has been questioned and remained unclear after an allozyme study on genetic differentiation between 13 species representing all subgenera was conducted. In the present study, the phylogenetic relationships within the genus were examined using mitochondrial DNA sequences of the cytochrome b and a part of the NADH dehydrogenase subunit 6. All 23 Formica species sampled in the Palaearctic clustered according to the subgeneric affiliation except F. uralensis that formed a separate phylogenetic group. Unlike Coptoformica and Formica s. str., the subgenus Serviformica did not form a tight cluster but more likely consisted of a few small clades. The genetic distances between the subgenera were around 10%, implying approximate divergence time of 5 Myr if we used the conventional insect divergence rate of 2% per Myr. Within-subgenus divergence estimates were 6.69% in Serviformica, 3.61% in Coptoformica, 1.18% in Formica s. str., which supported our previous results on relatively rapid speciation in the latter subgenus. The phylogeny inferred from DNA sequences provides a necessary framework against which the evolution of social traits can be compared. We discuss implications of inferred phylogeny for the evolution of social traits.

  6. Competitive interactions between forest trees are driven by species' trait hierarchy, not phylogenetic or functional similarity: implications for forest community assembly.

    Science.gov (United States)

    Kunstler, Georges; Lavergne, Sébastien; Courbaud, Benoît; Thuiller, Wilfried; Vieilledent, Ghislain; Zimmermann, Niklaus E; Kattge, Jens; Coomes, David A

    2012-08-01

    The relative importance of competition vs. environmental filtering in the assembly of communities is commonly inferred from their functional and phylogenetic structure, on the grounds that similar species compete most strongly for resources and are therefore less likely to coexist locally. This approach ignores the possibility that competitive effects can be determined by relative positions of species on a hierarchy of competitive ability. Using growth data, we estimated 275 interaction coefficients between tree species in the French mountains. We show that interaction strengths are mainly driven by trait hierarchy and not by functional or phylogenetic similarity. On the basis of this result, we thus propose that functional and phylogenetic convergence in local tree community might be due to competition-sorting species with different competitive abilities and not only environmental filtering as commonly assumed. We then show a functional and phylogenetic convergence of forest structure with increasing plot age, which supports this view. © 2012 Blackwell Publishing Ltd/CNRS.

  7. Effect of site-specific heterogeneous evolution on phylogenetic reconstruction: a simple evaluation.

    Science.gov (United States)

    Cheng, Qiqun; Su, Zhixi; Zhong, Yang; Gu, Xun

    2009-07-15

    Recent studies have shown that heterogeneous evolution may mislead phylogenetic analysis, which has been neglected for a long time. We evaluate the effect of heterogeneous evolution on phylogenetic analysis, using 18 fish mitogenomic coding sequences as an example. Using the software DIVERGE, we identify 198 amino acid sites that have experienced heterogeneous evolution. After removing these sites, the rest of sites are shown to be virtually homogeneous in the evolutionary rate. There are some differences between phylogenetic trees built with heterogeneous sites ("before tree") and without heterogeneous sites ("after tree"). Our study demonstrates that for phylogenetic reconstruction, an effective approach is to identify and remove sites with heterogeneous evolution, and suggests that researchers can use the software DIVERGE to remove the influence of heterogeneous evolution before reconstructing phylogenetic trees.

  8. A phylogenetic perspective on the individual species-area relationship in temperate and tropical tree communities.

    Science.gov (United States)

    Yang, Jie; Swenson, Nathan G; Cao, Min; Chuyong, George B; Ewango, Corneille E N; Howe, Robert; Kenfack, David; Thomas, Duncan; Wolf, Amy; Lin, Luxiang

    2013-01-01

    Ecologists have historically used species-area relationships (SARs) as a tool to understand the spatial distribution of species. Recent work has extended SARs to focus on individual-level distributions to generate individual species area relationships (ISARs). The ISAR approach quantifies whether individuals of a species tend have more or less species richness surrounding them than expected by chance. By identifying richness 'accumulators' and 'repellers', respectively, the ISAR approach has been used to infer the relative importance of abiotic and biotic interactions and neutrality. A clear limitation of the SAR and ISAR approaches is that all species are treated as evolutionarily independent and that a large amount of work has now shown that local tree neighborhoods exhibit non-random phylogenetic structure given the species richness. Here, we use nine tropical and temperate forest dynamics plots to ask: (i) do ISARs change predictably across latitude?; (ii) is the phylogenetic diversity in the neighborhood of species accumulators and repellers higher or lower than that expected given the observed species richness?; and (iii) do species accumulators, repellers distributed non-randomly on the community phylogenetic tree? The results indicate no clear trend in ISARs from the temperate zone to the tropics and that the phylogenetic diversity surrounding the individuals of species is generally only non-random on very local scales. Interestingly the distribution of species accumulators and repellers was non-random on the community phylogenies suggesting the presence of phylogenetic signal in the ISAR across latitude.

  9. Molecular Identification of Dendrobium Species (Orchidaceae) Based on the DNA Barcode ITS2 Region and Its Application for Phylogenetic Study.

    Science.gov (United States)

    Feng, Shangguo; Jiang, Yan; Wang, Shang; Jiang, Mengying; Chen, Zhe; Ying, Qicai; Wang, Huizhong

    2015-09-11

    The over-collection and habitat destruction of natural Dendrobium populations for their commercial medicinal value has led to these plants being under severe threat of extinction. In addition, many Dendrobium plants are similarly shaped and easily confused during the absence of flowering stages. In the present study, we examined the application of the ITS2 region in barcoding and phylogenetic analyses of Dendrobium species (Orchidaceae). For barcoding, ITS2 regions of 43 samples in Dendrobium were amplified. In combination with sequences from GenBank, the sequences were aligned using Clustal W and genetic distances were computed using MEGA V5.1. The success rate of PCR amplification and sequencing was 100%. There was a significant divergence between the inter- and intra-specific genetic distances of ITS2 regions, while the presence of a barcoding gap was obvious. Based on the BLAST1, nearest distance and TaxonGAP methods, our results showed that the ITS2 regions could successfully identify the species of most Dendrobium samples examined; Second, we used ITS2 as a DNA marker to infer phylogenetic relationships of 64 Dendrobium species. The results showed that cluster analysis using the ITS2 region mainly supported the relationship between the species of Dendrobium established by traditional morphological methods and many previous molecular analyses. To sum up, the ITS2 region can not only be used as an efficient barcode to identify Dendrobium species, but also has the potential to contribute to the phylogenetic analysis of the genus Dendrobium.

  10. Molecular Identification of Dendrobium Species (Orchidaceae Based on the DNA Barcode ITS2 Region and Its Application for Phylogenetic Study

    Directory of Open Access Journals (Sweden)

    Shangguo Feng

    2015-09-01

    Full Text Available The over-collection and habitat destruction of natural Dendrobium populations for their commercial medicinal value has led to these plants being under severe threat of extinction. In addition, many Dendrobium plants are similarly shaped and easily confused during the absence of flowering stages. In the present study, we examined the application of the ITS2 region in barcoding and phylogenetic analyses of Dendrobium species (Orchidaceae. For barcoding, ITS2 regions of 43 samples in Dendrobium were amplified. In combination with sequences from GenBank, the sequences were aligned using Clustal W and genetic distances were computed using MEGA V5.1. The success rate of PCR amplification and sequencing was 100%. There was a significant divergence between the inter- and intra-specific genetic distances of ITS2 regions, while the presence of a barcoding gap was obvious. Based on the BLAST1, nearest distance and TaxonGAP methods, our results showed that the ITS2 regions could successfully identify the species of most Dendrobium samples examined; Second, we used ITS2 as a DNA marker to infer phylogenetic relationships of 64 Dendrobium species. The results showed that cluster analysis using the ITS2 region mainly supported the relationship between the species of Dendrobium established by traditional morphological methods and many previous molecular analyses. To sum up, the ITS2 region can not only be used as an efficient barcode to identify Dendrobium species, but also has the potential to contribute to the phylogenetic analysis of the genus Dendrobium.

  11. Phylogenetic analysis reveals a cryptic species Blastomyces gilchristii, sp. nov. within the human pathogenic fungus Blastomyces dermatitidis.

    Directory of Open Access Journals (Sweden)

    Elizabeth M Brown

    Full Text Available Analysis of the population genetic structure of microbial species is of fundamental importance to many scientific disciplines because it can identify cryptic species, reveal reproductive mode, and elucidate processes that contribute to pathogen evolution. Here, we examined the population genetic structure and geographic differentiation of the sexual, dimorphic fungus Blastomyces dermatitidis, the causative agent of blastomycosis.Criteria for Genealogical Concordance Phylogenetic Species Recognition (GCPSR applied to seven nuclear loci (arf6, chs2, drk1, fads, pyrF, tub1, and its-2 from 78 clinical and environmental isolates identified two previously unrecognized phylogenetic species. Four of seven single gene phylogenies examined (chs2, drk1, pyrF, and its-2 supported the separation of Phylogenetic Species 1 (PS1 and Phylogenetic Species 2 (PS2 which were also well differentiated in the concatenated chs2-drk1-fads-pyrF-tub1-arf6-its2 genealogy with all isolates falling into one of two evolutionarily independent lineages. Phylogenetic species were genetically distinct with interspecific divergence 4-fold greater than intraspecific divergence and a high Fst value (0.772, P<0.001 indicative of restricted gene flow between PS1 and PS2. Whereas panmixia expected of a single freely recombining population was not observed, recombination was detected when PS1 and PS2 were assessed separately, suggesting reproductive isolation. Random mating among PS1 isolates, which were distributed across North America, was only detected after partitioning isolates into six geographic regions. The PS2 population, found predominantly in the hyper-endemic regions of northwestern Ontario, Wisconsin, and Minnesota, contained a substantial clonal component with random mating detected only among unique genotypes in the population.These analyses provide evidence for a genetically divergent clade within Blastomyces dermatitidis, which we use to describe a novel species

  12. Phylogenetic analysis of West Nile virus isolated in Italy in 2008.

    Science.gov (United States)

    Savini, G; Monaco, F; Calistri, P; Lelli, R

    2008-11-27

    In Italy the first occurrence of West Nile virus (WNV) infection was reported in Tuscany region during the late summer of 1998. In August 2008, the WNV infection re-emerged in Italy, in areas surrounding the Po river delta, and involving three regions Lombardy, Emilia Romagna and Veneto. WNV was isolated from blood and organs samples of one horse, one donkey, one pigeon (Columba livia) and three magpies (Pica pica). The phylogenetic analysis of the isolates, conducted on 255 bp in the region coding for the E protein, indicates that these isolates belong to the lineage I among the European strains. According to the analysis, both the 1998 and 2008 Italian strains as well as isolates from Romania, Russia, Senegal and Kenya fell in the same sub-cluster.

  13. MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods

    Science.gov (United States)

    Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir

    2011-01-01

    Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353

  14. Are Ichthyosporea animals or fungi? Bayesian phylogenetic analysis of elongation factor 1alpha of Ichthyophonus irregularis.

    Science.gov (United States)

    Ragan, Mark A; Murphy, Colleen A; Rand, Thomas G

    2003-12-01

    Ichthyosporea is a recently recognized group of morphologically simple eukaryotes, many of which cause disease in aquatic organisms. Ribosomal RNA sequence analyses place Ichthyosporea near the divergence of the animal and fungal lineages, but do not allow resolution of its exact phylogenetic position. Some of the best evidence for a specific grouping of animals and fungi (Opisthokonta) has come from elongation factor 1alpha, not only phylogenetic analysis of sequences but also the presence or absence of short insertions and deletions. We sequenced the EF-1alpha gene from the ichthyosporean parasite Ichthyophonus irregularis and determined its phylogenetic position using neighbor-joining, parsimony and Bayesian methods. We also sequenced EF-1alpha genes from four chytrids to provide broader representation within fungi. Sequence analyses and the presence of a characteristic 12 amino acid insertion strongly indicate that I. irregularis is a member of Opisthokonta, but do not resolve whether I. irregularis is a specific relative of animals or of fungi. However, the EF-1alpha of I. irregularis exhibits a two amino acid deletion heretofore reported only among fungi.

  15. Genome-wide analysis of SINA family in plants and their phylogenetic relationships.

    Science.gov (United States)

    Wang, Meng; Jin, Ying; Fu, Junjie; Zhu, Yun; Zheng, Jun; Hu, Jian; Wang, Guoying

    2008-06-01

    SINA genes in plants are part of a multigene family with 5 members in Arabidopsis thaliana, 10 members in Populus trichocarpa, 6 members in Oryza sativa, at least 6 members in Zea mays and at least 1 member in Physcomitrella patens. Six members in maize were confirmed by RT-PCR. All SINAs have one RING domain and one SINA domain. These two domains are highly conserved in plants. According to the motif organization and phylogenetic tree, SINA family members were divided into 2 groups. In addition, through semi-quantitative RT-PCR analysis of maize members and Digital Northern analysis of Arabidopsis and rice members, we found that the tissue expression patterns are more diverse in monocot than in Arabidopsis.

  16. Characterizing the phylogenetic tree community structure of a protected tropical rain forest area in Cameroon.

    Science.gov (United States)

    Manel, Stéphanie; Couvreur, Thomas L P; Munoz, François; Couteron, Pierre; Hardy, Olivier J; Sonké, Bonaventure

    2014-01-01

    Tropical rain forests, the richest terrestrial ecosystems in biodiversity on Earth are highly threatened by global changes. This paper aims to infer the mechanisms governing species tree assemblages by characterizing the phylogenetic structure of a tropical rain forest in a protected area of the Congo Basin, the Dja Faunal Reserve (Cameroon). We re-analyzed a dataset of 11538 individuals belonging to 372 taxa found along nine transects spanning five habitat types. We generated a dated phylogenetic tree including all sampled taxa to partition the phylogenetic diversity of the nine transects into alpha and beta components at the level of the transects and of the habitat types. The variation in phylogenetic composition among transects did not deviate from a random pattern at the scale of the Dja Faunal Reserve, probably due to a common history and weak environmental variation across the park. This lack of phylogenetic structure combined with an isolation-by-distance pattern of taxonomic diversity suggests that neutral dispersal limitation is a major driver of community assembly in the Dja. To assess any lack of sensitivity to the variation in habitat types, we restricted the analyses of transects to the terra firme primary forest and found results consistent with those of the whole dataset at the level of the transects. Additionally to previous analyses, we detected a weak but significant phylogenetic turnover among habitat types, suggesting that species sort in varying environments, even though it is not predominating on the overall phylogenetic structure. Finer analyses of clades indicated a signal of clustering for species from the Annonaceae family, while species from the Apocynaceae family indicated overdispersion. These results can contribute to the conservation of the park by improving our understanding of the processes dictating community assembly in these hyperdiverse but threatened regions of the world.

  17. Characterizing the Phylogenetic Tree Community Structure of a Protected Tropical Rain Forest Area in Cameroon

    Science.gov (United States)

    Munoz, François; Couteron, Pierre; Hardy, Olivier J.; Sonké, Bonaventure

    2014-01-01

    Tropical rain forests, the richest terrestrial ecosystems in biodiversity on Earth are highly threatened by global changes. This paper aims to infer the mechanisms governing species tree assemblages by characterizing the phylogenetic structure of a tropical rain forest in a protected area of the Congo Basin, the Dja Faunal Reserve (Cameroon). We re-analyzed a dataset of 11538 individuals belonging to 372 taxa found along nine transects spanning five habitat types. We generated a dated phylogenetic tree including all sampled taxa to partition the phylogenetic diversity of the nine transects into alpha and beta components at the level of the transects and of the habitat types. The variation in phylogenetic composition among transects did not deviate from a random pattern at the scale of the Dja Faunal Reserve, probably due to a common history and weak environmental variation across the park. This lack of phylogenetic structure combined with an isolation-by-distance pattern of taxonomic diversity suggests that neutral dispersal limitation is a major driver of community assembly in the Dja. To assess any lack of sensitivity to the variation in habitat types, we restricted the analyses of transects to the terra firme primary forest and found results consistent with those of the whole dataset at the level of the transects. Additionally to previous analyses, we detected a weak but significant phylogenetic turnover among habitat types, suggesting that species sort in varying environments, even though it is not predominating on the overall phylogenetic structure. Finer analyses of clades indicated a signal of clustering for species from the Annonaceae family, while species from the Apocynaceae family indicated overdispersion. These results can contribute to the conservation of the park by improving our understanding of the processes dictating community assembly in these hyperdiverse but threatened regions of the world. PMID:24936786

  18. Identification of Tunisian Leishmania spp. by PCR amplification of cysteine proteinase B (cpb) genes and phylogenetic analysis.

    Science.gov (United States)

    Chaouch, Melek; Fathallah-Mili, Akila; Driss, Mehdi; Lahmadi, Ramzi; Ayari, Chiraz; Guizani, Ikram; Ben Said, Moncef; Benabderrazak, Souha

    2013-03-01

    Discrimination of the Old World Leishmania parasites is important for diagnosis and epidemiological studies of leishmaniasis. We have developed PCR assays that allow the discrimination between Leishmania major, Leishmania tropica and Leishmania infantum Tunisian species. The identification was performed by a simple PCR targeting cysteine protease B (cpb) gene copies. These PCR can be a routine molecular biology tools for discrimination of Leishmania spp. from different geographical origins and different clinical forms. Our assays can be an informative source for cpb gene studying concerning drug, diagnostics and vaccine research. The PCR products of the cpb gene and the N-acetylglucosamine-1-phosphate transferase (nagt) Leishmania gene were sequenced and aligned. Phylogenetic trees of Leishmania based cpb and nagt sequences are close in topology and present the classic distribution of Leishmania in the Old World. The phylogenetic analysis has enabled the characterization and identification of different strains, using both multicopy (cpb) and single copy (nagt) genes. Indeed, the cpb phylogenetic analysis allowed us to identify the Tunisian Leishmania killicki species, and a group which gathers the least evolved isolates of the Leishmania donovani complex, that was originated from East Africa. This clustering confirms the African origin for the visceralizing species of the L. donovani complex. Copyright © 2012 Elsevier B.V. All rights reserved.

  19. Conformation of phylogenetic relationship of Penaeidae shrimp based on morphometric and molecular investigations.

    Science.gov (United States)

    Rajakumaran, P; Vaseeharan, B; Jayakumar, R; Chidambara, R

    2014-01-01

    Understanding of accurate phylogenetic relationship among Penaeidae shrimp is important for academic and fisheries industry. The Morphometric and Randomly amplified polymorphic DNA (RAPD) analysis was used to make the phylogenetic relationsip among 13 Penaeidae shrimp. For morphometric analysis forty variables and total lengths of shrimp were measured for each species, and removed the effect of size variation. The size normalized values obtained was subjected to UPGMA (Unweighted Pair-Group Method with Arithmetic Mean) cluster analysis. For RAPD analysis, the four primers showed reliable differentiation between species, and used correlation coefficient between the DNA banding patterns of 13 Penaeidae species to construct UPGMA dendrogram. Phylogenetic relationship from morphometric and molecular analysis for Penaeidae species found to be congruent. We concluded that as the results from morphometry investigations concur with molecular one, phylogenetic relationship obtained for the studied Penaeidae are considered to be reliable.

  20. Evolutionary rates at codon sites may be used to align sequences and infer protein domain function

    Directory of Open Access Journals (Sweden)

    Hazelhurst Scott

    2010-03-01

    Full Text Available Abstract Background Sequence alignments form part of many investigations in molecular biology, including the determination of phylogenetic relationships, the prediction of protein structure and function, and the measurement of evolutionary rates. However, to obtain meaningful results, a significant degree of sequence similarity is required to ensure that the alignments are accurate and the inferences correct. Limitations arise when sequence similarity is low, which is particularly problematic when working with fast-evolving genes, evolutionary distant taxa, genomes with nucleotide biases, and cases of convergent evolution. Results A novel approach was conceptualized to address the "low sequence similarity" alignment problem. We developed an alignment algorithm termed FIRE (Functional Inference using the Rates of Evolution, which aligns sequences using the evolutionary rate at codon sites, as measured by the dN/dS ratio, rather than nucleotide or amino acid residues. FIRE was used to test the hypotheses that evolutionary rates can be used to align sequences and that the alignments may be used to infer protein domain function. Using a range of test data, we found that aligning domains based on evolutionary rates was possible even when sequence similarity was very low (for example, antibody variable regions. Furthermore, the alignment has the potential to infer protein domain function, indicating that domains with similar functions are subject to similar evolutionary constraints. These data suggest that an evolutionary rate-based approach to sequence analysis (particularly when combined with structural data may be used to study cases of convergent evolution or when sequences have very low similarity. However, when aligning homologous gene sets with sequence similarity, FIRE did not perform as well as the best traditional alignment algorithms indicating that the conventional approach of aligning residues as opposed to evolutionary rates remains the

  1. Evolution of oil-producing trichomes in Sisyrinchium (Iridaceae): insights from the first comprehensive phylogenetic analysis of the genus

    Science.gov (United States)

    Chauveau, Olivier; Eggers, Lilian; Raquin, Christian; Silvério, Adriano; Brown, Spencer; Couloux, Arnaud; Cruaud, Corine; Kaltchuk-Santos, Eliane; Yockteng, Roxana; Souza-Chies, Tatiana T.; Nadot, Sophie

    2011-01-01

    Background and Aims Sisyrinchium (Iridaceae: Iridoideae: Sisyrinchieae) is one of the largest, most widespread and most taxonomically complex genera in Iridaceae, with all species except one native to the American continent. Phylogenetic relationships within the genus were investigated and the evolution of oil-producing structures related to specialized oil-bee pollination examined. Methods Phylogenetic analyses based on eight molecular markers obtained from 101 Sisyrinchium accessions representing 85 species were conducted in the first extensive phylogenetic analysis of the genus. Total evidence analyses confirmed the monophyly of the genus and retrieved nine major clades weakly connected to the subdivisions previously recognized. The resulting phylogenetic hypothesis was used to reconstruct biogeographical patterns, and to trace the evolutionary origin of glandular trichomes present in the flowers of several species. Key Results and Conclusions Glandular trichomes evolved three times independently in the genus. In two cases, these glandular trichomes are oil-secreting, suggesting that the corresponding flowers might be pollinated by oil-bees. Biogeographical patterns indicate expansions from Central America and the northern Andes to the subandean ranges between Chile and Argentina and to the extended area of the Paraná river basin. The distribution of oil-flower species across the phylogenetic trees suggests that oil-producing trichomes may have played a key role in the diversification of the genus, a hypothesis that requires future testing. PMID:21527419

  2. Analysis of kinetoplast cytochrome b gene of 16 Leishmania isolates from different foci of China: different species of Leishmania in China and their phylogenetic inference

    Science.gov (United States)

    2013-01-01

    Background Leishmania species belong to the family Trypanosomatidae and cause leishmaniasis, a geographically widespread disease that infects humans and other vertebrates. This disease remains endemic in China. Due to the large geographic area and complex ecological environment, the taxonomic position and phylogenetic relationship of Chinese Leishmania isolates remain uncertain. A recent internal transcribed spacer 1 and cytochrome oxidase II phylogeny of Chinese Leishmania isolates has challenged some aspects of their traditional taxonomy as well as cladistics hypotheses of their phylogeny. The current study was designed to provide further disease background and sequence analysis. Methods We systematically analyzed 50 cytochrome b (cyt b) gene sequences of 19 isolates (16 from China, 3 from other countries) sequenced after polymerase chain reaction (PCR) using a special primer for cyt b as well as 31 sequences downloaded from GenBank. After alignment, the data were analyzed using the maximum parsimony, Bayesian and netwok methods. Results Sequences of six haplotypes representing 10 Chinese isolates formed a monophyletic group and clustered with Leishmania tarentolae. The isolates GS1, GS7, XJ771 of this study from China clustered with other isolates of Leishmania donovani complex. The isolate JS1 was a sister to Leishmania tropica, which represented an L. tropica complex instead of clustering with L. donovani complex or with the other 10 Chinese isolates. The isolates KXG-2 and GS-GER20 formed a monophyletic group with Leishmania turanica from central Asia. In the different phylogenetic trees, all of the Chinese isolates occurred in at least four groups regardless of geographic distribution. Conclusions The undescribed Leishmania species of China, which are clearly causative agents of canine leishmaniasis and human visceral leishmaniasis and are related to Sauroleishmania, may have evolved from a common ancestral parasite that came from the Americas and may have

  3. Molecular diagnosis and phylogenetic analysis of Babesia bigemina and Babesia bovis hemoparasites from cattle in South Africa

    Science.gov (United States)

    2013-01-01

    Background Babesia parasites, mainly Babesia bovis and B. bigemina, are tick-borne hemoparasites inducing bovine babesiosis in cattle globally. The clinical signs of the disease include, among others, anemia, fever and hemoglobinuria. Babesiosis is known to occur in tropical and subtropical regions of the world. In this study, we aim to provide information about the occurrence and phylogenetic relationship of B. bigemina and B. bovis species in cattle from different locations in nine provinces of South Africa. A total of 430 blood samples were randomly collected from apparently healthy cattle. These samples were genetically tested for Babesia parasitic infections using nested PCR assays with species-specific primers. Results Nested PCR assays with Group I primer sets revealed that the overall prevalence of B. bigemina and B. bovis in all bovine samples tested was 64.7% (95% CI = 60.0-69.0) and 35.1% (95% CI = 30.6-39.8), respectively. Only 117/430 (27.2%) animals had a mixed infection. The highest prevalence of 87.5% (95% CI = 77.2-93.5) for B. bigemina was recorded in the Free State province collection sites (Ficksburg, Philippolis and Botshabelo), while North West collection sites had the highest number of animals infected with B. bovis (65.5%; 95% CI = 52.7-76.4). Phylograms were inferred based on B. bigemina-specific gp45 and B. bovis-specific rap-1 nucleotide sequences obtained with Group II nested PCR primers. Phylogenetic analysis of gp45 sequences revealed significant differences in the genotypes of B. bigemina isolates investigated, including those of strains published in GenBank. On the other hand, a phylogeny based on B. bovis rap-1 sequences indicated a similar trend of clustering among the sequences of B. bovis isolates investigated in this study. Conclusion This study demonstrates the occurrence of Babesia parasites in cattle from different provinces of South Africa. It was also noted that the situation of Babesia parasitic infection

  4. Molecular phylogenetic and expression analysis of the complete WRKY transcription factor family in maize.

    Science.gov (United States)

    Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

    2012-04-01

    The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance.

  5. Identification and phylogenetic analysis of novel cytochrome P450 1A genes from ungulate species.

    Science.gov (United States)

    Darwish, Wageh Sobhy; Kawai, Yusuke; Ikenaka, Yoshinori; Yamamoto, Hideaki; Muroya, Tarou; Ishizuka, Mayumi

    2010-09-01

    As part of an ongoing effort to understand the biological response of wild and domestic ungulates to different environmental pollutants such as dioxin-like compounds, cDNAs encoding for CYP1A1 and CYP1A2 were cloned and characterized. Four novel CYP1A cDNA fragments from the livers of four wild ungulates (elephant, hippopotamus, tapir and deer) were identified. Three fragments from hippopotamus, tapir and deer were classified as CYP1A2, and the other fragment from elephant was designated as CYP1A1/2. The deduced amino acid sequences of these fragment CYP1As showed identities ranging from 76 to 97% with other animal CYP1As. The phylogenetic analysis of these fragments showed that both elephant and hippopotamus CYP1As made separate branches, while tapir and deer CYP1As were located beside that of horse and cattle respectively in the phylogenetic tree. Analysis of dN/dS ratio among the identified CYP1As indicated that odd toed ungulate CYP1A2s were exposed to different selection pressure.

  6. Phylogenetic turnover during subtropical forest succession across environmental and phylogenetic scales

    OpenAIRE

    Purschke, Oliver; Michalski, Stefan G.; Bruelheide, Helge; Durka, Walter

    2017-01-01

    Abstract Although spatial and temporal patterns of phylogenetic community structure during succession are inherently interlinked and assembly processes vary with environmental and phylogenetic scales, successional studies of community assembly have yet to integrate spatial and temporal components of community structure, while accounting for scaling issues. To gain insight into the processes that generate biodiversity after disturbance, we combine analyses of spatial and temporal phylogenetic ...

  7. Time clustered sampling can inflate the inferred substitution rate in foot-and-mouth disease virus analyses

    DEFF Research Database (Denmark)

    Pedersen, Casper-Emil Tingskov; Frandsen, Peter; Wekesa, Sabenzia N.

    2015-01-01

    abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale...... through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer...... to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully...

  8. MEGA-CC: computing core of molecular evolutionary genetics analysis program for automated and iterative data analysis.

    Science.gov (United States)

    Kumar, Sudhir; Stecher, Glen; Peterson, Daniel; Tamura, Koichiro

    2012-10-15

    There is a growing need in the research community to apply the molecular evolutionary genetics analysis (MEGA) software tool for batch processing a large number of datasets and to integrate it into analysis workflows. Therefore, we now make available the computing core of the MEGA software as a stand-alone executable (MEGA-CC), along with an analysis prototyper (MEGA-Proto). MEGA-CC provides users with access to all the computational analyses available through MEGA's graphical user interface version. This includes methods for multiple sequence alignment, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, tests of natural selection and ancestral sequence inference. Additionally, we have upgraded the source code for phylogenetic analysis using the maximum likelihood methods for parallel execution on multiple processors and cores. Here, we describe MEGA-CC and outline the steps for using MEGA-CC in tandem with MEGA-Proto for iterative and automated data analysis. http://www.megasoftware.net/.

  9. Phylogeny, character evolution, and biogeography of Cuscuta (dodders; Convolvulaceae) inferred from coding plastid and nuclear sequences.

    Science.gov (United States)

    García, Miguel A; Costea, Mihai; Kuzmina, Maria; Stefanović, Saša

    2014-04-01

    The parasitic genus Cuscuta, containing some 200 species circumscribed traditionally in three subgenera, is nearly cosmopolitan, occurring in a wide range of habitats and hosts. Previous molecular studies, on subgenera Grammica and Cuscuta, delimited major clades within these groups. However, the sequences used were unalignable among subgenera, preventing the phylogenetic comparison across the genus. We conducted a broad phylogenetic study using rbcL and nrLSU sequences covering the morphological, physiological, and geographical diversity of Cuscuta. We used parsimony methods to reconstruct ancestral states for taxonomically important characters. Biogeographical inferences were obtained using statistical and Bayesian approaches. Four well-supported major clades are resolved. Two of them correspond to subgenera Monogynella and Grammica. Subgenus Cuscuta is paraphyletic, with section Pachystigma sister to subgenus Grammica. Previously described cases of strongly supported discordance between plastid and nuclear phylogenies, interpreted as reticulation events, are confirmed here and three new cases are detected. Dehiscent fruits and globose stigmas are inferred as ancestral character states, whereas the ancestral style number is ambiguous. Biogeographical reconstructions suggest an Old World origin for the genus and subsequent spread to the Americas as a consequence of one long-distance dispersal. Hybridization may play an important yet underestimated role in the evolution of Cuscuta. Our results disagree with scenarios of evolution (polarity) previously proposed for several taxonomically important morphological characters, and with their usage and significance. While several cases of long-distance dispersal are inferred, vicariance or dispersal to adjacent areas emerges as the dominant biogeographical pattern.

  10. Do Branch Lengths Help to Locate a Tree in a Phylogenetic Network?

    Science.gov (United States)

    Gambette, Philippe; van Iersel, Leo; Kelk, Steven; Pardi, Fabio; Scornavacca, Celine

    2016-09-01

    Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be explained by a given network. In mathematical terms, this is often translated in the following way: is a given phylogenetic tree contained in a given phylogenetic network? Recently this tree containment problem has been widely investigated from a computational perspective, but most studies have only focused on the topology of the phylogenies, ignoring a piece of information that, in the case of phylogenetic trees, is routinely inferred by evolutionary analyses: branch lengths. These measure the amount of change (e.g., nucleotide substitutions) that has occurred along each branch of the phylogeny. Here, we study a number of versions of the tree containment problem that explicitly account for branch lengths. We show that, although length information has the potential to locate more precisely a tree within a network, the problem is computationally hard in its most general form. On a positive note, for a number of special cases of biological relevance, we provide algorithms that solve this problem efficiently. This includes the case of networks of limited complexity, for which it is possible to recover, among the trees contained by the network with the same topology as the input tree, the closest one in terms of branch lengths.

  11. Mitochondrial DNA genomes organization and phylogenetic relationships analysis of eight anemonefishes (pomacentridae: amphiprioninae.

    Directory of Open Access Journals (Sweden)

    Jianlong Li

    Full Text Available Anemonefishes (Pomacentridae Amphiprioninae are a group of 30 valid coral reef fish species with their phylogenetic relationships still under debate. The eight available mitogenomes of anemonefishes were used to reconstruct the molecular phylogenetic tree; six were obtained from this study (Amphiprion clarkii, A. frenatus, A. percula, A. perideraion, A. polymnus and Premnas biaculeatus and two from GenBank (A. bicinctus and A. ocellaris. The seven Amphiprion species represent all four subgenera and P. biaculeatus is the only species from Premnas. The eight mitogenomes of anemonefishes encoded 13 protein-coding genes, two rRNA genes, 22 tRNA genes and two main non-coding regions, with the gene arrangement and translation direction basically identical to other typical vertebrate mitogenomes. Among the 13 protein-coding genes, A. ocellaris (AP006017 and A. percula (KJ174497 had the same length in ND5 with 1,866 bp, which were three nucleotides less than the other six anemonefishes. Both structures of ND5, however, could translate to amino acid successfully. Only four mitogenomes had the tandem repeats in D-loop; the tandem repeats were located in downstream after Conserved Sequence Block rather than the upstream and repeated in a simply way. The phylogenetic utility was tested with Bayesian and Maximum Likelihood methods using all 13 protein-coding genes. The results strongly supported that the subfamily Amphiprioninae was monophyletic and P. biaculeatus should be assigned to the genus Amphiprion. Premnas biaculeatus with the percula complex were revealed to be the ancient anemonefish species. The tree forms of ND1, COIII, ND4, Cytb, Cytb+12S rRNA, Cytb+COI and Cytb+COI+12S rRNA were similar to that 13 protein-coding genes, therefore, we suggested that the suitable single mitochondrial gene for phylogenetic analysis of anemonefishes maybe Cytb. Additional mitogenomes of anemonefishes with a combination of nuclear markers will be useful to

  12. Hybrid Origins of Citrus Varieties Inferred from DNA Marker Analysis of Nuclear and Organelle Genomes

    Science.gov (United States)

    Kitajima, Akira; Nonaka, Keisuke; Yoshioka, Terutaka; Ohta, Satoshi; Goto, Shingo; Toyoda, Atsushi; Fujiyama, Asao; Mochizuki, Takako; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2016-01-01

    Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy–Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies. PMID:27902727

  13. Revealing pancrustacean relationships: Phylogenetic analysis of ribosomal protein genes places Collembola (springtails in a monophyletic Hexapoda and reinforces the discrepancy between mitochondrial and nuclear DNA markers

    Directory of Open Access Journals (Sweden)

    Mariën J

    2008-03-01

    Full Text Available Abstract Background In recent years, several new hypotheses on phylogenetic relations among arthropods have been proposed on the basis of DNA sequences. One of the challenged hypotheses is the monophyly of hexapods. This discussion originated from analyses based on mitochondrial DNA datasets that, due to an unusual positioning of Collembola, suggested that the hexapod body plan evolved at least twice. Here, we re-evaluate the position of Collembola using ribosomal protein gene sequences. Results In total 48 ribosomal proteins were obtained for the collembolan Folsomia candida. These 48 sequences were aligned with sequence data on 35 other ecdysozoans. Each ribosomal protein gene was available for 25% to 86% of the taxa. However, the total sequence information was unequally distributed over the taxa and ranged between 4% and 100%. A concatenated dataset was constructed (5034 inferred amino acids in length, of which ~66% of the positions were filled. Phylogenetic tree recon