Sample records for accurate phylogenetic classification

  1. Accurate phylogenetic classification of DNA fragments based onsequence composition

    Energy Technology Data Exchange (ETDEWEB)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore


    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  2. Concepts of Classification and Taxonomy. Phylogenetic Classification

    CERN Document Server

    Fraix-Burnet, Didier


    Phylogenetic approaches to classification have been heavily developed in biology by bioinformaticians. But these techniques have applications in other fields, in particular in linguistics. Their main characteristics is to search for relationships between the objects or species in study, instead of grouping them by similarity. They are thus rather well suited for any kind of evolutionary objects. For nearly fifteen years, astrocladistics has explored the use of Maximum Parsimony (or cladistics) for astronomical objects like galaxies or globular clusters. In this lesson we will learn how it works. 1 Why phylogenetic tools in astrophysics? 1.1 History of classification The need for classifying living organisms is very ancient, and the first classification system can be dated back to the Greeks. The goal was very practical since it was intended to distinguish between eatable and toxic aliments, or kind and dangerous animals. Simple resemblance was used and has been used for centuries. Basically, until the XVIIIth...

  3. Concepts of Classification and Taxonomy Phylogenetic Classification (United States)

    Fraix-Burnet, D.


    Phylogenetic approaches to classification have been heavily developed in biology by bioinformaticians. But these techniques have applications in other fields, in particular in linguistics. Their main characteristics is to search for relationships between the objects or species in study, instead of grouping them by similarity. They are thus rather well suited for any kind of evolutionary objects. For nearly fifteen years, astrocladistics has explored the use of Maximum Parsimony (or cladistics) for astronomical objects like galaxies or globular clusters. In this lesson we will learn how it works.

  4. A higher-level phylogenetic classification of the Fungi

    NARCIS (Netherlands)

    Hibbett, D.S.; Binder, M.; Bischoff, J.F.; Blackwell, M.; Cannon, P.F.; Eriksson, O.E.; Huhndorf, S.; James, T.; Kirk, P.M.; Lücking, R.; Thorsten Lumbsch, H.; Lutzoni, F.; Brandon Matheny, P.; McLaughlin, D.J.; Powell, M.J.; Redhead, S.; Schoch, C.L.; Spatafora, J.W.; Stalpers, J.A.; Vilgalys, R.; Aime, M.C.; Aptroot, A.; Bauer, R.; Begerow, D.; Benny, G.L.; Castlebury, L.A.; Crous, P.W.; Dai, Y.C.; Gams, W.; Geiser, D.M.; Griffith, G.W.; Gueidan, C.; Hawksworth, D.L.; Hestmark, G.; Hosaka, K.; Humber, R.A.; Hyde, K.D.; Ironside, J.E.; Koljalg, U.; Kurtzman, C.P.; Larsson, K.H.; Lichtwardt, R.; Longcore, J.; Miadlikowska, J.; Miller, A.; Moncalvo, J.M.; Mozley-Standridge, S.; Oberwinkler, F.; Parmasto, E.; Reeb, V.; Rogers, J.D.; Roux, Le C.; Ryvarden, L.; Sampaio, J.P.; Schüssler, A.; Sugiyama, J.; Thorn, R.G.; Tibell, L.; Untereiner, W.A.; Walker, C.; Wang, Z.; Weir, A.; Weiss, M.; White, M.M.; Winka, K.; Yao, Y.J.; Zhang, N.


    A comprehensive phylogenetic classification of the kingdom Fungi is proposed, with reference to recent molecular phylogenetic analyses, and with input from diverse members of the fungal taxonomic community. The classification includes 195 taxa, down to the level of order, of which 16 are described o

  5. Descriptive Statistics of the Genome: Phylogenetic Classification of Viruses. (United States)

    Hernandez, Troy; Yang, Jie


    The typical process for classifying and submitting a newly sequenced virus to the NCBI database involves two steps. First, a BLAST search is performed to determine likely family candidates. That is followed by checking the candidate families with the pairwise sequence alignment tool for similar species. The submitter's judgment is then used to determine the most likely species classification. The aim of this article is to show that this process can be automated into a fast, accurate, one-step process using the proposed alignment-free method and properly implemented machine learning techniques. We present a new family of alignment-free vectorizations of the genome, the generalized vector, that maintains the speed of existing alignment-free methods while outperforming all available methods. This new alignment-free vectorization uses the frequency of genomic words (k-mers), as is done in the composition vector, and incorporates descriptive statistics of those k-mers' positional information, as inspired by the natural vector. We analyze five different characterizations of genome similarity using k-nearest neighbor classification and evaluate these on two collections of viruses totaling over 10,000 viruses. We show that our proposed method performs better than, or as well as, other methods at every level of the phylogenetic hierarchy. The data and R code is available upon request.

  6. Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.

    Directory of Open Access Journals (Sweden)

    Oscar Westesson

    Full Text Available The Multiple Sequence Alignment (MSA is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history, it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.

  7. Accurate molecular classification of cancer using simple rules

    Directory of Open Access Journals (Sweden)

    Gotoh Osamu


    Full Text Available Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. Methods We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV of training sets and classification of independent test sets. Results We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML], lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML. Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. Conclusion In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.

  8. Redundancy-Free, Accurate Analytical Center Machine for Classification

    Institute of Scientific and Technical Information of China (English)

    ZHENGFanzi; QIUZhengding; LengYonggang; YueJianhai


    Analytical center machine (ACM) has remarkable generalization performance based on analytical center of version space and outperforms SVM. From the analysis of geometry of machine learning and principle of ACM, it is showed that some training patterns are redundant to the definition of version space. Redundant patterns push ACM classifier away from analytical center of the prime version space so that the generalization performance degrades, at the same time redundant patterns slow down the classifier and reduce the efficiency of storage. Thus, an incremental algorithm is proposed to remove redundant patterns and embed into the frame of ACM that yields a Redundancy free accurate-Analytical center machine (RFA-ACM) for classification. Experiments with Heart, Thyroid, Banana datasets demonstrate the validity of RFA-ACM.

  9. Automatic classification and accurate size measurement of blank mask defects (United States)

    Bhamidipati, Samir; Paninjath, Sankaranarayanan; Pereira, Mark; Buck, Peter


    complexity of defects encountered. The variety arises due to factors such as defect nature, size, shape and composition; and the optical phenomena occurring around the defect. This paper focuses on preliminary characterization results, in terms of classification and size estimation, obtained by Calibre MDPAutoClassify tool on a variety of mask blank defects. It primarily highlights the challenges faced in achieving the results with reference to the variety of defects observed on blank mask substrates and the underlying complexities which make accurate defect size measurement an important and challenging task.

  10. Accurate mobile malware detection and classification in the cloud. (United States)

    Wang, Xiaolei; Yang, Yuexiang; Zeng, Yingzhi


    As the dominator of the Smartphone operating system market, consequently android has attracted the attention of s malware authors and researcher alike. The number of types of android malware is increasing rapidly regardless of the considerable number of proposed malware analysis systems. In this paper, by taking advantages of low false-positive rate of misuse detection and the ability of anomaly detection to detect zero-day malware, we propose a novel hybrid detection system based on a new open-source framework CuckooDroid, which enables the use of Cuckoo Sandbox's features to analyze Android malware through dynamic and static analysis. Our proposed system mainly consists of two parts: anomaly detection engine performing abnormal apps detection through dynamic analysis; signature detection engine performing known malware detection and classification with the combination of static and dynamic analysis. We evaluate our system using 5560 malware samples and 6000 benign samples. Experiments show that our anomaly detection engine with dynamic analysis is capable of detecting zero-day malware with a low false negative rate (1.16 %) and acceptable false positive rate (1.30 %); it is worth noting that our signature detection engine with hybrid analysis can accurately classify malware samples with an average positive rate 98.94 %. Considering the intensive computing resources required by the static and dynamic analysis, our proposed detection system should be deployed off-device, such as in the Cloud. The app store markets and the ordinary users can access our detection system for malware detection through cloud service.

  11. How accurate are the European Union's classifications of chemical substances. (United States)

    Rudén, Christina; Hansson, Sven Ove


    The European Commission has decided on harmonized classifications for a large number of individual chemicals according to its own directive for classification and labeling of dangerous substances. We have compared the harmonized classifications for acute oral toxicity to the acute oral toxicity data available in the RTECS database. Of the 992 substances eligible for this comparison, 15% were assigned a too low danger class and 8% a too high danger class according to the RTECS data. Due to insufficient transparency-scientific documentations of the classification decisions are not available-the causes of this discrepancy can only be hypothesized. We propose that the scientific motivations of future classifications be published and that the apparent over- and underclassifications in the present system be either explained or rectified, according to what are the facts in the matter.

  12. HIPPI: highly accurate protein family classification with ensembles of HMMs

    Directory of Open Access Journals (Sweden)

    Nam-phuong Nguyen


    Full Text Available Abstract Background Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences in public databases or that are fragmentary remains one of the more difficult analytical problems in bioinformatics. Results We present a new technique for family identification called HIPPI (Hierarchical Profile Hidden Markov Models for Protein family Identification. HIPPI uses a novel technique to represent a multiple sequence alignment for a given protein family or superfamily by an ensemble of profile hidden Markov models computed using HMMER. An evaluation of HIPPI on the Pfam database shows that HIPPI has better overall precision and recall than blastp, HMMER, and pipelines based on HHsearch, and maintains good accuracy even for fragmentary query sequences and for protein families with low average pairwise sequence identity, both conditions where other methods degrade in accuracy. Conclusion HIPPI provides accurate protein family identification and is robust to difficult model conditions. Our results, combined with observations from previous studies, show that ensembles of profile Hidden Markov models can better represent multiple sequence alignments than a single profile Hidden Markov model, and thus can improve downstream analyses for various bioinformatic tasks. Further research is needed to determine the best practices for building the ensemble of profile Hidden Markov models. HIPPI is available on GitHub at .

  13. Accurate Classification of RNA Structures Using Topological Fingerprints (United States)

    Li, Kejie; Gribskov, Michael


    While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at PMID:27755571

  14. The phagotrophic origin of eukaryotes and phylogenetic classification of Protozoa. (United States)

    Cavalier-Smith, T


    ancestrally biciliate clade, named 'bikonts'. The apparently conflicting rRNA and protein trees can be reconciled with each other and this ultrastructural interpretation if long-branch distortions, some mechanistically explicable, are allowed for. Bikonts comprise two groups: corticoflagellates, with a younger anterior cilium, no centrosomal cone and ancestrally a semi-rigid cell cortex with a microtubular band on either side of the posterior mature centriole; and Rhizaria [a new infrakingdom comprising Cercozoa (now including Ascetosporea classis nov.), Retaria phylum nov., Heliozoa and Apusozoa phylum nov.], having a centrosomal cone or radiating microtubules and two microtubular roots and a soft surface, frequently with reticulopodia. Corticoflagellates comprise photokaryotes (Plantae and chromalveolates, both ancestrally with cortical alveoli) and Excavata (a new protozoan infrakingdom comprising Loukozoa, Discicristata and Archezoa, ancestrally with three microtubular roots). All basal eukaryotic radiations were of mitochondrial aerobes; hydrogenosomes evolved polyphyletically from mitochondria long afterwards, the persistence of their double envelope long after their genomes disappeared being a striking instance of membrane heredity. I discuss the relationship between the 13 protozoan phyla recognized here and revise higher protozoan classification by updating as subkingdoms Lankester's 1878 division of Protozoa into Corticata (Excavata, Alveolata; with prominent cortical microtubules and ancestrally localized cytostome--the Parabasalia probably secondarily internalized the cytoskeleton) and Gymnomyxa [infrakingdoms Sarcomastigota (Choanozoa, Amoebozoa) and Rhizaria; both ancestrally with a non-cortical cytoskeleton of radiating singlet microtubules and a relatively soft cell surface with diffused feeding]. As the eukaryote root almost certainly lies within Gymnomyxa, probably among the Sarcomastigota, Corticata are derived. Following the single symbiogenetic origin of

  15. Phylogeny and phylogenetic classification of the antbirds, ovenbirds, woodcreepers, and allies (Aves: Passeriformes: Infraorder Furnariides) (United States)

    Moyle, R.G.; Chesser, R.T.; Brumfield, R.T.; Tello, J.G.; Marchese, D.J.; Cracraft, J.


    The infraorder Furnariides is a diverse group of suboscine passerine birds comprising a substantial component of the Neotropical avifauna. The included species encompass a broad array of morphologies and behaviours, making them appealing for evolutionary studies, but the size of the group (ca. 600 species) has limited well-sampled higher-level phylogenetic studies. Using DNA sequence data from the nuclear RAG-1 and RAG-2 exons, we undertook a phylogenetic analysis of the Furnariides sampling 124 (more than 88%) of the genera. Basal relationships among family-level taxa differed depending on phylogenetic method, but all topologies had little nodal support, mirroring the results from earlier studies in which discerning relationships at the base of the radiation was also difficult. In contrast, branch support for family-rank taxa and for many relationships within those clades was generally high. Our results support the Melanopareidae and Grallariidae as distinct from the Rhinocryptidae and Formicariidae, respectively. Within the Furnariides our data contradict some recent phylogenetic hypotheses and suggest that further study is needed to resolve these discrepancies. Of the few genera represented by multiple species, several were not monophyletic, indicating that additional systematic work remains within furnariine families and must include dense taxon sampling. We use this study as a basis for proposing a new phylogenetic classification for the group and in the process erect new family-group names for clades having high branch support across methods. ?? 2009 The Willi Hennig Society.

  16. A bootstrap based analysis pipeline for efficient classification of phylogenetically related animal miRNAs

    Directory of Open Access Journals (Sweden)

    Gu Xun


    Full Text Available Abstract Background Phylogenetically related miRNAs (miRNA families convey important information of the function and evolution of miRNAs. Due to the special sequence features of miRNAs, pair-wise sequence identity between miRNA precursors alone is often inadequate for unequivocally judging the phylogenetic relationships between miRNAs. Most of the current methods for miRNA classification rely heavily on manual inspection and lack measurements of the reliability of the results. Results In this study, we designed an analysis pipeline (the Phylogeny-Bootstrap-Cluster (PBC pipeline to identify miRNA families based on branch stability in the bootstrap trees derived from overlapping genome-wide miRNA sequence sets. We tested the PBC analysis pipeline with the miRNAs from six animal species, H. sapiens, M. musculus, G. gallus, D. rerio, D. melanogaster, and C. elegans. The resulting classification was compared with the miRNA families defined in miRBase. The two classifications were largely consistent. Conclusion The PBC analysis pipeline is an efficient method for classifying large numbers of heterogeneous miRNA sequences. It requires minimum human involvement and provides measurements of the reliability of the classification results.

  17. Molecular phylogenetic perspectives for character classification and convergence: Framing some issues with nematode vulval appendages and telotylenchid tail termini (United States)

    Characters flagged as convergent based on newer molecular phylogenetic trees inform both practical identification and more esoteric classification. Nematode morphological characters such as lateral lines, bullae and laciniae are quite independent structures from those similarly named in other organi...

  18. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment?

    Directory of Open Access Journals (Sweden)

    Hartmann Stefanie


    Full Text Available Abstract Background While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood and studied ways to improve the accuracy of trees obtained from such datasets. Results We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. Conclusion These results demonstrate that partial gene

  19. Classification of Phylogenetic Profiles for Protein Function Prediction: An SVM Approach (United States)

    Kotaru, Appala Raju; Joshi, Ramesh C.

    Predicting the function of an uncharacterized protein is a major challenge in post-genomic era due to problems complexity and scale. Having knowledge of protein function is a crucial link in the development of new drugs, better crops, and even the development of biochemicals such as biofuels. Recently numerous high-throughput experimental procedures have been invented to investigate the mechanisms leading to the accomplishment of a protein’s function and Phylogenetic profile is one of them. Phylogenetic profile is a way of representing a protein which encodes evolutionary history of proteins. In this paper we proposed a method for classification of phylogenetic profiles using supervised machine learning method, support vector machine classification along with radial basis function as kernel for identifying functionally linked proteins. We experimentally evaluated the performance of the classifier with the linear kernel, polynomial kernel and compared the results with the existing tree kernel. In our study we have used proteins of the budding yeast saccharomyces cerevisiae genome. We generated the phylogenetic profiles of 2465 yeast genes and for our study we used the functional annotations that are available in the MIPS database. Our experiments show that the performance of the radial basis kernel is similar to polynomial kernel is some functional classes together are better than linear, tree kernel and over all radial basis kernel outperformed the polynomial kernel, linear kernel and tree kernel. In analyzing these results we show that it will be feasible to make use of SVM classifier with radial basis function as kernel to predict the gene functionality using phylogenetic profiles.

  20. Accurate and interpretable classification of microspectroscopy pixels using artificial neural networks. (United States)

    Manescu, Petru; Jong Lee, Young; Camp, Charles; Cicerone, Marcus; Brady, Mary; Bajcsy, Peter


    This paper addresses the problem of classifying materials from microspectroscopy at a pixel level. The challenges lie in identifying discriminatory spectral features and obtaining accurate and interpretable models relating spectra and class labels. We approach the problem by designing a supervised classifier from a tandem of Artificial Neural Network (ANN) models that identify relevant features in raw spectra and achieve high classification accuracy. The tandem of ANN models is meshed with classification rule extraction methods to lower the model complexity and to achieve interpretability of the resulting model. The contribution of the work is in designing each ANN model based on the microspectroscopy hypothesis about a discriminatory feature of a certain target class being composed of a linear combination of spectra. The novelty lies in meshing ANN and decision rule models into a tandem configuration to achieve accurate and interpretable classification results. The proposed method was evaluated using a set of broadband coherent anti-Stokes Raman scattering (BCARS) microscopy cell images (600 000  pixel-level spectra) and a reference four-class rule-based model previously created by biochemical experts. The generated classification rule-based model was on average 85% accurate measured by the DICE pixel label similarity metric, and on average 96% similar to the reference rules measured by the vector cosine metric.

  1. Accurate crop classification using hierarchical genetic fuzzy rule-based systems (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.


    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  2. Molecular phylogenetic evaluation of classification and scenarios of character evolution in calcareous sponges (Porifera, Class Calcarea.

    Directory of Open Access Journals (Sweden)

    Oliver Voigt

    Full Text Available Calcareous sponges (Phylum Porifera, Class Calcarea are known to be taxonomically difficult. Previous molecular studies have revealed many discrepancies between classically recognized taxa and the observed relationships at the order, family and genus levels; these inconsistencies question underlying hypotheses regarding the evolution of certain morphological characters. Therefore, we extended the available taxa and character set by sequencing the complete small subunit (SSU rDNA and the almost complete large subunit (LSU rDNA of additional key species and complemented this dataset by substantially increasing the length of available LSU sequences. Phylogenetic analyses provided new hypotheses about the relationships of Calcarea and about the evolution of certain morphological characters. We tested our phylogeny against competing phylogenetic hypotheses presented by previous classification systems. Our data reject the current order-level classification by again finding non-monophyletic Leucosolenida, Clathrinida and Murrayonida. In the subclass Calcinea, we recovered a clade that includes all species with a cortex, which is largely consistent with the previously proposed order Leucettida. Other orders that had been rejected in the current system were not found, but could not be rejected in our tests either. We found several additional families and genera polyphyletic: the families Leucascidae and Leucaltidae and the genus Leucetta in Calcinea, and in Calcaronea the family Amphoriscidae and the genus Ute. Our phylogeny also provided support for the vaguely suspected close relationship of several members of Grantiidae with giantortical diactines to members of Heteropiidae. Similarly, our analyses revealed several unexpected affinities, such as a sister group relationship between Leucettusa (Leucaltidae and Leucettidae and between Leucascandra (Jenkinidae and Sycon carteri (Sycettidae. According to our results, the taxonomy of Calcarea is in

  3. Molecular phylogenetic evaluation of classification and scenarios of character evolution in calcareous sponges (Porifera, Class Calcarea). (United States)

    Voigt, Oliver; Wülfing, Eilika; Wörheide, Gert


    Calcareous sponges (Phylum Porifera, Class Calcarea) are known to be taxonomically difficult. Previous molecular studies have revealed many discrepancies between classically recognized taxa and the observed relationships at the order, family and genus levels; these inconsistencies question underlying hypotheses regarding the evolution of certain morphological characters. Therefore, we extended the available taxa and character set by sequencing the complete small subunit (SSU) rDNA and the almost complete large subunit (LSU) rDNA of additional key species and complemented this dataset by substantially increasing the length of available LSU sequences. Phylogenetic analyses provided new hypotheses about the relationships of Calcarea and about the evolution of certain morphological characters. We tested our phylogeny against competing phylogenetic hypotheses presented by previous classification systems. Our data reject the current order-level classification by again finding non-monophyletic Leucosolenida, Clathrinida and Murrayonida. In the subclass Calcinea, we recovered a clade that includes all species with a cortex, which is largely consistent with the previously proposed order Leucettida. Other orders that had been rejected in the current system were not found, but could not be rejected in our tests either. We found several additional families and genera polyphyletic: the families Leucascidae and Leucaltidae and the genus Leucetta in Calcinea, and in Calcaronea the family Amphoriscidae and the genus Ute. Our phylogeny also provided support for the vaguely suspected close relationship of several members of Grantiidae with giantortical diactines to members of Heteropiidae. Similarly, our analyses revealed several unexpected affinities, such as a sister group relationship between Leucettusa (Leucaltidae) and Leucettidae and between Leucascandra (Jenkinidae) and Sycon carteri (Sycettidae). According to our results, the taxonomy of Calcarea is in desperate need of a

  4. Phylogenetic Classification Of Bartonella Species By Comparing The Two-Component System Response Regulator Feup Sequences

    Directory of Open Access Journals (Sweden)

    Mhamad Abou-Hamdan


    Full Text Available Abstract The bacterial genus Bartonella is classified in the alpha-2 Proteobacteria on the basis of 16S rDNA sequence comparison. The Bartonella two-component system feuPQ is found in nearly all bacterial species. We investigated the usefulness of the response regulator feuP gene sequence in the classification of 18 well characterized Bartonella species. Phylogenetic relationships were inferred using parsimony neighbour-joining and maximum-likelihood methods. Reliable classifications of most of the studied species were obtained. Bartonella were divided into two supported clades containing two supported clusters each. These results were similar to our previous data obtained with groEL ftsZ and ribC genes sequences. The wide range of feuP DNA sequence similarity 78.6 to 96.5 among Bartonella species makes it a promising candidate for multi-locus sequence typing MLST of clinical isolates. This is the first report proving the usefulness of feuP sequences in bartonellae classification at the species level.

  5. Rapid phylogenetic and functional classification of short genomic fragments with signature peptides

    Directory of Open Access Journals (Sweden)

    Berendzen Joel


    Full Text Available Abstract Background Classification is difficult for shotgun metagenomics data from environments such as soils, where the diversity of sequences is high and where reference sequences from close relatives may not exist. Approaches based on sequence-similarity scores must deal with the confounding effects that inheritance and functional pressures exert on the relation between scores and phylogenetic distance, while approaches based on sequence alignment and tree-building are typically limited to a small fraction of gene families. We describe an approach based on finding one or more exact matches between a read and a precomputed set of peptide 10-mers. Results At even the largest phylogenetic distances, thousands of 10-mer peptide exact matches can be found between pairs of bacterial genomes. Genes that share one or more peptide 10-mers typically have high reciprocal BLAST scores. Among a set of 403 representative bacterial genomes, some 20 million 10-mer peptides were found to be shared. We assign each of these peptides as a signature of a particular node in a phylogenetic reference tree based on the RNA polymerase genes. We classify the phylogeny of a genomic fragment (e.g., read at the most specific node on the reference tree that is consistent with the phylogeny of observed signature peptides it contains. Using both synthetic data from four newly-sequenced soil-bacterium genomes and ten real soil metagenomics data sets, we demonstrate a sensitivity and specificity comparable to that of the MEGAN metagenomics analysis package using BLASTX against the NR database. Phylogenetic and functional similarity metrics applied to real metagenomics data indicates a signal-to-noise ratio of approximately 400 for distinguishing among environments. Our method assigns ~6.6 Gbp/hr on a single CPU, compared with 25 kbp/hr for methods based on BLASTX against the NR database. Conclusions Classification by exact matching against a precomputed list of signature

  6. HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors

    Directory of Open Access Journals (Sweden)

    Sun Yanni


    Full Text Available Abstract Background Protein domain classification is an important step in metagenomic annotation. The state-of-the-art method for protein domain classification is profile HMM-based alignment. However, the relatively high rates of insertions and deletions in homopolymer regions of pyrosequencing reads create frameshifts, causing conventional profile HMM alignment tools to generate alignments with marginal scores. This makes error-containing gene fragments unclassifiable with conventional tools. Thus, there is a need for an accurate domain classification tool that can detect and correct sequencing errors. Results We introduce HMM-FRAME, a protein domain classification tool based on an augmented Viterbi algorithm that can incorporate error models from different sequencing platforms. HMM-FRAME corrects sequencing errors and classifies putative gene fragments into domain families. It achieved high error detection sensitivity and specificity in a data set with annotated errors. We applied HMM-FRAME in Targeted Metagenomics and a published metagenomic data set. The results showed that our tool can correct frameshifts in error-containing sequences, generate much longer alignments with significantly smaller E-values, and classify more sequences into their native families. Conclusions HMM-FRAME provides a complementary protein domain classification tool to conventional profile HMM-based methods for data sets containing frameshifts. Its current implementation is best used for small-scale metagenomic data sets. The source code of HMM-FRAME can be downloaded at and at

  7. Accurate and reliable cancer classification based on probabilistic inference of pathway activity.

    Directory of Open Access Journals (Sweden)

    Junjie Su

    Full Text Available With the advent of high-throughput technologies for measuring genome-wide expression profiles, a large number of methods have been proposed for discovering diagnostic markers that can accurately discriminate between different classes of a disease. However, factors such as the small sample size of typical clinical data, the inherent noise in high-throughput measurements, and the heterogeneity across different samples, often make it difficult to find reliable gene markers. To overcome this problem, several studies have proposed the use of pathway-based markers, instead of individual gene markers, for building the classifier. Given a set of known pathways, these methods estimate the activity level of each pathway by summarizing the expression values of its member genes, and use the pathway activities for classification. It has been shown that pathway-based classifiers typically yield more reliable results compared to traditional gene-based classifiers. In this paper, we propose a new classification method based on probabilistic inference of pathway activities. For a given sample, we compute the log-likelihood ratio between different disease phenotypes based on the expression level of each gene. The activity of a given pathway is then inferred by combining the log-likelihood ratios of the constituent genes. We apply the proposed method to the classification of breast cancer metastasis, and show that it achieves higher accuracy and identifies more reproducible pathway markers compared to several existing pathway activity inference methods.

  8. Towards a formal genealogical classification of the Lezgian languages (North Caucasus: testing various phylogenetic methods on lexical data.

    Directory of Open Access Journals (Sweden)

    Alexei Kassian

    Full Text Available A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ, Neighbor joining (NJ, Unweighted pair group method with arithmetic mean (UPGMA, Bayesian Markov chain Monte Carlo (MCMC, Unweighted maximum parsimony (UMP. Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances. Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists, the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP have yielded less likely topologies.

  9. Molecular and morphological data supporting phylogenetic reconstruction of the genus Goniothalamus (Annonaceae, including a reassessment of previous infrageneric classifications

    Directory of Open Access Journals (Sweden)

    Chin Cheung Tang


    Full Text Available Data is presented in support of a phylogenetic reconstruction of the species-rich early-divergent angiosperm genus Goniothalamus (Annonaceae (Tang et al., Mol. Phylogenetic Evol., 2015 [1], inferred using chloroplast DNA (cpDNA sequences. The data includes a list of primers for amplification and sequencing for nine cpDNA regions: atpB-rbcL, matK, ndhF, psbA-trnH, psbM-trnD, rbcL, trnL-F, trnS-G, and ycf1, the voucher information and molecular data (GenBank accession numbers of 67 ingroup Goniothalamus accessions and 14 outgroup accessions selected from across the tribe Annoneae, and aligned data matrices for each gene region. We also present our Bayesian phylogenetic reconstructions for Goniothalamus, with information on previous infrageneric classifications superimposed to enable an evaluation of monophyly, together with a taxon-character data matrix (with 15 morphological characters scored for 66 Goniothalamus species and seven other species from the tribe Annoneae that are shown to be phylogenetically correlated.

  10. Photometric brown-dwarf classification. I. A method to identify and accurately classify large samples of brown dwarfs without spectroscopy (United States)

    Skrzypek, N.; Warren, S. J.; Faherty, J. K.; Mortlock, D. J.; Burgasser, A. J.; Hewett, P. C.


    Aims: We present a method, named photo-type, to identify and accurately classify L and T dwarfs onto the standard spectral classification system using photometry alone. This enables the creation of large and deep homogeneous samples of these objects efficiently, without the need for spectroscopy. Methods: We created a catalogue of point sources with photometry in 8 bands, ranging from 0.75 to 4.6 μm, selected from an area of 3344 deg2, by combining SDSS, UKIDSS LAS, and WISE data. Sources with 13.0 0.8, were then classified by comparison against template colours of quasars, stars, and brown dwarfs. The L and T templates, spectral types L0 to T8, were created by identifying previously known sources with spectroscopic classifications, and fitting polynomial relations between colour and spectral type. Results: Of the 192 known L and T dwarfs with reliable photometry in the surveyed area and magnitude range, 189 are recovered by our selection and classification method. We have quantified the accuracy of the classification method both externally, with spectroscopy, and internally, by creating synthetic catalogues and accounting for the uncertainties. We find that, brighter than J = 17.5, photo-type classifications are accurate to one spectral sub-type, and are therefore competitive with spectroscopic classifications. The resultant catalogue of 1157 L and T dwarfs will be presented in a companion paper.

  11. Towards a phylogenetic classification of dendrocoelid freshwater planarians (Platyhelminthes): a morphological and eclectic approach

    NARCIS (Netherlands)

    Sluys, R.; Kawakatsu, M.


    We explore and review the taxonomic distribution of morphological features that may be used as supporting apomorphies for the monophyletic status of various taxa in future, more comprehensive phylogenetic analyses of the dendrocoelid freshwater planarians and their close relatives. Characters examin

  12. A practical approach to accurate classification and staging of mycosis fungoides and Sézary syndrome. (United States)

    Thomas, Bjorn Rhys; Whittaker, Sean


    Cutaneous T-cell lymphomas are rare, distinct forms of non-Hodgkin's lymphomas. Of which, mycosis fungoides (MF) and Sézary syndrome (SS) are two of the most common forms. Careful, clear classification and staging of these lymphomas allow dermatologists to commence appropriate therapy and allow correct prognostic stratification for those patients affected. Of note, patients with more advanced disease will require multi-disciplinary input in determining specialist therapy. Literature has been summarized into an outline for classification/staging of MF and SS with the aim to provide clinical dermatologists with a concise review.

  13. Towards a phylogenetic classification of reef corals: The Indo-Pacific genera Merulina, Goniastrea and Scapophyllia (Scleractinia, Merulinidae)

    KAUST Repository

    Huang, Danwei


    Recent advances in scleractinian systematics and taxonomy have been achieved through the integration of molecular and morphological data, as well as rigorous analysis using phylogenetic methods. In this study, we continue in our pursuit of a phylogenetic classification by examining the evolutionary relationships between the closely related reef coral genera Merulina, Goniastrea, Paraclavarina and Scapophyllia (Merulinidae). In particular, we address the extreme polyphyly of Favites and Goniastrea that was discovered a decade ago. We sampled 145 specimens belonging to 16 species from a wide geographic range in the Indo-Pacific, focusing especially on type localities, including the Red Sea, western Indian Ocean and central Pacific. Tree reconstructions based on both nuclear and mitochondrial markers reveal a novel lineage composed of three species previously placed in Favites and Goniastrea. Morphological analyses indicate that this clade, Paragoniastrea Huang, Benzoni & Budd, gen. n., has a unique combination of corallite and subcorallite features observable with scanning electron microscopy and thin sections. Molecular and morphological evidence furthermore indicates that the monotypic genus Paraclavarina is nested within Merulina, and the former is therefore synonymised. © 2014 Royal Swedish Academy of Sciences.

  14. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu. (United States)

    Manwell, C; Baker, C M


    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species.

  15. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life.


    Margulis, L


    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life...

  16. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life (United States)

    Margulis, L.


    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life-history, and fossil record evidence support the reunification of bacteria as Prokarya while subdividing Eukarya into uniquely defined subtaxa: Protoctista, Animalia, Fungi, and Plantae.

  17. Classification algorithms with multi-modal data fusion could accurately distinguish neuromyelitis optica from multiple sclerosis. (United States)

    Eshaghi, Arman; Riyahi-Alam, Sadjad; Saeedi, Roghayyeh; Roostaei, Tina; Nazeri, Arash; Aghsaei, Aida; Doosti, Rozita; Ganjgahi, Habib; Bodini, Benedetta; Shakourirad, Ali; Pakravan, Manijeh; Ghana'ati, Hossein; Firouznia, Kavous; Zarei, Mojtaba; Azimi, Amir Reza; Sahraian, Mohammad Ali


    Neuromyelitis optica (NMO) exhibits substantial similarities to multiple sclerosis (MS) in clinical manifestations and imaging results and has long been considered a variant of MS. With the advent of a specific biomarker in NMO, known as anti-aquaporin 4, this assumption has changed; however, the differential diagnosis remains challenging and it is still not clear whether a combination of neuroimaging and clinical data could be used to aid clinical decision-making. Computer-aided diagnosis is a rapidly evolving process that holds great promise to facilitate objective differential diagnoses of disorders that show similar presentations. In this study, we aimed to use a powerful method for multi-modal data fusion, known as a multi-kernel learning and performed automatic diagnosis of subjects. We included 30 patients with NMO, 25 patients with MS and 35 healthy volunteers and performed multi-modal imaging with T1-weighted high resolution scans, diffusion tensor imaging (DTI) and resting-state functional MRI (fMRI). In addition, subjects underwent clinical examinations and cognitive assessments. We included 18 a priori predictors from neuroimaging, clinical and cognitive measures in the initial model. We used 10-fold cross-validation to learn the importance of each modality, train and finally test the model performance. The mean accuracy in differentiating between MS and NMO was 88%, where visible white matter lesion load, normal appearing white matter (DTI) and functional connectivity had the most important contributions to the final classification. In a multi-class classification problem we distinguished between all of 3 groups (MS, NMO and healthy controls) with an average accuracy of 84%. In this classification, visible white matter lesion load, functional connectivity, and cognitive scores were the 3 most important modalities. Our work provides preliminary evidence that computational tools can be used to help make an objective differential diagnosis of NMO and MS.

  18. Classification algorithms with multi-modal data fusion could accurately distinguish neuromyelitis optica from multiple sclerosis

    Directory of Open Access Journals (Sweden)

    Arman Eshaghi


    Full Text Available Neuromyelitis optica (NMO exhibits substantial similarities to multiple sclerosis (MS in clinical manifestations and imaging results and has long been considered a variant of MS. With the advent of a specific biomarker in NMO, known as anti-aquaporin 4, this assumption has changed; however, the differential diagnosis remains challenging and it is still not clear whether a combination of neuroimaging and clinical data could be used to aid clinical decision-making. Computer-aided diagnosis is a rapidly evolving process that holds great promise to facilitate objective differential diagnoses of disorders that show similar presentations. In this study, we aimed to use a powerful method for multi-modal data fusion, known as a multi-kernel learning and performed automatic diagnosis of subjects. We included 30 patients with NMO, 25 patients with MS and 35 healthy volunteers and performed multi-modal imaging with T1-weighted high resolution scans, diffusion tensor imaging (DTI and resting-state functional MRI (fMRI. In addition, subjects underwent clinical examinations and cognitive assessments. We included 18 a priori predictors from neuroimaging, clinical and cognitive measures in the initial model. We used 10-fold cross-validation to learn the importance of each modality, train and finally test the model performance. The mean accuracy in differentiating between MS and NMO was 88%, where visible white matter lesion load, normal appearing white matter (DTI and functional connectivity had the most important contributions to the final classification. In a multi-class classification problem we distinguished between all of 3 groups (MS, NMO and healthy controls with an average accuracy of 84%. In this classification, visible white matter lesion load, functional connectivity, and cognitive scores were the 3 most important modalities. Our work provides preliminary evidence that computational tools can be used to help make an objective differential diagnosis

  19. Automatic phylogenetic classification of bacterial beta-lactamase sequences including structural and antibiotic substrate preference information. (United States)

    Ma, Jianmin; Eisenhaber, Frank; Maurer-Stroh, Sebastian


    Beta lactams comprise the largest and still most effective group of antibiotics, but bacteria can gain resistance through different beta lactamases that can degrade these antibiotics. We developed a user friendly tree building web server that allows users to assign beta lactamase sequences to their respective molecular classes and subclasses. Further clinically relevant information includes if the gene is typically chromosomal or transferable through plasmids as well as listing the antibiotics which the most closely related reference sequences are known to target and cause resistance against. This web server can automatically build three phylogenetic trees: the first tree with closely related sequences from a Tachyon search against the NCBI nr database, the second tree with curated reference beta lactamase sequences, and the third tree built specifically from substrate binding pocket residues of the curated reference beta lactamase sequences. We show that the latter is better suited to recover antibiotic substrate assignments through nearest neighbor annotation transfer. The users can also choose to build a structural model for the query sequence and view the binding pocket residues of their query relative to other beta lactamases in the sequence alignment as well as in the 3D structure relative to bound antibiotics. This web server is freely available at

  20. Deceptive desmas: molecular phylogenetics suggests a new classification and uncovers convergent evolution of lithistid demosponges.

    Directory of Open Access Journals (Sweden)

    Astrid Schuster

    Full Text Available Reconciling the fossil record with molecular phylogenies to enhance the understanding of animal evolution is a challenging task, especially for taxa with a mostly poor fossil record, such as sponges (Porifera. 'Lithistida', a polyphyletic group of recent and fossil sponges, are an exception as they provide the richest fossil record among demosponges. Lithistids, currently encompassing 13 families, 41 genera and >300 recent species, are defined by the common possession of peculiar siliceous spicules (desmas that characteristically form rigid articulated skeletons. Their phylogenetic relationships are to a large extent unresolved and there has been no (taxonomically comprehensive analysis to formally reallocate lithistid taxa to their closest relatives. This study, based on the most comprehensive molecular and morphological investigation of 'lithistid' demosponges to date, corroborates some previous weakly-supported hypotheses, and provides novel insights into the evolutionary relationships of the previous 'order Lithistida'. Based on molecular data (partial mtDNA CO1 and 28S rDNA sequences, we show that 8 out of 13 'Lithistida' families belong to the order Astrophorida, whereas Scleritodermidae and Siphonidiidae form a separate monophyletic clade within Tetractinellida. Most lithistid astrophorids are dispersed between different clades of the Astrophorida and we propose to formally reallocate them, respectively. Corallistidae, Theonellidae and Phymatellidae are monophyletic, whereas the families Pleromidae and Scleritodermidae are polyphyletic. Family Desmanthidae is polyphyletic and groups within Halichondriidae--we formally propose a reallocation. The sister group relationship of the family Vetulinidae to Spongillida is confirmed and we propose here for the first time to include Vetulina into a new Order Sphaerocladina. Megascleres and microscleres possibly evolved and/or were lost several times independently in different 'lithistid' taxa, and

  1. The International Classification of Headache Disorders: accurate diagnosis of orofacial pain? (United States)

    Benoliel, R; Birman, N; Eliav, E; Sharav, Y


    The aim was to apply diagnostic criteria, as published by the International Headache Society (IHS), to the diagnosis of orofacial pain. A total of 328 consecutive patients with orofacial pain were collected over a period of 2 years. The orofacial pain clinic routinely employs criteria published by the IHS, the American Academy of Orofacial Pain (AAOP) and the Research Diagnostic Criteria for Temporomandibular Disorders (RDCTMD). Employing IHS criteria, 184 patients were successfully diagnosed (56%), including 34 with persistent idiopathic facial pain. In the remaining 144 we applied AAOP/RDCTMD criteria and diagnosed 120 as masticatory myofascial pain (MMP) resulting in a diagnostic efficiency of 92.7% (304/328) when applying the three classifications (IHS, AAOP, RDCTMD). Employing further published criteria, 23 patients were diagnosed as neurovascular orofacial pain (NVOP, facial migraine) and one as a neuropathy secondary to connective tissue disease. All the patients were therefore allocated to predefined diagnoses. MMP is clearly defined by AAOP and the RDCTMD. However, NVOP is not defined by any of the above classification systems. The features of MMP and NVOP are presented and analysed with calculations for positive (PPV) and negative predictive values (NPV). In MMP the combination of facial pain aggravated by jaw movement, and the presence of three or more tender muscles resulted in a PPV = 0.82 and a NPV = 0.86. For NVOP the combination of facial pain, throbbing quality, autonomic and/or systemic features and attack duration of > 60 min gave a PPV = 0.71 and a NPV = 0.95. Expansion of the IHS system is needed so as to integrate more orofacial pain syndromes.

  2. Photometric brown-dwarf classification. I. A method to identify and accurately classify large samples of brown dwarfs without spectroscopy

    CERN Document Server

    Skrzypek, Nathalie; Faherty, Jacqueline K; Mortlock, Daniel J; Burgasser, Adam J; Hewett, Paul C


    Aims. We present a method, named photo-type, to identify and accurately classify L and T dwarfs onto the standard spectral classification system using photometry alone. This enables the creation of large and deep homogeneous samples of these objects efficiently, without the need for spectroscopy. Methods. We created a catalogue of point sources with photometry in 8 bands, ranging from 0.75 to 4.6 microns, selected from an area of 3344 deg^2, by combining SDSS, UKIDSS LAS, and WISE data. Sources with 13.0 0.8, were then classified by comparison against template colours of quasars, stars, and brown dwarfs. The L and T templates, spectral types L0 to T8, were created by identifying previously known sources with spectroscopic classifications, and fitting polynomial relations between colour and spectral type. Results. Of the 192 known L and T dwarfs with reliable photometry in the surveyed area and magnitude range, 189 are recovered by our selection and classification method. We have quantified the accuracy of th...

  3. Treatment response classification of liver metastatic disease evaluated on imaging. Are RECIST unidimensional measurements accurate? (United States)

    Mantatzis, Michael; Kakolyris, Stylianos; Amarantidis, Kyriakos; Karayiannakis, Anastasios; Prassopoulos, Panos


    The purpose of this study was to evaluate the accuracy of unidimensional measurements (response evaluation criteria in solid tumors, RECIST) compared with volumetric measurements in patients with liver metastases undergoing chemotherapy. Forty-four patients with newly diagnosed liver lesions underwent three MRI examinations at treatment initiation, during chemotherapy, and immediately post-treatment. Measurements based on RECIST guidelines and volume calculations were performed on the "target" lesions (TLs). The two methods were in agreement in 64/77 of patients and 253/301 of individual lesions classification in response categories ("good" agreement, Cohen kappa = 0.735 and 0.741, respectively). In 16.88% of the comparisons the two methods stratified patients to a different response category; 27.6% of TLs did not follow the response category of the patient in whom lesions were located. The actual volume of TLs differs from the calculated volume of a sphere with the same diameter. Our study supports the use of volumetric techniques that may overcome certain disadvantages of unidimensional measurements.

  4. A Void Reference Sensor-Multiple Signal Classification Algorithm for More Accurate Direction of Arrival Estimation of Low Altitude Target

    Institute of Scientific and Technical Information of China (English)

    XIAO Hui; SUN Jin-cai; YUAN Jun; NIU Yi-long


    There exists MUSIC (multiple signal classification) algorithm for direction of arrival (DOA) estimation. This paper is to present a different MUSIC algorithm for more accurate estimation of low altitude target. The possibility of better performance is analyzed using a void reference sensor (VRS) in MUSIC algorithm. The following two topics are discussed: 1) the time delay formula and VRS-MUSIC algorithm with VRS located on the minus of z-axes; 2) the DOA estimation results of VRS-MUSIC and MUSIC algorithms. The simulation results show VRS-MUSIC algorithm has three advantages compared with MUSIC: 1 ) When the signal to noise ratio (SNR) is more than - 5 dB, the direction estimation error is 1/2 as much as that obtained by MUSIC; 2) The side lobe is more lower and the stability is better; 3) The size of array that the algorithm requires is smaller.

  5. Protein clustering and RNA phylogenetic reconstruction of the influenza A [corrected] virus NS1 protein allow an update in classification and identification of motif conservation.

    Directory of Open Access Journals (Sweden)

    Edgar E Sevilla-Reyes

    Full Text Available The non-structural protein 1 (NS1 of influenza A virus (IAV, coded by its third most diverse gene, interacts with multiple molecules within infected cells. NS1 is involved in host immune response regulation and is a potential contributor to the virus host range. Early phylogenetic analyses using 50 sequences led to the classification of NS1 gene variants into groups (alleles A and B. We reanalyzed NS1 diversity using 14,716 complete NS IAV sequences, downloaded from public databases, without host bias. Removal of sequence redundancy and further structured clustering at 96.8% amino acid similarity produced 415 clusters that enhanced our capability to detect distinct subgroups and lineages, which were assigned a numerical nomenclature. Maximum likelihood phylogenetic reconstruction using RNA sequences indicated the previously identified deep branching separating group A from group B, with five distinct subgroups within A as well as two and five lineages within the A4 and A5 subgroups, respectively. Our classification model proposes that sequence patterns in thirteen amino acid positions are sufficient to fit >99.9% of all currently available NS1 sequences into the A subgroups/lineages or the B group. This classification reduces host and virus bias through the prioritization of NS1 RNA phylogenetics over host or virus phenetics. We found significant sequence conservation within the subgroups and lineages with characteristic patterns of functional motifs, such as the differential binding of CPSF30 and crk/crkL or the availability of a C-terminal PDZ-binding motif. To understand selection pressures and evolution acting on NS1, it is necessary to organize the available data. This updated classification may help to clarify and organize the study of NS1 interactions and pathogenic differences and allow the drawing of further functional inferences on sequences in each group, subgroup and lineage rather than on a strain-by-strain basis.

  6. Revisiting the phylogeny of Bombacoideae (Malvaceae): Novel relationships, morphologically cohesive clades, and a new tribal classification based on multilocus phylogenetic analyses. (United States)

    Carvalho-Sobrinho, Jefferson G; Alverson, William S; Alcantara, Suzana; Queiroz, Luciano P; Mota, Aline C; Baum, David A


    Bombacoideae (Malvaceae) is a clade of deciduous trees with a marked dominance in many forests, especially in the Neotropics. The historical lack of a well-resolved phylogenetic framework for Bombacoideae hinders studies in this ecologically important group. We reexamined phylogenetic relationships in this clade based on a matrix of 6465 nuclear (ETS, ITS) and plastid (matK, trnL-trnF, trnS-trnG) DNA characters. We used maximum parsimony, maximum likelihood, and Bayesian inference to infer relationships among 108 species (∼70% of the total number of known species). We analyzed the evolution of selected morphological traits: trunk or branch prickles, calyx shape, endocarp type, seed shape, and seed number per fruit, using ML reconstructions of their ancestral states to identify possible synapomorphies for major clades. Novel phylogenetic relationships emerged from our analyses, including three major lineages marked by fruit or seed traits: the winged-seed clade (Bernoullia, Gyranthera, and Huberodendron), the spongy endocarp clade (Adansonia, Aguiaria, Catostemma, Cavanillesia, and Scleronema), and the Kapok clade (Bombax, Ceiba, Eriotheca, Neobuchia, Pachira, Pseudobombax, Rhodognaphalon, and Spirotheca). The Kapok clade, the most diverse lineage of the subfamily, includes sister relationships (i) between Pseudobombax and "Pochota fendleri" a historically incertae sedis taxon, and (ii) between the Paleotropical genera Bombax and Rhodognaphalon, implying just two bombacoid dispersals to the Old World, the other one involving Adansonia. This new phylogenetic framework offers new insights and a promising avenue for further evolutionary studies. In view of this information, we present a new tribal classification of the subfamily, accompanied by an identification key.

  7. Comprehensive phylogenetic reconstructions of African swine fever virus: proposal for a new classification and molecular dating of the virus.

    Directory of Open Access Journals (Sweden)

    Vincent Michaud

    Full Text Available African swine fever (ASF is a highly lethal disease of domestic pigs caused by the only known DNA arbovirus. It was first described in Kenya in 1921 and since then many isolates have been collected worldwide. However, although several phylogenetic studies have been carried out to understand the relationships between the isolates, no molecular dating analyses have been achieved so far. In this paper, comprehensive phylogenetic reconstructions were made using newly generated, publicly available sequences of hundreds of ASFV isolates from the past 70 years. Analyses focused on B646L, CP204L, and E183L genes from 356, 251, and 123 isolates, respectively. Phylogenetic analyses were achieved using maximum likelihood and Bayesian coalescence methods. A new lineage-based nomenclature is proposed to designate 35 different clusters. In addition, dating of ASFV origin was carried out from the molecular data sets. To avoid bias, diversity due to positive selection or recombination events was neutralized. The molecular clock analyses revealed that ASFV strains currently circulating have evolved over 300 years, with a time to the most recent common ancestor (TMRCA in the early 18(th century.

  8. The molecular genetics and morphometry-based Endometrial Intraepithelial Neoplasia classification system predicts disease progression in Endometrial hyperplasia more accurately than the 1994 World Health Organization classification system

    NARCIS (Netherlands)

    Baak, JP; Mutter, GL; Robboy, S; van Diest, PJ; Uyterlinde, AM; Orbo, A; Palazzo, J; Fiane, B; Lovslett, K; Burger, C; Voorhorst, F; Verheijen, RH


    BACKGROUND. The objective of this study was to compare the accuracy of disease progression prediction of the molecular genetics and morphometry-based Endometrial Intraepithelial Neoplasia (EIN) and World Health Organization 1994 (WHO94) classification systems in patients with endometrial hyperplasia

  9. Classification (United States)

    Clary, Renee; Wandersee, James


    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  10. A species independent universal bio-detection microarray for pathogen forensics and phylogenetic classification of unknown microorganisms

    Directory of Open Access Journals (Sweden)

    McCormick John


    Full Text Available Abstract Background The ability to differentiate a bioterrorist attack or an accidental release of a research pathogen from a naturally occurring pandemic or disease event is crucial to the safety and security of this nation by enabling an appropriate and rapid response. It is critical in samples from an infected patient, the environment, or a laboratory to quickly and accurately identify the precise pathogen including natural or engineered variants and to classify new pathogens in relation to those that are known. Current approaches for pathogen detection rely on prior genomic sequence information. Given the enormous spectrum of genetic possibilities, a field deployable, robust technology, such as a universal (any species microarray has near-term potential to address these needs. Results A new and comprehensive sequence-independent array (Universal Bio-Signature Detection Array was designed with approximately 373,000 probes. The main feature of this array is that the probes are computationally derived and sequence independent. There is one probe for each possible 9-mer sequence, thus 49 (262,144 probes. Each genome hybridized on this array has a unique pattern of signal intensities corresponding to each of these probes. These signal intensities were used to generate an un-biased cluster analysis of signal intensity hybridization patterns that can easily distinguish species into accepted and known phylogenomic relationships. Within limits, the array is highly sensitive and is able to detect synthetically mixed pathogens. Examples of unique hybridization signal intensity patterns are presented for different Brucella species as well as relevant host species and other pathogens. These results demonstrate the utility of the UBDA array as a diagnostic tool in pathogen forensics. Conclusions This pathogen detection system is fast, accurate and can be applied to any species. Hybridization patterns are unique to a specific genome and these can be used

  11. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm. (United States)

    McDonnell, Mark D; Tissera, Migel D; Vladusich, Tony; van Schaik, André; Tapson, Jonathan


    Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM) approach, which also enables a very rapid training time (∼ 10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  12. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  13. Phylogenetic trees


    Baños, Hector; Bushek, Nathaniel; Davidson, Ruth; Gross, Elizabeth; Harris, Pamela E.; Krone, Robert; Long, Colby; Stewart, Allen; WALKER, Robert


    We introduce the package PhylogeneticTrees for Macaulay2 which allows users to compute phylogenetic invariants for group-based tree models. We provide some background information on phylogenetic algebraic geometry and show how the package PhylogeneticTrees can be used to calculate a generating set for a phylogenetic ideal as well as a lower bound for its dimension. Finally, we show how methods within the package can be used to compute a generating set for the join of any two ideals.

  14. Phylogenetic Status of an Unrecorded Species of Curvularia, C. spicifera, Based on Current Classification System of Curvularia and Bipolaris Group Using Multi Loci. (United States)

    Jeon, Sun Jeong; Nguyen, Thi Thuong Thuong; Lee, Hyang Burm


    A seed-borne fungus, Curvularia sp. EML-KWD01, was isolated from an indigenous wheat seed by standard blotter method. This fungus was characterized based on the morphological characteristics and molecular phylogenetic analysis. Phylogenetic status of the fungus was determined using sequences of three loci: rDNA internal transcribed spacer, large ribosomal subunit, and glyceraldehyde 3-phosphate dehydrogenase gene. Multi loci sequencing analysis revealed that this fungus was Curvularia spicifera within Curvularia group 2 of family Pleosporaceae.

  15. A High Resolution/Accurate Mass (HRAM) Data-Dependent MS3 Neutral Loss Screening, Classification, and Relative Quantitation Methodology for Carbonyl Compounds in Saliva (United States)

    Dator, Romel; Carrà, Andrea; Maertens, Laura; Guidolin, Valeria; Villalta, Peter W.; Balbo, Silvia


    Reactive carbonyl compounds (RCCs) are ubiquitous in the environment and are generated endogenously as a result of various physiological and pathological processes. These compounds can react with biological molecules inducing deleterious processes believed to be at the basis of their toxic effects. Several of these compounds are implicated in neurotoxic processes, aging disorders, and cancer. Therefore, a method characterizing exposures to these chemicals will provide insights into how they may influence overall health and contribute to disease pathogenesis. Here, we have developed a high resolution accurate mass (HRAM) screening strategy allowing simultaneous identification and relative quantitation of DNPH-derivatized carbonyls in human biological fluids. The screening strategy involves the diagnostic neutral loss of hydroxyl radical triggering MS3 fragmentation, which is only observed in positive ionization mode of DNPH-derivatized carbonyls. Unique fragmentation pathways were used to develop a classification scheme for characterizing known and unanticipated/unknown carbonyl compounds present in saliva. Furthermore, a relative quantitation strategy was implemented to assess variations in the levels of carbonyl compounds before and after exposure using deuterated d 3 -DNPH. This relative quantitation method was tested on human samples before and after exposure to specific amounts of alcohol. The nano-electrospray ionization (nano-ESI) in positive mode afforded excellent sensitivity with detection limits on-column in the high-attomole levels. To the best of our knowledge, this is the first report of a method using HRAM neutral loss screening of carbonyl compounds. In addition, the method allows simultaneous characterization and relative quantitation of DNPH-derivatized compounds using nano-ESI in positive mode.

  16. Didiscus verdensis spec. nov. (Porifera: Halichondrida) from the Cape Verde Islands, with a revision and phylogenetic classification of the genus Didiscus

    NARCIS (Netherlands)

    Hiemstra, F.; Soest, van R.W.M.


    A new species of the circumtropical/subtropical genus Didiscus Dendy, 1922 is described from the Cape Verde Islands. Based on a phylogenetic analysis of all known species of the genus, using morphological and microscopical (including SEM) characters, it was demonstrated that the new species is close

  17. Evolutionary Phylogenetic Networks: Models and Issues (United States)

    Nakhleh, Luay

    Phylogenetic networks are special graphs that generalize phylogenetic trees to allow for modeling of non-treelike evolutionary histories. The ability to sequence multiple genetic markers from a set of organisms and the conflicting evolutionary signals that these markers provide in many cases, have propelled research and interest in phylogenetic networks to the forefront in computational phylogenetics. Nonetheless, the term 'phylogenetic network' has been generically used to refer to a class of models whose core shared property is tree generalization. Several excellent surveys of the different flavors of phylogenetic networks and methods for their reconstruction have been written recently. However, unlike these surveys, this chapte focuses specifically on one type of phylogenetic networks, namely evolutionary phylogenetic networks, which explicitly model reticulate evolutionary events. Further, this chapter focuses less on surveying existing tools, and addresses in more detail issues that are central to the accurate reconstruction of phylogenetic networks.

  18. Molecular phylogenetics of Alchemilla, Aphanes and Lachemilla (Rosaceae) inferred from plastid and nuclear intron and spacer DNA sequences, with comments on generic classification. (United States)

    Gehrke, B; Bräuchler, C; Romoleroux, K; Lundberg, M; Heubl, G; Eriksson, T


    Alchemilla (the lady's mantles) is a well known but inconspicuous group in the Rosaceae, notable for its ornamental leaves and pharmaceutical properties. The systematics of Alchemilla has remained poorly understood, most likely due to confusion resulting from apomixis, polyploidisation and hybridisation, which are frequently observed in the group, and which have led to the description of a large number of (micro-) species. A molecular phylogeny of the genus, including all sections of Alchemilla and Lachemilla as well as five representatives of Aphanes, based on the analysis of the chloroplast trnL-trnF and the nuclear ITS regions is presented here. Gene phylogenies reconstructed from the nuclear and chloroplast sequence data were largely congruent. Limited conflict between the data partitions was observed with respect to a small number of taxa. This is likely to be the result of hybridisation/introgression or incomplete lineage sorting. Four distinct clades were resolved, corresponding to major geographical division and life forms: Eurasian Alchemilla, annual Aphanes, South American Lachemilla and African Alchemilla. We argue for a wider circumscription of the genus Alchemilla, including Lachemilla and Aphanes, based on the morphology and the phylogenetic relationships between the different clades.

  19. A preliminary phylogenetic analysis of the New World Helopini (Coleoptera, Tenebrionidae, Tenebrioninae indicates the need for profound rearrangements of the classification

    Directory of Open Access Journals (Sweden)

    Paulina Cifuentes-Ruiz


    Full Text Available Helopini is a diverse tribe in the subfamily Tenebrioninae with a worldwide distribution. The New World helopine species have not been reviewed recently and several doubts emerge regarding their generic assignment as well as the naturalness of the tribe and subordinate taxa. To assess these questions, a preliminary cladistic analysis was conducted with emphasis on sampling the genera distributed in the New World, but including representatives from other regions. The parsimony analysis includes 30 ingroup species from America, Europe and Asia of the subtribes Helopina and Cylindrinotina, plus three outgroups, and 67 morphological characters. Construction of the matrix resulted in the discovery of morphological character states not previously reported for the tribe, particularly from the genitalia of New World species. A consensus of the 12 most parsimonious trees supports the monophyly of the tribe based on a unique combination of characters, including one synapomorphy. None of the subtribes or the genera of the New World represented by more than one species (Helops Fabricius, Nautes Pascoe and Tarpela Bates were recovered as monophyletic. Helopina was recovered as paraphyletic in relation to Cylindrinotina. One Nearctic species of Helops and one Palearctic species of Tarpela (subtribe Helopina were more closely related to species of Cylindrinotina. A relatively derived clade, mainly composed by Neotropical species, was found; it includes seven species of Tarpela, seven species of Nautes, and three species of Helops, two Nearctic and one Neotropical. Our results reveal the need to deeply re-evaluate the current classification of the tribe and subordinated taxa, but a broader taxon sampling and further character exploration is needed in order to fully recognize monophyletic groups at different taxonomic levels (from subtribes to genera.

  20. Stratification of co-evolving genomic groups using ranked phylogenetic profiles

    Directory of Open Access Journals (Sweden)

    Tsoka Sophia


    Full Text Available Abstract Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples.

  1. Accurate Arabic Script Language/Dialect Classification (United States)


    dialects. language identification, Arabic, dialect, natural language processing, machine learning 30 Stephen C. Tratz 301-394-1057Unclassified...Arabic, Farsi, Urdu), Cyrillic script (Bulgarian, Russian, Ukrainian), and Devanagari script ( Hindi , Marathi, Nepali). They use Mechanical Turk 1, which can be a useful feature. The Java port of the LIBLINEAR (Fan et al., 2008) machine learning software package1 is used to train all our

  2. Taxonomic update on proposed nomenclature and classification changes for bacteria of medical importance, 2016. (United States)

    Janda, J Michael


    A key aspect of medical, public health, and diagnostic microbiology laboratories is the accurate identification and rapid reporting and communication to medical staff regarding patients with infectious agents of clinical importance. Microbial taxonomy in the age of molecular diagnostics and phylogenetics creates changes in taxonomy at a logarithmic rate further complicating this process. This update focuses on the description of new species and classification changes proposed in 2016.

  3. Advances in phylogenetic studies of Nematoda

    Institute of Scientific and Technical Information of China (English)


    Nematoda is a metazoan group with extremely high diversity only next to Insecta. Caenorhabditis elegans is now a favorable experimental model animal in modern developmental biology, genetics and genomics studies. However, the phylogeny of Nematoda and the phylogenetic position of the phylum within animal kingdom have long been in debate. Recent molecular phylogenetic studies gave great challenges to the traditional nematode classification. The new phylogenies not only placed the Nematoda in the Ecdysozoan and divided the phylum into five clades, but also provided new insights into animal molecular identification and phylogenetic biodiversity studies. The present paper reviews major progress and remaining problems in the current molecular phylogenetic studies of Nematoda, and prospects the developmental tendencies of this field.

  4. Phylogenetic Trees From Sequences (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  5. Taxonomic Identity Resolution of Highly Phylogenetically Related Strains and Selection of Phylogenetic Markers by Using Genome-Scale Methods: The Bacillus pumilus Group Case (United States)

    Espariz, Martín; Zuljan, Federico A.; Esteban, Luis; Magni, Christian


    Bacillus pumilus group strains have been studied due their agronomic, biotechnological or pharmaceutical potential. Classifying strains of this taxonomic group at species level is a challenging procedure since it is composed of seven species that share among them over 99.5% of 16S rRNA gene identity. In this study, first, a whole-genome in silico approach was used to accurately demarcate B. pumilus group strains, as a case of highly phylogenetically related taxa, at the species level. In order to achieve that and consequently to validate or correct taxonomic identities of genomes in public databases, an average nucleotide identity correlation, a core-based phylogenomic and a gene function repertory analyses were performed. Eventually, more than 50% such genomes were found to be misclassified. Hierarchical clustering of gene functional repertoires was also used to infer ecotypes among B. pumilus group species. Furthermore, for the first time the machine-learning algorithm Random Forest was used to rank genes in order of their importance for species classification. We found that ybbP, a gene involved in the synthesis of cyclic di-AMP, was the most important gene for accurately predicting species identity among B. pumilus group strains. Finally, principal component analysis was used to classify strains based on the distances between their ybbP genes. The methodologies described could be utilized more broadly to identify other highly phylogenetically related species in metagenomic or epidemiological assessments. PMID:27658251

  6. Phylogenetic reconstruction of the wolf spiders (Araneae: Lycosidae) using sequences from the 12S rRNA, 28S rRNA, and NADH1 genes: implications for classification, biogeography, and the evolution of web building behavior. (United States)

    Murphy, Nicholas P; Framenau, Volker W; Donnellan, Stephen C; Harvey, Mark S; Park, Yung-Chul; Austin, Andrew D


    Current knowledge of the evolutionary relationships amongst the wolf spiders (Araneae: Lycosidae) is based on assessment of morphological similarity or phylogenetic analysis of a small number of taxa. In order to enhance the current understanding of lycosid relationships, phylogenies of 70 lycosid species were reconstructed by parsimony and Bayesian methods using three molecular markers; the mitochondrial genes 12S rRNA, NADH1, and the nuclear gene 28S rRNA. The resultant trees from the mitochondrial markers were used to assess the current taxonomic status of the Lycosidae and to assess the evolutionary history of sheet-web construction in the group. The results suggest that a number of genera are not monophyletic, including Lycosa, Arctosa, Alopecosa, and Artoria. At the subfamilial level, the status of Pardosinae needs to be re-assessed, and the position of a number of genera within their respective subfamilies is in doubt (e.g., Hippasa and Arctosa in Lycosinae and Xerolycosa, Aulonia and Hygrolycosa in Venoniinae). In addition, a major clade of strictly Australasian taxa may require the creation of a new subfamily. The analysis of sheet-web building in Lycosidae revealed that the interpretation of this trait as an ancestral state relies on two factors: (1) an asymmetrical model favoring the loss of sheet-webs and (2) that the suspended silken tube of Pirata is directly descended from sheet-web building. Paralogous copies of the nuclear 28S rRNA gene were sequenced, confounding the interpretation of the phylogenetic analysis and suggesting that a cautionary approach should be taken to the further use of this gene for lycosid phylogenetic analysis.

  7. Phylogenetic lineages in Entomophthoromycota

    NARCIS (Netherlands)

    Gryganskyi, A.P.; Humber, R.A.; Smith, M.E.; Hodge, K.; Huang, B.; Voigt, K.; Vilgalys, R.


    Entomophthoromycota is one of six major phylogenetic lineages among the former phylum Zygomycota. These early terrestrial fungi share evolutionarily ancestral characters such as coenocytic mycelium and gametangiogamy as a sexual process resulting in zygospore formation. Previous molecular studies ha

  8. A new higher classification of planarian flatworms (Platyhelminthes, Tricladida)

    NARCIS (Netherlands)

    Sluys, R.; Kawakatsu, M .; Riutort, M.; Baguñà, J.


    This paper presents a revised classification for the higher taxa within the Tricladida. A historical sketch is provided of the higher classificatory systems of triclad flatworms. As far as possible, the new classification is based on published phylogenetic studies. A phylogenetic tree generalizing c

  9. Rounding up the usual suspects: a standard target-gene approach for resolving the interfamilial phylogenetic relationships of ecribellate orb-weaving spiders with a new family-rank classification (Araneae, Araneoidea)

    DEFF Research Database (Denmark)

    Dimitrov, Dimitar; Benevidas, Ligia R.; Arnedo, Miquel A.;


    We test the limits of the spider superfamily Araneoidea and reconstruct its interfamilial relationships using standard molecular markers. The taxon sample (363 terminals) comprises for the first time representatives of all araneoid families, including the first molecular data of the family...... Synaphridae. We use the resulting phylogenetic framework to study web evolution in araneoids. Araneoidea is monophyletic and sister to Nicodamoidea rank. n. Orbiculariae are not monophyletic and also include the RTA clade, Oecobiidae and Hersiliidae. Deinopoidea is paraphyletic with respect to a lineage...... holarchaeids but the family remains diphyletic even if Holarchaea is considered an anapid. The orb-web is ancient, having evolved by the early Jurassic; a single origin of the orb with multiple “losses” is implied by our analyses. By the late Jurassic, the orb-web had already been transformed into different...

  10. Clustering with phylogenetic tools in astrophysics

    CERN Document Server

    Fraix-Burnet, Didier


    Phylogenetic approaches are finding more and more applications outside the field of biology. Astrophysics is no exception since an overwhelming amount of multivariate data has appeared in the last twenty years or so. In particular, the diversification of galaxies throughout the evolution of the Universe quite naturally invokes phylogenetic approaches. We have demonstrated that Maximum Parsimony brings useful astrophysical results, and we now proceed toward the analyses of large datasets for galaxies. In this talk I present how we solve the major difficulties for this goal: the choice of the parameters, their discretization, and the analysis of a high number of objects with an unsupervised NP-hard classification technique like cladistics. 1. Introduction How do the galaxy form, and when? How did the galaxy evolve and transform themselves to create the diversity we observe? What are the progenitors to present-day galaxies? To answer these big questions, observations throughout the Universe and the physical mode...

  11. Host specificity and phylogenetic relationships of chicken and turkey parvoviruses (United States)

    Previous reports indicate that the newly discovered chicken parvoviruses (ChPV) and turkey parvoviruses (TuPV) are very similar to each other, yet they represent different species within a new genus of Parvoviridae. Currently, strain classification is based on the phylogenetic analysis of a 561 bas...

  12. Phylogenetic placement of two species known only from resting spores

    DEFF Research Database (Denmark)

    Hajek, Ann E; Gryganskyi, Andrii; Bittner, Tonya;


    Molecular methods were used to determine the generic placement of two species of Entomophthorales known only from resting spores. Historically, these species would belong in the form-genus Tarichium, but this classification provides no information about phylogenetic relationships. Using DNA from...

  13. First phylogenetic analyses of galaxy evolution

    CERN Document Server

    Fraix-Burnet, D


    The Hubble tuning fork diagram, based on morphology, has always been the preferred scheme for classification of galaxies and is still the only one originally built from historical/evolutionary relationships. At the opposite, biologists have long taken into account the parenthood links of living entities for classification purposes. Assuming branching evolution of galaxies as a "descent with modification", we show that the concepts and tools of phylogenetic systematics widely used in biology can be heuristically transposed to the case of galaxies. This approach that we call "astrocladistics" has been first applied to Dwarf Galaxies of the Local Group and provides the first evolutionary galaxy tree. The cladogram is sufficiently solid to support the existence of a hierarchical organization in the diversity of galaxies, making it possible to track ancestral types of galaxies. We also find that morphology is a summary of more fundamental properties. Astrocladistics applied to cosmology simulated galaxies can, uns...

  14. Charles Darwin, beetles and phylogenetics. (United States)

    Beutel, Rolf G; Friedrich, Frank; Leschen, Richard A B


    Here, we review Charles Darwin's relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in "The Descent of Man". During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig's new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data. This has

  15. Charles Darwin, beetles and phylogenetics (United States)

    Beutel, Rolf G.; Friedrich, Frank; Leschen, Richard A. B.


    Here, we review Charles Darwin’s relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in “The Descent of Man”. During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig’s new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data

  16. ClassyFlu: classification of influenza A viruses with Discriminatively trained profile-HMMs.

    Directory of Open Access Journals (Sweden)

    Sandra Van der Auwera

    Full Text Available Accurate and rapid characterization of influenza A virus (IAV hemagglutinin (HA and neuraminidase (NA sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs, one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1-H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1-N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new "Highly Pathogenic H5N1 Clade Classification Tool" (IRD-CT proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT, although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification. A retraining of ClassyFlu to altered cladistics as well as an extension of ClassyFlu to other IAV genome segments or fragments thereof is undemanding. This is exemplified by unambiguous assignment to a distinct cluster within subtype H7 of sequences of H7N9 viruses which emerged in China early in 2013 and caused more than 130 human infections. is a free web service. For local execution, the ClassyFlu source code in PERL is freely available.

  17. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from and

  18. Fast phylogenetic DNA barcoding

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Willerslev, Eske


    We present a heuristic approach to the DNA assignment problem based on phylogenetic inferences using constrained neighbour joining and non-parametric bootstrapping. We show that this method performs as well as the more computationally intensive full Bayesian approach in an analysis of 500 insect...... DNA sequences obtained from GenBank. We also analyse a previously published dataset of environmental DNA sequences from soil from New Zealand and Siberia, and use these data to illustrate the fact that statistical approaches to the DNA assignment problem allow for more appropriate criteria...... for determining the taxonomic level at which a particular DNA sequence can be assigned....

  19. Update on diabetes classification. (United States)

    Thomas, Celeste C; Philipson, Louis H


    This article highlights the difficulties in creating a definitive classification of diabetes mellitus in the absence of a complete understanding of the pathogenesis of the major forms. This brief review shows the evolving nature of the classification of diabetes mellitus. No classification scheme is ideal, and all have some overlap and inconsistencies. The only diabetes in which it is possible to accurately diagnose by DNA sequencing, monogenic diabetes, remains undiagnosed in more than 90% of the individuals who have diabetes caused by one of the known gene mutations. The point of classification, or taxonomy, of disease, should be to give insight into both pathogenesis and treatment. It remains a source of frustration that all schemes of diabetes mellitus continue to fall short of this goal.

  20. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory


    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  1. Entanglement, Invariants, and Phylogenetics (United States)

    Sumner, J. G.


    This thesis develops and expands upon known techniques of mathematical physics relevant to the analysis of the popular Markov model of phylogenetic trees required in biology to reconstruct the evolutionary relationships of taxonomic units from biomolecular sequence data. The techniques of mathematical physics are plethora and have been developed for some time. The Markov model of phylogenetics and its analysis is a relatively new technique where most progress to date has been achieved by using discrete mathematics. This thesis takes a group theoretical approach to the problem by beginning with a remarkable mathematical parallel to the process of scattering in particle physics. This is shown to equate to branching events in the evolutionary history of molecular units. The major technical result of this thesis is the derivation of existence proofs and computational techniques for calculating polynomial group invariant functions on a multi-linear space where the group action is that relevant to a Markovian time evolution. The practical results of this thesis are an extended analysis of the use of invariant functions in distance based methods and the presentation of a new reconstruction technique for quartet trees which is consistent with the most general Markov model of sequence evolution.

  2. Molecular Phylogenetic: Organism Taxonomy Method Based on Evolution History

    Directory of Open Access Journals (Sweden)

    N.L.P Indi Dharmayanti


    Full Text Available Phylogenetic is described as taxonomy classification of an organism based on its evolution history namely its phylogeny and as a part of systematic science that has objective to determine phylogeny of organism according to its characteristic. Phylogenetic analysis from amino acid and protein usually became important area in sequence analysis. Phylogenetic analysis can be used to follow the rapid change of a species such as virus. The phylogenetic evolution tree is a two dimensional of a species graphic that shows relationship among organisms or particularly among their gene sequences. The sequence separation are referred as taxa (singular taxon that is defined as phylogenetically distinct units on the tree. The tree consists of outer branches or leaves that represents taxa and nodes and branch represent correlation among taxa. When the nucleotide sequence from two different organism are similar, they were inferred to be descended from common ancestor. There were three methods which were used in phylogenetic, namely (1 Maximum parsimony, (2 Distance, and (3 Maximum likehoood. Those methods generally are applied to construct the evolutionary tree or the best tree for determine sequence variation in group. Every method is usually used for different analysis and data.

  3. Incompletely resolved phylogenetic trees inflate estimates of phylogenetic conservatism. (United States)

    Davies, T Jonathan; Kraft, Nathan J B; Salamin, Nicolas; Wolkovich, Elizabeth M


    The tendency for more closely related species to share similar traits and ecological strategies can be explained by their longer shared evolutionary histories and represents phylogenetic conservatism. How strongly species traits co-vary with phylogeny can significantly impact how we analyze cross-species data and can influence our interpretation of assembly rules in the rapidly expanding field of community phylogenetics. Phylogenetic conservatism is typically quantified by analyzing the distribution of species values on the phylogenetic tree that connects them. Many phylogenetic approaches, however, assume a completely sampled phylogeny: while we have good estimates of deeper phylogenetic relationships for many species-rich groups, such as birds and flowering plants, we often lack information on more recent interspecific relationships (i.e., within a genus). A common solution has been to represent these relationships as polytomies on trees using taxonomy as a guide. Here we show that such trees can dramatically inflate estimates of phylogenetic conservatism quantified using S. P. Blomberg et al.'s K statistic. Using simulations, we show that even randomly generated traits can appear to be phylogenetically conserved on poorly resolved trees. We provide a simple rarefaction-based solution that can reliably retrieve unbiased estimates of K, and we illustrate our method using data on first flowering times from Thoreau's woods (Concord, Massachusetts, USA).

  4. On Nakhleh's metric for reduced phylogenetic networks. (United States)

    Cardona, Gabriel; Llabrés, Mercè; Rosselló, Francesc; Valiente, Gabriel


    We prove that Nakhleh's metric for reduced phylogenetic networks is also a metric on the classes of tree-child phylogenetic networks, semibinary tree-sibling time consistent phylogenetic networks, and multilabeled phylogenetic trees. We also prove that it separates distinguishable phylogenetic networks. In this way, it becomes the strongest dissimilarity measure for phylogenetic networks available so far. Furthermore, we propose a generalization of that metric that separates arbitrary phylogenetic networks.

  5. Dengue virus type 3 in Brazil: a phylogenetic perspective

    Directory of Open Access Journals (Sweden)

    Josélio Maria Galvão de Araújo


    Full Text Available Circulation of a new dengue virus (DENV-3 genotype was recently described in Brazil and Colombia, but the precise classification of this genotype has been controversial. Here we perform phylogenetic and nucleotide-distance analyses of the envelope gene, which support the subdivision of DENV-3 strains into five distinct genotypes (GI to GV and confirm the classification of the new South American genotype as GV. The extremely low genetic distances between Brazilian GV strains and the prototype Philippines/L11423 GV strain isolated in 1956 raise important questions regarding the origin of GV in South America.

  6. Bayesian phylogenetic estimation of fossil ages (United States)

    Drummond, Alexei J.; Stadler, Tanja


    Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth–death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the ‘morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses. This article is part of the themed issue ‘Dating species divergences

  7. Quartets and unrooted phylogenetic networks. (United States)

    Gambette, Philippe; Berry, Vincent; Paul, Christophe


    Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of level-k networks. In particular, we give an equivalence theorem between circular split systems and unrooted level-1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted level-k phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions.

  8. Application of Data Mining in Protein Sequence Classification

    Directory of Open Access Journals (Sweden)

    Suprativ Saha


    Full Text Available Protein sequence classification involves feature selection for accurate classification. Popular protein sequence classification techniques involve extraction of specific features from the sequences. Researchers apply some well-known classification techniques like neural networks, Genetic algorithm, Fuzzy ARTMAP,Rough Set Classifier etc for accurate classification. This paper presents a review is with three different classification models such as neural network model, fuzzy ARTMAP model and Rough set classifier model.This is followed by a new technique for classifying protein sequences. The proposed model is typicallyimplemented with an own designed tool and tries to reduce the computational overheads encountered by earlier approaches and increase the accuracy of classification.

  9. Three Domains, Not Five Kingdoms: A Phylogenetic Classification System. (United States)

    Peirce, Susan K.


    Argues that the Woesian three domain view of life should replace the five kingdom taxonomic scheme presented in most general biology texts and courses. Presents evidence for employing the three domain scheme and a related activity for classroom use. Contains 11 references. (WRM)

  10. Phylogenetic and biogeographic analysis of sphaerexochine trilobites.

    Directory of Open Access Journals (Sweden)

    Curtis R Congreve

    Full Text Available BACKGROUND: Sphaerexochinae is a speciose and widely distributed group of cheirurid trilobites. Their temporal range extends from the earliest Ordovician through the Silurian, and they survived the end Ordovician mass extinction event (the second largest mass extinction in Earth history. Prior to this study, the individual evolutionary relationships within the group had yet to be determined utilizing rigorous phylogenetic methods. Understanding these evolutionary relationships is important for producing a stable classification of the group, and will be useful in elucidating the effects the end Ordovician mass extinction had on the evolutionary and biogeographic history of the group. METHODOLOGY/PRINCIPAL FINDINGS: Cladistic parsimony analysis of cheirurid trilobites assigned to the subfamily Sphaerexochinae was conducted to evaluate phylogenetic patterns and produce a hypothesis of relationship for the group. This study utilized the program TNT, and the analysis included thirty-one taxa and thirty-nine characters. The results of this analysis were then used in a Lieberman-modified Brooks Parsimony Analysis to analyze biogeographic patterns during the Ordovician-Silurian. CONCLUSIONS/SIGNIFICANCE: The genus Sphaerexochus was found to be monophyletic, consisting of two smaller clades (one composed entirely of Ordovician species and another composed of Silurian and Ordovician species. By contrast, the genus Kawina was found to be paraphyletic. It is a basal grade that also contains taxa formerly assigned to Cydonocephalus. Phylogenetic patterns suggest Sphaerexochinae is a relatively distinctive trilobite clade because it appears to have been largely unaffected by the end Ordovician mass extinction. Finally, the biogeographic analysis yields two major conclusions about Sphaerexochus biogeography: Bohemia and Avalonia were close enough during the Silurian to exchange taxa; and during the Ordovician there was dispersal between Eastern Laurentia and

  11. High-resolution phylogenetic microbial community profiling

    Energy Technology Data Exchange (ETDEWEB)

    Singer, Esther; Coleman-Derr, Devin; Bowman, Brett; Schwientek, Patrick; Clum, Alicia; Copeland, Alex; Ciobanu, Doina; Cheng, Jan-Fang; Gies, Esther; Hallam, Steve; Tringe, Susannah; Woyke, Tanja


    The representation of bacterial and archaeal genome sequences is strongly biased towards cultivated organisms, which belong to merely four phylogenetic groups. Functional information and inter-phylum level relationships are still largely underexplored for candidate phyla, which are often referred to as microbial dark matter. Furthermore, a large portion of the 16S rRNA gene records in the GenBank database are labeled as environmental samples and unclassified, which is in part due to low read accuracy, potential chimeric sequences produced during PCR amplifications and the low resolution of short amplicons. In order to improve the phylogenetic classification of novel species and advance our knowledge of the ecosystem function of uncultivated microorganisms, high-throughput full length 16S rRNA gene sequencing methodologies with reduced biases are needed. We evaluated the performance of PacBio single-molecule real-time (SMRT) sequencing in high-resolution phylogenetic microbial community profiling. For this purpose, we compared PacBio and Illumina metagenomic shotgun and 16S rRNA gene sequencing of a mock community as well as of an environmental sample from Sakinaw Lake, British Columbia. Sakinaw Lake is known to contain a large age of microbial species from candidate phyla. Sequencing results show that community structure based on PacBio shotgun and 16S rRNA gene sequences is highly similar in both the mock and the environmental communities. Resolution power and community representation accuracy from SMRT sequencing data appeared to be independent of GC content of microbial genomes and was higher when compared to Illumina-based metagenome shotgun and 16S rRNA gene (iTag) sequences, e.g. full-length sequencing resolved all 23 OTUs in the mock community, while iTags did not resolve closely related species. SMRT sequencing hence offers various potential benefits when characterizing uncharted microbial communities.

  12. Speaking Fluently And Accurately

    Institute of Scientific and Technical Information of China (English)



    Even after many years of study,students make frequent mistakes in English. In addition, many students still need a long time to think of what they want to say. For some reason, in spite of all the studying, students are still not quite fluent.When I teach, I use one technique that helps students not only speak more accurately, but also more fluently. That technique is dictations.

  13. Semiparametric Gaussian copula classification


    Zhao, Yue; Wegkamp, Marten


    This paper studies the binary classification of two distributions with the same Gaussian copula in high dimensions. Under this semiparametric Gaussian copula setting, we derive an accurate semiparametric estimator of the log density ratio, which leads to our empirical decision rule and a bound on its associated excess risk. Our estimation procedure takes advantage of the potential sparsity as well as the low noise condition in the problem, which allows us to achieve faster convergence rate of...

  14. Phylogenetics and the human microbiome. (United States)

    Matsen, Frederick A


    The human microbiome is the ensemble of genes in the microbes that live inside and on the surface of humans. Because microbial sequencing information is now much easier to come by than phenotypic information, there has been an explosion of sequencing and genetic analysis of microbiome samples. Much of the analytical work for these sequences involves phylogenetics, at least indirectly, but methodology has developed in a somewhat different direction than for other applications of phylogenetics. In this article, I review the field and its methods from the perspective of a phylogeneticist, as well as describing current challenges for phylogenetics coming from this type of work.

  15. An Optimization-Based Sampling Scheme for Phylogenetic Trees (United States)

    Misra, Navodit; Blelloch, Guy; Ravi, R.; Schwartz, Russell

    Much modern work in phylogenetics depends on statistical sampling approaches to phylogeny construction to estimate probability distributions of possible trees for any given input data set. Our theoretical understanding of sampling approaches to phylogenetics remains far less developed than that for optimization approaches, however, particularly with regard to the number of sampling steps needed to produce accurate samples of tree partition functions. Despite the many advantages in principle of being able to sample trees from sophisticated probabilistic models, we have little theoretical basis for concluding that the prevailing sampling approaches do in fact yield accurate samples from those models within realistic numbers of steps. We propose a novel approach to phylogenetic sampling intended to be both efficient in practice and more amenable to theoretical analysis than the prevailing methods. The method depends on replacing the standard tree rearrangement moves with an alternative Markov model in which one solves a theoretically hard but practically tractable optimization problem on each step of sampling. The resulting method can be applied to a broad range of standard probability models, yielding practical algorithms for efficient sampling and rigorous proofs of accurate sampling for some important special cases. We demonstrate the efficiency and versatility of the method in an analysis of uncertainty in tree inference over varying input sizes. In addition to providing a new practical method for phylogenetic sampling, the technique is likely to prove applicable to many similar problems involving sampling over combinatorial objects weighted by a likelihood model.

  16. [Foundations of the new phylogenetics]. (United States)

    Pavlinov, I Ia


    Evolutionary idea is the core of the modern biology. Due to this, phylogenetics dealing with historical reconstructions in biology takes a priority position among biological disciplines. The second half of the 20th century witnessed growth of a great interest to phylogenetic reconstructions at macrotaxonomic level which replaced microevolutionary studies dominating during the 30s-60s. This meant shift from population thinking to phylogenetic one but it was not revival of the classical phylogenetics; rather, a new approach emerged that was baptized The New Phylogenetics. It arose as a result of merging of three disciplines which were developing independently during 60s-70s, namely cladistics, numerical phyletics, and molecular phylogenetics (now basically genophyletics). Thus, the new phylogenetics could be defined as a branch of evolutionary biology aimed at elaboration of "parsimonious" cladistic hypotheses by means of numerical methods on the basis of mostly molecular data. Classical phylogenetics, as a historical predecessor of the new one, emerged on the basis of the naturphilosophical worldview which included a superorganismal idea of biota. Accordingly to that view, historical development (the phylogeny) was thought an analogy of individual one (the ontogeny) so its most basical features were progressive parallel developments of "parts" (taxa), supplemented with Darwinian concept of monophyly. Two predominating traditions were diverged within classical phylogenetics according to a particular interpretation of relation between these concepts. One of them (Cope, Severtzow) belittled monophyly and paid most attention to progressive parallel developments of morphological traits. Such an attitude turned this kind of phylogenetics to be rather the semogenetics dealing primarily with evolution of structures and not of taxa. Another tradition (Haeckel) considered both monophyletic and parallel origins of taxa jointly: in the middle of 20th century it was split into

  17. Contextual classification of multispectral image data: Approximate algorithm (United States)

    Tilton, J. C. (Principal Investigator)


    An approximation to a classification algorithm incorporating spatial context information in a general, statistical manner is presented which is computationally less intensive. Classifications that are nearly as accurate are produced.

  18. Integrated classification of inflammatory myopathies. (United States)

    Allenbach, Y; Benveniste, O; Goebel, H-H; Stenzel, W


    Inflammatory myopathies comprise a multitude of diverse diseases, most often occurring in complex clinical settings. To ensure accurate diagnosis, multidisciplinary expertise is required. Here, we propose a comprehensive myositis classification that incorporates clinical, morphological and molecular data as well as autoantibody profile. This review focuses on recent advances in myositis research, in particular, the correlation between autoantibodies and morphological or clinical phenotypes that can be used as the basis for an 'integrated' classification system.

  19. Skeletal Rigidity of Phylogenetic Trees

    CERN Document Server

    Cheng, Howard; Li, Brian; Risteski, Andrej


    Motivated by geometric origami and the straight skeleton construction, we outline a map between spaces of phylogenetic trees and spaces of planar polygons. The limitations of this map is studied through explicit examples, culminating in proving a structural rigidity result.

  20. Community Phylogenetics: Assessing Tree Reconstruction Methods and the Utility of DNA Barcodes. (United States)

    Boyle, Elizabeth E; Adamowicz, Sarah J


    Studies examining phylogenetic community structure have become increasingly prevalent, yet little attention has been given to the influence of the input phylogeny on metrics that describe phylogenetic patterns of co-occurrence. Here, we examine the influence of branch length, tree reconstruction method, and amount of sequence data on measures of phylogenetic community structure, as well as the phylogenetic signal (Pagel's λ) in morphological traits, using Trichoptera larval communities from Churchill, Manitoba, Canada. We find that model-based tree reconstruction methods and the use of a backbone family-level phylogeny improve estimations of phylogenetic community structure. In addition, trees built using the barcode region of cytochrome c oxidase subunit I (COI) alone accurately predict metrics of phylogenetic community structure obtained from a multi-gene phylogeny. Input tree did not alter overall conclusions drawn for phylogenetic signal, as significant phylogenetic structure was detected in two body size traits across input trees. As the discipline of community phylogenetics continues to expand, it is important to investigate the best approaches to accurately estimate patterns. Our results suggest that emerging large datasets of DNA barcode sequences provide a vast resource for studying the structure of biological communities.

  1. Quantum Simulation of Phylogenetic Trees

    CERN Document Server

    Ellinas, Demosthenes


    Quantum simulations constructing probability tensors of biological multi-taxa in phylogenetic trees are proposed, in terms of positive trace preserving maps, describing evolving systems of quantum walks with multiple walkers. Basic phylogenetic models applying on trees of various topologies are simulated following appropriate decoherent quantum circuits. Quantum simulations of statistical inference for aligned sequences of biological characters are provided in terms of a quantum pruning map operating on likelihood operator observables, utilizing state-observable duality and measurement theory.

  2. Phylogenetic incongruence in E. coli O104: understanding the evolutionary relationships of emerging pathogens in the face of homologous recombination.

    Directory of Open Access Journals (Sweden)

    Weilong Hao

    Full Text Available Escherichia coli O104:H4 was identified as an emerging pathogen during the spring and summer of 2011 and was responsible for a widespread outbreak that resulted in the deaths of 50 people and sickened over 4075. Traditional phenotypic and genotypic assays, such as serotyping, pulsed field gel electrophoresis (PFGE, and multilocus sequence typing (MLST, permit identification and classification of bacterial pathogens, but cannot accurately resolve relationships among genotypically similar but pathotypically different isolates. To understand the evolutionary origins of E. coli O104:H4, we sequenced two strains isolated in Ontario, Canada. One was epidemiologically linked to the 2011 outbreak, and the second, unrelated isolate, was obtained in 2010. MLST analysis indicated that both isolates are of the same sequence type (ST678, but whole-genome sequencing revealed differences in chromosomal and plasmid content. Through comprehensive phylogenetic analysis of five O104:H4 ST678 genomes, we identified 167 genes in three gene clusters that have undergone homologous recombination with distantly related E. coli strains. These recombination events have resulted in unexpectedly high sequence diversity within the same sequence type. Failure to recognize or adjust for homologous recombination can result in phylogenetic incongruence. Understanding the extent of homologous recombination among different strains of the same sequence type may explain the pathotypic differences between the ON2010 and ON2011 strains and help shed new light on the emergence of this new pathogen.

  3. Factors that affect large subunit ribosomal DNA amplicon sequencing studies of fungal communities: classification method, primer choice, and error.

    Directory of Open Access Journals (Sweden)

    Teresita M Porter

    Full Text Available Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1 a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN; 2 a composition-based method (Ribosomal Database Project naïve bayesian classifier, NBC; and, 3 a phylogeny-based method (Statistical Assignment Package, SAP. We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50-100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys.

  4. apex: phylogenetics with multiple genes. (United States)

    Jombart, Thibaut; Archer, Frederick; Schliep, Klaus; Kamvar, Zhian; Harris, Rebecca; Paradis, Emmanuel; Goudet, Jérome; Lapp, Hilmar


    Genetic sequences of multiple genes are becoming increasingly common for a wide range of organisms including viruses, bacteria and eukaryotes. While such data may sometimes be treated as a single locus, in practice, a number of biological and statistical phenomena can lead to phylogenetic incongruence. In such cases, different loci should, at least as a preliminary step, be examined and analysed separately. The r software has become a popular platform for phylogenetics, with several packages implementing distance-based, parsimony and likelihood-based phylogenetic reconstruction, and an even greater number of packages implementing phylogenetic comparative methods. Unfortunately, basic data structures and tools for analysing multiple genes have so far been lacking, thereby limiting potential for investigating phylogenetic incongruence. In this study, we introduce the new r package apex to fill this gap. apex implements new object classes, which extend existing standards for storing DNA and amino acid sequences, and provides a number of convenient tools for handling, visualizing and analysing these data. In this study, we introduce the main features of the package and illustrate its functionalities through the analysis of a simple data set.


    Institute of Scientific and Technical Information of China (English)

    Li Jun; Zhang Shunyi; Lu Yanqing; Yan Junrong


    Accurate and real-time classification of network traffic is significant to network operation and management such as QoS differentiation, traffic shaping and security surveillance. However, with many newly emerged P2P applications using dynamic port numbers, masquerading techniques, and payload encryption to avoid detection, traditional classification approaches turn to be ineffective. In this paper, we present a layered hybrid system to classify current Internet traffic, motivated by variety of network activities and their requirements of traffic classification. The proposed method could achieve fast and accurate traffic classification with low overheads and robustness to accommodate both known and unknown/encrypted applications. Furthermore, it is feasible to be used in the context of real-time traffic classification. Our experimental results show the distinct advantages of the proposed classification system, compared with the one-step Machine Learning (ML) approach.

  6. Absolute Pitch in Boreal Chickadees and Humans: Exceptions that Test a Phylogenetic Rule (United States)

    Weisman, Ronald G.; Balkwill, Laura-Lee; Hoeschele, Marisa; Moscicki, Michele K.; Bloomfield, Laurie L.; Sturdy, Christopher B.


    This research examined generality of the phylogenetic rule that birds discriminate frequency ranges more accurately than mammals. Human absolute pitch chroma possessors accurately tracked transitions between frequency ranges. Independent tests showed that they used note naming (pitch chroma) to remap the tones into ranges; neither possessors nor…

  7. Functional Basis of Microorganism Classification.

    Directory of Open Access Journals (Sweden)

    Chengsheng Zhu


    Full Text Available Correctly identifying nearest "neighbors" of a given microorganism is important in industrial and clinical applications where close relationships imply similar treatment. Microbial classification based on similarity of physiological and genetic organism traits (polyphasic similarity is experimentally difficult and, arguably, subjective. Evolutionary relatedness, inferred from phylogenetic markers, facilitates classification but does not guarantee functional identity between members of the same taxon or lack of similarity between different taxa. Using over thirteen hundred sequenced bacterial genomes, we built a novel function-based microorganism classification scheme, functional-repertoire similarity-based organism network (FuSiON; flattened to fusion. Our scheme is phenetic, based on a network of quantitatively defined organism relationships across the known prokaryotic space. It correlates significantly with the current taxonomy, but the observed discrepancies reveal both (1 the inconsistency of functional diversity levels among different taxa and (2 an (unsurprising bias towards prioritizing, for classification purposes, relatively minor traits of particular interest to humans. Our dynamic network-based organism classification is independent of the arbitrary pairwise organism similarity cut-offs traditionally applied to establish taxonomic identity. Instead, it reveals natural, functionally defined organism groupings and is thus robust in handling organism diversity. Additionally, fusion can use organism meta-data to highlight the specific environmental factors that drive microbial diversification. Our approach provides a complementary view to cladistic assignments and holds important clues for further exploration of microbial lifestyles. Fusion is a more practical fit for biomedical, industrial, and ecological applications, as many of these rely on understanding the functional capabilities of the microbes in their environment and are less

  8. Phylogenetic analysis of a spontaneous cocoa bean fermentation metagenome reveals new insights into its bacterial and fungal community diversity.

    Directory of Open Access Journals (Sweden)

    Koen Illeghems

    Full Text Available This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly γ-Proteobacteria and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni. Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques.

  9. Phylogenetic biogeography and taxonomy of disjunctly distributed bryophytes

    Institute of Scientific and Technical Information of China (English)



    More than 200 research papers on the molecular phylogeny and phylogenetic biogeography of bryophytes have been published since the beginning of this millenium. These papers corroborated assumptions of a complex ge-netic structure of morphologically circumscribed bryophytes, and raised reservations against many morphologically justified species concepts, especially within the mosses. However, many molecular studies allowed for corrections and modifications of morphological classification schemes. Several studies reported that the phylogenetic structure of disjunctly distributed bryophyte species reflects their geographical ranges rather than morphological disparities. Molecular data led to new appraisals of distribution ranges and allowed for the reconstruction of refugia and migra-tion routes. Intercontinental ranges of bryophytes are often caused by dispersal rather than geographical vicariance. Many distribution patterns of disjunct bryophytes are likely formed by processes such as short distance dispersal, rare long distance dispersal events, extinction, recolonization and diversification.

  10. PoInTree: A Polar and Interactive Phylogenetic Tree

    Institute of Scientific and Technical Information of China (English)

    Carreras Marco; Gianti Eleonora; Sartori Luca; Plyte Simon Edward; Isacchi Antonella; Bosotti Roberta


    PoInTree (Polar and Innteractive Tree) is an application that allows to build, visualize, and customize phylogenetic trees in a polar, interactive, and highly flexible view. It takes as input a FASTA file or multiple alignment formats. Phylogenetic tree calculation is based on a sequence distance method and utilizes the Neighbor Joining (NJ) algorithm. It also allows displaying precalculated trees of the major protein families based on Pfam classification. In PoInTree, nodes can be dynamically opened and closed and distances between genes are graphically represented.Tree root can be centered on a selected leaf. Text search mechanism, color-coding and labeling display are integrated. The visualizer can be connected to an Oracle database containing information on sequences and other biological data, helping to guide their interpretation within a given protein family across multiple species.The application is written in Borland Delphi and based on VCL Teechart Pro 6 graphical component (Steema software).

  11. Phylogenetic relationships among Maloideae species (United States)

    The Maloideae is a highly diverse sub-family of the Rosaceae containing several agronomically important species (Malus sp. and Pyrus sp.) and their wild relatives. Previous phylogenetic work within the group has revealed extensive intergeneric hybridization and polyploidization. In order to develop...

  12. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula


    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  13. Text Classification using Artificial Intelligence

    CERN Document Server

    Kamruzzaman, S M


    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using artificial intelligence technique that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of na\\"ive Bayes classifier is then used on derived features and finally only a single concept of genetic algorithm has been added for final classification. A syste...

  14. Text Classification using Data Mining

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh


    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the...

  15. Cyber-infrastructure for Fusarium (CiF): Three integrated platforms supporting strain identification, phylogenetics, comparative genomics, and knowledge sharing (United States)

    The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on ...

  16. Transporter Classification Database (TCDB) (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  17. Molecular identification and phylogenetic study of Demodex caprae. (United States)

    Zhao, Ya-E; Cheng, Juan; Hu, Li; Ma, Jun-Xian


    The DNA barcode has been widely used in species identification and phylogenetic analysis since 2003, but there have been no reports in Demodex. In this study, to obtain an appropriate DNA barcode for Demodex, molecular identification of Demodex caprae based on mitochondrial cox1 was conducted. Firstly, individual adults and eggs of D. caprae were obtained for genomic DNA (gDNA) extraction; Secondly, mitochondrial cox1 fragment was amplified, cloned, and sequenced; Thirdly, cox1 fragments of D. caprae were aligned with those of other Demodex retrieved from GenBank; Finally, the intra- and inter-specific divergences were computed and the phylogenetic trees were reconstructed to analyze phylogenetic relationship in Demodex. Results obtained from seven 429-bp fragments of D. caprae showed that sequence identities were above 99.1% among three adults and four eggs. The intraspecific divergences in D. caprae, Demodex folliculorum, Demodex brevis, and Demodex canis were 0.0-0.9, 0.5-0.9, 0.0-0.2, and 0.0-0.5%, respectively, while the interspecific divergences between D. caprae and D. folliculorum, D. canis, and D. brevis were 20.3-20.9, 21.8-23.0, and 25.0-25.3, respectively. The interspecific divergences were 10 times higher than intraspecific ones, indicating considerable barcoding gap. Furthermore, the phylogenetic trees showed that four Demodex species gathered separately, representing independent species; and Demodex folliculorum gathered with canine Demodex, D. caprae, and D. brevis in sequence. In conclusion, the selected 429-bp mitochondrial cox1 gene is an appropriate DNA barcode for molecular classification, identification, and phylogenetic analysis of Demodex. D. caprae is an independent species and D. folliculorum is closer to D. canis than to D. caprae or D. brevis.

  18. Stochastic Models for Phylogenetic Trees on Higher-order Taxa

    CERN Document Server

    Aldous, David; Popovic, Lea


    Simple stochastic models for phylogenetic trees on species have been well studied. But much paleontology data concerns time series or trees on higher-order taxa, and any broad picture of relationships between extant groups requires use of higher-order taxa. A coherent model for trees on (say) genera should involve both a species-level model and a model for the classification scheme by which species are assigned to genera. We present a general framework for such models, and describe three alternate classification schemes. Combining with the species-level model of Aldous-Popovic (2005), one gets models for higher-order trees, and we initiate analytic study of such models. In particular we derive formulas for the lifetime of genera, for the distribution of number of species per genus, and for the offspring structure of the tree on genera.

  19. Mitochondrial coi in phylogenetic relationships of Laimaphelenchus belgradiensis (nematoda: Aphelenchoididae

    Directory of Open Access Journals (Sweden)

    Oro Violeta


    Full Text Available Nematodes of the genus Laimaphelenchus are small and tiny organisms. Some parts of their body are measured in nanometers. The identification and classification of such organisms is a complex task. Previously, the major source of classification was morphology based on anatomical characters and measurements. Nowadays, this approach is supplemented by: “nano-morphology” based on scanning electron microscopy and molecular data and phylogeny, resulting in molecular systematics. Laimaphelenchus belgradiensis was recently described species. Since cytochrome c oxidase subunit I gene was successful in DNA based species diagnosis, it was chosen as a molecular marker to infer phylogeny of the newly discovered species. Phylogenetic relationships were based on Bayesian inference, the pairwise distances and the content of nitrogenous bases. The great genetic diversity was observed among close and distant species. [Projekat Ministarstva nauke Republike Srbije, br. TR 31018 i br. III 46007

  20. Classification and Analysis of Computer Network Traffic

    DEFF Research Database (Denmark)

    Bujlow, Tomasz


    various classification modes (decision trees, rulesets, boosting, softening thresholds) regarding the classification accuracy and the time required to create the classifier. We showed how to use our VBS tool to obtain per-flow, per-application, and per-content statistics of traffic in computer networks...... classification (as by using transport layer port numbers, Deep Packet Inspection (DPI), statistical classification) and assessed their usefulness in particular areas. We found that the classification techniques based on port numbers are not accurate anymore as most applications use dynamic port numbers, while...... DPI is relatively slow, requires a lot of processing power, and causes a lot of privacy concerns. Statistical classifiers based on Machine Learning Algorithms (MLAs) were shown to be fast and accurate. At the same time, they do not consume a lot of resources and do not cause privacy concerns. However...

  1. HIV classification using coalescent theory

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory


    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our

  2. Vestige: Maximum likelihood phylogenetic footprinting

    Directory of Open Access Journals (Sweden)

    Maxwell Peter


    Full Text Available Abstract Background Phylogenetic footprinting is the identification of functional regions of DNA by their evolutionary conservation. This is achieved by comparing orthologous regions from multiple species and identifying the DNA regions that have diverged less than neutral DNA. Vestige is a phylogenetic footprinting package built on the PyEvolve toolkit that uses probabilistic molecular evolutionary modelling to represent aspects of sequence evolution, including the conventional divergence measure employed by other footprinting approaches. In addition to measuring the divergence, Vestige allows the expansion of the definition of a phylogenetic footprint to include variation in the distribution of any molecular evolutionary processes. This is achieved by displaying the distribution of model parameters that represent partitions of molecular evolutionary substitutions. Examination of the spatial incidence of these effects across regions of the genome can identify DNA segments that differ in the nature of the evolutionary process. Results Vestige was applied to a reference dataset of the SCL locus from four species and provided clear identification of the known conserved regions in this dataset. To demonstrate the flexibility to use diverse models of molecular evolution and dissect the nature of the evolutionary process Vestige was used to footprint the Ka/Ks ratio in primate BRCA1 with a codon model of evolution. Two regions of putative adaptive evolution were identified illustrating the ability of Vestige to represent the spatial distribution of distinct molecular evolutionary processes. Conclusion Vestige provides a flexible, open platform for phylogenetic footprinting. Underpinned by the PyEvolve toolkit, Vestige provides a framework for visualising the signatures of evolutionary processes across the genome of numerous organisms simultaneously. By exploiting the maximum-likelihood statistical framework, the complex interplay between mutational

  3. Phylogenetic analysis of otospiralin protein (United States)

    Torktaz, Ibrahim; Behjati, Mohaddeseh; Rostami, Amin


    Background: Fibrocyte-specific protein, otospiralin, is a small protein, widely expressed in the central nervous system as neuronal cell bodies and glia. The increased expression of otospiralin in reactive astrocytes implicates its role in signaling pathways and reparative mechanisms subsequent to injury. Indeed, otospiralin is considered to be essential for the survival of fibrocytes of the mesenchymal nonsensory regions of the cochlea. It seems that other functions of this protein are not yet completely understood. Materials and Methods: Amino acid sequences of otospiralin from 12 vertebrates were derived from National Center for Biotechnology Information database. Phylogenetic analysis and phylogeny estimation were performed using MEGA 5.0.5 program, and neighbor-joining tree was constructed by this software. Results: In this computational study, the phylogenetic tree of otospiralin has been investigated. Therefore, dendrograms of otospiralin were depicted. Alignment performed in MUSCLE method by UPGMB algorithm. Also, entropy plot determined for a better illustration of amino acid variations in this protein. Conclusion: In the present study, we used otospiralin sequence of 12 different species and by constructing phylogenetic tree, we suggested out group for some related species. PMID:27099854

  4. Neuromuscular disease classification system (United States)

    Sáez, Aurora; Acha, Begoña; Montero-Sánchez, Adoración; Rivas, Eloy; Escudero, Luis M.; Serrano, Carmen


    Diagnosis of neuromuscular diseases is based on subjective visual assessment of biopsies from patients by the pathologist specialist. A system for objective analysis and classification of muscular dystrophies and neurogenic atrophies through muscle biopsy images of fluorescence microscopy is presented. The procedure starts with an accurate segmentation of the muscle fibers using mathematical morphology and a watershed transform. A feature extraction step is carried out in two parts: 24 features that pathologists take into account to diagnose the diseases and 58 structural features that the human eye cannot see, based on the assumption that the biopsy is considered as a graph, where the nodes are represented by each fiber, and two nodes are connected if two fibers are adjacent. A feature selection using sequential forward selection and sequential backward selection methods, a classification using a Fuzzy ARTMAP neural network, and a study of grading the severity are performed on these two sets of features. A database consisting of 91 images was used: 71 images for the training step and 20 as the test. A classification error of 0% was obtained. It is concluded that the addition of features undetectable by the human visual inspection improves the categorization of atrophic patterns.

  5. HoxPred: automated classification of Hox proteins using combinations of generalised profiles

    Directory of Open Access Journals (Sweden)

    Leyns Luc


    Full Text Available Abstract Background Correct identification of individual Hox proteins is an essential basis for their study in diverse research fields. Common methods to classify Hox proteins focus on the homeodomain that characterise homeobox transcription factors. Classification is hampered by the high conservation of this short domain. Phylogenetic tree reconstruction is a widely used but time-consuming classification method. Results We have developed an automated procedure, HoxPred, that classifies Hox proteins in their groups of homology. The method relies on a discriminant analysis that classifies Hox proteins according to their scores for a combination of protein generalised profiles. 54 generalised profiles dedicated to each Hox homology group were produced de novo from a curated dataset of vertebrate Hox proteins. Several classification methods were investigated to select the most accurate discriminant functions. These functions were then incorporated into the HoxPred program. Conclusion HoxPred shows a mean accuracy of 97%. Predictions on the recently-sequenced stickleback fish proteome identified 44 Hox proteins, including HoxC1a only found so far in zebrafish. Using the Uniprot databank, we demonstrate that HoxPred can efficiently contribute to large-scale automatic annotation of Hox proteins into their paralogous groups. As orthologous group predictions show a higher risk of misclassification, they should be corroborated by additional supporting evidence. HoxPred is accessible via SOAP and Web interface Complete datasets, results and source code are available at the same site.

  6. A phylogenetic re-analysis of groupers with applications for ciguatera fish poisoning.

    Directory of Open Access Journals (Sweden)

    Charlotte Schoelinck

    Full Text Available Ciguatera fish poisoning (CFP is a significant public health problem due to dinoflagellates. It is responsible for one of the highest reported incidence of seafood-borne illness and Groupers are commonly reported as a source of CFP due to their position in the food chain. With the role of recent climate change on harmful algal blooms, CFP cases might become more frequent and more geographically widespread. Since there is no appropriate treatment for CFP, the most efficient solution is to regulate fish consumption. Such a strategy can only work if the fish sold are correctly identified, and it has been repeatedly shown that misidentifications and species substitutions occur in fish markets.We provide here both a DNA-barcoding reference for groupers, and a new phylogenetic reconstruction based on five genes and a comprehensive taxonomical sampling. We analyse the correlation between geographic range of species and their susceptibility to ciguatera accumulation, and the co-occurrence of ciguatoxins in closely related species, using both character mapping and statistical methods.Misidentifications were encountered in public databases, precluding accurate species identifications. Epinephelinae now includes only twelve genera (vs. 15 previously. Comparisons with the ciguatera incidences show that in some genera most species are ciguateric, but statistical tests display only a moderate correlation with the phylogeny. Atlantic species were rarely contaminated, with ciguatera occurrences being restricted to the South Pacific.The recent changes in classification based on the reanalyses of the relationships within Epinephelidae have an impact on the interpretation of the ciguatera distribution in the genera. In this context and to improve the monitoring of fish trade and safety, we need to obtain extensive data on contamination at the species level. Accurate species identifications through DNA barcoding are thus an essential tool in controlling CFP since

  7. Making Mosquito Taxonomy Useful: A Stable Classification of Tribe Aedini that Balances Utility with Current Knowledge of Evolutionary Relationships. (United States)

    Wilkerson, Richard C; Linton, Yvonne-Marie; Fonseca, Dina M; Schultz, Ted R; Price, Dana C; Strickman, Daniel A


    The tribe Aedini (Family Culicidae) contains approximately one-quarter of the known species of mosquitoes, including vectors of deadly or debilitating disease agents. This tribe contains the genus Aedes, which is one of the three most familiar genera of mosquitoes. During the past decade, Aedini has been the focus of a series of extensive morphology-based phylogenetic studies published by Reinert, Harbach, and Kitching (RH&K). Those authors created 74 new, elevated or resurrected genera from what had been the single genus Aedes, almost tripling the number of genera in the entire family Culicidae. The proposed classification is based on subjective assessments of the "number and nature of the characters that support the branches" subtending particular monophyletic groups in the results of cladistic analyses of a large set of morphological characters of representative species. To gauge the stability of RH&K's generic groupings we reanalyzed their data with unweighted parsimony jackknife and maximum-parsimony analyses, with and without ordering 14 of the characters as in RH&K. We found that their phylogeny was largely weakly supported and their taxonomic rankings failed priority and other useful taxon-naming criteria. Consequently, we propose simplified aedine generic designations that 1) restore a classification system that is useful for the operational community; 2) enhance the ability of taxonomists to accurately place new species into genera; 3) maintain the progress toward a natural classification based on monophyletic groups of species; and 4) correct the current classification system that is subject to instability as new species are described and existing species more thoroughly defined. We do not challenge the phylogenetic hypotheses generated by the above-mentioned series of morphological studies. However, we reduce the ranks of the genera and subgenera of RH&K to subgenera or informal species groups, respectively, to preserve stability as new data become

  8. Comparison of tree-child phylogenetic networks. (United States)

    Cardona, Gabriel; Rosselló, Francesc; Valiente, Gabriel


    Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a well-founded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called tree-child phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the well-known Robinson-Foulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child phylogenetic networks and for aligning a pair of tree-child phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at

  9. Functional and phylogenetic ecology in R

    CERN Document Server

    Swenson, Nathan G


    Functional and Phylogenetic Ecology in R is designed to teach readers to use R for phylogenetic and functional trait analyses. Over the past decade, a dizzying array of tools and methods were generated to incorporate phylogenetic and functional information into traditional ecological analyses. Increasingly these tools are implemented in R, thus greatly expanding their impact. Researchers getting started in R can use this volume as a step-by-step entryway into phylogenetic and functional analyses for ecology in R. More advanced users will be able to use this volume as a quick reference to understand particular analyses. The volume begins with an introduction to the R environment and handling relevant data in R. Chapters then cover phylogenetic and functional metrics of biodiversity; null modeling and randomizations for phylogenetic and functional trait analyses; integrating phylogenetic and functional trait information; and interfacing the R environment with a popular C-based program. This book presents a uni...

  10. Molecular phylogenetics of New World searobins (Triglidae; Prionotinae). (United States)

    Portnoy, David S; Willis, Stuart C; Hunt, Elizabeth; Swift, Dominic G; Gold, John R; Conway, Kevin W


    Phylogenetic relationships among members of the New World searobin genera Bellator and Prionotus (Family Triglidae, Subfamily Prionotinae) and among other searobins in the families Triglidae and Peristediidae were investigated using both mitochondrial and nuclear DNA sequences. Phylogenetic hypotheses derived from maximum likelihood and Bayesian methodologies supported a monophyletic Prionotinae that included four well resolved clades of uncertain relationship; three contained species in the genus Prionotus and one contained species in the genus Bellator. Bellator was always recovered within the genus Prionotus, a result supported by post hoc model testing. Two nominal species of Prionotus (P. alatus and P. paralatus) were not recovered as exclusive lineages, suggesting the two may comprise a single species. Phylogenetic hypotheses also supported a monophyletic Triglidae but only if armored searobins (Family Peristediidae) were included. A robust morphological assessment is needed to further characterize relationships and suggest classification of clades within Prionotinae; for the time being we recommend that Bellator be considered a synonym of Prionotus. Relationships between armored searobins (Family Peristediidae) and searobins (Family Triglidae) and relationships within Triglidae also warrant further study.

  11. Phylogenetic trees and Euclidean embeddings. (United States)

    Layer, Mark; Rhodes, John A


    It was recently observed by de Vienne et al. (Syst Biol 60(6):826-832, 2011) that a simple square root transformation of distances between taxa on a phylogenetic tree allowed for an embedding of the taxa into Euclidean space. While the justification for this was based on a diffusion model of continuous character evolution along the tree, here we give a direct and elementary explanation for it that provides substantial additional insight. We use this embedding to reinterpret the differences between the NJ and BIONJ tree building algorithms, providing one illustration of how this embedding reflects tree structures in data.

  12. Phylogenetic assignment of Mycobacterium tuberculosis Beijing clinical isolates in Japan by maximum a posteriori estimation. (United States)

    Seto, Junji; Wada, Takayuki; Iwamoto, Tomotada; Tamaru, Aki; Maeda, Shinji; Yamamoto, Kaori; Hase, Atsushi; Murakami, Koichi; Maeda, Eriko; Oishi, Akira; Migita, Yuji; Yamamoto, Taro; Ahiko, Tadayuki


    Intra-species phylogeny of Mycobacterium tuberculosis has been regarded as a clue to estimate its potential risk to develop drug-resistance and various epidemiological tendencies. Genotypic characterization of variable number of tandem repeats (VNTR), a standard tool to ascertain transmission routes, has been improving as a public health effort, but determining phylogenetic information from those efforts alone is difficult. We present a platform based on maximum a posteriori (MAP) estimation to estimate phylogenetic information for M. tuberculosis clinical isolates from individual profiles of VNTR types. This study used 1245 M. tuberculosis clinical isolates obtained throughout Japan for construction of an MAP estimation formula. Two MAP estimation formulae, classification of Beijing family and other lineages, and classification of five Beijing sublineages (ST11/26, STK, ST3, and ST25/19 belonging to the ancient Beijing subfamily and modern Beijing subfamily), were created based on 24 loci VNTR (24Beijing-VNTR) profiles and phylogenetic information of the isolates. Recursive estimation based on the formulae showed high concordance with their authentic phylogeny by multi-locus sequence typing (MLST) of the isolates. The formulae might further support phylogenetic estimation of the Beijing lineage M. tuberculosis from the VNTR genotype with various geographic backgrounds. These results suggest that MAP estimation can function as a reliable probabilistic process to append phylogenetic information to VNTR genotypes of M. tuberculosis independently, which might improve the usage of genotyping data for control, understanding, prevention, and treatment of TB.

  13. 类群取样与系统发育分析精确度之探索%Taxon sampling and the accuracy of phylogenetic analyses

    Institute of Scientific and Technical Information of China (English)

    Tracy A. HEATH; Shannon M. HEDTKE; David M. HILLIS


    Appropriate and extensive taxon sampling is one of the most important determinants of accurate phylogenetic estimation. In addition, accuracy of inferences about evolutionary processes obtained from phylogenetic analyses is improved significantly by thorough taxon sampling efforts. Many recent efforts to improve phylogenetic estimates have focused instead on increasing sequence length or the number of overall characters in the analysis, and this often does have a beneficial effect on the accuracy of phylogenetic analyses. However, phylogenetic analyses of few taxa (but each represented by many characters) can be subject to strong systematic biases, which in turn produce high measures of repeatability (such as bootstrap proportions) in support of incorrect or misleading phylogenetic results. Thus, it is important for phylogeneticists to consider both the sampling of taxa, as well as the sampling of characters, in designing phylogenetic studies. Taxon sampling also improves estimates of evolutionary parameters derived from phylogenetic trees, and is thus important for improved applications of phylogenetic analyses. Analysis of sensitivity to taxon inclusion, the possible effects of long-branch attraction, and sensitivity of parameter estimation for model-based methods should be a part of any careful and thorough phylogenetic analysis. Furthermore, recent improvements in phylogenetic algorithms and in computational power have removed many constraints on analyzing large, thoroughly sampled data sets. Thorough taxon sampling is thus one of the most practical ways to improve the accuracy of phylogenetic estimates, as well as the accuracy of biological inferences that are based on these phylogenetic trees.

  14. Experimental design in phylogenetics: testing predictions from expected information. (United States)

    San Mauro, Diego; Gower, David J; Cotton, James A; Zardoya, Rafael; Wilkinson, Mark; Massingham, Tim


    Taxon and character sampling are central to phylogenetic experimental design; yet, we lack general rules. Goldman introduced a method to construct efficient sampling designs in phylogenetics, based on the calculation of expected Fisher information given a probabilistic model of sequence evolution. The considerable potential of this approach remains largely unexplored. In an earlier study, we applied Goldman's method to a problem in the phylogenetics of caecilian amphibians and made an a priori evaluation and testable predictions of which taxon additions would increase information about a particular weakly supported branch of the caecilian phylogeny by the greatest amount. We have now gathered mitogenomic and rag1 sequences (some newly determined for this study) from additional caecilian species and studied how information (both expected and observed) and bootstrap support vary as each new taxon is individually added to our previous data set. This provides the first empirical test of specific predictions made using Goldman's method for phylogenetic experimental design. Our results empirically validate the top 3 (more intuitive) taxon addition predictions made in our previous study, but only information results validate unambiguously the 4th (less intuitive) prediction. This highlights a complex relationship between information and support, reflecting that each measures different things: Information is related to the ability to estimate branch length accurately and support to the ability to estimate the tree topology accurately. Thus, an increase in information may be correlated with but does not necessitate an increase in support. Our results also provide the first empirical validation of the widely held intuition that additional taxa that join the tree proximal to poorly supported internal branches are more informative and enhance support more than additional taxa that join the tree more distally. Our work supports the view that adding more data for a single (well

  15. Multiple sequence alignment accuracy and phylogenetic inference. (United States)

    Ogden, T Heath; Rosenberg, Michael S


    Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.

  16. Phylogenetic Conservatism in Plant Phenology (United States)

    Davies, T. Jonathan; Wolkovich, Elizabeth M.; Kraft, Nathan J. B.; Salamin, Nicolas; Allen, Jenica M.; Ault, Toby R.; Betancourt, Julio L.; Bolmgren, Kjell; Cleland, Elsa E.; Cook, Benjamin I.; Crimmins, Theresa M.; Mazer, Susan J.; McCabe, Gregory J.; Pau, Stephanie; Regetz, Jim; Schwartz, Mark D.; Travers, Steven E.


    Phenological events defined points in the life cycle of a plant or animal have been regarded as highly plastic traits, reflecting flexible responses to various environmental cues. The ability of a species to track, via shifts in phenological events, the abiotic environment through time might dictate its vulnerability to future climate change. Understanding the predictors and drivers of phenological change is therefore critical. Here, we evaluated evidence for phylogenetic conservatism the tendency for closely related species to share similar ecological and biological attributes in phenological traits across flowering plants. We aggregated published and unpublished data on timing of first flower and first leaf, encompassing 4000 species at 23 sites across the Northern Hemisphere. We reconstructed the phylogeny for the set of included species, first, using the software program Phylomatic, and second, from DNA data. We then quantified phylogenetic conservatism in plant phenology within and across sites. We show that more closely related species tend to flower and leaf at similar times. By contrasting mean flowering times within and across sites, however, we illustrate that it is not the time of year that is conserved, but rather the phenological responses to a common set of abiotic cues. Our findings suggest that species cannot be treated as statistically independent when modelling phenological responses.Closely related species tend to resemble each other in the timing of their life-history events, a likely product of evolutionarily conserved responses to environmental cues. The search for the underlying drivers of phenology must therefore account for species' shared evolutionary histories.

  17. Classification of Bacteria and Archaea: past, present and future. (United States)

    Schleifer, Karl Heinz


    The late 19th century was the beginning of bacterial taxonomy and bacteria were classified on the basis of phenotypic markers. The distinction of prokaryotes and eukaryotes was introduced in the 1960s. Numerical taxonomy improved phenotypic identification but provided little information on the phylogenetic relationships of prokaryotes. Later on, chemotaxonomic and genotypic methods were widely used for a more satisfactory classification. Archaea were first classified as a separate group of prokaryotes in 1977. The current classification of Bacteria and Archaea is based on an operational-based model, the so-called polyphasic approach, comprised of phenotypic, chemotaxonomic and genotypic data, as well as phylogenetic information. The provisional status Candidatus has been established for describing uncultured prokaryotic cells for which their phylogenetic relationship has been determined and their authenticity revealed by in situ probing. The ultimate goal is to achieve a theory-based classification system based on a phylogenetic/evolutionary concept. However, there are currently two contradictory opinions about the future classification of Bacteria and Archaea. A group of mostly molecular biologists posits that the yet-unclear effect of gene flow, in particular lateral gene transfer, makes the line of descent difficult, if not impossible, to describe. However, even in the face of genomic fluidity it seems that the typical geno- and phenotypic characteristics of a taxon are still maintained, and are sufficient for reliable classification and identification of Bacteria and Archaea. There are many well-defined genotypic clusters that are congruent with known species delineated by polyphasic approaches. Comparative sequence analysis of certain core genes, including rRNA genes, may be useful for the characterization of higher taxa, whereas various character genes may be suitable as phylogenetic markers for the delineation of lower taxa. Nevertheless, there may still be

  18. Automatic web services classification based on rough set theory

    Institute of Scientific and Technical Information of China (English)

    陈立; 张英; 宋自林; 苗壮


    With development of web services technology, the number of existing services in the internet is growing day by day. In order to achieve automatic and accurate services classification which can be beneficial for service related tasks, a rough set theory based method for services classification was proposed. First, the services descriptions were preprocessed and represented as vectors. Elicited by the discernibility matrices based attribute reduction in rough set theory and taking into account the characteristic of decision table of services classification, a method based on continuous discernibility matrices was proposed for dimensionality reduction. And finally, services classification was processed automatically. Through the experiment, the proposed method for services classification achieves approving classification result in all five testing categories. The experiment result shows that the proposed method is accurate and could be used in practical web services classification.

  19. Global patterns of amphibian phylogenetic diversity

    DEFF Research Database (Denmark)

    Fritz, Susanne; Rahbek, Carsten


    phylogeny (2792 species). We combined each tree with global species distributions to map four indices of phylogenetic diversity. To investigate congruence between global spatial patterns of amphibian species richness and phylogenetic diversity, we selected Faith’s phylogenetic diversity (PD) index......Aim  Phylogenetic diversity can provide insight into how evolutionary processes may have shaped contemporary patterns of species richness. Here, we aim to test for the influence of phylogenetic history on global patterns of amphibian species richness, and to identify areas where macroevolutionary...... and the total taxonomic distinctness (TTD) index, because we found that the variance of the other two indices we examined (average taxonomic distinctness and mean root distance) strongly depended on species richness. We then identified regions with unusually high or low phylogenetic diversity given...

  20. Molecular Phylogenetics: Concepts for a Newcomer. (United States)

    Ajawatanawong, Pravech


    Molecular phylogenetics is the study of evolutionary relationships among organisms using molecular sequence data. The aim of this review is to introduce the important terminology and general concepts of tree reconstruction to biologists who lack a strong background in the field of molecular evolution. Some modern phylogenetic programs are easy to use because of their user-friendly interfaces, but understanding the phylogenetic algorithms and substitution models, which are based on advanced statistics, is still important for the analysis and interpretation without a guide. Briefly, there are five general steps in carrying out a phylogenetic analysis: (1) sequence data preparation, (2) sequence alignment, (3) choosing a phylogenetic reconstruction method, (4) identification of the best tree, and (5) evaluating the tree. Concepts in this review enable biologists to grasp the basic ideas behind phylogenetic analysis and also help provide a sound basis for discussions with expert phylogeneticists.

  1. Tripartitions do not always discriminate phylogenetic networks. (United States)

    Cardona, Gabriel; Rosselló, Francesc; Valiente, Gabriel


    Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of non-treelike evolutionary events, like recombination, hybridization, or lateral gene transfer. In a recent series of papers devoted to the study of reconstructibility of phylogenetic networks, Moret, Nakhleh, Warnow and collaborators introduced the so-called tripartition metric for phylogenetic networks. In this paper we show that, in fact, this tripartition metric does not satisfy the separation axiom of distances (zero distance means isomorphism, or, in a more relaxed version, zero distance means indistinguishability in some specific sense) in any of the subclasses of phylogenetic networks where it is claimed to do so. We also present a subclass of phylogenetic networks whose members can be singled out by means of their sets of tripartitions (or even clusters), and hence where the latter can be used to define a meaningful metric.

  2. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik


    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  3. Phylogenetic diversity of Amazonian tree communities


    Honorio Coronado, Eurídice N.; Dexter, Kyle G.; Pennington, R. Toby; Chave, Jérôme; Lewis, Simon L.; Alexiades, Miguel N.; Alvarez, Esteban; Alves de Oliveira, Atila; Amaral, Iêda L.; Araujo-Murakami, Alejandro; Arets, Eric J. M. M.; Aymard, Gerardo A.; Baraloto, Christopher; Bonal, Damien; Brienen, Roel


    Aim: To examine variation in the phylogenetic diversity (PD) of tree communities across geographical and environmental gradients in Amazonia. Location: Two hundred and eighty-three c. 1 ha forest inventory plots from across Amazonia. Methods: We evaluated PD as the total phylogenetic branch length across species in each plot (PDss), the mean pairwise phylogenetic distance between species (MPD), the mean nearest taxon distance (MNTD) and their equivalents standardized for species richness (ses...

  4. Relevant phylogenetic invariants of evolutionary models

    CERN Document Server

    Casanellas, Marta


    Recently there have been several attempts to provide a whole set of generators of the ideal of the algebraic variety associated to a phylogenetic tree evolving under an algebraic model. These algebraic varieties have been proven to be useful in phylogenetics. In this paper we prove that, for phylogenetic reconstruction purposes, it is enough to consider generators coming from the edges of the tree, the so-called edge invariants. This is the algebraic analogous to Buneman's Splits Equivalence Theorem. The interest of this result relies on its potential applications in phylogenetics for the widely used evolutionary models such as Jukes-Cantor, Kimura 2 and 3 parameters, and General Markov models.

  5. Efficient segmentation by sparse pixel classification

    DEFF Research Database (Denmark)

    Dam, Erik B; Loog, Marco


    Segmentation methods based on pixel classification are powerful but often slow. We introduce two general algorithms, based on sparse classification, for optimizing the computation while still obtaining accurate segmentations. The computational costs of the algorithms are derived......, and they are demonstrated on real 3-D magnetic resonance imaging and 2-D radiograph data. We show that each algorithm is optimal for specific tasks, and that both algorithms allow a speedup of one or more orders of magnitude on typical segmentation tasks....

  6. Phylogenetic signal dissection identifies the root of starfishes. (United States)

    Feuda, Roberto; Smith, Andrew B


    Relationships within the class Asteroidea have remained controversial for almost 100 years and, despite many attempts to resolve this problem using molecular data, no consensus has yet emerged. Using two nuclear genes and a taxon sampling covering the major asteroid clades we show that non-phylogenetic signal created by three factors--Long Branch Attraction, compositional heterogeneity and the use of poorly fitting models of evolution--have confounded accurate estimation of phylogenetic relationships. To overcome the effect of this non-phylogenetic signal we analyse the data using non-homogeneous models, site stripping and the creation of subpartitions aimed to reduce or amplify the systematic error, and calculate Bayes Factor support for a selection of previously suggested topological arrangements of asteroid orders. We show that most of the previous alternative hypotheses are not supported in the most reliable data partitions, including the previously suggested placement of either Forcipulatida or Paxillosida as sister group to the other major branches. The best-supported solution places Velatida as the sister group to other asteroids, and the implications of this finding for the morphological evolution of asteroids are presented.

  7. Progress, pitfalls and parallel universes: a history of insect phylogenetics. (United States)

    Kjer, Karl M; Simon, Chris; Yavorskaya, Margarita; Beutel, Rolf G


    The phylogeny of insects has been both extensively studied and vigorously debated for over a century. A relatively accurate deep phylogeny had been produced by 1904. It was not substantially improved in topology until recently when phylogenomics settled many long-standing controversies. Intervening advances came instead through methodological improvement. Early molecular phylogenetic studies (1985-2005), dominated by a few genes, provided datasets that were too small to resolve controversial phylogenetic problems. Adding to the lack of consensus, this period was characterized by a polarization of philosophies, with individuals belonging to either parsimony or maximum-likelihood camps; each largely ignoring the insights of the other. The result was an unfortunate detour in which the few perceived phylogenetic revolutions published by both sides of the philosophical divide were probably erroneous. The size of datasets has been growing exponentially since the mid-1980s accompanied by a wave of confidence that all relationships will soon be known. However, large datasets create new challenges, and a large number of genes does not guarantee reliable results. If history is a guide, then the quality of conclusions will be determined by an improved understanding of both molecular and morphological evolution, and not simply the number of genes analysed.

  8. Phylogenetic signal dissection identifies the root of starfishes.

    Directory of Open Access Journals (Sweden)

    Roberto Feuda

    Full Text Available Relationships within the class Asteroidea have remained controversial for almost 100 years and, despite many attempts to resolve this problem using molecular data, no consensus has yet emerged. Using two nuclear genes and a taxon sampling covering the major asteroid clades we show that non-phylogenetic signal created by three factors--Long Branch Attraction, compositional heterogeneity and the use of poorly fitting models of evolution--have confounded accurate estimation of phylogenetic relationships. To overcome the effect of this non-phylogenetic signal we analyse the data using non-homogeneous models, site stripping and the creation of subpartitions aimed to reduce or amplify the systematic error, and calculate Bayes Factor support for a selection of previously suggested topological arrangements of asteroid orders. We show that most of the previous alternative hypotheses are not supported in the most reliable data partitions, including the previously suggested placement of either Forcipulatida or Paxillosida as sister group to the other major branches. The best-supported solution places Velatida as the sister group to other asteroids, and the implications of this finding for the morphological evolution of asteroids are presented.

  9. Hyperspectral image classification using functional data analysis. (United States)

    Li, Hong; Xiao, Guangrun; Xia, Tian; Tang, Y Y; Li, Luoqing


    The large number of spectral bands acquired by hyperspectral imaging sensors allows us to better distinguish many subtle objects and materials. Unlike other classical hyperspectral image classification methods in the multivariate analysis framework, in this paper, a novel method using functional data analysis (FDA) for accurate classification of hyperspectral images has been proposed. The central idea of FDA is to treat multivariate data as continuous functions. From this perspective, the spectral curve of each pixel in the hyperspectral images is naturally viewed as a function. This can be beneficial for making full use of the abundant spectral information. The relevance between adjacent pixel elements in the hyperspectral images can also be utilized reasonably. Functional principal component analysis is applied to solve the classification problem of these functions. Experimental results on three hyperspectral images show that the proposed method can achieve higher classification accuracies in comparison to some state-of-the-art hyperspectral image classification methods.

  10. Conflicting phylogenetic position of Schizosaccharomyces pombe

    NARCIS (Netherlands)

    Kuramae, Eiko E.; Robert, Vincent; Snel, Berend; Boekhout, Teun


    The phylogenetic position of the fission yeast Schizosaccharomyces pombe in the fungal Tree of Life is still controversial. Three alternative phylogenetic positions have been proposed in the literature, namely (1) a position basal to the Hemiascomycetes and Euascomycetes, (2) a position as a sister

  11. Efficient Computation of Popular Phylogenetic Tree Measures

    DEFF Research Database (Denmark)

    Tsirogiannis, Constantinos; Sandel, Brody Steven; Cheliotis, Dimitris


    Given a phylogenetic tree $\\mathcal{T}$ of n nodes, and a sample R of its tips (leaf nodes) a very common problem in ecological and evolutionary research is to evaluate a distance measure for the elements in R. Two of the most common measures of this kind are the Mean Pairwise Distance...... software package for processing phylogenetic trees....

  12. Insect phylogenetics in the digital age. (United States)

    Dietrich, Christopher H; Dmitriev, Dmitry A


    Insect systematists have long used digital data management tools to facilitate phylogenetic research. Web-based platforms developed over the past several years support creation of comprehensive, openly accessible data repositories and analytical tools that support large-scale collaboration, accelerating efforts to document Earth's biota and reconstruct the Tree of Life. New digital tools have the potential to further enhance insect phylogenetics by providing efficient workflows for capturing and analyzing phylogenetically relevant data. Recent initiatives streamline various steps in phylogenetic studies and provide community access to supercomputing resources. In the near future, automated, web-based systems will enable researchers to complete a phylogenetic study from start to finish using resources linked together within a single portal and incorporate results into a global synthesis.

  13. Use of whole genome sequences to develop a molecular phylogenetic framework for Rhodococcus fascians and the Rhodococcus genus

    Directory of Open Access Journals (Sweden)

    Allison L. Creason


    Full Text Available The accurate diagnosis of diseases caused by pathogenic bacteria requires a stable species classification. Rhodococcus fascians is the only documented member of its ill-defined genus that is capable of causing disease on a wide range of agriculturally important plants. Comparisons of genome sequences generated from isolates of Rhodococcus associated with diseased plants revealed a level of genetic diversity consistent with them representing multiple species. To test this, we generated a tree based on more than 1700 homologous sequences from plant-associated isolates of Rhodococcus, and obtained support from additional approaches that measure and cluster based on genome similarities. Results were consistent in supporting the definition of new Rhodococcus species within clades containing phytopathogenic members. We also used the genome sequences, along with other rhodococcal genome sequences to construct a molecular phylogenetic tree as a framework for resolving the Rhodococcus genus. Results indicated that Rhodococcus has the potential for having 20 species and also confirmed a need to revisit the taxonomic groupings within Rhodococcus.

  14. Plasmid Classification in an Era of Whole-Genome Sequencing: Application in Studies of Antibiotic Resistance Epidemiology (United States)

    Orlek, Alex; Stoesser, Nicole; Anjum, Muna F.; Doumith, Michel; Ellington, Matthew J.; Peto, Tim; Crook, Derrick; Woodford, Neil; Walker, A. Sarah; Phan, Hang; Sheppard, Anna E.


    Plasmids are extra-chromosomal genetic elements ubiquitous in bacteria, and commonly transmissible between host cells. Their genomes include variable repertoires of ‘accessory genes,’ such as antibiotic resistance genes, as well as ‘backbone’ loci which are largely conserved within plasmid families, and often involved in key plasmid-specific functions (e.g., replication, stable inheritance, mobility). Classifying plasmids into different types according to their phylogenetic relatedness provides insight into the epidemiology of plasmid-mediated antibiotic resistance. Current typing schemes exploit backbone loci associated with replication (replicon typing), or plasmid mobility (MOB typing). Conventional PCR-based methods for plasmid typing remain widely used. With the emergence of whole-genome sequencing (WGS), large datasets can be analyzed using in silico plasmid typing methods. However, short reads from popular high-throughput sequencers can be challenging to assemble, so complete plasmid sequences may not be accurately reconstructed. Therefore, localizing resistance genes to specific plasmids may be difficult, limiting epidemiological insight. Long-read sequencing will become increasingly popular as costs decline, especially when resolving accurate plasmid structures is the primary goal. This review discusses the application of plasmid classification in WGS-based studies of antibiotic resistance epidemiology; novel in silico plasmid analysis tools are highlighted. Due to the diverse and plastic nature of plasmid genomes, current typing schemes do not classify all plasmids, and identifying conserved, phylogenetically concordant genes for subtyping and phylogenetics is challenging. Analyzing plasmids as nodes in a network that represents gene-sharing relationships between plasmids provides a complementary way to assess plasmid diversity, and allows inferences about horizontal gene transfer to be made. PMID:28232822

  15. Classifying the bacterial gut microbiota of termites and cockroaches: A curated phylogenetic reference database (DictDb). (United States)

    Mikaelyan, Aram; Köhler, Tim; Lampert, Niclas; Rohland, Jeffrey; Boga, Hamadi; Meuser, Katja; Brune, Andreas


    Recent developments in sequencing technology have given rise to a large number of studies that assess bacterial diversity and community structure in termite and cockroach guts based on large amplicon libraries of 16S rRNA genes. Although these studies have revealed important ecological and evolutionary patterns in the gut microbiota, classification of the short sequence reads is limited by the taxonomic depth and resolution of the reference databases used in the respective studies. Here, we present a curated reference database for accurate taxonomic analysis of the bacterial gut microbiota of dictyopteran insects. The Dictyopteran gut microbiota reference Database (DictDb) is based on the Silva database but was significantly expanded by the addition of clones from 11 mostly unexplored termite and cockroach groups, which increased the inventory of bacterial sequences from dictyopteran guts by 26%. The taxonomic depth and resolution of DictDb was significantly improved by a general revision of the taxonomic guide tree for all important lineages, including a detailed phylogenetic analysis of the Treponema and Alistipes complexes, the Fibrobacteres, and the TG3 phylum. The performance of this first documented version of DictDb (v. 3.0) using the revised taxonomic guide tree in the classification of short-read libraries obtained from termites and cockroaches was highly superior to that of the current Silva and RDP databases. DictDb uses an informative nomenclature that is consistent with the literature also for clades of uncultured bacteria and provides an invaluable tool for anyone exploring the gut community structure of termites and cockroaches.

  16. Phylogenetic analysis of Pectinidae (Bivalvia) based on the ribosomal DNA internal transcribed spacer region

    Institute of Scientific and Technical Information of China (English)


    The ribosomal DNA internal transcribed spacer (ITS) region is a useful genomic region for understanding evolutionary and genetic relationships. In the current study, the molecular phylogenetic analysis of Pectinidae (Mollusca: Bivalvia) was performed using the nucleotide sequences of the nuclear ITS region in nine species of this family. The sequences were obtained from the scallop species Argopecten irradians, Mizuhopecten yessoensis, Amusium pleuronectes and Mimachlamys nobilis, and compared with the published sequences of Aequipecten opercularis, Chlamys farreri, C. distorta, M. varia, Pecten maximus, and an outgroup species Perna viridis. The molecular phylogenetic tree was constructed by the neighbor-joining and maximum parsimony methods. Phylogenetic analysis based on ITS1, ITS2, or their combination always yielded trees of similar topology. The results support the morphological classifications of bivalve and are nearly consistent with classification of two subfamilies (Chlamydinae and Pectininae) formulated by Waller. However, A. irradians, together with A. opercularis made up of genera Amusium, evidences that they may belong to the subfamily Pectinidae. The data are incompatible with the conclusion of Waller who placed them in Chlamydinae by morphological characteristics. These results provide new insights into the evolutionary relationships among scallop species and contribute to the improvement of existing classification systems.

  17. A statistical approach to root system classification.

    Directory of Open Access Journals (Sweden)

    Gernot eBodner


    Full Text Available Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for plant functional type identification in ecology can be applied to the classification of root systems. We demonstrate that combining principal component and cluster analysis yields a meaningful classification of rooting types based on morphological traits. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. Biplot inspection is used to determine key traits and to ensure stability in cluster based grouping. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Three rooting types emerged from measured data, distinguished by diameter/weight, density and spatial distribution respectively. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement

  18. Classification of the web

    DEFF Research Database (Denmark)

    Mai, Jens Erik


    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...

  19. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock


    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...

  20. Phylogenetic systematics of the Eucarida (Crustacea malacostraca

    Directory of Open Access Journals (Sweden)

    Martin L. Christoffersen


    Full Text Available Ninety-four morphological characters belonging to particular ontogenetic sequences within the Eucarida were used to produce a hierarchy of 128 evolutionary novelties (73 synapomorphies and 55 homoplasies and to delimit 15 monophyletic taxa. The following combined Recent-fossil sequenced phylogenetic classification is proposed: Superorder Eucarida; Order Euphausiacea; Family Bentheuphausiidae; Family Euphausiidae; Order Amphionidacea; Order Decapoda; Suborder Penaeidea; Suborder Pleocyemata; Infraorder Stenopodidea; Infraorder Reptantia; Infraorder Procarididea, Infraorder Caridea. The position of the Amphionidacea as the sister-group of the Decapoda is corroborated, while the Reptantia are proposed to be the sister-group of the Procarididea + Caridea for the first time. The fossil groups Uncina Quenstedt, 1850, and Palaeopalaemon Whitfield, 1880, are included as incertae sedis taxa within the Reptantia, which establishes the minimum ages of all the higher taxa of Eucarida except the Procarididea and Caridea in the Upper Devonian. The fossil group "Pygocephalomorpha" Beurlen, 1930, of uncertain status as a monophyletic taxon, is provisionally considered to belong to the "stem-group" of the Reptantia. Among the more important characters hypothesized to have evolved in the stem-lineage of each eucaridan monophyletic taxon are: (1 in Eucarida, attachement of post-zoeal carapace to all thoracic somites; (2 in Euphausiacea, reduction of endopod of eighth thoracopod; (3 in Bentheuphausiidae, compound eyes vestigial, associated with abyssal life; (4 in Euphausiidae, loss of endopod of eighth thoracopod and development of specialized luminescent organs; (5 in Amphionidacea + Decapoda, ambulatory ability of thoracic exopods reduced, scaphognathite, one pair of maxillipedes, pleurobranch gill series and carapace covering gills, associated with loss of pelagic life; (6 in Amphionidacea, unique thoracic brood pouch in females formed by inflated carapace and

  1. Phylogenetic analysis of the kinesin superfamily from Physcomitrella

    Directory of Open Access Journals (Sweden)

    Zhiyuan eShen


    Full Text Available Kinesins are an ancient superfamily of microtubule dependent motors. They participate in an ex-tensive and diverse list of essential cellular functions, including mitosis, cytokinesis, cell polari-zation, cell elongation, flagellar development, and intracellular transport. Based on phylogenetic relationships, the kinesin superfamily has been subdivided into 14 families, which are represented in most eukaryotic phyla. The functions of these families are sometimes conserved between species, but important variations in function across species have been observed. Plants possess most kinesin families including a few plant-specific families. With the availability of an ever in-creasing number of genome sequences from plants, it is important to document the complete complement of kinesins present in a given organism. This will help develop a molecular frame-work to explore the function of each family using genetics, biochemistry and cell biology. The moss Physcomitrella patens has emerged as a powerful model organism to study gene function in plants, which makes it a key candidate to explore complex gene families, such as the kinesin superfamily. Here we report a detailed phylogenetic characterization of the 71 kinesins of the kinesin superfamily in Physcomitrella. We found a remarkable conservation of families and sub-family classes with Arabidopsis, which is important for future comparative analysis of function. Some of the families, such as kinesins 14s are composed of fewer members in moss, while other families, such as the kinesin 12s are greatly expanded. To improve the comparison between spe-cies, and to simplify communication between research groups, we propose a classification of subfamilies based on our phylogenetic analysis.

  2. Classification issues related to neuropathic trigeminal pain. (United States)

    Zakrzewska, Joanna M


    The goal of a classification system of medical conditions is to facilitate accurate communication, to ensure that each condition is described uniformly and universally and that all data banks for the storage and retrieval of research and clinical data related to the conditions are consistent. Classification entails deciding which kinds of diagnostic entities should be recognized and how to order them in a meaningful way. Currently there are 3 major pain classification systems of relevance to orofacial pain: The International Association for the Study of Pain classification system, the International Headache Society classification system, and the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD). All use different methodologies, and only the RDC/TMD take into account social and psychologic factors in the classification of conditions. Classification systems need to be reliable, valid, comprehensive, generalizable, and flexible, and they need to be tested using consensus views of experts as well as the available literature. There is an urgent need for a robust classification system for neuropathic trigeminal pain.

  3. Testing for phylogenetic signal in biological traits: the ubiquity of cross-product statistics. (United States)

    Pavoine, Sandrine; Ricotta, Carlo


    To evaluate rates of evolution, to establish tests of correlation between two traits, or to investigate to what degree the phylogeny of a species assemblage is predictive of a trait value so-called tests for phylogenetic signal are used. Being based on different approaches, these tests are generally thought to possess quite different statistical performances. In this article, we show that the Blomberg et al. K and K*, the Abouheif index, the Moran's I, and the Mantel correlation are all based on a cross-product statistic, and are thus all related to each other when they are associated to a permutation test of phylogenetic signal. What changes is only the way phylogenetic and trait similarities (or dissimilarities) among the tips of a phylogeny are computed. The definitions of the phylogenetic and trait-based (dis)similarities among tips thus determines the performance of the tests. We shortly discuss the biological and statistical consequences (in terms of power and type I error of the tests) of the observed relatedness among the statistics that allow tests for phylogenetic signal. Blomberg et al. K* statistic appears as one on the most efficient approaches to test for phylogenetic signal. When branch lengths are not available or not accurate, Abouheif's Cmean statistic is a powerful alternative to K*.

  4. Establishment and application of medication error classification standards in nursing care based on the International Classification of Patient Safety

    Directory of Open Access Journals (Sweden)

    Xiao-Ping Zhu


    Conclusion: Application of this classification system will help nursing administrators to accurately detect system- and process-related defects leading to medication errors, and enable the factors to be targeted to improve the level of patient safety management.

  5. Molecular Phylogenetics: Mathematical Framework and Unsolved Problems (United States)

    Xia, Xuhua

    Phylogenetic relationship is essential in dating evolutionary events, reconstructing ancestral genes, predicting sites that are important to natural selection, and, ultimately, understanding genomic evolution. Three categories of phylogenetic methods are currently used: the distance-based, the maximum parsimony, and the maximum likelihood method. Here, I present the mathematical framework of these methods and their rationales, provide computational details for each of them, illustrate analytically and numerically the potential biases inherent in these methods, and outline computational challenges and unresolved problems. This is followed by a brief discussion of the Bayesian approach that has been recently used in molecular phylogenetics.

  6. On Tree-Based Phylogenetic Networks. (United States)

    Zhang, Louxin


    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa. This answers two problems posted by Francis and Steel recently. A byproduct is a computer program for generating random binary phylogenetic networks under the uniform distribution model.

  7. Estudio filogenético de los géneros de Lithinini de Sudamérica Austral (Lepidoptera, Geometridae: una nueva clasificación Phylogenetic study of the genera of Lithinini (Lepidoptera, Geometridae from southern South America: a new classification

    Directory of Open Access Journals (Sweden)

    Luis E. Parra


    work we evaluate the taxonomy of the Lithinini of Austral South America based on a phylogenetic analysis. In our analysis we used outgroup Catophoenissa. Two approaches were used to evaluate phylogenetic relationships: 1 parsimony criterion, and 2 Bayesian inference. Parsimony analysis was conducted in PAUP software, and Bayesian analysis with Markov chains Monte Carlo using the BayesPhylogenies software. Our results based on the phylogenetic hypothesis suggest a new taxonomic order for Austral American Lithinini. The valid genera are: Asestra Warren, Acauro Rindge, Calta Rindge, Euclidiodes Warren, Franciscoia Orfila and Schajovskoy, Incalvertia Bartlett-Calvert, Lacaria Orfila and Schajovskoy, Laneco Rindge, Maeandrogonaria Butler, Martindoelloia Orfila and Schajovskoy, Nucara Rindge, Odontothera Butler, Proteopharmacis Warren, Psilaspilates Butler, Rhinoligia Warren and Tanagridia Butler. The main changes with respect to the previous taxonomic order are: 1 Yalpa Rindge is the synonymous junior of Odontothera; 2 the genus Rhinoligia Warren is incorporated into the Lithinini; 3 while our analysis reaffirms that Siopla Rindge is junior synonym of Asestra, Yapoma Rindge and Duraglia Rindge are synonymous of Euclidiodes Warren, while Callemo Rindge and Guara Rindge are synonymous of Tanagridia; 4 the genus Calta Rindge, Incalvertia Rindge, Odontothera Butler and Proteopharmacis Warren, synonymized by Pitkin, are redefined, revalidated and incorporated into the Lithinini tribe. A new species for the genus Franciscoia, F. ediliae Parra is described. A catalogue of the genera and species of the tribe in the region, and the figures of adults and genitalia of some species are included.

  8. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Directory of Open Access Journals (Sweden)

    Steven Kelly

    Full Text Available The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at

  9. Molecular systematics of the Amazonian genus Aldina, a phylogenetically enigmatic ectomycorrhizal lineage of papilionoid legumes. (United States)

    Ramos, Gustavo; de Lima, Haroldo Cavalcante; Prenner, Gerhard; de Queiroz, Luciano Paganucci; Zartman, Charles E; Cardoso, Domingos


    Aldina (Leguminosae) is among the very few ecologically successful ectomycorrhizal lineages in a family largely marked by the evolution of nodulating symbiosis. The genus comprises 20 species predominantly distributed in Amazonia and has been traditionally classified in the tribe Swartzieae because of its radial flowers with an entire calyx and numerous free stamens. The taxonomy of Aldina is complicated due to its poor representation in herbaria and the lack of a robust phylogenetic hypothesis of relationship. Recent phylogenetic analyses of matK and trnL sequences confirmed the placement of Aldina in the 50-kb inversion clade, although the genus remained phylogenetically isolated or unresolved in the context of the evolutionary history of the main early-branching papilionoid lineages. We performed maximum likelihood and Bayesian analyses of combined chloroplast datasets (matK, rbcL, and trnL) and explored the effect of incomplete taxa or missing data in order to shed light on the enigmatic phylogenetic position of Aldina. Unexpectedly, a sister relationship of Aldina with the Andira clade (Andira and Hymenolobium) is revealed. We suggest that a new tribal phylogenetic classification of the papilionoid legumes should place Aldina along with Andira and Hymenolobium. These results highlight yet another example of the independent evolution of radial floral symmetry within the early-branching Papilionoideae, a large collection of florally heterogeneous lineages dominated by papilionate or bilaterally symmetric flower morphology.

  10. [Comparative leaf anatomy and phylogenetic relationships of 11 species of Laeliinae with emphasis on Brassavola (Orchidaceae)]. (United States)

    Noguera-Savelli, Eliana; Jáuregui, Damelis


    Brassavola inhabits a wide altitude range and habitat types from Northern Mexico to Northern Argentina. Classification schemes in plants have normally used vegetative and floral characters, but when species are very similar, as in this genus, conflicts arise in species delimitation, and alternative methods should be applied. In this study we explored the taxonomic and phylogenetic value of the anatomical structure of leaves in Brassavola; as ingroup, seven species of Brassavola were considered, and as an outgroup Guarianthe skinneri, Laelia anceps, Rhyncholaelia digbyana and Rhyncholaelia glauca were evaluated. Leaf anatomical characters were studied in freehand cross sections of the middle portion with a light microscope. Ten vegetative anatomical characters were selected and coded for the phylogenetic analysis. Phylogenetic reconstruction was carried out under maximum parsimony using the program NONA through WinClada. Overall, Brassavola species reveal a wide variety of anatomical characters, many of them associated with xeromorphic plants: thick cuticle, hypodermis and cells of the mesophyll with spiral thickenings in the secondary wall. Moreover, mesophyll is either homogeneous or heterogeneous, often with extravascular bundles of fibers near the epidermis at both terete and flat leaves. All vascular bundles are collateral, arranged in more than one row in the mesophyll. The phylogenetic analysis did not resolve internal relationships of the genus; we obtained a polytomy, indicating that the anatomical characters by themselves have little phylogenetic value in Brassavola. We concluded that few anatomical characters are phylogenetically important; however, they would provide more support to elucidate the phylogenetic relantionships in the Orchidaceae and other plant groups if they are used in conjunction with morphological and/or molecular characters.

  11. Efficient and accurate fragmentation methods. (United States)

    Pruitt, Spencer R; Bertoni, Colleen; Brorsen, Kurt R; Gordon, Mark S


    Conspectus Three novel fragmentation methods that are available in the electronic structure program GAMESS (general atomic and molecular electronic structure system) are discussed in this Account. The fragment molecular orbital (FMO) method can be combined with any electronic structure method to perform accurate calculations on large molecular species with no reliance on capping atoms or empirical parameters. The FMO method is highly scalable and can take advantage of massively parallel computer systems. For example, the method has been shown to scale nearly linearly on up to 131 000 processor cores for calculations on large water clusters. There have been many applications of the FMO method to large molecular clusters, to biomolecules (e.g., proteins), and to materials that are used as heterogeneous catalysts. The effective fragment potential (EFP) method is a model potential approach that is fully derived from first principles and has no empirically fitted parameters. Consequently, an EFP can be generated for any molecule by a simple preparatory GAMESS calculation. The EFP method provides accurate descriptions of all types of intermolecular interactions, including Coulombic interactions, polarization/induction, exchange repulsion, dispersion, and charge transfer. The EFP method has been applied successfully to the study of liquid water, π-stacking in substituted benzenes and in DNA base pairs, solvent effects on positive and negative ions, electronic spectra and dynamics, non-adiabatic phenomena in electronic excited states, and nonlinear excited state properties. The effective fragment molecular orbital (EFMO) method is a merger of the FMO and EFP methods, in which interfragment interactions are described by the EFP potential, rather than the less accurate electrostatic potential. The use of EFP in this manner facilitates the use of a smaller value for the distance cut-off (Rcut). Rcut determines the distance at which EFP interactions replace fully quantum

  12. Accurate determination of antenna directivity

    DEFF Research Database (Denmark)

    Dich, Mikael


    The derivation of a formula for accurate estimation of the total radiated power from a transmitting antenna for which the radiated power density is known in a finite number of points on the far-field sphere is presented. The main application of the formula is determination of directivity from power......-pattern measurements. The derivation is based on the theory of spherical wave expansion of electromagnetic fields, which also establishes a simple criterion for the required number of samples of the power density. An array antenna consisting of Hertzian dipoles is used to test the accuracy and rate of convergence...

  13. The disentangling number for phylogenetic mixtures

    CERN Document Server

    Sullivant, Seth


    We provide a logarithmic upper bound for the disentangling number on unordered lists of leaf labeled trees. This results is useful for analyzing phylogenetic mixture models. The proof depends on interpreting multisets of trees as high dimensional contingency tables.

  14. Trends and concepts in fern classification (United States)

    Christenhusz, Maarten J. M.; Chase, Mark W.


    Background and Aims Throughout the history of fern classification, familial and generic concepts have been highly labile. Many classifications and evolutionary schemes have been proposed during the last two centuries, reflecting different interpretations of the available evidence. Knowledge of fern structure and life histories has increased through time, providing more evidence on which to base ideas of possible relationships, and classification has changed accordingly. This paper reviews previous classifications of ferns and presents ideas on how to achieve a more stable consensus. Scope An historical overview is provided from the first to the most recent fern classifications, from which conclusions are drawn on past changes and future trends. The problematic concept of family in ferns is discussed, with a particular focus on how this has changed over time. The history of molecular studies and the most recent findings are also presented. Key Results Fern classification generally shows a trend from highly artificial, based on an interpretation of a few extrinsic characters, via natural classifications derived from a multitude of intrinsic characters, towards more evolutionary circumscriptions of groups that do not in general align well with the distribution of these previously used characters. It also shows a progression from a few broad family concepts to systems that recognized many more narrowly and highly controversially circumscribed families; currently, the number of families recognized is stabilizing somewhere between these extremes. Placement of many genera was uncertain until the arrival of molecular phylogenetics, which has rapidly been improving our understanding of fern relationships. As a collective category, the so-called ‘fern allies’ (e.g. Lycopodiales, Psilotaceae, Equisetaceae) were unsurprisingly found to be polyphyletic, and the term should be abandoned. Lycopodiaceae, Selaginellaceae and Isoëtaceae form a clade (the lycopods) that is

  15. Phylogenetic distribution of fungal sterols.

    Directory of Open Access Journals (Sweden)

    John D Weete

    Full Text Available BACKGROUND: Ergosterol has been considered the "fungal sterol" for almost 125 years; however, additional sterol data superimposed on a recent molecular phylogeny of kingdom Fungi reveals a different and more complex situation. METHODOLOGY/PRINCIPAL FINDINGS: The interpretation of sterol distribution data in a modern phylogenetic context indicates that there is a clear trend from cholesterol and other Delta(5 sterols in the earliest diverging fungal species to ergosterol in later diverging fungi. There are, however, deviations from this pattern in certain clades. Sterols of the diverse zoosporic and zygosporic forms exhibit structural diversity with cholesterol and 24-ethyl -Delta(5 sterols in zoosporic taxa, and 24-methyl sterols in zygosporic fungi. For example, each of the three monophyletic lineages of zygosporic fungi has distinctive major sterols, ergosterol in Mucorales, 22-dihydroergosterol in Dimargaritales, Harpellales, and Kickxellales (DHK clade, and 24-methyl cholesterol in Entomophthorales. Other departures from ergosterol as the dominant sterol include: 24-ethyl cholesterol in Glomeromycota, 24-ethyl cholest-7-enol and 24-ethyl-cholesta-7,24(28-dienol in rust fungi, brassicasterol in Taphrinales and hypogeous pezizalean species, and cholesterol in Pneumocystis. CONCLUSIONS/SIGNIFICANCE: Five dominant end products of sterol biosynthesis (cholesterol, ergosterol, 24-methyl cholesterol, 24-ethyl cholesterol, brassicasterol, and intermediates in the formation of 24-ethyl cholesterol, are major sterols in 175 species of Fungi. Although most fungi in the most speciose clades have ergosterol as a major sterol, sterols are more varied than currently understood, and their distribution supports certain clades of Fungi in current fungal phylogenies. In addition to the intellectual importance of understanding evolution of sterol synthesis in fungi, there is practical importance because certain antifungal drugs (e.g., azoles target reactions in

  16. Genome-based Taxonomic Classification of Bacteroidetes

    Directory of Open Access Journals (Sweden)

    Richard L. Hahnke


    Full Text Available The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  17. Genome-Based Taxonomic Classification of Bacteroidetes. (United States)

    Hahnke, Richard L; Meier-Kolthoff, Jan P; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N; Woyke, Tanja; Kyrpides, Nikos C; Klenk, Hans-Peter; Göker, Markus


    The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  18. How does cognition evolve? Phylogenetic comparative psychology. (United States)

    MacLean, Evan L; Matthews, Luke J; Hare, Brian A; Nunn, Charles L; Anderson, Rindy C; Aureli, Filippo; Brannon, Elizabeth M; Call, Josep; Drea, Christine M; Emery, Nathan J; Haun, Daniel B M; Herrmann, Esther; Jacobs, Lucia F; Platt, Michael L; Rosati, Alexandra G; Sandel, Aaron A; Schroepfer, Kara K; Seed, Amanda M; Tan, Jingzhi; van Schaik, Carel P; Wobber, Victoria


    Now more than ever animal studies have the potential to test hypotheses regarding how cognition evolves. Comparative psychologists have developed new techniques to probe the cognitive mechanisms underlying animal behavior, and they have become increasingly skillful at adapting methodologies to test multiple species. Meanwhile, evolutionary biologists have generated quantitative approaches to investigate the phylogenetic distribution and function of phenotypic traits, including cognition. In particular, phylogenetic methods can quantitatively (1) test whether specific cognitive abilities are correlated with life history (e.g., lifespan), morphology (e.g., brain size), or socio-ecological variables (e.g., social system), (2) measure how strongly phylogenetic relatedness predicts the distribution of cognitive skills across species, and (3) estimate the ancestral state of a given cognitive trait using measures of cognitive performance from extant species. Phylogenetic methods can also be used to guide the selection of species comparisons that offer the strongest tests of a priori predictions of cognitive evolutionary hypotheses (i.e., phylogenetic targeting). Here, we explain how an integration of comparative psychology and evolutionary biology will answer a host of questions regarding the phylogenetic distribution and history of cognitive traits, as well as the evolutionary processes that drove their evolution.

  19. A practical guide to phylogenetics for nonexperts. (United States)

    O'Halloran, Damien


    Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.

  20. Phylogenetic approaches to natural product structure prediction. (United States)

    Ziemert, Nadine; Jensen, Paul R


    Phylogenetics is the study of the evolutionary relatedness among groups of organisms. Molecular phylogenetics uses sequence data to infer these relationships for both organisms and the genes they maintain. With the large amount of publicly available sequence data, phylogenetic inference has become increasingly important in all fields of biology. In the case of natural product research, phylogenetic relationships are proving to be highly informative in terms of delineating the architecture and function of the genes involved in secondary metabolite biosynthesis. Polyketide synthases and nonribosomal peptide synthetases provide model examples in which individual domain phylogenies display different predictive capacities, resolving features ranging from substrate specificity to structural motifs associated with the final metabolic product. This chapter provides examples in which phylogeny has proven effective in terms of predicting functional or structural aspects of secondary metabolism. The basics of how to build a reliable phylogenetic tree are explained along with information about programs and tools that can be used for this purpose. Furthermore, it introduces the Natural Product Domain Seeker, a recently developed Web tool that employs phylogenetic logic to classify ketosynthase and condensation domains based on established enzyme architecture and biochemical function.

  1. Nodal distances for rooted phylogenetic trees. (United States)

    Cardona, Gabriel; Llabrés, Mercè; Rosselló, Francesc; Valiente, Gabriel


    Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics in spaces M(n)(R) of real-valued n x n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined in this way using L(p) metrics on M(n)(R), with p [epsilon] R(>0).

  2. Texture Classification based on Gabor Wavelet

    Directory of Open Access Journals (Sweden)

    Amandeep Kaur


    Full Text Available This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vector Machine, K-nearest neighbor method and decision tree induction method. The results shows that classification using Support vector machines gives better results as compare to the other classifiers. It can accurately discriminate between a testing image data and training data.

  3. A statistical approach to root system classification. (United States)

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter


    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential.

  4. Influence of pansharpening techniques in obtaining accurate vegetation thematic maps (United States)

    Ibarrola-Ulzurrun, Edurne; Gonzalo-Martin, Consuelo; Marcello-Ruiz, Javier


    In last decades, there have been a decline in natural resources, becoming important to develop reliable methodologies for their management. The appearance of very high resolution sensors has offered a practical and cost-effective means for a good environmental management. In this context, improvements are needed for obtaining higher quality of the information available in order to get reliable classified images. Thus, pansharpening enhances the spatial resolution of the multispectral band by incorporating information from the panchromatic image. The main goal in the study is to implement pixel and object-based classification techniques applied to the fused imagery using different pansharpening algorithms and the evaluation of thematic maps generated that serve to obtain accurate information for the conservation of natural resources. A vulnerable heterogenic ecosystem from Canary Islands (Spain) was chosen, Teide National Park, and Worldview-2 high resolution imagery was employed. The classes considered of interest were set by the National Park conservation managers. 7 pansharpening techniques (GS, FIHS, HCS, MTF based, Wavelet `à trous' and Weighted Wavelet `à trous' through Fractal Dimension Maps) were chosen in order to improve the data quality with the goal to analyze the vegetation classes. Next, different classification algorithms were applied at pixel-based and object-based approach, moreover, an accuracy assessment of the different thematic maps obtained were performed. The highest classification accuracy was obtained applying Support Vector Machine classifier at object-based approach in the Weighted Wavelet `à trous' through Fractal Dimension Maps fused image. Finally, highlight the difficulty of the classification in Teide ecosystem due to the heterogeneity and the small size of the species. Thus, it is important to obtain accurate thematic maps for further studies in the management and conservation of natural resources.

  5. Accurate Modeling of Advanced Reflectarrays

    DEFF Research Database (Denmark)

    Zhou, Min

    Analysis and optimization methods for the design of advanced printed re ectarrays have been investigated, and the study is focused on developing an accurate and efficient simulation tool. For the analysis, a good compromise between accuracy and efficiency can be obtained using the spectral domain...... to the POT. The GDOT can optimize for the size as well as the orientation and position of arbitrarily shaped array elements. Both co- and cross-polar radiation can be optimized for multiple frequencies, dual polarization, and several feed illuminations. Several contoured beam reflectarrays have been designed...... using the GDOT to demonstrate its capabilities. To verify the accuracy of the GDOT, two offset contoured beam reflectarrays that radiate a high-gain beam on a European coverage have been designed and manufactured, and subsequently measured at the DTU-ESA Spherical Near-Field Antenna Test Facility...

  6. The Accurate Particle Tracer Code

    CERN Document Server

    Wang, Yulei; Qin, Hong; Yu, Zhi


    The Accurate Particle Tracer (APT) code is designed for large-scale particle simulations on dynamical systems. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and non-linear problems. Under the well-designed integrated and modularized framework, APT serves as a universal platform for researchers from different fields, such as plasma physics, accelerator physics, space science, fusion energy research, computational mathematics, software engineering, and high-performance computation. The APT code consists of seven main modules, including the I/O module, the initialization module, the particle pusher module, the parallelization module, the field configuration module, the external force-field module, and the extendible module. The I/O module, supported by Lua and Hdf5 projects, provides a user-friendly interface for both numerical simulation and data analysis. A series of new geometric numerical methods...

  7. Accurate ab initio spin densities

    CERN Document Server

    Boguslawski, Katharina; Legeza, Örs; Reiher, Markus


    We present an approach for the calculation of spin density distributions for molecules that require very large active spaces for a qualitatively correct description of their electronic structure. Our approach is based on the density-matrix renormalization group (DMRG) algorithm to calculate the spin density matrix elements as basic quantity for the spatially resolved spin density distribution. The spin density matrix elements are directly determined from the second-quantized elementary operators optimized by the DMRG algorithm. As an analytic convergence criterion for the spin density distribution, we employ our recently developed sampling-reconstruction scheme [J. Chem. Phys. 2011, 134, 224101] to build an accurate complete-active-space configuration-interaction (CASCI) wave function from the optimized matrix product states. The spin density matrix elements can then also be determined as an expectation value employing the reconstructed wave function expansion. Furthermore, the explicit reconstruction of a CA...

  8. Accurate thickness measurement of graphene. (United States)

    Shearer, Cameron J; Slattery, Ashley D; Stapleton, Andrew J; Shapter, Joseph G; Gibson, Christopher T


    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  9. Accurate thickness measurement of graphene (United States)

    Shearer, Cameron J.; Slattery, Ashley D.; Stapleton, Andrew J.; Shapter, Joseph G.; Gibson, Christopher T.


    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  10. Typology, classification and systematization of innovative projects and initiatives in the company


    Baklanova Julia O.


    The author presents a comparison of definitions of typology, classification and systematization, and treats them as an example of innovative projects and initiatives of the company. The basis of typology and classification laid methodical Benko K., Mc Farlan. In order to obtain a more accurate result it is necessary to integrate the task typology, classification and systematization.

  11. Typology, classification and systematization of innovative projects and initiatives in the company

    Directory of Open Access Journals (Sweden)

    Baklanova Julia O.


    Full Text Available The author presents a comparison of definitions of typology, classification and systematization, and treats them as an example of innovative projects and initiatives of the company. The basis of typology and classification laid methodical Benko K., Mc Farlan. In order to obtain a more accurate result it is necessary to integrate the task typology, classification and systematization.

  12. Inter-rater reliability of the EPUAP pressure ulcer classification system using photographs.

    NARCIS (Netherlands)

    Defloor, T.; Schoonhoven, L.


    BACKGROUND: Many classification systems for grading pressure ulcers are discussed in the literature. Correct identification and classification of a pressure ulcer is important for accurate reporting of the magnitude of the problem, and for timely prevention. The reliability of pressure ulcer classif

  13. Classification of cultivated plants.

    NARCIS (Netherlands)

    Brandenburg, W.A.


    Agricultural practice demands principles for classification, starting from the basal entity in cultivated plants: the cultivar. In establishing biosystematic relationships between wild, weedy and cultivated plants, the species concept needs re-examination. Combining of botanic classification, based

  14. Aircraft Operations Classification System (United States)

    Harlow, Charles; Zhu, Weihong


    Accurate data is important in the aviation planning process. In this project we consider systems for measuring aircraft activity at airports. This would include determining the type of aircraft such as jet, helicopter, single engine, and multiengine propeller. Some of the issues involved in deploying technologies for monitoring aircraft operations are cost, reliability, and accuracy. In addition, the system must be field portable and acceptable at airports. A comparison of technologies was conducted and it was decided that an aircraft monitoring system should be based upon acoustic technology. A multimedia relational database was established for the study. The information contained in the database consists of airport information, runway information, acoustic records, photographic records, a description of the event (takeoff, landing), aircraft type, and environmental information. We extracted features from the time signal and the frequency content of the signal. A multi-layer feed-forward neural network was chosen as the classifier. Training and testing results were obtained. We were able to obtain classification results of over 90 percent for training and testing for takeoff events.

  15. Barcoding and Phylogenetic Inferences in Nine Mugilid Species (Pisces, Mugiliformes

    Directory of Open Access Journals (Sweden)

    Neonila Polyakova


    Full Text Available Accurate identification of fish and fish products, from eggs to adults, is important in many areas. Grey mullets of the family Mugilidae are distributed worldwide and inhabit marine, estuarine, and freshwater environments in all tropical and temperate regions. Various Mugilid species are commercially important species in fishery and aquaculture of many countries. For the present study we have chosen two Mugilid genes with different phylogenetic signals: relatively variable mitochondrial cytochrome oxidase subunit I (COI and conservative nuclear rhodopsin (RHO. We examined their diversity within and among 9 Mugilid species belonging to 4 genera, many of which have been examined from multiple specimens, with the goal of determining whether DNA barcoding can achieve unambiguous species recognition of Mugilid species. The data obtained showed that information based on COI sequences was diagnostic not only for species-level identification but also for recognition of intraspecific units, e.g., allopatric populations of circumtropical Mugil cephalus, or even native and acclimatized specimens of Chelon haematocheila. All RHO sequences appeared strictly species specific. Based on the data obtained, we conclude that COI, as well as RHO sequencing can be used to unambiguously identify fish species. Topologies of phylogeny based on RHO and COI sequences coincided with each other, while together they had a good phylogenetic signal.

  16. Cirrhosis Classification Based on Texture Classification of Random Features

    Directory of Open Access Journals (Sweden)

    Hui Liu


    Full Text Available Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage. CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  17. Cirrhosis classification based on texture classification of random features. (United States)

    Liu, Hui; Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang


    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  18. Random forest classification of etiologies for an orphan disease. (United States)

    Speiser, Jaime Lynn; Durkalski, Valerie L; Lee, William M


    Classification of objects into pre-defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and often not normally distributed. The purpose of this paper is to illustrate the application of the random forest (RF) classification procedure in a real clinical setting and discuss typical questions that arise in the general classification framework as well as offer interpretations of RF results. This paper includes methods for assessing predictive performance, importance of predictor variables, and observation-specific information.

  19. Increased taxon sampling greatly reduces phylogenetic error. (United States)

    Zwickl, Derrick J; Hillis, David M


    Several authors have argued recently that extensive taxon sampling has a positive and important effect on the accuracy of phylogenetic estimates. However, other authors have argued that there is little benefit of extensive taxon sampling, and so phylogenetic problems can or should be reduced to a few exemplar taxa as a means of reducing the computational complexity of the phylogenetic analysis. In this paper we examined five aspects of study design that may have led to these different perspectives. First, we considered the measurement of phylogenetic error across a wide range of taxon sample sizes, and conclude that the expected error based on randomly selecting trees (which varies by taxon sample size) must be considered in evaluating error in studies of the effects of taxon sampling. Second, we addressed the scope of the phylogenetic problems defined by different samples of taxa, and argue that phylogenetic scope needs to be considered in evaluating the importance of taxon-sampling strategies. Third, we examined the claim that fast and simple tree searches are as effective as more thorough searches at finding near-optimal trees that minimize error. We show that a more complete search of tree space reduces phylogenetic error, especially as the taxon sample size increases. Fourth, we examined the effects of simple versus complex simulation models on taxonomic sampling studies. Although benefits of taxon sampling are apparent for all models, data generated under more complex models of evolution produce higher overall levels of error and show greater positive effects of increased taxon sampling. Fifth, we asked if different phylogenetic optimality criteria show different effects of taxon sampling. Although we found strong differences in effectiveness of different optimality criteria as a function of taxon sample size, increased taxon sampling improved the results from all the common optimality criteria. Nonetheless, the method that showed the lowest overall

  20. Fourier transform inequalities for phylogenetic trees. (United States)

    Matsen, Frederick A


    Phylogenetic invariants are not the only constraints on site-pattern frequency vectors for phylogenetic trees. A mutation matrix, by its definition, is the exponential of a matrix with non-negative off-diagonal entries; this positivity requirement implies non-trivial constraints on the site-pattern frequency vectors. We call these additional constraints "edge-parameter inequalities". In this paper, we first motivate the edge-parameter inequalities by considering a pathological site-pattern frequency vector corresponding to a quartet tree with a negative internal edge. This site-pattern frequency vector nevertheless satisfies all of the constraints described up to now in the literature. We next describe two complete sets of edge-parameter inequalities for the group-based models; these constraints are square-free monomial inequalities in the Fourier transformed coordinates. These inequalities, along with the phylogenetic invariants, form a complete description of the set of site-pattern frequency vectors corresponding to bona fide trees. Said in mathematical language, this paper explicitly presents two finite lists of inequalities in Fourier coordinates of the form "monomial < or = 1", each list characterizing the phylogenetically relevant semialgebraic subsets of the phylogenetic varieties.

  1. Worldwide phylogenetic relationship of avian poxviruses (United States)

    Gyuranecz, Miklós; Foster, Jeffrey T.; Dán, Ádám; Ip, Hon S.; Egstad, Kristina F.; Parker, Patricia G.; Higashiguchi, Jenni M.; Skinner, Michael A.; Höfle, Ursula; Kreizinger, Zsuzsa; Dorrestein, Gerry M.; Solt, Szabolcs; Sós, Endre; Kim, Young Jun; Uhart, Marcela; Pereda, Ariel; González-Hein, Gisela; Hidalgo, Hector; Blanco, Juan-Manuel; Erdélyi, Károly


    Poxvirus infections have been found in 230 species of wild and domestic birds worldwide in both terrestrial and marine environments. This ubiquity raises the question of how infection has been transmitted and globally dispersed. We present a comprehensive global phylogeny of 111 novel poxvirus isolates in addition to all available sequences from GenBank. Phylogenetic analysis of Avipoxvirus genus has traditionally relied on one gene region (4b core protein). In this study we have expanded the analyses to include a second locus (DNA polymerase gene), allowing for a more robust phylogenetic framework, finer genetic resolution within specific groups and the detection of potential recombination. Our phylogenetic results reveal several major features of avipoxvirus evolution and ecology and propose an updated avipoxvirus taxonomy, including three novel subclades. The characterization of poxviruses from 57 species of birds in this study extends the current knowledge of their host range and provides the first evidence of the phylogenetic effect of genetic recombination of avipoxviruses. The repeated occurrence of avian family or order-specific grouping within certain clades (e.g. starling poxvirus, falcon poxvirus, raptor poxvirus, etc.) indicates a marked role of host adaptation, while the sharing of poxvirus species within prey-predator systems emphasizes the capacity for cross-species infection and limited host adaptation. Our study provides a broad and comprehensive phylogenetic analysis of the Avipoxvirus genus, an ecologically and environmentally important viral group, to formulate a genome sequencing strategy that will clarify avipoxvirus taxonomy.

  2. Primate molecular phylogenetics in a genomic era. (United States)

    Ting, Nelson; Sterner, Kirstin N


    A primary objective of molecular phylogenetics is to use molecular data to elucidate the evolutionary history of living organisms. Dr. Morris Goodman founded the journal Molecular Phylogenetics and Evolution as a forum where scientists could further our knowledge about the tree of life, and he recognized that the inference of species trees is a first and fundamental step to addressing many important evolutionary questions. In particular, Dr. Goodman was interested in obtaining a complete picture of the primate species tree in order to provide an evolutionary context for the study of human adaptations. A number of recent studies use multi-locus datasets to infer well-resolved and well-supported primate phylogenetic trees using consensus approaches (e.g., supermatrices). It is therefore tempting to assume that we have a complete picture of the primate tree, especially above the species level. However, recent theoretical and empirical work in the field of molecular phylogenetics demonstrates that consensus methods might provide a false sense of support at certain nodes. In this brief review we discuss the current state of primate molecular phylogenetics and highlight the importance of exploring the use of coalescent-based analyses that have the potential to better utilize information contained in multi-locus data.

  3. Teaching Molecular Phylogenetics through Investigating a Real-World Phylogenetic Problem (United States)

    Zhang, Xiaorong


    A phylogenetics exercise is incorporated into the "Introduction to biocomputing" course, a junior-level course at Savannah State University. This exercise is designed to help students learn important concepts and practical skills in molecular phylogenetics through solving a real-world problem. In this application, students are required to identify…

  4. A More Accurate Fourier Transform

    CERN Document Server

    Courtney, Elya


    Fourier transform methods are used to analyze functions and data sets to provide frequencies, amplitudes, and phases of underlying oscillatory components. Fast Fourier transform (FFT) methods offer speed advantages over evaluation of explicit integrals (EI) that define Fourier transforms. This paper compares frequency, amplitude, and phase accuracy of the two methods for well resolved peaks over a wide array of data sets including cosine series with and without random noise and a variety of physical data sets, including atmospheric $\\mathrm{CO_2}$ concentrations, tides, temperatures, sound waveforms, and atomic spectra. The FFT uses MIT's FFTW3 library. The EI method uses the rectangle method to compute the areas under the curve via complex math. Results support the hypothesis that EI methods are more accurate than FFT methods. Errors range from 5 to 10 times higher when determining peak frequency by FFT, 1.4 to 60 times higher for peak amplitude, and 6 to 10 times higher for phase under a peak. The ability t...

  5. PhyTB: Phylogenetic tree visualisation and sample positioning for M. tuberculosis

    KAUST Repository

    Benavente, Ernest D


    Background Phylogenetic-based classification of M. tuberculosis and other bacterial genomes is a core analysis for studying evolutionary hypotheses, disease outbreaks and transmission events. Whole genome sequencing is providing new insights into the genomic variation underlying intra- and inter-strain diversity, thereby assisting with the classification and molecular barcoding of the bacteria. One roadblock to strain investigation is the lack of user-interactive solutions to interrogate and visualise variation within a phylogenetic tree setting. Results We have developed a web-based tool called PhyTB ( webcite) to assist phylogenetic tree visualisation and identification of M. tuberculosis clade-informative polymorphism. Variant Call Format files can be uploaded to determine a sample position within the tree. A map view summarises the geographical distribution of alleles and strain-types. The utility of the PhyTB is demonstrated on sequence data from 1,601 M. tuberculosis isolates. Conclusion PhyTB contextualises M. tuberculosis genomic variation within epidemiological, geographical and phylogenic settings. Further tool utility is possible by incorporating large variants and phenotypic data (e.g. drug-resistance profiles), and an assessment of genotype-phenotype associations. Source code is available to develop similar websites for other organisms ( webcite).

  6. Phylogenetic analysis of the Trypanosoma genus based on the heat-shock protein 70 gene. (United States)

    Fraga, Jorge; Fernández-Calienes, Aymé; Montalvo, Ana Margarita; Maes, Ilse; Deborggraeve, Stijn; Büscher, Philippe; Dujardin, Jean-Claude; Van der Auwera, Gert


    Trypanosome evolution was so far essentially studied on the basis of phylogenetic analyses of small subunit ribosomal RNA (SSU-rRNA) and glycosomal glyceraldehyde-3-phosphate dehydrogenase (gGAPDH) genes. We used for the first time the 70kDa heat-shock protein gene (hsp70) to investigate the phylogenetic relationships among 11 Trypanosoma species on the basis of 1380 nucleotides from 76 sequences corresponding to 65 strains. We also constructed a phylogeny based on combined datasets of SSU-rDNA, gGAPDH and hsp70 sequences. The obtained clusters can be correlated with the sections and subgenus classifications of mammal-infecting trypanosomes except for Trypanosoma theileri and Trypanosoma rangeli. Our analysis supports the classification of Trypanosoma species into clades rather than in sections and subgenera, some of which being polyphyletic. Nine clades were recognized: Trypanosoma carassi, Trypanosoma congolense, Trypanosoma cruzi, Trypanosoma grayi, Trypanosoma lewisi, T. rangeli, T. theileri, Trypanosoma vivax and Trypanozoon. These results are consistent with existing knowledge of the genus' phylogeny. Within the T. cruzi clade, three groups of T. cruzi discrete typing units could be clearly distinguished, corresponding to TcI, TcIII, and TcII+V+VI, while support for TcIV was lacking. Phylogenetic analyses based on hsp70 demonstrated that this molecular marker can be applied for discriminating most of the Trypanosoma species and clades.

  7. SUMAC: Constructing Phylogenetic Supermatrices and Assessing Partially Decisive Taxon Coverage


    William A. Freyman


    The amount of phylogenetically informative sequence data in GenBank is growing at an exponential rate, and large phylogenetic trees are increasingly used in research. Tools are needed to construct phylogenetic sequence matrices from GenBank data and evaluate the effect of missing data. Supermatrix Constructor (SUMAC) is a tool to data-mine GenBank, construct phylogenetic supermatrices, and assess the phylogenetic decisiveness of a matrix given the pattern of missing sequence data. SUMAC calcu...

  8. Phylogenetic inference under varying proportions of indel-induced alignment gaps

    Directory of Open Access Journals (Sweden)

    Gadagkar Sudhindra R


    Full Text Available Abstract Background The effect of alignment gaps on phylogenetic accuracy has been the subject of numerous studies. In this study, we investigated the relationship between the total number of gapped sites and phylogenetic accuracy, when the gaps were introduced (by means of computer simulation to reflect indel (insertion/deletion events during the evolution of DNA sequences. The resulting (true alignments were subjected to commonly used gap treatment and phylogenetic inference methods. Results (1 In general, there was a strong – almost deterministic – relationship between the amount of gap in the data and the level of phylogenetic accuracy when the alignments were very "gappy", (2 gaps resulting from deletions (as opposed to insertions contributed more to the inaccuracy of phylogenetic inference, (3 the probabilistic methods (Bayesian, PhyML & "MLε, " a method implemented in DNAML in PHYLIP performed better at most levels of gap percentage when compared to parsimony (MP and distance (NJ methods, with Bayesian analysis being clearly the best, (4 methods that treat gapped sites as missing data yielded less accurate trees when compared to those that attribute phylogenetic signal to the gapped sites (by coding them as binary character data – presence/absence, or as in the MLε method, and (5 in general, the accuracy of phylogenetic inference depended upon the amount of available data when the gaps resulted from mainly deletion events, and the amount of missing data when insertion events were equally likely to have caused the alignment gaps. Conclusion When gaps in an alignment are a consequence of indel events in the evolution of the sequences, the accuracy of phylogenetic analysis is likely to improve if: (1 alignment gaps are categorized as arising from insertion events or deletion events and then treated separately in the analysis, (2 the evolutionary signal provided by indels is harnessed in the phylogenetic analysis, and (3 methods that

  9. Visualizing Phylogenetic Treespace Using Cartographic Projections (United States)

    Sundberg, Kenneth; Clement, Mark; Snell, Quinn

    Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger datasets.

  10. Phylogenetic invariants for group-based models

    CERN Document Server

    Donten-Bury, Maria


    In this paper we investigate properties of algebraic varieties representing group-based phylogenetic models. We give the (first) example of a nonnormal general group-based model for an abelian group. Following Kaie Kubjas we also determine some invariants of group-based models showing that the associated varieties do not have to be deformation equivalent. We propose a method of generating many phylogenetic invariants and in particular we show that our approach gives the whole ideal of the claw tree for 3-Kimura model under the assumption of the conjecture of Sturmfels and Sullivant. This, combined with the results of Sturmfels and Sullivant, would enable to determine all phylogenetic invariants for any tree for 3-Kimura model and possibly for other group-based models.

  11. Morphological and molecular convergences in mammalian phylogenetics. (United States)

    Zou, Zhengting; Zhang, Jianzhi


    Phylogenetic trees reconstructed from molecular sequences are often considered more reliable than those reconstructed from morphological characters, in part because convergent evolution, which confounds phylogenetic reconstruction, is believed to be rarer for molecular sequences than for morphologies. However, neither the validity of this belief nor its underlying cause is known. Here comparing thousands of characters of each type that have been used for inferring the phylogeny of mammals, we find that on average morphological characters indeed experience much more convergences than amino acid sites, but this disparity is explained by fewer states per character rather than an intrinsically higher susceptibility to convergence for morphologies than sequences. We show by computer simulation and actual data analysis that a simple method for identifying and removing convergence-prone characters improves phylogenetic accuracy, potentially enabling, when necessary, the inclusion of morphologies and hence fossils for reliable tree inference.

  12. Phylogenetic structure in tropical hummingbird communities

    DEFF Research Database (Denmark)

    Graham, Catherine H; Parra, Juan L; Rahbek, Carsten;


    composition of 189 hummingbird communities in Ecuador. We assessed how species and phylogenetic composition changed along environmental gradients and across biogeographic barriers. We show that humid, low-elevation communities are phylogenetically overdispersed (coexistence of distant relatives), a pattern...... an expensive means of locomotion at high elevations. We found that communities in the lowlands on opposite sides of the Andes tend to be phylogenetically similar despite their large differences in species composition, a pattern implicating the Andes as an important dispersal barrier. In contrast, along...... the steep environmental gradient between the lowlands and the Andes we found evidence that species turnover is comprised of relatively distantly related species. The integration of local and regional patterns of diversity across environmental gradients and biogeographic barriers provides insight...

  13. Consequences of recombination on traditional phylogenetic analysis

    DEFF Research Database (Denmark)

    Schierup, M H; Hein, J


    We investigate the shape of a phylogenetic tree reconstructed from sequences evolving under the coalescent with recombination. The motivation is that evolutionary inferences are often made from phylogenetic trees reconstructed from population data even though recombination may well occur (mtDNA...... or viral sequences) or does occur (nuclear sequences). We investigate the size and direction of biases when a single tree is reconstructed ignoring recombination. Standard software (PHYLIP) was used to construct the best phylogenetic tree from sequences simulated under the coalescent with recombination....... With recombination present, the length of terminal branches and the total branch length are larger, and the time to the most recent common ancestor smaller, than for a tree reconstructed from sequences evolving with no recombination. The effects are pronounced even for small levels of recombination that may...

  14. Probabilistic graphical model representation in phylogenetics. (United States)

    Höhna, Sebastian; Heath, Tracy A; Boussau, Bastien; Landis, Michael J; Ronquist, Fredrik; Huelsenbeck, John P


    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution.

  15. Classification, disease, and diagnosis. (United States)

    Jutel, Annemarie


    Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.

  16. Automatic classification of blank substrate defects (United States)

    Boettiger, Tom; Buck, Peter; Paninjath, Sankaranarayanan; Pereira, Mark; Ronald, Rob; Rost, Dan; Samir, Bhamidipati


    Mask preparation stages are crucial in mask manufacturing, since this mask is to later act as a template for considerable number of dies on wafer. Defects on the initial blank substrate, and subsequent cleaned and coated substrates, can have a profound impact on the usability of the finished mask. This emphasizes the need for early and accurate identification of blank substrate defects and the risk they pose to the patterned reticle. While Automatic Defect Classification (ADC) is a well-developed technology for inspection and analysis of defects on patterned wafers and masks in the semiconductors industry, ADC for mask blanks is still in the early stages of adoption and development. Calibre ADC is a powerful analysis tool for fast, accurate, consistent and automatic classification of defects on mask blanks. Accurate, automated classification of mask blanks leads to better usability of blanks by enabling defect avoidance technologies during mask writing. Detailed information on blank defects can help to select appropriate job-decks to be written on the mask by defect avoidance tools [1][4][5]. Smart algorithms separate critical defects from the potentially large number of non-critical defects or false defects detected at various stages during mask blank preparation. Mechanisms used by Calibre ADC to identify and characterize defects include defect location and size, signal polarity (dark, bright) in both transmitted and reflected review images, distinguishing defect signals from background noise in defect images. The Calibre ADC engine then uses a decision tree to translate this information into a defect classification code. Using this automated process improves classification accuracy, repeatability and speed, while avoiding the subjectivity of human judgment compared to the alternative of manual defect classification by trained personnel [2]. This paper focuses on the results from the evaluation of Automatic Defect Classification (ADC) product at MP Mask

  17. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai


    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  18. Identification and Classification of Rhizobia by Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry. (United States)

    Jia, Rui Zong; Zhang, Rong Juan; Wei, Qing; Chen, Wen Feng; Cho, Il Kyu; Chen, Wen Xin; Li, Qing X

    Mass spectrometry (MS) has been widely used for specific, sensitive and rapid analysis of proteins and has shown a high potential for bacterial identification and characterization. Type strains of four species of rhizobia and Escherichia coli DH5α were employed as reference bacteria to optimize various parameters for identification and classification of species of rhizobia by matrix-assisted laser desorption/ionization time-of-flight MS (MALDI TOF MS). The parameters optimized included culture medium states (liquid or solid), bacterial growth phases, colony storage temperature and duration, and protein data processing to enhance the bacterial identification resolution, accuracy and reliability. The medium state had little effects on the mass spectra of protein profiles. A suitable sampling time was between the exponential phase and the stationary phase. Consistent protein mass spectral profiles were observed for E. coli colonies pre-grown for 14 days and rhizobia for 21 days at 4°C or 21°C. A dendrogram of 75 rhizobial strains of 4 genera was constructed based on MALDI TOF mass spectra and the topological patterns agreed well with those in the 16S rDNA phylogenetic tree. The potential of developing a mass spectral database for all rhizobia species was assessed with blind samples. The entire process from sample preparation to accurate identification and classification of species required approximately one hour.

  19. The complete mitochondrial genome sequence of Acentrogobius sp. (Gobiiformes: Gobiidae) and phylogenetic studies of Gobiidae. (United States)

    Yang, Qiu-Hua; Lin, Qi; He, Li-Bin; Huang, Rui-Fang; Lin, Ke-Bing; Ge, Hui; Wu, Jian-Shao; Zhou, Chen


    At present, few morphological descriptions are available for Acentrogobius species and there exist some confused issues on the species classification and phylogeny. In this study, we first determined and described the complete mitochondrial genome of Acentrogobius sp. The complete mitogenome sequence is 17 083 bp in length, containing 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a putative control region (CR), and a light-strand replication origin (OL). The overall base composition is 28.9% A, 26.2% T, 28.5% C, and 16.4% G, with a slight AT bias (55.1%). To furthermore validate the new determined sequences, phylogenetic trees involving all the Gobiidae species available in GenBank database were constructed. These results are expected to provide useful molecular data for species identification and further phylogenetic studies of Gobiiformes.

  20. Phylogenetics of the Phlebotomine Sand Fly Group Verrucarum (Diptera: Psychodidae: Lutzomyia) (United States)

    Cohnstaedt, Lee W.; Beati, Lorenza; Caceres, Abraham G.; Ferro, Cristina; Munstermann, Leonard E.


    Within the sand fly genus Lutzomyia, the Verrucarum species group contains several of the principal vectors of American cutaneous leishmaniasis and human bartonellosis in the Andean region of South America. The group encompasses 40 species for which the taxonomic status, phylogenetic relationships, and role of each species in disease transmission remain unresolved. Mitochondrial cytochrome c oxidase I (COI) phylogenetic analysis of a 667-bp fragment supported the morphological classification of the Verrucarum group into series. Genetic sequences from seven species were grouped in well-supported monophyletic lineages. Four species, however, clustered in two paraphyletic lineages that indicate conspecificity—the Lutzomyia longiflocosa–Lutzomyia sauroida pair and the Lutzomyia quasitownsendi–Lutzomyia torvida pair. COI sequences were also evaluated as a taxonomic tool based on interspecific genetic variability within the Verrucarum group and the intraspecific variability of one of its members, Lutzomyia verrucarum, across its known distribution. PMID:21633028

  1. Phylogenetic position of Oryzolejeunea (Lejeuneaceae,Marchantiophyta): Evidence from molecular markers and morphology

    Institute of Scientific and Technical Information of China (English)

    Wen YE; Yu-Mei WEI; Alfons SCH(A)FER-VERWIMP; Rui-Liang ZHU


    The systematic position of the small neotropical genus Oryzolejeunea (three spp.) has long been controversial.Phylogenetic analyses of molecular data for the present study using DNA markers (trnL,psbA,and a nuclear ribosomal internal transcribed spacer [nrITS] region) shows that the genus is nested in Lejeunea.The results not only reveal the phylogenetic position of Oryzolejeunea for the first time,but also challenge the taxonomic value of the proximal hyaline papilla as a key feature in Lejeunea.The present study shows the urgent need for a reassessment of the perimeters of the genus Lejeunea and its infrageneric classification.Three new combinations,namely Lejeunea saccatiloba,Lejeunea grolleana,and Lejeunea venezuelana,are proposed.

  2. Security classification of information

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.


    Certain governmental information must be classified for national security reasons. However, the national security benefits from classifying information are usually accompanied by significant costs -- those due to a citizenry not fully informed on governmental activities, the extra costs of operating classified programs and procuring classified materials (e.g., weapons), the losses to our nation when advances made in classified programs cannot be utilized in unclassified programs. The goal of a classification system should be to clearly identify that information which must be protected for national security reasons and to ensure that information not needing such protection is not classified. This document was prepared to help attain that goal. This document is the first of a planned four-volume work that comprehensively discusses the security classification of information. Volume 1 broadly describes the need for classification, the basis for classification, and the history of classification in the United States from colonial times until World War 2. Classification of information since World War 2, under Executive Orders and the Atomic Energy Acts of 1946 and 1954, is discussed in more detail, with particular emphasis on the classification of atomic energy information. Adverse impacts of classification are also described. Subsequent volumes will discuss classification principles, classification management, and the control of certain unclassified scientific and technical information. 340 refs., 6 tabs.

  3. Security classification of information

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.


    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  4. Multilocus phylogenetic analysis of the genus Aeromonas. (United States)

    Martinez-Murcia, Antonio J; Monera, Arturo; Saavedra, M Jose; Oncina, Remedios; Lopez-Alvarez, Monserrate; Lara, Erica; Figueras, M Jose


    A broad multilocus phylogenetic analysis (MLPA) of the representative diversity of a genus offers the opportunity to incorporate concatenated inter-species phylogenies into bacterial systematics. Recent analyses based on single housekeeping genes have provided coherent phylogenies of Aeromonas. However, to date, a multi-gene phylogenetic analysis has never been tackled. In the present study, the intra- and inter-species phylogenetic relationships of 115 strains representing all Aeromonas species described to date were investigated by MLPA. The study included the independent analysis of seven single gene fragments (gyrB, rpoD, recA, dnaJ, gyrA, dnaX, and atpD), and the tree resulting from the concatenated 4705 bp sequence. The phylogenies obtained were consistent with each other, and clustering agreed with the Aeromonas taxonomy recognized to date. The highest clustering robustness was found for the concatenated tree (i.e. all Aeromonas species split into 100% bootstrap clusters). Both possible chronometric distortions and poor resolution encountered when using single-gene analysis were buffered in the concatenated MLPA tree. However, reliable phylogenetic species delineation required an MLPA including several "bona fide" strains representing all described species.

  5. The phylogenetics of succession can guide restoration

    DEFF Research Database (Denmark)

    Shooner, Stephanie; Chisholm, Chelsea Lee; Davies, T. Jonathan


    Phylogenetic tools have increasingly been used in community ecology to describe the evolutionary relationships among co-occurring species. In studies of succession, such tools may allow us to identify the evolutionary lineages most suited for particular stages of succession and habitat rehabilita...

  6. Quantifying MCMC exploration of phylogenetic tree space. (United States)

    Whidden, Chris; Matsen, Frederick A


    In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks.

  7. Constructing Student Problems in Phylogenetic Tree Construction. (United States)

    Brewer, Steven D.

    Evolution is often equated with natural selection and is taught from a primarily functional perspective while comparative and historical approaches, which are critical for developing an appreciation of the power of evolutionary theory, are often neglected. This report describes a study of expert problem-solving in phylogenetic tree construction.…

  8. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    Directory of Open Access Journals (Sweden)

    Kodner Robin B


    Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.

  9. Threat diversity will erode mammalian phylogenetic diversity in the near future.

    Directory of Open Access Journals (Sweden)

    Clémentine M A Jono

    Full Text Available To reduce the accelerating rate of phylogenetic diversity loss, many studies have searched for mechanisms that could explain why certain species are at risk, whereas others are not. In particular, it has been demonstrated that species might be affected by both extrinsic threat factors as well as intrinsic biological traits that could render a species more sensitive to extinction; here, we focus on extrinsic factors. Recently, the International Union for Conservation of Nature developed a new classification of threat types, including climate change, urbanization, pollution, agriculture and aquaculture, and harvesting/hunting. We have used this new classification to analyze two main factors that could explain the expected future loss of mammalian phylogenetic diversity: 1. differences in the type of threats that affect mammals and 2. differences in the number of major threats that accumulate for a single species. Our results showed that Cetartiodactyla, Diprotodontia, Monotremata, Perissodactyla, Primates, and Proboscidea could lose a high proportion of their current phylogenetic diversity in the coming decades. In contrast, Chiroptera, Didelphimorphia, and Rodentia could lose less phylogenetic diversity than expected if extinctions were random. Some mammalian clades, including Marsupiala, Chiroptera, and a subclade of Primates, are affected by particular threat types, most likely due solely to their geographic locations and associations with particular habitats. However, regardless of the geography, habitat, and taxon considered, it is not the threat type, but the threat diversity that determines the extinction risk for species and clades. Thus, some mammals might be randomly located in areas subjected to a large diversity of threats; they might also accumulate detrimental traits that render them sensitive to different threats, which is a characteristic that could be associated with large body size. Any action reducing threat diversity is

  10. 38 CFR 4.46 - Accurate measurement. (United States)


    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Accurate measurement. 4... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate measurement of the length of stumps, excursion of joints, dimensions and location of scars with respect...

  11. Phyloclimatic modeling: combining phylogenetics and bioclimatic modeling. (United States)

    Yesson, C; Culham, A


    We investigate the impact of past climates on plant diversification by tracking the "footprint" of climate change on a phylogenetic tree. Diversity within the cosmopolitan carnivorous plant genus Drosera (Droseraceae) is focused within Mediterranean climate regions. We explore whether this diversity is temporally linked to Mediterranean-type climatic shifts of the mid-Miocene and whether climate preferences are conservative over phylogenetic timescales. Phyloclimatic modeling combines environmental niche (bioclimatic) modeling with phylogenetics in order to study evolutionary patterns in relation to climate change. We present the largest and most complete such example to date using Drosera. The bioclimatic models of extant species demonstrate clear phylogenetic patterns; this is particularly evident for the tuberous sundews from southwestern Australia (subgenus Ergaleium). We employ a method for establishing confidence intervals of node ages on a phylogeny using replicates from a Bayesian phylogenetic analysis. This chronogram shows that many clades, including subgenus Ergaleium and section Bryastrum, diversified during the establishment of the Mediterranean-type climate. Ancestral reconstructions of bioclimatic models demonstrate a pattern of preference for this climate type within these groups. Ancestral bioclimatic models are projected into palaeo-climate reconstructions for the time periods indicated by the chronogram. We present two such examples that each generate plausible estimates of ancestral lineage distribution, which are similar to their current distributions. This is the first study to attempt bioclimatic projections on evolutionary time scales. The sundews appear to have diversified in response to local climate development. Some groups are specialized for Mediterranean climates, others show wide-ranging generalism. This demonstrates that Phyloclimatic modeling could be repeated for other plant groups and is fundamental to the understanding of

  12. Ontologies vs. Classification Systems

    DEFF Research Database (Denmark)

    Madsen, Bodil Nistrup; Erdman Thomsen, Hanne


    What is an ontology compared to a classification system? Is a taxonomy a kind of classification system or a kind of ontology? These are questions that we meet when working with people from industry and public authorities, who need methods and tools for concept clarification, for developing meta d...... classification systems and meta data taxonomies, should be based on ontologies.......What is an ontology compared to a classification system? Is a taxonomy a kind of classification system or a kind of ontology? These are questions that we meet when working with people from industry and public authorities, who need methods and tools for concept clarification, for developing meta...... data sets or for obtaining advanced search facilities. In this paper we will present an attempt at answering these questions. We will give a presentation of various types of ontologies and briefly introduce terminological ontologies. Furthermore we will argue that classification systems, e.g. product...

  13. AGN Zoo and Classifications of Active Galaxies (United States)

    Mickaelian, Areg M.


    We review the variety of Active Galactic Nuclei (AGN) classes (so-called "AGN zoo") and classification schemes of galaxies by activity types based on their optical emission-line spectrum, as well as other parameters and other than optical wavelength ranges. A historical overview of discoveries of various types of active galaxies is given, including Seyfert galaxies, radio galaxies, QSOs, BL Lacertae objects, Starbursts, LINERs, etc. Various kinds of AGN diagnostics are discussed. All known AGN types and subtypes are presented and described to have a homogeneous classification scheme based on the optical emission-line spectra and in many cases, also other parameters. Problems connected with accurate classifications and open questions related to AGN and their classes are discussed and summarized.

  14. Text Classification Using Sentential Frequent Itemsets

    Institute of Scientific and Technical Information of China (English)

    Shi-Zhu Liu; He-Ping Hu


    Text classification techniques mostly rely on single term analysis of the document data set, while more concepts,especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same sentence and their weights are particularly important in such scenarios. In this paper, we propose a novel approach using sentential frequent itemset, a concept comes from association rule mining, for text classification, which views a sentence rather than a document as a transaction, and uses a variable precision rough set based method to evaluate each sentential frequent itemset's contribution to the classification. Experiments over the Reuters and newsgroup corpus are carried out, which validate the practicability of the proposed system.

  15. Information gathering for CLP classification


    Ida Marcello; Felice Giordano; Francesca Marina Costamagna


    Regulation 1272/2008 includes provisions for two types of classification: harmonised classification and self-classification. The harmonised classification of substances is decided at Community level and a list of harmonised classifications is included in the Annex VI of the classification, labelling and packaging Regulation (CLP). If a chemical substance is not included in the harmonised classification list it must be self-classified, based on available information, according to the requireme...

  16. Classification of Spreadsheet Errors


    Rajalingham, Kamalasen; Chadwick, David R.; Knight, Brian


    This paper describes a framework for a systematic classification of spreadsheet errors. This classification or taxonomy of errors is aimed at facilitating analysis and comprehension of the different types of spreadsheet errors. The taxonomy is an outcome of an investigation of the widespread problem of spreadsheet errors and an analysis of specific types of these errors. This paper contains a description of the various elements and categories of the classification and is supported by appropri...

  17. Study for Updated Gout Classification Criteria

    DEFF Research Database (Denmark)

    Taylor, William J; Fransen, Jaap; Jansen, Tim L


    OBJECTIVE: To determine which clinical, laboratory, and imaging features most accurately distinguished gout from non-gout. METHODS: We performed a cross-sectional study of consecutive rheumatology clinic patients with ≥1 swollen joint or subcutaneous tophus. Gout was defined by synovial fluid or ...... (MTP1) joint ever involved (multivariate OR 2.30), location of currently tender joints in other foot/ankle (multivariate OR 2.28) or MTP1 joint (multivariate OR 2.82), serum urate level >6 mg/dl (0.36 mmoles/liter; multivariate OR 3.35), ultrasound double contour sign (multivariate OR 7...... been identified for further evaluation for new gout classification criteria. Ultrasound findings and degree of uricemia add discriminating value, and will significantly contribute to more accurate classification criteria....

  18. Automatic lexical classification: bridging research and practice. (United States)

    Korhonen, Anna


    Natural language processing (NLP)--the automatic analysis, understanding and generation of human language by computers--is vitally dependent on accurate knowledge about words. Because words change their behaviour between text types, domains and sub-languages, a fully accurate static lexical resource (e.g. a dictionary, word classification) is unattainable. Researchers are now developing techniques that could be used to automatically acquire or update lexical resources from textual data. If successful, the automatic approach could considerably enhance the accuracy and portability of language technologies, such as machine translation, text mining and summarization. This paper reviews the recent and on-going research in automatic lexical acquisition. Focusing on lexical classification, it discusses the many challenges that still need to be met before the approach can benefit NLP on a large scale.

  19. A new measure to study phylogenetic relations in the brown algal order Ectocarpales: The ``codon impact parameter"

    Indian Academy of Sciences (India)

    Smarajit Das; Jayprokas Chakrabarti; Zhumur Ghosh; Satyabrata Sahoo; Bibekanand Mallick


    We analyse forty-seven chloroplast genes of the large subunit of RuBisCO, from the algal order Ectocarpales, sourced from GenBank. Codon-usage weighted by the nucleotide base-bias defines our score called the codon-impact-parameter. This score is used to obtain phylogenetic relations amongst the 47 Ectocarpales. We compare our classification with the ones done earlier.

  20. Independent Comparison of Popular DPI Tools for Traffic Classification

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Carela-Español, Valentín; Barlet-Ros, Pere


    Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classifi......Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic......, application and web service). We carefully built a labeled dataset with more than 750K flows, which contains traffic from popular applications. We used the Volunteer-Based System (VBS), developed at Aalborg University, to guarantee the correct labeling of the dataset. We released this dataset, including full...

  1. Metrics for phylogenetic networks II: nodal and triplets metrics. (United States)

    Cardona, Gabriel; Llabrés, Mercè; Rosselló, Francesc; Valiente, Gabriel


    The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the second in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we generalize to phylogenetic networks two metrics that have already been introduced in the literature for phylogenetic trees: the nodal distance and the triplets distance. We prove that they are metrics on any class of tree-child time consistent phylogenetic networks on the same set of taxa, as well as some basic properties for them. To prove these results, we introduce a reduction/expansion procedure that can be used not only to establish properties of tree-child time consistent phylogenetic networks by induction, but also to generate all tree-child time consistent phylogenetic networks with a given number of leaves.

  2. Marine turtle mitogenome phylogenetics and evolution

    DEFF Research Database (Denmark)

    Duchene, S.; Frey, A.; Alfaro-Núñez, A.;


    The sea turtles are a group of cretaceous origin containing seven recognized living species: leatherback, hawksbill, Kemp's ridley, olive ridley, loggerhead, green, and flatback. The leatherback is the single member of the Dermochelidae family, whereas all other sea turtles belong in Cheloniidae...... distributions, shedding light on complex migration patterns and possible geographic or climatic events as driving forces of sea-turtle distribution. We have sequenced complete mitogenomes for all sea-turtle species, including samples from their geographic range extremes, and performed phylogenetic analyses...... to assess sea-turtle evolution with a large molecular dataset. We found variation in the length of the ATP8 gene and a highly variable site in ND4 near a proton translocation channel in the resulting protein. Complete mitogenomes show strong support and resolution for phylogenetic relationships among all...

  3. Morphological Phylogenetics in the Genomic Age. (United States)

    Lee, Michael S Y; Palci, Alessandro


    Evolutionary trees underpin virtually all of biology, and the wealth of new genomic data has enabled us to reconstruct them with increasing detail and confidence. While phenotypic (typically morphological) traits are becoming less important in reconstructing evolutionary trees, they still serve vital and unique roles in phylogenetics, even for living taxa for which vast amounts of genetic information are available. Morphology remains a powerful independent source of evidence for testing molecular clades, and - through fossil phenotypes - the primary means for time-scaling phylogenies. Morphological phylogenetics is therefore vital for transforming undated molecular topologies into dated evolutionary trees. However, if morphology is to be employed to its full potential, biologists need to start scrutinising phenotypes in a more objective fashion, models of phenotypic evolution need to be improved, and approaches for analysing phenotypic traits and fossils together with genomic data need to be refined.

  4. Alignment-free phylogenetics and population genetics. (United States)

    Haubold, Bernhard


    Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are based on comparative data, today usually DNA sequences. These have become so plentiful that alignment-free sequence comparison is of growing importance in the race between scientists and sequencing machines. In phylogenetics, efficient distance computation is the major contribution of alignment-free methods. A distance measure should reflect the number of substitutions per site, which underlies classical alignment-based phylogeny reconstruction. Alignment-free distance measures are either based on word counts or on match lengths, and I apply examples of both approaches to simulated and real data to assess their accuracy and efficiency. While phylogeny reconstruction is based on the number of substitutions, in population genetics, the distribution of mutations along a sequence is also considered. This distribution can be explored by match lengths, thus opening the prospect of alignment-free population genomics.

  5. Phylogenetic paleobiogeography of Late Ordovician Laurentian brachiopods

    Directory of Open Access Journals (Sweden)

    Jennifer E. Bauer


    Full Text Available Phylogenetic biogeographic analysis of four brachiopod genera was used to uncover large-scale geologic drivers of Late Ordovician biogeographic differentiation in Laurentia. Previously generated phylogenetic hypotheses were converted into area cladograms, ancestral geographic ranges were optimized and speciation events characterized as via dispersal or vicariance, when possible. Area relationships were reconstructed using Lieberman-modified Brooks Parsimony Analysis. The resulting area cladograms indicate tectonic and oceanographic changes were the primary geologic drivers of biogeographic patterns within the focal taxa. The Taconic tectophase contributed to the separation of the Appalachian and Central basins as well as the two midcontinent basins, whereas sea level rise following the Boda Event promoted interbasinal dispersal. Three migration pathways into the Cincinnati Basin were recognized, which supports the multiple pathway hypothesis for the Richmondian Invasion.

  6. Multiple sparse representations classification

    NARCIS (Netherlands)

    E. Plenge (Esben); S.K. Klein (Stefan); W.J. Niessen (Wiro); E. Meijering (Erik)


    textabstractSparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In t

  7. Library Classification 2020 (United States)

    Harris, Christopher


    In this article the author explores how a new library classification system might be designed using some aspects of the Dewey Decimal Classification (DDC) and ideas from other systems to create something that works for school libraries in the year 2020. By examining what works well with the Dewey Decimal System, what features should be carried…

  8. MINER: software for phylogenetic motif identification


    La, David; Livesay, Dennis R.


    MINER is web-based software for phylogenetic motif (PM) identification. PMs are sequence regions (fragments) that conserve the overall familial phylogeny. PMs have been shown to correspond to a wide variety of catalytic regions, substrate-binding sites and protein interfaces, making them ideal functional site predictions. The MINER output provides an intuitive interface for interactive PM sequence analysis and structural visualization. The web implementation of MINER is freely available at . ...

  9. Phylogenetics and Computational Biology of Multigene Families (United States)

    Liò, Pietro; Brilli, Matteo; Fani, Renato

    This chapter introduces the study of the major evolutionary forces operating in large gene families. The reconstruction of duplication history and phylogenetic analysis provide an interpretative framework of the evolution of multigene families. We present here two case studies, the first coming from Eukaryotes (chemokine receptors) and the second from Prokaryotes (TIM barrel proteins), showing how functional and structural constraints have shaped gene duplication events.

  10. Phylogenetic estimation with partial likelihood tensors

    CERN Document Server

    Sumner, J G


    We present an alternative method for calculating likelihoods in molecular phylogenetics. Our method is based on partial likelihood tensors, which are generalizations of partial likelihood vectors, as used in Felsenstein's approach. Exploiting a lexicographic sorting and partial likelihood tensors, it is possible to obtain significant computational savings. We show this on a range of simulated data by enumerating all numerical calculations that are required by our method and the standard approach.

  11. Enhancing Accuracy of Plant Leaf Classification Techniques

    Directory of Open Access Journals (Sweden)

    C. S. Sumathi


    Full Text Available Plants have become an important source of energy, and are a fundamental piece in the puzzle to solve the problem of global warming. Living beings also depend on plants for their food, hence it is of great importance to know about the plants growing around us and to preserve them. Automatic plant leaf classification is widely researched. This paper investigates the efficiency of learning algorithms of MLP for plant leaf classification. Incremental back propagation, Levenberg–Marquardt and batch propagation learning algorithms are investigated. Plant leaf images are examined using three different Multi-Layer Perceptron (MLP modelling techniques. Back propagation done in batch manner increases the accuracy of plant leaf classification. Results reveal that batch training is faster and more accurate than MLP with incremental training and Levenberg– Marquardt based learning for plant leaf classification. Various levels of semi-batch training used on 9 species of 15 sample each, a total of 135 instances show a roughly linear increase in classification accuracy.

  12. Classifier in Age classification

    Directory of Open Access Journals (Sweden)

    B. Santhi


    Full Text Available Face is the important feature of the human beings. We can derive various properties of a human by analyzing the face. The objective of the study is to design a classifier for age using facial images. Age classification is essential in many applications like crime detection, employment and face detection. The proposed algorithm contains four phases: preprocessing, feature extraction, feature selection and classification. The classification employs two class labels namely child and Old. This study addresses the limitations in the existing classifiers, as it uses the Grey Level Co-occurrence Matrix (GLCM for feature extraction and Support Vector Machine (SVM for classification. This improves the accuracy of the classification as it outperforms the existing methods.

  13. Uncertain-tree: discriminating among competing approaches to the phylogenetic analysis of phenotype data (United States)

    Tanner, Alastair R.; Fleming, James F.; Tarver, James E.; Pisani, Davide


    Morphological data provide the only means of classifying the majority of life's history, but the choice between competing phylogenetic methods for the analysis of morphology is unclear. Traditionally, parsimony methods have been favoured but recent studies have shown that these approaches are less accurate than the Bayesian implementation of the Mk model. Here we expand on these findings in several ways: we assess the impact of tree shape and maximum-likelihood estimation using the Mk model, as well as analysing data composed of both binary and multistate characters. We find that all methods struggle to correctly resolve deep clades within asymmetric trees, and when analysing small character matrices. The Bayesian Mk model is the most accurate method for estimating topology, but with lower resolution than other methods. Equal weights parsimony is more accurate than implied weights parsimony, and maximum-likelihood estimation using the Mk model is the least accurate method. We conclude that the Bayesian implementation of the Mk model should be the default method for phylogenetic estimation from phenotype datasets, and we explore the implications of our simulations in reanalysing several empirical morphological character matrices. A consequence of our finding is that high levels of resolution or the ability to classify species or groups with much confidence should not be expected when using small datasets. It is now necessary to depart from the traditional parsimony paradigms of constructing character matrices, towards datasets constructed explicitly for Bayesian methods. PMID:28077778

  14. Phylogenetic analysis of cubilin (CUBN) gene. (United States)

    Shaik, Abjal Pasha; Alsaeed, Abbas H; Kiranmayee, S; Bammidi, Vk; Sultana, Asma


    Cubilin, (CUBN; also known as intrinsic factor-cobalamin receptor [Homo sapiens Entrez Pubmed ref NM_001081.3; NG_008967.1; GI: 119606627]), located in the epithelium of intestine and kidney acts as a receptor for intrinsic factor - vitamin B12 complexes. Mutations in CUBN may play a role in autosomal recessive megaloblastic anemia. The current study investigated the possible role of CUBN in evolution using phylogenetic testing. A total of 588 BLAST hits were found for the cubilin query sequence and these hits showed putative conserved domain, CUB superfamily (as on 27(th) Nov 2012). A first-pass phylogenetic tree was constructed to identify the taxa which most often contained the CUBN sequences. Following this, we narrowed down the search by manually deleting sequences which were not CUBN. A repeat phylogenetic analysis of 25 taxa was performed using PhyML, RAxML and TreeDyn softwares to confirm that CUBN is a conserved protein emphasizing its importance as an extracellular domain and being present in proteins mostly known to be involved in development in many chordate taxa but not found in prokaryotes, plants and yeast.. No horizontal gene transfers have been found between different taxa.

  15. Incongruencies in Vaccinia Virus Phylogenetic Trees

    Directory of Open Access Journals (Sweden)

    Chad Smithson


    Full Text Available Over the years, as more complete poxvirus genomes have been sequenced, phylogenetic studies of these viruses have become more prevalent. In general, the results show similar relationships between the poxvirus species; however, some inconsistencies are notable. Previous analyses of the viral genomes contained within the vaccinia virus (VACV-Dryvax vaccine revealed that their phylogenetic relationships were sometimes clouded by low bootstrapping confidence. To analyze the VACV-Dryvax genomes in detail, a new tool-set was developed and integrated into the Base-By-Base bioinformatics software package. Analyses showed that fewer unique positions were present in each VACV-Dryvax genome than expected. A series of patterns, each containing several single nucleotide polymorphisms (SNPs were identified that were counter to the results of the phylogenetic analysis. The VACV genomes were found to contain short DNA sequence blocks that matched more distantly related clades. Additionally, similar non-conforming SNP patterns were observed in (1 the variola virus clade; (2 some cowpox clades; and (3 VACV-CVA, the direct ancestor of VACV-MVA. Thus, traces of past recombination events are common in the various orthopoxvirus clades, including those associated with smallpox and cowpox viruses.

  16. Phylogenetic conservatism of environmental niches in mammals. (United States)

    Cooper, Natalie; Freckleton, Rob P; Jetz, Walter


    Phylogenetic niche conservatism is the pattern where close relatives occupy similar niches, whereas distant relatives are more dissimilar. We suggest that niche conservatism will vary across clades in relation to their characteristics. Specifically, we investigate how conservatism of environmental niches varies among mammals according to their latitude, range size, body size and specialization. We use the Brownian rate parameter, σ(2), to measure the rate of evolution in key variables related to the ecological niche and define the more conserved group as the one with the slower rate of evolution. We find that tropical, small-ranged and specialized mammals have more conserved thermal niches than temperate, large-ranged or generalized mammals. Partitioning niche conservatism into its spatial and phylogenetic components, we find that spatial effects on niche variables are generally greater than phylogenetic effects. This suggests that recent evolution and dispersal have more influence on species' niches than more distant evolutionary events. These results have implications for our understanding of the role of niche conservatism in species richness patterns and for gauging the potential for species to adapt to global change.

  17. Photometric Supernova Classification with Machine Learning (United States)

    Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.


    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  18. Multisensor Target Detection And Classification (United States)

    Ruck, Dennis W.; Rogers, Steven K.; Mills, James P.; Kabrisky, Matthew


    In this paper a new approach to the detection and classification of tactical targets using a multifunction laser radar sensor is developed. Targets of interest are tanks, jeeps, trucks, and other vehicles. Doppler images are segmented by developing a new technique which compensates for spurious doppler returns. Relative range images are segmented using an approach based on range gradients. The resultant shapes in the segmented images are then classified using Zernike moment invariants as shape descriptors. Two classification decision rules are implemented: a classical statistical nearest-neighbor approach and a multilayer perceptron architecture. The doppler segmentation algorithm was applied to a set of 180 real sensor images. An accurate segmentation was obtained for 89 percent of the images. The new doppler segmentation proved to be a robust method, and the moment invariants were effective in discriminating the tactical targets. Tanks were classified correctly 86 percent of the time. The most important result of this research is the demonstration of the use of a new information processing architecture for image processing applications.

  19. Kappa Coefficients for Circular Classifications

    NARCIS (Netherlands)

    Warrens, Matthijs J.; Pratiwi, Bunga C.


    Circular classifications are classification scales with categories that exhibit a certain periodicity. Since linear scales have endpoints, the standard weighted kappas used for linear scales are not appropriate for analyzing agreement between two circular classifications. A family of kappa coefficie

  20. Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN

    Directory of Open Access Journals (Sweden)

    Marc Gregory Dumont


    Full Text Available The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty.  

  1. A multigene phylogenetic synthesis for the class Lecanoromycetes (Ascomycota): 1307 fungi representing 1139 infrageneric taxa, 317 genera and 66 families. (United States)

    Miadlikowska, Jolanta; Kauff, Frank; Högnabba, Filip; Oliver, Jeffrey C; Molnár, Katalin; Fraker, Emily; Gaya, Ester; Hafellner, Josef; Hofstetter, Valérie; Gueidan, Cécile; Otálora, Mónica A G; Hodkinson, Brendan; Kukwa, Martin; Lücking, Robert; Björk, Curtis; Sipman, Harrie J M; Burgaz, Ana Rosa; Thell, Arne; Passo, Alfredo; Myllys, Leena; Goward, Trevor; Fernández-Brime, Samantha; Hestmark, Geir; Lendemer, James; Lumbsch, H Thorsten; Schmull, Michaela; Schoch, Conrad L; Sérusiaux, Emmanuël; Maddison, David R; Arnold, A Elizabeth; Lutzoni, François; Stenroos, Soili


    The Lecanoromycetes is the largest class of lichenized Fungi, and one of the most species-rich classes in the kingdom. Here we provide a multigene phylogenetic synthesis (using three ribosomal RNA-coding and two protein-coding genes) of the Lecanoromycetes based on 642 newly generated and 3329 publicly available sequences representing 1139 taxa, 317 genera, 66 families, 17 orders and five subclasses (four currently recognized: Acarosporomycetidae, Lecanoromycetidae, Ostropomycetidae, Umbilicariomycetidae; and one provisionarily recognized, 'Candelariomycetidae'). Maximum likelihood phylogenetic analyses on four multigene datasets assembled using a cumulative supermatrix approach with a progressively higher number of species and missing data (5-gene, 5+4-gene, 5+4+3-gene and 5+4+3+2-gene datasets) show that the current classification includes non-monophyletic taxa at various ranks, which need to be recircumscribed and require revisionary treatments based on denser taxon sampling and more loci. Two newly circumscribed orders (Arctomiales and Hymeneliales in the Ostropomycetidae) and three families (Ramboldiaceae and Psilolechiaceae in the Lecanorales, and Strangosporaceae in the Lecanoromycetes inc. sed.) are introduced. The potential resurrection of the families Eigleraceae and Lopadiaceae is considered here to alleviate phylogenetic and classification disparities. An overview of the photobionts associated with the main fungal lineages in the Lecanoromycetes based on available published records is provided. A revised schematic classification at the family level in the phylogenetic context of widely accepted and newly revealed relationships across Lecanoromycetes is included. The cumulative addition of taxa with an increasing amount of missing data (i.e., a cumulative supermatrix approach, starting with taxa for which sequences were available for all five targeted genes and ending with the addition of taxa for which only two genes have been sequenced) revealed

  2. A New Classification Method to Overcome Over-Branching

    Institute of Scientific and Technical Information of China (English)

    ZHOU Aoying(周傲英); QIAN Weining(钱卫宁); QIAN Hailei(钱海蕾); JIN Wen(金文)


    Classification is an important technique in data mining. The decision trees built by most of the existing classification algorithms commonly feature over-branching, which will lead to poor efficiency in the subsequent classification period. In this paper, we present a new value-oriented classification method, which aims at building accurately proper-sized decision trees while reducing over-branching as much as possible, based on the concepts of frequentpattern-node and exceptive-child-node. The experiments show that while using relevant analysis as pre-processing, our classification method, without loss of accuracy, can eliminate the over-branching greatly in decision trees more effectively and efficiently than other algorithms do.

  3. Intraregional classification of wine via ICP-MS elemental fingerprinting. (United States)

    Coetzee, P P; van Jaarsveld, F P; Vanhaecke, F


    The feasibility of elemental fingerprinting in the classification of wines according to their provenance vineyard soil was investigated in the relatively small geographical area of a single wine district. Results for the Stellenbosch wine district (Western Cape Wine Region, South Africa), comprising an area of less than 1,000 km(2), suggest that classification of wines from different estates (120 wines from 23 estates) is indeed possible using accurate elemental data and multivariate statistical analysis based on a combination of principal component analysis, cluster analysis, and discriminant analysis. This is the first study to demonstrate the successful classification of wines at estate level in a single wine district in South Africa. The elements B, Ba, Cs, Cu, Mg, Rb, Sr, Tl and Zn were identified as suitable indicators. White and red wines were grouped in separate data sets to allow successful classification of wines. Correlation between wine classification and soil type distributions in the area was observed.

  4. Behavior Based Social Dimensions Extraction for Multi-Label Classification. (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin


    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  5. Cancer classification based on gene expression using neural networks. (United States)

    Hu, H P; Niu, Z J; Bai, Y P; Tan, X H


    Based on gene expression, we have classified 53 colon cancer patients with UICC II into two groups: relapse and no relapse. Samples were taken from each patient, and gene information was extracted. Of the 53 samples examined, 500 genes were considered proper through analyses by S-Kohonen, BP, and SVM neural networks. Classification accuracy obtained by S-Kohonen neural network reaches 91%, which was more accurate than classification by BP and SVM neural networks. The results show that S-Kohonen neural network is more plausible for classification and has a certain feasibility and validity as compared with BP and SVM neural networks.

  6. Rapid and accurate pyrosequencing of angiosperm plastid genomes

    Directory of Open Access Journals (Sweden)

    Farmerie William G


    Full Text Available Abstract Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20 System (454 Life Sciences Corporation, to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae and Platanus occidentalis (Platanaceae. Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy

  7. A Note on Encodings of Phylogenetic Networks of Bounded Level

    CERN Document Server

    Gambette, Philippe


    Driven by the need for better models that allow one to shed light into the question how life's diversity has evolved, phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i.e. uniquely describe) the network that induces it? In this note, we present a complete answer to this question for the special case of a level-1 (phylogenetic) network by characterizing those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. Given that this type of network forms the first layer of the rich hierarchy of lev...

  8. Multilocus phylogeny of the New-World mud turtles (Kinosternidae) supports the traditional classification of the group. (United States)

    Spinks, Phillip Q; Thomson, Robert C; Gidiş, Müge; Bradley Shaffer, H


    A goal of modern taxonomy is to develop classifications that reflect current phylogenetic relationships and are as stable as possible given the inherent uncertainties in much of the tree of life. Here, we provide an in-depth phylogenetic analysis, based on 14 nuclear loci comprising 10,305 base pairs of aligned sequence data from all but two species of the turtle family Kinosternidae, to determine whether recent proposed changes to the group's classification are justified and necessary. We conclude that those proposed changes were based on (1) mtDNA gene tree anomalies, (2) preliminary analyses that do not fully capture the breadth of geographic variation necessary to motivate taxonomic changes, and (3) changes in rank that are not motivated by non-monophyletic groups. Our recommendation, for this and other similar cases, is that taxonomic changes be made only when phylogenetic results that are statistically well-supported and corroborated by multiple independent lines of genetic evidence indicate that non-monophyletic groups are currently recognized and need to be corrected. We hope that other members of the phylogenetics community will join us in proposing taxonomic changes only when the strongest phylogenetic data demand such changes, and in so doing that we can move toward stable, phylogenetically informed classifications of lasting value.

  9. Phylogenetic relationships of 18 passerines based on Adenylate Kinase Intron 5 sequences

    Institute of Scientific and Technical Information of China (English)

    GUO Hui-yan; YU Hui-xin; BAI Su-ying; MA Yu-kun


    The 18 species of bird studied originally are known to belong to muscicapids, robins and sylviids of passerines, but some disputations are always present in their classification systems. In this experiment, phylogenetic relationships of 18 species of passerines were studied using Adenylate Kinase Intron 5 (AK5) sequences and DNA techniques. Through sequences analysis in comparison with each other, phylogenetic tree figures of 18 species of passerines were constructed using Neighbor-Joining (NJ) and Maximum-Parsimony (MP) methods . The results showed that sylviids should be listed as an independent family, while robins and flycatchers should be listed into Muscicapidae. Since the phylogenetic relationships between long-tailed tits and old world warblers are closer than that between long-tailed tits and parids, the long-tailed tits should be independent of paridae and be categorized into aegithalidae. Muscicapidae and Paridae are known to be two monophylitic families, but Sylviidae is not a monophyletic group. AK5 sequences had better efficacy in resolving close relationships of interspecies among intrageneric groups.

  10. Diversity of Clonostachys species assessed by molecular phylogenetics and MALDI-TOF mass spectrometry. (United States)

    Abreu, Lucas M; Moreira, Gláucia M; Ferreira, Douglas; Rodrigues-Filho, Edson; Pfenning, Ludwig H


    We assessed the species diversity among 45 strains of Clonostachys from different substrates and localities in Brazil using molecular phylogenetics, and compared the results with the phenotypic classification of strains obtained from matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS). Phylogenetic analyses were based on beta tubulin (Tub), ITS-LSU rDNA, and a combined Tub-ITS DNA dataset. MALDI-TOF MS analyses were performed using intact conidia and conidiophores of strains cultivated on oatmeal agar and 4% malt extract agar. Six known species were identified: Clonostachys byssicola, Clonostachys candelabrum, Clonostachys pseudochroleuca, Clonostachys rhizophaga, Clonostachys rogersoniana, and Clonostachys rosea. Two clades and two singleton lineages did not correspond to known species represented in the reference DNA dataset and were identified as Clonostachys sp. 1-4. Multivariate cluster analyses of MALDI-TOF MS data classified the strains into eight clusters and three singletons, corresponding to the ten identified species plus one additional cluster containing two strains of C. rogersoniana that split from the other co-specific strains. The consistent results of MALDI-TOF MS supported the identification of strains assigned to C. byssicola and C. pseudochroleuca, which did not form well supported clades in all phylogenetic analyses, but formed distinct clusters in the MALDI-TOF dendrograms.

  11. Molecular phylogenetics and historical biogeography amid shifting continents in the cockles and giant clams (Bivalvia: Cardiidae). (United States)

    Herrera, Nathanael D; Ter Poorten, Jan Johan; Bieler, Rüdiger; Mikkelsen, Paula M; Strong, Ellen E; Jablonski, David; Steppan, Scott J


    Reconstructing historical biogeography of the marine realm is complicated by indistinct barriers and, over deeper time scales, a dynamic landscape shaped by plate tectonics. Here we present the most extensive examination of model-based historical biogeography among marine invertebrates to date. We conducted the largest phylogenetic and molecular clock analyses to date for the bivalve family Cardiidae (cockles and giant clams) with three unlinked loci for 110 species representing 37 of the 50 genera. Ancestral ranges were reconstructed using the dispersal-extinction-cladogenesis (DEC) method with a time-stratified paleogeographic model wherein dispersal rates varied with shifting tectonics. Results were compared to previous classifications and the extensive paleontological record. Six of the eight prior subfamily groupings were found to be para- or polyphyletic. Cardiidae originated and subsequently diversified in the tropical Indo-Pacific starting in the Late Triassic. Eastern Atlantic species were mainly derived from the tropical Indo-Mediterranean region via the Tethys Sea. In contrast, the western Atlantic fauna was derived from Indo-Pacific clades. Our phylogenetic results demonstrated greater concordance with geography than did previous phylogenies based on morphology. Time-stratifying the DEC reconstruction improved the fit and was highly consistent with paleo-ocean currents and paleogeography. Lastly, combining molecular phylogenetics with a rich and well-documented fossil record allowed us to test the accuracy and precision of biogeographic range reconstructions.

  12. Molecular phylogenetics unveils the ancient evolutionary origins of the enigmatic fairy armadillos. (United States)

    Delsuc, Frédéric; Superina, Mariella; Tilak, Marie-Ka; Douzery, Emmanuel J P; Hassanin, Alexandre


    Fairy armadillos or pichiciegos (Xenarthra, Dasypodidae) are among the most elusive mammals. Due to their subterranean and nocturnal lifestyle, their basic biology and evolutionary history remain virtually unknown. Two distinct species with allopatric distributions are recognized: Chlamyphorus truncatus is restricted to central Argentina, while Calyptophractus retusus occurs in the Gran Chaco of Argentina, Paraguay, and Bolivia. To test their monophyly and resolve their phylogenetic affinities within armadillos, we obtained sequence data from modern and museum specimens for two mitochondrial genes (12S RNA [MT-RNR1] and NADH dehydrogenase 1 [MT-ND1]) and two nuclear exons (breast cancer 1 early onset exon 11 [BRCA1] and von Willebrand factor exon 28 [VWF]). Phylogenetic analyses provided a reference phylogeny and timescale for living xenarthran genera. Our results reveal monophyletic pichiciegos as members of a major armadillo subfamily (Chlamyphorinae). Their strictly fossorial lifestyle probably evolved as a response to the Oligocene aridification that occurred in South America after their divergence from Tolypeutinae around 32 million years ago (Mya). The ancient divergence date (∼17Mya) for separation between the two species supports their taxonomic classification into distinct genera. The synchronicity with Middle Miocene marine incursions along the Paraná river basin suggests a vicariant origin for pichiciegos by the disruption of their ancestral range. Their phylogenetic distinctiveness and rarity in the wild argue in favor of high conservation priority.

  13. Genome-wide identification and phylogenetic analysis of the ERF gene family in cucumbers

    Directory of Open Access Journals (Sweden)

    Lifang Hu


    Full Text Available Members of the ERF transcription-factor family participate in a number of biological processes, viz., responses to hormones, adaptation to biotic and abiotic stress, metabolism regulation, beneficial symbiotic interactions, cell differentiation and developmental processes. So far, no tissue-expression profile of any cucumber ERF protein has been reported in detail. Recent completion of the cucumber full-genome sequence has come to facilitate, not only genome-wide analysis of ERF family members in cucumbers themselves, but also a comparative analysis with those in Arabidopsis and rice. In this study, 103 hypothetical ERF family genes in the cucumber genome were identified, phylogenetic analysis indicating their classification into 10 groups, designated I to X. Motif analysis further indicated that most of the conserved motifs outside the AP2/ERF domain, are selectively distributed among the specific clades in the phylogenetic tree. From chromosomal localization and genome distribution analysis, it appears that tandem-duplication may have contributed to CsERF gene expansion. Intron/exon structure analysis indicated that a few CsERFs still conserved the former intron-position patterns existent in the common ancestor of monocots and eudicots. Expression analysis revealed the widespread distribution of the cucumber ERF gene family within plant tissues, thereby implying the probability of their performing various roles therein. Furthermore, members of some groups presented mutually similar expression patterns that might be related to their phylogenetic groups.

  14. Automated protein subfamily identification and classification.

    Directory of Open Access Journals (Sweden)

    Duncan P Brown


    Full Text Available Function prediction by homology is widely used to provide preliminary functional annotations for genes for which experimental evidence of function is unavailable or limited. This approach has been shown to be prone to systematic error, including percolation of annotation errors through sequence databases. Phylogenomic analysis avoids these errors in function prediction but has been difficult to automate for high-throughput application. To address this limitation, we present a computationally efficient pipeline for phylogenomic classification of proteins. This pipeline uses the SCI-PHY (Subfamily Classification in Phylogenomics algorithm for automatic subfamily identification, followed by subfamily hidden Markov model (HMM construction. A simple and computationally efficient scoring scheme using family and subfamily HMMs enables classification of novel sequences to protein families and subfamilies. Sequences representing entirely novel subfamilies are differentiated from those that can be classified to subfamilies in the input training set using logistic regression. Subfamily HMM parameters are estimated using an information-sharing protocol, enabling subfamilies containing even a single sequence to benefit from conservation patterns defining the family as a whole or in related subfamilies. SCI-PHY subfamilies correspond closely to functional subtypes defined by experts and to conserved clades found by phylogenetic analysis. Extensive comparisons of subfamily and family HMM performances show that subfamily HMMs dramatically improve the separation between homologous and non-homologous proteins in sequence database searches. Subfamily HMMs also provide extremely high specificity of classification and can be used to predict entirely novel subtypes. The SCI-PHY Web server at allows users to upload a multiple sequence alignment for subfamily identification and subfamily HMM construction. Biologists wishing to

  15. Complete mitochondrial genomes elucidate phylogenetic relationships of the deep-sea octocoral families Coralliidae and Paragorgiidae (United States)

    Figueroa, Diego F.; Baco, Amy R.


    In the past decade, molecular phylogenetic analyses of octocorals have shown that the current morphological taxonomic classification of these organisms needs to be revised. The latest phylogenetic analyses show that most octocorals can be divided into three main clades. One of these clades contains the families Coralliidae and Paragorgiidae. These families share several taxonomically important characters and it has been suggested that they may not be monophyletic; with the possibility of the Coralliidae being a derived branch of the Paragorgiidae. Uncertainty exists not only in the relationship of these two families, but also in the classification of the two genera that make up the Coralliidae, Corallium and Paracorallium. Molecular analyses suggest that the genus Corallium is paraphyletic, and it can be divided into two main clades, with the Paracorallium as members of one of these clades. In this study we sequenced the whole mitochondrial genome of five species of Paragorgia and of five species of Corallium to use in a phylogenetic analysis to achieve two main objectives; the first to elucidate the phylogenetic relationship between the Paragorgiidae and Coralliidae and the second to determine whether the genera Corallium and Paracorallium are monophyletic. Our results show that other members of the Coralliidae share the two novel mitochondrial gene arrangements found in a previous study in Corallium konojoi and Paracorallium japonicum; and that the Corallium konojoi arrangement is also found in the Paragorgiidae. Our phylogenetic reconstruction based on all the protein coding genes and ribosomal RNAs of the mitochondrial genome suggest that the Coralliidae are not a derived branch of the Paragorgiidae, but rather a monophyletic sister branch to the Paragorgiidae. While our manuscript was in review a study was published using morphological data and several fragments from mitochondrial genes to redefine the taxonomy of the Coralliidae. Paracorallium was subsumed

  16. 78 FR 54970 - Cotton Futures Classification: Optional Classification Procedure (United States)


    ... process in March 2012 (77 FR 5379). When verified by a futures classification, Smith-Doxey data serves as... Classification: Optional Classification Procedure AGENCY: Agricultural Marketing Service, USDA. ACTION: Proposed... for the addition of an optional cotton futures classification procedure--identified and known...

  17. Two new rapid SNP-typing methods for classifying Mycobacterium tuberculosis complex into the main phylogenetic lineages.

    Directory of Open Access Journals (Sweden)

    David Stucki

    Full Text Available There is increasing evidence that strain variation in Mycobacterium tuberculosis complex (MTBC might influence the outcome of tuberculosis infection and disease. To assess genotype-phenotype associations, phylogenetically robust molecular markers and appropriate genotyping tools are required. Most current genotyping methods for MTBC are based on mobile or repetitive DNA elements. Because these elements are prone to convergent evolution, the corresponding genotyping techniques are suboptimal for phylogenetic studies and strain classification. By contrast, single nucleotide polymorphisms (SNP are ideal markers for classifying MTBC into phylogenetic lineages, as they exhibit very low degrees of homoplasy. In this study, we developed two complementary SNP-based genotyping methods to classify strains into the six main human-associated lineages of MTBC, the "Beijing" sublineage, and the clade comprising Mycobacterium bovis and Mycobacterium caprae. Phylogenetically informative SNPs were obtained from 22 MTBC whole-genome sequences. The first assay, referred to as MOL-PCR, is a ligation-dependent PCR with signal detection by fluorescent microspheres and a Luminex flow cytometer, which simultaneously interrogates eight SNPs. The second assay is based on six individual TaqMan real-time PCR assays for singleplex SNP-typing. We compared MOL-PCR and TaqMan results in two panels of clinical MTBC isolates. Both methods agreed fully when assigning 36 well-characterized strains into the main phylogenetic lineages. The sensitivity in allele-calling was 98.6% and 98.8% for MOL-PCR and TaqMan, respectively. Typing of an additional panel of 78 unknown clinical isolates revealed 99.2% and 100% sensitivity in allele-calling, respectively, and 100% agreement in lineage assignment between both methods. While MOL-PCR and TaqMan are both highly sensitive and specific, MOL-PCR is ideal for classification of isolates with no previous information, whereas TaqMan is faster

  18. Learning Apache Mahout classification

    CERN Document Server

    Gupta, Ashish


    If you are a data scientist who has some experience with the Hadoop ecosystem and machine learning methods and want to try out classification on large datasets using Mahout, this book is ideal for you. Knowledge of Java is essential.

  19. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U


    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  20. A phylum-level phylogenetic classification of zygomycete fungi based on genome-scale data (United States)

    Zygomycete fungi were classified as a single phylum, Zygomycota, based on sexual reproduction by zygospores, frequent asexual reproduction by sporangia, absence of multicellular sporocarps, and production of coenocytic hyphae, all with some exceptions. Molecular phylogenies based on one or a few gen...

  1. Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty. (United States)

    Md Mukarram Hossain, A S; Blackburne, Benjamin P; Shah, Abhijeet; Whelan, Simon


    Evolutionary studies usually use a two-step process to investigate sequence data. Step one estimates a multiple sequence alignment (MSA) and step two applies phylogenetic methods to ask evolutionary questions of that MSA. Modern phylogenetic methods infer evolutionary parameters using maximum likelihood or Bayesian inference, mediated by a probabilistic substitution model that describes sequence change over a tree. The statistical properties of these methods mean that more data directly translates to an increased confidence in downstream results, providing the substitution model is adequate and the MSA is correct. Many studies have investigated the robustness of phylogenetic methods in the presence of substitution model misspecification, but few have examined the statistical properties of those methods when the MSA is unknown. This simulation study examines the statistical properties of the complete two-step process when inferring sequence divergence and the phylogenetic tree topology. Both nucleotide and amino acid analyses are negatively affected by the alignment step, both through inaccurate guide tree estimates and through overfitting to that guide tree. For many alignment tools these effects become more pronounced when additional sequences are added to the analysis. Nucleotide sequences are particularly susceptible, with MSA errors leading to statistical support for long-branch attraction artifacts, which are usually associated with gross substitution model misspecification. Amino acid MSAs are more robust, but do tend to arbitrarily resolve multifurcations in favor of the guide tree. No inference strategies produce consistently accurate estimates of divergence between sequences, although amino acid MSAs are again more accurate than their nucleotide counterparts. We conclude with some practical suggestions about how to limit the effect of MSA uncertainty on evolutionary inference.

  2. MINER: software for phylogenetic motif identification. (United States)

    La, David; Livesay, Dennis R


    MINER is web-based software for phylogenetic motif (PM) identification. PMs are sequence regions (fragments) that conserve the overall familial phylogeny. PMs have been shown to correspond to a wide variety of catalytic regions, substrate-binding sites and protein interfaces, making them ideal functional site predictions. The MINER output provides an intuitive interface for interactive PM sequence analysis and structural visualization. The web implementation of MINER is freely available at Source code is available to the academic community on request.

  3. Phylogenetic analysis of cubilin (CUBN) gene


    Shaik, Abjal Pasha; Alsaeed, Abbas H; Kiranmayee, S; Bammidi, VK; Sultana, Asma


    Cubilin, (CUBN; also known as intrinsic factor-cobalamin receptor [Homo sapiens Entrez Pubmed ref NM_001081.3; NG_008967.1; GI: 119606627]), located in the epithelium of intestine and kidney acts as a receptor for intrinsic factor – vitamin B12 complexes. Mutations in CUBN may play a role in autosomal recessive megaloblastic anemia. The current study investigated the possible role of CUBN in evolution using phylogenetic testing. A total of 588 BLAST hits were found for the cubilin query seque...

  4. Completion of the classification

    CERN Document Server

    Strade, Helmut


    This is the last of three volumes about ""Simple Lie Algebras over Fields of Positive Characteristic""by Helmut Strade, presenting the state of the art of the structure and classification of Lie algebras over fields of positive characteristic. In this monograph the proof of the Classification Theorem presented in the first volumeis concluded.Itcollects all the important results on the topic whichcan be found only in scatteredscientific literaturso far.

  5. Twitter content classification



    This paper delivers a new Twitter content classification framework based sixteen existing Twitter studies and a grounded theory analysis of a personal Twitter history. It expands the existing understanding of Twitter as a multifunction tool for personal, profession, commercial and phatic communications with a split level classification scheme that offers broad categorization and specific sub categories for deeper insight into the real world application of the service.

  6. Expected Classification Accuracy

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner


    Full Text Available Every time we make a classification based on a test score, we should expect some number..of misclassifications. Some examinees whose true ability is within a score range will have..observed scores outside of that range. A procedure for providing a classification table of..true and expected scores is developed for polytomously scored items under item response..theory and applied to state assessment data. A simplified procedure for estimating the..table entries is also presented.

  7. Accurate Medium-Term Wind Power Forecasting in a Censored Classification Framework

    DEFF Research Database (Denmark)

    Dahl, Christian M.; Croonenbroeck, Carsten


    We provide a wind power forecasting methodology that exploits many of the actual data's statistical features, in particular both-sided censoring. While other tools ignore many of the important “stylized facts” or provide forecasts for short-term horizons only, our approach focuses on medium......-term forecasts, which are especially necessary for practitioners in the forward electricity markets of many power trading places; for example, NASDAQ OMX Commodities (formerly Nord Pool OMX Commodities) in northern Europe. We show that our model produces turbine-specific forecasts that are significantly more...

  8. Separation of Benign and Malicious Network Events for Accurate Malware Family Classification (United States)


    unseen malicious behavior, while addressing code obfuscation, but they cost more time and resources to run the given malware sample in the sandbox...they come at a high cost [10] Observed Traffic Web S/W Updates Malware Blend  All  Sources   Fig. 1. Observed Traffic from a Host due to the...First, CHATTER relies on fine-grained events, since attributes extracted from one packet are treated as different events, e.g., inbound , DNS, and

  9. A Distance Measure for Genome Phylogenetic Analysis (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  10. A phylogenetic blueprint for a modern whale. (United States)

    Gatesy, John; Geisler, Jonathan H; Chang, Joseph; Buell, Carl; Berta, Annalisa; Meredith, Robert W; Springer, Mark S; McGowen, Michael R


    The emergence of Cetacea in the Paleogene represents one of the most profound macroevolutionary transitions within Mammalia. The move from a terrestrial habitat to a committed aquatic lifestyle engendered wholesale changes in anatomy, physiology, and behavior. The results of this remarkable transformation are extant whales that include the largest, biggest brained, fastest swimming, loudest, deepest diving mammals, some of which can detect prey with a sophisticated echolocation system (Odontoceti - toothed whales), and others that batch feed using racks of baleen (Mysticeti - baleen whales). A broad-scale reconstruction of the evolutionary remodeling that culminated in extant cetaceans has not yet been based on integration of genomic and paleontological information. Here, we first place Cetacea relative to extant mammalian diversity, and assess the distribution of support among molecular datasets for relationships within Artiodactyla (even-toed ungulates, including Cetacea). We then merge trees derived from three large concatenations of molecular and fossil data to yield a composite hypothesis that encompasses many critical events in the evolutionary history of Cetacea. By combining diverse evidence, we infer a phylogenetic blueprint that outlines the stepwise evolutionary development of modern whales. This hypothesis represents a starting point for more detailed, comprehensive phylogenetic reconstructions in the future, and also highlights the synergistic interaction between modern (genomic) and traditional (morphological+paleontological) approaches that ultimately must be exploited to provide a rich understanding of evolutionary history across the entire tree of Life.

  11. Phylogenetic diversity of Mesorhizobium in chickpea

    Indian Academy of Sciences (India)

    Dong Hyun Kim; Mayank Kaashyap; Abhishek Rathore; Roma R Das; Swathi Parupalli; Hari D Upadhyaya; S Gopalakrishnan; Pooran M Gaur; Sarvjeet Singh; Jagmeet Kaur; Mohammad Yasin; Rajeev K Varshney


    Crop domestication, in general, has reduced genetic diversity in cultivated gene pool of chickpea (Cicer arietinum) as compared with wild species (C. reticulatum, C. bijugum). To explore impact of domestication on symbiosis, 10 accessions of chickpeas, including 4 accessions of C. arietinum, and 3 accessions of each of C. reticulatum and C. bijugum species, were selected and DNAs were extracted from their nodules. To distinguish chickpea symbiont, preliminary sequences analysis was attempted with 9 genes (16S rRNA, atpD, dnaJ, glnA, gyrB, nifH, nifK, nodD and recA) of which 3 genes (gyrB, nifK and nodD) were selected based on sufficient sequence diversity for further phylogenetic analysis. Phylogenetic analysis and sequence diversity for 3 genes demonstrated that sequences from C. reticulatum were more diverse. Nodule occupancy by dominant symbiont also indicated that C. reticulatum (60%) could have more various symbionts than cultivated chickpea (80%). The study demonstrated that wild chickpeas (C. reticulatum) could be used for selecting more diverse symbionts in the field conditions and it implies that chickpea domestication affected symbiosis negatively in addition to reducing genetic diversity.

  12. Epitope discovery with phylogenetic hidden Markov models.

    LENUS (Irish Health Repository)

    Lacerda, Miguel


    Existing methods for the prediction of immunologically active T-cell epitopes are based on the amino acid sequence or structure of pathogen proteins. Additional information regarding the locations of epitopes may be acquired by considering the evolution of viruses in hosts with different immune backgrounds. In particular, immune-dependent evolutionary patterns at sites within or near T-cell epitopes can be used to enhance epitope identification. We have developed a mutation-selection model of T-cell epitope evolution that allows the human leukocyte antigen (HLA) genotype of the host to influence the evolutionary process. This is one of the first examples of the incorporation of environmental parameters into a phylogenetic model and has many other potential applications where the selection pressures exerted on an organism can be related directly to environmental factors. We combine this novel evolutionary model with a hidden Markov model to identify contiguous amino acid positions that appear to evolve under immune pressure in the presence of specific host immune alleles and that therefore represent potential epitopes. This phylogenetic hidden Markov model provides a rigorous probabilistic framework that can be combined with sequence or structural information to improve epitope prediction. As a demonstration, we apply the model to a data set of HIV-1 protein-coding sequences and host HLA genotypes.

  13. Laboratory Building for Accurate Determination of Plutonium

    Institute of Scientific and Technical Information of China (English)


    <正>The accurate determination of plutonium is one of the most important assay techniques of nuclear fuel, also the key of the chemical measurement transfer and the base of the nuclear material balance. An

  14. Comparative assessment of performance and genome dependence among phylogenetic profiling methods

    Directory of Open Access Journals (Sweden)

    Wu Jie


    Full Text Available Abstract Background The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional annotation. Ideally, genome context based techniques such as proximity, fusion, and phylogenetic profiling, which have been so successful in prokaryotes, could be utilized in eukaryotes. Here we explore the application of phylogenetic profiling, a method that exploits the evolutionary co-occurrence of genes in the assignment of functional linkages, to eukaryotic genomes. Results In order to evaluate the performance of phylogenetic profiling in eukaryotes, we assessed the relative performance of commonly used profile construction techniques and genome compositions in predicting functional linkages in both prokaryotic and eukaryotic organisms. When predicting linkages in E. coli with a prokaryotic profile, the use of continuous values constructed from transformed BLAST bit-scores performed better than profiles composed of discretized E-values; the use of discretized E-values resulted in more accurate linkages when using S. cerevisiae as the query organism. Extending this analysis by incorporating several eukaryotic genomes in profiles containing a majority of prokaryotes resulted in similar overall accuracy, but with a surprising reduction in pathway diversity among the most significant linkages. Furthermore, the application of phylogenetic profiling using profiles composed of only eukaryotes resulted in the loss of the strong correlation between common KEGG pathway membership and profile similarity score. Profile construction methods, orthology definitions, ontology and domain complexity were explored as possible sources of the poor performance of eukaryotic profiles, but with no improvement in results. Conclusion Given the current set of

  15. The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation. (United States)

    Roger, Andrew J; Hug, Laura A


    Determining the relationships among and divergence times for the major eukaryotic lineages remains one of the most important and controversial outstanding problems in evolutionary biology. The sequencing and phylogenetic analyses of ribosomal RNA (rRNA) genes led to the first nearly comprehensive phylogenies of eukaryotes in the late 1980s, and supported a view where cellular complexity was acquired during the divergence of extant unicellular eukaryote lineages. More recently, however, refinements in analytical methods coupled with the availability of many additional genes for phylogenetic analysis showed that much of the deep structure of early rRNA trees was artefactual. Recent phylogenetic analyses of a multiple genes and the discovery of important molecular and ultrastructural phylogenetic characters have resolved eukaryotic diversity into six major hypothetical groups. Yet relationships among these groups remain poorly understood because of saturation of sequence changes on the billion-year time-scale, possible rapid radiations of major lineages, phylogenetic artefacts and endosymbiotic or lateral gene transfer among eukaryotes. Estimating the divergence dates between the major eukaryote lineages using molecular analyses is even more difficult than phylogenetic estimation. Error in such analyses comes from a myriad of sources including: (i) calibration fossil dates, (ii) the assumed phylogenetic tree, (iii) the nucleotide or amino acid substitution model, (iv) substitution number (branch length) estimates, (v) the model of how rates of evolution change over the tree, (vi) error inherent in the time estimates for a given model and (vii) how multiple gene data are treated. By reanalysing datasets from recently published molecular clock studies, we show that when errors from these various sources are properly accounted for, the confidence intervals on inferred dates can be very large. Furthermore, estimated dates of divergence vary hugely depending on the methods

  16. Understanding the Code: keeping accurate records. (United States)

    Griffith, Richard


    In his continuing series looking at the legal and professional implications of the Nursing and Midwifery Council's revised Code of Conduct, Richard Griffith discusses the elements of accurate record keeping under Standard 10 of the Code. This article considers the importance of accurate record keeping for the safety of patients and protection of district nurses. The legal implications of records are explained along with how district nurses should write records to ensure these legal requirements are met.

  17. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors

    Directory of Open Access Journals (Sweden)

    Qi Wang


    Full Text Available Abstract: This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters – wheelbase and number of axles – to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle’s speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves

  18. Phylogenetics of early branching eudicots: Comparing phylogenetic signal across plastid introns, spacers, and genes

    Institute of Scientific and Technical Information of China (English)

    Anna-Magdalena BARNISKE; Thomas BORSCH; Kai M(U)LLER; Michael KRUG; Andreas WORBERG; Christoph NEINHUIS; Dietmar QUANDT


    Recent phylogenetic analyses revealed a grade with Ranunculales,Sabiales,Proteales,Trochodendrales,and Buxales as first branching eudicots,with the respective positions of Proteales and Sabiales still lacking statistical confidence.As previous analyses of conserved plastid genes remain inconclusive,we aimed to use and evaluate a representative set of plastid introns (group Ⅰ:trnL; group Ⅱ:petD,rpll6,trnK) and intergenic spacers (trnL-F,petB-petD,atpB-rbcL,rps3-rpll6) in comparison to the rapidly evolving matK and slowly evolving atpB and rbcL genes.Overall patterns of microstructural mutations converged across genomic regions,underscoring the existence of a general mutational pattern throughout the plastid genome.Phylogenetic signal differed strongly between functionally and structurally different genomic regions and was highest in matK,followed by spacers,then group Ⅱ and group Ⅰ introns.The more conserved atpB and rbcL coding regions showed distinctly lower phylogenetic information content.Parsimony,maximum likelihood,and Bayesian phylogenetic analyses based on the combined dataset of non-coding and rapidly evolving regions (>14 000 aligned characters) converged to a backbone topology ofeudicots with Ranunculales branching first,a Proteales-Sabiales clade second,followed by Trochodendrales and Buxales.Gunnerales generally appeared as sister to all remaining core eudicots with maximum support.Our results show that a small number of intron and spacer sequences allow similar insights into phylogenetic relationships of eudicots compared to datasets of many combined genes.The non-coding proportion of the plastid genome thus can be considered an important information source for plastid phylogenomics.

  19. Phylogenetic positions of several amitochondriate protozoa-Evidence from phylogenetic analysis of DNA topoisomerase II

    Institute of Scientific and Technical Information of China (English)

    HE De; DONG Jiuhong; WEN Jianfan; XIN Dedong; LU Siqi


    Several groups of parasitic protozoa, as represented by Giardia, Trichomonas, Entamoeba and Microsporida, were once widely considered to be the most primitive extant eukaryotic group―Archezoa. The main evidence for this is their 'lacking mitochondria' and possessing some other primitive features between prokaryotes and eukaryotes, and being basal to all eukaryotes with mitochondria in phylogenies inferred from many molecules. Some authors even proposed that these organisms diverged before the endosymbiotic origin of mitochondria within eukaryotes. This view was once considered to be very significant to the study of origin and evolution of eukaryotic cells (eukaryotes). However, in recent years this has been challenged by accumulating evidence from new studies. Here the sequences of DNA topoisomerase II in G. lamblia, T. vaginalis and E. histolytica were identified first by PCR and sequencing, then combining with the sequence data of the microsporidia Encephalitozoon cunicul and other eukaryotic groups of different evolutionary positions from GenBank, phylogenetic trees were constructed by various methods to investigate the evolutionary positions of these amitochondriate protozoa. Our results showed that since the characteristics of DNA topoisomerase II make it avoid the defect of 'long-branch attraction' appearing in the previous phylogenetic analyses, our trees can not only reflect effectively the relationship of different major eukaryotic groups, which is widely accepted, but also reveal phylogenetic positions for these amitochondriate protozoa, which is different from the previous phylogenetic trees. They are not the earliest-branching eukaryotes, but diverged after some mitochondriate organisms such as kinetoplastids and mycetozoan; they are not a united group but occupy different phylogenetic positions. Combining with the recent cytological findings of mitochondria-like organelles in them, we think that though some of them (e.g. diplomonads, as represented

  20. Phylogenetic constraints in key functional traits behind species' climate niches

    DEFF Research Database (Denmark)

    Kellermann, Vanessa; Loeschcke, Volker; Hoffmann, Ary A;


    adapted to similar environments or alternatively phylogenetic inertia. For desiccation resistance, weak phylogenetic inertia was detected; ancestral trait reconstruction, however, revealed a deep divergence that could be traced back to the genus level. Despite drosophilids’ high evolutionary potential......) for 92–95 Drosophila species and assessed their importance for geographic distributions, while controlling for acclimation, phylogeny, and spatial autocorrelation. Employing an array of phylogenetic analyses, we documented moderate-to-strong phylogenetic signal in both desiccation and cold resistance....... Desiccation and cold resistance were clearly linked to species distributions because significant associations between traits and climatic variables persisted even after controlling for phylogeny. We used different methods to untangle whether phylogenetic signal reflected phylogenetically related species...

  1. Association of Virulence Genotype with Phylogenetic Background in Comparison to Different Seropathotypes of Shiga Toxin-Producing Escherichia coli Isolates (United States)

    Girardeau, Jean Pierre; Dalmasso, Alessandra; Bertin, Yolande; Ducrot, Christian; Bord, Séverine; Livrelli, Valérie; Vernozy-Rozand, Christine; Martin, Christine


    The distribution of virulent factors (VFs) in 287 Shiga toxin-producing Escherichia coli (STEC) strains that were classified according to Karmali et al. into five seropathotypes (M. A. Karmali, M. Mascarenhas, S. Shen, K. Ziebell, S. Johnson, R. Reid-Smith, J. Isaac-Renton, C. Clark, K. Rahn, and J. B. Kaper, J. Clin. Microbiol. 41:4930-4940, 2003) was investigated. The associations of VFs with phylogenetic background were assessed among the strains in comparison with the different seropathotypes. The phylogenetic analysis showed that STEC strains segregated mainly in phylogenetic group B1 (70%) and revealed the substantial prevalence (19%) of STEC belonging to phylogenetic group A (designated STEC-A). The presence of virulent clonal groups in seropathotypes that are associated with disease and their absence from seropathotypes that are not associated with disease support the concept of seropathotype classification. Although certain VFs (eae, stx2-EDL933, stx2-vha, and stx2-vhb) were concentrated in seropathotypes associated with disease, others (astA, HPI, stx1c, and stx2-NV206) were concentrated in seropathotypes that are not associated with disease. Taken together with the observation that the STEC-A group was exclusively composed of strains lacking eae recovered from seropathotypes that are not associated with disease, the “atypical” virulence pattern suggests that STEC-A strains comprise a distinct category of STEC strains. A practical benefit of our phylogenetic analysis of STEC strains is that phylogenetic group A status appears to be highly predictive of “nonvirulent” seropathotypes. PMID:16333104

  2. Segmentation Assisted Food Classification for Dietary Assessment. (United States)

    Zhu, Fengqing; Bosch, Marc; Schap, Tusarebecca; Khanna, Nitin; Ebert, David S; Boushey, Carol J; Delp, Edward J


    Accurate methods and tools to assess food and nutrient intake are essential for the association between diet and health. Preliminary studies have indicated that the use of a mobile device with a built-in camera to obtain images of the food consumed may provide a less burdensome and more accurate method for dietary assessment. We are developing methods to identify food items using a single image acquired from the mobile device. Our goal is to automatically determine the regions in an image where a particular food is located (segmentation) and correctly identify the food type based on its features (classification or food labeling). Images of foods are segmented using Normalized Cuts based on intensity and color. Color and texture features are extracted from each segmented food region. Classification decisions for each segmented region are made using support vector machine methods. The segmentation of each food region is refined based on feedback from the output of classifier to provide more accurate estimation of the quantity of food consumed.

  3. Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

    DEFF Research Database (Denmark)

    Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.


    Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly......: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting...

  4. Prediction and classification of respiratory motion

    CERN Document Server

    Lee, Suk Jin


    This book describes recent radiotherapy technologies including tools for measuring target position during radiotherapy and tracking-based delivery systems. This book presents a customized prediction of respiratory motion with clustering from multiple patient interactions. The proposed method contributes to the improvement of patient treatments by considering breathing pattern for the accurate dose calculation in radiotherapy systems. Real-time tumor-tracking, where the prediction of irregularities becomes relevant, has yet to be clinically established. The statistical quantitative modeling for irregular breathing classification, in which commercial respiration traces are retrospectively categorized into several classes based on breathing pattern are discussed as well. The proposed statistical classification may provide clinical advantages to adjust the dose rate before and during the external beam radiotherapy for minimizing the safety margin. In the first chapter following the Introduction  to this book, we...

  5. GPCRTree: online hierarchical classification of GPCR function

    Directory of Open Access Journals (Sweden)

    Timmis Jon


    Full Text Available Abstract Background G protein-coupled receptors (GPCRs play important physiological roles transducing extracellular signals into intracellular responses. Approximately 50% of all marketed drugs target a GPCR. There remains considerable interest in effectively predicting the function of a GPCR from its primary sequence. Findings Using techniques drawn from data mining and proteochemometrics, an alignment-free approach to GPCR classification has been devised. It uses a simple representation of a protein's physical properties. GPCRTree, a publicly-available internet server, implements an algorithm that classifies GPCRs at the class, sub-family and sub-subfamily level. Conclusion A selective top-down classifier was developed which assigns sequences within a GPCR hierarchy. Compared to other publicly available GPCR prediction servers, GPCRTree is considerably more accurate at every level of classification. The server has been available online since March 2008 at URL:

  6. Search techniques in intelligent classification systems

    CERN Document Server

    Savchenko, Andrey V


    A unified methodology for categorizing various complex objects is presented in this book. Through probability theory, novel asymptotically minimax criteria suitable for practical applications in imaging and data analysis are examined including the special cases such as the Jensen-Shannon divergence and the probabilistic neural network. An optimal approximate nearest neighbor search algorithm, which allows faster classification of databases is featured. Rough set theory, sequential analysis and granular computing are used to improve performance of the hierarchical classifiers. Practical examples in face identification (including deep neural networks), isolated commands recognition in voice control system and classification of visemes captured by the Kinect depth camera are included. This approach creates fast and accurate search procedures by using exact probability densities of applied dissimilarity measures. This book can be used as a guide for independent study and as supplementary material for a technicall...

  7. Best Practices for Data Sharing in Phylogenetic Research (United States)

    Cranston, Karen; Harmon, Luke J.; O'Leary, Maureen A.; Lisle, Curtis


    As phylogenetic data becomes increasingly available, along with associated data on species’ genomes, traits, and geographic distributions, the need to ensure data availability and reuse become more and more acute. In this paper, we provide ten “simple rules” that we view as best practices for data sharing in phylogenetic research. These rules will help lead towards a future phylogenetics where data can easily be archived, shared, reused, and repurposed across a wide variety of projects. PMID:24987572

  8. A common tendency for phylogenetic overdispersion in mammalian assemblages


    Cooper, Natalie; RODRIGUEZ, JESUS; Purvis, Andy


    PUBLISHED Competition has long been proposed as an important force in structuring mammalian communities. Although early work recognised that competition has a phylogenetic dimension, only with recent increases in the availability of phylogenies have true phylogenetic investigations of mammalian community structure become possible. We test whether the phylogenetic structure of 142 assemblages from three mammalian clades (New World monkeys, North American ground squirrels and Australasian po...

  9. S1 gene-based phylogeny of infectious bronchitis virus: An attempt to harmonize virus classification. (United States)

    Valastro, Viviana; Holmes, Edward C; Britton, Paul; Fusaro, Alice; Jackwood, Mark W; Cattoli, Giovanni; Monne, Isabella


    Infectious bronchitis virus (IBV) is the causative agent of a highly contagious disease that results in severe economic losses to the global poultry industry. The virus exists in a wide variety of genetically distinct viral types, and both phylogenetic analysis and measures of pairwise similarity among nucleotide or amino acid sequences have been used to classify IBV strains. However, there is currently no consensus on the method by which IBV sequences should be compared, and heterogeneous genetic group designations that are inconsistent with phylogenetic history have been adopted, leading to the confusing coexistence of multiple genotyping schemes. Herein, we propose a simple and repeatable phylogeny-based classification system combined with an unambiguous and rationale lineage nomenclature for the assignment of IBV strains. By using complete nucleotide sequences of the S1 gene we determined the phylogenetic structure of IBV, which in turn allowed us to define 6 genotypes that together comprise 32 distinct viral lineages and a number of inter-lineage recombinants. Because of extensive rate variation among IBVs, we suggest that the inference of phylogenetic relationships alone represents a more appropriate criterion for sequence classification than pairwise sequence comparisons. The adoption of an internationally accepted viral nomenclature is crucial for future studies of IBV epidemiology and evolution, and the classification scheme presented here can be updated and revised novel S1 sequences should become available.

  10. Application of fuzzy classification in modern primary dental care

    Directory of Open Access Journals (Sweden)

    Yauheni Veryha


    Full Text Available This paper describes a framework for implementing fuzzy classifications in primary dental care services. Dental practices aim to provide the highest quality services for their patients. To achieve this, it is important that dentists are able to obtain patients' opinions about their experiences in the dental practice and are able to accurately evaluate this. We propose the use of fuzzy classification to combine various assessment criteria into one general measure to assess patients' satisfaction with primary dental care services. The proposed framework can be used in conventional dental practice information systems and easily integrated with those already used. The benefits of using the proposed fuzzy classification approach include more flexible and accurate analysis of patients' feedback, combining verbal and numeric data. To confirm our theory, a prototype was developed based on the Microsoft TM SQL Server database management system for two criteria used in dental practices, namely making an appointment with a dentist and waiting time for dental care services.

  11. Platelet-rich plasma: the PAW classification system. (United States)

    DeLong, Jeffrey M; Russell, Ryan P; Mazzocca, Augustus D


    Platelet-rich plasma (PRP) has been the subject of hundreds of publications in recent years. Reports of its effects in tissue, both positive and negative, have generated great interest in the orthopaedic community. Protocols for PRP preparation vary widely between authors and are often not well documented in the literature, making results difficult to compare or replicate. A classification system is needed to more accurately compare protocols and results and effectively group studies together for meta-analysis. Although some classification systems have been proposed, no single system takes into account the multitude of variables that determine the efficacy of PRP. In this article we propose a simple method for organizing and comparing results in the literature. The PAW classification system is based on 3 components: (1) the absolute number of Platelets, (2) the manner in which platelet Activation occurs, and (3) the presence or absence of White cells. By analyzing these 3 variables, we are able to accurately compare publications.

  12. Convolutional Neural Networks for patient-specific ECG classification. (United States)

    Kiranyaz, Serkan; Ince, Turker; Hamila, Ridha; Gabbouj, Moncef


    We propose a fast and accurate patient-specific electrocardiogram (ECG) classification and monitoring system using an adaptive implementation of 1D Convolutional Neural Networks (CNNs) that can fuse feature extraction and classification into a unified learner. In this way, a dedicated CNN will be trained for each patient by using relatively small common and patient-specific training data and thus it can also be used to classify long ECG records such as Holter registers in a fast and accurate manner. Alternatively, such a solution can conveniently be used for real-time ECG monitoring and early alert system on a light-weight wearable device. The experimental results demonstrate that the proposed system achieves a superior classification performance for the detection of ventricular ectopic beats (VEB) and supraventricular ectopic beats (SVEB).

  13. Dilated Chi-Square : a novel interestingness measure to build accurate and compact decion list


    Lan, Yu; Janssens, Davy; Chen, Guoqing; Wets, Geert


    Associative classification has aroused significant attention in recent years. This paper proposed a novel interestingness measure, named dilated chi-square, to statistically reveal the interdependence between the antecedents and the consequent of classificaton rules. Using dilated chi-square, instead of confidence, as the primary ranking criterion for rules under the framework of popular CBA algorithm, the adapted algorithm presented in this paper can empirically generate more accurate and mu...

  14. Applications of phylogenetics to solve practical problems in insect conservation. (United States)

    Buckley, Thomas R


    Phylogenetic approaches have much promise for the setting of conservation priorities and resource allocation. There has been significant development of analytical methods for the measurement of phylogenetic diversity within and among ecological communities as a way of setting conservation priorities. Application of these tools to insects has been low as has been the uptake by conservation managers. A critical reason for the lack of uptake includes the scarcity of detailed phylogenetic and species distribution data from much of insect diversity. Environmental DNA technologies offer a means for the high throughout collection of phylogenetic data across landscapes for conservation planning.

  15. Disentangling the phylogenetic and ecological components of spider phenotypic variation. (United States)

    Gonçalves-Souza, Thiago; Diniz-Filho, José Alexandre Felizola; Romero, Gustavo Quevedo


    An understanding of how the degree of phylogenetic relatedness influences the ecological similarity among species is crucial to inferring the mechanisms governing the assembly of communities. We evaluated the relative importance of spider phylogenetic relationships and ecological niche (plant morphological variables) to the variation in spider body size and shape by comparing spiders at different scales: (i) between bromeliads and dicot plants (i.e., habitat scale) and (ii) among bromeliads with distinct architectural features (i.e., microhabitat scale). We partitioned the interspecific variation in body size and shape into phylogenetic (that express trait values as expected by phylogenetic relationships among species) and ecological components (that express trait values independent of phylogenetic relationships). At the habitat scale, bromeliad spiders were larger and flatter than spiders associated with the surrounding dicots. At this scale, plant morphology sorted out close related spiders. Our results showed that spider flatness is phylogenetically clustered at the habitat scale, whereas it is phylogenetically overdispersed at the microhabitat scale, although phylogenic signal is present in both scales. Taken together, these results suggest that whereas at the habitat scale selective colonization affect spider body size and shape, at fine scales both selective colonization and adaptive evolution determine spider body shape. By partitioning the phylogenetic and ecological components of phenotypic variation, we were able to disentangle the evolutionary history of distinct spider traits and show that plant architecture plays a role in the evolution of spider body size and shape. We also discussed the relevance in considering multiple scales when studying phylogenetic community structure.

  16. Mitochondrial genome organization and vertebrate phylogenetics

    Directory of Open Access Journals (Sweden)

    Pereira Sérgio Luiz


    Full Text Available With the advent of DNA sequencing techniques the organization of the vertebrate mitochondrial genome shows variation between higher taxonomic levels. The most conserved gene order is found in placental mammals, turtles, fishes, some lizards and Xenopus. Birds, other species of lizards, crocodilians, marsupial mammals, snakes, tuatara, lamprey, and some other amphibians and one species of fish have gene orders that are less conserved. The most probable mechanism for new gene rearrangements seems to be tandem duplication and multiple deletion events, always associated with tRNA sequences. Some new rearrangements seem to be typical of monophyletic groups and the use of data from these groups may be useful for answering phylogenetic questions involving vertebrate higher taxonomic levels. Other features such as the secondary structure of tRNA, and the start and stop codons of protein-coding genes may also be useful in comparisons of vertebrate mitochondrial genomes.

  17. Tanglegrams: a Reduction Tool for Mathematical Phylogenetics. (United States)

    Matsen, Frederick; Billey, Sara; Kas, Arnold; Konvalinka, Matjaz


    Many discrete mathematics problems in phylogenetics are defined in terms of the relative labeling of pairsof leaf-labeled trees. These relative labelings are naturally formalized as tanglegrams, which have previously been an object of study in coevolutionary analysis. Although there has been considerable work on planar drawings of tanglegrams, they have not been fully explored as combinatorial objects until recently. In this paper, we describe how many discrete mathematical questions on trees "factor" through a problem on tanglegrams, and how understanding that factoring can simplify analysis. Depending on the problem, it may be useful to consider a unordered version of tanglegrams, and/or their unrooted counterparts. For all of these definitions, we show how the isomorphism types of tanglegrams can be understood in terms of double cosets of the symmetric group, and we investigate their automorphisms. Understanding tanglegrams better will isolate the distinct problems on leaf-labeled pairs of trees and reveal natural symmetries of spaces associated with such problems.

  18. The rapidly changing landscape of insect phylogenetics. (United States)

    Maddison, David R


    Insect phylogenetics is being profoundly changed by many innovations. Although rapid developments in genomics have center stage, key progress has been made in phenomics, field and museum science, digital databases and pipelines, analytical tools, and the culture of science. The importance of these methodological and cultural changes to the pace of inference of the hexapod Tree of Life is discussed. The innovations have the potential, when synthesized and mobilized in ways as yet unforeseen, to shine light on the million or more clades in insects, and infer their composition with confidence. There are many challenges to overcome before insects can enter the 'phylocognisant age', but because of the promise of genomics, phenomics, and informatics, that is now an imaginable future.

  19. Zika Virus: Emergence, Phylogenetics, Challenges, and Opportunities. (United States)

    Rajah, Maaran M; Pardy, Ryan D; Condotta, Stephanie A; Richer, Martin J; Sagan, Selena M


    Zika virus (ZIKV) is an emerging arthropod-borne pathogen that has recently gained notoriety due to its rapid and ongoing geographic expansion and its novel association with neurological complications. Reports of ZIKV-associated Guillain-Barré syndrome as well as fetal microcephaly place emphasis on the need to develop preventative measures and therapeutics to combat ZIKV infection. Thus, it is imperative that models to study ZIKV replication and pathogenesis and the immune response are developed in conjunction with integrated vector control strategies to mount an efficient response to the pandemic. This paper summarizes the current state of knowledge on ZIKV, including the clinical features, phylogenetic analyses, pathogenesis, and the immune response to infection. Potential challenges in developing diagnostic tools, treatment, and prevention strategies are also discussed.

  20. The Shapley Value of Phylogenetic Trees

    CERN Document Server

    Haake, Claus-Jochen; Su, Francis Edward


    Every weighted tree corresponds naturally to a cooperative game that we call a "tree game"; it assigns to each subset of leaves the sum of the weights of the minimal subtree spanned by those leaves. In the context of phylogenetic trees, the leaves are species and this assignment captures the diversity present in the coalition of species considered. We consider the Shapley value of tree games and suggest a biological interpretation. We determine the linear transformation M that shows the dependence of the Shapley value on the edge weights of the tree, and we also compute a null space basis of M. Both depend on the "split counts" of the tree. Finally, we characterize the Shapley value on tree games by four axioms, a counterpart to Shapley's original theorem on the larger class of cooperative games.

  1. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov


    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  2. Acoustic classification of dwellings

    DEFF Research Database (Denmark)

    Berardi, Umberto; Rasmussen, Birgit


    Schemes for the classification of dwellings according to different building performances have been proposed in the last years worldwide. The general idea behind these schemes relates to the positive impact a higher label, and thus a better performance, should have. In particular, focusing on soun...... exchanging experiences about constructions fulfilling different classes, reducing trade barriers, and finally increasing the sound insulation of dwellings.......Schemes for the classification of dwellings according to different building performances have been proposed in the last years worldwide. The general idea behind these schemes relates to the positive impact a higher label, and thus a better performance, should have. In particular, focusing on sound...... insulation performance, national schemes for sound classification of dwellings have been developed in several European countries. These schemes define acoustic classes according to different levels of sound insulation. Due to the lack of coordination among countries, a significant diversity in terms...

  3. Classification of hand eczema

    DEFF Research Database (Denmark)

    Agner, T; Aalto-Korte, K; Andersen, K E;


    BACKGROUND: Classification of hand eczema (HE) is mandatory in epidemiological and clinical studies, and also important in clinical work. OBJECTIVES: The aim was to test a recently proposed classification system of HE in clinical practice in a prospective multicentre study. METHODS: Patients were...... HE, protein contact dermatitis/contact urticaria, hyperkeratotic endogenous eczema and vesicular endogenous eczema, respectively. An additional diagnosis was given if symptoms indicated that factors additional to the main diagnosis were of importance for the disease. RESULTS: Four hundred and twenty......%) could not be classified. 38% had one additional diagnosis and 26% had two or more additional diagnoses. Eczema on feet was found in 30% of the patients, statistically significantly more frequently associated with hyperkeratotic and vesicular endogenous eczema. CONCLUSION: We find that the classification...

  4. Cellular image classification

    CERN Document Server

    Xu, Xiang; Lin, Feng


    This book introduces new techniques for cellular image feature extraction, pattern recognition and classification. The authors use the antinuclear antibodies (ANAs) in patient serum as the subjects and the Indirect Immunofluorescence (IIF) technique as the imaging protocol to illustrate the applications of the described methods. Throughout the book, the authors provide evaluations for the proposed methods on two publicly available human epithelial (HEp-2) cell datasets: ICPR2012 dataset from the ICPR'12 HEp-2 cell classification contest and ICIP2013 training dataset from the ICIP'13 Competition on cells classification by fluorescent image analysis. First, the reading of imaging results is significantly influenced by one’s qualification and reading systems, causing high intra- and inter-laboratory variance. The authors present a low-order LP21 fiber mode for optical single cell manipulation and imaging staining patterns of HEp-2 cells. A focused four-lobed mode distribution is stable and effective in optical...

  5. Sound classification of dwellings

    DEFF Research Database (Denmark)

    Rasmussen, Birgit


    National schemes for sound classification of dwellings exist in more than ten countries in Europe, typically published as national standards. The schemes define quality classes reflecting different levels of acoustical comfort. Main criteria concern airborne and impact sound insulation between....... Descriptors, range of quality levels, number of quality classes, class intervals, denotations and descriptions vary across Europe. The diversity is an obstacle for exchange of experience about constructions fulfilling different classes, implying also trade barriers. Thus, a harmonized classification scheme...... is needed, and a European COST Action TU0901 "Integrating and Harmonizing Sound Insulation Aspects in Sustainable Urban Housing Constructions", has been established and runs 2009-2013, one of the main objectives being to prepare a proposal for a European sound classification scheme with a number of quality...

  6. Classification problem in CBIR

    Directory of Open Access Journals (Sweden)

    Tatiana Jaworska


    Full Text Available At present a great deal of research is being done in different aspects of Content-Based Im-age Retrieval (CBIR. Image classification is one of the most important tasks in image re-trieval that must be dealt with. The primary issue we have addressed is: how can the fuzzy set theory be used to handle crisp image data. We propose fuzzy rule-based classification of image objects. To achieve this goal we have built fuzzy rule-based classifiers for crisp data. In this paper we present the results of fuzzy rule-based classification in our CBIR. Further-more, these results are used to construct a search engine taking into account data mining.

  7. Supernova Photometric Classification Challenge

    CERN Document Server

    Kessler, Richard; Jha, Saurabh; Kuhlmann, Stephen


    We have publicly released a blinded mix of simulated SNe, with types (Ia, Ib, Ic, II) selected in proportion to their expected rate. The simulation is realized in the griz filters of the Dark Energy Survey (DES) with realistic observing conditions (sky noise, point spread function and atmospheric transparency) based on years of recorded conditions at the DES site. Simulations of non-Ia type SNe are based on spectroscopically confirmed light curves that include unpublished non-Ia samples donated from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan Digital Sky Survey-II (SDSS-II). We challenge scientists to run their classification algorithms and report a type for each SN. A spectroscopically confirmed subset is provided for training. The goals of this challenge are to (1) learn the relative strengths and weaknesses of the different classification algorithms, (2) use the results to improve classification algorithms, and (3) understand what spectroscopically confirmed sub-...

  8. Information gathering for CLP classification. (United States)

    Marcello, Ida; Giordano, Felice; Costamagna, Francesca Marina


    Regulation 1272/2008 includes provisions for two types of classification: harmonised classification and self-classification. The harmonised classification of substances is decided at Community level and a list of harmonised classifications is included in the Annex VI of the classification, labelling and packaging Regulation (CLP). If a chemical substance is not included in the harmonised classification list it must be self-classified, based on available information, according to the requirements of Annex I of the CLP Regulation. CLP appoints that the harmonised classification will be performed for carcinogenic, mutagenic or toxic to reproduction substances (CMR substances) and for respiratory sensitisers category 1 and for other hazard classes on a case-by-case basis. The first step of classification is the gathering of available and relevant information. This paper presents the procedure for gathering information and to obtain data. The data quality is also discussed.

  9. The paradox of atheoretical classification

    DEFF Research Database (Denmark)

    Hjørland, Birger


    A distinction can be made between “artificial classifications” and “natural classifications,” where artificial classifications may adequately serve some limited purposes, but natural classifications are overall most fruitful by allowing inference and thus many different purposes. There is strong...... support for the view that a natural classification should be based on a theory (and, of course, that the most fruitful theory provides the most fruitful classification). Nevertheless, atheoretical (or “descriptive”) classifications are often produced. Paradoxically, atheoretical classifications may...... be very successful. The best example of a successful “atheoretical” classification is probably the prestigious Diagnostic and Statistical Manual of Mental Disorders (DSM) since its third edition from 1980. Based on such successes one may ask: Should the claim that classifications ideally are natural...

  10. Information gathering for CLP classification

    Directory of Open Access Journals (Sweden)

    Ida Marcello


    Full Text Available Regulation 1272/2008 includes provisions for two types of classification: harmonised classification and self-classification. The harmonised classification of substances is decided at Community level and a list of harmonised classifications is included in the Annex VI of the classification, labelling and packaging Regulation (CLP. If a chemical substance is not included in the harmonised classification list it must be self-classified, based on available information, according to the requirements of Annex I of the CLP Regulation. CLP appoints that the harmonised classification will be performed for carcinogenic, mutagenic or toxic to reproduction substances (CMR substances and for respiratory sensitisers category 1 and for other hazard classes on a case-by-case basis. The first step of classification is the gathering of available and relevant information. This paper presents the procedure for gathering information and to obtain data. The data quality is also discussed.

  11. Phylogenetic insights into Andean plant diversification

    Directory of Open Access Journals (Sweden)

    Federico eLuebert


    Full Text Available Andean orogeny is considered as one of the most important events for the developmentof current plant diversity in South America. We compare available phylogenetic studies anddivergence time estimates for plant lineages that may have diversified in response to Andeanorogeny. The influence of the Andes on plant diversification is separated into four major groups:The Andes as source of new high-elevation habitats, as a vicariant barrier, as a North-Southcorridor and as generator of new environmental conditions outside the Andes. Biogeographicalrelationships between the Andes and other regions are also considered. Divergence timeestimates indicate that high-elevation lineages originated and diversified during or after the majorphases of Andean uplift (Mid-Miocene to Pliocene, although there are some exceptions. Asexpected, Andean mid-elevation lineages tend to be older than high-elevation groups. Mostclades with disjunct distribution on both sides of the Andes diverged during Andean uplift.Inner-Andean clades also tend to have divergence time during or after Andean uplift. This isinterpreted as evidence of vicariance. Dispersal along the Andes has been shown to occur ineither direction, mostly dated after the Andean uplift. Divergence time estimates of plant groupsoutside the Andes encompass a wider range of ages, indicating that the Andes may not benecessarily the cause of these diversifications. The Andes are biogeographically related to allneighbouring areas, especially Central America, with floristic interchanges in both directionssince Early Miocene times. Direct biogeographical relationships between the Andes and otherdisjunct regions have also been shown in phylogenetic studies, especially with the easternBrazilian highlands and North America. The history of the Andean flora is complex and plantdiversification has been driven by a variety of processes, including environmental change,adaptation, and biotic interactions

  12. Minimum Error Entropy Classification

    CERN Document Server

    Marques de Sá, Joaquim P; Santos, Jorge M F; Alexandre, Luís A


    This book explains the minimum error entropy (MEE) concept applied to data classification machines. Theoretical results on the inner workings of the MEE concept, in its application to solving a variety of classification problems, are presented in the wider realm of risk functionals. Researchers and practitioners also find in the book a detailed presentation of practical data classifiers using MEE. These include multi‐layer perceptrons, recurrent neural networks, complexvalued neural networks, modular neural networks, and decision trees. A clustering algorithm using a MEE‐like concept is also presented. Examples, tests, evaluation experiments and comparison with similar machines using classic approaches, complement the descriptions.

  13. Latent classification models

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre


    One of the simplest, and yet most consistently well-performing setof classifiers is the \\NB models. These models rely on twoassumptions: $(i)$ All the attributes used to describe an instanceare conditionally independent given the class of that instance,and $(ii)$ all attributes follow a specific...... parametric family ofdistributions.  In this paper we propose a new set of models forclassification in continuous domains, termed latent classificationmodels. The latent classification model can roughly be seen ascombining the \\NB model with a mixture of factor analyzers,thereby relaxing the assumptions...... classification model, and wedemonstrate empirically that the accuracy of the proposed model issignificantly higher than the accuracy of other probabilisticclassifiers....

  14. Bosniak Classification system

    DEFF Research Database (Denmark)

    Graumann, Ole; Osther, Susanne Sloth; Karstoft, Jens;


    . Purpose: To investigate the inter- and intra-observer agreement among experienced uroradiologists when categorizing complex renal cysts according to the Bosniak classification. Material and Methods: The original categories of 100 cystic renal masses were chosen as “Gold Standard” (GS), established...... to the calculated weighted κ all readers performed “very good” for both inter-observer and intra-observer variation. Most variation was seen in cysts catagorized as Bosniak II, IIF, and III. These results show that radiologists who evaluate complex renal cysts routinely may apply the Bosniak classification...

  15. Classification problem in CBIR


    Tatiana Jaworska


    At present a great deal of research is being done in different aspects of Content-Based Im-age Retrieval (CBIR). Image classification is one of the most important tasks in image re-trieval that must be dealt with. The primary issue we have addressed is: how can the fuzzy set theory be used to handle crisp image data. We propose fuzzy rule-based classification of image objects. To achieve this goal we have built fuzzy rule-based classifiers for crisp data. In this paper we present the results ...

  16. Classification of iconic images


    Zrianina, Mariia; Kopf, Stephan


    Iconic images represent an abstract topic and use a presentation that is intuitively understood within a certain cultural context. For example, the abstract topic “global warming” may be represented by a polar bear standing alone on an ice floe. Such images are widely used in media and their automatic classification can help to identify high-level semantic concepts. This paper presents a system for the classification of iconic images. It uses a variation of the Bag of Visual Words approach wi...

  17. Assessing Measures of Order Flow Toxicity via Perfect Trade Classification

    DEFF Research Database (Denmark)

    Andersen, Torben G.; Bondarenko, Oleg

    . The VPIN metric involves decomposing volume into active buys and sells. We use the best-bid-offer (BBO) files from the CME Group to construct (near) perfect trade classification measures for the E-mini S&P 500 futures contract. We investigate the accuracy of the ELO Bulk Volume Classification (BVC) scheme...... and find it inferior to a standard tick rule based on individual transactions. Moreover, when VPIN is constructed from accurate classification, it behaves in a diametrically opposite way to BVC-VPIN. We also find the latter to have forecast power for short-term volatility solely because it generates...... systematic classification errors that are correlated with trading volume and return volatility. When controlling for trading intensity and volatility, the BVC-VPIN measure has no incremental predictive power for future volatility. We conclude that VPIN is not suitable for measuring order flow imbalances....

  18. A Syntactic Classification based Web Page Ranking Algorithm

    CERN Document Server

    Mukhopadhyay, Debajyoti; Kim, Young-Chon


    The existing search engines sometimes give unsatisfactory search result for lack of any categorization of search result. If there is some means to know the preference of user about the search result and rank pages according to that preference, the result will be more useful and accurate to the user. In the present paper a web page ranking algorithm is being proposed based on syntactic classification of web pages. Syntactic Classification does not bother about the meaning of the content of a web page. The proposed approach mainly consists of three steps: select some properties of web pages based on user's demand, measure them, and give different weightage to each property during ranking for different types of pages. The existence of syntactic classification is supported by running fuzzy c-means algorithm and neural network classification on a set of web pages. The change in ranking for difference in type of pages but for same query string is also being demonstrated.

  19. Land Cover Classification Using ALOS Imagery For Penang, Malaysia (United States)

    Sim, C. K.; Abdullah, K.; MatJafri, M. Z.; Lim, H. S.


    This paper presents the potential of integrating optical and radar remote sensing data to improve automatic land cover mapping. The analysis involved standard image processing, and consists of spectral signature extraction and application of a statistical decision rule to identify land cover categories. A maximum likelihood classifier is utilized to determine different land cover categories. Ground reference data from sites throughout the study area are collected for training and validation. The land cover information was extracted from the digital data using PCI Geomatica 10.3.2 software package. The variations in classification accuracy due to a number of radar imaging processing techniques are studied. The relationship between the processing window and the land classification is also investigated. The classification accuracies from the optical and radar feature combinations are studied. Our research finds that fusion of radar and optical significantly improved classification accuracies. This study indicates that the land cover/use can be mapped accurately by using this approach.

  20. Automatic classification of time-variable X-ray sources

    CERN Document Server

    Lo, Kitty K; Murphy, Tara; Gaensler, B M


    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the second \\textit{XMM-Newton} serendipitous source catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10-fold cross validation accuracy of the training data is ${\\sim}$97% on a seven-class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest der...

  1. Ensemble polarimetric SAR image classification based on contextual sparse representation (United States)

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun


    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  2. Flying insect detection and classification with inexpensive sensors. (United States)

    Chen, Yanping; Why, Adena; Batista, Gustavo; Mafra-Neto, Agenor; Keogh, Eamonn


    An inexpensive, noninvasive system that could accurately classify flying insects would have important implications for entomological research, and allow for the development of many useful applications in vector and pest control for both medical and agricultural entomology. Given this, the last sixty years have seen many research efforts devoted to this task. To date, however, none of this research has had a lasting impact. In this work, we show that pseudo-acoustic optical sensors can produce superior data; that additional features, both intrinsic and extrinsic to the insect's flight behavior, can be exploited to improve insect classification; that a Bayesian classification approach allows to efficiently learn classification models that are very robust to over-fitting, and a general classification framework allows to easily incorporate arbitrary number of features. We demonstrate the findings with large-scale experiments that dwarf all previous works combined, as measured by the number of insects and the number of species considered.

  3. morePhyML: improving the phylogenetic tree space exploration with PhyML 3. (United States)

    Criscuolo, Alexis


    PhyML is a widely used Maximum Likelihood (ML) phylogenetic tree inference software based on a standard hill-climbing method. Starting from an initial tree, the version 3 of PhyML explores the tree space by using "Nearest Neighbor Interchange" (NNI) or "Subtree Pruning and Regrafting" (SPR) tree swapping techniques in order to find the ML phylogenetic tree. NNI-based local searches are fast but can often get trapped in local optima, whereas it is expected that the larger (but slower to cover) SPR-based neighborhoods will lead to trees with higher likelihood. Here, I verify that PhyML infers more likely trees with SPRs than with NNIs in almost all cases. However, I also show that the SPR-based local search of PhyML often does not succeed at locating the ML tree. To improve the tree space exploration, I deliver a script, named morePhyML, which allows escaping from local optima by performing character reweighting. This ML tree search strategy, named ratchet, often leads to higher likelihood estimates. Based on the analysis of a large number of amino acid and nucleotide data, I show that morePhyML allows inferring more accurate phylogenetic trees than several other recently developed ML tree inference softwares in many cases.

  4. The complete chloroplast genome sequences of five Epimedium species: lights into phylogenetic and taxonomic analyses

    Directory of Open Access Journals (Sweden)

    Yanjun eZhang


    Full Text Available Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR region and the single-copy (SC boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants.

  5. Ascospore morphology is a poor predictor of the phylogenetic relationships of Neurospora and Gelasinospora. (United States)

    Dettman, J R; Harbinski, F M; Taylor, J W


    The genera Neurospora and Gelasinospora are conventionally distinguished by differences in ascospore ornamentation, with elevated longitudinal ridges (ribs) separated by depressed grooves (veins) in Neurospora and spherical or oval indentations (pits) in Gelasinospora. The phylogenetic relationships of representatives of 12 Neurospora and 4 Gelasinospora species were assessed with the DNA sequences of four nuclear genes. Within the genus Neurospora, the 5 outbreeding conidiating species form a monophyletic group with N. discreta as the most divergent, and 4 of the homothallic species form a monophyletic group. In combined analysis, each of the conventionally defined Gelasinospora species was more closely related to a Neurospora species than to another Gelasinospora species. Evidently, the Neurospora and Gelasinospora species included in this study do not represent two clearly resolved monophyletic sister genera, but instead represent a polyphyletic group of taxa with close phylogenetic relationships and significant morphological similarities. Ascospore morphology, the character that the distinction between the genera Neurospora and Gelasinospora is based upon,was not an accurate predictor of phylogenetic relationships.

  6. [Classification of diabetes: an increasing heterogeneity]. (United States)

    Corcillo, Antonella; Corcillo Vionnet, Antonella; Jornayvaz, François R


    Diabetes mellitus is usually subdivided into type 1 and type 2. Despite precise criteria, distinction between these two types of diabetes can be difficult because of cases with superposition of the two classes. Adults aged 20 to 40 are particularly at risk of presenting an intermediary type of diabetes and thus are subject to misclassification. The distinction between these subtypes is relevant because of the therapeutic decision and the outcome which relies on insulin supply and therefore the evolution to insulin dependence. Thus, it seems important to review a new and more accurate classification of diabetes to offer a more appropriated care to patients.

  7. Spectral classification using convolutional neural networks

    CERN Document Server

    Hála, Pavel


    There is a great need for accurate and autonomous spectral classification methods in astrophysics. This thesis is about training a convolutional neural network (ConvNet) to recognize an object class (quasar, star or galaxy) from one-dimension spectra only. Author developed several scripts and C programs for datasets preparation, preprocessing and postprocessing of the data. EBLearn library (developed by Pierre Sermanet and Yann LeCun) was used to create ConvNets. Application on dataset of more than 60000 spectra yielded success rate of nearly 95%. This thesis conclusively proved great potential of convolutional neural networks and deep learning methods in astrophysics.

  8. Accurate tracking control in LOM application

    Institute of Scientific and Technical Information of China (English)


    The fabrication of accurate prototype from CAD model directly in short time depends on the accurate tracking control and reference trajectory planning in (Laminated Object Manufacture) LOM application. An improvement on contour accuracy is acquired by the introduction of a tracking controller and a trajectory generation policy. A model of the X-Y positioning system of LOM machine is developed as the design basis of tracking controller. The ZPETC (Zero Phase Error Tracking Controller) is used to eliminate single axis following error, thus reduce the contour error. The simulation is developed on a Maltab model based on a retrofitted LOM machine and the satisfied result is acquired.

  9. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

    Directory of Open Access Journals (Sweden)

    Yin Wang


    Full Text Available Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.

  10. Identification of astigmatid mites using the second internal transcribed spacer (ITS2) region and its application for phylogenetic study. (United States)

    Noge, Koji; Mori, Naoki; Tanaka, Chihiro; Nishida, Ritsuo; Tsuda, Mitsuya; Kuwahara, Yasumasa


    The second internal transcribed spacer (ITS2) of nuclear ribosomal DNA from 73 specimens of Astigmata was analyzed by PCR amplification and DNA sequencing. The length of the ITS2 region varied from 282 to 592 bp. The interspecific variation based on consensus sequences was more than 4.1%, while the intraspecific or intra-individual variation was from 0 to 5.7%. The variation between geographically separated populations (0-3.2%) was almost the same as the variation within strains. The sequences of the ITS2 region of Astigmata were concluded to be species-specific. The phylogenetic tree inferred from the ITS2 region supported Zachvatkin's morphological classification in the subfamily Rhizoglyphinae. The species-specific ITS2 sequence is useful for the species identification of astigmatid mites and for studying low-level phylogenetic relationships.

  11. Validation and Classification of Web Services using Equalization Validation Classification

    Directory of Open Access Journals (Sweden)



    Full Text Available In the business process world, web services present a managed and middleware to connect huge number of services. Web service transaction is a mechanism to compose services with their desired quality parameters. If enormous transactions occur, the provider could not acquire the accurate data at the correct time. So it is necessary to reduce the overburden of web service t ransactions. In order to reduce the excess of transactions form customers to providers, this paper propose a new method called Equalization Validation Classification. This method introduces a new weight - reducing algorithm called Efficient Trim Down algorit hm to reduce the overburden of the incoming client requests. When this proposed algorithm is compared with Decision tree algorithms of (J48, Random Tree, Random Forest, AD Tree it produces a better accuracy and Validation than the existing algorithms. The proposed trimming method was analyzed with the Decision tree algorithms and the results implementation shows that the ETD algorithm provides better performance in terms of improved accuracy with Effective Validation. Therefore, the proposed method provide s a good gateway to reduce the overburden of the client requests in web services. Moreover analyzing the requests arrived from a vast number of clients and preventing the illegitimate requests save the service provider time

  12. Classification of waste packages

    Energy Technology Data Exchange (ETDEWEB)

    Mueller, H.P.; Sauer, M.; Rojahn, T. [Versuchsatomkraftwerk GmbH, Kahl am Main (Germany)


    A barrel gamma scanning unit has been in use at the VAK for the classification of radioactive waste materials since 1998. The unit provides the facility operator with the data required for classification of waste barrels. Once these data have been entered into the AVK data processing system, the radiological status of raw waste as well as pre-treated and processed waste can be tracked from the point of origin to the point at which the waste is delivered to a final storage. Since the barrel gamma scanning unit was commissioned in 1998, approximately 900 barrels have been measured and the relevant data required for classification collected and analyzed. Based on the positive results of experience in the use of the mobile barrel gamma scanning unit, the VAK now offers the classification of barrels as a service to external users. Depending upon waste quantity accumulation, this measurement unit offers facility operators a reliable and time-saving and cost-effective means of identifying and documenting the radioactivity inventory of barrels scheduled for final storage. (orig.)

  13. Improving Student Question Classification (United States)

    Heiner, Cecily; Zachary, Joseph L.


    Students in introductory programming classes often articulate their questions and information needs incompletely. Consequently, the automatic classification of student questions to provide automated tutorial responses is a challenging problem. This paper analyzes 411 questions from an introductory Java programming course by reducing the natural…

  14. Event Classification using Concepts

    NARCIS (Netherlands)

    Boer, M.H.T. de; Schutte, K.; Kraaij, W.


    The semantic gap is one of the challenges in the GOOSE project. In this paper a Semantic Event Classification (SEC) system is proposed as an initial step in tackling the semantic gap challenge in the GOOSE project. This system uses semantic text analysis, multiple feature detectors using the BoW mod

  15. Nearest convex hull classification

    NARCIS (Netherlands)

    G.I. Nalbantov (Georgi); P.J.F. Groenen (Patrick); J.C. Bioch (Cor)


    textabstractConsider the classification task of assigning a test object to one of two or more possible groups, or classes. An intuitive way to proceed is to assign the object to that class, to which the distance is minimal. As a distance measure to a class, we propose here to use the distance to the

  16. Classification of myocardial infarction

    DEFF Research Database (Denmark)

    Saaby, Lotte; Poulsen, Tina Svenstrup; Hosbond, Susanne Elisabeth;


    The classification of myocardial infarction into 5 types was introduced in 2007 as an important component of the universal definition. In contrast to the plaque rupture-related type 1 myocardial infarction, type 2 myocardial infarction is considered to be caused by an imbalance between demand and...

  17. Recurrent neural collective classification. (United States)

    Monner, Derek D; Reggia, James A


    With the recent surge in availability of data sets containing not only individual attributes but also relationships, classification techniques that take advantage of predictive relationship information have gained in popularity. The most popular existing collective classification techniques have a number of limitations-some of them generate arbitrary and potentially lossy summaries of the relationship data, whereas others ignore directionality and strength of relationships. Popular existing techniques make use of only direct neighbor relationships when classifying a given entity, ignoring potentially useful information contained in expanded neighborhoods of radius greater than one. We present a new technique that we call recurrent neural collective classification (RNCC), which avoids arbitrary summarization, uses information about relationship directionality and strength, and through recursive encoding, learns to leverage larger relational neighborhoods around each entity. Experiments with synthetic data sets show that RNCC can make effective use of relationship data for both direct and expanded neighborhoods. Further experiments demonstrate that our technique outperforms previously published results of several collective classification methods on a number of real-world data sets.

  18. Sandwich classification theorem

    Directory of Open Access Journals (Sweden)

    Alexey Stepanov


    Full Text Available The present note arises from the author's talk at the conference ``Ischia Group Theory 2014''. For subgroups FleN of a group G denote by Lat(F,N the set of all subgroups of N , containing F . Let D be a subgroup of G . In this note we study the lattice LL=Lat(D,G and the lattice LL ′ of subgroups of G , normalized by D . We say that LL satisfies sandwich classification theorem if LL splits into a disjoint union of sandwiches Lat(F,N G (F over all subgroups F such that the normal closure of D in F coincides with F . Here N G (F denotes the normalizer of F in G . A similar notion of sandwich classification is introduced for the lattice LL ′ . If D is perfect, i.,e. coincides with its commutator subgroup, then it turns out that sandwich classification theorem for LL and LL ′ are equivalent. We also show how to find basic subroup F of sandwiches for LL ′ and review sandwich classification theorems in algebraic groups over rings.

  19. Dynamic Latent Classification Model

    DEFF Research Database (Denmark)

    Zhong, Shengtong; Martínez, Ana M.; Nielsen, Thomas Dyhre

    as possible. Motivated by this problem setting, we propose a generative model for dynamic classification in continuous domains. At each time point the model can be seen as combining a naive Bayes model with a mixture of factor analyzers (FA). The latent variables of the FA are used to capture the dynamics...... in the process as well as modeling dependences between attributes....

  20. Classifications in popular music

    NARCIS (Netherlands)

    van Venrooij, A.; Schmutz, V.; Wright, J.D.


    The categorical system of popular music, such as genre categories, is a highly differentiated and dynamic classification system. In this article we present work that studies different aspects of these categorical systems in popular music. Following the work of Paul DiMaggio, we focus on four questio

  1. Shark Teeth Classification (United States)

    Brown, Tom; Creel, Sally; Lee, Velda


    On a recent autumn afternoon at Harmony Leland Elementary in Mableton, Georgia, students in a fifth-grade science class investigated the essential process of classification--the act of putting things into groups according to some common characteristics or attributes. While they may have honed these skills earlier in the week by grouping their own…

  2. Accurate Switched-Voltage voltage averaging circuit


    金光, 一幸; 松本, 寛樹


    Abstract ###This paper proposes an accurate Switched-Voltage (SV) voltage averaging circuit. It is presented ###to compensated for NMOS missmatch error at MOS differential type voltage averaging circuit. ###The proposed circuit consists of a voltage averaging and a SV sample/hold (S/H) circuit. It can ###operate using nonoverlapping three phase clocks. Performance of this circuit is verified by PSpice ###simulations.

  3. Accurate overlaying for mobile augmented reality

    NARCIS (Netherlands)

    Pasman, W; van der Schaaf, A; Lagendijk, RL; Jansen, F.W.


    Mobile augmented reality requires accurate alignment of virtual information with objects visible in the real world. We describe a system for mobile communications to be developed to meet these strict alignment criteria using a combination of computer vision. inertial tracking and low-latency renderi

  4. PhyDesign: an online application for profiling phylogenetic informativeness

    Directory of Open Access Journals (Sweden)

    Townsend Jeffrey P


    Full Text Available Abstract Background The rapid increase in number of sequenced genomes for species across of the tree of life is revealing a diverse suite of orthologous genes that could potentially be employed to inform molecular phylogenetic studies that encompass broader taxonomic sampling. Optimal usage of this diversity of loci requires user-friendly tools to facilitate widespread cost-effective locus prioritization for phylogenetic sampling. The Townsend (2007 phylogenetic informativeness provides a unique empirical metric for guiding marker selection. However, no software or automated methodology to evaluate sequence alignments and estimate the phylogenetic informativeness metric has been available. Results Here, we present PhyDesign, a platform-independent online application that implements the Townsend (2007 phylogenetic informativeness analysis, providing a quantitative prediction of the utility of loci to solve specific phylogenetic questions. An easy-to-use interface facilitates uploading of alignments and ultrametric trees to calculate and depict profiles of informativeness over specified time ranges, and provides rankings of locus prioritization for epochs of interest. Conclusions By providing these profiles, PhyDesign facilitates locus prioritization increasing the efficiency of sequencing for phylogenetic purposes compared to traditional studies with more laborious and low capacity screening methods, as well as increasing the accuracy of phylogenetic studies. Together with a manual and sample files, the application is freely accessible at

  5. Utilization of complete chloroplast genomes for phylogenetic studies

    NARCIS (Netherlands)

    Ramlee, Shairul Izan Binti


    Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from ot

  6. Student Interpretations of Phylogenetic Trees in an Introductory Biology Course (United States)

    Dees, Jonathan; Momsen, Jennifer L.; Niemi, Jarad; Montplaisir, Lisa


    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa…

  7. Phylogenetic Analysis of Viridans Group Streptococci Causing Endocarditis ▿ (United States)

    Simmon, Keith E.; Hall, Lori; Woods, Christopher W.; Marco, Francesc; Miro, Jose M.; Cabell, Christopher; Hoen, Bruno; Marin, Mercedes; Utili, Riccardo; Giannitsioti, Efthymia; Doco-Lecompte, Thanh; Bradley, Suzanne; Mirrett, Stanley; Tambic, Arjana; Ryan, Suzanne; Gordon, David; Jones, Phillip; Korman, Tony; Wray, Dannah; Reller, L. Barth; Tripodi, Marie-Francoise; Plesiat, Patrick; Morris, Arthur J.; Lang, Selwyn; Murdoch, David R.; Petti, Cathy A.


    Identification of viridans group streptococci (VGS) to the species level is difficult because VGS exchange genetic material. We performed multilocus DNA target sequencing to assess phylogenetic concordance of VGS for a well-defined clinical syndrome. The hierarchy of sequence data was often discordant, underscoring the importance of establishing biological relevance for finer phylogenetic distinctions. PMID:18650347

  8. Phylogenetic analysis of viridans group streptococci causing endocarditis. (United States)

    Simmon, Keith E; Hall, Lori; Woods, Christopher W; Marco, Francesc; Miro, Jose M; Cabell, Christopher; Hoen, Bruno; Marin, Mercedes; Utili, Riccardo; Giannitsioti, Efthymia; Doco-Lecompte, Thanh; Bradley, Suzanne; Mirrett, Stanley; Tambic, Arjana; Ryan, Suzanne; Gordon, David; Jones, Phillip; Korman, Tony; Wray, Dannah; Reller, L Barth; Tripodi, Marie-Francoise; Plesiat, Patrick; Morris, Arthur J; Lang, Selwyn; Murdoch, David R; Petti, Cathy A


    Identification of viridans group streptococci (VGS) to the species level is difficult because VGS exchange genetic material. We performed multilocus DNA target sequencing to assess phylogenetic concordance of VGS for a well-defined clinical syndrome. The hierarchy of sequence data was often discordant, underscoring the importance of establishing biological relevance for finer phylogenetic distinctions.

  9. Phylogenetic diversity (PD and biodiversity conservation: some bioinformatics challenges

    Directory of Open Access Journals (Sweden)

    Daniel P. Faith


    Full Text Available Biodiversity conservation addresses information challenges through estimations encapsulated in measures of diversity. A quantitative measure of phylogenetic diversity, “PD”, has been defined as the minimum total length of all the phylogenetic branches required to span a given set of taxa on the phylogenetic tree (Faith 1992a. While a recent paper incorrectly characterizes PD as not including information about deeper phylogenetic branches, PD applications over the past decade document the proper incorporation of shared deep branches when assessing the total PD of a set of taxa. Current PD applications to macroinvertebrate taxa in streams of New South Wales, Australia illustrate the practical importance of this definition. Phylogenetic lineages, often corresponding to new, “cryptic”, taxa, are restricted to a small number of stream localities. A recent case of human impact causing loss of taxa in one locality implies a higher PD value for another locality, because it now uniquely represents a deeper branch. This molecular-based phylogenetic pattern supports the use of DNA barcoding programs for biodiversity conservation planning. Here, PD assessments side-step the contentious use of barcoding-based “species” designations. Bio-informatics challenges include combining different phylogenetic evidence, optimization problems for conservation planning, and effective integration of phylogenetic information with environmental and socio-economic data.

  10. Efficient Fingercode Classification (United States)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  11. Site-specific time heterogeneity of the substitution process and its impact on phylogenetic inference

    Directory of Open Access Journals (Sweden)

    Philippe Hervé


    Full Text Available Abstract Background Model violations constitute the major limitation in inferring accurate phylogenies. Characterizing properties of the data that are not being correctly handled by current models is therefore of prime importance. One of the properties of protein evolution is the variation of the relative rate of substitutions across sites and over time, the latter is the phenomenon called heterotachy. Its effect on phylogenetic inference has recently obtained considerable attention, which led to the development of new models of sequence evolution. However, thus far focus has been on the quantitative heterogeneity of the evolutionary process, thereby overlooking more qualitative variations. Results We studied the importance of variation of the site-specific amino-acid substitution process over time and its possible impact on phylogenetic inference. We used the CAT model to define an infinite mixture of substitution processes characterized by equilibrium frequencies over the twenty amino acids, a useful proxy for qualitatively estimating the evolutionary process. Using two large datasets, we show that qualitative changes in site-specific substitution properties over time occurred significantly. To test whether this unaccounted qualitative variation can lead to an erroneous phylogenetic tree, we analyzed a concatenation of mitochondrial proteins in which Cnidaria and Porifera were erroneously grouped. The progressive removal of the sites with the most heterogeneous CAT profiles across clades led to the recovery of the monophyly of Eumetazoa (Cnidaria+Bilateria, suggesting that this heterogeneity can negatively influence phylogenetic inference. Conclusion The time-heterogeneity of the amino-acid replacement process is therefore an important evolutionary aspect that should be incorporated in future models of sequence change.

  12. Open Reading Frame Phylogenetic Analysis on the Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung


    Full Text Available Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.

  13. Visualising very large phylogenetic trees in three dimensional hyperbolic space

    Directory of Open Access Journals (Sweden)

    Liberles David A


    Full Text Available Abstract Background Common existing phylogenetic tree visualisation tools are not able to display readable trees with more than a few thousand nodes. These existing methodologies are based in two dimensional space. Results We introduce the idea of visualising phylogenetic trees in three dimensional hyperbolic space with the Walrus graph visualisation tool and have developed a conversion tool that enables the conversion of standard phylogenetic tree formats to Walrus' format. With Walrus, it becomes possible to visualise and navigate phylogenetic trees with more than 100,000 nodes. Conclusion Walrus enables desktop visualisation of very large phylogenetic trees in 3 dimensional hyperbolic space. This application is potentially useful for visualisation of the tree of life and for functional genomics derivatives, like The Adaptive Evolution Database (TAED.

  14. Free classification of American English dialects by native and non-native listeners. (United States)

    Clopper, Cynthia G; Bradlow, Ann R


    Most second language acquisition research focuses on linguistic structures, and less research has examined the acquisition of sociolinguistic patterns. The current study explored the perceptual classification of regional dialects of American English by native and non-native listeners using a free classification task. Results revealed similar classification strategies for the native and non-native listeners. However, the native listeners were more accurate overall than the non-native listeners. In addition, the non-native listeners were less able to make use of constellations of cues to accurately classify the talkers by dialect. However, the non-native listeners were able to attend to cues that were either phonologically or sociolinguistically relevant in their native language. These results suggest that non-native listeners can use information in the speech signal to classify talkers by regional dialect, but that their lack of signal-independent cultural knowledge about variation in the second language leads to less accurate classification performance.

  15. A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

    Directory of Open Access Journals (Sweden)

    Hongzhuan Zhao


    Full Text Available With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS. Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

  16. Novel accurate bacterial discrimination by MALDI-time-of-flight MS based on ribosomal proteins coding in S10-spc-alpha operon at strain level S10-GERMS. (United States)

    Tamura, Hiroto; Hotta, Yudai; Sato, Hiroaki


    Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is one of the most widely used mass-based approaches for bacterial identification and classification because of the simple sample preparation and extremely rapid analysis within a few minutes. To establish the accurate MALDI-TOF MS bacterial discrimination method at strain level, the ribosomal subunit proteins coded in the S10-spc-alpha operon, which encodes half of the ribosomal subunit protein and is highly conserved in eubacterial genomes, were selected as reliable biomarkers. This method, named the S10-GERMS method, revealed that the strains of genus Pseudomonas were successfully identified and discriminated at species and strain levels, respectively; therefore, the S10-GERMS method was further applied to discriminate the pathovar of P. syringae. The eight selected biomarkers (L24, L30, S10, S12, S14, S16, S17, and S19) suggested the rapid discrimination of P. syringae at the strain (pathovar) level. The S10-GERMS method appears to be a powerful tool for rapid and reliable bacterial discrimination and successful phylogenetic characterization. In this article, an overview of the utilization of results from the S10-GERMS method is presented, highlighting the characterization of the Lactobacillus casei group and discrimination of the bacteria of genera Bacillus and Sphingopyxis despite only two and one base difference in the 16S rRNA gene sequence, respectively.

  17. Novel Accurate Bacterial Discrimination by MALDI-Time-of-Flight MS Based on Ribosomal Proteins Coding in S10-spc-alpha Operon at Strain Level S10-GERMS (United States)

    Tamura, Hiroto; Hotta, Yudai; Sato, Hiroaki


    Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is one of the most widely used mass-based approaches for bacterial identification and classification because of the simple sample preparation and extremely rapid analysis within a few minutes. To establish the accurate MALDI-TOF MS bacterial discrimination method at strain level, the ribosomal subunit proteins coded in the S 10-spc-alpha operon, which encodes half of the ribosomal subunit protein and is highly conserved in eubacterial genomes, were selected as reliable biomarkers. This method, named the S10-GERMS method, revealed that the strains of genus Pseudomonas were successfully identified and discriminated at species and strain levels, respectively; therefore, the S10-GERMS method was further applied to discriminate the pathovar of P. syringae. The eight selected biomarkers (L24, L30, S10, S12, S14, S16, S17, and S19) suggested the rapid discrimination of P. syringae at the strain (pathovar) level. The S10-GERMS method appears to be a powerful tool for rapid and reliable bacterial discrimination and successful phylogenetic characterization. In this article, an overview of the utilization of results from the S10-GERMS method is presented, highlighting the characterization of the Lactobacillus casei group and discrimination of the bacteria of genera Bacillus and Sphingopyxis despite only two and one base difference in the 16S rRNA gene sequence, respectively.

  18. Fast Structural Search in Phylogenetic Databases

    Directory of Open Access Journals (Sweden)

    William H. Piel


    Full Text Available As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P . The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

  19. Comprehensive phylogenetic analysis of bacterial reverse transcriptases.

    Directory of Open Access Journals (Sweden)

    Nicolás Toro

    Full Text Available Much less is known about reverse transcriptases (RTs in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (≤85% identity per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs, Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L, and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology.

  20. Phylogenetic autocorrelation under distinct evolutionary processes. (United States)

    Diniz-Filho, J A


    I show how phylogenetic correlograms track distinct microevolutionary processes and can be used as empirical descriptors of the relationship between interspecific covariance (V(B)) and time since divergence (t). Data were simulated under models of gradual and speciational change, using increasing levels of stabilizing selection in a stochastic Ornstein-Uhlenbeck (O-U) process, on a phylogeny of 42 species. For each simulated dataset, correlograms were constructed using Moran's I coefficients estimated at five time slices, established at constant intervals. The correlograms generated under different evolutionary models differ significantly according to F-values derived from analysis of variance comparing Moran's I at each time slice and based on Wilks' lambda from multivariate analysis of variance comparing their overall profiles in a two-way design. Under Brownian motion or with small restraining forces in the O-U process, correlograms were better fit by a linear model. However, increasing restraining forces in the O-U process cause a lack of linear fit, and correlograms are better described by exponential models. These patterns are better fit for gradual than for speciational modes of change. Correlograms can be used as a diagnostic method and to describe the V(B)/t relationship before using methods to analyze correlated evolution that assume (or perform statistically better when) this relationship is linear.

  1. Phylogenetic analysis of heterothallic Neurospora species. (United States)

    Skupski, M P; Jackson, D A; Natvig, D O


    We examined the phylogenetic relationships among five heterothallic species of Neurospora using restriction fragment polymorphisms derived from cosmid probes and sequence data from the upstream regions of two genes, al-1 and frq. Distance, maximum likelihood, and parsimony trees derived from the data support the hypothesis that strains assigned to N. sitophila, N. discreta, and N. tetrasperma form respective monophyletic groups. Strains assigned to N. intermedia and N. crassa, however, did not form two respective monophyletic groups, consistent with a previous suggestion based on analysis of mitochondrial DNAs that N. crassa and N. intermedia may be incompletely resolved sister taxa. Trees derived from restriction fragments and the al-1 sequence position N. tetrasperma as the sister species of N. sitophila. None of the trees produced by our data supported a previous analysis of sequences in the region of the mating type idiomorph that grouped N. crassa and N. sitophila as sister taxa, as well as N. intermedia and N. tetrasperma as sister taxa. Moreover, sequences from al-1, frq, and the mating-type region produced different trees when analyzed separately. The lack of consensus obtained with different sequences could result from the sorting of ancestral polymorphism during speciation or gene flow across species boundaries, or both.

  2. Ultrastructure, biology, and phylogenetic relationships of kinorhyncha. (United States)

    Neuhaus, Birger; Higgins, Robert P


    The article summarizes current knowledge mainly about the (functional) morphology and ultrastructure, but also about the biology, development, and evolution of the Kinorhyncha. The Kinorhyncha are microscopic, bilaterally symmetrical, exclusively free-living, benthic, marine animals and ecologically part of the meiofauna. They occur throughout the world from the intertidal to the deep sea, generally in sediments but sometimes associated with plants or other animals. From adult stages 141 species are known, but 38 species have been described from juvenile stages. The trunk is arranged into 11 segments as evidenced by cuticular plates, sensory spots, setae or spines, nervous system, musculature, and subcuticular glands. The ultrastructure of several organ systems and the postembryonic development are known for very few species. Almost no data are available about the embryology and only a single gene has been sequenced for a single species. The phylogenetic relationships within Kinorhyncha are unresolved. Priapulida, Loricifera, and Kinorhyncha are grouped together as Scalidophora, but arguments are found for every possible sistergroup relationship within this taxon. The recently published Ecdysozoa hypothesis suggests a closer relationship of the Scalidophora, Nematoda, Nematomorpha, Tardigrada, Onychophora, and Arthropoda.

  3. Identifiability of large phylogenetic mixture models. (United States)

    Rhodes, John A; Sullivant, Seth


    Phylogenetic mixture models are statistical models of character evolution allowing for heterogeneity. Each of the classes in some unknown partition of the characters may evolve by different processes, or even along different trees. Such models are of increasing interest for data analysis, as they can capture the variety of evolutionary processes that may be occurring across long sequences of DNA or proteins. The fundamental question of whether parameters of such a model are identifiable is difficult to address, due to the complexity of the parameterization. Identifiability is, however, essential to their use for statistical inference.We analyze mixture models on large trees, with many mixture components, showing that both numerical and tree parameters are indeed identifiable in these models when all trees are the same. This provides a theoretical justification for some current empirical studies, and indicates that extensions to even more mixture components should be theoretically well behaved. We also extend our results to certain mixtures on different trees, using the same algebraic techniques.

  4. Phylogenetic analysis of fungal ABC transporters

    Directory of Open Access Journals (Sweden)

    Driessen Arnold JM


    Full Text Available Abstract Background The superfamily of ABC proteins is among the largest known in nature. Its members are mainly, but not exclusively, involved in the transport of a broad range of substrates across biological membranes. Many contribute to multidrug resistance in microbial pathogens and cancer cells. The diversity of ABC proteins in fungi is comparable with those in multicellular animals, but so far fungal ABC proteins have barely been studied. Results We performed a phylogenetic analysis of the ABC proteins extracted from the genomes of 27 fungal species from 18 orders representing 5 fungal phyla thereby covering the most important groups. Our analysis demonstrated that some of the subfamilies of ABC proteins remained highly conserved in fungi, while others have undergone a remarkable group-specific diversification. Members of the various fungal phyla also differed significantly in the number of ABC proteins found in their genomes, which is especially reduced in the yeast S. cerevisiae and S. pombe. Conclusions Data obtained during our analysis should contribute to a better understanding of the diversity of the fungal ABC proteins and provide important clues about their possible biological functions.

  5. Mayaro virus: complete nucleotide sequence and phylogenetic relationships with other alphaviruses. (United States)

    Lavergne, Anne; de Thoisy, Benoît; Lacoste, Vincent; Pascalis, Hervé; Pouliquen, Jean-François; Mercier, Véronique; Tolou, Hugues; Dussart, Philippe; Morvan, Jacques; Talarmin, Antoine; Kazanji, Mirdad


    Mayaro (MAY) virus is a member of the genus Alphavirus in the family Togaviridae. Alphaviruses are distributed throughout the world and cause a wide range of diseases in humans and animals. Here, we determined the complete nucleotide sequence of MAY from a viral strain isolated from a French Guianese patient. The deduced MAY genome was 11,429 nucleotides in length, excluding the 5' cap nucleotide and 3' poly(A) tail. Nucleotide and amino acid homologies, as well as phylogenetic analyses of the obtained sequence confirmed that MAY is not a recombinant virus and belongs to the Semliki Forest complex according to the antigenic complex classification. Furthermore, analyses based on the E1 region revealed that MAY is closely related to Una virus, the only other South American virus clustering with the Old World viruses. On the basis of our results and of the alphaviruses diversity and pathogenicity, we suggest that alphaviruses may have an Old World origin.

  6. Accurate colorimetric feedback for RGB LED clusters (United States)

    Man, Kwong; Ashdown, Ian


    We present an empirical model of LED emission spectra that is applicable to both InGaN and AlInGaP high-flux LEDs, and which accurately predicts their relative spectral power distributions over a wide range of LED junction temperatures. We further demonstrate with laboratory measurements that changes in LED spectral power distribution with temperature can be accurately predicted with first- or second-order equations. This provides the basis for a real-time colorimetric feedback system for RGB LED clusters that can maintain the chromaticity of white light at constant intensity to within +/-0.003 Δuv over a range of 45 degrees Celsius, and to within 0.01 Δuv when dimmed over an intensity range of 10:1.

  7. Accurate guitar tuning by cochlear implant musicians.

    Directory of Open Access Journals (Sweden)

    Thomas Lu

    Full Text Available Modern cochlear implant (CI users understand speech but find difficulty in music appreciation due to poor pitch perception. Still, some deaf musicians continue to perform with their CI. Here we show unexpected results that CI musicians can reliably tune a guitar by CI alone and, under controlled conditions, match simultaneously presented tones to <0.5 Hz. One subject had normal contralateral hearing and produced more accurate tuning with CI than his normal ear. To understand these counterintuitive findings, we presented tones sequentially and found that tuning error was larger at ∼ 30 Hz for both subjects. A third subject, a non-musician CI user with normal contralateral hearing, showed similar trends in performance between CI and normal hearing ears but with less precision. This difference, along with electric analysis, showed that accurate tuning was achieved by listening to beats rather than discriminating pitch, effectively turning a spectral task into a temporal discrimination task.

  8. Efficient Accurate Context-Sensitive Anomaly Detection

    Institute of Scientific and Technical Information of China (English)


    For program behavior-based anomaly detection, the only way to ensure accurate monitoring is to construct an efficient and precise program behavior model. A new program behavior-based anomaly detection model,called combined pushdown automaton (CPDA) model was proposed, which is based on static binary executable analysis. The CPDA model incorporates the optimized call stack walk and code instrumentation technique to gain complete context information. Thereby the proposed method can detect more attacks, while retaining good performance.

  9. On accurate determination of contact angle (United States)

    Concus, P.; Finn, R.


    Methods are proposed that exploit a microgravity environment to obtain highly accurate measurement of contact angle. These methods, which are based on our earlier mathematical results, do not require detailed measurement of a liquid free-surface, as they incorporate discontinuous or nearly-discontinuous behavior of the liquid bulk in certain container geometries. Physical testing is planned in the forthcoming IML-2 space flight and in related preparatory ground-based experiments.

  10. Accurate Control of Josephson Phase Qubits (United States)


    61 ~1986!. 23 K. Kraus, States, Effects, and Operations: Fundamental Notions of Quantum Theory, Lecture Notes in Physics , Vol. 190 ~Springer-Verlag... PHYSICAL REVIEW B 68, 224518 ~2003!Accurate control of Josephson phase qubits Matthias Steffen,1,2,* John M. Martinis,3 and Isaac L. Chuang1 1Center...for Bits and Atoms and Department of Physics , MIT, Cambridge, Massachusetts 02139, USA 2Solid State and Photonics Laboratory, Stanford University

  11. Accurate guitar tuning by cochlear implant musicians. (United States)

    Lu, Thomas; Huang, Juan; Zeng, Fan-Gang


    Modern cochlear implant (CI) users understand speech but find difficulty in music appreciation due to poor pitch perception. Still, some deaf musicians continue to perform with their CI. Here we show unexpected results that CI musicians can reliably tune a guitar by CI alone and, under controlled conditions, match simultaneously presented tones to electric analysis, showed that accurate tuning was achieved by listening to beats rather than discriminating pitch, effectively turning a spectral task into a temporal discrimination task.

  12. Synthesizing Accurate Floating-Point Formulas


    Ioualalen, Arnault; Martel, Matthieu


    International audience; Many critical embedded systems perform floating-point computations yet their accuracy is difficult to assert and strongly depends on how formulas are written in programs. In this article, we focus on the synthesis of accurate formulas mathematically equal to the original formulas occurring in source codes. In general, an expression may be rewritten in many ways. To avoid any combinatorial explosion, we use an intermediate representation, called APEG, enabling us to rep...

  13. Fast Image Texture Classification Using Decision Trees (United States)

    Thompson, David R.


    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.


    Directory of Open Access Journals (Sweden)

    Puranik Prashant K


    Full Text Available The biopharmaceutical classification system (BCS has been developed to provide a scientific approach for classifying drug compounds based on solubility as related to dose and intestinal permeability in combination with the dissolution properties of the oral immediate release dosage form. BCS is to provide a regulatory tool for replacing certain bioequivalence (BE studies by accurate in vitro dissolution tests. This review gives three dimensionless numbers which are used in BCS are absorption number, dissolution number, dose number.Biowaver is an important tool for formulation development. Bioavailability (BA and BE play a central role in pharmaceutical product development, and BE studies are presently being conducted for New Drug Applications (NDAs of new compounds, in supplementary NDAs for new medical indications and product line extensions, in Abbreviated New Drug Applications (ANDAs of generic products, and in applications for scale-up and post-approval changes. The principles of the BCS classification system can be applied to NDA and ANDA approvals as well as to scale-up and post approval changes in drug manufacturing. BCS classification can therefore save pharmaceutical companies a significant amount in development time and reduce costs. The aim of the present review is to present the status of BCS and discuss its future application in pharmaceutical product development.

  15. Accurate structural correlations from maximum likelihood superpositions.

    Directory of Open Access Journals (Sweden)

    Douglas L Theobald


    Full Text Available The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method ("PCA plots" for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology.

  16. Phylogenetic relationships of Salvia (Lamiaceae) in China:Evidence from DNA sequence datasets

    Institute of Scientific and Technical Information of China (English)

    Qian-Quan LI; Min-Hui LI; Qing-Jun YUAN; Zhan-Hu CUI; Lu-Qi HUANG; Pei-Gen XIAO


    With 84 native species,China is a center of distribution of the genus Salvia (Lamiaceae).These species are mainly distributed in Yunnan and Sichuan provinces (southwestern China),notably the Hengduan Mountain region.Traditionally,the Chinese Salvia has been classified into four subgenera,Salvia,Sclarea,Jungia,and Allagospadonopsis.We tested this classification using molecular phylogenetic analysis of 43 species of Salvia from China,six from Japan,and four introduced species.The nuclear ribosomal internal transcribed spacer region and three chloroplast regions (rbcL,matK,and trnH-psbA) were analyzed by maximum parsimony,maximum likelihood,and Bayesian methods.Our results showed that the Chinese (except Salvia deserta) and Japanese Salvia species formed a well-supported clade; S.deserta from Xinjiang grouped with Salvia officinalis of Europe.In addition,all introduced Salvia species in China were relatively distantly related to the native Chinese Salvia.Our results differed from the subgeneric and section classifications in Flora Reipublicae Popularis Sinicae.We suggested that sections Eusphace and Pleiphace should be united in a new subgenus and that sect.Notiosphace should be removed from subg.Sclarea and form a new subgenus.Our data could not distinguish a boundary between subg.Altagospadonopsis and sect.Drymosphace (subg.Sclarea); the latter should be reduced into the former.Further clarification of the phylogenetic relationships within Salvia and between Salvia and related genera will require broader taxonomic sampling and more molecular markers.

  17. Object Based and Pixel Based Classification Using Rapideye Satellite Imager of ETI-OSA, Lagos, Nigeria

    Directory of Open Access Journals (Sweden)

    Esther Oluwafunmilayo Makinde


    Full Text Available Several studies have been carried out to find an appropriate method to classify the remote sensing data. Traditional classification approaches are all pixel-based, and do not utilize the spatial information within an object which is an important source of information to image classification. Thus, this study compared the pixel based and object based classification algorithms using RapidEye satellite image of Eti-Osa LGA, Lagos. In the object-oriented approach, the image was segmented to homogenous area by suitable parameters such as scale parameter, compactness, shape etc. Classification based on segments was done by a nearest neighbour classifier. In the pixel-based classification, the spectral angle mapper was used to classify the images. The user accuracy for each class using object based classification were 98.31% for waterbody, 92.31% for vegetation, 86.67% for bare soil and 90.57% for Built up while the user accuracy for the pixel based classification were 98.28% for waterbody, 84.06% for Vegetation 86.36% and 79.41% for Built up. These classification techniques were subjected to accuracy assessment and the overall accuracy of the Object based classification was 94.47%, while that of Pixel based classification yielded 86.64%. The result of classification and accuracy assessment show that the object-based approach gave more accurate and satisfying results

  18. Hand eczema classification

    DEFF Research Database (Denmark)

    Diepgen, T L; Andersen, Klaus Ejner; Brandao, F M;


    Summary Background Hand eczema is a long-lasting disease with a high prevalence in the background population. The disease has severe, negative effects on quality of life and sometimes on social status. Epidemiological studies have identified risk factors for onset and prognosis, but treatment...... of the disease is rarely evidence based, and a classification system for different subdiagnoses of hand eczema is not agreed upon. Randomized controlled trials investigating the treatment of hand eczema are called for. For this, as well as for clinical purposes, a generally accepted classification system...... for hand eczema is needed. Objectives The present study attempts to characterize subdiagnoses of hand eczema with respect to basic demographics, medical history and morphology. Methods Clinical data from 416 patients with hand eczema from 10 European patch test clinics were assessed. Results...

  19. Multilingual documentation and classification. (United States)

    Donnelly, Kevin


    Health care providers around the world have used classification systems for decades as a basis for documentation, communications, statistical reporting, reimbursement and research. In more recent years machine-readable medical terminologies have taken on greater importance with the adoption of electronic health records and the need for greater granularity of data in clinical systems. Use of a clinical terminology harmonised with classifications, implemented within a clinical information system, will enable the delivery of many patient health benefits including electronic clinical decision support, disease screening and enhanced patient safety. In order to be usable these systems must be translated into the language of use, without losing meaning. It is evident that today one system cannot meet all requirements which call for collaboration and harmonisation in order to achieve true interoperability on a multilingual basis.

  20. [New classification of vasculitis]. (United States)

    Anić, Branimir


    Vasculitis syndrome comprises a heterogenic group of inflammatory rheumatic diseases whose common feature is inflammation in the blood vessel wall. Establishing the diagnosis of vasculitis is one of the greatest challenges in medicine. Clinical presentation of vasculitis depends on the extent of an organ system affection, as well as on the total number of affected organs. A great range of clinical presentations of vasculitis and the low incidence of the disease impede systematic clinical investigation of vasculitis. The needs of clinical routine and the need for conducting systemic clinical investigations require a clear distinction of individual clinical entities. Different classifications of vasculitis syndrome have been proposed: according to etiology, pathogenesis, histological finding in the affected vessels, affection of individual organs and organ systems. This paper presents and comments news and recent classifications and nomenclature of vasculitic entities proposed at the second conference in Chapel Hill.

  1. Bosniak classification system

    DEFF Research Database (Denmark)

    Graumann, Ole; Osther, Susanne Sloth; Karstoft, Jens;


    at MR and CEUS imaging and those at CT. PURPOSE: To compare diagnostic accuracy of MR, CEUS, and CT when categorizing complex renal cystic masses according to the Bosniak classification. MATERIAL AND METHODS: From February 2011 to June 2012, 46 complex renal cysts were prospectively evaluated by three...... readers. Each mass was categorized according to the Bosniak classification and CT was chosen as gold standard. Kappa was calculated for diagnostic accuracy and data was compared with pathological results. RESULTS: CT images found 27 BII, six BIIF, seven BIII, and six BIV. Forty-three cysts could...... one category lower. Pathologic correlation in six lesions revealed four malignant and two benign lesions. CONCLUSION: CEUS and MR both up- and downgraded renal cysts compared to CT, and until these non-radiation modalities have been refined and adjusted, CT should remain the gold standard...

  2. Reconstruction of Family-Level Phylogenetic Relationships within Demospongiae (Porifera) Using Nuclear Encoded Housekeeping Genes (United States)

    Hill, Malcolm S.; Hill, April L.; Lopez, Jose; Peterson, Kevin J.; Pomponi, Shirley; Diaz, Maria C.; Thacker, Robert W.; Adamska, Maja; Boury-Esnault, Nicole; Cárdenas, Paco; Chaves-Fonnegra, Andia; Danka, Elizabeth; De Laine, Bre-Onna; Formica, Dawn; Hajdu, Eduardo; Lobo-Hajdu, Gisele; Klontz, Sarah; Morrow, Christine C.; Patel, Jignasa; Picton, Bernard; Pisani, Davide; Pohlmann, Deborah; Redmond, Niamh E.; Reed, John; Richey, Stacy; Riesgo, Ana; Rubin, Ewelina; Russell, Zach; Rützler, Klaus; Sperling, Erik A.; di Stefano, Michael; Tarver, James E.; Collins, Allen G.


    Background Demosponges are challenging for phylogenetic systematics because of their plastic and relatively simple morphologies and many deep divergences between major clades. To improve understanding of the phylogenetic relationships within Demospongiae, we sequenced and analyzed seven nuclear housekeeping genes involved in a variety of cellular functions from a diverse group of sponges. Methodology/Principal Findings We generated data from each of the four sponge classes (i.e., Calcarea, Demospongiae, Hexactinellida, and Homoscleromorpha), but focused on family-level relationships within demosponges. With data for 21 newly sampled families, our Maximum Likelihood and Bayesian-based approaches recovered previously phylogenetically defined taxa: Keratosap, Myxospongiaep, Spongillidap, Haploscleromorphap (the marine haplosclerids) and Democlaviap. We found conflicting results concerning the relationships of Keratosap and Myxospongiaep to the remaining demosponges, but our results strongly supported a clade of Haploscleromorphap+Spongillidap+Democlaviap. In contrast to hypotheses based on mitochondrial genome and ribosomal data, nuclear housekeeping gene data suggested that freshwater sponges (Spongillidap) are sister to Haploscleromorphap rather than part of Democlaviap. Within Keratosap, we found equivocal results as to the monophyly of Dictyoceratida. Within Myxospongiaep, Chondrosida and Verongida were monophyletic. A well-supported clade within Democlaviap, Tetractinellidap, composed of all sampled members of Astrophorina and Spirophorina (including the only lithistid in our analysis), was consistently revealed as the sister group to all other members of Democlaviap. Within Tetractinellidap, we did not recover monophyletic Astrophorina or Spirophorina. Our results also reaffirmed the monophyly of order Poecilosclerida (excluding Desmacellidae and Raspailiidae), and polyphyly of Hadromerida and Halichondrida. Conclusions/Significance These results, using an

  3. Close phylogenetic relationship between Angolan and Romanian HIV-1 subtype F1 isolates (United States)

    Guimarães, Monick L; Vicente, Ana Carolina P; Otsuki, Koko; da Silva, Rosa Ferreira FC; Francisco, Moises; da Silva, Filomena Gomes; Serrano, Ducelina; Morgado, Mariza G; Bello, Gonzalo


    Background Here, we investigated the phylogenetic relationships of the HIV-1 subtype F1 circulating in Angola with subtype F1 strains sampled worldwide and reconstructed the evolutionary history of this subtype in Central Africa. Methods Forty-six HIV-1-positive samples were collected in Angola in 2006 and subtyped at the env-gp41 region. Partial env-gp120 and pol-RT sequences and near full-length genomes from those env-gp41 subtype F1 samples were further generated. Phylogenetic analyses of partial and full-length subtype F1 strains isolated worldwide were carried out. The onset date of the subtype F1 epidemic in Central Africa was estimated using a Bayesian Markov chain Monte Carlo approach. Results Nine Angolan samples were classified as subtype F1 based on the analysis of the env-gp41 region. All nine Angolan sequences were also classified as subtype F1 in both env-gp120 and pol-RT genomic regions, and near full-length genome analysis of four of these samples confirmed their classification as "pure" subtype F1. Phylogenetic analyses of subtype F1 strains isolated worldwide revealed that isolates from the Democratic Republic of Congo (DRC) were the earliest branching lineages within the subtype F1 phylogeny. Most strains from Angola segregated in a monophyletic group together with Romanian sequences; whereas South American F1 sequences emerged as an independent cluster. The origin of the subtype F1 epidemic in Central African was estimated at 1958 (1934–1971). Conclusion "Pure" subtype F1 strains are common in Angola and seem to be the result of a single founder event. Subtype F1 sequences from Angola are closely related to those described in Romania, and only distantly related to the subtype F1 lineage circulating in South America. Original diversification of subtype F1 probably occurred within the DRC around the late 1950s. PMID:19386115

  4. Close phylogenetic relationship between Angolan and Romanian HIV-1 subtype F1 isolates

    Directory of Open Access Journals (Sweden)

    Serrano Ducelina


    Full Text Available Abstract Background Here, we investigated the phylogenetic relationships of the HIV-1 subtype F1 circulating in Angola with subtype F1 strains sampled worldwide and reconstructed the evolutionary history of this subtype in Central Africa. Methods Forty-six HIV-1-positive samples were collected in Angola in 2006 and subtyped at the env-gp41 region. Partial env-gp120 and pol-RT sequences and near full-length genomes from those env-gp41 subtype F1 samples were further generated. Phylogenetic analyses of partial and full-length subtype F1 strains isolated worldwide were carried out. The onset date of the subtype F1 epidemic in Central Africa was estimated using a Bayesian Markov chain Monte Carlo approach. Results Nine Angolan samples were classified as subtype F1 based on the analysis of the env-gp41 region. All nine Angolan sequences were also classified as subtype F1 in both env-gp120 and pol-RT genomic regions, and near full-length genome analysis of four of these samples confirmed their classification as "pure" subtype F1. Phylogenetic analyses of subtype F1 strains isolated worldwide revealed that isolates from the Democratic Republic of Congo (DRC were the earliest branching lineages within the subtype F1 phylogeny. Most strains from Angola segregated in a monophyletic group together with Romanian sequences; whereas South American F1 sequences emerged as an independent cluster. The origin of the subtype F1 epidemic in Central African was estimated at 1958 (1934–1971. Conclusion "Pure" subtype F1 strains are common in Angola and seem to be the result of a single founder event. Subtype F1 sequences from Angola are closely related to those described in Romania, and only distantly related to the subtype F1 lineage circulating in South America. Original diversification of subtype F1 probably occurred within the DRC around the late 1950s.

  5. Phylogenetic relationships of the operculate land snail genus Cyclophorus Montfort, 1810 in Thailand. (United States)

    Nantarat, Nattawadee; Tongkerd, Piyoros; Sutcharit, Chirasak; Wade, Christopher M; Naggs, Fred; Panha, Somsak


    Operculate land snails of the genus Cyclophorus are distributed widely in sub-tropical and tropical Asia. Shell morphology is traditionally used for species identification in Cyclophorus but their shells exhibit considerable variation both within and between populations; species limits have been extremely difficult to determine and are poorly understood. Many currently recognized species have discontinuous distributions over large ranges but geographical barriers and low mobility of snails are likely to have led to long periods of isolation resulting in cryptic speciation of allopatric populations. As a contribution towards solving these problems, we reconstructed the molecular phylogeny of 87 Cyclophorus specimens, representing 29 nominal species (of which one was represented by four subspecies), plus three related out-group species. Molecular phylogenetic analyses were used to investigate geographic limits and speciation scenarios. The analyses of COI, 16S rRNA and 28S rRNA gene fragments were performed using neighbour-joining (NJ), maximum likelihood (ML), and Bayesian inference (BI) methods. All the obtained phylogenetic trees were congruent with each other and in most cases confirmed the species level classification. However, at least three nominate species were polyphyletic. Both C. fulguratus and C. volvulus appear to be species complexes, suggesting that populations of these species from different geographical areas of Thailand are cryptic species. C. aurantiacus pernobilis is distinct and likely to be a different species from the other members of the C. aurantiacus species complex.

  6. Molecular phylogenetics and morphological evolution of St. John's wort (Hypericum; Hypericaceae). (United States)

    Nürk, Nicolai M; Madriñán, Santiago; Carine, Mark A; Chase, Mark W; Blattner, Frank R


    Phylogenetic hypotheses for the large cosmopolitan genus Hypericum (St. John's wort) have previously been based on morphology, and molecular studies have thus far included only a few species. In this study, we used 360 sequences of the internal transcribed spacer (ITS) region of nuclear ribosomal DNA (nrDNA) for 206 species representing Hypericum (incl. Triadenum and Thornea) and three other genera of Hypericaceae to generate an explicit phylogenetic hypothesis for the genus using parsimony and model-based methods. The results indicate that the small genus Triadenum is nested in a clade within Hypericum containing most of the New World species. Sister to Hypericum is Thornea from Central America. Within Hypericum, three large clades and two smaller grades were found; these are based on their general morphology, especially characters used previously in taxonomy of the genus. Relative to the most recent classification, around 60% of the sections of Hypericum were monophyletic. We used a Bayesian approach to reconstruct ancestral states of selected morphological characters, which resulted in recognition of characters that support major clades within the genus and a revised interpretation of morphological evolution in Hypericum. The shrubby habit represents the plesiomorphic state from which herbs evolved several times. Arborescent species have radiated convergently in high-elevation habitats in tropical Africa and South America.

  7. The mitochondrial genome of Atrijuglans hetaohei Yang (Lepidoptera: Gelechioidea) and related phylogenetic analyses. (United States)

    Wang, Qiqi; Zhang, Zhengqing; Tang, Guanghui


    Complete mitochondrial genome sequences are of great importance for better understanding the genome-level characteristics and phylogenetic relationships among related species. In this study, the complete mitochondrial genome of Atrijuglans hetaohei Yang is sequenced and analyzed, which is 15,379bp in length (GenBank: KT581634) and contains a typical set of 13 protein-coding genes, 22 tRNA genes, two rRNA genes and a non-coding region (control region). Except for cox1 gene that is initiated by CGA codon, all protein-coding genes start with ATN codons and end with the stop codon T, TA or TAA. All tRNAs have a typical clover-leaf secondary structure, except for trnS1, of which the DHU arm could not form a stable stem-loop structure. The secondary structure of rrnL and rrnS consists of 49 helices and 33 helices, respectively. Phylogenetic analyses of the complete mitochondrial genome sequences and of the amino acid sequences for 13 mitochondrial protein-coding genes among related species support the view that A. hetaohei is more closely related to the Gelechioidea than Yponomeutoidea. This result is consistent with a previous classification based on morphology.

  8. Extensive mitochondrial gene arrangements in coleoid Cephalopoda and their phylogenetic implications. (United States)

    Akasaki, Tetsuya; Nikaido, Masato; Tsuchiya, Kotaro; Segawa, Susumu; Hasegawa, Masami; Okada, Norihiro


    We determined the complete mitochondrial genomes of five cephalopods of the Subclass Coleoidea (Suborder Oegopsida: Watasenia scintillans, Todarodes pacificus, Suborder Myopsida: Sepioteuthis lessoniana, Order Sepiida: Sepia officinalis, and Order Octopoda: Octopus ocellatus) and used them to infer phylogenetic relationships. In our Maximum Likelihood (ML) tree, sepiids (cuttlefish) are at the most basal position of all decapodiformes, and oegopsids and myopsids form a monophyletic clade, thus supporting the traditional classification of the Order Teuthida. We detected extensive gene rearrangements in the mitochondrial genomes of broad cephalopod groups. It is likely that the arrangements of mitochondrial genes in Oegopsida and Sepiida were derived from those of Octopoda, which is thought to be the ancestral order, by entire gene duplication and random gene loss. Oegopsida in particular has undergone long-range gene duplications. We also found that the mitochondrial gene arrangement of Sepioteuthis lessoniana differs from that of Loligo bleekeri, although they belong to the same family. Analysis of both the phylogenetic tree and mitochondrial gene rearrangements of coleoid Cephalopoda suggests that each mitochondrial gene arrangement was acquired after the divergence of each lineage.

  9. Distributional patterns of the Neotropical genus Thecomyia Perty (Diptera, Sciomyzidae and phylogenetic support

    Directory of Open Access Journals (Sweden)

    Amanda Ciprandi Pires


    Full Text Available Distributional patterns of the Neotropical genus Thecomyia Perty (Diptera, Sciomyzidae and phylogenetic support. The distributional pattern of the genus Thecomyia Perty, 1833 was defined using panbiogeographic tools, and analyzed based on the phylogeny of the group. This study sought to establish biogeographical homologies in the Neotropical region between different species of the genus, based on their distribution pattern and later corroboration through its phylogeny. Eight individual tracks and 16 generalized tracks were identified, established along nearly the entire swath of the Neotropics. Individual tracks are the basic units of a panbiogeographic study, and correspond to the hypothesis of minimum distribution of the organisms involved. The generalized tracks, obtained from the spatial congruence between two or more individual tracks, are important in the identification of smaller areas of endemism. Thus, we found evidence from the generalized tracks in support of previous classification for the Neotropical region. The Amazon domain is indicated as an area of outstanding importance in the diversification of the group, by the confluence of generalized tracks and biogeographic nodes in the region. Most of the generalized tracks and biogeographical nodes were congruent with the phylogenetic hypothesis of the genus, indicating support of the primary biogeographical homologies originally defined by the track analysis.

  10. Reappraisal of phylogenetic status and genetic diversity analysis of Asian population of Lentinula edodes

    Institute of Scientific and Technical Information of China (English)


    Phylogenetic relationship within the Lentinula genus is constructed based on the sequenced ITS fragments of the 60Chinese wild L. edodes isolates and the sequence data of 48 isolates of different species from other districts downloaded from the GenBank. The 108 isolates of Lentinula genus are divided into two branches and seven groups, one branch and two groups in the New World, and the other branch and five groups in the Old World, and the isolates clustering of different groups corresponds obviously with the classification of the morphological species. Asian isolates are partitioned in group Ⅰ and Ⅴ, two of the five groups of the Old World,by which the germplasm resources status represented is of great importance shown by the phylogenetic analysis. Group V which fills up the blank of geographic distribution has become one of the mainstream groups with an increased isolate number, while group Ⅰ has a tendency to dissimilate into two subgroups (Ia and Ib) with a huge isolate quantity and a coverage of most tested districts, suggesting that China (or Asia) is an important genetic diversity center of the natural population of Lentinula genus. Genetic analysis of Asian isolates based on groups Ia, Ib and group V indicates that the diversity of the east coastal-land, northwestern highland and southwestern China and Himalayas districts is the most plentiful, which is the three priorities in diversity protection of Asian Lentinula population.

  11. Classification of nanopolymers

    Energy Technology Data Exchange (ETDEWEB)

    Larena, A; Tur, A [Department of Chemical Industrial Engineering and Environment, Universidad Politecnica de Madrid, E.T.S. Ingenieros Industriales, C/ Jose Gutierrez Abascal, Madrid (Spain); Baranauskas, V [Faculdade de Engenharia Eletrica e Computacao, Departamento de Semicondutores, Instrumentos e Fotonica, Universidade Estadual de Campinas, UNICAMP, Av. Albert Einstein N.400, 13 083-852 Campinas SP Brasil (Brazil)], E-mail:


    Nanopolymers with different structures, shapes, and functional forms have recently been prepared using several techniques. Nanopolymers are the most promising basic building blocks for mounting complex and simple hierarchical nanosystems. The applications of nanopolymers are extremely broad and polymer-based nanotechnologies are fast emerging. We propose a nanopolymer classification scheme based on self-assembled structures, non self-assembled structures, and on the number of dimensions in the nanometer range (nD)

  12. Evolvement of Classification Society

    Institute of Scientific and Technical Information of China (English)

    Xu Hua


    As an independent industry, the emergence of the classification society was perhaps the demand of beneficial interests between shipowners, cargo owners and insurers at the earliest time. Today, as an indispensable link of the international maritime industry, class role has changed fundamentally. Start off from the demand of the insurersSeaborne trade, transport and insurance industries began to emerge successively in the 17th century. The massive risk and benefit brought by seaborne transport provided a difficult problem to insurers.

  13. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J


    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  14. The best of both worlds: Phylogenetic eigenvector regression and mapping

    Directory of Open Access Journals (Sweden)

    José Alexandre Felizola Diniz Filho


    Full Text Available Eigenfunction analyses have been widely used to model patterns of autocorrelation in time, space and phylogeny. In a phylogenetic context, Diniz-Filho et al. (1998 proposed what they called Phylogenetic Eigenvector Regression (PVR, in which pairwise phylogenetic distances among species are submitted to a Principal Coordinate Analysis, and eigenvectors are then used as explanatory variables in regression, correlation or ANOVAs. More recently, a new approach called Phylogenetic Eigenvector Mapping (PEM was proposed, with the main advantage of explicitly incorporating a model-based warping in phylogenetic distance in which an Ornstein-Uhlenbeck (O-U process is fitted to data before eigenvector extraction. Here we compared PVR and PEM in respect to estimated phylogenetic signal, correlated evolution under alternative evolutionary models and phylogenetic imputation, using simulated data. Despite similarity between the two approaches, PEM has a slightly higher prediction ability and is more general than the original PVR. Even so, in a conceptual sense, PEM may provide a technique in the best of both worlds, combining the flexibility of data-driven and empirical eigenfunction analyses and the sounding insights provided by evolutionary models well known in comparative analyses.

  15. Student interpretations of phylogenetic trees in an introductory biology course. (United States)

    Dees, Jonathan; Momsen, Jennifer L; Niemi, Jarad; Montplaisir, Lisa


    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa relatedness on phylogenetic trees, to measure the prevalence of correct taxa-relatedness interpretations, and to determine how student reasoning and correctness change in response to instruction and over time. Counting synapomorphies and nodes between taxa were the most common forms of incorrect reasoning, which presents a pedagogical dilemma concerning labeled synapomorphies on phylogenetic trees. Students also independently generated an alternative form of correct reasoning using monophyletic groups, the use of which decreased in popularity over time. Approximately half of all students were able to correctly interpret taxa relatedness on phylogenetic trees, and many memorized correct reasoning without understanding its application. Broad initial instruction that allowed students to generate inferences on their own contributed very little to phylogenetic tree understanding, while targeted instruction on evolutionary relationships improved understanding to some extent. Phylogenetic trees, which can directly affect student understanding of evolution, appear to offer introductory biology instructors a formidable pedagogical challenge.

  16. Nonlinear Time Series Model for Shape Classification Using Neural Networks

    Institute of Scientific and Technical Information of China (English)


    A complex nonlinear exponential autoregressive (CNEAR) model for invariant feature extraction is developed for recognizing arbitrary shapes on a plane. A neural network is used to calculate the CNEAR coefficients. The coefficients, which constitute the feature set, are proven to be invariant to boundary transformations such as translation, rotation, scale and choice of starting point in tracing the boundary. The feature set is then used as the input to a complex multilayer perceptron (C-MLP) network for learning and classification. Experimental results show that complicated shapes can be accurately recognized even with the low-order model and that the classification method has good fault tolerance when noise is present.

  17. Classification of Meteorological Drought

    Institute of Scientific and Technical Information of China (English)

    Zhang Qiang; Zou Xukai; Xiao Fengjin; Lu Houquan; Liu Haibo; Zhu Changhan; An Shunqing


    Background The national standard of the Classification of Meteorological Drought (GB/T 20481-2006) was developed by the National Climate Center in cooperation with Chinese Academy of Meteorological Sciences,National Meteorological Centre and Department of Forecasting and Disaster Mitigation under the China Meteorological Administration (CMA),and was formally released and implemented in November 2006.In 2008,this Standard won the second prize of the China Standard Innovation and Contribution Awards issued by SAC.Developed through independent innovation,it is the first national standard published to monitor meteorological drought disaster and the first standard in China and around the world specifying the classification of drought.Since its release in 2006,the national standard of Classification of Meteorological Drought has been used by CMA as the operational index to monitor and drought assess,and gradually used by provincial meteorological sureaus,and applied to the drought early warning release standard in the Methods of Release and Propagation of Meteorological Disaster Early Warning Signal.

  18. Short Text Classification: A Survey

    Directory of Open Access Journals (Sweden)

    Ge Song


    Full Text Available With the recent explosive growth of e-commerce and online communication, a new genre of text, short text, has been extensively applied in many areas. So many researches focus on short text mining. It is a challenge to classify the short text owing to its natural characters, such as sparseness, large-scale, immediacy, non-standardization. It is difficult for traditional methods to deal with short text classification mainly because too limited words in short text cannot represent the feature space and the relationship between words and documents. Several researches and reviews on text classification are shown in recent times. However, only a few of researches focus on short text classification. This paper discusses the characters of short text and the difficulty of short text classification. Then we introduce the existing popular works on short text classifiers and models, including short text classification using sematic analysis, semi-supervised short text classification, ensemble short text classification, and real-time classification. The evaluations of short text classification are analyzed in our paper. Finally we summarize the existing classification technology and prospect for development trend of short text classification

  19. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan


    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  20. Preliminary Study of Phylogenetic Relationship of Rice Field Chironomidae (Diptera Inferred From DNA Sequences of Mitochondrial Cytochrome Oxidase Subunit I

    Directory of Open Access Journals (Sweden)

    Salman A. Al-Shami


    Full Text Available Problem statement: Chironomidae have been recorded in rice fields throughout the world including in many countries such as India, Australia and the USA. Although some studies provide the key to genera level and note the difficulty of identifying the larvae to species level. Chironomid researches have been hindered because of difficulties in specimen preparation, identification, morphology and literature. Systematics, phylogenetics and taxonomic studies of insects developed quickly with emergence of molecular techniques. These techniques provide an effective tool toward more accurate identification of ambiguous chironomid species. Approach: Samples of chironomids larvae were collected from rice plots at Bukit Merah Agricultural Experimental Station (BMAES, Penang, Malaysia. A 710 bp fragment of mitochondrial gene Cytochrome Oxidase subunit I (COI was amplified and sequenced. Results: Five species of Chironomidae; three species of subfamily Chironominae, Chironomus kiiensis, Polypedilum trigonus, Tanytarsus formosanus, two species of subfamily Tanypodinae, Clinotanypus sp and Tanypus punctipennis were morphologically identified. The phylogenetic relationship among these species was been investigated. High sequence divergence was observed between two individuals of the presumed C. kiiensis and it is suggested that more than one species may be present. However the intraspecific sequence divergence was lower between the other species of Tanypodinae subfamily. Interestingly, Tanytarsus formosanus showed close phylogenetic relationship to Tanypodinae species and this presumably reflect co-evolutionary traits of different subfamilies. Conclusion: The sequence of the mtDNA cytochrome oxidase subunit I gene has proven useful to investigate the phylogenetic relationship among the ambiguous species of chironomids.

  1. Polarimetric Synthetic Aperture Radar Image Classification by a Hybrid Method

    Institute of Scientific and Technical Information of China (English)

    Kamran Ullah Khan; YANG Jian


    Different methods proposed so far for accurate classification of land cover types in polarimetric synthetic aperture radar (SAR) image are data specific and no general method is available. A novel hybrid framework for this classification was developed in this work. A set of effective features derived from the coherence matrix of polarimetric SARdata was proposed.Constituents of the feature set are wavelet,texture,and nonlinear features.The proposed feature set has a strong discrimination power. A neural network was used as the classification engine in a unique way. By exploiting the speed of the conjugate gradient method and the convergence rate of the Levenberg-Marquardt method (near the optimal point), an overall speed up of the classification procedure was achieved. Principal component analysis(PCA)was used to shrink the dimension of the feature vector without sacrificing much of the classification accuracy. The proposed approach is compared with the maximum likelihood estimator (MLE)based on the complex Wishart distribution and the results show the superiority of the proposed method,with the average classification accuracy by the proposed method(95.4%)higher than that of the MLE(93.77%). Use of PCA to reduce the dimensionality of the feature vector helps reduce the memory requirements and computational cost, thereby enhancing the speed of the process.

  2. Exploiting multi-context analysis in semantic image classification

    Institute of Scientific and Technical Information of China (English)

    TIAN Yong-hong; HUANG Tie-jun; GAO Wen


    As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification approach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based correlation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.

  3. Fuzzy Aspect Based Opinion Classification System for Mining Tourist Reviews

    Directory of Open Access Journals (Sweden)

    Muhammad Afzaal


    Full Text Available Due to the large amount of opinions available on the websites, tourists are often overwhelmed with information and find it extremely difficult to use the available information to make a decision about the tourist places to visit. A number of opinion mining methods have been proposed in the past to identify and classify an opinion into positive or negative. Recently, aspect based opinion mining has been introduced which targets the various aspects present in the opinion text. A number of existing aspect based opinion classification methods are available in the literature but very limited research work has targeted the automatic aspect identification and extraction of implicit, infrequent, and coreferential aspects. Aspect based classification suffers from the presence of irrelevant sentences in a typical user review. Such sentences make the data noisy and degrade the classification accuracy of the machine learning algorithms. This paper presents a fuzzy aspect based opinion classification system which efficiently extracts aspects from user opinions and perform near to accurate classification. We conducted experiments on real world datasets to evaluate the effectiveness of our proposed system. Experimental results prove that the proposed system not only is effective in aspect extraction but also improves the classification accuracy.

  4. Evaluating Phylogenetic Informativeness as a Predictor of Phylogenetic Signal for Metazoan, Fungal, and Mammalian Phylogenomic Data Sets

    Directory of Open Access Journals (Sweden)

    Francesc López-Giráldez


    Full Text Available Phylogenetic research is often stymied by selection of a marker that leads to poor phylogenetic resolution despite considerable cost and effort. Profiles of phylogenetic informativeness provide a quantitative measure for prioritizing gene sampling to resolve branching order in a particular epoch. To evaluate the utility of these profiles, we analyzed phylogenomic data sets from metazoans, fungi, and mammals, thus encompassing diverse time scales and taxonomic groups. We also evaluated the utility of profiles created based on simulated data sets. We found that genes selected via their informativeness dramatically outperformed haphazard sampling of markers. Furthermore, our analyses demonstrate that the original phylogenetic informativeness method can be extended to trees with more than four taxa. Thus, although the method currently predicts phylogenetic signal without specifically accounting for the misleading effects of stochastic noise, it is robust to the effects of homoplasy. The phylogenetic informativeness rankings obtained will allow other researchers to select advantageous genes for future studies within these clades, maximizing return on effort and investment. Genes identified might also yield efficient experimental designs for phylogenetic inference for many sister clades and outgroup taxa that are closely related to the diverse groups of organisms analyzed.

  5. Accurate measurement of unsteady state fluid temperature (United States)

    Jaremkiewicz, Magdalena


    In this paper, two accurate methods for determining the transient fluid temperature were presented. Measurements were conducted for boiling water since its temperature is known. At the beginning the thermometers are at the ambient temperature and next they are immediately immersed into saturated water. The measurements were carried out with two thermometers of different construction but with the same housing outer diameter equal to 15 mm. One of them is a K-type industrial thermometer widely available commercially. The temperature indicated by the thermometer was corrected considering the thermometers as the first or second order inertia devices. The new design of a thermometer was proposed and also used to measure the temperature of boiling water. Its characteristic feature is a cylinder-shaped housing with the sheath thermocouple located in its center. The temperature of the fluid was determined based on measurements taken in the axis of the solid cylindrical element (housing) using the inverse space marching method. Measurements of the transient temperature of the air flowing through the wind tunnel using the same thermometers were also carried out. The proposed measurement technique provides more accurate results compared with measurements using industrial thermometers in conjunction with simple temperature correction using the inertial thermometer model of the first or second order. By comparing the results, it was demonstrated that the new thermometer allows obtaining the fluid temperature much faster and with higher accuracy in comparison to the industrial thermometer. Accurate measurements of the fast changing fluid temperature are possible due to the low inertia thermometer and fast space marching method applied for solving the inverse heat conduction problem.

  6. Phylogeny and classification of the trapdoor spider genus Myrmekiaphila: an integrative approach to evaluating taxonomic hypotheses.

    Directory of Open Access Journals (Sweden)

    Ashley L Bailey

    Full Text Available BACKGROUND: Revised by Bond and Platnick in 2007, the trapdoor spider genus Myrmekiaphila comprises 11 species. Species delimitation and placement within one of three species groups was based on modifications of the male copulatory device. Because a phylogeny of the group was not available these species groups might not represent monophyletic lineages; species definitions likewise were untested hypotheses. The purpose of this study is to reconstruct the phylogeny of Myrmekiaphila species using molecular data to formally test the delimitation of species and species-groups. We seek to refine a set of established systematic hypotheses by integrating across molecular and morphological data sets. METHODS AND FINDINGS: Phylogenetic analyses comprising Bayesian searches were conducted for a mtDNA matrix composed of contiguous 12S rRNA, tRNA-val, and 16S rRNA genes and a nuclear DNA matrix comprising the glutamyl and prolyl tRNA synthetase gene each consisting of 1348 and 481 bp, respectively. Separate analyses of the mitochondrial and nuclear genome data and a concatenated data set yield M. torreya and M. millerae paraphyletic with respect to M. coreyi and M. howelli and polyphyletic fluviatilis and foliata species groups. CONCLUSIONS: Despite the perception that molecular data present a solution to a crisis in taxonomy, studies like this demonstrate the efficacy of an approach that considers data from multiple sources. A DNA barcoding approach during the species discovery process would fail to recognize at least two species (M. coreyi and M. howelli whereas a combined approach more accurately assesses species diversity and illuminates speciation pattern and process. Concomitantly these data also demonstrate that morphological characters likewise fail in their ability to recover monophyletic species groups and result in an unnatural classification. Optimizations of these characters demonstrate a pattern of "Dollo evolution" wherein a complex character

  7. Niche Genetic Algorithm with Accurate Optimization Performance

    Institute of Scientific and Technical Information of China (English)

    LIU Jian-hua; YAN De-kun


    Based on crowding mechanism, a novel niche genetic algorithm was proposed which can record evolutionary direction dynamically during evolution. After evolution, the solutions's precision can be greatly improved by means of the local searching along the recorded direction. Simulation shows that this algorithm can not only keep population diversity but also find accurate solutions. Although using this method has to take more time compared with the standard GA, it is really worth applying to some cases that have to meet a demand for high solution precision.

  8. Accurate estimation of indoor travel times

    DEFF Research Database (Denmark)

    Prentow, Thor Siiger; Blunck, Henrik; Stisen, Allan


    the InTraTime method for accurately estimating indoor travel times via mining of historical and real-time indoor position traces. The method learns during operation both travel routes, travel times and their respective likelihood---both for routes traveled as well as for sub-routes thereof. In...... are collected within the building complex. Results indicate that InTraTime is superior with respect to metrics such as deployment cost, maintenance cost and estimation accuracy, yielding an average deviation from actual travel times of 11.7 %. This accuracy was achieved despite using a minimal-effort setup...

  9. Accurate diagnosis is essential for amebiasis

    Institute of Scientific and Technical Information of China (English)


    @@ Amebiasis is one of the three most common causes of death from parasitic disease, and Entamoeba histolytica is the most widely distributed parasites in the world. Particularly, Entamoeba histolytica infection in the developing countries is a significant health problem in amebiasis-endemic areas with a significant impact on infant mortality[1]. In recent years a world wide increase in the number of patients with amebiasis has refocused attention on this important infection. On the other hand, improving the quality of parasitological methods and widespread use of accurate tecniques have improved our knowledge about the disease.

  10. The first accurate description of an aurora (United States)

    Schröder, Wilfried


    As technology has advanced, the scientific study of auroral phenomena has increased by leaps and bounds. A look back at the earliest descriptions of aurorae offers an interesting look into how medieval scholars viewed the subjects that we study.Although there are earlier fragmentary references in the literature, the first accurate description of the aurora borealis appears to be that published by the German Catholic scholar Konrad von Megenberg (1309-1374) in his book Das Buch der Natur (The Book of Nature). The book was written between 1349 and 1350.

  11. New law requires 'medically accurate' lesson plans. (United States)


    The California Legislature has passed a bill requiring all textbooks and materials used to teach about AIDS be medically accurate and objective. Statements made within the curriculum must be supported by research conducted in compliance with scientific methods, and published in peer-reviewed journals. Some of the current lesson plans were found to contain scientifically unsupported and biased information. In addition, the bill requires material to be "free of racial, ethnic, or gender biases." The legislation is supported by a wide range of interests, but opposed by the California Right to Life Education Fund, because they believe it discredits abstinence-only material.

  12. Universality: Accurate Checks in Dyson's Hierarchical Model (United States)

    Godina, J. J.; Meurice, Y.; Oktay, M. B.


    In this talk we present high-accuracy calculations of the susceptibility near βc for Dyson's hierarchical model in D = 3. Using linear fitting, we estimate the leading (γ) and subleading (Δ) exponents. Independent estimates are obtained by calculating the first two eigenvalues of the linearized renormalization group transformation. We found γ = 1.29914073 ± 10 -8 and, Δ = 0.4259469 ± 10-7 independently of the choice of local integration measure (Ising or Landau-Ginzburg). After a suitable rescaling, the approximate fixed points for a large class of local measure coincide accurately with a fixed point constructed by Koch and Wittwer.

  13. Decision Fusion Based on Hyperspectral and Multispectral Satellite Imagery for Accurate Forest Species Mapping

    Directory of Open Access Journals (Sweden)

    Dimitris G. Stavrakoudis


    Full Text Available This study investigates the effectiveness of combining multispectral very high resolution (VHR and hyperspectral satellite imagery through a decision fusion approach, for accurate forest species mapping. Initially, two fuzzy classifications are conducted, one for each satellite image, using a fuzzy output support vector machine (SVM. The classification result from the hyperspectral image is then resampled to the multispectral’s spatial resolution and the two sources are combined using a simple yet efficient fusion operator. Thus, the complementary information provided from the two sources is effectively exploited, without having to resort to computationally demanding and time-consuming typical data fusion or vector stacking approaches. The effectiveness of the proposed methodology is validated in a complex Mediterranean forest landscape, comprising spectrally similar and spatially intermingled species. The decision fusion scheme resulted in an accuracy increase of 8% compared to the classification using only the multispectral imagery, whereas the increase was even higher compared to the classification using only the hyperspectral satellite image. Perhaps most importantly, its accuracy was significantly higher than alternative multisource fusion approaches, although the latter are characterized by much higher computation, storage, and time requirements.

  14. Phylogenetics and reticulate evolution in Pistacia (Anacardiaceae). (United States)

    Yi, Tingshuang; Wen, Jun; Golan-Goldhirsh, Avi; Parfitt, Dan E


    The systematic position and intrageneric relationships of the economically important Pistacia species (Anacardiaceae) are controversial. The phylogeny of Pistacia was assessed using five data sets: sequences of nuclear ribosomal ITS, the third intron of the nuclear nitrate reductase gene (NIA-i3), and the plastid ndhF, trnL-F and trnC-trnD. Significant discordance was detected among ITS, NIA-i3, and the combined plastid DNA data sets. ITS, NIA-i3, and the combined plastid data sets were analyzed separately using Bayesian and parsimony methods. Both the ITS and the NIA-i3 data sets resolved the relationships among Pistacia species well; however, these two data sets had significant discordance. The ITS phylogeny best reflects the evolutionary relationships among Pistacia species. Lineage sorting of the NIA-i3 alleles may explain the conflicts between the NIA-i3 and the ITS data sets. The combined analysis of three plastid DNA data sets resolved Pistacia species into three major clades, within which only a few subclades were supported. Pistacia was shown to be monophyletic in all three analyses. The previous intrageneric classification was largely inconsistent with the molecular data. Some Pistacia species appear not to be genealogical species, and evidence for reticulate evolution is presented. Pistacia saportae was shown to be a hybrid with P. lentiscus (maternal) and P. terebinthus (paternal) as the parental taxa.

  15. Markov invariants, plethysms, and phylogenetics (the long version)

    CERN Document Server

    Sumner, J G; Jermiin, L S; Jarvis, P D


    We explore model based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our primary tool group representation theory, and show that it provides a general framework for analysing Markov processes on trees. From this algebraic perspective, the inherent symmetries of these processes become apparent, and focusing on plethysms, we are able to define Markov invariants and give existence proofs. We give an explicit technique for constructing the invariants, valid for any number of character states and taxa. For phylogenetic trees with three and four leaves, we demonstrate that the corresponding Markov invariants can be fruitfully exploited in applied phylogenetic studies.

  16. A common tendency for phylogenetic overdispersion in mammalian assemblages. (United States)

    Cooper, Natalie; Rodríguez, Jesús; Purvis, Andy


    Competition has long been proposed as an important force in structuring mammalian communities. Although early work recognized that competition has a phylogenetic dimension, only with recent increases in the availability of phylogenies have true phylogenetic investigations of mammalian community structure become possible. We test whether the phylogenetic structure of 142 assemblages from three mammalian clades (New World monkeys, North American ground squirrels and Australasian possums) shows the imprint of competition. The full set of assemblages display a highly significant tendency for members to be more distantly related than expected by chance (phylogenetic overdispersion). The overdispersion is also significant within two of the clades (monkeys and squirrels) separately. This is the first demonstration of widespread overdispersion in mammal assemblages and implies an important role for either competition between close relatives where traits are conserved, habitat filtering where distant relatives share convergent traits, or both.

  17. Phylogenetic and functional diversity in large carnivore assemblages. (United States)

    Dalerum, F


    Large terrestrial carnivores are important ecological components and prominent flagship species, but are often extinction prone owing to a combination of biological traits and high levels of human persecution. This study combines phylogenetic and functional diversity evaluations of global and continental large carnivore assemblages to provide a framework for conservation prioritization both between and within assemblages. Species-rich assemblages of large carnivores simultaneously had high phylogenetic and functional diversity, but species contributions to phylogenetic and functional diversity components were not positively correlated. The results further provide ecological justification for the largest carnivore species as a focus for conservation action, and suggests that range contraction is a likely cause of diminishing carnivore ecosystem function. This study highlights that preserving species-rich carnivore assemblages will capture both high phylogenetic and functional diversity, but that prioritizing species within assemblages will involve trade-offs between optimizing contemporary ecosystem function versus the evolutionary potential for future ecosystem performance.

  18. Phylogenetic relationships of Salmonella based on rRNA sequences

    DEFF Research Database (Denmark)

    Christensen, H.; Nordentoft, Steen; Olsen, J.E.


    To establish the phylogenetic relationships between the subspecies of Salmonella enterica (official name Salmonella choleraesuis), Salmonella bongori and related members of Enterobacteriaceae, sequence comparison of rRNA was performed by maximum-likelihood analysis. The two Salmonella species wer...

  19. Classification of LiDAR Data with Point Based Classification Methods (United States)

    Yastikli, N.; Cetin, Z.


    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features

  20. How Accurately can we Calculate Thermal Systems?

    Energy Technology Data Exchange (ETDEWEB)

    Cullen, D; Blomquist, R N; Dean, C; Heinrichs, D; Kalugin, M A; Lee, M; Lee, Y; MacFarlan, R; Nagaya, Y; Trkov, A


    I would like to determine how accurately a variety of neutron transport code packages (code and cross section libraries) can calculate simple integral parameters, such as K{sub eff}, for systems that are sensitive to thermal neutron scattering. Since we will only consider theoretical systems, we cannot really determine absolute accuracy compared to any real system. Therefore rather than accuracy, it would be more precise to say that I would like to determine the spread in answers that we obtain from a variety of code packages. This spread should serve as an excellent indicator of how accurately we can really model and calculate such systems today. Hopefully, eventually this will lead to improvements in both our codes and the thermal scattering models that they use in the future. In order to accomplish this I propose a number of extremely simple systems that involve thermal neutron scattering that can be easily modeled and calculated by a variety of neutron transport codes. These are theoretical systems designed to emphasize the effects of thermal scattering, since that is what we are interested in studying. I have attempted to keep these systems very simple, and yet at the same time they include most, if not all, of the important thermal scattering effects encountered in a large, water-moderated, uranium fueled thermal system, i.e., our typical thermal reactors.

  1. Accurate pattern registration for integrated circuit tomography

    Energy Technology Data Exchange (ETDEWEB)

    Levine, Zachary H.; Grantham, Steven; Neogi, Suneeta; Frigo, Sean P.; McNulty, Ian; Retsch, Cornelia C.; Wang, Yuxin; Lucatorto, Thomas B.


    As part of an effort to develop high resolution microtomography for engineered structures, a two-level copper integrated circuit interconnect was imaged using 1.83 keV x rays at 14 angles employing a full-field Fresnel zone plate microscope. A major requirement for high resolution microtomography is the accurate registration of the reference axes in each of the many views needed for a reconstruction. A reconstruction with 100 nm resolution would require registration accuracy of 30 nm or better. This work demonstrates that even images that have strong interference fringes can be used to obtain accurate fiducials through the use of Radon transforms. We show that we are able to locate the coordinates of the rectilinear circuit patterns to 28 nm. The procedure is validated by agreement between an x-ray parallax measurement of 1.41{+-}0.17 {mu}m and a measurement of 1.58{+-}0.08 {mu}m from a scanning electron microscope image of a cross section.

  2. Accurate determination of characteristic relative permeability curves (United States)

    Krause, Michael H.; Benson, Sally M.


    A recently developed technique to accurately characterize sub-core scale heterogeneity is applied to investigate the factors responsible for flowrate-dependent effective relative permeability curves measured on core samples in the laboratory. The dependency of laboratory measured relative permeability on flowrate has long been both supported and challenged by a number of investigators. Studies have shown that this apparent flowrate dependency is a result of both sub-core scale heterogeneity and outlet boundary effects. However this has only been demonstrated numerically for highly simplified models of porous media. In this paper, flowrate dependency of effective relative permeability is demonstrated using two rock cores, a Berea Sandstone and a heterogeneous sandstone from the Otway Basin Pilot Project in Australia. Numerical simulations of steady-state coreflooding experiments are conducted at a number of injection rates using a single set of input characteristic relative permeability curves. Effective relative permeability is then calculated from the simulation data using standard interpretation methods for calculating relative permeability from steady-state tests. Results show that simplified approaches may be used to determine flowrate-independent characteristic relative permeability provided flow rate is sufficiently high, and the core heterogeneity is relatively low. It is also shown that characteristic relative permeability can be determined at any typical flowrate, and even for geologically complex models, when using accurate three-dimensional models.

  3. Accurate pose estimation for forensic identification (United States)

    Merckx, Gert; Hermans, Jeroen; Vandermeulen, Dirk


    In forensic authentication, one aims to identify the perpetrator among a series of suspects or distractors. A fundamental problem in any recognition system that aims for identification of subjects in a natural scene is the lack of constrains on viewing and imaging conditions. In forensic applications, identification proves even more challenging, since most surveillance footage is of abysmal quality. In this context, robust methods for pose estimation are paramount. In this paper we will therefore present a new pose estimation strategy for very low quality footage. Our approach uses 3D-2D registration of a textured 3D face model with the surveillance image to obtain accurate far field pose alignment. Starting from an inaccurate initial estimate, the technique uses novel similarity measures based on the monogenic signal to guide a pose optimization process. We will illustrate the descriptive strength of the introduced similarity measures by using them directly as a recognition metric. Through validation, using both real and synthetic surveillance footage, our pose estimation method is shown to be accurate, and robust to lighting changes and image degradation.

  4. Accurate taxonomic assignment of short pyrosequencing reads. (United States)

    Clemente, José C; Jansson, Jesper; Valiente, Gabriel


    Ambiguities in the taxonomy dependent assignment of pyrosequencing reads are usually resolved by mapping each read to the lowest common ancestor in a reference taxonomy of all those sequences that match the read. This conservative approach has the drawback of mapping a read to a possibly large clade that may also contain many sequences not matching the read. A more accurate taxonomic assignment of short reads can be made by mapping each read to the node in the reference taxonomy that provides the best precision and recall. We show that given a suffix array for the sequences in the reference taxonomy, a short read can be mapped to the node of the reference taxonomy with the best combined value of precision and recall in time linear in the size of the taxonomy subtree rooted at the lowest common ancestor of the matching sequences. An accurate taxonomic assignment of short reads can thus be made with about the same efficiency as when mapping each read to the lowest common ancestor of all matching sequences in a reference taxonomy. We demonstrate the effectiveness of our approach on several metagenomic datasets of marine and gut microbiota.

  5. Ecological and phylogenetic influences on maxillary dentition in snakes

    Directory of Open Access Journals (Sweden)

    Kate Jackson


    Full Text Available The maxillary dentition of snakes was used as a system with which to investigate the relative importance of the interacting forces of ecological selective pressures and phylogenetic constraints indetermining morphology. The maxillary morphology of three groups of snakes having different diets, with each group comprising two distinct lineages — boids and colubroids — was examined. Our results suggest that dietary selective pressures may be more significantthan phylogenetic history in shaping maxillary morphology.

  6. Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics. (United States)

    Andújar, Carmelo; Arribas, Paula; Ruzicka, Filip; Crampton-Platt, Alex; Timmermans, Martijn J T N; Vogler, Alfried P


    High-throughput DNA methods hold great promise for the study of taxonomically intractable mesofauna of the soil. Here, we assess species diversity and community structure in a phylogenetic framework, by sequencing total DNA from bulk specimen samples and assembly of mitochondrial genomes. The combination of mitochondrial metagenomics and DNA barcode sequencing of 1494 specimens in 69 soil samples from three geographic regions in southern Iberia revealed >300 species of soil Coleoptera (beetles) from a broad spectrum of phylogenetic lineages. A set of 214 mitochondrial sequences longer than 3000 bp was generated and used to estimate a well-supported phylogenetic tree of the order Coleoptera. Shorter sequences, including cox1 barcodes, were placed on this mitogenomic tree. Raw Illumina reads were mapped against all available sequences to test for species present in local samples. This approach simultaneously established the species richness, phylogenetic composition and community turnover at species and phylogenetic levels. We find a strong signature of vertical structuring in soil fauna that shows high local community differentiation between deep soil and superficial horizons at phylogenetic levels. Within the two vertical layers, turnover among regions was primarily at the tip (species) level and was stronger in the deep soil than leaf litter communities, pointing to layer-mediated drivers determining species diversification, spatial structure and evolutionary assembly of soil communities. This integrated phylogenetic framework opens the application of phylogenetic community ecology to the mesofauna of the soil, among the most diverse and least well-understood ecosystems, and will propel both theoretical and applied soil science.

  7. Sequence Classification: 893607 [

    Lifescience Database Archive (English)

    Full Text Available ial component of the MIND kinetochore complex (Mtw1p Including Nnf1p-Nsl1p-Dsn1p) which joins kinetochore subunits DNA to those contacting microtubules; required for accurate chromosome segregation; Nsl1p || ...

  8. Sequence Classification: 894861 [

    Lifescience Database Archive (English)

    Full Text Available ial component of the MIND kinetochore complex (Mtw1p Including Nnf1p-Nsl1p-Dsn1p) which joins kinetochore subunits DNA to those contacting microtubules; required for accurate chromosome segregation; Nnf1p || ...

  9. Identification of Microorganisms by High Resolution Tandem Mass Spectrometry with Accurate Statistical Significance (United States)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo


    Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at

  10. Classification in Medical Imaging

    DEFF Research Database (Denmark)

    Chen, Chen

    Classification is extensively used in the context of medical image analysis for the purpose of diagnosis or prognosis. In order to classify image content correctly, one needs to extract efficient features with discriminative properties and build classifiers based on these features. In addition...... to segment breast tissue and pectoral muscle area from the background in mammogram. The second focus is the choices of metric and its influence to the feasibility of a classifier, especially on k-nearest neighbors (k-NN) algorithm, with medical applications on breast cancer prediction and calcification...


    Directory of Open Access Journals (Sweden)

    I. P. Prokopenko


    Full Text Available Correctly organized nutritive and pharmacological support is an important component of an athlete's preparation for competitions, an optimal shape maintenance, fast recovery and rehabilitation after traumas and defatigation. Special products of enhanced biological value (BAS for athletes nutrition are used with this purpose. Easy-to-use energy sources are administered into athlete's organism, yielded materials and biologically active substances which regulate and activate exchange reactions which proceed with difficulties during certain physical trainings. The article presents sport supplements classification which can be used before warm-up and trainings, after trainings and in competitions breaks.

  12. [Classification of headache disorders]. (United States)

    Heinze, A; Heinze-Kuhn, K; Göbel, H


    In 2003 the International Headache Society (IHS) published the second edition of the International Classification of Headache Disorders. Diagnostic criteria for no less than 206 separate headache diagnoses are presented in the parts (I) primary headaches, (II) secondary headaches and (III) cranial neuralgia, central and primary facial pain. The headaches are classified according to the etiology in case of the secondary headaches and according to the phenomenology in case of the primary headaches. It is the task of the headache specialist to identify the correct headache diagnose with the smallest effort possible. Both, the differentiation between secondary and primary headaches and the differentiation between the various primary headaches are of equal importance.

  13. Efficient FPT Algorithms for (Strict) Compatibility of Unrooted Phylogenetic Trees. (United States)

    Baste, Julien; Paul, Christophe; Sau, Ignasi; Scornavacca, Celine


    In phylogenetics, a central problem is to infer the evolutionary relationships between a set of species X; these relationships are often depicted via a phylogenetic tree-a tree having its leaves labeled bijectively by elements of X and without degree-2 nodes-called the "species tree." One common approach for reconstructing a species tree consists in first constructing several phylogenetic trees from primary data (e.g., DNA sequences originating from some species in X), and then constructing a single phylogenetic tree maximizing the "concordance" with the input trees. The obtained tree is our estimation of the species tree and, when the input trees are defined on overlapping-but not identical-sets of labels, is called "supertree." In this paper, we focus on two problems that are central when combining phylogenetic trees into a supertree: the compatibility and the strict compatibility problems for unrooted phylogenetic trees. These problems are strongly related, respectively, to the notions of "containing as a minor" and "containing as a topological minor" in the graph community. Both problems are known to be fixed parameter tractable in the number of input trees k, by using their expressibility in monadic second-order logic and a reduction to graphs of bounded treewidth. Motivated by the fact that the dependency on k of these algorithms is prohibitively large, we give the first explicit dynamic programming algorithms for solving these problems, both running in time [Formula: see text], where n is the total size of the input.

  14. PyElph - a software tool for gel images analysis and phylogenetics

    Directory of Open Access Journals (Sweden)

    Pavel Ana Brânduşa


    (Random Amplification of Polymorphic DNA and STR (Short Tandem Repeat. The similarity between the DNA sequences is computed and used to generate phylogenetic trees which are very useful for population genetics studies and taxonomic classification. Conclusions PyElph decreases the effort and time spent processing data from gel images by providing an automatic step-by-step gel image analysis system with a friendly Graphical User Interface. The proposed free software tool is suitable for researchers and students which do not have access to expensive commercial software and image acquisition devices.

  15. Complete generic-level phylogenetic analyses of palms (Arecaceae) with comparisons of supertree and supermatrix approaches. (United States)

    Baker, William J; Savolainen, Vincent; Asmussen-Lange, Conny B; Chase, Mark W; Dransfield, John; Forest, Félix; Harley, Madeline M; Uhl, Natalie W; Wilkinson, Mark


    Supertree and supermatrix methods have great potential in the quest to build the tree of life and yet they remain controversial, with most workers opting for one approach or the other, but rarely both. Here, we employed both methods to construct phylogenetic trees of all genera of palms (Arecaceae/Palmae), an iconic angiosperm family of great economic importance. We assembled a supermatrix consisting of 16 partitions, comprising DNA sequence data, plastid restriction fragment length polymorphism data, and morphological data for all genera, from which a highly resolved and well-supported phylogenetic tree was built despite abundant missing data. To construct supertrees, we used variants of matrix representation with parsimony (MRP) analysis based on input trees generated directly from subsamples of the supermatrix. All supertrees were highly resolved. Standard MRP with bootstrap-weighted matrix elements performed most effectively in this case, generating trees with the greatest congruence with the supermatrix tree and fewest clades unsupported by any input tree. Nonindependence due to input trees based on combinations of data partitions was an acceptable trade-off for improvements in supertree performance. Irreversible MRP and the use of strictly independent input trees only provided no obvious benefits. Contrary to previous claims, we found that unsupported clades are not infrequent under some MRP implementations, with up to 13% of clades lacking support from any input tree in some irreversible MRP supertrees. To build a formal synthesis, we assessed the cross-corroboration between supermatrix trees and the variant supertrees using semistrict consensus, enumerating shared clades and compatible clades. The semistrict consensus of the supermatrix tree and the most congruent supertree contained 160 clades (of a maximum of 204), 137 of which were present in both trees. The relationships recovered by these trees strongly support the current phylogenetic classification

  16. Classification of smooth Fano polytopes

    DEFF Research Database (Denmark)

    Øbro, Mikkel

    Fano polytopes up to isomorphism. A smooth Fano -polytope can have at most vertices. In case of vertices an explicit classification is known. The thesis contains the classification in case of vertices. Classifications of smooth Fano -polytopes for fixed exist only for . In the thesis an algorithm...... for the classification of smooth Fano -polytopes for any given is presented. The algorithm has been implemented and used to obtain the complete classification for .......A simplicial lattice polytope containing the origin in the interior is called a smooth Fano polytope, if the vertices of every facet is a basis of the lattice. The study of smooth Fano polytopes is motivated by their connection to toric varieties. The thesis concerns the classification of smooth...

  17. Phylogenetic analysis of the complete mitochondrial genome of Madurella mycetomatis confirms its taxonomic position within the order Sordariales.

    Directory of Open Access Journals (Sweden)

    Wendy W J van de Sande

    Full Text Available BACKGROUND: Madurella mycetomatis is the most common cause of human eumycetoma. The genus Madurella has been characterized by overall sterility on mycological media. Due to this sterility and the absence of other reliable morphological and ultrastructural characters, the taxonomic classification of Madurella has long been a challenge. Mitochondria are of monophyletic origin and mitochondrial genomes have been proven to be useful in phylogenetic analyses. RESULTS: The first complete mitochondrial DNA genome of a mycetoma-causative agent was sequenced using 454 sequencing. The mitochondrial genome of M. mycetomatis is a circular DNA molecule with a size of 45,590 bp, encoding for the small and the large subunit rRNAs, 27 tRNAs, 11 genes encoding subunits of respiratory chain complexes, 2 ATP synthase subunits, 5 hypothetical proteins, 6 intronic proteins including the ribosomal protein rps3. In phylogenetic analyses using amino acid sequences of the proteins involved in respiratory chain complexes and the 2 ATP synthases it appeared that M. mycetomatis clustered together with members of the order Sordariales and that it was most closely related to Chaetomium thermophilum. Analyses of the gene order showed that within the order Sordariales a similar gene order is found. Furthermore also the tRNA order seemed mostly conserved. CONCLUSION: Phylogenetic analyses of fungal mitochondrial genomes confirmed that M. mycetomatis belongs to the order of Sordariales and that it was most closely related to Chaetomium thermophilum, with which it also shared a comparable gene and tRNA order.

  18. Monte Carlo estimation of total variation distance of Markov chains on large spaces, with application to phylogenetics. (United States)

    Herbei, Radu; Kubatko, Laura


    Markov chains are widely used for modeling in many areas of molecular biology and genetics. As the complexity of such models advances, it becomes increasingly important to assess the rate at which a Markov chain converges to its stationary distribution in order to carry out accurate inference. A common measure of convergence to the stationary distribution is the total variation distance, but this measure can be difficult to compute when the state space of the chain is large. We propose a Monte Carlo method to estimate the total variation distance that can be applied in this situation, and we demonstrate how the method can be efficiently implemented by taking advantage of GPU computing techniques. We apply the method to two Markov chains on the space of phylogenetic trees, and discuss the implications of our findings for the development of algorithms for phylogenetic inference.

  19. Link prediction boosted psychiatry disorder classification for functional connectivity network (United States)

    Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang


    Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.

  20. Hierarchical Maximum Margin Learning for Multi-Class Classification

    CERN Document Server

    Yang, Jian-Bo


    Due to myriads of classes, designing accurate and efficient classifiers becomes very challenging for multi-class classification. Recent research has shown that class structure learning can greatly facilitate multi-class learning. In this paper, we propose a novel method to learn the class structure for multi-class classification problems. The class structure is assumed to be a binary hierarchical tree. To learn such a tree, we propose a maximum separating margin method to determine the child nodes of any internal node. The proposed method ensures that two classgroups represented by any two sibling nodes are most separable. In the experiments, we evaluate the accuracy and efficiency of the proposed method over other multi-class classification methods on real world large-scale problems. The results show that the proposed method outperforms benchmark methods in terms of accuracy for most datasets and performs comparably with other class structure learning methods in terms of efficiency for all datasets.

  1. AdaBoost for Improved Voice-Band Signal Classification

    Institute of Scientific and Technical Information of China (English)


    A good voice-band signal classification can not only enable the safe application of speech coding techniques,the implementation of a Digital Signal Interpolation (DSI)system, but also facilitate network administration and planning by providing accurate voice-band traffic analysis.A new method is proposed to detect and classify the presence of various voice-band signals on the General Switched Telephone Network ( GSTN ). The method uses a combination of simple base classifiers through the AdaBoost algorithm. The conventional classification features for voiceband data classification are combined and optimized by the AdaBoost algorithm and spectral subtraction method.Experiments show the simpleness, effectiveness, efficiency and flexibility of the method.

  2. Classification of Pulse Waveforms Using Edit Distance with Real Penalty

    Directory of Open Access Journals (Sweden)

    Zhang Dongyu


    Full Text Available Abstract Advances in sensor and signal processing techniques have provided effective tools for quantitative research in traditional Chinese pulse diagnosis (TCPD. Because of the inevitable intraclass variation of pulse patterns, the automatic classification of pulse waveforms has remained a difficult problem. In this paper, by referring to the edit distance with real penalty (ERP and the recent progress in -nearest neighbors (KNN classifiers, we propose two novel ERP-based KNN classifiers. Taking advantage of the metric property of ERP, we first develop an ERP-induced inner product and a Gaussian ERP kernel, then embed them into difference-weighted KNN classifiers, and finally develop two novel classifiers for pulse waveform classification. The experimental results show that the proposed classifiers are effective for accurate classification of pulse waveform.

  3. Combinatorial Approach of Associative Classification


    P. R. Pal; R.C. Jain


    Association rule mining and classification are two important techniques of data mining in knowledge discovery process. Integration of these two has produced class association rule mining or associative classification techniques, which in many cases have shown better classification accuracy than conventional classifiers. Motivated by this study we have explored and applied the combinatorial mathematics in class association rule mining in this paper. Our algorithm is based on producing co...

  4. The future of general classification

    DEFF Research Database (Denmark)

    Mai, Jens Erik


    Discusses problems related to accessing multiple collections using a single retrieval language. Surveys the concepts of interoperability and switching language. Finds that mapping between more indexing languages always will be an approximation. Surveys the issues related to general classification...... and contrasts that to special classifications. Argues for the use of general classifications to provide access to collections nationally and internationally. © 2003 by The Haworth Press, Inc. All rights reserved....

  5. A Classification Leveraged Object Detector


    Sun, Miao; Han, Tony X.; He, Zhihai


    Currently, the state-of-the-art image classification algorithms outperform the best available object detector by a big margin in terms of average precision. We, therefore, propose a simple yet principled approach that allows us to leverage object detection through image classification on supporting regions specified by a preliminary object detector. Using a simple bag-of- words model based image classification algorithm, we leveraged the performance of the deformable model objector from 35.9%...

  6. A new classification of glaucomas

    Directory of Open Access Journals (Sweden)

    Bordeianu CD


    Full Text Available Constantin-Dan Bordeianu Private Practice, Ploiesti, Prahova, Romania Purpose: To suggest a new glaucoma classification that is pathogenic, etiologic, and clinical.Methods: After discussing the logical pathway used in criteria selection, the paper presents the new classification and compares it with the classification currently in use, that is, the one issued by the European Glaucoma Society in 2008.Results: The paper proves that the new classification is clear (being based on a coherent and consistently followed set of criteria, is comprehensive (framing all forms of glaucoma, and helps in understanding the sickness understanding (in that it uses a logical framing system. The great advantage is that it facilitates therapeutic decision making in that it offers direct therapeutic suggestions and avoids errors leading to disasters. Moreover, the scheme remains open to any new development.Conclusion: The suggested classification is a pathogenic, etiologic, and clinical classification that fulfills the conditions of an ideal classification. The suggested classification is the first classification in which the main criterion is consistently used for the first 5 to 7 crossings until its differentiation capabilities are exhausted. Then, secondary criteria (etiologic and clinical pick up the relay until each form finds its logical place in the scheme. In order to avoid unclear aspects, the genetic criterion is no longer used, being replaced by age, one of the clinical criteria. The suggested classification brings only benefits to all categories of ophthalmologists: the beginners will have a tool to better understand the sickness and to ease their decision making, whereas the experienced doctors will have their practice simplified. For all doctors, errors leading to therapeutic disasters will be less likely to happen. Finally, researchers will have the object of their work gathered in the group of glaucoma with unknown or uncertain pathogenesis, whereas

  7. Toward Accurate and Quantitative Comparative Metagenomics (United States)

    Nayfach, Stephen; Pollard, Katherine S.


    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. PMID:27565341

  8. Apparatus for accurately measuring high temperatures (United States)

    Smith, D.D.

    The present invention is a thermometer used for measuring furnace temperatures in the range of about 1800/sup 0/ to 2700/sup 0/C. The thermometer comprises a broadband multicolor thermal radiation sensor positioned to be in optical alignment with the end of a blackbody sight tube extending into the furnace. A valve-shutter arrangement is positioned between the radiation sensor and the sight tube and a chamber for containing a charge of high pressure gas is positioned between the valve-shutter arrangement and the radiation sensor. A momentary opening of the valve shutter arrangement allows a pulse of the high gas to purge the sight tube of air-borne thermal radiation contaminants which permits the radiation sensor to accurately measure the thermal radiation emanating from the end of the sight tube.

  9. Accurate renormalization group analyses in neutrino sector

    Energy Technology Data Exchange (ETDEWEB)

    Haba, Naoyuki [Graduate School of Science and Engineering, Shimane University, Matsue 690-8504 (Japan); Kaneta, Kunio [Kavli IPMU (WPI), The University of Tokyo, Kashiwa, Chiba 277-8568 (Japan); Takahashi, Ryo [Graduate School of Science and Engineering, Shimane University, Matsue 690-8504 (Japan); Yamaguchi, Yuya [Department of Physics, Faculty of Science, Hokkaido University, Sapporo 060-0810 (Japan)


    We investigate accurate renormalization group analyses in neutrino sector between ν-oscillation and seesaw energy scales. We consider decoupling effects of top quark and Higgs boson on the renormalization group equations of light neutrino mass matrix. Since the decoupling effects are given in the standard model scale and independent of high energy physics, our method can basically apply to any models beyond the standard model. We find that the decoupling effects of Higgs boson are negligible, while those of top quark are not. Particularly, the decoupling effects of top quark affect neutrino mass eigenvalues, which are important for analyzing predictions such as mass squared differences and neutrinoless double beta decay in an underlying theory existing at high energy scale.


    Directory of Open Access Journals (Sweden)

    Natalia Romanova


    Full Text Available New types of criminal groups are emerging in modern society.  These types have their special criminal subculture. The research objective is to develop new parameters of classification of modern criminal groups, create a new typology of criminal groups and identify some features of their subculture. Research methodology is based on the system approach that includes using the method of analysis of documentary sources (materials of a criminal case, method of conversations with themembers of the criminal group, method of testing the members of the criminal group and method of observation. As a result of the conducted research, we have created a new classification of criminal groups. The first type is a lawful group in its form and criminal according to its content (i.e., its target is criminal enrichment. The second type is a criminal organization which is run by so-called "white-collars" that "remain in the shadow". The third type is traditional criminal groups.  The fourth type is the criminal group, which openly demonstrates its criminal activity.

  11. Supply chain planning classification (United States)

    Hvolby, Hans-Henrik; Trienekens, Jacques; Bonde, Hans


    Industry experience a need to shift in focus from internal production planning towards planning in the supply network. In this respect customer oriented thinking becomes almost a common good amongst companies in the supply network. An increase in the use of information technology is needed to enable companies to better tune their production planning with customers and suppliers. Information technology opportunities and supply chain planning systems facilitate companies to monitor and control their supplier network. In spite if these developments, most links in today's supply chains make individual plans, because the real demand information is not available throughout the chain. The current systems and processes of the supply chains are not designed to meet the requirements now placed upon them. For long term relationships with suppliers and customers, an integrated decision-making process is needed in order to obtain a satisfactory result for all parties. Especially when customized production and short lead-time is in focus. An effective value chain makes inventory available and visible among the value chain members, minimizes response time and optimizes total inventory value held throughout the chain. In this paper a supply chain planning classification grid is presented based current manufacturing classifications and supply chain planning initiatives.

  12. PSC: protein surface classification. (United States)

    Tseng, Yan Yuan; Li, Wen-Hsiung


    We recently proposed to classify proteins by their functional surfaces. Using the structural attributes of functional surfaces, we inferred the pairwise relationships of proteins and constructed an expandable database of protein surface classification (PSC). As the functional surface(s) of a protein is the local region where the protein performs its function, our classification may reflect the functional relationships among proteins. Currently, PSC contains a library of 1974 surface types that include 25,857 functional surfaces identified from 24,170 bound structures. The search tool in PSC empowers users to explore related surfaces that share similar local structures and core functions. Each functional surface is characterized by structural attributes, which are geometric, physicochemical or evolutionary features. The attributes have been normalized as descriptors and integrated to produce a profile for each functional surface in PSC. In addition, binding ligands are recorded for comparisons among homologs. PSC allows users to exploit related binding surfaces to reveal the changes in functionally important residues on homologs that have led to functional divergence during evolution. The substitutions at the key residues of a spatial pattern may determine the functional evolution of a protein. In PSC (, a pool of changes in residues on similar functional surfaces is provided.

  13. Holistic facial expression classification (United States)

    Ghent, John; McDonald, J.


    This paper details a procedure for classifying facial expressions. This is a growing and relatively new type of problem within computer vision. One of the fundamental problems when classifying facial expressions in previous approaches is the lack of a consistent method of measuring expression. This paper solves this problem by the computation of the Facial Expression Shape Model (FESM). This statistical model of facial expression is based on an anatomical analysis of facial expression called the Facial Action Coding System (FACS). We use the term Action Unit (AU) to describe a movement of one or more muscles of the face and all expressions can be described using the AU's described by FACS. The shape model is calculated by marking the face with 122 landmark points. We use Principal Component Analysis (PCA) to analyse how the landmark points move with respect to each other and to lower the dimensionality of the problem. Using the FESM in conjunction with Support Vector Machines (SVM) we classify facial expressions. SVMs are a powerful machine learning technique based on optimisation theory. This project is largely concerned with statistical models, machine learning techniques and psychological tools used in the classification of facial expression. This holistic approach to expression classification provides a means for a level of interaction with a computer that is a significant step forward in human-computer interaction.

  14. Accurate Weather Forecasting for Radio Astronomy (United States)

    Maddalena, Ronald J.


    The NRAO Green Bank Telescope routinely observes at wavelengths from 3 mm to 1 m. As with all mm-wave telescopes, observing conditions depend upon the variable atmospheric water content. The site provides over 100 days/yr when opacities are low enough for good observing at 3 mm, but winds on the open-air structure reduce the time suitable for 3-mm observing where pointing is critical. Thus, to maximum productivity the observing wavelength needs to match weather conditions. For 6 years the telescope has used a dynamic scheduling system (recently upgraded; that requires accurate multi-day forecasts for winds and opacities. Since opacity forecasts are not provided by the National Weather Services (NWS), I have developed an automated system that takes available forecasts, derives forecasted opacities, and deploys the results on the web in user-friendly graphical overviews ( rmaddale/Weather). The system relies on the "North American Mesoscale" models, which are updated by the NWS every 6 hrs, have a 12 km horizontal resolution, 1 hr temporal resolution, run to 84 hrs, and have 60 vertical layers that extend to 20 km. Each forecast consists of a time series of ground conditions, cloud coverage, etc, and, most importantly, temperature, pressure, humidity as a function of height. I use the Liebe's MWP model (Radio Science, 20, 1069, 1985) to determine the absorption in each layer for each hour for 30 observing wavelengths. Radiative transfer provides, for each hour and wavelength, the total opacity and the radio brightness of the atmosphere, which contributes substantially at some wavelengths to Tsys and the observational noise. Comparisons of measured and forecasted Tsys at 22.2 and 44 GHz imply that the forecasted opacities are good to about 0.01 Nepers, which is sufficient for forecasting and accurate calibration. Reliability is high out to 2 days and degrades slowly for longer-range forecasts.

  15. Phylogenetic taxonomy of the family Chlorobiaceae on the basis of 16S rRNA and fmo (Fenna-Matthews-Olson protein) gene sequences. (United States)

    Imhoff, Johannes F


    A new taxonomy of the green sulfur bacteria is proposed, based on phylogenetic relationships determined using the sequences of the independent 16S rRNA and fmo (Fenna-Matthews-Olson protein) genes, and supported by the DNA G + C content and sequence signatures. Comparison of the traditional classification system for these bacteria with their phylogenetic relationship yielded a confusing picture, because properties used for classification (such as cell morphology, photosynthetic pigments and substrate utilization) do not concur with their phylogeny. Using the genetic information available, strains and species assigned to the genera Chlorobium, Pelodictyon and Prosthecochloris are considered, and the following changes are proposed. Pelodictyon luteolum is transferred to the genus Chlorobium as Chlorobium luteolum comb. nov. Pelodictyon clathratiforme and Pelodictyon phaeoclathratiforme are transferred to the genus Chlorobium and combined into one species, Chlorobium clathratiforme comb. nov. The name Pelodictyon will become a synonym of Chlorobium. Strains known as Chlorobium limicola subsp. thiosulfatophilum that have a low DNA G + C content (52-52.5 mol%) are treated as strains of Chlorobium limicola; those with a high DNA G + C content (58.1 mol%) are transferred to Chlorobaculum gen. nov., as Chlorobaculum thiosulfatiphilum sp. nov. Chlorobium tepidum is transferred to Chlorobaculum tepidum comb. nov., and defined as the type species of the genus Chlorobaculum. Strains assigned to Chlorobium phaeobacteroides, but phylogenetically distant from the type strain of this species, are assigned to Chlorobium limicola and to Chlorobaculum limnaeum sp. nov. Strains known as Chlorobium vibrioforme subsp. thiosulfatophilum are transferred to Chlorobaculum parvum sp. nov. Chlorobium chlorovibrioides is transferred to 'Chlorobaculum chlorovibrioides' comb. nov. The type strain of Chlorobium vibrioforme is phylogenetically related to Prosthecochloris, and is therefore

  16. Homotopy Classification of Multiaxial Actions

    CERN Document Server

    Cappell, Sylvain; Yan, Min


    A U(n)-manifold is multiaxial if the isotropy groups are always conjugate to unitary subgroups. The classification and the concordance of such manifolds have been studied by Davis, Hsiang and Morgan under much more strict conditions. We show that in general, without much extra condition, the homotopy classification of multiaxial manifolds can be split into a direct sum of the classification of pairs of adjacent strata, which can be computed by the classical surgery theory. Moreover, we also compute the homotopy classification for the case of the standard representation sphere. We also present the result for the similar multiaxial Sp(n)-manifolds.

  17. Phylogeny and classification of phylum Cercozoa (Protozoa). (United States)

    Cavalier-Smith, Thomas; Chao, Ema E Y


    The protozoan phylum Cercozoa embraces numerous ancestrally biciliate zooflagellates, euglyphid and other filose testate amoebae, chlorarachnean algae, phytomyxean plant parasites (e.g. Plasmodiophora, Phagomyxa), the animal-parasitic Ascetosporea, and Gromia. We report 18S rRNA sequences of 27 culturable zooflagellates, many previously of unknown taxonomic position. Phylogenetic analysis shows that all belong to Cercozoa. We revise cercozoan classification in the light of our analysis and ultrastructure, adopting two subphyla: Filosa subphyl. nov. a clade comprising Monadofilosa and Reticulofilosa, ranked as superclasses, ancestrally having the same very rare base-pair substitution as all opisthokonts; and subphylum Endomyxa emend. comprising classes Phytomyxea (Plasmodiophorida, Phagomyxida), Ascetosporea (Haplosporidia, Paramyxida, Claustrosporida ord. nov.) and Gromiidea cl. nov., which did not. Monadofilosa comprise Sarcomonadea, zooflagellates with a propensity to glide on their posterior cilium and/or generate filopodia (e.g. Metopion; Cercomonas; Heteromitidae - Heteromita, Bodomorpha, Proleptomonas and Allantion) and two new classes: Imbricatea (with silica scales: Euglyphida; Thaumatomonadida, including Alias, Thaumatomastix) and Thecofilosea (Cryomonadida; Tectofilosida ord. nov. - non-scaly filose amoebae, e.g. Pseudodifflugia). Reticulofilosa comprise classes Chlorarachnea, Spongomonadea and Proteomyxidea (e.g. Massisteria, Gymnophrys, a Dimorpha-like protozoan). Cercozoa, now with nine classes and 17 orders (four new), will probably include many, possibly most, other filose and reticulose amoebae and zooflagellates not yet assigned to phyla.


    Institute of Scientific and Technical Information of China (English)

    张晓明; 蒋大真; 等


    By obtaining a feasible filter function,reconstructed images can be got with linear interpolation and filtered backoprojection techniques.Considering the gray and spatial correlation neighbour informations of each pixel,a new supervised classification method is put forward for the reconstructed images,and an experiment with noise image is done,the result shows that the method is feasible and accurate compared with ideal phantoms.

  19. Reconstruction-classification method for quantitative photoacoustic tomography

    CERN Document Server

    Malone, Emma; Cox, Ben T; Arridge, Simon R


    We propose a combined reconstruction-classification method for simultaneously recovering absorption and scattering in turbid media from images of absorbed optical energy. This method exploits knowledge that optical parameters are determined by a limited number of classes to iteratively improve their estimate. Numerical experiments show that the proposed approach allows for accurate recovery of absorption and scattering in 2 and 3 dimensions, and delivers superior image quality with respect to traditional reconstruction-only approaches.

  20. The complete chloroplast genome sequence of Ampelopsis: gene organization, comparative analysis and phylogenetic relationships to other angiosperms

    Directory of Open Access Journals (Sweden)

    Gurusamy eRaman


    Full Text Available Ampelopsis brevipedunculata is an economically important plant that belongs to the Vitaceae family of angiosperms. The phylogenetic placement of Vitaceae is still unresolved. Recent phylogenetic studies suggested that it should be placed in various alternative families including Caryophyllaceae, asteraceae, Saxifragaceae, Dilleniaceae, or with the rest of the rosid families. However, these analyses provided weak supportive results because they were based on only one of several genes. Accordingly, complete chloroplast genome sequences are required to resolve the phylogenetic relationships among angiosperms. Recent phylogenetic analyses based on the complete chloroplast genome sequence suggested strong support for the position of Vitaceae as the earliest diverging lineage of rosids and placed it as a sister to the remaining rosids. These studies also revealed relationships among several major lineages of angiosperms; however, they highlighted the significance of taxon sampling for obtaining accurate phylogenies. In the present study, we sequenced the complete chloroplast genome of A. brevipedunculata and used these data to assess the relationships among 32 angiosperms, including 18 taxa of rosids. The Ampelopsis chloroplast genome is 161,090 bp in length, and includes a pair of inverted repeats of 26,394 bp that are separated by small and large single copy regions of 19,036 bp and 89,266 bp, respectively. The gene content and order of Ampelopsis is identical to many other unrearranged angiosperm chloroplast genomes, including Vitis and tobacco. A phylogenetic tree constructed based on 70 protein-coding genes of 33 angiosperms showed that both Saxifragales and Vitaceae diverged from the rosid clade and formed two clades with 100% bootstrap value. The position of the Vitaceae is sister to Saxifragales, and both are the basal and earliest diverging lineages. Moreover, Saxifragales forms a sister clade to Vitaceae of rosids. Overall, the results of

  1. Approaching system equilibrium with accurate or not accurate feedback information in a two-route system (United States)

    Zhao, Xiao-mei; Xie, Dong-fan; Li, Qi


    With the development of intelligent transport system, advanced information feedback strategies have been developed to reduce traffic congestion and enhance the capacity. However, previous strategies provide accurate information to travelers and our simulation results show that accurate information brings negative effects, especially in delay case. Because travelers prefer to the best condition route with accurate information, and delayed information cannot reflect current traffic condition but past. Then travelers make wrong routing decisions, causing the decrease of the capacity and the increase of oscillations and the system deviating from the equilibrium. To avoid the negative effect, bounded rationality is taken into account by introducing a boundedly rational threshold BR. When difference between two routes is less than the BR, routes have equal probability to be chosen. The bounded rationality is helpful to improve the efficiency in terms of capacity, oscillation and the gap deviating from the system equilibrium.

  2. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection.

    Directory of Open Access Journals (Sweden)

    Yun Yu

    Full Text Available Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.

  3. Phylogenetic ANOVA: The Expression Variance and Evolution Model for Quantitative Trait Evolution. (United States)

    Rohlfs, Rori V; Nielsen, Rasmus


    A number of methods have been developed for modeling the evolution of a quantitative trait on a phylogeny. These methods have received renewed interest in the context of genome-wide studies of gene expression, in which the expression levels of many genes can be modeled as quantitative traits. We here develop a new method for joint analyses of quantitative traits within- and between species, the Expression Variance and Evolution (EVE) model. The model parameterizes the ratio of population to evolutionary expression variance, facilitating a wide variety of analyses, including a test for lineage-specific shifts in expression level, and a phylogenetic ANOVA that can detect genes with increased or decreased ratios of expression divergence to diversity, analogous to the famous Hudson Kreitman Aguadé (HKA) test used to detect selection at the DNA level. We use simulations to explore the properties of these tests under a variety of circumstances and show that the phylogenetic ANOVA is more accurate than the standard ANOVA (no accounting for phylogeny) sometimes used in transcriptomics. We then apply the EVE model to a mammalian phylogeny of 15 species typed for expression levels in liver tissue. We identify genes with high expression divergence between species as candidates for expression level adaptation, and genes with high expression diversity within species as candidates for expression level conservation and/or plasticity. Using the test for lineage-specific expression shifts, we identify several candidate genes for expression level adaptation on the catarrhine and human lineages, including genes putatively related to dietary changes in humans. We compare these results to those reported previously using a model which ignores expression variance within species, uncovering important differences in performance. We demonstrate the necessity for a phylogenetic model in comparative expression studies and show the utility of the EVE model to detect expression divergence

  4. [A accurate identification method for Chinese materia medica--systematic identification of Chinese materia medica]. (United States)

    Wang, Xue-Yong; Liao, Cai-Li; Liu, Si-Qi; Liu, Chun-Sheng; Shao, Ai-Juan; Huang, Lu-Qi


    This paper put forward a more accurate identification method for identification of Chinese materia medica (CMM), the systematic identification of Chinese materia medica (SICMM) , which might solve difficulties in CMM identification used the ordinary traditional ways. Concepts, mechanisms and methods of SICMM were systematically introduced and possibility was proved by experiments. The establishment of SICMM will solve problems in identification of Chinese materia medica not only in phenotypic characters like the mnorphous, microstructure, chemical constituents, but also further discovery evolution and classification of species, subspecies and population in medical plants. The establishment of SICMM will improve the development of identification of CMM and create a more extensive study space.

  5. Automatically high accurate and efficient photomask defects management solution for advanced lithography manufacture (United States)

    Zhu, Jun; Chen, Lijun; Ma, Lantao; Li, Dejian; Jiang, Wei; Pan, Lihong; Shen, Huiting; Jia, Hongmin; Hsiang, Chingyun; Cheng, Guojie; Ling, Li; Chen, Shijie; Wang, Jun; Liao, Wenkui; Zhang, Gary


    Defect review is a time consuming job. Human error makes result inconsistent. The defects located on don't care area would not hurt the yield and no need to review them such as defects on dark area. However, critical area defects can impact yield dramatically and need more attention to review them such as defects on clear area. With decrease in integrated circuit dimensions, mask defects are always thousands detected during inspection even more. Traditional manual or simple classification approaches are unable to meet efficient and accuracy requirement. This paper focuses on automatic defect management and classification solution using image output of Lasertec inspection equipment and Anchor pattern centric image process technology. The number of mask defect found during an inspection is always in the range of thousands or even more. This system can handle large number defects with quick and accurate defect classification result. Our experiment includes Die to Die and Single Die modes. The classification accuracy can reach 87.4% and 93.3%. No critical or printable defects are missing in our test cases. The missing classification defects are 0.25% and 0.24% in Die to Die mode and Single Die mode. This kind of missing rate is encouraging and acceptable to apply on production line. The result can be output and reloaded back to inspection machine to have further review. This step helps users to validate some unsure defects with clear and magnification images when captured images can't provide enough information to make judgment. This system effectively reduces expensive inline defect review time. As a fully inline automated defect management solution, the system could be compatible with current inspection approach and integrated with optical simulation even scoring function and guide wafer level defect inspection.

  6. 15 CFR 2008.9 - Classification guides. (United States)


    ... 15 Commerce and Foreign Trade 3 2010-01-01 2010-01-01 false Classification guides. 2008.9 Section... REPRESENTATIVE Derivative Classification § 2008.9 Classification guides. Classification guides shall be issued by... direct derivative classification, shall identify the information to be protected in specific and...

  7. 32 CFR 2400.15 - Classification guides. (United States)


    ... 32 National Defense 6 2010-07-01 2010-07-01 false Classification guides. 2400.15 Section 2400.15... Derivative Classification § 2400.15 Classification guides. (a) OSTP shall issue and maintain classification guides to facilitate the proper and uniform derivative classification of information. These guides...

  8. 14 CFR 1203.412 - Classification guides. (United States)


    ... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Classification guides. 1203.412 Section... PROGRAM Guides for Original Classification § 1203.412 Classification guides. (a) General. A classification guide, based upon classification determinations made by appropriate program and...

  9. 7 CFR 27.34 - Classification procedure. (United States)


    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Classification procedure. 27.34 Section 27.34... REGULATIONS COTTON CLASSIFICATION UNDER COTTON FUTURES LEGISLATION Regulations Classification and Micronaire Determinations § 27.34 Classification procedure. Classification shall proceed as rapidly as possible, but...

  10. 22 CFR 9.6 - Derivative classification. (United States)


    ... CFR 2001.22. (c) Department of State Classification Guide. The Department of State Classification... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Derivative classification. 9.6 Section 9.6... classification. (a) Definition. Derivative classification is the incorporating, paraphrasing, restating...

  11. Uprooting phylogenetic uncertainty in coalescent species delimitation:A meta-analysis of empirical studies

    Institute of Scientific and Technical Information of China (English)



    Phylogenetic and phylogeographic studies rely on the accurate quantification of biodiversity. In recent studies of tax-onomically ambiguous groups, species boundaries are often determined based on multi-locus sequence data.Bayesian Phyloge-netics and Phylogeography(BPP) is a coalescent-based method frequently used to delimit species; however, empirical studies suggest that the requirement of a user-specified guide tree biases the range of possible outcomes. We evaluate fifteen multi-locus datasets using the most recent iteration of BPP, which eliminates the need for a user-specified guide tree and reconstructs the species tree in synchrony with species delimitation (= unguided species delimitation). We found that the number of species recovered with guided versus unguided species delimitation was the same except for two cases, and that posterior probabilities were generally lower for the unguided analyses as a result of searching across species trees in addition to species delimitation models. The guide trees used in previous studies were often discordant with the species tree topologies estimated by BPP. We also compared species trees estimated using BPP and *BEAST and found that when the topologies are the same, BPP tends to give higher posterior probabilities [Current Zoology 61 (5): 866–873, 2015].

  12. Utility of ITS sequence data for phylogenetic reconstruction of Italian Quercus spp. (United States)

    Bellarosa, Rosanna; Simeone, Marco C; Papini, Alessio; Schirone, Bartolomeo


    Nuclear ribosomal DNA sequences encoding the 5.8S RNA and the flanking internal transcribed spacers (ITS1 and ITS2) were used to test the phylogenetic relationships within 12 Italian Quercus taxa (Fagaceae). Hypotheses of sequence orthology are tested by detailed inspection of some basic features of oak ITS sequences (i.e., general patterns of conserved domains, thermodynamic stability and predicted conformation of the secondary structure of transcripts) that also allowed more accurate sequence alignment. Analysis of ITS variation supported three monophyletic groups, corresponding to subg. Cerris, Schlerophyllodrys (=Ilex group sensu Nixon) and Quercus, as proposed by Schwarz [Feddes Rep., Sonderbeih. D, 1-200]. A derivation of the "Cerris group" from the "Ilex group" is suggested, with Q. cerris sister to the rest of the "Cerris group." Quercus pubescens was found to be sister to the rest of the "Quercus group." The status of hybrispecies of Q. crenata (Q. cerrisxQ. suber) and Q. morisii (Q. ilexxQ. suber) was evaluated and discussed. Finally, the phylogenetic position of the Italian species in a broader context of the genus is presented. The utility of the ITS marker to assess the molecular systematics of oaks is therefore confirmed. The importance of Italy as a region with a high degree of diversity at the population and genetic level is discussed.

  13. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. (United States)

    Lartillot, Nicolas; Rodrigue, Nicolas; Stubbs, Daniel; Richer, Jacques


    Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website

  14. A higher level classification of all living organisms.

    Directory of Open Access Journals (Sweden)

    Michael A Ruggiero

    Full Text Available We present a consensus classification of life to embrace the more than 1.6 million species already provided by more than 3,000 taxonomists' expert opinions in a unified and coherent, hierarchically ranked system known as the Catalogue of Life (CoL. The intent of this collaborative effort is to provide a hierarchical classification serving not only the needs of the CoL's database providers but also the diverse public-domain user community, most of whom are familiar with the Linnaean conceptual system of ordering taxon relationships. This classification is neither phylogenetic nor evolutionary but instead represents a consensus view that accommodates taxonomic choices and practical compromises among diverse expert opinions, public usages, and conflicting evidence about the boundaries between taxa and the ranks of major taxa, including kingdoms. Certain key issues, some not fully resolved, are addressed in particular. Beyond its immediate use as a management tool for the CoL and ITIS (Integrated Taxonomic Information System, it is immediately valuable as a reference for taxonomic and biodiversity research, as a tool for societal communication, and as a classificatory "backbone" for biodiversity databases, museum collections, libraries, and textbooks. Such a modern comprehensive hierarchy has not previously existed at this level of specificity.

  15. A higher level classification of all living organisms. (United States)

    Ruggiero, Michael A; Gordon, Dennis P; Orrell, Thomas M; Bailly, Nicolas; Bourgoin, Thierry; Brusca, Richard C; Cavalier-Smith, Thomas; Guiry, Michael D; Kirk, Paul M


    We present a consensus classification of life to embrace the more than 1.6 million species already provided by more than 3,000 taxonomists' expert opinions in a unified and coherent, hierarchically ranked system known as the Catalogue of Life (CoL). The intent of this collaborative effort is to provide a hierarchical classification serving not only the needs of the CoL's database providers but also the diverse public-domain user community, most of whom are familiar with the Linnaean conceptual system of ordering taxon relationships. This classification is neither phylogenetic nor evolutionary but instead represents a consensus view that accommodates taxonomic choices and practical compromises among diverse expert opinions, public usages, and conflicting evidence about the boundaries between taxa and the ranks of major taxa, including kingdoms. Certain key issues, some not fully resolved, are addressed in particular. Beyond its immediate use as a management tool for the CoL and ITIS (Integrated Taxonomic Information System), it is immediately valuable as a reference for taxonomic and biodiversity research, as a tool for societal communication, and as a classificatory "backbone" for biodiversity databases, museum collections, libraries, and textbooks. Such a modern comprehensive hierarchy has not previously existed at this level of specificity.

  16. Transient Detection and Classification

    CERN Document Server

    Becker, Andrew C


    I provide an incomplete inventory of the astronomical variability that will be found by next-generation time-domain astronomical surveys. These phenomena span the distance range from near-Earth satellites to the farthest Gamma Ray Bursts. The surveys that detect these transients will issue alerts to the greater astronomical community; this decision process must be extremely robust to avoid a slew of ``false'' alerts, and to maintain the community's trust in the surveys. I review the functionality required of both the surveys and the telescope networks that will be following them up, and the role of VOEvents in this process. Finally, I offer some ideas about object and event classification, which will be explored more thoroughly by other articles in these proceedings.

  17. Nonlinear estimation and classification

    CERN Document Server

    Hansen, Mark; Holmes, Christopher; Mallick, Bani; Yu, Bin


    Researchers in many disciplines face the formidable task of analyzing massive amounts of high-dimensional and highly-structured data This is due in part to recent advances in data collection and computing technologies As a result, fundamental statistical research is being undertaken in a variety of different fields Driven by the complexity of these new problems, and fueled by the explosion of available computer power, highly adaptive, non-linear procedures are now essential components of modern "data analysis," a term that we liberally interpret to include speech and pattern recognition, classification, data compression and signal processing The development of new, flexible methods combines advances from many sources, including approximation theory, numerical analysis, machine learning, signal processing and statistics The proposed workshop intends to bring together eminent experts from these fields in order to exchange ideas and forge directions for the future

  18. Spectral Classification Beyond M

    CERN Document Server

    Leggett, S K; Burgasser, A J; Jones, H R A; Marley, M S; Tsuji, T


    Significant populations of field L and T dwarfs are now known, and we anticipate the discovery of even cooler dwarfs by Spitzer and ground-based infrared surveys. However, as the number of known L and T dwarfs increases so does the range in their observational properties, and difficulties have arisen in interpreting the observations. Although modellers have made significant advances, the complexity of the very low temperature, high pressure, photospheres means that problems remain such as the treatment of grain condensation as well as incomplete and non-equilibrium molecular chemistry. Also, there are several parameters which control the observed spectral energy distribution - effective temperature, grain sedimentation efficiency, metallicity and gravity - and their effects are not well understood. In this paper, based on a splinter session, we discuss classification schemes for L and T dwarfs, their dependency on wavelength, and the effects of the parameters T_eff, f_sed, [m/H] and log g on optical and infra...

  19. Estuary Classification Revisited

    CERN Document Server

    Guha, Anirban


    The governing equations of a tidally averaged, width averaged, rectangular estuary has been investigated. It's theoretically shown that the dynamics of an estuary is entirely controlled by three parameters: (i) the Estuarine Froude number, (ii) the Tidal Froude number and (iii) the Estuarine Aspect ratio. The momentum, salinity and integral salt balance equations can be completely expressed in terms of these control variables. The estuary classification problem has also been reinvestigated. It's found that these three control variables can completely specify the estuary type. Comparison with real estuary data shows very good match. Additionally, we show that the well accepted leading order estuarine integral salt balance equation is inconsitent with the leading order salinity equation in an order of magnitude sense.

  20. Phylogenetic performance of mitochondrial protein-coding genes of Oncomelania hupensis in resolving relationships between landscape populations

    Institute of Scientific and Technical Information of China (English)

    Shi-Zhu LI; Li ZHANG; Lin MA; Wei HU; Shan LV; Qin LIU; Ying-Jun QIAN


    Oncomelania hupensis is the unique intermediate host of Schistosomajaponicum,which plays a key role in the transmission of human blood fluke Schistosoma.The complete mitochondrial (mt) genome of O.hupensis has been characterized; however,the phylogenetic performance of mt protein-coding genes (PCGs) of the snail remain unclear.In this study,11 whole mt genomes of snails collected from four different ecological settings in China and the Philippines were sequenced.The mt genome sizes ranged from 15 183 to 15 216 bp,with the G + C contents from 32.4% to 33.4%.A total of 15 251 characters were generated from the multiple sequence alignment.Of 2711 (17.8%)polymorphic sites,56.22% (1524) were parsimony sites.The mt genomes' phylogenetic trees were reconstructed using minimum evolution,neighbor joining,maximum likelihood,maximum parsimony,and Bayesian tree estimate methods,and two main distinct clades were identified:(i) the isolate from mountainous regions; (ii) the remaining isolate which included three inner branches.All phylogenetic trees of the 13 PCGs were generated by running 1000 bootstrap replicates and compared with the complete mtDNA tree,the classification accuracy ranging from 21.23% to 87.87%,the topological distance of phylogenetic trees between PCGs ranging from 5 to 14.Therefore,the performance of PCGs can be divided into good condition (COⅠ,ND2,ND5,and ND3),medium (COⅡ,ATP6,ND1,ND6,Cytb,ND4,and COⅢ),poor (ATP8 and ND4L).This study represents the first analysis ofmt genome diversity of the O.hupensis snail and phylogenetic performance of mt PCGs.It presents clear evidence that the snail populations can be separated into four landscape genetic populations in mainland China based on whole mt genomes.The identification of the phylogenetic performance of PCGs provides new insight into the intensive genetic diversity study using mtDNA markers for the snail.