WorldWideScience

Sample records for accurate phylogenetic classification

  1. Accurate phylogenetic classification of DNA fragments based onsequence composition

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  2. Concepts of Classification and Taxonomy. Phylogenetic Classification

    Fraix-Burnet, Didier

    2016-01-01

    Phylogenetic approaches to classification have been heavily developed in biology by bioinformaticians. But these techniques have applications in other fields, in particular in linguistics. Their main characteristics is to search for relationships between the objects or species in study, instead of grouping them by similarity. They are thus rather well suited for any kind of evolutionary objects. For nearly fifteen years, astrocladistics has explored the use of Maximum Parsimony (or cladistics) for astronomical objects like galaxies or globular clusters. In this lesson we will learn how it works. 1 Why phylogenetic tools in astrophysics? 1.1 History of classification The need for classifying living organisms is very ancient, and the first classification system can be dated back to the Greeks. The goal was very practical since it was intended to distinguish between eatable and toxic aliments, or kind and dangerous animals. Simple resemblance was used and has been used for centuries. Basically, until the XVIIIth...

  3. Concepts of Classification and Taxonomy Phylogenetic Classification

    Fraix-Burnet, D.

    2016-05-01

    Phylogenetic approaches to classification have been heavily developed in biology by bioinformaticians. But these techniques have applications in other fields, in particular in linguistics. Their main characteristics is to search for relationships between the objects or species in study, instead of grouping them by similarity. They are thus rather well suited for any kind of evolutionary objects. For nearly fifteen years, astrocladistics has explored the use of Maximum Parsimony (or cladistics) for astronomical objects like galaxies or globular clusters. In this lesson we will learn how it works.

  4. Classification and Phylogenetics of Myxozoa

    Fiala, Ivan; Bartošová-Sojková, Pavla; Whipps, C. M.

    Cham: Springer International Publishing, 2015 - (Okamura, B.; Gruhl, A.; Bartholomew, J.), s. 85-110 ISBN 978-3-319-14752-9 Institutional support: RVO:60077344 Keywords : Taxonomy * Classification * Myxosporea * Actinosporea * Spore * Phylogeny Subject RIV: EG - Zoology

  5. ACCURATE TIME SERIES CLASSIFICATION USING SHAPELETS

    M. Arathi; A. GOVARDHAN

    2014-01-01

    Time series data are sequences of values measured o ver time. One of the most recent approaches to classification of time series data is to find shape lets within a data set. Time series shapelets are time series subsequences which represent a class. In order to compare two time series sequences, existing work use s Euclidean distance measure. The problem with Euclid ean distance is that it requires data to be standardized if scales ...

  6. Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics

    Westesson, O; Lunter, G.; Paten, B; Holmes, I

    2012-01-01

    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of seve...

  7. Accurate molecular classification of cancer using simple rules

    Gotoh Osamu; Wang Xiaosheng

    2009-01-01

    Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often ...

  8. Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.

    Oscar Westesson

    Full Text Available The Multiple Sequence Alignment (MSA is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history, it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.

  9. Phylogenetics, classification, and biogeography of the treefrogs (Amphibia: Anura: Arboranae).

    Duellman, William E; Marion, Angela B; Hedges, S Blair

    2016-01-01

    A phylogenetic analysis of sequences from 503 species of hylid frogs and four outgroup taxa resulted in 16,128 aligned sites of 19 genes. The molecular data were subjected to a maximum likelihood analysis that resulted in a new phylogenetic tree of treefrogs. A conservative new classification based on the tree has (1) three families composing an unranked taxon, Arboranae, (2) nine subfamilies (five resurrected, one new), and (3) six resurrected generic names and five new generic names. Using the results of a maximum likelihood timetree, times of divergence were determined. For the most part these times of divergence correlated well with historical geologic events. The arboranan frogs originated in South America in the Late Mesozoic or Early Cenozoic. The family Pelodryadidae diverged from its South American relative, Phyllomedusidae, in the Eocene and invaded Australia via Antarctica. There were two dispersals from South America to North America in the Paleogene. One lineage was the ancestral stock of Acris and its relatives, whereas the other lineage, subfamily Hylinae, differentiated into a myriad of genera in Middle America. PMID:27394762

  10. Accurate molecular classification of cancer using simple rules

    Gotoh Osamu

    2009-10-01

    Full Text Available Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. Methods We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV of training sets and classification of independent test sets. Results We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML], lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML. Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. Conclusion In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.

  11. Automatic classification and accurate size measurement of blank mask defects

    Bhamidipati, Samir; Paninjath, Sankaranarayanan; Pereira, Mark; Buck, Peter

    2015-07-01

    complexity of defects encountered. The variety arises due to factors such as defect nature, size, shape and composition; and the optical phenomena occurring around the defect. This paper focuses on preliminary characterization results, in terms of classification and size estimation, obtained by Calibre MDPAutoClassify tool on a variety of mask blank defects. It primarily highlights the challenges faced in achieving the results with reference to the variety of defects observed on blank mask substrates and the underlying complexities which make accurate defect size measurement an important and challenging task.

  12. Accurate mobile malware detection and classification in the cloud.

    Wang, Xiaolei; Yang, Yuexiang; Zeng, Yingzhi

    2015-01-01

    As the dominator of the Smartphone operating system market, consequently android has attracted the attention of s malware authors and researcher alike. The number of types of android malware is increasing rapidly regardless of the considerable number of proposed malware analysis systems. In this paper, by taking advantages of low false-positive rate of misuse detection and the ability of anomaly detection to detect zero-day malware, we propose a novel hybrid detection system based on a new open-source framework CuckooDroid, which enables the use of Cuckoo Sandbox's features to analyze Android malware through dynamic and static analysis. Our proposed system mainly consists of two parts: anomaly detection engine performing abnormal apps detection through dynamic analysis; signature detection engine performing known malware detection and classification with the combination of static and dynamic analysis. We evaluate our system using 5560 malware samples and 6000 benign samples. Experiments show that our anomaly detection engine with dynamic analysis is capable of detecting zero-day malware with a low false negative rate (1.16 %) and acceptable false positive rate (1.30 %); it is worth noting that our signature detection engine with hybrid analysis can accurately classify malware samples with an average positive rate 98.94 %. Considering the intensive computing resources required by the static and dynamic analysis, our proposed detection system should be deployed off-device, such as in the Cloud. The app store markets and the ordinary users can access our detection system for malware detection through cloud service. PMID:26543718

  13. Phylogenetics.

    Sleator, Roy D

    2011-04-01

    The recent rapid expansion in the DNA and protein databases, arising from large-scale genomic and metagenomic sequence projects, has forced significant development in the field of phylogenetics: the study of the evolutionary relatedness of the planet's inhabitants. Advances in phylogenetic analysis have greatly transformed our view of the landscape of evolutionary biology, transcending the view of the tree of life that has shaped evolutionary theory since Darwinian times. Indeed, modern phylogenetic analysis no longer focuses on the restricted Darwinian-Mendelian model of vertical gene transfer, but must also consider the significant degree of lateral gene transfer, which connects and shapes almost all living things. Herein, I review the major tree-building methods, their strengths, weaknesses and future prospects. PMID:21249334

  14. Phylogeny and phylogenetic classification of the antbirds, ovenbirds, woodcreepers, and allies (Aves: Passeriformes: Infraorder Furnariides)

    Moyle, R.G.; Chesser, R.T.; Brumfield, R.T.; Tello, J.G.; Marchese, D.J.; Cracraft, J.

    2009-01-01

    The infraorder Furnariides is a diverse group of suboscine passerine birds comprising a substantial component of the Neotropical avifauna. The included species encompass a broad array of morphologies and behaviours, making them appealing for evolutionary studies, but the size of the group (ca. 600 species) has limited well-sampled higher-level phylogenetic studies. Using DNA sequence data from the nuclear RAG-1 and RAG-2 exons, we undertook a phylogenetic analysis of the Furnariides sampling 124 (more than 88%) of the genera. Basal relationships among family-level taxa differed depending on phylogenetic method, but all topologies had little nodal support, mirroring the results from earlier studies in which discerning relationships at the base of the radiation was also difficult. In contrast, branch support for family-rank taxa and for many relationships within those clades was generally high. Our results support the Melanopareidae and Grallariidae as distinct from the Rhinocryptidae and Formicariidae, respectively. Within the Furnariides our data contradict some recent phylogenetic hypotheses and suggest that further study is needed to resolve these discrepancies. Of the few genera represented by multiple species, several were not monophyletic, indicating that additional systematic work remains within furnariine families and must include dense taxon sampling. We use this study as a basis for proposing a new phylogenetic classification for the group and in the process erect new family-group names for clades having high branch support across methods. ?? 2009 The Willi Hennig Society.

  15. Molecular phylogenetic perspectives for character classification and convergence: Framing some issues with nematode vulval appendages and telotylenchid tail termini

    Characters flagged as convergent based on newer molecular phylogenetic trees inform both practical identification and more esoteric classification. Nematode morphological characters such as lateral lines, bullae and laciniae are quite independent structures from those similarly named in other organi...

  16. A bootstrap based analysis pipeline for efficient classification of phylogenetically related animal miRNAs

    Gu Xun

    2007-03-01

    Full Text Available Abstract Background Phylogenetically related miRNAs (miRNA families convey important information of the function and evolution of miRNAs. Due to the special sequence features of miRNAs, pair-wise sequence identity between miRNA precursors alone is often inadequate for unequivocally judging the phylogenetic relationships between miRNAs. Most of the current methods for miRNA classification rely heavily on manual inspection and lack measurements of the reliability of the results. Results In this study, we designed an analysis pipeline (the Phylogeny-Bootstrap-Cluster (PBC pipeline to identify miRNA families based on branch stability in the bootstrap trees derived from overlapping genome-wide miRNA sequence sets. We tested the PBC analysis pipeline with the miRNAs from six animal species, H. sapiens, M. musculus, G. gallus, D. rerio, D. melanogaster, and C. elegans. The resulting classification was compared with the miRNA families defined in miRBase. The two classifications were largely consistent. Conclusion The PBC analysis pipeline is an efficient method for classifying large numbers of heterogeneous miRNA sequences. It requires minimum human involvement and provides measurements of the reliability of the classification results.

  17. Correlation between the Chemotaxonomic Classifications of the essential oils of 48 Eucalyptus species harvested from Tunisia and their Phylogenetic Classification

    Elaissi Ameur

    2014-03-01

    Full Text Available Various chemical classes (monoterpenes hydrocarbons, oxygenated monoterpenes, sesquiterpenes hydrocarbons, oxygenated sesquiterpenes, esters, ketones, non classified coumpounds and non identified compounds and twenty five of the main components from the essential oils of 48 Tunisian Eucalyptus species has been reported. The compounds includes 1,8-cineole, torquatone, p-cymene, spathulenol, trans-pinocarveol, α-pinene, borneol, cryptone, 4-methyl-2-pentyl acetate, globulol, isoamyl isovalerate, α-terpineol, (E,E-farnesol, viridiflorol, aromadendrene, terpinen-4-ol, β-eudesmol, α-eudesmol, limonene, D-piperitone, caryophyllene oxide, β-phellandrene, bicyclogermacrene, α-phellandrene and benzaldehyde, as a principal component when analysed by GC-MS.. The comparison of this classification to the phylogenetic classification showed a divergence for the majority of the species, however some concordance was found.

  18. INDUS - a composition-based approach for rapid and accurate taxonomic classification of metagenomic sequences

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Reddy, Rachamalla Maheedhar; Reddy, Chennareddy Venkata Siva Kumar; Singh, Nitin Kumar; Sharmila S Mande

    2011-01-01

    Background Taxonomic classification of metagenomic sequences is the first step in metagenomic analysis. Existing taxonomic classification approaches are of two types, similarity-based and composition-based. Similarity-based approaches, though accurate and specific, are extremely slow. Since, metagenomic projects generate millions of sequences, adopting similarity-based approaches becomes virtually infeasible for research groups having modest computational resources. In this study, we present ...

  19. Molecular phylogenetic evaluation of classification and scenarios of character evolution in calcareous sponges (Porifera, Class Calcarea.

    Oliver Voigt

    Full Text Available Calcareous sponges (Phylum Porifera, Class Calcarea are known to be taxonomically difficult. Previous molecular studies have revealed many discrepancies between classically recognized taxa and the observed relationships at the order, family and genus levels; these inconsistencies question underlying hypotheses regarding the evolution of certain morphological characters. Therefore, we extended the available taxa and character set by sequencing the complete small subunit (SSU rDNA and the almost complete large subunit (LSU rDNA of additional key species and complemented this dataset by substantially increasing the length of available LSU sequences. Phylogenetic analyses provided new hypotheses about the relationships of Calcarea and about the evolution of certain morphological characters. We tested our phylogeny against competing phylogenetic hypotheses presented by previous classification systems. Our data reject the current order-level classification by again finding non-monophyletic Leucosolenida, Clathrinida and Murrayonida. In the subclass Calcinea, we recovered a clade that includes all species with a cortex, which is largely consistent with the previously proposed order Leucettida. Other orders that had been rejected in the current system were not found, but could not be rejected in our tests either. We found several additional families and genera polyphyletic: the families Leucascidae and Leucaltidae and the genus Leucetta in Calcinea, and in Calcaronea the family Amphoriscidae and the genus Ute. Our phylogeny also provided support for the vaguely suspected close relationship of several members of Grantiidae with giantortical diactines to members of Heteropiidae. Similarly, our analyses revealed several unexpected affinities, such as a sister group relationship between Leucettusa (Leucaltidae and Leucettidae and between Leucascandra (Jenkinidae and Sycon carteri (Sycettidae. According to our results, the taxonomy of Calcarea is in

  20. Phylogenetic systematics and a revised generic classification of anthidiine bees (Hymenoptera: Megachilidae).

    Litman, Jessica R; Griswold, Terry; Danforth, Bryan N

    2016-07-01

    The bee tribe Anthidiini (Hymenoptera: Megachilidae) is a large, cosmopolitan group of solitary bees that exhibit intriguing nesting behavior. We present the first molecular-based phylogenetic analysis of relationships within Anthidiini using model-based methods and a large, multi-locus dataset (five nuclear genes, 5081 base pairs), as well as a combined analysis using our molecular dataset in conjunction with a previously published morphological matrix. We discuss the evolution of nesting behavior in Anthidiini and the relationship between nesting material and female mandibular morphology. Following an examination of the morphological characters historically used to recognize anthidiine genera, we recommend the use of a molecular-based phylogenetic backbone to define taxonomic groups prior to the assignment of diagnostic morphological characters for these groups. Finally, our results reveal the paraphyly of numerous genera and have significant consequences for anthidiine classification. In order to promote a classification system based on stable, monophyletic clades, we hereby make the following changes to Michener's (2007) classification: The subgenera Afranthidium (Zosteranthidium) Michener and Griswold, 1994, Afranthidium (Branthidium) Pasteels, 1969 and Afranthidium (Immanthidium) Pasteels, 1969 are moved into the genus Pseudoanthidium, thus forming the new combinations Pseudoanthidium (Zosteranthidium), Pseudoanthidium (Branthidium), and Pseudoanthidium (Immanthidium). The genus Neanthidium Pasteels, 1969 is also moved into the genus Pseudoanthidium, thus forming the new combination Pseudoanthidium (Neanthidium). Based on morphological characters shared with our new definition of the genus Pseudoanthidium, the subgenus Afranthidium (Mesanthidiellum) Pasteels, 1969 and the genus Gnathanthidium Pasteels, 1969 are also moved into the genus Pseudoanthidium, thus forming the new combinations Pseudoanthidium (Mesanthidiellum) and Pseudoanthidium (Gnathanthidium

  1. Molecular phylogenetic evaluation of classification and scenarios of character evolution in calcareous sponges (Porifera, Class Calcarea).

    Voigt, Oliver; Wülfing, Eilika; Wörheide, Gert

    2012-01-01

    Calcareous sponges (Phylum Porifera, Class Calcarea) are known to be taxonomically difficult. Previous molecular studies have revealed many discrepancies between classically recognized taxa and the observed relationships at the order, family and genus levels; these inconsistencies question underlying hypotheses regarding the evolution of certain morphological characters. Therefore, we extended the available taxa and character set by sequencing the complete small subunit (SSU) rDNA and the almost complete large subunit (LSU) rDNA of additional key species and complemented this dataset by substantially increasing the length of available LSU sequences. Phylogenetic analyses provided new hypotheses about the relationships of Calcarea and about the evolution of certain morphological characters. We tested our phylogeny against competing phylogenetic hypotheses presented by previous classification systems. Our data reject the current order-level classification by again finding non-monophyletic Leucosolenida, Clathrinida and Murrayonida. In the subclass Calcinea, we recovered a clade that includes all species with a cortex, which is largely consistent with the previously proposed order Leucettida. Other orders that had been rejected in the current system were not found, but could not be rejected in our tests either. We found several additional families and genera polyphyletic: the families Leucascidae and Leucaltidae and the genus Leucetta in Calcinea, and in Calcaronea the family Amphoriscidae and the genus Ute. Our phylogeny also provided support for the vaguely suspected close relationship of several members of Grantiidae with giantortical diactines to members of Heteropiidae. Similarly, our analyses revealed several unexpected affinities, such as a sister group relationship between Leucettusa (Leucaltidae) and Leucettidae and between Leucascandra (Jenkinidae) and Sycon carteri (Sycettidae). According to our results, the taxonomy of Calcarea is in desperate need of a

  2. Rapid phylogenetic and functional classification of short genomic fragments with signature peptides

    Berendzen Joel

    2012-08-01

    Full Text Available Abstract Background Classification is difficult for shotgun metagenomics data from environments such as soils, where the diversity of sequences is high and where reference sequences from close relatives may not exist. Approaches based on sequence-similarity scores must deal with the confounding effects that inheritance and functional pressures exert on the relation between scores and phylogenetic distance, while approaches based on sequence alignment and tree-building are typically limited to a small fraction of gene families. We describe an approach based on finding one or more exact matches between a read and a precomputed set of peptide 10-mers. Results At even the largest phylogenetic distances, thousands of 10-mer peptide exact matches can be found between pairs of bacterial genomes. Genes that share one or more peptide 10-mers typically have high reciprocal BLAST scores. Among a set of 403 representative bacterial genomes, some 20 million 10-mer peptides were found to be shared. We assign each of these peptides as a signature of a particular node in a phylogenetic reference tree based on the RNA polymerase genes. We classify the phylogeny of a genomic fragment (e.g., read at the most specific node on the reference tree that is consistent with the phylogeny of observed signature peptides it contains. Using both synthetic data from four newly-sequenced soil-bacterium genomes and ten real soil metagenomics data sets, we demonstrate a sensitivity and specificity comparable to that of the MEGAN metagenomics analysis package using BLASTX against the NR database. Phylogenetic and functional similarity metrics applied to real metagenomics data indicates a signal-to-noise ratio of approximately 400 for distinguishing among environments. Our method assigns ~6.6 Gbp/hr on a single CPU, compared with 25 kbp/hr for methods based on BLASTX against the NR database. Conclusions Classification by exact matching against a precomputed list of signature

  3. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  4. HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors

    Sun Yanni

    2011-05-01

    Full Text Available Abstract Background Protein domain classification is an important step in metagenomic annotation. The state-of-the-art method for protein domain classification is profile HMM-based alignment. However, the relatively high rates of insertions and deletions in homopolymer regions of pyrosequencing reads create frameshifts, causing conventional profile HMM alignment tools to generate alignments with marginal scores. This makes error-containing gene fragments unclassifiable with conventional tools. Thus, there is a need for an accurate domain classification tool that can detect and correct sequencing errors. Results We introduce HMM-FRAME, a protein domain classification tool based on an augmented Viterbi algorithm that can incorporate error models from different sequencing platforms. HMM-FRAME corrects sequencing errors and classifies putative gene fragments into domain families. It achieved high error detection sensitivity and specificity in a data set with annotated errors. We applied HMM-FRAME in Targeted Metagenomics and a published metagenomic data set. The results showed that our tool can correct frameshifts in error-containing sequences, generate much longer alignments with significantly smaller E-values, and classify more sequences into their native families. Conclusions HMM-FRAME provides a complementary protein domain classification tool to conventional profile HMM-based methods for data sets containing frameshifts. Its current implementation is best used for small-scale metagenomic data sets. The source code of HMM-FRAME can be downloaded at http://www.cse.msu.edu/~zhangy72/hmmframe/ and at https://sourceforge.net/projects/hmm-frame/.

  5. The challenge of producing an accurate statewide land cover classification of digital satellite data

    A general land use/land cover data set for South Carolina produced from 1989/1990 SPOT multispectral data is presented. This data set incorporates eight categories: urban/built-up, agricultural/grass, scrub/shrub, forest, water, forested wetland, nonforested wetland, and barren. A statewide inventory of these land use/land cover 'associations' is prepared using integrated pcERDAS and prARC/INFO software by the South Carolina Land Resources Commission with unsupervised classification and reclassification routines, and subsequent air photo verification. Land cover data are produced by county and evaluated for reliability (88-percent average classification accuracy). Multiple applications are served by accurate and timely county land cover inventories for resource management and economic development at state and local government levels, specifically for purposes of land use planning and site location analysis. 6 refs

  6. Towards a formal genealogical classification of the Lezgian languages (North Caucasus): testing various phylogenetic methods on lexical data.

    Kassian, Alexei

    2015-01-01

    A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies. PMID:25719456

  7. DNA barcode analysis: a comparison of phylogenetic and statistical classification methods

    Leblois Raphael

    2009-11-01

    Full Text Available Abstract Background DNA barcoding aims to assign individuals to given species according to their sequence at a small locus, generally part of the CO1 mitochondrial gene. Amongst other issues, this raises the question of how to deal with within-species genetic variability and potential transpecific polymorphism. In this context, we examine several assignation methods belonging to two main categories: (i phylogenetic methods (neighbour-joining and PhyML that attempt to account for the genealogical framework of DNA evolution and (ii supervised classification methods (k-nearest neighbour, CART, random forest and kernel methods. These methods range from basic to elaborate. We investigated the ability of each method to correctly classify query sequences drawn from samples of related species using both simulated and real data. Simulated data sets were generated using coalescent simulations in which we varied the genealogical history, mutation parameter, sample size and number of species. Results No method was found to be the best in all cases. The simplest method of all, "one nearest neighbour", was found to be the most reliable with respect to changes in the parameters of the data sets. The parameter most influencing the performance of the various methods was molecular diversity of the data. Addition of genetically independent loci - nuclear genes - improved the predictive performance of most methods. Conclusion The study implies that taxonomists can influence the quality of their analyses either by choosing a method best-adapted to the configuration of their sample, or, given a certain method, increasing the sample size or altering the amount of molecular diversity. This can be achieved either by sequencing more mtDNA or by sequencing additional nuclear genes. In the latter case, they may also have to modify their data analysis method.

  8. A classification of the Chloridoideae (Poaceae) based on multi-gene phylogenetic trees.

    Peterson, Paul M; Romaschenko, Konstantin; Johnson, Gabriel

    2010-05-01

    We conducted a molecular phylogenetic study of the subfamily Chloridoideae using six plastid DNA sequences (ndhA intron, ndhF, rps16-trnK, rps16 intron, rps3, and rpl32-trnL) and a single nuclear ITS DNA sequence. Our large original data set includes 246 species (17.3%) representing 95 genera (66%) of the grasses currently placed in the Chloridoideae. The maximum likelihood and Bayesian analysis of DNA sequences provides strong support for the monophyly of the Chloridoideae; followed by, in order of divergence: a Triraphideae clade with Neyraudia sister to Triraphis; an Eragrostideae clade with the Cotteinae (includes Cottea and Enneapogon) sister to the Uniolinae (includes Entoplocamia, Tetrachne, and Uniola), and a terminal Eragrostidinae clade of Ectrosia, Harpachne, and Psammagrostis embedded in a polyphyletic Eragrostis; a Zoysieae clade with Urochondra sister to a Zoysiinae (Zoysia) clade, and a terminal Sporobolinae clade that includes Spartina, Calamovilfa, Pogoneura, and Crypsis embedded in a polyphyletic Sporobolus; and a very large terminal Cynodonteae clade that includes 13 monophyletic subtribes. The Cynodonteae includes, in alphabetical order: Aeluropodinae (Aeluropus); Boutelouinae (Bouteloua); Eleusininae (includes Apochiton, Astrebla with Schoenefeldia embedded, Austrochloris, Brachyachne, Chloris, Cynodon with Brachyachne embedded in part, Eleusine, Enteropogon with Eustachys embedded in part, Eustachys, Chrysochloa, Coelachyrum, Leptochloa with Dinebra embedded, Lepturus, Lintonia, Microchloa, Saugetia, Schoenefeldia, Sclerodactylon, Tetrapogon, and Trichloris); Hilariinae (Hilaria); Monanthochloinae (includes Distichlis, Monanthochloe, and Reederochloa); Muhlenbergiinae (Muhlenbergia with Aegopogon, Bealia, Blepharoneuron, Chaboissaea, Lycurus, Pereilema, Redfieldia, Schaffnerella, and Schedonnardus all embedded); Orcuttiinae (includes Orcuttia and Tuctoria); Pappophorinae (includes Neesiochloa and Pappophorum); Scleropogoninae (includes

  9. GPD: a graph pattern diffusion kernel for accurate graph classification with applications in cheminformatics.

    Smalter, Aaron; Huan, Jun Luke; Jia, Yi; Lushington, Gerald

    2010-01-01

    Graph data mining is an active research area. Graphs are general modeling tools to organize information from heterogeneous sources and have been applied in many scientific, engineering, and business fields. With the fast accumulation of graph data, building highly accurate predictive models for graph data emerges as a new challenge that has not been fully explored in the data mining community. In this paper, we demonstrate a novel technique called graph pattern diffusion (GPD) kernel. Our idea is to leverage existing frequent pattern discovery methods and to explore the application of kernel classifier (e.g., support vector machine) in building highly accurate graph classification. In our method, we first identify all frequent patterns from a graph database. We then map subgraphs to graphs in the graph database and use a process we call "pattern diffusion" to label nodes in the graphs. Finally, we designed a graph alignment algorithm to compute the inner product of two graphs. We have tested our algorithm using a number of chemical structure data. The experimental results demonstrate that our method is significantly better than competing methods such as those kernel functions based on paths, cycles, and subgraphs. PMID:20431140

  10. DNA barcode analysis: a comparison of phylogenetic and statistical classification methods.

    Leblois Raphael; Olteanu Madalina; Bleakley Kevin; Schaeffer Brigitte; David Olivier; Austerlitz Frederic; Veuille Michel; Laredo Catherine

    2009-01-01

    Abstract Background DNA barcoding aims to assign individuals to given species according to their sequence at a small locus, generally part of the CO1 mitochondrial gene. Amongst other issues, this raises the question of how to deal with within-species genetic variability and potential transpecific polymorphism. In this context, we examine several assignation methods belonging to two main categories: (i) phylogenetic methods (neighbour-joining and PhyML) that attempt to account for the genealo...

  11. Addition of wsp sequences to the Wolbachia phylogenetic tree and stability of the classification.

    Pintureau, B; Chaudier, S; Lassablière, F; Charles, H; Grenier, S

    2000-10-01

    Wolbachia are symbiotic bacteria altering reproductive characters of numerous arthropods. Their most recent phylogeny and classification are based on sequences of the wsp gene. We sequenced wsp gene from six Wolbachia strains infecting six Trichogramma species that live as egg parasitoids on many insects. This allows us to test the effect of the addition of sequences on the Wolbachia phylogeny and to check the classification of Wolbachia infecting Trichogramma. The six Wolbachia studied are classified in the B supergroup. They confirm the monophyletic structure of the B Wolbachia in Trichogramma but introduce small differences in the Wolbachia classification. Modifications include the definition of a new group, Sem, for Wolbachia of T. semblidis and the merging of the two closely related groups, Sib and Kay. Specific primers were determined and tested for the Sem group. PMID:11040288

  12. Towards a phylogenetic classification of reef corals: The Indo-Pacific genera Merulina, Goniastrea and Scapophyllia (Scleractinia, Merulinidae)

    Huang, Danwei

    2014-06-03

    Recent advances in scleractinian systematics and taxonomy have been achieved through the integration of molecular and morphological data, as well as rigorous analysis using phylogenetic methods. In this study, we continue in our pursuit of a phylogenetic classification by examining the evolutionary relationships between the closely related reef coral genera Merulina, Goniastrea, Paraclavarina and Scapophyllia (Merulinidae). In particular, we address the extreme polyphyly of Favites and Goniastrea that was discovered a decade ago. We sampled 145 specimens belonging to 16 species from a wide geographic range in the Indo-Pacific, focusing especially on type localities, including the Red Sea, western Indian Ocean and central Pacific. Tree reconstructions based on both nuclear and mitochondrial markers reveal a novel lineage composed of three species previously placed in Favites and Goniastrea. Morphological analyses indicate that this clade, Paragoniastrea Huang, Benzoni & Budd, gen. n., has a unique combination of corallite and subcorallite features observable with scanning electron microscopy and thin sections. Molecular and morphological evidence furthermore indicates that the monotypic genus Paraclavarina is nested within Merulina, and the former is therefore synonymised. © 2014 Royal Swedish Academy of Sciences.

  13. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu.

    Manwell, C; Baker, C M

    1980-01-01

    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species. PMID:7458002

  14. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life.

    Margulis, L

    1996-01-01

    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life...

  15. Nucleotide sequence and phylogenetic classification of candidate human papilloma virus type 92

    From a basal cell carcinoma (BCC) the complete genome of candidate human papillomavirus (HPV) type 92 was characterized. Phylogenetically, the candidate HPV 92 was relatively distantly related to other cutaneous HPV types within the B1 group. By quantitative real time PCR, 94 viral copies were present per cell in the BCC and another BCC contained 1 viral copy per cell. Lower copy numbers were found in two solar keratoses (1 copy per 33 cells and 1 copy per 60 cells) and two squamous cell carcinomas (1 copy per 436 cells and 1 copy per 1143 cells). The high viral load of HPV 92 in two BCCs differs from the low amount of HPV DNA reported from nonmelanoma skin cancers

  16. A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects

    Bill Smith; Frank Scarpace; Widad Elmahboub

    2009-01-01

    Atmospheric correction impacts on the accuracy of satellite image-based land cover classification are a growing concern among scientists. In this study, the principle objective was to enhance classification accuracy by minimizing contamination effects from aerosol scattering in Landsat TM images due to the variation in solar zenith angle corresponding to cloud-free earth targets. We have derived a mathematical model for aerosols to compute and subtract the aerosol scattering noise per pixel o...

  17. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    2010-01-01

    Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for

  18. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Dawyndt Peter

    2010-01-01

    Full Text Available Abstract Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the

  19. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life

    Margulis, L.

    1996-01-01

    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life-history, and fossil record evidence support the reunification of bacteria as Prokarya while subdividing Eukarya into uniquely defined subtaxa: Protoctista, Animalia, Fungi, and Plantae.

  20. Assignment of Calibration Information to Deeper Phylogenetic Nodes is More Effective in Obtaining Precise and Accurate Divergence Time Estimates.

    Mello, Beatriz; Schrago, Carlos G

    2014-01-01

    Divergence time estimation has become an essential tool for understanding macroevolutionary events. Molecular dating aims to obtain reliable inferences, which, within a statistical framework, means jointly increasing the accuracy and precision of estimates. Bayesian dating methods exhibit the propriety of a linear relationship between uncertainty and estimated divergence dates. This relationship occurs even if the number of sites approaches infinity and places a limit on the maximum precision of node ages. However, how the placement of calibration information may affect the precision of divergence time estimates remains an open question. In this study, relying on simulated and empirical data, we investigated how the location of calibration within a phylogeny affects the accuracy and precision of time estimates. We found that calibration priors set at median and deep phylogenetic nodes were associated with higher precision values compared to analyses involving calibration at the shallowest node. The results were independent of the tree symmetry. An empirical mammalian dataset produced results that were consistent with those generated by the simulated sequences. Assigning time information to the deeper nodes of a tree is crucial to guarantee the accuracy and precision of divergence times. This finding highlights the importance of the appropriate choice of outgroups in molecular dating. PMID:24855333

  1. A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects

    Bill Smith

    2009-07-01

    Full Text Available Atmospheric correction impacts on the accuracy of satellite image-based land cover classification are a growing concern among scientists. In this study, the principle objective was to enhance classification accuracy by minimizing contamination effects from aerosol scattering in Landsat TM images due to the variation in solar zenith angle corresponding to cloud-free earth targets. We have derived a mathematical model for aerosols to compute and subtract the aerosol scattering noise per pixel of different vegetation classes from TM images of Nicolet in north-eastern Wisconsin. An algorithm in C++ has been developed with iterations to simulate, model, and correct for the solar zenith angle influences on scattering. Results from a supervised classification with corrected TM images showed increased class accuracy for land cover types over uncorrected images. The overall accuracy of the supervised classification was improved substantially (between 13% and 18%. The z-score shows significant difference between the corrected data and the raw data (between 4.0 and 12.0. Therefore, the atmospheric correction was essential for enhancing the image classification.

  2. Molecular Phylogenetic Classification of Streptomycetes Isolated from the Rhizosphere of Tropical Legume (Paraserianthes falcataria (L. Nielsen

    LANGKAH SEMBIRING

    2009-09-01

    Full Text Available Intrageneric diversity of 556 streptomycetes isolated from the rhizosphere of tropical legume was determined by using molecular taxonomic method based on 16S rDNA. A total of 46 isolates were taken to represent 37 colour groups of the isolates. 16S rDNA were amplified and subsequently sequenced and the sequences data were aligned with streptomycete sequences retrieved from the ribosomal data base project (RDP data. Phylogenetic trees were generated by using the PHYLIP software package and the matrix of nucleotide similarity and nucleotide difference were generated by using PHYDIT software. The results confirmed and extended the value of 16S rDNA sequencing in streptomycete systematic. The 16S rDNA sequence data showed that most of the tested colour group representatives formed new centers of taxonomic variation within the genus Streptomyces. The generic assignment of these organisms was underpinned by 16S rDNA sequence data which also suggested that most of the strains represented new centers of taxonomic variation. The taxonomic data indicate that diverse populations of streptomycetes are associated with the roots of tropical legume (P. falcataria. Therefore, the combination of selective isolation and molecular taxonomic procedures used in this study provide a powerful way of uncovering new centers of taxonomic variation within the genus Streptomyces.

  3. Deceptive desmas: molecular phylogenetics suggests a new classification and uncovers convergent evolution of lithistid demosponges.

    Astrid Schuster

    Full Text Available Reconciling the fossil record with molecular phylogenies to enhance the understanding of animal evolution is a challenging task, especially for taxa with a mostly poor fossil record, such as sponges (Porifera. 'Lithistida', a polyphyletic group of recent and fossil sponges, are an exception as they provide the richest fossil record among demosponges. Lithistids, currently encompassing 13 families, 41 genera and >300 recent species, are defined by the common possession of peculiar siliceous spicules (desmas that characteristically form rigid articulated skeletons. Their phylogenetic relationships are to a large extent unresolved and there has been no (taxonomically comprehensive analysis to formally reallocate lithistid taxa to their closest relatives. This study, based on the most comprehensive molecular and morphological investigation of 'lithistid' demosponges to date, corroborates some previous weakly-supported hypotheses, and provides novel insights into the evolutionary relationships of the previous 'order Lithistida'. Based on molecular data (partial mtDNA CO1 and 28S rDNA sequences, we show that 8 out of 13 'Lithistida' families belong to the order Astrophorida, whereas Scleritodermidae and Siphonidiidae form a separate monophyletic clade within Tetractinellida. Most lithistid astrophorids are dispersed between different clades of the Astrophorida and we propose to formally reallocate them, respectively. Corallistidae, Theonellidae and Phymatellidae are monophyletic, whereas the families Pleromidae and Scleritodermidae are polyphyletic. Family Desmanthidae is polyphyletic and groups within Halichondriidae--we formally propose a reallocation. The sister group relationship of the family Vetulinidae to Spongillida is confirmed and we propose here for the first time to include Vetulina into a new Order Sphaerocladina. Megascleres and microscleres possibly evolved and/or were lost several times independently in different 'lithistid' taxa, and

  4. Phylogenetic analysis and classification of the Brassica rapa SET-domain protein family

    Huang Yong

    2011-12-01

    Full Text Available Abstract Background The SET (Su(var3-9, Enhancer-of-zeste, Trithorax domain is an evolutionarily conserved sequence of approximately 130-150 amino acids, and constitutes the catalytic site of lysine methyltransferases (KMTs. KMTs perform many crucial biological functions via histone methylation of chromatin. Histone methylation marks are interpreted differently depending on the histone type (i.e. H3 or H4, the lysine position (e.g. H3K4, H3K9, H3K27, H3K36 or H4K20 and the number of added methyl groups (i.e. me1, me2 or me3. For example, H3K4me3 and H3K36me3 are associated with transcriptional activation, but H3K9me2 and H3K27me3 are associated with gene silencing. The substrate specificity and activity of KMTs are determined by sequences within the SET domain and other regions of the protein. Results Here we identified 49 SET-domain proteins from the recently sequenced Brassica rapa genome. We performed sequence similarity and protein domain organization analysis of these proteins, along with the SET-domain proteins from the dicot Arabidopsis thaliana, the monocots Oryza sativa and Brachypodium distachyon, and the green alga Ostreococcus tauri. We showed that plant SET-domain proteins can be grouped into 6 distinct classes, namely KMT1, KMT2, KMT3, KMT6, KMT7 and S-ET. Apart from the S-ET class, which has an interrupted SET domain and may be involved in methylation of nonhistone proteins, the other classes have characteristics of histone methyltransferases exhibiting different substrate specificities: KMT1 for H3K9, KMT2 for H3K4, KMT3 for H3K36, KMT6 for H3K27 and KMT7 also for H3K4. We also propose a coherent and rational nomenclature for plant SET-domain proteins. Comparisons of sequence similarity and synteny of B. rapa and A. thaliana SET-domain proteins revealed recent gene duplication events for some KMTs. Conclusion This study provides the first characterization of the SET-domain KMT proteins of B. rapa. Phylogenetic analysis data

  5. GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications in Cheminformatics

    Smalter, Aaron; Huan, Jun; Jia, Yi; Lushington, Gerald

    2010-01-01

    Graph data mining is an active research area. Graphs are general modeling tools to organize information from heterogeneous sources and have been applied in many scientific, engineering, and business fields. With the fast accumulation of graph data, building highly accurate predictive models for graph data emerges as a new challenge that has not been fully explored in the data mining community. In this paper, we demonstrate a novel technique called graph pattern diffusion (GPD) kernel. Our ide...

  6. Two fast and accurate heuristic RBF learning rules for data classification.

    Rouhani, Modjtaba; Javan, Dawood S

    2016-03-01

    This paper presents new Radial Basis Function (RBF) learning methods for classification problems. The proposed methods use some heuristics to determine the spreads, the centers and the number of hidden neurons of network in such a way that the higher efficiency is achieved by fewer numbers of neurons, while the learning algorithm remains fast and simple. To retain network size limited, neurons are added to network recursively until termination condition is met. Each neuron covers some of train data. The termination condition is to cover all training data or to reach the maximum number of neurons. In each step, the center and spread of the new neuron are selected based on maximization of its coverage. Maximization of coverage of the neurons leads to a network with fewer neurons and indeed lower VC dimension and better generalization property. Using power exponential distribution function as the activation function of hidden neurons, and in the light of new learning approaches, it is proved that all data became linearly separable in the space of hidden layer outputs which implies that there exist linear output layer weights with zero training error. The proposed methods are applied to some well-known datasets and the simulation results, compared with SVM and some other leading RBF learning methods, show their satisfactory and comparable performance. PMID:26797472

  7. Protein clustering and RNA phylogenetic reconstruction of the influenza A [corrected] virus NS1 protein allow an update in classification and identification of motif conservation.

    Edgar E Sevilla-Reyes

    Full Text Available The non-structural protein 1 (NS1 of influenza A virus (IAV, coded by its third most diverse gene, interacts with multiple molecules within infected cells. NS1 is involved in host immune response regulation and is a potential contributor to the virus host range. Early phylogenetic analyses using 50 sequences led to the classification of NS1 gene variants into groups (alleles A and B. We reanalyzed NS1 diversity using 14,716 complete NS IAV sequences, downloaded from public databases, without host bias. Removal of sequence redundancy and further structured clustering at 96.8% amino acid similarity produced 415 clusters that enhanced our capability to detect distinct subgroups and lineages, which were assigned a numerical nomenclature. Maximum likelihood phylogenetic reconstruction using RNA sequences indicated the previously identified deep branching separating group A from group B, with five distinct subgroups within A as well as two and five lineages within the A4 and A5 subgroups, respectively. Our classification model proposes that sequence patterns in thirteen amino acid positions are sufficient to fit >99.9% of all currently available NS1 sequences into the A subgroups/lineages or the B group. This classification reduces host and virus bias through the prioritization of NS1 RNA phylogenetics over host or virus phenetics. We found significant sequence conservation within the subgroups and lineages with characteristic patterns of functional motifs, such as the differential binding of CPSF30 and crk/crkL or the availability of a C-terminal PDZ-binding motif. To understand selection pressures and evolution acting on NS1, it is necessary to organize the available data. This updated classification may help to clarify and organize the study of NS1 interactions and pathogenic differences and allow the drawing of further functional inferences on sequences in each group, subgroup and lineage rather than on a strain-by-strain basis.

  8. Phylogenetic Classification and Species Identification of Dermatophyte Strains Based on DNA Sequences of Nuclear Ribosomal Internal Transcribed Spacer 1 Regions

    Makimura, Koichi; Tamura, Yoshiko; Mochizuki, Takashi; Hasegawa, Atsuhiko; Tajiri, Yoshito; Hanazawa, Ryo; Uchida, Katsuhisa; Saito, Hiuga; YAMAGUCHI, HIDEYO

    1999-01-01

    The mutual phylogenetic relationships of dermatophytes of the genera Trichophyton, Microsporum, and Epidermophyton were demonstrated by using internal transcribed spacer 1 (ITS1) region ribosomal DNA sequences. Trichophyton spp. and Microsporum spp. form a cluster in the phylogenetic tree with Epidermophyton floccosum as an outgroup, and within this cluster, all Trichophyton spp. except Trichophyton terrestre form a nested cluster (100% bootstrap support). Members of dermatophytes in the clus...

  9. TIPP: taxonomic identification and phylogenetic profiling

    Nguyen, Nam-phuong; Mirarab, Siavash; Liu, Bo; Pop, Mihai; Warnow, Tandy

    2014-01-01

    Motivation: Abundance profiling (also called ‘phylogenetic profiling’) is a crucial step in understanding the diversity of a metagenomic sample, and one of the basic techniques used for this is taxonomic identification of the metagenomic reads. Results: We present taxon identification and phylogenetic profiling (TIPP), a new marker-based taxon identification and abundance profiling method. TIPP combines SAT\\'e-enabled phylogenetic placement a phylogenetic placement method, with statistical techniques to control the classification precision and recall, and results in improved abundance profiles. TIPP is highly accurate even in the presence of high indel errors and novel genomes, and matches or improves on previous approaches, including NBC, mOTU, PhymmBL, MetaPhyler and MetaPhlAn. Availability and implementation: Software and supplementary materials are available at http://www.cs.utexas.edu/users/phylo/software/sepp/tipp-submission/. Contact: warnow@illinois.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25359891

  10. Revisiting the phylogeny of Bombacoideae (Malvaceae): Novel relationships, morphologically cohesive clades, and a new tribal classification based on multilocus phylogenetic analyses.

    Carvalho-Sobrinho, Jefferson G; Alverson, William S; Alcantara, Suzana; Queiroz, Luciano P; Mota, Aline C; Baum, David A

    2016-08-01

    Bombacoideae (Malvaceae) is a clade of deciduous trees with a marked dominance in many forests, especially in the Neotropics. The historical lack of a well-resolved phylogenetic framework for Bombacoideae hinders studies in this ecologically important group. We reexamined phylogenetic relationships in this clade based on a matrix of 6465 nuclear (ETS, ITS) and plastid (matK, trnL-trnF, trnS-trnG) DNA characters. We used maximum parsimony, maximum likelihood, and Bayesian inference to infer relationships among 108 species (∼70% of the total number of known species). We analyzed the evolution of selected morphological traits: trunk or branch prickles, calyx shape, endocarp type, seed shape, and seed number per fruit, using ML reconstructions of their ancestral states to identify possible synapomorphies for major clades. Novel phylogenetic relationships emerged from our analyses, including three major lineages marked by fruit or seed traits: the winged-seed clade (Bernoullia, Gyranthera, and Huberodendron), the spongy endocarp clade (Adansonia, Aguiaria, Catostemma, Cavanillesia, and Scleronema), and the Kapok clade (Bombax, Ceiba, Eriotheca, Neobuchia, Pachira, Pseudobombax, Rhodognaphalon, and Spirotheca). The Kapok clade, the most diverse lineage of the subfamily, includes sister relationships (i) between Pseudobombax and "Pochota fendleri" a historically incertae sedis taxon, and (ii) between the Paleotropical genera Bombax and Rhodognaphalon, implying just two bombacoid dispersals to the Old World, the other one involving Adansonia. This new phylogenetic framework offers new insights and a promising avenue for further evolutionary studies. In view of this information, we present a new tribal classification of the subfamily, accompanied by an identification key. PMID:27154210

  11. Phylogenetic Classification of Trichophyton mentagrophytes Complex Strains Based on DNA Sequences of Nuclear Ribosomal Internal Transcribed Spacer 1 Regions

    Makimura, Koichi; Mochizuki, Takashi; Hasegawa, Atsuhiko; Uchida, Katsuhisa; Saito, Hiuga; YAMAGUCHI, HIDEYO

    1998-01-01

    Using internal transcribed spacer 1 (ITS1) region ribosomal DNA sequences from 37 stock strains and clinical isolates provisionally termed Trichophyton mentagrophytes complex in Japan, we demonstrated the mutual phylogenetic relationships of these strains. Members of this complex were classified into 3 ITS1-homologous groups and 13 ITS1-identical groups by their sequences. ITS1-homologous group I consists of Arthroderma vanbreuseghemii, T. mentagrophytes human isolates, and several strains of...

  12. Increasing the data size to accurately reconstruct the phylogenetic relationships between nine subgroups of the Drosophila melanogaster species group (Drosophilidae, Diptera).

    Yang, Yong; Hou, Zhuo-Cheng; Qian, Yuan-Huai; Kang, Han; Zeng, Qing-Tao

    2012-01-01

    Previous phylogenetic analyses of the melanogaster species group have led to conflicting hypotheses concerning their relationship; therefore the addition of new sequence data is necessary to discover the phylogeny of this species group. Here we present new data derived from 17 genes and representing 48 species to reconstruct the phylogeny of the melanogaster group. A variety of statistical tests, as well as maximum likelihood mapping analysis, were performed to estimate data quality, suggesting that all genes had a high degree of contribution to resolve the phylogeny. Individual locus was analyzed using maximum likelihood (ML), and the concatenated dataset (12,988 bp) were analyzed using partitioned maximum likelihood (ML) and Bayesian analyses. Separated analysis produced various phylogenetic relationships, however, phylogenetic topologies from ML and Bayesian analysis based on concatenated dataset, at the subgroup level, were completely identical to each other with high levels of support. Our results recovered three major clades: the ananassae subgroup, followed by the montium subgroup, the melanogaster subgroup and the oriental subgroups form the third monophyletic clade, in which melanogaster (takahashii, suzukii) forms one subclade and ficusphila [eugracilis (elegans, rhopaloa)] forms another. However, more data are necessary to determine the phylogenetic position of Drosophila lucipennis which proved difficult to place. PMID:21985965

  13. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

    Arran Schlosberg

    2014-05-01

    Full Text Available Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV.

  14. When proglottids and scoleces conflict: phylogenetic relationships and a family-level classification of the Lecanicephalidea (Platyhelminthes: Cestoda).

    Jensen, Kirsten; Caira, Janine N; Cielocha, Joanna J; Littlewood, D Timothy J; Waeschenbach, Andrea

    2016-05-01

    This study presents the first comprehensive phylogenetic analysis of the interrelationships of the morphologically diverse elasmobranch-hosted tapeworm order Lecanicephalidea, based on molecular sequence data. With almost half of current generic diversity having been erected or resurrected within the last decade, an apparent conflict between scolex morphology and proglottid anatomy has hampered the assignment of many of these genera to families. Maximum likelihood and Bayesian analyses of two nuclear markers (D1-D3 of lsrDNA and complete ssrDNA) and two mitochondrial markers (partial rrnL and partial cox1) for 61 lecanicephalidean species representing 22 of the 25 valid genera were conducted; new sequence data were generated for 43 species and 11 genera, including three undescribed genera. The monophyly of the order was confirmed in all but the analyses based on cox1 data alone. Sesquipedalapex placed among species of Anteropora and was thus synonymized with the latter genus. Based on analyses of the concatenated dataset, eight major groups emerged which are herein formally recognised at the familial level. Existing family names (i.e., Lecanicephalidae, Polypocephalidae, Tetragonocephalidae, and Cephalobothriidae) are maintained for four of the eight clades, and new families are proposed for the remaining four groups (Aberrapecidae n. fam., Eniochobothriidae n. fam., Paraberrapecidae n. fam., and Zanobatocestidae n. fam.). The four new families and the Tetragonocephalidae are monogeneric, while the Cephalobothriidae, Lecanicephalidae and Polypocephalidae comprise seven, eight and four genera, respectively. As a result of their unusual morphologies, the three genera not included here (i.e., Corrugatocephalum, Healyum and Quadcuspibothrium) are considered incertae sedis within the order until their familial affinities can be examined in more detail. All eight families are newly circumscribed based on morphological features and a key to the families is provided

  15. Classification

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  16. A non-contact method based on multiple signal classification algorithm to reduce the measurement time for accurately heart rate detection

    Bechet, P.; Mitran, R.; Munteanu, M.

    2013-08-01

    Non-contact methods for the assessment of vital signs are of great interest for specialists due to the benefits obtained in both medical and special applications, such as those for surveillance, monitoring, and search and rescue. This paper investigates the possibility of implementing a digital processing algorithm based on the MUSIC (Multiple Signal Classification) parametric spectral estimation in order to reduce the observation time needed to accurately measure the heart rate. It demonstrates that, by proper dimensioning the signal subspace, the MUSIC algorithm can be optimized in order to accurately assess the heart rate during an 8-28 s time interval. The validation of the processing algorithm performance was achieved by minimizing the mean error of the heart rate after performing simultaneous comparative measurements on several subjects. In order to calculate the error the reference value of heart rate was measured using a classic measurement system through direct contact.

  17. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  18. 善用《中图法》(第五版)改善图书文献归类准确性%Books and Documents'Accurate Classification by Using Chinese Library Classification ( Sth Edition)

    汤彩霞

    2011-01-01

    从三个方面讨论如何善用《中图法》(第五版)(以下简称CLC5)改善图书文献归类准确性,分别是:做好和CLC5相关的前期准备工作,如新旧分类法的比对等;了解和掌握《中图法》(第五版)的部分通用分类规则;制定启用CLC5的本馆分类规定。%From three aspects, this paper discusses how to classify books and documents accurately by using the Chinese Library Classification (Sth Edition) (hereafter referred to as CLC5 ), such as: making a good preliminary preparation for CLCS, including the comparison of the new with the old classification, etc. ; Understanding and grasping some universal classification rules of CLCS; Making the regulations of launching CLC5 in our library.

  19. SpineAnalyzer™ is an accurate and precise method of vertebral fracture detection and classification on dual-energy lateral vertebral assessment scans

    Osteoporotic fractures of the spine are associated with significant morbidity, are highly predictive of hip fractures, but frequently do not present clinically. When there is a low to moderate clinical suspicion of vertebral fracture, which would not justify acquisition of a radiograph, vertebral fracture assessment (VFA) using Dual-energy X-ray Absorptiometry (DXA) offers a low-dose opportunity for diagnosis. Different approaches to the classification of vertebral fractures have been documented. The aim of this study was to measure the precision and accuracy of SpineAnalyzer™, a quantitative morphometry software program. Lateral vertebral assessment images of 64 men were analysed using SpineAnalyzer™ and standard GE Lunar software. The images were also analysed by two expert readers using a semi-quantitative approach. Agreement between groups ranged from 95.99% to 98.60%. The intra-rater precision for the application of SpineAnalyzer™ to vertebrae was poor in the upper thoracic regions, but good elsewhere. SpineAnalyzer™ is a reproducible and accurate method for measuring vertebral height and quantifying vertebral fractures from VFA scans. - Highlights: • Vertebral fracture assessment (VFA) using Dual-energy X-ray Absorptiometry (DXA) offers a low-dose opportunity for diagnosis. • Agreement between VFA software (SpineAnalyzer™) and expert readers is high. • Intra-rater precision of SpineAnalyzer™ applied to upper thoracic vertebrae is poor, but good elsewhere. • SpineAnalyzer™ is reproducible and accurate for vertebral height measurement and fracture quantification from VFA scans

  20. Didiscus verdensis spec. nov. (Porifera: Halichondrida) from the Cape Verde Islands, with a revision and phylogenetic classification of the genus Didiscus

    Hiemstra, F.; Soest, van R.W.M.

    1991-01-01

    A new species of the circumtropical/subtropical genus Didiscus Dendy, 1922 is described from the Cape Verde Islands. Based on a phylogenetic analysis of all known species of the genus, using morphological and microscopical (including SEM) characters, it was demonstrated that the new species is close

  1. Cyber infrastructure for Fusarium: three integrated platforms supporting strain identification, phylogenetics, comparative genomics and knowledge sharing

    Park, Bongsoo; Park, Jongsun; Cheong, Kyeong-Chae; Choi, Jaeyoung; Jung, Kyongyong; Kim, Donghan; Lee, Yong-Hwan; Ward, Todd J.; O'Donnell, Kerry; Geiser, David M.; Kang, Seogchan

    2010-01-01

    The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate species identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on well-preserved culture collections, have established a robust foundation for Fusarium classification. Genomes of four Fusarium species have been published with more being currently sequence...

  2. Molecular Phylogenetic: Organism Taxonomy Method Based on Evolution History

    N.L.P Indi Dharmayanti

    2011-01-01

    Phylogenetic is described as taxonomy classification of an organism based on its evolution history namely its phylogeny and as a part of systematic science that has objective to determine phylogeny of organism according to its characteristic. Phylogenetic analysis from amino acid and protein usually became important area in sequence analysis. Phylogenetic analysis can be used to follow the rapid change of a species such as virus. The phylogenetic evolution tree is a two dimensional of a spec...

  3. Rapid and accurate taxonomic classification of insect (class Insecta) cytochrome c oxidase subunit 1 (COI) DNA barcode sequences using a naïve Bayesian classifier

    Porter, Teresita M.; Gibson, Joel F; Shokralla, Shadi; Baird, Donald J.; Golding, G. Brian; Hajibabaei, Mehrdad

    2014-01-01

    Current methods to identify unknown insect (class Insecta) cytochrome c oxidase (COI barcode) sequences often rely on thresholds of distances that can be difficult to define, sequence similarity cut-offs, or monophyly. Some of the most commonly used metagenomic classification methods do not provide a measure of confidence for the taxonomic assignments they provide. The aim of this study was to use a naïve Bayesian classifier (Wang et al. Applied and Environmental Microbiology, 2007; 73: 5261)...

  4. The Revised Classification of Eukaryotes

    Adl, Sina M; Simpson, Alastair G.B.; Lane, Christopher E.; Lukeš, Julius; Bass, David; Bowser, Samuel S.; Brown, Matthew W.; Burki, Fabien; Dunthorn, Micah; Hampl, Vladimir; Heiss, Aaron; Hoppenrath, Mona; Lara, Enrique; Le Gall, Line; Lynn, Denis H.

    2013-01-01

    This revision of the classification of eukaryotes, which updates that of Adl et al. [J. Eukaryot. Microbiol. 52 (2005) 399], retains an emphasis on the protists and incorporates changes since 2005 that have resolved nodes and branches in phylogenetic trees. Whereas the previous revision was successful in re-introducing name stability to the classification, this revision provides a classification for lineages that were then still unresolved. The supergroups have withstood phylogenetic hypothes...

  5. ICGA-PSO-ELM approach for accurate multiclass cancer classification resulting in reduced gene sets in which genes encoding secreted proteins are highly represented.

    Saraswathi, Saras; Sundaram, Suresh; Sundararajan, Narasimhan; Zimmermann, Michael; Nilsen-Hamilton, Marit

    2011-01-01

    A combination of Integer-Coded Genetic Algorithm (ICGA) and Particle Swarm Optimization (PSO), coupled with the neural-network-based Extreme Learning Machine (ELM), is used for gene selection and cancer classification. ICGA is used with PSO-ELM to select an optimal set of genes, which is then used to build a classifier to develop an algorithm (ICGA_PSO_ELM) that can handle sparse data and sample imbalance. We evaluate the performance of ICGA-PSO-ELM and compare our results with existing methods in the literature. An investigation into the functions of the selected genes, using a systems biology approach, revealed that many of the identified genes are involved in cell signaling and proliferation. An analysis of these gene sets shows a larger representation of genes that encode secreted proteins than found in randomly selected gene sets. Secreted proteins constitute a major means by which cells interact with their surroundings. Mounting biological evidence has identified the tumor microenvironment as a critical factor that determines tumor survival and growth. Thus, the genes identified by this study that encode secreted proteins might provide important insights to the nature of the critical biological features in the microenvironment of each tumor type that allow these cells to thrive and proliferate. PMID:21233525

  6. Stratification of co-evolving genomic groups using ranked phylogenetic profiles

    Tsoka Sophia

    2009-10-01

    Full Text Available Abstract Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples.

  7. Photometric brown-dwarf classification. II. A homogeneous sample of 1361 L and T dwarfs brighter than J = 17.5 with accurate spectral types

    Skrzypek, N.; Warren, S. J.; Faherty, J. K.

    2016-04-01

    We present a homogeneous sample of 1361 L and T dwarfs brighter than J = 17.5 (of which 998 are new), from an effective area of 3070 deg2, classified by the photo-type method to an accuracy of one spectral sub-type using izYJHKW1W2 photometry from SDSS+UKIDSS+WISE. Other than a small bias in the early L types, the sample is shown to be effectively complete to the magnitude limit, for all spectral types L0 to T8. The nature of the bias is an incompleteness estimated at 3% because peculiar blue L dwarfs of type L4 and earlier are classified late M. There is a corresponding overcompleteness because peculiar red (likely young) late M dwarfs are classified early L. Contamination of the sample is confirmed to be small: so far spectroscopy has been obtained for 19 sources in the catalogue and all are confirmed to be ultracool dwarfs. We provide coordinates and izYJHKW1W2 photometry of all sources. We identify an apparent discontinuity, Δm ~ 0.4 mag, in the Y - K colour between spectral types L7 and L8. We present near-infrared spectra of nine sources identified by photo-type as peculiar, including a new low-gravity source ULAS J005505.68+013436.0, with spectroscopic classification L2γ. We provide revised izYJHKW1W2 template colours for late M dwarfs, types M7 to M9. The catalogue is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/589/A49

  8. The need for improved identification and accurate classification of stages 3-5 Chronic Kidney Disease in primary care: retrospective cohort study.

    Poorva Jain

    Full Text Available BACKGROUND: Around ten percent of the population have been reported as having Chronic Kidney Disease (CKD, which is associated with increased cardiovascular mortality. Few previous studies have ascertained the chronicity of CKD. In the UK, a payment for performance (P4P initiative incentivizes CKD (stages 3-5 recognition and management in primary care, but the impact of this has not been assessed. METHODS AND FINDINGS: Using data from 426 primary care practices (population 2,707,130, the age standardised prevalence of stages 3-5 CKD was identified using two consecutive estimated Glomerular Filtration Rates (eGFRs seven days apart. Additionally the accuracy of practice CKD registers and the relationship between accurate identification of CKD and the achievement of P4P indicators was determined. Between 2005 and 2009, the prevalence of stages 3-5 CKD increased from 0.3% to 3.9%. In 2009, 30,440 patients (1.1% unadjusted fulfilled biochemical criteria for CKD but were not on a practice CKD register (uncoded CKD and 60,705 patients (2.2% unadjusted were included on a practice CKD register but did not fulfil biochemical criteria (miscoded CKD. For patients with confirmed CKD, inclusion in a practice register was associated with increasing age, male sex, diabetes, hypertension, cardiovascular disease and increasing CKD stage (p<0.0001. Uncoded CKD patients compared to miscoded patients were less likely to achieve performance indicators for blood pressure (OR 0.84, 95% CI 0.82-0.86 p<0.001 or recorded albumin-creatinine ratio (OR 0.73, 0.70-0.76, p<0.001. CONCLUSIONS: The prevalence of stages 3-5 CKD, using two laboratory reported eGFRs, was lower than estimates from previous studies. Clinically significant discrepancies were identified between biochemically defined CKD and appearance on practice registers, with misclassification associated with sub-optimal care for some people with CKD.

  9. Phylogenetic and phytogeographical relationships in Maloideae (Rosaceae) based on morphological and anatomical characters

    Aldasoro, J.J.; Aedo, C.; Navarro, C.

    2005-01-01

    Phylogenetic relationships among 24 genera of Rosaceae subfam. Maloideae and Spiraeoideae are explored by means of a cladistic analysis; 16 morphological and anatomical characters were included in the analysis. Published suprageneric classifications and characters used in these classifications are b

  10. Advances in phylogenetic studies of Nematoda

    2002-01-01

    Nematoda is a metazoan group with extremely high diversity only next to Insecta. Caenorhabditis elegans is now a favorable experimental model animal in modern developmental biology, genetics and genomics studies. However, the phylogeny of Nematoda and the phylogenetic position of the phylum within animal kingdom have long been in debate. Recent molecular phylogenetic studies gave great challenges to the traditional nematode classification. The new phylogenies not only placed the Nematoda in the Ecdysozoan and divided the phylum into five clades, but also provided new insights into animal molecular identification and phylogenetic biodiversity studies. The present paper reviews major progress and remaining problems in the current molecular phylogenetic studies of Nematoda, and prospects the developmental tendencies of this field.

  11. ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs

    Van der Auwera, Sandra; Bulla, Ingo; Ziller, Mario; Pohlmann, Anne; Harder, Timm; Stanke, Mario

    2014-01-01

    Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu me...

  12. A genus-level classification of the family Thraupidae (Class Aves: Order Passeriformes).

    Burns, Kevin J; Unitt, Philip; Mason, Nicholas A

    2016-01-01

    The tanagers (Thraupidae) are a major component of the Neotropical avifauna, and vary in plumage colors, behaviors, morphologies, and ecologies. Globally, they represent nearly 4% of all avian species and are the largest family of songbirds. However, many currently used tanager genera are not monophyletic, based on analyses of molecular data that have accumulated over the past 25 years. Current genus-level classifications of tanagers have not been revised according to newly documented relationships of tanagers for various reasons: 1) the lack of a comprehensive phylogeny, 2) reluctance to lump existing genera into larger groups, and 3) the lack of available names for newly defined smaller groups. Here, we present two alternative classifications based on a newly published comprehensive phylogeny of tanagers. One of these classifications uses existing generic names, but defines them broadly. The other, which we advocate and follow here, provides new generic names for more narrowly defined groups. Under the latter, we propose eleven new genera (Asemospiza, Islerothraupis, Maschalethraupis, Chrysocorypha, Kleinothraupis, Castanozoster, Ephippiospingus, Chionodacryon, Pseudosaltator, Poecilostreptus, Stilpnia), and resurrect several generic names to form monophyletic taxa. Either of these classifications would allow taxonomic authorities to reconcile classification with current understanding of tanager phylogenetic relationships. Having a more phylogenetically accurate classification for tanagers will facilitate the study and conservation of this important Neotropical radiation of songbirds. PMID:27394344

  13. The evolution of HPV by means of a phylogenetic study.

    Isea, Raúl; Chaves, Juan L; Montes, Esther; Rubio-Montero, Antonio J; Mayo, Rafael

    2009-01-01

    In this work we demonstrate the adequacy of revising the classification systems based on molecular phylogenetic calculations by allowing an arbitrary number of taxas that take advantage of high performance computing platforms for the Human papillomavirus (HPV) case. To do so, we have analysed several phylogenetic trees which have been calculated with the PhyloGrid tool, a workflow developed in the framework of the EELA-2 Project. PMID:19593062

  14. Efficient multivariate sequence classification

    Kuksa, Pavel P.

    2014-01-01

    Kernel-based approaches for sequence classification have been successfully applied to a variety of domains, including the text categorization, image classification, speech analysis, biological sequence analysis, time series and music classification, where they show some of the most accurate results. Typical kernel functions for sequences in these domains (e.g., bag-of-words, mismatch, or subsequence kernels) are restricted to {\\em discrete univariate} (i.e. one-dimensional) string data, such ...

  15. Phylogenetic effective sample size

    Bartoszek, Krzysztof

    2015-01-01

    In this paper I address the question - how large is a phylogenetic sample I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes - the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find...

  16. 'Araphid' diatom classification and the 'absolute standard'

    Williams, David M.

    2009-01-01

    'Araphid' diatom classification is discussed from the point of view of an 'absolute standard' for taxonomic rank. The 'absolute standard' is the phylogenetic tree, its nodes, the included monophyletic groups and sub-groups. To illustrate this point a few species from the genus Licmophora are re-analysed and the resulting phylogenetic tree is discussed in terms of a possible classification, the groups and sub-groups and their ranks.

  17. Phylogenetically resolving epidemiologic linkage

    Romero-Severson, Ethan O.; Bulla, Ingo; Leitner, Thomas

    2016-01-01

    Although the use of phylogenetic trees in epidemiological investigations has become commonplace, their epidemiological interpretation has not been systematically evaluated. Here, we use an HIV-1 within-host coalescent model to probabilistically evaluate transmission histories of two epidemiologically linked hosts. Previous critique of phylogenetic reconstruction has claimed that direction of transmission is difficult to infer, and that the existence of unsampled intermediary links or common sources can never be excluded. The phylogenetic relationship between the HIV populations of epidemiologically linked hosts can be classified into six types of trees, based on cladistic relationships and whether the reconstruction is consistent with the true transmission history or not. We show that the direction of transmission and whether unsampled intermediary links or common sources existed make very different predictions about expected phylogenetic relationships: (i) Direction of transmission can often be established when paraphyly exists, (ii) intermediary links can be excluded when multiple lineages were transmitted, and (iii) when the sampled individuals’ HIV populations both are monophyletic a common source was likely the origin. Inconsistent results, suggesting the wrong transmission direction, were generally rare. In addition, the expected tree topology also depends on the number of transmitted lineages, the sample size, the time of the sample relative to transmission, and how fast the diversity increases after infection. Typically, 20 or more sequences per subject give robust results. We confirm our theoretical evaluations with analyses of real transmission histories and discuss how our findings should aid in interpreting phylogenetic results. PMID:26903617

  18. Clustering with phylogenetic tools in astrophysics

    Fraix-Burnet, Didier

    2016-01-01

    Phylogenetic approaches are finding more and more applications outside the field of biology. Astrophysics is no exception since an overwhelming amount of multivariate data has appeared in the last twenty years or so. In particular, the diversification of galaxies throughout the evolution of the Universe quite naturally invokes phylogenetic approaches. We have demonstrated that Maximum Parsimony brings useful astrophysical results, and we now proceed toward the analyses of large datasets for galaxies. In this talk I present how we solve the major difficulties for this goal: the choice of the parameters, their discretization, and the analysis of a high number of objects with an unsupervised NP-hard classification technique like cladistics. 1. Introduction How do the galaxy form, and when? How did the galaxy evolve and transform themselves to create the diversity we observe? What are the progenitors to present-day galaxies? To answer these big questions, observations throughout the Universe and the physical mode...

  19. A Universal Phylogenetic Tree.

    Offner, Susan

    2001-01-01

    Presents a universal phylogenetic tree suitable for use in high school and college-level biology classrooms. Illustrates the antiquity of life and that all life is related, even if it dates back 3.5 billion years. Reflects important evolutionary relationships and provides an exciting way to learn about the history of life. (SAH)

  20. Charles Darwin, beetles and phylogenetics

    Beutel, Rolf G.; Friedrich, Frank; Leschen, Richard A. B.

    2009-11-01

    Here, we review Charles Darwin’s relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in “The Descent of Man”. During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig’s new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data

  1. Phylogenetic molecular function annotation

    Barbara E Engelhardt; Jordan, Michael I.; Repo, Susanna T; Brenner, Steven E

    2009-01-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic ...

  2. Molecular phylogenetics before sequences

    Mark A. Ragan; Bernard, Guillaume,; Chan, Cheong Xin

    2014-01-01

    From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components of the translational apparatus. They recognized a third domain of life, revealed the phylogenetic backbone of bacteria (and its limitations), delineated taxa, and explored the tempo and mode of micr...

  3. Canonical phylogenetic ordination.

    Giannini, Norberto P

    2003-10-01

    A phylogenetic comparative method is proposed for estimating historical effects on comparative data using the partitions that compose a cladogram, i.e., its monophyletic groups. Two basic matrices, Y and X, are defined in the context of an ordinary linear model. Y contains the comparative data measured over t taxa. X consists of an initial tree matrix that contains all the xj monophyletic groups (each coded separately as a binary indicator variable) of the phylogenetic tree available for those taxa. The method seeks to define the subset of groups, i.e., a reduced tree matrix, that best explains the patterns in Y. This definition is accomplished via regression or canonical ordination (depending on the dimensionality of Y) coupled with Monte Carlo permutations. It is argued here that unrestricted permutations (i.e., under an equiprobable model) are valid for testing this specific kind of groupwise hypothesis. Phylogeny is either partialled out or, more properly, incorporated into the analysis in the form of component variation. Direct extensions allow for testing ecomorphological data controlled by phylogeny in a variation partitioning approach. Currently available statistical techniques make this method applicable under most univariate/multivariate models and metrics; two-way phylogenetic effects can be estimated as well. The simplest case (univariate Y), tested with simulations, yielded acceptable type I error rates. Applications presented include examples from evolutionary ethology, ecology, and ecomorphology. Results showed that the new technique detected previously overlooked variation clearly associated with phylogeny and that many phylogenetic effects on comparative data may occur at particular groups rather than across the entire tree. PMID:14530135

  4. Efficient segmentation by sparse pixel classification

    Dam, Erik B; Loog, Marco

    2008-01-01

    Segmentation methods based on pixel classification are powerful but often slow. We introduce two general algorithms, based on sparse classification, for optimizing the computation while still obtaining accurate segmentations. The computational costs of the algorithms are derived, and they are...

  5. Multiple sparse representations classification

    Plenge, Esben; Klein, Stefan; Niessen, Wiro; Meijering, Erik

    2015-01-01

    textabstractSparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In this empirical study we propose to further leverage the redundancy of the learned dictionaries to achieve a more accurate classifier. In conventional SRC, each image pixel is associated with a small...

  6. Multiple Sparse Representations Classification

    Plenge, Esben; Klein, Stefan S.; Niessen, Wiro J.; Meijering, Erik

    2015-01-01

    Sparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In this empirical study we propose to further leverage the redundancy of the learned dictionaries to achieve a more accurate classifier. In conventional SRC, each image pixel is associated with a small patch surro...

  7. Nominal classification

    Senft, G.

    2007-01-01

    This handbook chapter summarizes some of the problems of nominal classification in language, presents and illustrates the various systems or techniques of nominal classification, and points out why nominal classification is one of the most interesting topics in Cognitive Linguistics.

  8. Associations of Leaf Spectra with Genetic and Phylogenetic Variation in Oaks: Prospects for Remote Detection of Biodiversity

    Jeannine Cavender-Bares

    2016-03-01

    Full Text Available Species and phylogenetic lineages have evolved to differ in the way that they acquire and deploy resources, with consequences for their physiological, chemical and structural attributes, many of which can be detected using spectral reflectance form leaves. Recent technological advances for assessing optical properties of plants offer opportunities to detect functional traits of organisms and differentiate levels of biological organization across the tree of life. Here, we connect leaf-level full range spectral data (400–2400 nm of leaves to the hierarchical organization of plant diversity within the oak genus (Quercus using field and greenhouse experiments in which environmental factors and plant age are controlled. We show that spectral data significantly differentiate populations within a species and that spectral similarity is significantly associated with phylogenetic similarity among species. We further show that hyperspectral information allows more accurate classification of taxa than spectrally-derived traits, which by definition are of lower dimensionality. Finally, model accuracy increases at higher levels in the hierarchical organization of plant diversity, such that we are able to better distinguish clades than species or populations. This pattern supports an evolutionary explanation for the degree of optical differentiation among plants and demonstrates potential for remote detection of genetic and phylogenetic diversity.

  9. Fast phylogenetic DNA barcoding

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Willerslev, Eske;

    2008-01-01

    We present a heuristic approach to the DNA assignment problem based on phylogenetic inferences using constrained neighbour joining and non-parametric bootstrapping. We show that this method performs as well as the more computationally intensive full Bayesian approach in an analysis of 500 insect...... DNA sequences obtained from GenBank. We also analyse a previously published dataset of environmental DNA sequences from soil from New Zealand and Siberia, and use these data to illustrate the fact that statistical approaches to the DNA assignment problem allow for more appropriate criteria...... for determining the taxonomic level at which a particular DNA sequence can be assigned....

  10. Phylogenetic comparative assembly

    Husemann Peter

    2010-01-01

    Full Text Available Abstract Background Recent high throughput sequencing technologies are capable of generating a huge amount of data for bacterial genome sequencing projects. Although current sequence assemblers successfully merge the overlapping reads, often several contigs remain which cannot be assembled any further. It is still costly and time consuming to close all the gaps in order to acquire the whole genomic sequence. Results Here we propose an algorithm that takes several related genomes and their phylogenetic relationships into account to create a graph that contains the likelihood for each pair of contigs to be adjacent. Subsequently, this graph can be used to compute a layout graph that shows the most promising contig adjacencies in order to aid biologists in finishing the complete genomic sequence. The layout graph shows unique contig orderings where possible, and the best alternatives where necessary. Conclusions Our new algorithm for contig ordering uses sequence similarity as well as phylogenetic information to estimate adjacencies of contigs. An evaluation of our implementation shows that it performs better than recent approaches while being much faster at the same time.

  11. Phylogenetic trees in bioinformatics

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  12. CREST--classification resources for environmental sequence tags.

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  13. Molecular Phylogenetic: Organism Taxonomy Method Based on Evolution History

    N.L.P Indi Dharmayanti

    2011-03-01

    Full Text Available Phylogenetic is described as taxonomy classification of an organism based on its evolution history namely its phylogeny and as a part of systematic science that has objective to determine phylogeny of organism according to its characteristic. Phylogenetic analysis from amino acid and protein usually became important area in sequence analysis. Phylogenetic analysis can be used to follow the rapid change of a species such as virus. The phylogenetic evolution tree is a two dimensional of a species graphic that shows relationship among organisms or particularly among their gene sequences. The sequence separation are referred as taxa (singular taxon that is defined as phylogenetically distinct units on the tree. The tree consists of outer branches or leaves that represents taxa and nodes and branch represent correlation among taxa. When the nucleotide sequence from two different organism are similar, they were inferred to be descended from common ancestor. There were three methods which were used in phylogenetic, namely (1 Maximum parsimony, (2 Distance, and (3 Maximum likehoood. Those methods generally are applied to construct the evolutionary tree or the best tree for determine sequence variation in group. Every method is usually used for different analysis and data.

  14. Modeling body size evolution in Felidae under alternative phylogenetic hypotheses

    José Alexandre Felizola Diniz-Filho

    2009-01-01

    Full Text Available The use of phylogenetic comparative methods in ecological research has advanced during the last twenty years, mainly due to accurate phylogenetic reconstructions based on molecular data and computational and statistical advances. We used phylogenetic correlograms and phylogenetic eigenvector regression (PVR to model body size evolution in 35 worldwide Felidae (Mammalia, Carnivora species using two alternative phylogenies and published body size data. The purpose was not to contrast the phylogenetic hypotheses but to evaluate how analyses of body size evolution patterns can be affected by the phylogeny used for comparative analyses (CA. Both phylogenies produced a strong phylogenetic pattern, with closely related species having similar body sizes and the similarity decreasing with increasing distances in time. The PVR explained 65% to 67% of body size variation and all Moran's I values for the PVR residuals were non-significant, indicating that both these models explained phylogenetic structures in trait variation. Even though our results did not suggest that any phylogeny can be used for CA with the same power, or that “good” phylogenies are unnecessary for the correct interpretation of the evolutionary dynamics of ecological, biogeographical, physiological or behavioral patterns, it does suggest that developments in CA can, and indeed should, proceed without waiting for perfect and fully resolved phylogenies.

  15. Insights into the evolution of sorbitol metabolism: phylogenetic analysis of SDR196C family

    Sola Carvajal Agustín

    2012-08-01

    Full Text Available Abstract Background Short chain dehydrogenases/reductases (SDR are NAD(P(H-dependent oxidoreductases with a highly conserved 3D structure and of an early origin, which has allowed them to diverge into several families and enzymatic activities. The SDR196C family (http://www.sdr-enzymes.org groups bacterial sorbitol dehydrogenases (SDH, which are of great industrial interest. In this study, we examine the phylogenetic relationship between the members of this family, and based on the findings and some sequence conserved blocks, a new and a more accurate classification is proposed. Results The distribution of the 66 bacterial SDH species analyzed was limited to Gram-negative bacteria. Six different bacterial families were found, encompassing α-, β- and γ-proteobacteria. This broad distribution in terms of bacteria and niches agrees with that of SDR, which are found in all forms of life. A cluster analysis of sorbitol dehydrogenase revealed different types of gene organization, although with a common pattern in which the SDH gene is surrounded by sugar ABC transporter proteins, another SDR, a kinase, and several gene regulators. According to the obtained trees, six different lineages and three sublineages can be discerned. The phylogenetic analysis also suggested two different origins for SDH in β-proteobacteria and four origins for γ-proteobacteria. Finally, this subdivision was further confirmed by the differences observed in the sequence of the conserved blocks described for SDR and some specific blocks of SDH, and by a functional divergence analysis, which made it possible to establish new consensus sequences and specific fingerprints for the lineages and sub lineages. Conclusion SDH distribution agrees with that observed for SDR, indicating the importance of the polyol metabolism, as an alternative source of carbon and energy. The phylogenetic analysis pointed to six clearly defined lineages and three sub lineages, and great variability in

  16. Ant-Based Phylogenetic Reconstruction (ABPR: A new distance algorithm for phylogenetic estimation based on ant colony optimization

    Karla Vittori

    2008-12-01

    Full Text Available We propose a new distance algorithm for phylogenetic estimation based on Ant Colony Optimization (ACO, named Ant-Based Phylogenetic Reconstruction (ABPR. ABPR joins two taxa iteratively based on evolutionary distance among sequences, while also accounting for the quality of the phylogenetic tree built according to the total length of the tree. Similar to optimization algorithms for phylogenetic estimation, the algorithm allows exploration of a larger set of nearly optimal solutions. We applied the algorithm to four empirical data sets of mitochondrial DNA ranging from 12 to 186 sequences, and from 898 to 16,608 base pairs, and covering taxonomic levels from populations to orders. We show that ABPR performs better than the commonly used Neighbor-Joining algorithm, except when sequences are too closely related (e.g., population-level sequences. The phylogenetic relationships recovered at and above species level by ABPR agree with conventional views. However, like other algorithms of phylogenetic estimation, the proposed algorithm failed to recover expected relationships when distances are too similar or when rates of evolution are very variable, leading to the problem of long-branch attraction. ABPR, as well as other ACO-based algorithms, is emerging as a fast and accurate alternative method of phylogenetic estimation for large data sets.

  17. CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome.

    Ann L Griffen

    Full Text Available Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically curated 16S rDNA database of the core oral microbiome, CORE, was developed. The goal was to include a comprehensive and minimally redundant representation of the bacteria that regularly reside in the human oral cavity with computationally robust classification at the level of species and genus. Clades of cultivated and uncultivated taxa were formed based on sequence analyses using multiple criteria, including maximum-likelihood-based topology and bootstrap support, genetic distance, and previous naming. A number of classification inconsistencies for previously named species, especially at the level of genus, were resolved. The performance of the CORE database for identifying clinical sequences was compared to that of three publicly available databases, GenBank nr/nt, RDP and HOMD, using a set of sequencing reads that had not been used in creation of the database. CORE offered improved performance compared to other public databases for identification of human oral bacterial 16S sequences by a number of criteria. In addition, the CORE database and phylogenetic tree provide a framework for measures of community divergence, and the focused size of the database offers advantages of efficiency for BLAST searching of large datasets. The CORE database is available as a searchable interface and for download at http://microbiome.osu.edu.

  18. Dengue virus type 3 in Brazil: a phylogenetic perspective

    Josélio Maria Galvão de Araújo

    2009-05-01

    Full Text Available Circulation of a new dengue virus (DENV-3 genotype was recently described in Brazil and Colombia, but the precise classification of this genotype has been controversial. Here we perform phylogenetic and nucleotide-distance analyses of the envelope gene, which support the subdivision of DENV-3 strains into five distinct genotypes (GI to GV and confirm the classification of the new South American genotype as GV. The extremely low genetic distances between Brazilian GV strains and the prototype Philippines/L11423 GV strain isolated in 1956 raise important questions regarding the origin of GV in South America.

  19. Ultrafast Approximation for Phylogenetic Bootstrap

    Bui Quang Minh, [No Value; Nguyen, Thi; von Haeseler, Arndt

    2013-01-01

    Nonparametric bootstrap has been a widely used tool in phylogenetic analysis to assess the clade support of phylogenetic trees. However, with the rapidly growing amount of data, this task remains a computational bottleneck. Recently, approximation methods such as the RAxML rapid bootstrap (RBS) and

  20. Quartets and unrooted phylogenetic networks.

    Gambette, Philippe; Berry, Vincent; Paul, Christophe

    2012-08-01

    Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of level-k networks. In particular, we give an equivalence theorem between circular split systems and unrooted level-1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted level-k phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions. PMID:22809417

  1. PSG-Based Classification of Sleep Phases

    Králík, M.

    2015-01-01

    This work is focused on classification of sleep phases using artificial neural network. The unconventional approach was used for calculation of classification features using polysomnographic data (PSG) of real patients. This approach allows to increase the time resolution of the analysis and, thus, to achieve more accurate results of classification.

  2. Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN

    Dumont, Marc G.; Lüke, Claudia; Deng, Yongcui; Frenzel, Peter

    2014-01-01

    The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences f...

  3. Phylogenetic and biogeographic analysis of sphaerexochine trilobites.

    Curtis R Congreve

    Full Text Available BACKGROUND: Sphaerexochinae is a speciose and widely distributed group of cheirurid trilobites. Their temporal range extends from the earliest Ordovician through the Silurian, and they survived the end Ordovician mass extinction event (the second largest mass extinction in Earth history. Prior to this study, the individual evolutionary relationships within the group had yet to be determined utilizing rigorous phylogenetic methods. Understanding these evolutionary relationships is important for producing a stable classification of the group, and will be useful in elucidating the effects the end Ordovician mass extinction had on the evolutionary and biogeographic history of the group. METHODOLOGY/PRINCIPAL FINDINGS: Cladistic parsimony analysis of cheirurid trilobites assigned to the subfamily Sphaerexochinae was conducted to evaluate phylogenetic patterns and produce a hypothesis of relationship for the group. This study utilized the program TNT, and the analysis included thirty-one taxa and thirty-nine characters. The results of this analysis were then used in a Lieberman-modified Brooks Parsimony Analysis to analyze biogeographic patterns during the Ordovician-Silurian. CONCLUSIONS/SIGNIFICANCE: The genus Sphaerexochus was found to be monophyletic, consisting of two smaller clades (one composed entirely of Ordovician species and another composed of Silurian and Ordovician species. By contrast, the genus Kawina was found to be paraphyletic. It is a basal grade that also contains taxa formerly assigned to Cydonocephalus. Phylogenetic patterns suggest Sphaerexochinae is a relatively distinctive trilobite clade because it appears to have been largely unaffected by the end Ordovician mass extinction. Finally, the biogeographic analysis yields two major conclusions about Sphaerexochus biogeography: Bohemia and Avalonia were close enough during the Silurian to exchange taxa; and during the Ordovician there was dispersal between Eastern Laurentia and

  4. High-resolution phylogenetic microbial community profiling

    Singer, Esther; Coleman-Derr, Devin; Bowman, Brett; Schwientek, Patrick; Clum, Alicia; Copeland, Alex; Ciobanu, Doina; Cheng, Jan-Fang; Gies, Esther; Hallam, Steve; Tringe, Susannah; Woyke, Tanja

    2014-03-17

    The representation of bacterial and archaeal genome sequences is strongly biased towards cultivated organisms, which belong to merely four phylogenetic groups. Functional information and inter-phylum level relationships are still largely underexplored for candidate phyla, which are often referred to as microbial dark matter. Furthermore, a large portion of the 16S rRNA gene records in the GenBank database are labeled as environmental samples and unclassified, which is in part due to low read accuracy, potential chimeric sequences produced during PCR amplifications and the low resolution of short amplicons. In order to improve the phylogenetic classification of novel species and advance our knowledge of the ecosystem function of uncultivated microorganisms, high-throughput full length 16S rRNA gene sequencing methodologies with reduced biases are needed. We evaluated the performance of PacBio single-molecule real-time (SMRT) sequencing in high-resolution phylogenetic microbial community profiling. For this purpose, we compared PacBio and Illumina metagenomic shotgun and 16S rRNA gene sequencing of a mock community as well as of an environmental sample from Sakinaw Lake, British Columbia. Sakinaw Lake is known to contain a large age of microbial species from candidate phyla. Sequencing results show that community structure based on PacBio shotgun and 16S rRNA gene sequences is highly similar in both the mock and the environmental communities. Resolution power and community representation accuracy from SMRT sequencing data appeared to be independent of GC content of microbial genomes and was higher when compared to Illumina-based metagenome shotgun and 16S rRNA gene (iTag) sequences, e.g. full-length sequencing resolved all 23 OTUs in the mock community, while iTags did not resolve closely related species. SMRT sequencing hence offers various potential benefits when characterizing uncharted microbial communities.

  5. [Foundations of the new phylogenetics].

    Pavlinov, I Ia

    2004-01-01

    Evolutionary idea is the core of the modern biology. Due to this, phylogenetics dealing with historical reconstructions in biology takes a priority position among biological disciplines. The second half of the 20th century witnessed growth of a great interest to phylogenetic reconstructions at macrotaxonomic level which replaced microevolutionary studies dominating during the 30s-60s. This meant shift from population thinking to phylogenetic one but it was not revival of the classical phylogenetics; rather, a new approach emerged that was baptized The New Phylogenetics. It arose as a result of merging of three disciplines which were developing independently during 60s-70s, namely cladistics, numerical phyletics, and molecular phylogenetics (now basically genophyletics). Thus, the new phylogenetics could be defined as a branch of evolutionary biology aimed at elaboration of "parsimonious" cladistic hypotheses by means of numerical methods on the basis of mostly molecular data. Classical phylogenetics, as a historical predecessor of the new one, emerged on the basis of the naturphilosophical worldview which included a superorganismal idea of biota. Accordingly to that view, historical development (the phylogeny) was thought an analogy of individual one (the ontogeny) so its most basical features were progressive parallel developments of "parts" (taxa), supplemented with Darwinian concept of monophyly. Two predominating traditions were diverged within classical phylogenetics according to a particular interpretation of relation between these concepts. One of them (Cope, Severtzow) belittled monophyly and paid most attention to progressive parallel developments of morphological traits. Such an attitude turned this kind of phylogenetics to be rather the semogenetics dealing primarily with evolution of structures and not of taxa. Another tradition (Haeckel) considered both monophyletic and parallel origins of taxa jointly: in the middle of 20th century it was split into

  6. A phylogenetic analysis of the myxobacteria: basis for their classification

    Shimkets, L.; Woese, C. R.

    1992-01-01

    The primary sequence and secondary structural features of the 16S rRNA were compared for 12 different myxobacteria representing all the known cultivated genera. Analysis of these data show the myxobacteria to form a monophyletic grouping consisting of three distinct families, which lies within the delta subdivision of the purple bacterial phylum. The composition of the families is consistent with differences in cell and spore morphology, cell behavior, and pigment and secondary metabolite production but is not correlated with the morphological complexity of the fruiting bodies. The Nannocystis exedens lineage has evolved at an unusually rapid pace and its rRNA shows numerous primary and secondary structural idiosyncrasies.

  7. Bayesian Classification in Medicine: The Transferability Question *

    Zagoria, Ronald J.; Reggia, James A.; Price, Thomas R.; Banko, Maryann

    1981-01-01

    Using probabilities derived from a geographically distant patient population, we applied Bayesian classification to categorize stroke patients by etiology. Performance was assessed both by error rate and with a new linear accuracy coefficient. This approach to patient classification was found to be surprisingly accurate when compared to classification by two neurologists and to classification by the Bayesian method using “low cost” local and subjective probabilities. We conclude that for some...

  8. Quantum Simulation of Phylogenetic Trees

    Ellinas, Demosthenes; Jarvis, Peter

    2011-01-01

    Quantum simulations constructing probability tensors of biological multi-taxa in phylogenetic trees are proposed, in terms of positive trace preserving maps, describing evolving systems of quantum walks with multiple walkers. Basic phylogenetic models applying on trees of various topologies are simulated following appropriate decoherent quantum circuits. Quantum simulations of statistical inference for aligned sequences of biological characters are provided in terms of a quantum pruning map o...

  9. Application of Data Mining in Protein Sequence Classification

    Suprativ Saha

    2012-11-01

    Full Text Available Protein sequence classification involves feature selection for accurate classification. Popular protein sequence classification techniques involve extraction of specific features from the sequences. Researchers apply some well-known classification techniques like neural networks, Genetic algorithm, Fuzzy ARTMAP,Rough Set Classifier etc for accurate classification. This paper presents a review is with three different classification models such as neural network model, fuzzy ARTMAP model and Rough set classifier model.This is followed by a new technique for classifying protein sequences. The proposed model is typicallyimplemented with an own designed tool and tries to reduce the computational overheads encountered by earlier approaches and increase the accuracy of classification.

  10. Molecular systematics of Volvocales (Chlorophyceae, Chlorophyta) based on exhaustive 18S rRNA phylogenetic analyses.

    Nakada, Takashi; Misawa, Kazuharu; Nozaki, Hisayoshi

    2008-07-01

    The taxonomy of Volvocales (Chlorophyceae, Chlorophyta) was traditionally based solely on morphological characteristics. However, because recent molecular phylogeny largely contradicts the traditional subordinal and familial classifications, no classification system has yet been established that describes the subdivision of Volvocales in a manner consistent with the phylogenetic relationships. Towards development of a natural classification system at and above the generic level, identification and sorting of hundreds of sequences based on subjective phylogenetic definitions is a significant step. We constructed an 18S rRNA gene phylogeny based on 449 volvocalean sequences collected using exhaustive BLAST searches of the GenBank database. Many chimeric sequences, which can cause fallacious phylogenetic trees, were detected and excluded during data collection. The results revealed 21 strongly supported primary clades within phylogenetically redefined Volvocales. Phylogenetic classification following PhyloCode was proposed based on the presented 18S rRNA gene phylogeny along with the results of previous combined 18S and 26S rRNA and chloroplast multigene analyses. PMID:18430591

  11. The phylogenetic utility of chloroplast and nuclear DNA markers and the phylogeny of the Rubiaceae tribe Spermacoceae.

    Kårehed, Jesper; Groeninckx, Inge; Dessein, Steven; Motley, Timothy J; Bremer, Birgitta

    2008-12-01

    The phylogenetic utility of chloroplast (atpB-rbcL, petD, rps16, trnL-F) and nuclear (ETS, ITS) DNA regions was investigated for the tribe Spermacoceae of the coffee family (Rubiaceae). ITS was, despite often raised cautions of its utility at higher taxonomic levels, shown to provide the highest number of parsimony informative characters, in partitioned Bayesian analyses it yielded the fewest trees in the 95% credible set, it resolved the highest proportion of well resolved clades, and was the most accurate region as measured by the partition metric and the proportion of correctly resolved clades (well supported clades retrieved from a combined analysis regarded as "true"). For Hedyotis, the nuclear 5S-NTS was shown to be potentially as useful as ITS, despite its shorter sequence length. The chloroplast region being the most phylogenetically informative was the petD group II intron. We also present a phylogeny of Spermacoceae based on a Bayesian analysis of the four chloroplast regions, ITS, and ETS combined. Spermacoceae are shown to be monophyletic. Clades supported by high posterior probabilities are discussed, especially in respect to the current generic classification. Notably, Oldenlandia is polyphyletic, the two subgenera of Kohautia are not sister taxa, and Hedyotis should be treated in a narrow sense to include only Asian species. PMID:18950720

  12. Accurate classification of 17 AGNs detected with Swift/BAT

    Parisi, P; Jimenez-Bailon, E; Chavushyan, V; Malizia, A; Landi, R; Molina, M; Fiocchi, M; Palazzi, E; Bassani, L; Bazzano, A; Bird, A J; Dean, A J; Galaz, G; Mason, E; Minniti, D; Morelli, L; Stephen, J B; Ubertini, P

    2009-01-01

    Through an optical campaign performed at 5 telescopes located in the northern and the southern hemispheres, plus archival data from two on line sky surveys, we have obtained optical spectroscopy for 17 counterparts of suspected or poorly studied hard X-ray emitting active galactic nuclei (AGNs) detected with Swift/BAT in order to determine or better classify their nature. We find that 7 sources of our sample are Type 1 AGNs, 9 are Type 2 AGNs, and 1 object is an X-ray bright optically normal galaxy; the redshifts of these objects lie in a range between 0.012 and 0.286. For all these sources, X-ray data analysis was also performed to estimate their absorption column and to search for possible Compton thick candidates. Among our type 2 objects, we did not find any clear Compton thick AGN, but at least 6 out of 9 of them are highly absorbed (N_H > 10^23 cm^-2), while one does not require intrinsic absorption; i.e., it appears to be a naked Seyfert 2 galaxy.

  13. An Innovative Imputation and Classification Approach for Accurate Disease Prediction

    UshaRani, Yelipe; Sammulal, P.

    2016-01-01

    Imputation of missing attribute values in medical datasets for extracting hidden knowledge from medical datasets is an interesting research topic of interest which is very challenging. One cannot eliminate missing values in medical records. The reason may be because some tests may not been conducted as they are cost effective, values missed when conducting clinical trials, values may not have been recorded to name some of the reasons. Data mining researchers have been proposing various approa...

  14. Accurate mobile malware detection and classification in the cloud

    Wang, Xiaolei; Yang, Yuexiang; Zeng, Yingzhi

    2015-01-01

    As the dominator of the Smartphone operating system market, consequently android has attracted the attention of s malware authors and researcher alike. The number of types of android malware is increasing rapidly regardless of the considerable number of proposed malware analysis systems. In this paper, by taking advantages of low false-positive rate of misuse detection and the ability of anomaly detection to detect zero-day malware, we propose a novel hybrid detection system based on a new op...

  15. Phylogenetic placement of the ectomycorrhizal genus Cenococcum in Gloniaceae (Dothideomycetes).

    Spatafora, Joseph W; Owensby, C Alisha; Douhan, Greg W; Boehm, Eric W A; Schoch, Conrad L

    2012-01-01

    Cenococcum is a genus of ectomycorrhizal Ascomycota that has a broad host range and geographic distribution. It is not known to produce either meiotic or mitotic spores and is known to exist only in the form of hyphae, sclerotia and host-colonized ectomycorrhizal root tips. Due to its lack of sexual and asexual spores and reproductive structures, it has proven difficult to incorporate into traditional classification within Ascomycota. Molecular phylogenetic studies of ribosomal RNA placed Cenococcum in Dothideomycetes, but the definitive identification of closely related taxa remained elusive. Here we report a phylogenetic analysis of five nuclear loci (SSU, LSU, TEF1, RPB1, RPB2) of Dothideomycetes that placed Cenococcum as a close relative of the genus Glonium of Gloniaceae (Pleosporomycetidae incertae sedis) with strong statistical support. Glonium is a genus of saprobic Dothideomycetes that produces darkly pigmented, carbonaceous, hysteriate apothecia and is not known to be biotrophic. Evolution of ectomycorhizae, Cenococcum and Dothideomycetes is discussed. PMID:22453119

  16. Accurate Finite Difference Algorithms

    Goodrich, John W.

    1996-01-01

    Two families of finite difference algorithms for computational aeroacoustics are presented and compared. All of the algorithms are single step explicit methods, they have the same order of accuracy in both space and time, with examples up to eleventh order, and they have multidimensional extensions. One of the algorithm families has spectral like high resolution. Propagation with high order and high resolution algorithms can produce accurate results after O(10(exp 6)) periods of propagation with eight grid points per wavelength.

  17. Multiple Sparse Representations Classification.

    Plenge, Esben; Klein, Stefan; Klein, Stefan S; Niessen, Wiro J; Meijering, Erik

    2015-01-01

    Sparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In this empirical study we propose to further leverage the redundancy of the learned dictionaries to achieve a more accurate classifier. In conventional SRC, each image pixel is associated with a small patch surrounding it. Using these patches, a dictionary is trained for each class in a supervised fashion. Commonly, redundant/overcomplete dictionaries are trained and image patches are sparsely represented by a linear combination of only a few of the dictionary elements. Given a set of trained dictionaries, a new patch is sparse coded using each of them, and subsequently assigned to the class whose dictionary yields the minimum residual energy. We propose a generalization of this scheme. The method, which we call multiple sparse representations classification (mSRC), is based on the observation that an overcomplete, class specific dictionary is capable of generating multiple accurate and independent estimates of a patch belonging to the class. So instead of finding a single sparse representation of a patch for each dictionary, we find multiple, and the corresponding residual energies provides an enhanced statistic which is used to improve classification. We demonstrate the efficacy of mSRC for three example applications: pixelwise classification of texture images, lumen segmentation in carotid artery magnetic resonance imaging (MRI), and bifurcation point detection in carotid artery MRI. We compare our method with conventional SRC, K-nearest neighbor, and support vector machine classifiers. The results show that mSRC outperforms SRC and the other reference methods. In addition, we present an extensive evaluation of the effect of the main mSRC parameters: patch size, dictionary size, and

  18. Strategic Classification

    Hardt, Moritz; Megiddo, Nimrod; Papadimitriou, Christos; Wootters, Mary

    2015-01-01

    Machine learning relies on the assumption that unseen test instances of a classification problem follow the same distribution as observed training data. However, this principle can break down when machine learning is used to make important decisions about the welfare (employment, education, health) of strategic individuals. Knowing information about the classifier, such individuals may manipulate their attributes in order to obtain a better classification outcome. As a result of this behavior...

  19. HYBRID INTERNET TRAFFIC CLASSIFICATION TECHNIQUE1

    Li Jun; Zhang Shunyi; Lu Yanqing; Yan Junrong

    2009-01-01

    Accurate and real-time classification of network traffic is significant to network operation and management such as QoS differentiation, traffic shaping and security surveillance. However, with many newly emerged P2P applications using dynamic port numbers, masquerading techniques, and payload encryption to avoid detection, traditional classification approaches turn to be ineffective. In this paper, we present a layered hybrid system to classify current Internet traffic, motivated by variety of network activities and their requirements of traffic classification. The proposed method could achieve fast and accurate traffic classification with low overheads and robustness to accommodate both known and unknown/encrypted applications. Furthermore, it is feasible to be used in the context of real-time traffic classification. Our experimental results show the distinct advantages of the proposed classification system, compared with the one-step Machine Learning (ML) approach.

  20. Phylogenetic Position of Barbus lacerta Heckel, 1843

    Mustafa Korkmaz

    2015-11-01

    As a result, five clades come out from phylogenetic reconstruction and in phylogenetic tree Barbus lacerta determined to be sister group of Barbus macedonicus, Barbus oligolepis and Barbus plebejus complex.

  1. DendroBlast: approximate phylogenetic trees in the absence of multiple sequence alignments

    KELLY S; Maini, P. K.

    2013-01-01

    The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realis...

  2. Phylogenetic Distribution of Fungal Sterols

    Weete, John D.; Abril, Maritza; Blackwell, Meredith

    2010-01-01

    Background Ergosterol has been considered the “fungal sterol” for almost 125 years; however, additional sterol data superimposed on a recent molecular phylogeny of kingdom Fungi reveals a different and more complex situation. Methodology/Principal Findings The interpretation of sterol distribution data in a modern phylogenetic context indicates that there is a clear trend from cholesterol and other Δ5 sterols in the earliest diverging fungal species to ergosterol in later diverging fungi. The...

  3. PHYLOGENETIC ANALYSIS AMONG FOUR SECTIONS OF GENUS DENDROBIUM SW. (ORCHIDACEAE) IN PENINSULAR MALAYSIA USING RBCL SEQUENCE DATA

    2013-01-01

    Phylogenetic analysis using chloroplast DNA, the ribulose-bisphosphate carboxylase gene (rbcL), was conducted to examine relationship among four sections of the genus Dendrobium (Orchidaceae): Aporum, Crumenata, Strongyle, and Bolbidium in Peninsular Malaysia. Classifications based on morphological characters have not been able to clearly divide these four sections, therefore deeper and detailed analyses are required to ascertain their status. In this study, the phylogenetic relationships amo...

  4. Combinatorial Approaches to Accurate Identification of Orthologous Genes

    Shi, Guanqun

    2011-01-01

    The accurate identification of orthologous genes across different species is a critical and challenging problem in comparative genomics and has a wide spectrum of biological applications including gene function inference, evolutionary studies and systems biology. During the past several years, many methods have been proposed for ortholog assignment based on sequence similarity, phylogenetic approaches, synteny information, and genome rearrangement. Although these methods share many commonly a...

  5. Text Classification using Artificial Intelligence

    Kamruzzaman, S M

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using artificial intelligence technique that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of na\\"ive Bayes classifier is then used on derived features and finally only a single concept of genetic algorithm has been added for final classification. A syste...

  6. Text Classification using Data Mining

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the...

  7. Transforming phylogenetic networks: Moving beyond tree space.

    Huber, Katharina T; Moulton, Vincent; Wu, Taoyang

    2016-09-01

    Phylogenetic networks are a generalization of phylogenetic trees that are used to represent reticulate evolution. Unrooted phylogenetic networks form a special class of such networks, which naturally generalize unrooted phylogenetic trees. In this paper we define two operations on unrooted phylogenetic networks, one of which is a generalization of the well-known nearest-neighbor interchange (NNI) operation on phylogenetic trees. We show that any unrooted phylogenetic network can be transformed into any other such network using only these operations. This generalizes the well-known fact that any phylogenetic tree can be transformed into any other such tree using only NNI operations. It also allows us to define a generalization of tree space and to define some new metrics on unrooted phylogenetic networks. To prove our main results, we employ some fascinating new connections between phylogenetic networks and cubic graphs that we have recently discovered. Our results should be useful in developing new strategies to search for optimal phylogenetic networks, a topic that has recently generated some interest in the literature, as well as for providing new ways to compare networks. PMID:27224010

  8. Functional and phylogenetic ecology in R

    Swenson, Nathan G

    2014-01-01

    Functional and Phylogenetic Ecology in R is designed to teach readers to use R for phylogenetic and functional trait analyses. Over the past decade, a dizzying array of tools and methods were generated to incorporate phylogenetic and functional information into traditional ecological analyses. Increasingly these tools are implemented in R, thus greatly expanding their impact. Researchers getting started in R can use this volume as a step-by-step entryway into phylogenetic and functional analyses for ecology in R. More advanced users will be able to use this volume as a quick reference to understand particular analyses. The volume begins with an introduction to the R environment and handling relevant data in R. Chapters then cover phylogenetic and functional metrics of biodiversity; null modeling and randomizations for phylogenetic and functional trait analyses; integrating phylogenetic and functional trait information; and interfacing the R environment with a popular C-based program. This book presents a uni...

  9. A phylogenetic re-analysis of groupers with applications for ciguatera fish poisoning.

    Charlotte Schoelinck

    Full Text Available Ciguatera fish poisoning (CFP is a significant public health problem due to dinoflagellates. It is responsible for one of the highest reported incidence of seafood-borne illness and Groupers are commonly reported as a source of CFP due to their position in the food chain. With the role of recent climate change on harmful algal blooms, CFP cases might become more frequent and more geographically widespread. Since there is no appropriate treatment for CFP, the most efficient solution is to regulate fish consumption. Such a strategy can only work if the fish sold are correctly identified, and it has been repeatedly shown that misidentifications and species substitutions occur in fish markets.We provide here both a DNA-barcoding reference for groupers, and a new phylogenetic reconstruction based on five genes and a comprehensive taxonomical sampling. We analyse the correlation between geographic range of species and their susceptibility to ciguatera accumulation, and the co-occurrence of ciguatoxins in closely related species, using both character mapping and statistical methods.Misidentifications were encountered in public databases, precluding accurate species identifications. Epinephelinae now includes only twelve genera (vs. 15 previously. Comparisons with the ciguatera incidences show that in some genera most species are ciguateric, but statistical tests display only a moderate correlation with the phylogeny. Atlantic species were rarely contaminated, with ciguatera occurrences being restricted to the South Pacific.The recent changes in classification based on the reanalyses of the relationships within Epinephelidae have an impact on the interpretation of the ciguatera distribution in the genera. In this context and to improve the monitoring of fish trade and safety, we need to obtain extensive data on contamination at the species level. Accurate species identifications through DNA barcoding are thus an essential tool in controlling CFP since

  10. Transporter Classification Database (TCDB)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  11. HIV classification using coalescent theory

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory

    2008-01-01

    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our

  12. Tissue Classification

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are no...... software packages such as SPM, FSL, and FreeSurfer....

  13. Classifying Classification

    Novakowski, Janice

    2009-01-01

    This article describes the experience of a group of first-grade teachers as they tackled the science process of classification, a targeted learning objective for the first grade. While the two-year process was not easy and required teachers to teach in a new, more investigation-oriented way, the benefits were great. The project helped teachers and…

  14. HoxPred: automated classification of Hox proteins using combinations of generalised profiles

    Leyns Luc

    2007-07-01

    Full Text Available Abstract Background Correct identification of individual Hox proteins is an essential basis for their study in diverse research fields. Common methods to classify Hox proteins focus on the homeodomain that characterise homeobox transcription factors. Classification is hampered by the high conservation of this short domain. Phylogenetic tree reconstruction is a widely used but time-consuming classification method. Results We have developed an automated procedure, HoxPred, that classifies Hox proteins in their groups of homology. The method relies on a discriminant analysis that classifies Hox proteins according to their scores for a combination of protein generalised profiles. 54 generalised profiles dedicated to each Hox homology group were produced de novo from a curated dataset of vertebrate Hox proteins. Several classification methods were investigated to select the most accurate discriminant functions. These functions were then incorporated into the HoxPred program. Conclusion HoxPred shows a mean accuracy of 97%. Predictions on the recently-sequenced stickleback fish proteome identified 44 Hox proteins, including HoxC1a only found so far in zebrafish. Using the Uniprot databank, we demonstrate that HoxPred can efficiently contribute to large-scale automatic annotation of Hox proteins into their paralogous groups. As orthologous group predictions show a higher risk of misclassification, they should be corroborated by additional supporting evidence. HoxPred is accessible via SOAP and Web interface http://cege.vub.ac.be/hoxpred/. Complete datasets, results and source code are available at the same site.

  15. Phylogenetics of neotropical Platymiscium (Leguminosae

    Saslis-Lagoudakis, C. Haris; Chase, Mark W; Robinson, Daniel N;

    2008-01-01

    Platymiscium is a neotropical legume genus of forest trees in the Pterocarpus clade of the pantropical "dalbergioid" clade. It comprises 19 species (29 taxa), distributed from Mexico to southern Brazil. This study presents a molecular phylogenetic analysis of Platymiscium and allies inferred from...... nuclear ribosomal (nrITS) and plastid (trnL, trnL-F and matK) DNA sequence data using parsimony and Bayesian methods. Divergence times are estimated using a Bayesian method assuming a relaxed molecular clock (multidivtime). Within the Pterocarpus clade, new sister relationships are recovered: Pterocarpus...

  16. Phylogenetic placement of the Spirosomaceae

    Woese, C. R.; Maloy, S.; Mandelco, L.; Raj, H. D.

    1990-01-01

    Comparative analysis of 16S rRNA sequences shows that the family Spirosomaceae belongs within the eubacterial phylum defined by the flavobacteria and bacteriodes. Its constituent genera, Spirosoma, Flectobacillus, and Runella form a monophyletic grouping therein. The phylogenetic assignment is based not only upon evolutionary distance analysis, but also upon sequence signatures and higher order structural synapomorphies in 16S rRNA. Another genus peripherally associated with the Spirosomaceae, Ancylobacter ("Microcyclus"), does not cluster with the flavobacteria and their relatives, but rather belongs to the alpha subdivision of the purple bacteria.

  17. Making Mosquito Taxonomy Useful: A Stable Classification of Tribe Aedini that Balances Utility with Current Knowledge of Evolutionary Relationships.

    Wilkerson, Richard C; Linton, Yvonne-Marie; Fonseca, Dina M; Schultz, Ted R; Price, Dana C; Strickman, Daniel A

    2015-01-01

    The tribe Aedini (Family Culicidae) contains approximately one-quarter of the known species of mosquitoes, including vectors of deadly or debilitating disease agents. This tribe contains the genus Aedes, which is one of the three most familiar genera of mosquitoes. During the past decade, Aedini has been the focus of a series of extensive morphology-based phylogenetic studies published by Reinert, Harbach, and Kitching (RH&K). Those authors created 74 new, elevated or resurrected genera from what had been the single genus Aedes, almost tripling the number of genera in the entire family Culicidae. The proposed classification is based on subjective assessments of the "number and nature of the characters that support the branches" subtending particular monophyletic groups in the results of cladistic analyses of a large set of morphological characters of representative species. To gauge the stability of RH&K's generic groupings we reanalyzed their data with unweighted parsimony jackknife and maximum-parsimony analyses, with and without ordering 14 of the characters as in RH&K. We found that their phylogeny was largely weakly supported and their taxonomic rankings failed priority and other useful taxon-naming criteria. Consequently, we propose simplified aedine generic designations that 1) restore a classification system that is useful for the operational community; 2) enhance the ability of taxonomists to accurately place new species into genera; 3) maintain the progress toward a natural classification based on monophyletic groups of species; and 4) correct the current classification system that is subject to instability as new species are described and existing species more thoroughly defined. We do not challenge the phylogenetic hypotheses generated by the above-mentioned series of morphological studies. However, we reduce the ranks of the genera and subgenera of RH&K to subgenera or informal species groups, respectively, to preserve stability as new data become

  18. Making Mosquito Taxonomy Useful: A Stable Classification of Tribe Aedini that Balances Utility with Current Knowledge of Evolutionary Relationships.

    Richard C Wilkerson

    Full Text Available The tribe Aedini (Family Culicidae contains approximately one-quarter of the known species of mosquitoes, including vectors of deadly or debilitating disease agents. This tribe contains the genus Aedes, which is one of the three most familiar genera of mosquitoes. During the past decade, Aedini has been the focus of a series of extensive morphology-based phylogenetic studies published by Reinert, Harbach, and Kitching (RH&K. Those authors created 74 new, elevated or resurrected genera from what had been the single genus Aedes, almost tripling the number of genera in the entire family Culicidae. The proposed classification is based on subjective assessments of the "number and nature of the characters that support the branches" subtending particular monophyletic groups in the results of cladistic analyses of a large set of morphological characters of representative species. To gauge the stability of RH&K's generic groupings we reanalyzed their data with unweighted parsimony jackknife and maximum-parsimony analyses, with and without ordering 14 of the characters as in RH&K. We found that their phylogeny was largely weakly supported and their taxonomic rankings failed priority and other useful taxon-naming criteria. Consequently, we propose simplified aedine generic designations that 1 restore a classification system that is useful for the operational community; 2 enhance the ability of taxonomists to accurately place new species into genera; 3 maintain the progress toward a natural classification based on monophyletic groups of species; and 4 correct the current classification system that is subject to instability as new species are described and existing species more thoroughly defined. We do not challenge the phylogenetic hypotheses generated by the above-mentioned series of morphological studies. However, we reduce the ranks of the genera and subgenera of RH&K to subgenera or informal species groups, respectively, to preserve stability as new

  19. Phycas: software for Bayesian phylogenetic analysis.

    Lewis, Paul O; Holder, Mark T; Swofford, David L

    2015-05-01

    Phycas is open source, freely available Bayesian phylogenetics software written primarily in C++ but with a Python interface. Phycas specializes in Bayesian model selection for nucleotide sequence data, particularly the estimation of marginal likelihoods, central to computing Bayes Factors. Marginal likelihoods can be estimated using newer methods (Thermodynamic Integration and Generalized Steppingstone) that are more accurate than the widely used Harmonic Mean estimator. In addition, Phycas supports two posterior predictive approaches to model selection: Gelfand-Ghosh and Conditional Predictive Ordinates. The General Time Reversible family of substitution models, as well as a codon model, are available, and data can be partitioned with all parameters unlinked except tree topology and edge lengths. Phycas provides for analyses in which the prior on tree topologies allows polytomous trees as well as fully resolved trees, and provides for several choices for edge length priors, including a hierarchical model as well as the recently described compound Dirichlet prior, which helps avoid overly informative induced priors on tree length. PMID:25577605

  20. Neuromuscular disease classification system

    Sáez, Aurora; Acha, Begoña; Montero-Sánchez, Adoración; Rivas, Eloy; Escudero, Luis M.; Serrano, Carmen

    2013-06-01

    Diagnosis of neuromuscular diseases is based on subjective visual assessment of biopsies from patients by the pathologist specialist. A system for objective analysis and classification of muscular dystrophies and neurogenic atrophies through muscle biopsy images of fluorescence microscopy is presented. The procedure starts with an accurate segmentation of the muscle fibers using mathematical morphology and a watershed transform. A feature extraction step is carried out in two parts: 24 features that pathologists take into account to diagnose the diseases and 58 structural features that the human eye cannot see, based on the assumption that the biopsy is considered as a graph, where the nodes are represented by each fiber, and two nodes are connected if two fibers are adjacent. A feature selection using sequential forward selection and sequential backward selection methods, a classification using a Fuzzy ARTMAP neural network, and a study of grading the severity are performed on these two sets of features. A database consisting of 91 images was used: 71 images for the training step and 20 as the test. A classification error of 0% was obtained. It is concluded that the addition of features undetectable by the human visual inspection improves the categorization of atrophic patterns.

  1. Many-core algorithms for statistical phylogenetics

    Suchard, Marc A.; Rambaut, Andrew

    2009-01-01

    Motivation: Statistical phylogenetics is computationally intensive, resulting in considerable attention meted on techniques for parallelization. Codon-based models allow for independent rates of synonymous and replacement substitutions and have the potential to more adequately model the process of protein-coding sequence evolution with a resulting increase in phylogenetic accuracy. Unfortunately, due to the high number of codon states, computational burden has largely thwarted phylogenetic re...

  2. Phylogenetic diversity of freshwater picocyanobacteria

    Callieri, Cristiana; Coci, Manuela

    2012-01-01

    Picocyanobacteria are photosynthetic prokaryotes, coccoid or rod-shaped, with a cell diameter < 2 ?m. They are common in lakes and oceans, and abundant across a wide spectrum of trophic conditions (Callieri et al 2012). The dominant genus of freshwater picocyanobacteria is Synechococcus. Analysis of 16S rRNA gene of freshwater Synechococcus showed its polyphyletic origin, requiring better insights in the present classification of the genus and possibly a revision. We isolated more than 40 pic...

  3. Phylogenetic organization of bacterial activity.

    Morrissey, Ember M; Mau, Rebecca L; Schwartz, Egbert; Caporaso, J Gregory; Dijkstra, Paul; van Gestel, Natasja; Koch, Benjamin J; Liu, Cindy M; Hayer, Michaela; McHugh, Theresa A; Marks, Jane C; Price, Lance B; Hungate, Bruce A

    2016-09-01

    Phylogeny is an ecologically meaningful way to classify plants and animals, as closely related taxa frequently have similar ecological characteristics, functional traits and effects on ecosystem processes. For bacteria, however, phylogeny has been argued to be an unreliable indicator of an organism's ecology owing to evolutionary processes more common to microbes such as gene loss and lateral gene transfer, as well as convergent evolution. Here we use advanced stable isotope probing with (13)C and (18)O to show that evolutionary history has ecological significance for in situ bacterial activity. Phylogenetic organization in the activity of bacteria sets the stage for characterizing the functional attributes of bacterial taxonomic groups. Connecting identity with function in this way will allow scientists to begin building a mechanistic understanding of how bacterial community composition regulates critical ecosystem functions. PMID:26943624

  4. DNA sequence analysis using hierarchical ART-based classification networks

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  5. Progress, pitfalls and parallel universes: a history of insect phylogenetics.

    Kjer, Karl M; Simon, Chris; Yavorskaya, Margarita; Beutel, Rolf G

    2016-08-01

    The phylogeny of insects has been both extensively studied and vigorously debated for over a century. A relatively accurate deep phylogeny had been produced by 1904. It was not substantially improved in topology until recently when phylogenomics settled many long-standing controversies. Intervening advances came instead through methodological improvement. Early molecular phylogenetic studies (1985-2005), dominated by a few genes, provided datasets that were too small to resolve controversial phylogenetic problems. Adding to the lack of consensus, this period was characterized by a polarization of philosophies, with individuals belonging to either parsimony or maximum-likelihood camps; each largely ignoring the insights of the other. The result was an unfortunate detour in which the few perceived phylogenetic revolutions published by both sides of the philosophical divide were probably erroneous. The size of datasets has been growing exponentially since the mid-1980s accompanied by a wave of confidence that all relationships will soon be known. However, large datasets create new challenges, and a large number of genes does not guarantee reliable results. If history is a guide, then the quality of conclusions will be determined by an improved understanding of both molecular and morphological evolution, and not simply the number of genes analysed. PMID:27558853

  6. LABEL: fast and accurate lineage assignment with assessment of H5N1 and H9N2 influenza A hemagglutinins.

    Samuel S Shepard

    Full Text Available The evolutionary classification of influenza genes into lineages is a first step in understanding their molecular epidemiology and can inform the subsequent implementation of control measures. We introduce a novel approach called Lineage Assignment By Extended Learning (LABEL to rapidly determine cladistic information for any number of genes without the need for time-consuming sequence alignment, phylogenetic tree construction, or manual annotation. Instead, LABEL relies on hidden Markov model profiles and support vector machine training to hierarchically classify gene sequences by their similarity to pre-defined lineages. We assessed LABEL by analyzing the annotated hemagglutinin genes of highly pathogenic (H5N1 and low pathogenicity (H9N2 avian influenza A viruses. Using the WHO/FAO/OIE H5N1 evolution working group nomenclature, the LABEL pipeline quickly and accurately identified the H5 lineages of uncharacterized sequences. Moreover, we developed an updated clade nomenclature for the H9 hemagglutinin gene and show a similarly fast and reliable phylogenetic assessment with LABEL. While this study was focused on hemagglutinin sequences, LABEL could be applied to the analysis of any gene and shows great potential to guide molecular epidemiology activities, accelerate database annotation, and provide a data sorting tool for other large-scale bioinformatic studies.

  7. Vehicle Classification by Lane Allowance

    Vishakha Gaikwad

    2014-12-01

    Full Text Available Classification of vehicles from video is used for analysis of traffic, self-driving systems or security systems. This analysis is based on shape, size, velocity and track of vehicles. These features characterize vehicle in background subtraction and feature extraction methods. Extraction is done by active contours and morphological operations. Extracted vehicles are classified by applying various classification techniques. The combination of features and classification techniques varies with the application. Proposed system, Uses combination of K Nearest Neighbor (KNN and Decision Tree techniques to overcome constraints. These constraints are instances of an object, overlapping of objects, and scaling factor. KNN is utilized to classify vehicle by size and lane. Decision tree manipulates the combination of these two features to classify accurately which results increased performance. This system classifies objects into three classes. These classes are four wheeler, bikers and heavy duty vehicle extracted from video.

  8. Phylogenetic comparative approaches for studying niche conservatism

    COOPER, NATALIE; Jetz, Walter; Freckleton, Rob P.

    2010-01-01

    Analyses of phylogenetic niche conservatism (PNC) are becoming increasingly common. However, each analysis makes subtly different assumptions about the evolutionary mechanism that generates patterns of niche conservatism. To understand PNC, analyses should be conducted with reference to a clear underlying model, using appropriate methods. Here, we outline five macroevolutionary models that may underlie patterns of PNC (drift, niche retention, phylogenetic inertia, niche filling ? shifti...

  9. Demonstrating Biological Classification Using a Simulation of Natural Taxa.

    Vogt, Kenneth D.

    1995-01-01

    A review of introductory college level and high school biology texts reveals that concepts and theories behind classification are usually poorly discussed. Suggests ways in which card games can be used to teach differences between the phenetic and phylogenetic approaches. (LZ)

  10. Automatic web services classification based on rough set theory

    陈立; 张英; 宋自林; 苗壮

    2013-01-01

    With development of web services technology, the number of existing services in the internet is growing day by day. In order to achieve automatic and accurate services classification which can be beneficial for service related tasks, a rough set theory based method for services classification was proposed. First, the services descriptions were preprocessed and represented as vectors. Elicited by the discernibility matrices based attribute reduction in rough set theory and taking into account the characteristic of decision table of services classification, a method based on continuous discernibility matrices was proposed for dimensionality reduction. And finally, services classification was processed automatically. Through the experiment, the proposed method for services classification achieves approving classification result in all five testing categories. The experiment result shows that the proposed method is accurate and could be used in practical web services classification.

  11. Classification in Australia.

    McKinlay, John

    Despite some inroads by the Library of Congress Classification and short-lived experimentation with Universal Decimal Classification and Bliss Classification, Dewey Decimal Classification, with its ability in recent editions to be hospitable to local needs, remains the most widely used classification system in Australia. Although supplemented at…

  12. Classification in context

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary cla...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  13. Multi-borders classification

    Mills, Peter

    2014-01-01

    The number of possible methods of generalizing binary classification to multi-class classification increases exponentially with the number of class labels. Often, the best method of doing so will be highly problem dependent. Here we present classification software in which the partitioning of multi-class classification problems into binary classification problems is specified using a recursive control language.

  14. Use of whole genome sequences to develop a molecular phylogenetic framework for Rhodococcus fascians and the Rhodococcus genus

    Allison L. Creason

    2014-08-01

    Full Text Available The accurate diagnosis of diseases caused by pathogenic bacteria requires a stable species classification. Rhodococcus fascians is the only documented member of its ill-defined genus that is capable of causing disease on a wide range of agriculturally important plants. Comparisons of genome sequences generated from isolates of Rhodococcus associated with diseased plants revealed a level of genetic diversity consistent with them representing multiple species. To test this, we generated a tree based on more than 1700 homologous sequences from plant-associated isolates of Rhodococcus, and obtained support from additional approaches that measure and cluster based on genome similarities. Results were consistent in supporting the definition of new Rhodococcus species within clades containing phytopathogenic members. We also used the genome sequences, along with other rhodococcal genome sequences to construct a molecular phylogenetic tree as a framework for resolving the Rhodococcus genus. Results indicated that Rhodococcus has the potential for having 20 species and also confirmed a need to revisit the taxonomic groupings within Rhodococcus.

  15. Phylogenetic analysis of Pectinidae (Bivalvia) based on the ribosomal DNA internal transcribed spacer region

    2007-01-01

    The ribosomal DNA internal transcribed spacer (ITS) region is a useful genomic region for understanding evolutionary and genetic relationships. In the current study, the molecular phylogenetic analysis of Pectinidae (Mollusca: Bivalvia) was performed using the nucleotide sequences of the nuclear ITS region in nine species of this family. The sequences were obtained from the scallop species Argopecten irradians, Mizuhopecten yessoensis, Amusium pleuronectes and Mimachlamys nobilis, and compared with the published sequences of Aequipecten opercularis, Chlamys farreri, C. distorta, M. varia, Pecten maximus, and an outgroup species Perna viridis. The molecular phylogenetic tree was constructed by the neighbor-joining and maximum parsimony methods. Phylogenetic analysis based on ITS1, ITS2, or their combination always yielded trees of similar topology. The results support the morphological classifications of bivalve and are nearly consistent with classification of two subfamilies (Chlamydinae and Pectininae) formulated by Waller. However, A. irradians, together with A. opercularis made up of genera Amusium, evidences that they may belong to the subfamily Pectinidae. The data are incompatible with the conclusion of Waller who placed them in Chlamydinae by morphological characteristics. These results provide new insights into the evolutionary relationships among scallop species and contribute to the improvement of existing classification systems.

  16. Interactive multiclass segmentation using superpixel classification

    Mathieu, Bérengère; Crouzil, Alain; Puel, Jean-Baptiste

    2015-01-01

    This paper adresses the problem of interactive multiclass segmentation. We propose a fast and efficient new interactive segmentation method called Superpixel Classification-based Interactive Segmentation (SCIS). From a few strokes drawn by a human user over an image, this method extracts relevant semantic objects. To get a fast calculation and an accurate segmentation, SCIS uses superpixel over-segmentation and support vector machine classification. In this paper, we demonstrate that SCIS sig...

  17. Phylogenetic placement of Hydra and relationships within Aplanulata (Cnidaria: Hydrozoa).

    Nawrocki, Annalise M; Collins, Allen G; Hirano, Yayoi M; Schuchert, Peter; Cartwright, Paulyn

    2013-04-01

    The model organism Hydra belongs to the hydrozoan clade Aplanulata. Despite being a popular model system for development, little is known about the phylogenetic placement of this taxon or the relationships of its closest relatives. Previous studies have been conflicting regarding sister group relationships and have been unable to resolve deep nodes within the clade. In addition, there are several putative Aplanulata taxa that have never been sampled for molecular data or analyzed using multiple markers. Here, we combine the fast-evolving cytochrome oxidase 1 (CO1) mitochondrial marker with mitochondrial 16S, nuclear small ribosomal subunit (18S, SSU) and large ribosomal subunit (28S, LSU) sequences to examine relationships within the clade Aplanulata. We further discuss the relative contribution of four different molecular markers to resolving phylogenetic relationships within Aplanulata. Lastly, we report morphological synapomorphies for some of the major Aplanulata genera and families, and suggest new taxonomic classifications for two species of Aplanulata, Fukaurahydra anthoformis and Corymorpha intermedia, based on a preponderance of molecular and morphological data that justify the designation of these species to different genera. PMID:23280366

  18. The space of ultrametric phylogenetic trees.

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. PMID:27188249

  19. Classification and knowledge

    Kurtz, Michael J.

    1989-01-01

    Automated procedures to classify objects are discussed. The classification problem is reviewed, and the relation of epistemology and classification is considered. The classification of stellar spectra and of resolved images of galaxies is addressed.

  20. Hazard classification methodology

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  1. Remote Sensing Information Classification

    Rickman, Douglas L.

    2008-01-01

    This viewgraph presentation reviews the classification of Remote Sensing data in relation to epidemiology. Classification is a way to reduce the dimensionality and precision to something a human can understand. Classification changes SCALAR data into NOMINAL data.

  2. Molecular systematics of the Amazonian genus Aldina, a phylogenetically enigmatic ectomycorrhizal lineage of papilionoid legumes.

    Ramos, Gustavo; de Lima, Haroldo Cavalcante; Prenner, Gerhard; de Queiroz, Luciano Paganucci; Zartman, Charles E; Cardoso, Domingos

    2016-04-01

    Aldina (Leguminosae) is among the very few ecologically successful ectomycorrhizal lineages in a family largely marked by the evolution of nodulating symbiosis. The genus comprises 20 species predominantly distributed in Amazonia and has been traditionally classified in the tribe Swartzieae because of its radial flowers with an entire calyx and numerous free stamens. The taxonomy of Aldina is complicated due to its poor representation in herbaria and the lack of a robust phylogenetic hypothesis of relationship. Recent phylogenetic analyses of matK and trnL sequences confirmed the placement of Aldina in the 50-kb inversion clade, although the genus remained phylogenetically isolated or unresolved in the context of the evolutionary history of the main early-branching papilionoid lineages. We performed maximum likelihood and Bayesian analyses of combined chloroplast datasets (matK, rbcL, and trnL) and explored the effect of incomplete taxa or missing data in order to shed light on the enigmatic phylogenetic position of Aldina. Unexpectedly, a sister relationship of Aldina with the Andira clade (Andira and Hymenolobium) is revealed. We suggest that a new tribal phylogenetic classification of the papilionoid legumes should place Aldina along with Andira and Hymenolobium. These results highlight yet another example of the independent evolution of radial floral symmetry within the early-branching Papilionoideae, a large collection of florally heterogeneous lineages dominated by papilionate or bilaterally symmetric flower morphology. PMID:26748266

  3. Texture Classification Based on Texton Features

    U Ravi Babu

    2012-08-01

    Full Text Available Texture Analysis plays an important role in the interpretation, understanding and recognition of terrain, biomedical or microscopic images. To achieve high accuracy in classification the present paper proposes a new method on textons. Each texture analysis method depends upon how the selected texture features characterizes image. Whenever a new texture feature is derived it is tested whether it precisely classifies the textures. Here not only the texture features are important but also the way in which they are applied is also important and significant for a crucial, precise and accurate texture classification and analysis. The present paper proposes a new method on textons, for an efficient rotationally invariant texture classification. The proposed Texton Features (TF evaluates the relationship between the values of neighboring pixels. The proposed classification algorithm evaluates the histogram based techniques on TF for a precise classification. The experimental results on various stone textures indicate the efficacy of the proposed method when compared to other methods.

  4. Discriminating the effects of phylogenetic hypothesis, tree resolution and clade age estimates on phylogenetic signal measurements.

    Seger, G D S; Duarte, L D S; Debastiani, V J; Kindel, A; Jarenkow, J A

    2013-09-01

    Understanding how species traits evolved over time is the central question to comprehend assembly rules that govern the phylogenetic structure of communities. The measurement of phylogenetic signal (PS) in ecologically relevant traits is a first step to understand phylogenetically structured community patterns. The different methods available to estimate PS make it difficult to choose which is most appropriate. Furthermore, alternative phylogenetic tree hypotheses, node resolution and clade age estimates might influence PS measurements. In this study, we evaluated to what extent these parameters affect different methods of PS analysis, and discuss advantages and disadvantages when selecting which method to use. We measured fruit/seed traits and flowering/fruiting phenology of endozoochoric species occurring in Southern Brazilian Araucaria forests and evaluated their PS using Mantel regressions, phylogenetic eigenvector regressions (PVR) and K statistic. Mantel regressions always gave less significant results compared to PVR and K statistic in all combinations of phylogenetic trees constructed. Moreover, a better phylogenetic resolution affected PS, independently of the method used to estimate it. Morphological seed traits tended to show higher PS than diaspores traits, while PS in flowering/fruiting phenology depended mostly on the method used to estimate it. This study demonstrates that different PS estimates are obtained depending on the chosen method and the phylogenetic tree resolution. This finding has implications for inferences on phylogenetic niche conservatism or ecological processes determining phylogenetic community structure. PMID:23368095

  5. Classification and Analysis of Computer Network Traffic

    Bujlow, Tomasz

    2014-01-01

    of traffic for academic purposes. We define the objective of this thesis as finding a way to evaluate the performance of various applications in a high-speed Internet infrastructure. To satisfy the objective, we needed to answer a number of research questions. The biggest extent of them concern techniques...... classification (as by using transport layer port numbers, Deep Packet Inspection (DPI), statistical classification) and assessed their usefulness in particular areas. We found that the classification techniques based on port numbers are not accurate anymore as most applications use dynamic port numbers, while...

  6. Nudivirus Genomics: Diversity and Classification

    Yong-jie Wang; John P. Burand; Johannes A. Jehle

    2007-01-01

    Nudiviruses represent a diverse group of arthropod specific, rod-shaped and dsDNA viruses. Due to similarities in pathology and morphology to members of the family Baculoviridae, they have been previously classified as the so-called "non-occluded" baculoviruses. However, presently they are taxonomically orphaned and are not assigned to any virus family because of the lack of genetic relatedness to Baculoviridae,. Here, we report on recent progress in the genomic analysis of Heliothis zea nudivirus 1 (HzNV-1), Oryctes rhinoceros nudivirus (OrNV), Gryllus bimaculatus nudivirus (GbNV) and Heliotis zea nudivirus 2 (HzNV-2). Gene content comparison and phylogenetic analyses indicated that the viruses share 15 core genes with baculoviruses and form a monophyletic sister group to them. Consequences of the genetic relationship are discussed for the classification of nudiviruses.

  7. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Steven Kelly

    Full Text Available The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  8. Phylogenetic structure in tropical hummingbird communities

    Graham, Catherine H; Parra, Juan L; Rahbek, Carsten;

    2009-01-01

    sustaining an expensive means of locomotion at high elevations. We found that communities in the lowlands on opposite sides of the Andes tend to be phylogenetically similar despite their large differences in species composition, a pattern implicating the Andes as an important dispersal barrier. In contrast......How biotic interactions, current and historical environment, and biogeographic barriers determine community structure is a fundamental question in ecology and evolution, especially in diverse tropical regions. To evaluate patterns of local and regional diversity, we quantified the phylogenetic...... composition of 189 hummingbird communities in Ecuador. We assessed how species and phylogenetic composition changed along environmental gradients and across biogeographic barriers. We show that humid, low-elevation communities are phylogenetically overdispersed (coexistence of distant relatives), a pattern...

  9. Marine turtle mitogenome phylogenetics and evolution

    Duchene, S.; Frey, A.; Alfaro-Núñez, A.;

    2012-01-01

    . Analyses of partial mitochondrial sequences and some nuclear markers have revealed phylogenetic inconsistencies within Cheloniidae, especially regarding the placement of the flatback. Population genetic studies based on D-Loop sequences have shown considerable structuring in species with broad geographic...

  10. A statistical approach to root system classification.

    Gernot eBodner

    2013-08-01

    Full Text Available Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for plant functional type identification in ecology can be applied to the classification of root systems. We demonstrate that combining principal component and cluster analysis yields a meaningful classification of rooting types based on morphological traits. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. Biplot inspection is used to determine key traits and to ensure stability in cluster based grouping. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Three rooting types emerged from measured data, distinguished by diameter/weight, density and spatial distribution respectively. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement

  11. Phylogenetic relationships in the family Alloherpesviridae

    Waltzek, T.B.; Kelley, G.O.; Alfaro, M.E.; Kurobe, T.; Davison, A J; Hedrick, R.P.

    2009-01-01

    Phylogenetic relationships among herpesviruses (HVs) of mammals, birds, and reptiles have been studied extensively, whereas those among other HVs are relatively unexplored. We have reconstructed the phylogenetic relationships among 13 fish and amphibian HVs using maximum likelihood and Bayesian analyses of amino acid sequences predicted from parts of the DNA polymerase and terminase genes. The relationships among 6 of these viruses were confirmed using the partial DNA polymerase data plus the...

  12. On the analysis of phylogenetically paired designs

    Funk, Jennifer L.; Rakovski, Cyril S; Macpherson, J Michael

    2015-01-01

    As phylogenetically controlled experimental designs become increasingly common in ecology, the need arises for a standardized statistical treatment of these datasets. Phylogenetically paired designs circumvent the need for resolved phylogenies and have been used to compare species groups, particularly in the areas of invasion biology and adaptation. Despite the widespread use of this approach, the statistical analysis of paired designs has not been critically evaluated. We propose a mixed mod...

  13. Consequences of recombination on traditional phylogenetic analysis

    Schierup, M H; Hein, J

    2000-01-01

    We investigate the shape of a phylogenetic tree reconstructed from sequences evolving under the coalescent with recombination. The motivation is that evolutionary inferences are often made from phylogenetic trees reconstructed from population data even though recombination may well occur (mtDNA or...... recombination leads to a large overestimation of the substitution rate heterogeneity and the loss of the molecular clock. These results are discussed in relation to viral and mtDNA data sets. Udgivelsesdato: 2000-Oct...

  14. Phylogenetic Position of Barbus lacerta Heckel, 1843

    Mustafa Korkmaz

    2015-01-01

    The genus Barbus is characterized by a complex taxonomical structure, due to high number of species and its morphological plasticity; it counts more than 25 species in Europe, displaying different ecological preferences. 21 taxon’s from Barbus genus including Barbus lacerta was used in phylogenetic analysis. Cytochrome oxidase I (COI) gene sequence analysis of Barbus lacerta is presented firstly in this study. A phylogenetic tree (neighbor-joining and maximum likelihood analysis) was reco...

  15. Phylogenetic niche conservatism in C4 grasses.

    Liu, Hui; Edwards, Erika J; Freckleton, Robert P; Osborne, Colin P

    2012-11-01

    Photosynthetic pathway is used widely to discriminate plant functional types in studies of global change. However, independent evolutionary lineages of C(4) grasses with different variants of C(4) photosynthesis show different biogeographical relationships with mean annual precipitation, suggesting phylogenetic niche conservatism (PNC). To investigate how phylogeny and photosynthetic type differentiate C(4) grasses, we compiled a dataset of morphological and habitat information of 185 genera belonging to two monophyletic subfamilies, Chloridoideae and Panicoideae, which together account for 90 % of the world's C(4) grass species. We evaluated evolutionary variance and covariance of morphological and habitat traits. Strong phylogenetic signals were found in both morphological and habitat traits, arising mainly from the divergence of the two subfamilies. Genera in Chloridoideae had significantly smaller culm heights, leaf widths, 1,000-seed weights and stomata; they also appeared more in dry, open or saline habitats than those of Panicoideae. Controlling for phylogenetic structure showed significant covariation among morphological traits, supporting the hypothesis of phylogenetically independent scaling effects. However, associations between morphological and habitat traits showed limited phylogenetic covariance. Subfamily was a better explanation than photosynthetic type for the variance in most morphological traits. Morphology, habitat water availability, shading, and productivity are therefore all involved in the PNC of C(4) grass lineages. This study emphasized the importance of phylogenetic history in the ecology and biogeography of C(4) grasses, suggesting that divergent lineages need to be considered to fully understand the impacts of global change on plant distributions. PMID:22569558

  16. Phylogenetic distribution of fungal sterols.

    John D Weete

    Full Text Available BACKGROUND: Ergosterol has been considered the "fungal sterol" for almost 125 years; however, additional sterol data superimposed on a recent molecular phylogeny of kingdom Fungi reveals a different and more complex situation. METHODOLOGY/PRINCIPAL FINDINGS: The interpretation of sterol distribution data in a modern phylogenetic context indicates that there is a clear trend from cholesterol and other Delta(5 sterols in the earliest diverging fungal species to ergosterol in later diverging fungi. There are, however, deviations from this pattern in certain clades. Sterols of the diverse zoosporic and zygosporic forms exhibit structural diversity with cholesterol and 24-ethyl -Delta(5 sterols in zoosporic taxa, and 24-methyl sterols in zygosporic fungi. For example, each of the three monophyletic lineages of zygosporic fungi has distinctive major sterols, ergosterol in Mucorales, 22-dihydroergosterol in Dimargaritales, Harpellales, and Kickxellales (DHK clade, and 24-methyl cholesterol in Entomophthorales. Other departures from ergosterol as the dominant sterol include: 24-ethyl cholesterol in Glomeromycota, 24-ethyl cholest-7-enol and 24-ethyl-cholesta-7,24(28-dienol in rust fungi, brassicasterol in Taphrinales and hypogeous pezizalean species, and cholesterol in Pneumocystis. CONCLUSIONS/SIGNIFICANCE: Five dominant end products of sterol biosynthesis (cholesterol, ergosterol, 24-methyl cholesterol, 24-ethyl cholesterol, brassicasterol, and intermediates in the formation of 24-ethyl cholesterol, are major sterols in 175 species of Fungi. Although most fungi in the most speciose clades have ergosterol as a major sterol, sterols are more varied than currently understood, and their distribution supports certain clades of Fungi in current fungal phylogenies. In addition to the intellectual importance of understanding evolution of sterol synthesis in fungi, there is practical importance because certain antifungal drugs (e.g., azoles target reactions in

  17. Classification of the web

    Mai, Jens Erik

    2004-01-01

    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...

  18. Towards accurate emergency response behavior

    Nuclear reactor operator emergency response behavior has persisted as a training problem through lack of information. The industry needs an accurate definition of operator behavior in adverse stress conditions, and training methods which will produce the desired behavior. Newly assembled information from fifty years of research into human behavior in both high and low stress provides a more accurate definition of appropriate operator response, and supports training methods which will produce the needed control room behavior. The research indicates that operator response in emergencies is divided into two modes, conditioned behavior and knowledge based behavior. Methods which assure accurate conditioned behavior, and provide for the recovery of knowledge based behavior, are described in detail

  19. Phylogenetic and functional assessment of orthologs inference projects and methods.

    Adrian M Altenhoff

    2009-01-01

    Full Text Available Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA and two standard methods (bidirectional best-hit and reciprocal smallest distance. We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology

  20. Hierarchical classification of glycoside hydrolases.

    Naumoff, D G

    2011-06-01

    This review deals with structural and functional features of glycoside hydrolases, a widespread group of enzymes present in almost all living organisms. Their catalytic domains are grouped into 120 amino acid sequence-based families in the international classification of the carbohydrate-active enzymes (CAZy database). At a higher hierarchical level some of these families are combined in 14 clans. Enzymes of the same clan have common evolutionary origin of their genes and share the most important functional characteristics such as composition of the active center, anomeric configuration of cleaved glycosidic bonds, and molecular mechanism of the catalyzed reaction (either inverting, or retaining). There are now extensive data in the literature concerning the relationship between glycoside hydrolase families belonging to different clans and/or included in none of them, as well as information on phylogenetic protein relationship within particular families. Summarizing these data allows us to propose a multilevel hierarchical classification of glycoside hydrolases and their homologs. It is shown that almost the whole variety of the enzyme catalytic domains can be brought into six main folds, large groups of proteins having the same three-dimensional structure and the supposed common evolutionary origin. PMID:21639842

  1. Scalable metagenomic taxonomy classification using a reference genome database

    Ames, Sasha K.; Hysom, David A.; Shea N. Gardner; Lloyd, G. Scott; Gokhale, Maya B.; Allen, Jonathan E.

    2013-01-01

    Motivation: Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. Results: A method is presented to shift...

  2. PHYLOGENETIC ANALYSIS AMONG FOUR SECTIONS OF GENUS DENDROBIUM SW. (ORCHIDACEAE IN PENINSULAR MALAYSIA USING RBCL SEQUENCE DATA

    Maryam Moudi

    2013-06-01

    Full Text Available Phylogenetic analysis using chloroplast DNA, the ribulose-bisphosphate carboxylase gene (rbcL, was conducted to examine relationship among four sections of the genus Dendrobium (Orchidaceae: Aporum, Crumenata, Strongyle, and Bolbidium in Peninsular Malaysia. Classifications based on morphological characters have not been able to clearly divide these four sections, therefore deeper and detailed analyses are required to ascertain their status. In this study, the phylogenetic relationships among species of the four sections were investigated to clarify their relations either to lump them into one section or reduce them into two.

  3. Barcoding and Phylogenetic Inferences in Nine Mugilid Species (Pisces, Mugiliformes

    Neonila Polyakova

    2013-10-01

    Full Text Available Accurate identification of fish and fish products, from eggs to adults, is important in many areas. Grey mullets of the family Mugilidae are distributed worldwide and inhabit marine, estuarine, and freshwater environments in all tropical and temperate regions. Various Mugilid species are commercially important species in fishery and aquaculture of many countries. For the present study we have chosen two Mugilid genes with different phylogenetic signals: relatively variable mitochondrial cytochrome oxidase subunit I (COI and conservative nuclear rhodopsin (RHO. We examined their diversity within and among 9 Mugilid species belonging to 4 genera, many of which have been examined from multiple specimens, with the goal of determining whether DNA barcoding can achieve unambiguous species recognition of Mugilid species. The data obtained showed that information based on COI sequences was diagnostic not only for species-level identification but also for recognition of intraspecific units, e.g., allopatric populations of circumtropical Mugil cephalus, or even native and acclimatized specimens of Chelon haematocheila. All RHO sequences appeared strictly species specific. Based on the data obtained, we conclude that COI, as well as RHO sequencing can be used to unambiguously identify fish species. Topologies of phylogeny based on RHO and COI sequences coincided with each other, while together they had a good phylogenetic signal.

  4. Accurate determination of antenna directivity

    Dich, Mikael

    1997-01-01

    The derivation of a formula for accurate estimation of the total radiated power from a transmitting antenna for which the radiated power density is known in a finite number of points on the far-field sphere is presented. The main application of the formula is determination of directivity from power...

  5. Phylogenetic analysis of the Trypanosoma genus based on the heat-shock protein 70 gene.

    Fraga, Jorge; Fernández-Calienes, Aymé; Montalvo, Ana Margarita; Maes, Ilse; Deborggraeve, Stijn; Büscher, Philippe; Dujardin, Jean-Claude; Van der Auwera, Gert

    2016-09-01

    Trypanosome evolution was so far essentially studied on the basis of phylogenetic analyses of small subunit ribosomal RNA (SSU-rRNA) and glycosomal glyceraldehyde-3-phosphate dehydrogenase (gGAPDH) genes. We used for the first time the 70kDa heat-shock protein gene (hsp70) to investigate the phylogenetic relationships among 11 Trypanosoma species on the basis of 1380 nucleotides from 76 sequences corresponding to 65 strains. We also constructed a phylogeny based on combined datasets of SSU-rDNA, gGAPDH and hsp70 sequences. The obtained clusters can be correlated with the sections and subgenus classifications of mammal-infecting trypanosomes except for Trypanosoma theileri and Trypanosoma rangeli. Our analysis supports the classification of Trypanosoma species into clades rather than in sections and subgenera, some of which being polyphyletic. Nine clades were recognized: Trypanosoma carassi, Trypanosoma congolense, Trypanosoma cruzi, Trypanosoma grayi, Trypanosoma lewisi, T. rangeli, T. theileri, Trypanosoma vivax and Trypanozoon. These results are consistent with existing knowledge of the genus' phylogeny. Within the T. cruzi clade, three groups of T. cruzi discrete typing units could be clearly distinguished, corresponding to TcI, TcIII, and TcII+V+VI, while support for TcIV was lacking. Phylogenetic analyses based on hsp70 demonstrated that this molecular marker can be applied for discriminating most of the Trypanosoma species and clades. PMID:27180897

  6. PhyTB: Phylogenetic tree visualisation and sample positioning for M. tuberculosis

    Benavente, Ernest D

    2015-05-13

    Background Phylogenetic-based classification of M. tuberculosis and other bacterial genomes is a core analysis for studying evolutionary hypotheses, disease outbreaks and transmission events. Whole genome sequencing is providing new insights into the genomic variation underlying intra- and inter-strain diversity, thereby assisting with the classification and molecular barcoding of the bacteria. One roadblock to strain investigation is the lack of user-interactive solutions to interrogate and visualise variation within a phylogenetic tree setting. Results We have developed a web-based tool called PhyTB (http://pathogenseq.lshtm.ac.uk/phytblive/index.php webcite) to assist phylogenetic tree visualisation and identification of M. tuberculosis clade-informative polymorphism. Variant Call Format files can be uploaded to determine a sample position within the tree. A map view summarises the geographical distribution of alleles and strain-types. The utility of the PhyTB is demonstrated on sequence data from 1,601 M. tuberculosis isolates. Conclusion PhyTB contextualises M. tuberculosis genomic variation within epidemiological, geographical and phylogenic settings. Further tool utility is possible by incorporating large variants and phenotypic data (e.g. drug-resistance profiles), and an assessment of genotype-phenotype associations. Source code is available to develop similar websites for other organisms (http://sourceforge.net/projects/phylotrack webcite).

  7. Characterization of a branch of the phylogenetic tree

    We use a combination of analytic models and computer simulations to gain insight into the dynamics of evolution. Our results suggest that certain interesting phenomena should eventually emerge from the fossil record. For example, there should be a 'tortoise and hare effect': Those genera with the smallest species death rate are likely to survive much longer than genera with large species birth and death rates. A complete characterization of the behavior of a branch of the phylogenetic tree corresponding to a genus and accurate mathematical representations of the various stages are obtained. We apply our results to address certain controversial issues that have arisen in paleontology such as the importance of punctuated equilibrium and whether unique Cambrian phyla have survived to the present

  8. Characterization of a branch of the phylogenetic tree.

    Samuel, Stuart A; Weng, Gezhi

    2003-02-21

    We use a combination of analytic models and computer simulations to gain insight into the dynamics of evolution. Our results suggest that certain interesting phenomena should eventually emerge from the fossil record. For example, there should be a "tortoise and hare effect": those genera with the smallest species death rate are likely to survive much longer than genera with large species birth and death rates. A complete characterization of the behavior of a branch of the phylogenetic tree corresponding to a genus and accurate mathematical representations of the various stages are obtained. We apply our results to address certain controversial issues that have arisen in paleontology such as the importance of punctuated equilibrium and whether unique Cambrian phyla have survived to the present. PMID:12623281

  9. Morphological and molecular convergences in mammalian phylogenetics.

    Zou, Zhengting; Zhang, Jianzhi

    2016-01-01

    Phylogenetic trees reconstructed from molecular sequences are often considered more reliable than those reconstructed from morphological characters, in part because convergent evolution, which confounds phylogenetic reconstruction, is believed to be rarer for molecular sequences than for morphologies. However, neither the validity of this belief nor its underlying cause is known. Here comparing thousands of characters of each type that have been used for inferring the phylogeny of mammals, we find that on average morphological characters indeed experience much more convergences than amino acid sites, but this disparity is explained by fewer states per character rather than an intrinsically higher susceptibility to convergence for morphologies than sequences. We show by computer simulation and actual data analysis that a simple method for identifying and removing convergence-prone characters improves phylogenetic accuracy, potentially enabling, when necessary, the inclusion of morphologies and hence fossils for reliable tree inference. PMID:27585543

  10. Texture Classification based on Gabor Wavelet

    Amandeep Kaur

    2012-07-01

    Full Text Available This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vector Machine, K-nearest neighbor method and decision tree induction method. The results shows that classification using Support vector machines gives better results as compare to the other classifiers. It can accurately discriminate between a testing image data and training data.

  11. Phylogenetic and Structural Analysis of Polyketide Synthases in Aspergilli

    Bhetariya, Preetida J.; Prajapati, Madhvi; Bhaduri, Asani; Mandal, Rahul Shubhra; Varma, Anupam; Madan, Taruna; Singh, Yogendra; Sarma, P. Usha

    2016-01-01

    Polyketide synthases (PKSs) of Aspergillus species are multidomain and multifunctional megaenzymes that play an important role in the synthesis of diverse polyketide compounds. Putative PKS protein sequences from Aspergillus species representing medically, agriculturally, and industrially important Aspergillus species were chosen and screened for in silico studies. Six candidate Aspergillus species, Aspergillus fumigatus Af293, Aspergillus flavus NRRL3357, Aspergillus niger CBS 513.88, Aspergillus terreus NIH2624, Aspergillus oryzae RIB40, and Aspergillus clavatus NRRL1, were selected to study the PKS phylogeny. Full-length PKS proteins and only ketosynthase (KS) domain sequence were retrieved for independent phylogenetic analysis from the aforementioned species, and phylogenetic analysis was performed with characterized fungal PKS. This resulted into grouping of Aspergilli PKSs into nonreducing (NR), partially reducing (PR), and highly reducing (HR) PKS enzymes. Eight distinct clades with unique domain arrangements were classified based on homology with functionally characterized PKS enzymes. Conserved motif signatures corresponding to each type of PKS were observed. Three proteins from Protein Data Bank corresponding to NR, PR, and HR type of PKS (XP_002384329.1, XP_753141.2, and XP_001402408.2, respectively) were selected for mapping of conserved motifs on three-dimensional structures of KS domain. Structural variations were found at the active sites on modeled NR, PR, and HR enzymes of Aspergillus. It was observed that the number of iteration cycles was dependent on the size of the cavity in the active site of the PKS enzyme correlating with a type with reducing or NR products, such as pigment, 6MSA, and lovastatin. The current study reports the grouping and classification of PKS proteins of Aspergilli for possible exploration of novel polyketides based on sequence homology; this information can be useful for selection of PKS for polyketide exploration and

  12. La LC classification come linked data

    Kevin Ford

    2013-01-01

    Full Text Available In 2009 and in 2011, the Library of Congress made two of its largest authority files – Subject Headings and Names – available as linked data via LC’s Linked Data Service, ID.LOC.GOV. Both are offered in MADS/RDF and SKOS. It is LC’s objective, in 2012, to publish another of its largest authority files as linked data: LC Classification. Whereas the source records for Subject Headings and Names are encoded in the MARC Authority format, from which there is a relatively straightforward mapping to MADS/RDF and SKOS, LC Classification records rely on the MARC Classification format. Mapping from LC Classification to MADS/RDF or SKOS has been a little more challenging. For example, records that represent classification ranges, which are not Concepts intended to be assigned, are not easily accommodated in SKOS. This presents additional problems when needing to accurately represent the relationships in RDF for LC Classification. With comparison to the publication of LCSH and Names at ID.LOC.GOV, this paper will examine issues encountered – and how those challenges were addressed – during the conversion of LC Classification to MADS/RDF and SKOS for release as linked data at ID.LOC.GOV.

  13. Typology, classification and systematization of innovative projects and initiatives in the company

    Baklanova Julia O.

    2012-04-01

    Full Text Available The author presents a comparison of definitions of typology, classification and systematization, and treats them as an example of innovative projects and initiatives of the company. The basis of typology and classification laid methodical Benko K., Mc Farlan. In order to obtain a more accurate result it is necessary to integrate the task typology, classification and systematization.

  14. PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium

    Mi, Huaiyu; Dong, Qing; Muruganujan, Anushya; Gaudet, Pascale; Lewis, Suzanna; Thomas, Paul D

    2009-01-01

    Protein Analysis THrough Evolutionary Relationships (PANTHER) is a comprehensive software system for inferring the functions of genes based on their evolutionary relationships. Phylogenetic trees of gene families form the basis for PANTHER and these trees are annotated with ontology terms describing the evolution of gene function from ancestral to modern day genes. One of the main applications of PANTHER is in accurate prediction of the functions of uncharacterized genes, based on their evolu...

  15. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    Kodner Robin B

    2010-10-01

    Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.

  16. Phylogenetic position of Oryzolejeunea (Lejeuneaceae,Marchantiophyta): Evidence from molecular markers and morphology

    Wen YE; Yu-Mei WEI; Alfons SCH(A)FER-VERWIMP; Rui-Liang ZHU

    2013-01-01

    The systematic position of the small neotropical genus Oryzolejeunea (three spp.) has long been controversial.Phylogenetic analyses of molecular data for the present study using DNA markers (trnL,psbA,and a nuclear ribosomal internal transcribed spacer [nrITS] region) shows that the genus is nested in Lejeunea.The results not only reveal the phylogenetic position of Oryzolejeunea for the first time,but also challenge the taxonomic value of the proximal hyaline papilla as a key feature in Lejeunea.The present study shows the urgent need for a reassessment of the perimeters of the genus Lejeunea and its infrageneric classification.Three new combinations,namely Lejeunea saccatiloba,Lejeunea grolleana,and Lejeunea venezuelana,are proposed.

  17. The complete mitochondrial genome of Meriones libycus (Rodentia: Cricetidae) and its phylogenetic analysis.

    Luo, Guangjie; Liao, Jicheng

    2016-07-01

    Meriones libycus belongs to the genus Meriones in Gerbillinae, its complete mitochondrial genome is 16,341 bp in length. The heavy strand contains 32.8% A, 13.1% G, 25.3% C, 28.8% T, protein-coding genes approximately accounting for 69.54%. Results of phylogenetic analysis showed that M. libycus and Meriones unguiculatus were clustered together, and it was consistent with that of primary morphological taxonomy. This study verifies the evolutionary status of M. libycus in Meriones at the molecular level. The mitochondrial genome would be a significant supplement for the gene pool of Rodentia and the conclusion of phylogenetic analysis could be an important molecular evidence for the classification of Gerbillinae. PMID:26017047

  18. Hand eczema classification

    Diepgen, T L; Andersen, Klaus Ejner; Brandao, F M;

    2008-01-01

    the disease is rarely evidence based, and a classification system for different subdiagnoses of hand eczema is not agreed upon. Randomized controlled trials investigating the treatment of hand eczema are called for. For this, as well as for clinical purposes, a generally accepted classification system...... classification system for hand eczema is proposed. Conclusions It is suggested that this classification be used in clinical work and in clinical trials....

  19. Phylogenetic Memory of Developing Mammalian Dentition

    Peterková, Renata; Lesot, H.; Peterka, Miroslav

    2006-01-01

    Roč. 306, č. 3 (2006), s. 234-250. ISSN 1552-5007 R&D Projects: GA ČR GA304/05/2665; GA MŠk OC B23.002 Institutional research plan: CEZ:AV0Z50390512 Keywords : Phylogenetic Memory Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.756, year: 2006

  20. DNA barcoding and phylogenetic relationships in Timaliidae.

    Huang, Z H; Ke, D H

    2015-01-01

    The Timaliidae, a diverse family of oscine passerine birds, has long been a subject of debate regarding its phylogeny. The mitochondrial cytochrome c oxidase subunit I (COI) gene has been used as a powerful marker for identification and phylogenetic studies of animal species. In the present study, we analyzed the COI barcodes of 71 species from 21 genera belonging to the family Timaliidae. Every bird species possessed a barcode distinct from that of other bird species. Kimura two-parameter (K2P) distances were calculated between barcodes. The average genetic distance between species was 18 times higher than the average genetic distance within species. The neighbor-joining method was used to construct a phylogenetic tree and all the species could be discriminated by their distinct clades within the phylogenetic tree. The results indicate that some currently recognized babbler genera might not be monophyletic, with the COI gene data supporting the hypothesis of polyphyly for Garrulax, Alcippe, and Minla. Thus, DNA barcoding is an effective molecular tool for Timaliidae species identification and phylogenetic inference. PMID:26125793

  1. Phylogenetic and phylogenomic overview of the Polyporales.

    Binder, Manfred; Justo, Alfredo; Riley, Robert; Salamov, Asaf; Lopez-Giraldez, Francesc; Sjökvist, Elisabet; Copeland, Alex; Foster, Brian; Sun, Hui; Larsson, Ellen; Larsson, Karl-Henrik; Townsend, Jeffrey; Grigoriev, Igor V; Hibbett, David S

    2013-01-01

    We present a phylogenetic and phylogenomic overview of the Polyporales. The newly sequenced genomes of Bjerkandera adusta, Ganoderma sp., and Phlebia brevispora are introduced and an overview of 10 currently available Polyporales genomes is provided. The new genomes are 39 500 000-49 900 00 bp and encode for 12 910-16 170 genes. We searched available genomes for single-copy genes and performed phylogenetic informativeness analyses to evaluate their potential for phylogenetic systematics of the Polyporales. Phylogenomic datasets (25, 71, 356 genes) were assembled for the 10 Polyporales species with genome data and compared with the most comprehensive dataset of Polyporales to date (six-gene dataset for 373 taxa, including taxa with missing data). Maximum likelihood and Bayesian phylogenetic analyses of genomic datasets yielded identical topologies, and the corresponding clades also were recovered in the 373-taxa dataset although with different support values in some datasets. Three previously recognized lineages of Polyporales, antrodia, core polyporoid and phlebioid clades, are supported in most datasets, while the status of the residual polyporoid clade remains uncertain and certain taxa (e.g. Gelatoporia, Grifola, Tyromyces) apparently do not belong to any of the major lineages of Polyporales. The most promising candidate single-copy genes are presented, and nodes in the Polyporales phylogeny critical for the suprageneric taxonomy of the order are identified and discussed. PMID:23935031

  2. Genomic repeat abundances contain phylogenetic signal

    Dodsworth, S.; Chase, M.W.; Kelly, L.J.; Leitch, I.J.; Macas, Jiří; Novák, Petr; Piednoël, M.; Weiß-Schneeweiss, H.; Leitch, A.R.

    2015-01-01

    Roč. 64, č. 1 (2015), s. 112-126. ISSN 1063-5157 R&D Projects: GA ČR GBP501/12/G090 Institutional support: RVO:60077344 Keywords : Repetitive DNA * continuous characters * genomics * next-generation sequencing * phylogenetics Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 14.387, year: 2014

  3. Phylogenetic Relationships and Biogeographic History of Iriarteeae

    Bacon, Christine D.; Florez, Alexander; Balslev, Henrik;

    sequence data for 11 loci (5 chloroplast and 6 nuclear) to reconstruct a coalescent species tree and infer relationships amongst genera and species to, in turn, allow for tests of biogeography and community phylogenetics in the tribe. Our results define inter-generic relationships and resolve all genera as...

  4. Classification of articulators.

    Rihani, A

    1980-03-01

    A simple classification in familiar terms with definite, clear characteristics can be adopted. This classification system is based on the number of records used and the adjustments necessary for the articulator to accept these records. The classification divides the articulators into nonadjustable, semiadjustable, and fully adjustable articulators (Table I). PMID:6928204

  5. Aircraft Operations Classification System

    Harlow, Charles; Zhu, Weihong

    2001-01-01

    Accurate data is important in the aviation planning process. In this project we consider systems for measuring aircraft activity at airports. This would include determining the type of aircraft such as jet, helicopter, single engine, and multiengine propeller. Some of the issues involved in deploying technologies for monitoring aircraft operations are cost, reliability, and accuracy. In addition, the system must be field portable and acceptable at airports. A comparison of technologies was conducted and it was decided that an aircraft monitoring system should be based upon acoustic technology. A multimedia relational database was established for the study. The information contained in the database consists of airport information, runway information, acoustic records, photographic records, a description of the event (takeoff, landing), aircraft type, and environmental information. We extracted features from the time signal and the frequency content of the signal. A multi-layer feed-forward neural network was chosen as the classifier. Training and testing results were obtained. We were able to obtain classification results of over 90 percent for training and testing for takeoff events.

  6. Phylogenetic relationships of some species of the family Echinostomatidae Odner, 1910 ( Trematoda ), inferred from nuclear rDNA sequences and karyological analysis

    Gražina Stanevičiūtė; Virmantas Stunžėnas; Romualda Petkevičiūtė

    2015-01-01

    Abstract The family Echinostomatidae Looss, 1899 exhibits a substantial taxonomic diversity, morphological criteria adopted by different authors have resulted in its subdivision into an impressive number of subfamilies. The status of the subfamily Echinochasminae Odhner, 1910 was changed in various classifications. Genetic characteristics and phylogenetic analysis of four Echinostomatidae species – Echinochasmus sp., Echinochasmus coaxatus Dietz, 1909, Stephanoprora pseudoechinata (Olsson, 18...

  7. Cirrhosis classification based on texture classification of random features.

    Liu, Hui; Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang

    2014-01-01

    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM. PMID:24707317

  8. Phyloclimatic modeling: combining phylogenetics and bioclimatic modeling.

    Yesson, C; Culham, A

    2006-10-01

    We investigate the impact of past climates on plant diversification by tracking the "footprint" of climate change on a phylogenetic tree. Diversity within the cosmopolitan carnivorous plant genus Drosera (Droseraceae) is focused within Mediterranean climate regions. We explore whether this diversity is temporally linked to Mediterranean-type climatic shifts of the mid-Miocene and whether climate preferences are conservative over phylogenetic timescales. Phyloclimatic modeling combines environmental niche (bioclimatic) modeling with phylogenetics in order to study evolutionary patterns in relation to climate change. We present the largest and most complete such example to date using Drosera. The bioclimatic models of extant species demonstrate clear phylogenetic patterns; this is particularly evident for the tuberous sundews from southwestern Australia (subgenus Ergaleium). We employ a method for establishing confidence intervals of node ages on a phylogeny using replicates from a Bayesian phylogenetic analysis. This chronogram shows that many clades, including subgenus Ergaleium and section Bryastrum, diversified during the establishment of the Mediterranean-type climate. Ancestral reconstructions of bioclimatic models demonstrate a pattern of preference for this climate type within these groups. Ancestral bioclimatic models are projected into palaeo-climate reconstructions for the time periods indicated by the chronogram. We present two such examples that each generate plausible estimates of ancestral lineage distribution, which are similar to their current distributions. This is the first study to attempt bioclimatic projections on evolutionary time scales. The sundews appear to have diversified in response to local climate development. Some groups are specialized for Mediterranean climates, others show wide-ranging generalism. This demonstrates that Phyloclimatic modeling could be repeated for other plant groups and is fundamental to the understanding of

  9. Proposal for a revised classification of the Demospongiae (Porifera)

    Morrow, Christine; Cardenas, Paco

    2015-01-01

    Background: Demospongiae is the largest sponge class including 81% of all living sponges with nearly 7,000 species worldwide. Systema Porifera (2002) was the result of a large international collaboration to update the Demospongiae higher taxa classification, essentially based on morphological data. Since then, an increasing number of molecular phylogenetic studies have considerably shaken this taxonomic framework, with numerous polyphyletic groups revealed or confirmed and new clades discover...

  10. Identification and classification of silks using infrared spectroscopy

    M. Boulet-Audet; Vollrath, F.; Holland, C.

    2015-01-01

    ABSTRACT Lepidopteran silks number in the thousands and display a vast diversity of structures, properties and industrial potential. To map this remarkable biochemical diversity, we present an identification and screening method based on the infrared spectra of native silk feedstock and cocoons. Multivariate analysis of over 1214 infrared spectra obtained from 35 species allowed us to group silks into distinct hierarchies and a classification that agrees well with current phylogenetic data an...

  11. A new measure to study phylogenetic relations in the brown algal order Ectocarpales: The ``codon impact parameter"

    Smarajit Das; Jayprokas Chakrabarti; Zhumur Ghosh; Satyabrata Sahoo; Bibekanand Mallick

    2005-12-01

    We analyse forty-seven chloroplast genes of the large subunit of RuBisCO, from the algal order Ectocarpales, sourced from GenBank. Codon-usage weighted by the nucleotide base-bias defines our score called the codon-impact-parameter. This score is used to obtain phylogenetic relations amongst the 47 Ectocarpales. We compare our classification with the ones done earlier.

  12. Phylogenetic Characterization of Transport Protein Superfamilies: Superiority of SuperfamilyTree Programs over Those Based on Multiple Alignments

    Chen, Jonathan S.; Reddy, Vamsee; Chen, Joshua H.; Shlykov, Maksim A; Zheng, Wei Hao; Cho, Jaehoon; Yen, Ming Ren; Saier, Milton H.

    2012-01-01

    Transport proteins function in the translocation of ions, solutes and macromolecules across cellular and organellar membranes. These integral membrane proteins fall into >600 families as tabulated in the Transporter Classification Database (www.tcdb.org). Recent studies, some of which are reported here, define distant phylogenetic relationships between families with the creation of superfamilies. Several of these are analyzed using a novel set of programs designed to allow reliable prediction...

  13. Strong phylogenetic signals and phylogenetic niche conservatism in ecophysiological traits across divergent lineages of Magnoliaceae

    Hui Liu; Qiuyuan Xu; Pengcheng He; Santiago, Louis S.; Keming Yang; Qing Ye

    2015-01-01

    The early diverged Magnoliaceae shows a historical temperate-tropical distribution among lineages indicating divergent evolution, yet which ecophysiological traits are phylogenetically conserved, and whether these traits are involved in correlated evolution remain unclear. Integrating phylogeny and 20 ecophysiological traits of 27 species, from the four largest sections of Magnoliaceae, we tested the phylogenetic signals of these traits and the correlated evolution between trait pairs. Phylog...

  14. Fast and accurate estimation for astrophysical problems in large databases

    Richards, Joseph W.

    2010-10-01

    A recent flood of astronomical data has created much demand for sophisticated statistical and machine learning tools that can rapidly draw accurate inferences from large databases of high-dimensional data. In this Ph.D. thesis, methods for statistical inference in such databases will be proposed, studied, and applied to real data. I use methods for low-dimensional parametrization of complex, high-dimensional data that are based on the notion of preserving the connectivity of data points in the context of a Markov random walk over the data set. I show how this simple parameterization of data can be exploited to: define appropriate prototypes for use in complex mixture models, determine data-driven eigenfunctions for accurate nonparametric regression, and find a set of suitable features to use in a statistical classifier. In this thesis, methods for each of these tasks are built up from simple principles, compared to existing methods in the literature, and applied to data from astronomical all-sky surveys. I examine several important problems in astrophysics, such as estimation of star formation history parameters for galaxies, prediction of redshifts of galaxies using photometric data, and classification of different types of supernovae based on their photometric light curves. Fast methods for high-dimensional data analysis are crucial in each of these problems because they all involve the analysis of complicated high-dimensional data in large, all-sky surveys. Specifically, I estimate the star formation history parameters for the nearly 800,000 galaxies in the Sloan Digital Sky Survey (SDSS) Data Release 7 spectroscopic catalog, determine redshifts for over 300,000 galaxies in the SDSS photometric catalog, and estimate the types of 20,000 supernovae as part of the Supernova Photometric Classification Challenge. Accurate predictions and classifications are imperative in each of these examples because these estimates are utilized in broader inference problems

  15. A user's guide to a data base of the diversity of Pseudomonas syringae and its application to classifying strains in this phylogenetic complex.

    Odile Berge

    Full Text Available The Pseudomonas syringae complex is composed of numerous genetic lineages of strains from both agricultural and environmental habitats including habitats closely linked to the water cycle. The new insights from the discovery of this bacterial species in habitats outside of agricultural contexts per se have led to the revelation of a wide diversity of strains in this complex beyond what was known from agricultural contexts. Here, through Multi Locus Sequence Typing (MLST of 216 strains, we identified 23 clades within 13 phylogroups among which the seven previously described P. syringae phylogroups were included. The phylogeny of the core genome of 29 strains representing nine phylogroups was similar to the phylogeny obtained with MLST thereby confirming the robustness of MLST-phylogroups. We show that phenotypic traits rarely provide a satisfactory means for classification of strains even if some combinations are highly probable in some phylogroups. We demonstrate that the citrate synthase (cts housekeeping gene can accurately predict the phylogenetic affiliation for more than 97% of strains tested. We propose a list of cts sequences to be used as a simple tool for quickly and precisely classifying new strains. Finally, our analysis leads to predictions about the diversity of P. syringae that is yet to be discovered. We present here an expandable framework mainly based on cts genetic analysis into which more diversity can be integrated.

  16. A user's guide to a data base of the diversity of Pseudomonas syringae and its application to classifying strains in this phylogenetic complex.

    Berge, Odile; Monteil, Caroline L; Bartoli, Claudia; Chandeysson, Charlotte; Guilbaud, Caroline; Sands, David C; Morris, Cindy E

    2014-01-01

    The Pseudomonas syringae complex is composed of numerous genetic lineages of strains from both agricultural and environmental habitats including habitats closely linked to the water cycle. The new insights from the discovery of this bacterial species in habitats outside of agricultural contexts per se have led to the revelation of a wide diversity of strains in this complex beyond what was known from agricultural contexts. Here, through Multi Locus Sequence Typing (MLST) of 216 strains, we identified 23 clades within 13 phylogroups among which the seven previously described P. syringae phylogroups were included. The phylogeny of the core genome of 29 strains representing nine phylogroups was similar to the phylogeny obtained with MLST thereby confirming the robustness of MLST-phylogroups. We show that phenotypic traits rarely provide a satisfactory means for classification of strains even if some combinations are highly probable in some phylogroups. We demonstrate that the citrate synthase (cts) housekeeping gene can accurately predict the phylogenetic affiliation for more than 97% of strains tested. We propose a list of cts sequences to be used as a simple tool for quickly and precisely classifying new strains. Finally, our analysis leads to predictions about the diversity of P. syringae that is yet to be discovered. We present here an expandable framework mainly based on cts genetic analysis into which more diversity can be integrated. PMID:25184292

  17. A User's Guide to a Data Base of the Diversity of Pseudomonas syringae and Its Application to Classifying Strains in This Phylogenetic Complex

    Berge, Odile; Monteil, Caroline L.; Bartoli, Claudia; Chandeysson, Charlotte; Guilbaud, Caroline; Sands, David C.; Morris, Cindy E.

    2014-01-01

    The Pseudomonas syringae complex is composed of numerous genetic lineages of strains from both agricultural and environmental habitats including habitats closely linked to the water cycle. The new insights from the discovery of this bacterial species in habitats outside of agricultural contexts per se have led to the revelation of a wide diversity of strains in this complex beyond what was known from agricultural contexts. Here, through Multi Locus Sequence Typing (MLST) of 216 strains, we identified 23 clades within 13 phylogroups among which the seven previously described P. syringae phylogroups were included. The phylogeny of the core genome of 29 strains representing nine phylogroups was similar to the phylogeny obtained with MLST thereby confirming the robustness of MLST-phylogroups. We show that phenotypic traits rarely provide a satisfactory means for classification of strains even if some combinations are highly probable in some phylogroups. We demonstrate that the citrate synthase (cts) housekeeping gene can accurately predict the phylogenetic affiliation for more than 97% of strains tested. We propose a list of cts sequences to be used as a simple tool for quickly and precisely classifying new strains. Finally, our analysis leads to predictions about the diversity of P. syringae that is yet to be discovered. We present here an expandable framework mainly based on cts genetic analysis into which more diversity can be integrated. PMID:25184292

  18. Phylogenetic and structural analysis of merkel cell polyomavirus VP1 in Brazilian samples.

    Baez, Camila F; Diaz, Nuria C; Venceslau, Marianna T; Luz, Flávio B; Guimarães, Maria Angelica A M; Zalis, Mariano G; Varella, Rafael B

    2016-08-01

    Our understanding of the phylogenetic and structural characteristics of the Merkel Cell Polyomavirus (MCPyV) is increasing but still scarce, especially in samples originating from South America. In order to investigate the properties of MCPyV circulating in the continent in more detail, MCPyV Viral Protein 1 (VP1) sequences from five basal cell carcinoma (BCC) and four saliva samples from Brazilian individuals were evaluated from the phylogenetic and structural standpoint, along with all complete MCPyV VP1 sequences available at Genbank database so far. The VP1 phylogenetic analysis confirmed the previously reported pattern of geographic distribution of MCPyV genotypes and the complexity of the South-American clade. The nine Brazilian samples were equally distributed in the South-American (3 saliva samples); North American/European (2 BCC and 1 saliva sample); and in the African clades (3 BCC). The classification of mutations according to the functional regions of VP1 protein revealed a differentiated pattern for South-American sequences, with higher number of mutations on the neutralizing epitope loops and lower on the region of C-terminus, responsible for capsid formation, when compared to other continents. In conclusion, the phylogenetic analysis showed that the distribution of Brazilian VP1 sequences agrees with the ethnic composition of the country, indicating that VP1 can be successfully used for MCPyV phylogenetic studies. Finally, the structural analysis suggests that some mutations could have impact on the protein folding, membrane binding or antibody escape, and therefore they should be further studied. PMID:27173789

  19. Accurate Image Super-Resolution Using Very Deep Convolutional Networks

    Kim, Jiwon; Lee, Jung Kwon; Lee, Kyoung Mu

    2015-01-01

    We present a highly accurate single-image super-resolution (SR) method. Our method uses a very deep convolutional network inspired by VGG-net used for ImageNet classification \\cite{simonyan2015very}. We find increasing our network depth shows a significant improvement in accuracy. Our final model uses 20 weight layers. By cascading small filters many times in a deep network structure, contextual information over large image regions is exploited in an efficient way. With very deep networks, ho...

  20. Automatic classification of blank substrate defects

    Boettiger, Tom; Buck, Peter; Paninjath, Sankaranarayanan; Pereira, Mark; Ronald, Rob; Rost, Dan; Samir, Bhamidipati

    2014-10-01

    Mask preparation stages are crucial in mask manufacturing, since this mask is to later act as a template for considerable number of dies on wafer. Defects on the initial blank substrate, and subsequent cleaned and coated substrates, can have a profound impact on the usability of the finished mask. This emphasizes the need for early and accurate identification of blank substrate defects and the risk they pose to the patterned reticle. While Automatic Defect Classification (ADC) is a well-developed technology for inspection and analysis of defects on patterned wafers and masks in the semiconductors industry, ADC for mask blanks is still in the early stages of adoption and development. Calibre ADC is a powerful analysis tool for fast, accurate, consistent and automatic classification of defects on mask blanks. Accurate, automated classification of mask blanks leads to better usability of blanks by enabling defect avoidance technologies during mask writing. Detailed information on blank defects can help to select appropriate job-decks to be written on the mask by defect avoidance tools [1][4][5]. Smart algorithms separate critical defects from the potentially large number of non-critical defects or false defects detected at various stages during mask blank preparation. Mechanisms used by Calibre ADC to identify and characterize defects include defect location and size, signal polarity (dark, bright) in both transmitted and reflected review images, distinguishing defect signals from background noise in defect images. The Calibre ADC engine then uses a decision tree to translate this information into a defect classification code. Using this automated process improves classification accuracy, repeatability and speed, while avoiding the subjectivity of human judgment compared to the alternative of manual defect classification by trained personnel [2]. This paper focuses on the results from the evaluation of Automatic Defect Classification (ADC) product at MP Mask

  1. Quality-Oriented Classification of Aircraft Material Based on SVM

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  2. Accurate ab initio spin densities

    Boguslawski, Katharina; Legeza, Örs; Reiher, Markus

    2012-01-01

    We present an approach for the calculation of spin density distributions for molecules that require very large active spaces for a qualitatively correct description of their electronic structure. Our approach is based on the density-matrix renormalization group (DMRG) algorithm to calculate the spin density matrix elements as basic quantity for the spatially resolved spin density distribution. The spin density matrix elements are directly determined from the second-quantized elementary operators optimized by the DMRG algorithm. As an analytic convergence criterion for the spin density distribution, we employ our recently developed sampling-reconstruction scheme [J. Chem. Phys. 2011, 134, 224101] to build an accurate complete-active-space configuration-interaction (CASCI) wave function from the optimized matrix product states. The spin density matrix elements can then also be determined as an expectation value employing the reconstructed wave function expansion. Furthermore, the explicit reconstruction of a CA...

  3. Accurate Modeling of Advanced Reflectarrays

    Zhou, Min

    of the incident field, the choice of basis functions, and the technique to calculate the far-field. Based on accurate reference measurements of two offset reflectarrays carried out at the DTU-ESA Spherical NearField Antenna Test Facility, it was concluded that the three latter factors are particularly important...... to the conventional phase-only optimization technique (POT), the geometrical parameters of the array elements are directly optimized to fulfill the far-field requirements, thus maintaining a direct relation between optimization goals and optimization variables. As a result, better designs can be obtained compared...... using the GDOT to demonstrate its capabilities. To verify the accuracy of the GDOT, two offset contoured beam reflectarrays that radiate a high-gain beam on a European coverage have been designed and manufactured, and subsequently measured at the DTU-ESA Spherical Near-Field Antenna Test Facility...

  4. Accurate thickness measurement of graphene

    Shearer, Cameron J.; Slattery, Ashley D.; Stapleton, Andrew J.; Shapter, Joseph G.; Gibson, Christopher T.

    2016-03-01

    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  5. Statistical assignment of DNA sequences using Bayesian phylogenetics

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P;

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data that...

  6. Phylogenetic Analysis of PRRSV from Danish Pigs

    Hjulsager, Charlotte Kristiane; Breum, Solvej Østergaard; Larsen, Lars Erik

    visualized with NJ-plot software. Genbank entries of Danish PRRSV sequences from the 1990’ties were included in the phylogenetic analysis. Translated sequences were aligned with current vaccine isolates. Results Both PRRSV EU and US type viruses were isolated from material submitted from Danish pigs in the...... phylogenetic analysis, in order to asses the applicability of vaccines currently used to control PRRSV infection in Danish pig herds. Materials and methods Lung tissue from samples submitted to the National Veterinary Institute during 2003-2008 for PRRSV diagnosis were screened for PRRSV by real-time RT......-PCR, essentially as described by Egli et al. 2001, on RNA extracted with RNeasy Mini Kit (QIAGEN). Complete open reading frames (ORF) ORF5 and ORF7 were PCR amplified as described (Oleksiewicz et al. 1998) and sequenced. Sequences were aligned and Neighbour-Joining trees were constructed with ClustalX. Trees were...

  7. The phylogenetics of succession can guide restoration

    Shooner, Stephanie; Chisholm, Chelsea Lee; Davies, T. Jonathan

    2015-01-01

    Phylogenetic tools have increasingly been used in community ecology to describe the evolutionary relationships among co-occurring species. In studies of succession, such tools may allow us to identify the evolutionary lineages most suited for particular stages of succession and habitat...... phylogenetically random subset of species from the local species pool. Over time, there appears to be selection for particular lineages that come to be filtered across space and environment. The species most appropriate for mine site restoration might, therefore, depend on the successional stage of the community...... appropriate for mine site restoration might, therefore, depend on the successional stage of the community and the local species composition. For example, in later succession, it could be more beneficial to facilitate establishment of more distant relatives. Our findings can improve management practices by...

  8. A phylogenetic analysis of Aquifex pyrophilus

    Burggraf, S.; Olsen, G. J.; Stetter, K. O.; Woese, C. R.

    1992-01-01

    The 16S rRNA of the bacterion Aquifex pyrophilus, a microaerophilic, oxygen-reducing hyperthermophile, has been sequenced directly from the the PCR amplified gene. Phylogenetic analyses show the Aq. pyrophilus lineage to be probably the deepest (earliest) in the (eu)bacterial tree. The addition of this deep branching to the bacterial tree further supports the argument that the Bacteria are of thermophilic ancestry.

  9. A Consistent Phylogenetic Backbone for the Fungi

    Ebersberger, Ingo; de Matos Simoes, Ricardo; Kupczok, Anne; Gube, Matthias; Kothe, Erika; Voigt, Kerstin; von Haeseler, Arndt

    2011-01-01

    The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen...

  10. Phylogenetic invariants for stationary base composition

    Allman, Elizabeth S.; Rhodes, John A.

    2004-01-01

    Changing base composition during the evolution of biological sequences can mislead some of the phylogenetic inference techniques in current use. However, detecting whether such a process has occurred may be difficult, since convergent evolution may lead to similar base frequencies emerging from different lineages. To study this situation, algebraic models of biological sequence evolution are introduced in which the base composition is fixed throughout evolution. Basic properties of the associ...

  11. Phylogenetic conservatism of functional traits in microorganisms

    Martiny, Adam C.; Treseder, Kathleen; Pusch, Gordon

    2012-01-01

    A central question in biology is how biodiversity influences ecosystem functioning. Underlying this is the relationship between organismal phylogeny and the presence of specific functional traits. The relationship is complicated by gene loss and convergent evolution, resulting in the polyphyletic distribution of many traits. In microorganisms, lateral gene transfer can further distort the linkage between phylogeny and the presence of specific functional traits. To identify the phylogenetic co...

  12. Phylogenetic tree shapes resolve disease transmission patterns

    Colijn, Caroline; Gardy, Jennifer

    2014-01-01

    Background and Objectives: Whole-genome sequencing is becoming popular as a tool for understanding outbreaks of communicable diseases, with phylogenetic trees being used to identify individual transmission events or to characterize outbreak-level overall transmission dynamics. Existing methods to infer transmission dynamics from sequence data rely on well-characterized infectious periods, epidemiological and clinical metadata which may not always be available, and typically require computatio...

  13. Recursive heuristic classification

    Wilkins, David C.

    1994-01-01

    The author will describe a new problem-solving approach called recursive heuristic classification, whereby a subproblem of heuristic classification is itself formulated and solved by heuristic classification. This allows the construction of more knowledge-intensive classification programs in a way that yields a clean organization. Further, standard knowledge acquisition and learning techniques for heuristic classification can be used to create, refine, and maintain the knowledge base associated with the recursively called classification expert system. The method of recursive heuristic classification was used in the Minerva blackboard shell for heuristic classification. Minerva recursively calls itself every problem-solving cycle to solve the important blackboard scheduler task, which involves assigning a desirability rating to alternative problem-solving actions. Knowing these ratings is critical to the use of an expert system as a component of a critiquing or apprenticeship tutoring system. One innovation of this research is a method called dynamic heuristic classification, which allows selection among dynamically generated classification categories instead of requiring them to be prenumerated.

  14. Security classification of information

    Quist, A.S.

    1993-04-01

    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  15. Incongruencies in Vaccinia Virus Phylogenetic Trees

    Chad Smithson

    2014-10-01

    Full Text Available Over the years, as more complete poxvirus genomes have been sequenced, phylogenetic studies of these viruses have become more prevalent. In general, the results show similar relationships between the poxvirus species; however, some inconsistencies are notable. Previous analyses of the viral genomes contained within the vaccinia virus (VACV-Dryvax vaccine revealed that their phylogenetic relationships were sometimes clouded by low bootstrapping confidence. To analyze the VACV-Dryvax genomes in detail, a new tool-set was developed and integrated into the Base-By-Base bioinformatics software package. Analyses showed that fewer unique positions were present in each VACV-Dryvax genome than expected. A series of patterns, each containing several single nucleotide polymorphisms (SNPs were identified that were counter to the results of the phylogenetic analysis. The VACV genomes were found to contain short DNA sequence blocks that matched more distantly related clades. Additionally, similar non-conforming SNP patterns were observed in (1 the variola virus clade; (2 some cowpox clades; and (3 VACV-CVA, the direct ancestor of VACV-MVA. Thus, traces of past recombination events are common in the various orthopoxvirus clades, including those associated with smallpox and cowpox viruses.

  16. A Consistent Phylogenetic Backbone for the Fungi

    Ebersberger, Ingo; de Matos Simoes, Ricardo; Kupczok, Anne; Gube, Matthias; Kothe, Erika; Voigt, Kerstin; von Haeseler, Arndt

    2012-01-01

    The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data—a common practice in phylogenomic analyses—introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses. PMID:22114356

  17. The phylogenetic affinities of the extinct glyptodonts.

    Delsuc, Frédéric; Gibb, Gillian C; Kuch, Melanie; Billet, Guillaume; Hautier, Lionel; Southon, John; Rouillard, Jean-Marie; Fernicola, Juan Carlos; Vizcaíno, Sergio F; MacPhee, Ross D E; Poinar, Hendrik N

    2016-02-22

    Among the fossils of hitherto unknown mammals that Darwin collected in South America between 1832 and 1833 during the Beagle expedition were examples of the large, heavily armored herbivores later known as glyptodonts. Ever since, glyptodonts have fascinated evolutionary biologists because of their remarkable skeletal adaptations and seemingly isolated phylogenetic position even within their natural group, the cingulate xenarthrans (armadillos and their allies). In possessing a carapace comprised of fused osteoderms, the glyptodonts were clearly related to other cingulates, but their precise phylogenetic position as suggested by morphology remains unresolved. To provide a molecular perspective on this issue, we designed sequence-capture baits using in silico reconstructed ancestral sequences and successfully assembled the complete mitochondrial genome of Doedicurus sp., one of the largest glyptodonts. Our phylogenetic reconstructions establish that glyptodonts are in fact deeply nested within the armadillo crown-group, representing a distinct subfamily (Glyptodontinae) within family Chlamyphoridae. Molecular dating suggests that glyptodonts diverged no earlier than around 35 million years ago, in good agreement with their fossil record. Our results highlight the derived nature of the glyptodont morphotype, one aspect of which is a spectacular increase in body size until their extinction at the end of the last ice age. PMID:26906483

  18. Sequence exploration reveals information bias among molecular markers used in phylogenetic reconstruction for Colletotrichum species.

    Rampersad, Sephra N; Hosein, Fazeeda N; Carrington, Christine Vf

    2014-01-01

    The Colletotrichum gloeosporioides species complex is among the most destructive fungal plant pathogens in the world, however, identification of isolates of quarantine importance to the intra-specific level is confounded by a number of factors that affect phylogenetic reconstruction. Information bias and quality parameters were investigated to determine whether nucleotide sequence alignments and phylogenetic trees accurately reflect the genetic diversity and phylogenetic relatedness of individuals. Sequence exploration of GAPDH, ACT, TUB2 and ITS markers indicated that the query sequences had different patterns of nucleotide substitution but were without evidence of base substitution saturation. Regions of high entropy were much more dispersed in the ACT and GAPDH marker alignments than for the ITS and TUB2 markers. A discernible bimodal gap in the genetic distance frequency histograms was produced for the ACT and GAPDH markers which indicated successful separation of intra- and inter-specific sequences in the data set. Overall, analyses indicated clear differences in the ability of these markers to phylogenetically separate individuals to the intra-specific level which coincided with information bias. PMID:25392785

  19. Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium.

    Gaudet, Pascale; Livstone, Michael S; Lewis, Suzanna E; Thomas, Paul D

    2011-09-01

    The goal of the Gene Ontology (GO) project is to provide a uniform way to describe the functions of gene products from organisms across all kingdoms of life and thereby enable analysis of genomic data. Protein annotations are either based on experiments or predicted from protein sequences. Since most sequences have not been experimentally characterized, most available annotations need to be based on predictions. To make as accurate inferences as possible, the GO Consortium's Reference Genome Project is using an explicit evolutionary framework to infer annotations of proteins from a broad set of genomes from experimental annotations in a semi-automated manner. Most components in the pipeline, such as selection of sequences, building multiple sequence alignments and phylogenetic trees, retrieving experimental annotations and depositing inferred annotations, are fully automated. However, the most crucial step in our pipeline relies on software-assisted curation by an expert biologist. This curation tool, Phylogenetic Annotation and INference Tool (PAINT) helps curators to infer annotations among members of a protein family. PAINT allows curators to make precise assertions as to when functions were gained and lost during evolution and record the evidence (e.g. experimentally supported GO annotations and phylogenetic information including orthology) for those assertions. In this article, we describe how we use PAINT to infer protein function in a phylogenetic context with emphasis on its strengths, limitations and guidelines. We also discuss specific examples showing how PAINT annotations compare with those generated by other highly used homology-based methods. PMID:21873635

  20. Ant-Based Phylogenetic Reconstruction (ABPR): A new distance algorithm for phylogenetic estimation based on ant colony optimization

    Karla Vittori; Alexandre C B Delbem; Pereira, Sérgio L

    2008-01-01

    We propose a new distance algorithm for phylogenetic estimation based on Ant Colony Optimization (ACO), named Ant-Based Phylogenetic Reconstruction (ABPR). ABPR joins two taxa iteratively based on evolutionary distance among sequences, while also accounting for the quality of the phylogenetic tree built according to the total length of the tree. Similar to optimization algorithms for phylogenetic estimation, the algorithm allows exploration of a larger set of nearly optimal solutions. We appl...

  1. A More Accurate Fourier Transform

    Courtney, Elya

    2015-01-01

    Fourier transform methods are used to analyze functions and data sets to provide frequencies, amplitudes, and phases of underlying oscillatory components. Fast Fourier transform (FFT) methods offer speed advantages over evaluation of explicit integrals (EI) that define Fourier transforms. This paper compares frequency, amplitude, and phase accuracy of the two methods for well resolved peaks over a wide array of data sets including cosine series with and without random noise and a variety of physical data sets, including atmospheric $\\mathrm{CO_2}$ concentrations, tides, temperatures, sound waveforms, and atomic spectra. The FFT uses MIT's FFTW3 library. The EI method uses the rectangle method to compute the areas under the curve via complex math. Results support the hypothesis that EI methods are more accurate than FFT methods. Errors range from 5 to 10 times higher when determining peak frequency by FFT, 1.4 to 60 times higher for peak amplitude, and 6 to 10 times higher for phase under a peak. The ability t...

  2. Classiology and soil classification

    Rozhkov, V. A.

    2012-03-01

    Classiology can be defined as a science studying the principles and rules of classification of objects of any nature. The development of the theory of classification and the particular methods for classifying objects are the main challenges of classiology; to a certain extent, they are close to the challenges of pattern recognition. The methodology of classiology integrates a wide range of methods and approaches: from expert judgment to formal logic, multivariate statistics, and informatics. Soil classification assumes generalization of available data and practical experience, formalization of our notions about soils, and their representation in the form of an information system. As an information system, soil classification is designed to predict the maximum number of a soil's properties from the position of this soil in the classification space. The existing soil classification systems do not completely satisfy the principles of classiology. The violation of logical basis, poor structuring, low integrity, and inadequate level of formalization make these systems verbal schemes rather than classification systems sensu stricto. The concept of classification as listing (enumeration) of objects makes it possible to introduce the notion of the information base of classification. For soil objects, this is the database of soil indices (properties) that might be applied for generating target-oriented soil classification system. Mathematical methods enlarge the prognostic capacity of classification systems; they can be applied to assess the quality of these systems and to recognize new soil objects to be included in the existing systems. The application of particular principles and rules of classiology for soil classification purposes is discussed in this paper.

  3. Efficient Pairwise Multilabel Classification

    Loza Mencía, Eneldo

    2013-01-01

    Multilabel classification learning is the task of learning a mapping between objects and sets of possibly overlapping classes and has gained increasing attention in recent times. A prototypical application scenario for multilabel classification is the assignment of a set of keywords to a document, a frequently encountered problem in the text classification domain. With upcoming Web 2.0 technologies, this domain is extended by a wide range of tag suggestion tasks and the trend definitely...

  4. Classifier in Age classification

    B. Santhi; R.Seethalakshmi

    2012-01-01

    Face is the important feature of the human beings. We can derive various properties of a human by analyzing the face. The objective of the study is to design a classifier for age using facial images. Age classification is essential in many applications like crime detection, employment and face detection. The proposed algorithm contains four phases: preprocessing, feature extraction, feature selection and classification. The classification employs two class labels namely child and Old. This st...

  5. Text Classification Using Sentential Frequent Itemsets

    Shi-Zhu Liu; He-Ping Hu

    2007-01-01

    Text classification techniques mostly rely on single term analysis of the document data set, while more concepts,especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same sentence and their weights are particularly important in such scenarios. In this paper, we propose a novel approach using sentential frequent itemset, a concept comes from association rule mining, for text classification, which views a sentence rather than a document as a transaction, and uses a variable precision rough set based method to evaluate each sentential frequent itemset's contribution to the classification. Experiments over the Reuters and newsgroup corpus are carried out, which validate the practicability of the proposed system.

  6. AGN Zoo and Classifications of Active Galaxies

    Mickaelian, Areg M.

    2015-07-01

    We review the variety of Active Galactic Nuclei (AGN) classes (so-called "AGN zoo") and classification schemes of galaxies by activity types based on their optical emission-line spectrum, as well as other parameters and other than optical wavelength ranges. A historical overview of discoveries of various types of active galaxies is given, including Seyfert galaxies, radio galaxies, QSOs, BL Lacertae objects, Starbursts, LINERs, etc. Various kinds of AGN diagnostics are discussed. All known AGN types and subtypes are presented and described to have a homogeneous classification scheme based on the optical emission-line spectra and in many cases, also other parameters. Problems connected with accurate classifications and open questions related to AGN and their classes are discussed and summarized.

  7. An Ensemble Classification Algorithm for Hyperspectral Images

    K.Kavitha

    2014-04-01

    Full Text Available Hyperspectral image analysis has been used for many purposes in environmental monitoring, remote sensing, vegetation research and also for land cover classification. A hyperspectral image consists of many layers in which each layer represents a specific wavelength. The layers stack on top of one another making a cube-like image for entire spectrum. This work aims to classify the hyperspectral images and to produce a thematic map accurately. Spatial information of hyperspectral images is collected by applying morphological profile and local binary pattern. Support vector machine is an efficient classification algorithm for classifying the hyperspectral images. Genetic algorithm is used to obtain the best feature subjected for classification. Selected features are classified for obtaining the classes and to produce a thematic map. Experiment is carried out with AVIRIS Indian Pines and ROSIS Pavia University. Proposed method produces accuracy as 93% for Indian Pines and 92% for Pavia University.

  8. Aspects de la classification

    Mari, Jean-François; Napoli, Amedeo

    1996-01-01

    Les techniques de classification numérique ont toujours été présentes en reconnaissance des formes. Les réseaux de neurones montrent chaque jour leurs (très ?) bonnes propriétés de classification, et la classification se fait de plus en plus présente en représentation des connaissances. Ainsi, ce rapport présente, simplement dans un but introductif, les aspects mathématiques, statistiques, neuromimétiques et cognitifs de la classification.

  9. Ontologies vs. Classification Systems

    Madsen, Bodil Nistrup; Erdman Thomsen, Hanne

    2009-01-01

    What is an ontology compared to a classification system? Is a taxonomy a kind of classification system or a kind of ontology? These are questions that we meet when working with people from industry and public authorities, who need methods and tools for concept clarification, for developing meta...... data sets or for obtaining advanced search facilities. In this paper we will present an attempt at answering these questions. We will give a presentation of various types of ontologies and briefly introduce terminological ontologies. Furthermore we will argue that classification systems, e.g. product...... classification systems and meta data taxonomies, should be based on ontologies....

  10. A Note on Encodings of Phylogenetic Networks of Bounded Level

    Gambette, Philippe

    2009-01-01

    Driven by the need for better models that allow one to shed light into the question how life's diversity has evolved, phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i.e. uniquely describe) the network that induces it? In this note, we present a complete answer to this question for the special case of a level-1 (phylogenetic) network by characterizing those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. Given that this type of network forms the first layer of the rich hierarchy of lev...

  11. Molecular phylogenetics and historical biogeography amid shifting continents in the cockles and giant clams (Bivalvia: Cardiidae).

    Herrera, Nathanael D; Ter Poorten, Jan Johan; Bieler, Rüdiger; Mikkelsen, Paula M; Strong, Ellen E; Jablonski, David; Steppan, Scott J

    2015-12-01

    Reconstructing historical biogeography of the marine realm is complicated by indistinct barriers and, over deeper time scales, a dynamic landscape shaped by plate tectonics. Here we present the most extensive examination of model-based historical biogeography among marine invertebrates to date. We conducted the largest phylogenetic and molecular clock analyses to date for the bivalve family Cardiidae (cockles and giant clams) with three unlinked loci for 110 species representing 37 of the 50 genera. Ancestral ranges were reconstructed using the dispersal-extinction-cladogenesis (DEC) method with a time-stratified paleogeographic model wherein dispersal rates varied with shifting tectonics. Results were compared to previous classifications and the extensive paleontological record. Six of the eight prior subfamily groupings were found to be para- or polyphyletic. Cardiidae originated and subsequently diversified in the tropical Indo-Pacific starting in the Late Triassic. Eastern Atlantic species were mainly derived from the tropical Indo-Mediterranean region via the Tethys Sea. In contrast, the western Atlantic fauna was derived from Indo-Pacific clades. Our phylogenetic results demonstrated greater concordance with geography than did previous phylogenies based on morphology. Time-stratifying the DEC reconstruction improved the fit and was highly consistent with paleo-ocean currents and paleogeography. Lastly, combining molecular phylogenetics with a rich and well-documented fossil record allowed us to test the accuracy and precision of biogeographic range reconstructions. PMID:26234273

  12. Phylogenetic relationships of 18 passerines based on Adenylate Kinase Intron 5 sequences

    GUO Hui-yan; YU Hui-xin; BAI Su-ying; MA Yu-kun

    2008-01-01

    The 18 species of bird studied originally are known to belong to muscicapids, robins and sylviids of passerines, but some disputations are always present in their classification systems. In this experiment, phylogenetic relationships of 18 species of passerines were studied using Adenylate Kinase Intron 5 (AK5) sequences and DNA techniques. Through sequences analysis in comparison with each other, phylogenetic tree figures of 18 species of passerines were constructed using Neighbor-Joining (NJ) and Maximum-Parsimony (MP) methods . The results showed that sylviids should be listed as an independent family, while robins and flycatchers should be listed into Muscicapidae. Since the phylogenetic relationships between long-tailed tits and old world warblers are closer than that between long-tailed tits and parids, the long-tailed tits should be independent of paridae and be categorized into aegithalidae. Muscicapidae and Paridae are known to be two monophylitic families, but Sylviidae is not a monophyletic group. AK5 sequences had better efficacy in resolving close relationships of interspecies among intrageneric groups.

  13. Phylogenetic diversity and biogeography of the Mamiellophyceae lineage of eukaryotic phytoplankton across the oceans.

    Monier, Adam; Worden, Alexandra Z; Richards, Thomas A

    2016-08-01

    High-throughput diversity amplicon sequencing of marine microbial samples has revealed that members of the Mamiellophyceae lineage are successful phytoplankton in many oceanic habitats. Indeed, these eukaryotic green algae can dominate the picoplanktonic biomass, however, given the broad expanses of the oceans, their geographical distributions and the phylogenetic diversity of some groups remain poorly characterized. As these algae play a foundational role in marine food webs, it is crucial to assess their global distribution in order to better predict potential changes in abundance and community structure. To this end, we analyzed the V9-18S small subunit rDNA sequences deposited from the Tara Oceans expedition to evaluate the diversity and biogeography of these phytoplankton. Our results show that the phylogenetic composition of Mamiellophyceae communities is in part determined by geographical provenance, and do not appear to be influenced - in the samples recovered - by water depth, at least at the resolution possible with the V9-18S. Phylogenetic classification of Mamiellophyceae sequences revealed that the Dolichomastigales order encompasses more sequence diversity than other orders in this lineage. These results indicate that a large fraction of the Mamiellophyceae diversity has been hitherto overlooked, likely because of a combination of size fraction, sequencing and geographical limitations. PMID:26929141

  14. Phylogenetic inferences reveal a large extent of novel biodiversity in chemically rich tropical marine cyanobacteria.

    Engene, Niclas; Gunasekera, Sarath P; Gerwick, William H; Paul, Valerie J

    2013-03-01

    Benthic marine cyanobacteria are known for their prolific biosynthetic capacities to produce structurally diverse secondary metabolites with biomedical application and their ability to form cyanobacterial harmful algal blooms. In an effort to provide taxonomic clarity to better guide future natural product drug discovery investigations and harmful algal bloom monitoring, this study investigated the taxonomy of tropical and subtropical natural product-producing marine cyanobacteria on the basis of their evolutionary relatedness. Our phylogenetic inferences of marine cyanobacterial strains responsible for over 100 bioactive secondary metabolites revealed an uneven taxonomic distribution, with a few groups being responsible for the vast majority of these molecules. Our data also suggest a high degree of novel biodiversity among natural product-producing strains that was previously overlooked by traditional morphology-based taxonomic approaches. This unrecognized biodiversity is primarily due to a lack of proper classification systems since the taxonomy of tropical and subtropical, benthic marine cyanobacteria has only recently been analyzed by phylogenetic methods. This evolutionary study provides a framework for a more robust classification system to better understand the taxonomy of tropical and subtropical marine cyanobacteria and the distribution of natural products in marine cyanobacteria. PMID:23315747

  15. Controlled recovery of phylogenetic communities from an evolutionary model using a network approach

    Sousa, Arthur M. Y. R.; Vieira, André P.; Prado, Carmen P. C.; Andrade, Roberto F. S.

    2016-04-01

    This works reports the use of a complex network approach to produce a phylogenetic classification tree of a simple evolutionary model. This approach has already been used to treat proteomic data of actual extant organisms, but an investigation of its reliability to retrieve a traceable evolutionary history is missing. The used evolutionary model includes key ingredients for the emergence of groups of related organisms by differentiation through random mutations and population growth, but purposefully omits other realistic ingredients that are not strictly necessary to originate an evolutionary history. This choice causes the model to depend only on a small set of parameters, controlling the mutation probability and the population of different species. Our results indicate that for a set of parameter values, the phylogenetic classification produced by the used framework reproduces the actual evolutionary history with a very high average degree of accuracy. This includes parameter values where the species originated by the evolutionary dynamics have modular structures. In the more general context of community identification in complex networks, our model offers a simple setting for evaluating the effects, on the efficiency of community formation and identification, of the underlying dynamics generating the network itself.

  16. Complete mitochondrial genomes elucidate phylogenetic relationships of the deep-sea octocoral families Coralliidae and Paragorgiidae

    Figueroa, Diego F.; Baco, Amy R.

    2014-01-01

    In the past decade, molecular phylogenetic analyses of octocorals have shown that the current morphological taxonomic classification of these organisms needs to be revised. The latest phylogenetic analyses show that most octocorals can be divided into three main clades. One of these clades contains the families Coralliidae and Paragorgiidae. These families share several taxonomically important characters and it has been suggested that they may not be monophyletic; with the possibility of the Coralliidae being a derived branch of the Paragorgiidae. Uncertainty exists not only in the relationship of these two families, but also in the classification of the two genera that make up the Coralliidae, Corallium and Paracorallium. Molecular analyses suggest that the genus Corallium is paraphyletic, and it can be divided into two main clades, with the Paracorallium as members of one of these clades. In this study we sequenced the whole mitochondrial genome of five species of Paragorgia and of five species of Corallium to use in a phylogenetic analysis to achieve two main objectives; the first to elucidate the phylogenetic relationship between the Paragorgiidae and Coralliidae and the second to determine whether the genera Corallium and Paracorallium are monophyletic. Our results show that other members of the Coralliidae share the two novel mitochondrial gene arrangements found in a previous study in Corallium konojoi and Paracorallium japonicum; and that the Corallium konojoi arrangement is also found in the Paragorgiidae. Our phylogenetic reconstruction based on all the protein coding genes and ribosomal RNAs of the mitochondrial genome suggest that the Coralliidae are not a derived branch of the Paragorgiidae, but rather a monophyletic sister branch to the Paragorgiidae. While our manuscript was in review a study was published using morphological data and several fragments from mitochondrial genes to redefine the taxonomy of the Coralliidae. Paracorallium was subsumed

  17. [Sequence variation of mitochondrial cytochrome b gene and phylogenetic relationships among twelve species of Charadriiformes].

    Chen, Xiao-Fang; Wang, Xiang; Yuan, Xiao-Dong; Tang, Min-Qian; Li, Yu-Xiang; Guo, Yu-Mei; Li, Qing-Wei

    2003-05-01

    Studies of the phylogenetic relationships of the Charadriiformes have been largely based on conservative morphological characters. During the past 10 years, many studies on the evolutionary biology of birds adopted phylogenetic information obtained from mitochondrial DNA, but few work on the Charadriiformes has been reported to date. Therefore, phylogenetic relationships and classification of the Charadriiformes remains controversial. In this study, we try to shed light on these relationships via DNA sequence analysis of the mitochondrial Cyt b gene in 12 species of Charadriiformes. It was a preliminary study of the origin and evolution of the species by using nucleotide sequence data. Using the well-known PCR techniques, the complete mitochondrial Cyt b gene sequences were amplified and sequenced respectively from Charadrius mongolus, Charadrius alexandrinus, Numenius madagascariensis, Numenius arquat, Numenius phaeopus, Tringa totanus, Tringa glareola, Xenus cineres, Arenaria interpres, Calidris tenuirostris, Recurvirostra avosetts and Haematopus ostralensis. The 1143 bp long DNA sequences of the gene from these species were obtained, in which 381 variable sites were identified without insertions or deletions. The nucleic acid sequence variation of the mitochondrial Cyt b gene was 5.16%-16.01% among these species. Phylogenetic trees constructed using the NJ method, MP method and ML method with Ciconia ciconia as the outgroup indicate that the 12 species of Charadriiformes examined in this study are clustered in two major clades. The first clade includes T. totanus, T. glareola, A. interpres, C. tenuirostris, X. cineres, N. madagascariensis, N. arquata and N. phaeopus. The second one includes C. mongolus, C. alexandrinus, R. avosetts and H. ostralensis. Our molecular data show that the phylogenetic relationships among species of Scolopacidae are consistent with the classification based on morphological studies; R. avosetts and H. ostralensis are relatively closer

  18. Acute pancreatitis - severity classification, complications and outcome

    Andersson, Bodil

    2010-01-01

    Acute pancreatitis, with an annual incidence of approximately 35 per 100 000 inhabitants in Sweden, is in most cases mild and self-limiting. Severe acute pancreatitis, affecting 10-15% of the cases is, however, associated with severe complications and even death. The optimal management of acute pancreatitis includes accurate early prediction of the disease severity. The aims of this thesis were to investigate early severity classification, complications and outcome in acute pancreatitis patie...

  19. Ensemble methods for noise in classification problems

    Verbaeten, Sofie; Van Assche, Anneleen

    2003-01-01

    Ensemble methods combine a set of classifiers to construct a new classifier that is (often) more accurate than any of its component classifiers. In this paper, we use ensemble methods to identify noisy training examples. More precisely, we consider the problem of mislabeled training examples in classification tasks, and address this problem by pre-processing the training set, i.e. by identifying and removing outliers from the training set. We study a number of filter techniques that are based...

  20. BIOPHARMACEUTICAL CLASSIFICATION SYSTEM AND BIOWAVER: AN OVERVIEW

    Puranik Prashant K; Kasar Sagar Ashok; Gadade Deepak Dilip; Mali Prabha R

    2011-01-01

    The biopharmaceutical classification system (BCS) has been developed to provide a scientific approach for classifying drug compounds based on solubility as related to dose and intestinal permeability in combination with the dissolution properties of the oral immediate release dosage form. BCS is to provide a regulatory tool for replacing certain bioequivalence (BE) studies by accurate in vitro dissolution tests. This review gives three dimensionless numbers which are used in BCS are absorptio...

  1. Protein structure database search and evolutionary classification

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using...

  2. Independent Comparison of Popular DPI Tools for Traffic Classification

    Bujlow, Tomasz; Carela-Español, Valentín; Barlet-Ros, Pere

    2015-01-01

    Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classifi......Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic......, application and web service). We carefully built a labeled dataset with more than 750K flows, which contains traffic from popular applications. We used the Volunteer-Based System (VBS), developed at Aalborg University, to guarantee the correct labeling of the dataset. We released this dataset, including full...

  3. Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN

    Marc Gregory Dumont

    2014-02-01

    Full Text Available The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty.  

  4. The impact of incorporating molecular evolutionary model into predictions of phylogenetic signal and noise

    JeffreyPeterTownsend

    2014-04-01

    Full Text Available Phylogenetic inference can be improved by the development and use of better models for inference given the data available, or by gathering more appropriate data given the potential inferences to be made. Numerous studies have demonstrated the crucial importance of selecting a best-fit model to conducting accurate phylogenetic inference given a data set, explicitly revealing how model choice affects the results of phylogenetic inferences. However, the importance of specifying a correct model of evolution for predictions of the best data to be gathered has never been examined. Here, we extend analyses of phylogenetic signal and noise that predict the potential to resolve nodes in a phylogeny to incorporate all time-reversible Markov models of nucleotide substitution. Extending previous results on the canonical four-taxon tree, our theory yields an analytical method that uses estimates of the rates of evolution and the model of molecular evolution to predict the distribution of signal, noise, and polytomy. We applied our methods to a study of 29 taxa of the yeast genus Candida and allied members to predict the power of five markers, COX2, ACT1, RPB1, RPB2, and D1/D2 LSU, to resolve a poorly supported backbone node corresponding to a clade of haploid Candida species, as well as nineteen other nodes that are reasonably short and at least moderately deep in the consensus tree. The use of simple, unrealistic models that did not take into account transition/transversion rate differences led to some discrepancies in predictions, but overall our results demonstrate that predictions of signal and noise in phylogenetics are fairly robust to model specification.

  5. Phylogenetic position of the spirochetal genus Cristispira

    Paster, B.J.; Pelletier, D.A.; Dewhirst, F.E.; Weisburg, W.G.; Fussing, V.; Poulsen, Lars K.; Dannenberg, S.; Schroeder, I.

    1996-01-01

    Comparative sequence analysis of 16S rRNA genes was used to determine the phylogenetic relationship of the genus Cristispira to other spirochetes. Since Cristispira organisms cannot presently be grown in vitro, 16S rRNA genes were amplified directly from bacterial DNA isolated from Cristispira a...... genus within this family. A fluorescently labeled DNA probe designed from the CP1 sequence was used for in situ hybridization experiments to verify that the sequence obtained was derived from the observed Cristispira cells....

  6. Library Classification 2020

    Harris, Christopher

    2013-01-01

    In this article the author explores how a new library classification system might be designed using some aspects of the Dewey Decimal Classification (DDC) and ideas from other systems to create something that works for school libraries in the year 2020. By examining what works well with the Dewey Decimal System, what features should be carried…

  7. Musings on galaxy classification

    Classification schemes and their utility are discussed with a number of examples, particularly for cD galaxies. Data suggest that primordial turbulence rather than tidal torques is responsible for most of the presently observed angular momentum of galaxies. Finally, some of the limitations on present-day schemes for galaxy classification are pointed out. 54 references, 4 figures, 3 tables

  8. A survey of feature selection models for classification

    B. Kalpana

    2012-01-01

    Full Text Available The success of a machine learning algorithm depends on quality of data .The data given for classification, should not contain irrelevant or redundant attributes. This increases the processing time. The data set, selected for classification should contain the right attributes for accurate results. Feature selection is an essential data processing step, prior to applying a learning algorithm. Here we discuss some basic feature selection models and evaluation function. Experimental results are compared for individual datasets with filter and wrapper model.

  9. Multi-Organ Cancer Classification and Survival Analysis

    Bauer, Stefan; Carion, Nicolas; Schüffler, Peter; Fuchs, Thomas; Wild, Peter; Buhmann, Joachim M.

    2016-01-01

    Accurate and robust cell nuclei classification is the cornerstone for a wider range of tasks in digital and Computational Pathology. However, most machine learning systems require extensive labeling from expert pathologists for each individual problem at hand, with no or limited abilities for knowledge transfer between datasets and organ sites. In this paper we implement and evaluate a variety of deep neural network models and model ensembles for nuclei classification in renal cell cancer (RC...

  10. A proposal for the morphological classification and nomenclature of neurons

    Rong Jiang; Qiang Liu; Quan Liu; Shenquan Liu

    2011-01-01

    The morphological and functional characteristics of neurons are quite varied and complex. There is a need for a comprehensive approach for distinguishing and classifying neurons. Similar to the biological species classification system, this study proposes a morphological classification system for neurons based on principal component analysis. Based on four principal components of neuronal morphology derived from principal component analysis, a nomenclature system for neurons was obtained. This system can accurately distinguish between the same type of neuron from different species.

  11. Robust Eye Localization by Combining Classification and Regression Methods

    Pak Il Nam; Ri Song Jin; Peter Peer

    2014-01-01

    Eye localization is an important part in face recognition system, because its precision closely affects the performance of the system. In this paper we analyze the limitations of classification and regression methods and propose a robust and accurate eye localization method combining these two methods. The classification method in eye localization is robust, but its precision is not so high, while the regression method is sensitive to the initial position, but in case the initial position is ...

  12. A Novel Fault Classification Scheme Based on Least Square SVM

    Dubey, Harishchandra; Tiwari, A. K.; Nandita; Ray, P. K.; Mohanty, S. R.; Kishor, Nand

    2016-01-01

    This paper presents a novel approach for fault classification and section identification in a series compensated transmission line based on least square support vector machine. The current signal corresponding to one-fourth of the post fault cycle is used as input to proposed modular LS-SVM classifier. The proposed scheme uses four binary classifier; three for selection of three phases and fourth for ground detection. The proposed classification scheme is found to be accurate and reliable in ...

  13. Enhancing Accuracy of Plant Leaf Classification Techniques

    C. S. Sumathi

    2014-03-01

    Full Text Available Plants have become an important source of energy, and are a fundamental piece in the puzzle to solve the problem of global warming. Living beings also depend on plants for their food, hence it is of great importance to know about the plants growing around us and to preserve them. Automatic plant leaf classification is widely researched. This paper investigates the efficiency of learning algorithms of MLP for plant leaf classification. Incremental back propagation, Levenberg–Marquardt and batch propagation learning algorithms are investigated. Plant leaf images are examined using three different Multi-Layer Perceptron (MLP modelling techniques. Back propagation done in batch manner increases the accuracy of plant leaf classification. Results reveal that batch training is faster and more accurate than MLP with incremental training and Levenberg– Marquardt based learning for plant leaf classification. Various levels of semi-batch training used on 9 species of 15 sample each, a total of 135 instances show a roughly linear increase in classification accuracy.

  14. Photometric Supernova Classification with Machine Learning

    Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  15. 38 CFR 4.46 - Accurate measurement.

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Accurate measurement. 4... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate measurement of the length of stumps, excursion of joints, dimensions and location of scars with respect...

  16. A Distance Measure for Genome Phylogenetic Analysis

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  17. Epitope discovery with phylogenetic hidden Markov models.

    Lacerda, Miguel

    2010-05-01

    Existing methods for the prediction of immunologically active T-cell epitopes are based on the amino acid sequence or structure of pathogen proteins. Additional information regarding the locations of epitopes may be acquired by considering the evolution of viruses in hosts with different immune backgrounds. In particular, immune-dependent evolutionary patterns at sites within or near T-cell epitopes can be used to enhance epitope identification. We have developed a mutation-selection model of T-cell epitope evolution that allows the human leukocyte antigen (HLA) genotype of the host to influence the evolutionary process. This is one of the first examples of the incorporation of environmental parameters into a phylogenetic model and has many other potential applications where the selection pressures exerted on an organism can be related directly to environmental factors. We combine this novel evolutionary model with a hidden Markov model to identify contiguous amino acid positions that appear to evolve under immune pressure in the presence of specific host immune alleles and that therefore represent potential epitopes. This phylogenetic hidden Markov model provides a rigorous probabilistic framework that can be combined with sequence or structural information to improve epitope prediction. As a demonstration, we apply the model to a data set of HIV-1 protein-coding sequences and host HLA genotypes.

  18. Phylogenetic analysis of honey bee behavioral evolution.

    Raffiudin, Rika; Crozier, Ross H

    2007-05-01

    DNA sequences from three mitochondrial (rrnL, cox2, nad2) and one nuclear gene (itpr) from all 9 known honey bee species (Apis), a 10th possible species, Apis dorsata binghami, and three outgroup species (Bombus terrestris, Melipona bicolor and Trigona fimbriata) were used to infer Apis phylogenetic relationships using Bayesian analysis. The dwarf honey bees were confirmed as basal, and the giant and cavity-nesting species to be monophyletic. All nodes were strongly supported except that grouping Apis cerana with A. nigrocincta. Two thousand post-burnin trees from the phylogenetic analysis were used in a Bayesian comparative analysis to explore the evolution of dance type, nest structure, comb structure and dance sound within Apis. The ancestral honey bee species was inferred with high support to have nested in the open, and to have more likely than not had a silent vertical waggle dance and a single comb. The common ancestor of the giant and cavity-dwelling bees is strongly inferred to have had a buzzing vertical directional dance. All pairwise combinations of characters showed strong association, but the multiple comparisons problem reduces the ability to infer associations between states between characters. Nevertheless, a buzzing dance is significantly associated with cavity-nesting, several vertical combs, and dancing vertically, a horizontal dance is significantly associated with a nest with a single comb wrapped around the support, and open nesting with a single pendant comb and a silent waggle dance. PMID:17123837

  19. Phylogenetic diversity of Mesorhizobium in chickpea

    Dong Hyun Kim; Mayank Kaashyap; Abhishek Rathore; Roma R Das; Swathi Parupalli; Hari D Upadhyaya; S Gopalakrishnan; Pooran M Gaur; Sarvjeet Singh; Jagmeet Kaur; Mohammad Yasin; Rajeev K Varshney

    2014-06-01

    Crop domestication, in general, has reduced genetic diversity in cultivated gene pool of chickpea (Cicer arietinum) as compared with wild species (C. reticulatum, C. bijugum). To explore impact of domestication on symbiosis, 10 accessions of chickpeas, including 4 accessions of C. arietinum, and 3 accessions of each of C. reticulatum and C. bijugum species, were selected and DNAs were extracted from their nodules. To distinguish chickpea symbiont, preliminary sequences analysis was attempted with 9 genes (16S rRNA, atpD, dnaJ, glnA, gyrB, nifH, nifK, nodD and recA) of which 3 genes (gyrB, nifK and nodD) were selected based on sufficient sequence diversity for further phylogenetic analysis. Phylogenetic analysis and sequence diversity for 3 genes demonstrated that sequences from C. reticulatum were more diverse. Nodule occupancy by dominant symbiont also indicated that C. reticulatum (60%) could have more various symbionts than cultivated chickpea (80%). The study demonstrated that wild chickpeas (C. reticulatum) could be used for selecting more diverse symbionts in the field conditions and it implies that chickpea domestication affected symbiosis negatively in addition to reducing genetic diversity.

  20. A phylogenetic blueprint for a modern whale.

    Gatesy, John; Geisler, Jonathan H; Chang, Joseph; Buell, Carl; Berta, Annalisa; Meredith, Robert W; Springer, Mark S; McGowen, Michael R

    2013-02-01

    The emergence of Cetacea in the Paleogene represents one of the most profound macroevolutionary transitions within Mammalia. The move from a terrestrial habitat to a committed aquatic lifestyle engendered wholesale changes in anatomy, physiology, and behavior. The results of this remarkable transformation are extant whales that include the largest, biggest brained, fastest swimming, loudest, deepest diving mammals, some of which can detect prey with a sophisticated echolocation system (Odontoceti - toothed whales), and others that batch feed using racks of baleen (Mysticeti - baleen whales). A broad-scale reconstruction of the evolutionary remodeling that culminated in extant cetaceans has not yet been based on integration of genomic and paleontological information. Here, we first place Cetacea relative to extant mammalian diversity, and assess the distribution of support among molecular datasets for relationships within Artiodactyla (even-toed ungulates, including Cetacea). We then merge trees derived from three large concatenations of molecular and fossil data to yield a composite hypothesis that encompasses many critical events in the evolutionary history of Cetacea. By combining diverse evidence, we infer a phylogenetic blueprint that outlines the stepwise evolutionary development of modern whales. This hypothesis represents a starting point for more detailed, comprehensive phylogenetic reconstructions in the future, and also highlights the synergistic interaction between modern (genomic) and traditional (morphological+paleontological) approaches that ultimately must be exploited to provide a rich understanding of evolutionary history across the entire tree of Life. PMID:23103570

  1. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Le Li

    Full Text Available Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  2. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  3. Behavior Based Social Dimensions Extraction for Multi-Label Classification

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  4. Hyperspectral Data Classification Using Factor Graphs

    Makarau, A.; Müller, R.; Palubinskas, G.; Reinartz, P.

    2012-07-01

    Accurate classification of hyperspectral data is still a competitive task and new classification methods are developed to achieve desired tasks of hyperspectral data use. The objective of this paper is to develop a new method for hyperspectral data classification ensuring the classification model properties like transferability, generalization, probabilistic interpretation, etc. While factor graphs (undirected graphical models) are unfortunately not widely employed in remote sensing tasks, these models possess important properties such as representation of complex systems to model estimation/decision making tasks. In this paper we present a new method for hyperspectral data classification using factor graphs. Factor graph (a bipartite graph consisting of variables and factor vertices) allows factorization of a more complex function leading to definition of variables (employed to store input data), latent variables (allow to bridge abstract class to data), and factors (defining prior probabilities for spectral features and abstract classes; input data mapping to spectral features mixture and further bridging of the mixture to an abstract class). Latent variables play an important role by defining two-level mapping of the input spectral features to a class. Configuration (learning) on training data of the model allows calculating a parameter set for the model to bridge the input data to a class. The classification algorithm is as follows. Spectral bands are separately pre-processed (unsupervised clustering is used) to be defined on a finite domain (alphabet) leading to a representation of the data on multinomial distribution. The represented hyperspectral data is used as input evidence (evidence vector is selected pixelwise) in a configured factor graph and an inference is run resulting in the posterior probability. Variational inference (Mean field) allows to obtain plausible results with a low calculation time. Calculating the posterior probability for each class

  5. Constructing Phenetic and Phylogenetic Relationship Using Clad'97

    Estri Laras Arumningtyas

    2012-01-01

    Full Text Available Relationship construction has a very important position in classification process for arranging taxonomy of organism. In the world of taxonomy, there are two the most familiar relationship diagram, cladogram and phenogram. In every construction activity, a researcher is always facing character state data from taxa that becomes components of the diagram. Calculation that is used for construction is often incorporate iterative or repetitive process that needs time and precision. The existence of calculating tools that produces both text and graphical output are hopefully decrease time and error during construction. Basic algorithm that is used in calculation is for phylogenetic construction by Kluge and Farris in 1969,for phenetic construction using cluster analysis with slight modification. Basic common algorithm used in the software is by calculating two dimensional arrays of taxa x characters matrix and creating distance or similarity matrix. In more detail the program creates one dimensional array of taxonomical object and each object has some other one dimensional array containing data commonly exist in a taxonomic unit. The relationship between one object and theother are regulated by an object that created by class representing taxonomic tree. Cladogram is constructed by calculating nearest distance between each taxon (OTU and creating one HTU in every bifurcation. Phenogram is constructed agglomeratively by searching highest similarity between taxon then grouped into new taxon. Program calculates numerical data after we do character scoring. Final result for each user may be different; this may be due to decision by user during construction process. This paper hopefully attracts people from systematic computation to develop further into open source software and multi-platform feature.

  6. Cluster Based Text Classification Model

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases the...... classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...... datasets. Our model also outperforms A Decision Cluster Classification (ADCC) and the Decision Cluster Forest Classification (DCFC) models on the Reuters-21578 dataset....

  7. Phylogenetic positions of several amitochondriate protozoa-Evidence from phylogenetic analysis of DNA topoisomerase II

    HE De; DONG Jiuhong; WEN Jianfan; XIN Dedong; LU Siqi

    2005-01-01

    Several groups of parasitic protozoa, as represented by Giardia, Trichomonas, Entamoeba and Microsporida, were once widely considered to be the most primitive extant eukaryotic group―Archezoa. The main evidence for this is their 'lacking mitochondria' and possessing some other primitive features between prokaryotes and eukaryotes, and being basal to all eukaryotes with mitochondria in phylogenies inferred from many molecules. Some authors even proposed that these organisms diverged before the endosymbiotic origin of mitochondria within eukaryotes. This view was once considered to be very significant to the study of origin and evolution of eukaryotic cells (eukaryotes). However, in recent years this has been challenged by accumulating evidence from new studies. Here the sequences of DNA topoisomerase II in G. lamblia, T. vaginalis and E. histolytica were identified first by PCR and sequencing, then combining with the sequence data of the microsporidia Encephalitozoon cunicul and other eukaryotic groups of different evolutionary positions from GenBank, phylogenetic trees were constructed by various methods to investigate the evolutionary positions of these amitochondriate protozoa. Our results showed that since the characteristics of DNA topoisomerase II make it avoid the defect of 'long-branch attraction' appearing in the previous phylogenetic analyses, our trees can not only reflect effectively the relationship of different major eukaryotic groups, which is widely accepted, but also reveal phylogenetic positions for these amitochondriate protozoa, which is different from the previous phylogenetic trees. They are not the earliest-branching eukaryotes, but diverged after some mitochondriate organisms such as kinetoplastids and mycetozoan; they are not a united group but occupy different phylogenetic positions. Combining with the recent cytological findings of mitochondria-like organelles in them, we think that though some of them (e.g. diplomonads, as represented

  8. Computational Prediction of Phylogenetically Conserved Sequence Motifs for Five Different Candidate Genes in Type II Diabetic Nephropathy

    P Srinivasan

    2012-07-01

    Full Text Available Background: Computational identification of phylogenetic motifs helps to understand the knowledge about known functional features that includes catalytic site, substrate binding epitopes, and protein-protein interfaces. Furthermore, they are strongly conserved among orthologs, indicating their evolutionary importance. The study aimed to analyze five candidate genes involved in type II diabetic nephropathy and to predict phylogenetic motifs from their corresponding orthologous protein sequences.Methods: AKR1B1, APOE, ENPP1, ELMO1 and IGFBP1 are the genes that have been identified as an important target for type II diabetic nephropathy through experimental studies. Their corresponding protein sequences, structures, orthologous sequences were retrieved from UniprotKB, PDB, and PHOG database respectively. Multiple sequence alignments were constructed using ClustalW and phylogenetic motifs were identified using MINER. The occurrence of amino acids in the obtained phylogenetic motifs was generated using WebLogo and false positive expectations were calculated against phylogenetic similarity.Results: In total, 17 phylogenetic motifs were identified from the five proteins and the residues such as glycine, leucine, tryptophan, aspartic acid were found in appreciable frequency whereas arginine identified in all the predicted PMs. The result implies that these residues can be important to the functional and structural role of the proteins and calculated false positive expectations implies that they were generally conserved in traditional sense.Conclusion: The prediction of phylogenetic motifs is an accurate method for detecting functionally important conserved residues. The conserved motifs can be used as a potential drug target for type II diabetic nephropathy.

  9. Molecular phylogenetic evaluation of classification and scenarios of character evolution in calcareous sponges (Porifera, Class Calcarea).

    Oliver Voigt; Eilika Wülfing; Gert Wörheide

    2012-01-01

    Calcareous sponges (Phylum Porifera, Class Calcarea) are known to be taxonomically difficult. Previous molecular studies have revealed many discrepancies between classically recognized taxa and the observed relationships at the order, family and genus levels; these inconsistencies question underlying hypotheses regarding the evolution of certain morphological characters. Therefore, we extended the available taxa and character set by sequencing the complete small subunit (SSU) rDNA and the alm...

  10. Hierarchical Markov random-field modeling for texture classification in chest radiographs

    Vargas-Voracek, Rene; Floyd, Carey E., Jr.; Nolte, Loren W.; McAdams, Page

    1996-04-01

    A hierarchical Markov random field (MRF) modeling approach is presented for the classification of textures in selected regions of interest (ROIs) of chest radiographs. The procedure integrates possible texture classes and their spatial definition with other components present in an image such as noise and background trend. Classification is performed as a maximum a-posteriori (MAP) estimation of texture class and involves an iterative Gibbs- sampling technique. Two cases are studied: classification of lung parenchyma versus bone and classification of normal lung parenchyma versus miliary tuberculosis (MTB). Accurate classification was obtained for all examined cases showing the potential of the proposed modeling approach for texture analysis of radiographic images.

  11. Pitch Based Sound Classification

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U.

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classif...

  12. Fast Computations for Measures of Phylogenetic Beta Diversity

    Tsirogiannis, Constantinos; Sandel, Brody

    2016-01-01

    For many applications in ecology, it is important to examine the phylogenetic relations between two communities of species. More formally, let T be a phylogenetic tree and let A and B be two samples of its tips, representing the examined communities. We want to compute a value that expresses the phylogenetic diversity between A and B in T . There exist several measures that can do this; these are the so-called phylogenetic beta diversity (β-diversity) measures. Two popular measures of this ki...

  13. Phylogenetic Structure of Foliar Spectral Traits in Tropical Forest Canopies

    Kelly M. McManus

    2016-02-01

    Full Text Available The Spectranomics approach to tropical forest remote sensing has established a link between foliar reflectance spectra and the phylogenetic composition of tropical canopy tree communities vis-à-vis the taxonomic organization of biochemical trait variation. However, a direct relationship between phylogenetic affiliation and foliar reflectance spectra of species has not been established. We sought to develop this relationship by quantifying the extent to which underlying patterns of phylogenetic structure drive interspecific variation among foliar reflectance spectra within three Neotropical canopy tree communities with varying levels of soil fertility. We interpreted the resulting spectral patterns of phylogenetic signal in the context of foliar biochemical traits that may contribute to the spectral-phylogenetic link. We utilized a multi-model ensemble to elucidate trait-spectral relationships, and quantified phylogenetic signal for spectral wavelengths and traits using Pagel’s lambda statistic. Foliar reflectance spectra showed evidence of phylogenetic influence primarily within the visible and shortwave infrared spectral regions. These regions were also selected by the multi-model ensemble as those most important to the quantitative prediction of several foliar biochemical traits. Patterns of phylogenetic organization of spectra and traits varied across sites and with soil fertility, indicative of the complex interactions between the environmental and phylogenetic controls underlying patterns of biodiversity.

  14. Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

    Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.

    2012-01-01

    Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly......: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting...

  15. Dating human cultural capacity using phylogenetic principles.

    Lind, J; Lindenfors, P; Ghirlanda, S; Lidén, K; Enquist, M

    2013-01-01

    Humans have genetically based unique abilities making complex culture possible; an assemblage of traits which we term "cultural capacity". The age of this capacity has for long been subject to controversy. We apply phylogenetic principles to date this capacity, integrating evidence from archaeology, genetics, paleoanthropology, and linguistics. We show that cultural capacity is older than the first split in the modern human lineage, and at least 170,000 years old, based on data on hyoid bone morphology, FOXP2 alleles, agreement between genetic and language trees, fire use, burials, and the early appearance of tools comparable to those of modern hunter-gatherers. We cannot exclude that Neanderthals had cultural capacity some 500,000 years ago. A capacity for complex culture, therefore, must have existed before complex culture itself. It may even originated long before. This seeming paradox is resolved by theoretical models suggesting that cultural evolution is exceedingly slow in its initial stages. PMID:23648831

  16. Phylogenetic position of the spirochetal genus Cristispira

    Paster, B.J.; Pelletier, D.A.; Dewhirst, F.E.;

    1996-01-01

    a cell-laden crystalline styles of the oyster Crassostrea virginica. The amplified products were then cloned into Escherichia coli plasmids. Sequence comparisons of the gene coding for 16S rRNA (rDNA) insert of one clone, designated CP1, indicated that it was spirochetal. The sequence of the 16S rDNA...... insert of another clone was mycoplasmal. The CP1 sequence possessed most of the individual base signatures that are unique to 16S rRNA (or rDNA) sequences of known spirochetes. CP1 branched deeply among other spirochetal genera within the family Spirochaetaceae, and accordingly, it represents a separate......Comparative sequence analysis of 16S rRNA genes was used to determine the phylogenetic relationship of the genus Cristispira to other spirochetes. Since Cristispira organisms cannot presently be grown in vitro, 16S rRNA genes were amplified directly from bacterial DNA isolated from Cristispira...

  17. Learning Apache Mahout classification

    Gupta, Ashish

    2015-01-01

    If you are a data scientist who has some experience with the Hadoop ecosystem and machine learning methods and want to try out classification on large datasets using Mahout, this book is ideal for you. Knowledge of Java is essential.

  18. Classification in Medical Imaging

    Chen, Chen

    detection in a cardiovascular disease study. The third focus is to deepen the understanding of classification mechanism by visualizing the knowledge learned by a classifier. More specifically, to build the most typical patterns recognized by the Fisher's linear discriminant rule with applications......Classification is extensively used in the context of medical image analysis for the purpose of diagnosis or prognosis. In order to classify image content correctly, one needs to extract efficient features with discriminative properties and build classifiers based on these features. In addition......, a good metric is required to measure distance or similarity between feature points so that the classification becomes feasible. Furthermore, in order to build a successful classifier, one needs to deeply understand how classifiers work. This thesis focuses on these three aspects of classification...

  19. S1 gene-based phylogeny of infectious bronchitis virus: An attempt to harmonize virus classification.

    Valastro, Viviana; Holmes, Edward C; Britton, Paul; Fusaro, Alice; Jackwood, Mark W; Cattoli, Giovanni; Monne, Isabella

    2016-04-01

    Infectious bronchitis virus (IBV) is the causative agent of a highly contagious disease that results in severe economic losses to the global poultry industry. The virus exists in a wide variety of genetically distinct viral types, and both phylogenetic analysis and measures of pairwise similarity among nucleotide or amino acid sequences have been used to classify IBV strains. However, there is currently no consensus on the method by which IBV sequences should be compared, and heterogeneous genetic group designations that are inconsistent with phylogenetic history have been adopted, leading to the confusing coexistence of multiple genotyping schemes. Herein, we propose a simple and repeatable phylogeny-based classification system combined with an unambiguous and rationale lineage nomenclature for the assignment of IBV strains. By using complete nucleotide sequences of the S1 gene we determined the phylogenetic structure of IBV, which in turn allowed us to define 6 genotypes that together comprise 32 distinct viral lineages and a number of inter-lineage recombinants. Because of extensive rate variation among IBVs, we suggest that the inference of phylogenetic relationships alone represents a more appropriate criterion for sequence classification than pairwise sequence comparisons. The adoption of an internationally accepted viral nomenclature is crucial for future studies of IBV epidemiology and evolution, and the classification scheme presented here can be updated and revised novel S1 sequences should become available. PMID:26883378

  20. Inhibition in multiclass classification

    Huerta, Ramón; Vembu, Shankar; Amigó, José M.; Nowotny, Thomas; Elkan, Charles

    2012-01-01

    The role of inhibition is investigated in a multiclass support vector machine formalism inspired by the brain structure of insects. The so-called mushroom bodies have a set of output neurons, or classification functions, that compete with each other to encode a particular input. Strongly active output neurons depress or inhibit the remaining outputs without knowing which is correct or incorrect. Accordingly, we propose to use a classification function that embodies unselective inhibition and ...

  1. Twitter content classification

    Dann, Stephen

    2010-01-01

    This paper delivers a new Twitter content classification framework based sixteen existing Twitter studies and a grounded theory analysis of a personal Twitter history. It expands the existing understanding of Twitter as a multifunction tool for personal, profession, commercial and phatic communications with a split level classification scheme that offers broad categorization and specific sub categories for deeper insight into the real world application of the service.

  2. Text classification method review

    Mahinovs, Aigars; Tiwari, Ashutosh; Roy, Rajkumar; Baxter, David

    2007-01-01

    With the explosion of information fuelled by the growth of the World Wide Web it is no longer feasible for a human observer to understand all the data coming in or even classify it into categories. With this growth of information and simultaneous growth of available computing power automatic classification of data, particularly textual data, gains increasingly high importance. This paper provides a review of generic text classification process, phases of that process and met...

  3. Automatic Arabic Text Classification

    Al-harbi, S; Almuhareb, A.; Al-Thubaity , A; Khorsheed, M. S.; Al-Rajeh, A.

    2008-01-01

    Automated document classification is an important text mining task especially with the rapid growth of the number of online documents present in Arabic language. Text classification aims to automatically assign the text to a predefined category based on linguistic features. Such a process has different useful applications including, but not restricted to, e-mail spam detection, web page content filtering, and automatic message routing. This paper presents the results of experiments on documen...

  4. Classification of Sleep Disorders

    Michael J. Thorpy

    2012-01-01

    The classification of sleep disorders is necessary to discriminate between disorders and to facilitate an understanding of symptoms, etiology, and pathophysiology that allows for appropriate treatment. The earliest classification systems, largely organized according to major symptoms (insomnia, excessive sleepiness, and abnormal events that occur during sleep), were unable to be based on pathophysiology because the cause of most sleep disorders was unknown. These 3 symptom-based categories ar...

  5. Latent classification models

    Langseth, Helge; Nielsen, Thomas Dyhre

    2005-01-01

    parametric family ofdistributions.  In this paper we propose a new set of models forclassification in continuous domains, termed latent classificationmodels. The latent classification model can roughly be seen ascombining the \\NB model with a mixture of factor analyzers,thereby relaxing the assumptions of...... classification model, and wedemonstrate empirically that the accuracy of the proposed model issignificantly higher than the accuracy of other probabilisticclassifiers....

  6. Classifications of Software Transfers

    Wohlin, Claes; Smite, Darja

    2012-01-01

    Many companies have development sites around the globe. This inevitably means that development work may be transferred between the sites. This paper defines a classification of software transfer types; it divides transfers into three main types: full, partial and gradual transfers to describe the context of a transfer. The differences between transfer types, and hence the need for a classification, are illustrated with staffing curves for two different transfer types. The staffing curves are ...

  7. A New Classification Approach Based on Multiple Classification Rules

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  8. Supernova Photometric Lightcurve Classification

    Zaidi, Tayeb; Narayan, Gautham

    2016-01-01

    This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).

  9. Progressive Classification Using Support Vector Machines

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user

  10. Rapid and accurate pyrosequencing of angiosperm plastid genomes

    Farmerie William G

    2006-08-01

    Full Text Available Abstract Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20 System (454 Life Sciences Corporation, to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae and Platanus occidentalis (Platanaceae. Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy

  11. Hierarchical classification of social groups

    Витковская, Мария

    2001-01-01

    Classification problems are important for every science, and for sociology as well. Social phenomena, examined from the aspect of classification of social groups, can be examined deeper. At present one common classification of groups does not exist. This article offers the hierarchical classification of social group.

  12. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326

  13. Iris Image Classification Based on Hierarchical Visual Codebook.

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection. PMID:26353275

  14. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors

    Qi Wang

    2008-11-01

    Full Text Available Abstract: This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters – wheelbase and number of axles – to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle’s speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves

  15. Challenging of Facial Expressions Classification Systems: Survey, Critical Considerations and Direction of Future Work

    Amir Jamshidnezhad; M.D. Jan Nordin

    2012-01-01

    The main purpose of this study is analysis of the parameters and the affects of those on the performance of the facial expressions classification systems. In recent years understanding of emotions is a basic requirement in the development of Human Computer Interaction (HCI) systems. Therefore, an HCI is highly depended on accurate understanding of facial expression. Classification module is the main part of facial expressions recognition system. Numerous classification techniques were propose...

  16. Phylogenetic diversity (PD and biodiversity conservation: some bioinformatics challenges

    Daniel P. Faith

    2006-01-01

    Full Text Available Biodiversity conservation addresses information challenges through estimations encapsulated in measures of diversity. A quantitative measure of phylogenetic diversity, “PD”, has been defined as the minimum total length of all the phylogenetic branches required to span a given set of taxa on the phylogenetic tree (Faith 1992a. While a recent paper incorrectly characterizes PD as not including information about deeper phylogenetic branches, PD applications over the past decade document the proper incorporation of shared deep branches when assessing the total PD of a set of taxa. Current PD applications to macroinvertebrate taxa in streams of New South Wales, Australia illustrate the practical importance of this definition. Phylogenetic lineages, often corresponding to new, “cryptic”, taxa, are restricted to a small number of stream localities. A recent case of human impact causing loss of taxa in one locality implies a higher PD value for another locality, because it now uniquely represents a deeper branch. This molecular-based phylogenetic pattern supports the use of DNA barcoding programs for biodiversity conservation planning. Here, PD assessments side-step the contentious use of barcoding-based “species” designations. Bio-informatics challenges include combining different phylogenetic evidence, optimization problems for conservation planning, and effective integration of phylogenetic information with environmental and socio-economic data.

  17. Product Classification in Supply Chain

    Xing, Lihong; Xu, Yaoxuan

    2010-01-01

    Oriflame is a famous international direct sale cosmetics company with complicated supply chain operation but it lacks of a product classification system. It is vital to design a product classification method in order to support Oriflame global supply planning and improve the supply chain performance. This article is aim to investigate and design the multi-criteria of product classification, propose the classification model, suggest application areas of product classification results and intro...

  18. A Fuzzy Logic Based Sentiment Classification

    J.I.Sheeba

    2014-07-01

    Full Text Available Sentiment classification aims to detect information such as opinions, explicit , implicit feelings expressed in text. The most existing approaches are able to detect either explicit expressions or implicit expressions of sentiments in the text separately. In this proposed framework it will detect both Implicit and Explicit expressions available in the meeting transcripts. It will classify the Positive, Negative, Neutral words and also identify the topic of the particular meeting transcripts by using fuzzy logic. This paper aims to add some additional features for improving the classification method. The quality of the sentiment classification is improved using proposed fuzzy logic framework .In this fuzzy logic it includes the features like Fuzzy rules and Fuzzy C-means algorithm.The quality of the output is evaluated using the parameters such as precision, recall, f-measure. Here Fuzzy C-means Clustering technique measured in terms of Purity and Entropy. The data set was validated using 10-fold cross validation method and observed 95% confidence interval between the accuracy values .Finally, the proposed fuzzy logic method produced more than 85 % accurate results and error rate is very less compared to existing sentiment classification techniques.

  19. Phylogenetic relationships of typical antbirds (Thamnophilidae and test of incongruence based on Bayes factors

    Nylander Johan AA

    2004-07-01

    Full Text Available Abstract Background The typical antbirds (Thamnophilidae form a monophyletic and diverse family of suboscine passerines that inhabit neotropical forests. However, the phylogenetic relationships within this assemblage are poorly understood. Herein, we present a hypothesis of the generic relationships of this group based on Bayesian inference analyses of two nuclear introns and the mitochondrial cytochrome b gene. The level of phylogenetic congruence between the individual genes has been investigated utilizing Bayes factors. We also explore how changes in the substitution models affected the observed incongruence between partitions of our data set. Results The phylogenetic analysis supports both novel relationships, as well as traditional groupings. Among the more interesting novel relationship suggested is that the Terenura antwrens, the wing-banded antbird (Myrmornis torquata, the spot-winged antshrike (Pygiptila stellaris and the russet antshrike (Thamnistes anabatinus are sisters to all other typical antbirds. The remaining genera fall into two major clades. The first includes antshrikes, antvireos and the Herpsilochmus antwrens, while the second clade consists of most antwren genera, the Myrmeciza antbirds, the "professional" ant-following antbirds, and allied species. Our results also support previously suggested polyphyly of Myrmotherula antwrens and Myrmeciza antbirds. The tests of phylogenetic incongruence, using Bayes factors, clearly suggests that allowing the gene partitions to have separate topology parameters clearly increased the model likelihood. However, changing a component of the nucleotide substitution model had much higher impact on the model likelihood. Conclusions The phylogenetic results are in broad agreement with traditional classification of the typical antbirds, but some relationships are unexpected based on external morphology. In these cases their true affinities may have been obscured by convergent evolution and

  20. Open Reading Frame Phylogenetic Analysis on the Cloud

    Che-Lun Hung

    2013-01-01

    Full Text Available Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.

  1. Open reading frame phylogenetic analysis on the cloud.

    Hung, Che-Lun; Lin, Chun-Yuan

    2013-01-01

    Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843

  2. Visualising very large phylogenetic trees in three dimensional hyperbolic space

    Liberles David A

    2004-04-01

    Full Text Available Abstract Background Common existing phylogenetic tree visualisation tools are not able to display readable trees with more than a few thousand nodes. These existing methodologies are based in two dimensional space. Results We introduce the idea of visualising phylogenetic trees in three dimensional hyperbolic space with the Walrus graph visualisation tool and have developed a conversion tool that enables the conversion of standard phylogenetic tree formats to Walrus' format. With Walrus, it becomes possible to visualise and navigate phylogenetic trees with more than 100,000 nodes. Conclusion Walrus enables desktop visualisation of very large phylogenetic trees in 3 dimensional hyperbolic space. This application is potentially useful for visualisation of the tree of life and for functional genomics derivatives, like The Adaptive Evolution Database (TAED.

  3. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification.

    Wang, Yin; Li, Rudong; Zhou, Yuhua; Ling, Zongxin; Guo, Xiaokui; Xie, Lu; Liu, Lei

    2016-01-01

    Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data. PMID:27057545

  4. The neuron classification problem

    Bota, Mihail; Swanson, Larry W.

    2007-01-01

    A systematic account of neuron cell types is a basic prerequisite for determining the vertebrate nervous system global wiring diagram. With comprehensive lineage and phylogenetic information unavailable, a general ontology based on structure-function taxonomy is proposed and implemented in a knowledge management system, and a prototype analysis of select regions (including retina, cerebellum, and hypothalamus) presented. The supporting Brain Architecture Knowledge Management System (BAMS) Neu...

  5. Site-specific time heterogeneity of the substitution process and its impact on phylogenetic inference

    Philippe Hervé

    2011-01-01

    Full Text Available Abstract Background Model violations constitute the major limitation in inferring accurate phylogenies. Characterizing properties of the data that are not being correctly handled by current models is therefore of prime importance. One of the properties of protein evolution is the variation of the relative rate of substitutions across sites and over time, the latter is the phenomenon called heterotachy. Its effect on phylogenetic inference has recently obtained considerable attention, which led to the development of new models of sequence evolution. However, thus far focus has been on the quantitative heterogeneity of the evolutionary process, thereby overlooking more qualitative variations. Results We studied the importance of variation of the site-specific amino-acid substitution process over time and its possible impact on phylogenetic inference. We used the CAT model to define an infinite mixture of substitution processes characterized by equilibrium frequencies over the twenty amino acids, a useful proxy for qualitatively estimating the evolutionary process. Using two large datasets, we show that qualitative changes in site-specific substitution properties over time occurred significantly. To test whether this unaccounted qualitative variation can lead to an erroneous phylogenetic tree, we analyzed a concatenation of mitochondrial proteins in which Cnidaria and Porifera were erroneously grouped. The progressive removal of the sites with the most heterogeneous CAT profiles across clades led to the recovery of the monophyly of Eumetazoa (Cnidaria+Bilateria, suggesting that this heterogeneity can negatively influence phylogenetic inference. Conclusion The time-heterogeneity of the amino-acid replacement process is therefore an important evolutionary aspect that should be incorporated in future models of sequence change.

  6. Search techniques in intelligent classification systems

    Savchenko, Andrey V

    2016-01-01

    A unified methodology for categorizing various complex objects is presented in this book. Through probability theory, novel asymptotically minimax criteria suitable for practical applications in imaging and data analysis are examined including the special cases such as the Jensen-Shannon divergence and the probabilistic neural network. An optimal approximate nearest neighbor search algorithm, which allows faster classification of databases is featured. Rough set theory, sequential analysis and granular computing are used to improve performance of the hierarchical classifiers. Practical examples in face identification (including deep neural networks), isolated commands recognition in voice control system and classification of visemes captured by the Kinect depth camera are included. This approach creates fast and accurate search procedures by using exact probability densities of applied dissimilarity measures. This book can be used as a guide for independent study and as supplementary material for a technicall...

  7. Prediction and classification of respiratory motion

    Lee, Suk Jin

    2014-01-01

    This book describes recent radiotherapy technologies including tools for measuring target position during radiotherapy and tracking-based delivery systems. This book presents a customized prediction of respiratory motion with clustering from multiple patient interactions. The proposed method contributes to the improvement of patient treatments by considering breathing pattern for the accurate dose calculation in radiotherapy systems. Real-time tumor-tracking, where the prediction of irregularities becomes relevant, has yet to be clinically established. The statistical quantitative modeling for irregular breathing classification, in which commercial respiration traces are retrospectively categorized into several classes based on breathing pattern are discussed as well. The proposed statistical classification may provide clinical advantages to adjust the dose rate before and during the external beam radiotherapy for minimizing the safety margin. In the first chapter following the Introduction  to this book, we...

  8. Accurate Medium-Term Wind Power Forecasting in a Censored Classification Framework

    Dahl, Christian M.; Croonenbroeck, Carsten

    2014-01-01

    -term forecasts, which are especially necessary for practitioners in the forward electricity markets of many power trading places; for example, NASDAQ OMX Commodities (formerly Nord Pool OMX Commodities) in northern Europe. We show that our model produces turbine-specific forecasts that are significantly more...

  9. Can Procalcitonin Be an Accurate Diagnostic Marker for the Classification of Diabetic Foot Ulcers?

    Jonaidi Jafari, Nematollah; Safaee Firouzabadi, Mahdi; Izadi, Morteza; Safaee Firouzabadi, Mohammad Sadegh; Saburi, Amin

    2014-01-01

    Background: The differentiation of infected diabetic foot ulcers (IDFU) from non infected diabetic foot ulcers (NIDFU) is a challenging issue for clinicians. Objectives: Recently, procalcitonin (PCT) was introduced as a remarkable inflammatory marker. We aimed to evaluate the accuracy of PCT in comparison to other inflammatory markers for distinguishing IDFU from NIDFU. Materials and Methods: We evaluated PCT serum level as a marker of bacterial infection in patients with diabetic foot ulcers...

  10. Convolutional Neural Networks for patient-specific ECG classification.

    Kiranyaz, Serkan; Ince, Turker; Hamila, Ridha; Gabbouj, Moncef

    2015-08-01

    We propose a fast and accurate patient-specific electrocardiogram (ECG) classification and monitoring system using an adaptive implementation of 1D Convolutional Neural Networks (CNNs) that can fuse feature extraction and classification into a unified learner. In this way, a dedicated CNN will be trained for each patient by using relatively small common and patient-specific training data and thus it can also be used to classify long ECG records such as Holter registers in a fast and accurate manner. Alternatively, such a solution can conveniently be used for real-time ECG monitoring and early alert system on a light-weight wearable device. The experimental results demonstrate that the proposed system achieves a superior classification performance for the detection of ventricular ectopic beats (VEB) and supraventricular ectopic beats (SVEB). PMID:26736826

  11. The paradox of atheoretical classification

    Hjørland, Birger

    2016-01-01

    A distinction can be made between “artificial classifications” and “natural classifications,” where artificial classifications may adequately serve some limited purposes, but natural classifications are overall most fruitful by allowing inference and thus many different purposes. There is strong...... support for the view that a natural classification should be based on a theory (and, of course, that the most fruitful theory provides the most fruitful classification). Nevertheless, atheoretical (or “descriptive”) classifications are often produced. Paradoxically, atheoretical classifications may be...... very successful. The best example of a successful “atheoretical” classification is probably the prestigious Diagnostic and Statistical Manual of Mental Disorders (DSM) since its third edition from 1980. Based on such successes one may ask: Should the claim that classifications ideally are natural and...

  12. Information gathering for CLP classification

    Ida Marcello

    2011-01-01

    Full Text Available Regulation 1272/2008 includes provisions for two types of classification: harmonised classification and self-classification. The harmonised classification of substances is decided at Community level and a list of harmonised classifications is included in the Annex VI of the classification, labelling and packaging Regulation (CLP. If a chemical substance is not included in the harmonised classification list it must be self-classified, based on available information, according to the requirements of Annex I of the CLP Regulation. CLP appoints that the harmonised classification will be performed for carcinogenic, mutagenic or toxic to reproduction substances (CMR substances and for respiratory sensitisers category 1 and for other hazard classes on a case-by-case basis. The first step of classification is the gathering of available and relevant information. This paper presents the procedure for gathering information and to obtain data. The data quality is also discussed.

  13. Phylogenetic biodiversity assessment based on systematic nomenclature

    Ross H Crozier

    2006-01-01

    Full Text Available Biodiversity assessment demands objective measures, because ultimately conservation decisions must prioritize the use of limited resources for preserving taxa. The most general framework for the objective assessment of conservation worth are those that assess evolutionary distinctiveness, e.g. Genetic (Crozier 1992 and Phylogenetic Diversity (Faith 1992, and Evolutionary History (Nee & May 1997. These measures all attempt to assess the conservation worth of any scheme based on how much of the encompassing phylogeny of organisms is preserved. However, their general applicability is limited by the small proportion of taxa that have been reliably placed in a phylogeny. Given that phylogenizaton of many interesting taxa or important is unlikely to occur soon, we present a framework for using taxonomy as a reasonable surrogate for phylogeny. Combining this framework with exhaustive searches for combinations of sites containing maximal diversity, we provide a proof-of-concept for assessing conservation schemes for systematized but un-phylogenised taxa spread over a series of sites. This is illustrated with data from four studies, on North Queensland flightless insects (Yeates et al. 2002, ants from a Florida Transect (Lubertazzi & Tschinkel 2003, New England bog ants (Gotelli & Ellison 2002 and a simulated distribution of the known New Zealand Lepidosauria (Daugherty et al. 1994. The results support this approach, indicating that species, genus and site numbers predict evolutionary history, to a degree depending on the size of the data set.

  14. Phylogenetic analyses of Andromedeae (Ericaceae subfam. Vaccinioideae).

    Kron, K A; Judd, W S; Crayn, D M

    1999-09-01

    Phylogenetic relationships within the Andromedeae and closely related taxa were investigated by means of cladistic analyses based on phenotypic (morphology, anatomy, chromosome number, and secondary chemistry) and molecular (rbcL and matK nucleotide sequences) characters. An analysis based on combined molecular and phenotypic characters indicates that the tribe is composed of two major clades-the Gaultheria group (incl. Andromeda, Chamaedaphne, Diplycosia, Gaultheria, Leucothoë, Pernettya, Tepuia, and Zenobia) and the Lyonia group (incl. Agarista, Craibiodendron, Lyonia, and Pieris). Andromedeae are shown to be paraphyletic in all analyses because the Vaccinieae link with some or all of the genera of the Gaultheria group. Oxydendrum is sister to the clade containing the Vaccinieae, Gaultheria group, and Lyonia group. The monophyly of Agarista, Lyonia, Pieris, and Gaultheria (incl. Pernettya) is supported, while that of Leucothoë is problematic. The close relationship of Andromeda and Zenobia is novel and was strongly supported in the molecular (but not morphological) analyses. Diplycosia, Tepuia, Gaultheria, and Pernettya form a well-supported clade, which can be diagnosed by the presence of fleshy calyx lobes and methyl salicylate. Recognition of Andromedeae is not reflective of our understanding of geneological relationships and should be abandoned; the Lyonia group is formally recognized at the tribal level. PMID:10487817

  15. PHYLOGENETIC STUDY OF SOME STRAINS OF DUNALIELLA

    Duc Tran

    2013-01-01

    Full Text Available Dunaliella strains were isolated from a key site for salt production in Vietnam (Vinh Hao, Binh Thuan province. The strains were identified based on Internal Transcribed Spacer (ITS markers. The phylogenetic tree revealed these strains belong to the clades of Dunaliella salina and Dunaliella viridis. Results of this study confirm the ubiquitous nature of Dunaliella and suggest that strains of Dunaliella salina might be acquired locally worldwide for the production of beta-carotene. The identification of these species infers the presence of other Dunaliella species (Dunaliella tertiolecta, Dunaliella primolecta, Dunaliella parva, but further investigation would be required to confirm their presence in Vietnam. We anticipate the physiological and biochemical characteristics of these local species will be compared with imported strains in a future effort. This will facilitate selection of strains with the best potential for exploitation in the food, aquaculture and biofuel industries. The Dunaliella strains isolated and identified in this study are maintained at the Laboratory of Algal Biotechnology, International University and will be made available for research and educational institutions.

  16. Comprehensive phylogenetic analysis of bacterial reverse transcriptases.

    Nicolás Toro

    Full Text Available Much less is known about reverse transcriptases (RTs in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (≤85% identity per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs, Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L, and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology.

  17. A phylogenetic re-evaluation of Arthrinium.

    Crous, Pedro W; Groenewald, Johannes Z

    2013-07-01

    Although the genus Arthrinium (sexual morph Apiospora) is commonly isolated as an endophyte from a range of substrates, and is extremely interesting for the pharmaceutical industry, its molecular phylogeny has never been resolved. Based on morphology and DNA sequence data of the large subunit nuclear ribosomal RNA gene (LSU, 28S) and the internal transcribed spacers (ITS) and 5.8S rRNA gene of the nrDNA operon, the genus Arthrinium is shown to belong to Apiosporaceae in Xylariales. Arthrinium is morphologically and phylogenetically circumscribed, and the sexual genus Apiospora treated as synonym on the basis that Arthinium is older, more commonly encountered, and more frequently used in literature. An epitype is designated for Arthrinium pterospermum, and several well-known species are redefined based on their morphology and sequence data of the translation elongation factor 1-alpha (TEF), beta-tubulin (TUB) and internal transcribed spacer (ITS1, 5.8S, ITS2) gene regions. Newly described are A. hydei on Bambusa tuldoides from Hong Kong, A. kogelbergense on dead culms of Restionaceae from South Africa, A. malaysianum on Macaranga hullettii from Malaysia, A. ovatum on Arundinaria hindsii from Hong Kong, A. phragmites on Phragmites australis from Italy, A. pseudospegazzinii on Macaranga hullettii from Malaysia, A. pseudosinense on bamboo from The Netherlands, and A. xenocordella from soil in Zimbabwe. Furthermore, the genera Pteroconium and Cordella are also reduced to synonymy, rejecting spore shape and the presence of setae as characters of generic significance separating them from Arthrinium. PMID:23898419

  18. Fast Structural Search in Phylogenetic Databases

    William H. Piel

    2005-01-01

    Full Text Available As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P . The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

  19. Phylogenetic characterization of archaea in saltpan sediments.

    Ahmad, Nasier; Johri, Sarojini; Sultan, Phalisteen; Abdin, Malik Z; Qazi, Ghulam N

    2011-06-01

    A study was undertaken to investigate the presence of archaeal diversity in saltpan sediments of Goa, India by 16S rDNA-dependent molecular phylogeny. Small subunit rRNA (16S rDNA) from saltpan sediment metagenome were amplified by polymerase chain reaction (PCR) using primers specific to the domain archaea. 10 unique phylotypes were obtained by PCR based RFLP of 16S rRNA genes using endonuclease Msp 1, which was most suitable to score the genetic diversity. These phylotypes spanned a wide range within the domain archaea including both crenarchaeota and euryarcheaota. None of the retrieved crenarchaeota sequences could be grouped with previously cultured crenarchaeota however; two sequences were related with haloarchaea. Most of the sequences determined were closely related to the sequences that had been previously obtained from metagenome of a variety of marine environments. The phylogenetic study of a site investigated for the first time revealed the presence of low archaeal population but showed yet unclassified species, may specially adapted to the salt pan sediment of Goa. PMID:22654153

  20. Vertebral fracture classification

    de Bruijne, Marleen; Pettersen, Paola C.; Tankó, László B.; Nielsen, Mads

    2007-03-01

    A novel method for classification and quantification of vertebral fractures from X-ray images is presented. Using pairwise conditional shape models trained on a set of healthy spines, the most likely unfractured shape is estimated for each of the vertebrae in the image. The difference between the true shape and the reconstructed normal shape is an indicator for the shape abnormality. A statistical classification scheme with the two shapes as features is applied to detect, classify, and grade various types of deformities. In contrast with the current (semi-)quantitative grading strategies this method takes the full shape into account, it uses a patient-specific reference by combining population-based information on biological variation in vertebra shape and vertebra interrelations, and it provides a continuous measure of deformity. Good agreement with manual classification and grading is demonstrated on 204 lateral spine radiographs with in total 89 fractures.

  1. Classification problem in CBIR

    Tatiana Jaworska

    2013-04-01

    Full Text Available At present a great deal of research is being done in different aspects of Content-Based Im-age Retrieval (CBIR. Image classification is one of the most important tasks in image re-trieval that must be dealt with. The primary issue we have addressed is: how can the fuzzy set theory be used to handle crisp image data. We propose fuzzy rule-based classification of image objects. To achieve this goal we have built fuzzy rule-based classifiers for crisp data. In this paper we present the results of fuzzy rule-based classification in our CBIR. Further-more, these results are used to construct a search engine taking into account data mining.

  2. Supernova Photometric Classification Challenge

    Kessler, Richard; Jha, Saurabh; Kuhlmann, Stephen

    2010-01-01

    We have publicly released a blinded mix of simulated SNe, with types (Ia, Ib, Ic, II) selected in proportion to their expected rate. The simulation is realized in the griz filters of the Dark Energy Survey (DES) with realistic observing conditions (sky noise, point spread function and atmospheric transparency) based on years of recorded conditions at the DES site. Simulations of non-Ia type SNe are based on spectroscopically confirmed light curves that include unpublished non-Ia samples donated from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan Digital Sky Survey-II (SDSS-II). We challenge scientists to run their classification algorithms and report a type for each SN. A spectroscopically confirmed subset is provided for training. The goals of this challenge are to (1) learn the relative strengths and weaknesses of the different classification algorithms, (2) use the results to improve classification algorithms, and (3) understand what spectroscopically confirmed sub-...

  3. Bosniak classification system

    Graumann, Ole; Osther, Susanne Sloth; Karstoft, Jens;

    2016-01-01

    BACKGROUND: The Bosniak classification was originally based on computed tomographic (CT) findings. Magnetic resonance (MR) and contrast-enhanced ultrasonography (CEUS) imaging may demonstrate findings that are not depicted at CT, and there may not always be a clear correlation between the findings...... at MR and CEUS imaging and those at CT. PURPOSE: To compare diagnostic accuracy of MR, CEUS, and CT when categorizing complex renal cystic masses according to the Bosniak classification. MATERIAL AND METHODS: From February 2011 to June 2012, 46 complex renal cysts were prospectively evaluated by...... three readers. Each mass was categorized according to the Bosniak classification and CT was chosen as gold standard. Kappa was calculated for diagnostic accuracy and data was compared with pathological results. RESULTS: CT images found 27 BII, six BIIF, seven BIII, and six BIV. Forty-three cysts could...

  4. Acoustic classification of dwellings

    Berardi, Umberto; Rasmussen, Birgit

    2014-01-01

    Schemes for the classification of dwellings according to different building performances have been proposed in the last years worldwide. The general idea behind these schemes relates to the positive impact a higher label, and thus a better performance, should have. In particular, focusing on sound...... insulation performance, national schemes for sound classification of dwellings have been developed in several European countries. These schemes define acoustic classes according to different levels of sound insulation. Due to the lack of coordination among countries, a significant diversity in terms of...... descriptors, number of classes, and class intervals occurred between national schemes. However, a proposal “acoustic classification scheme for dwellings” has been developed recently in the European COST Action TU0901 with 32 member countries. This proposal has been accepted as an ISO work item. This paper...

  5. Reconstruction of Family-Level Phylogenetic Relationships within Demospongiae (Porifera) Using Nuclear Encoded Housekeeping Genes

    Hill, Malcolm S.; Hill, April L.; Lopez, Jose; Peterson, Kevin J.; Pomponi, Shirley; Diaz, Maria C.; Thacker, Robert W.; Adamska, Maja; Boury-Esnault, Nicole; Cárdenas, Paco; Chaves-Fonnegra, Andia; Danka, Elizabeth; De Laine, Bre-Onna; Formica, Dawn; Hajdu, Eduardo; Lobo-Hajdu, Gisele; Klontz, Sarah; Morrow, Christine C.; Patel, Jignasa; Picton, Bernard; Pisani, Davide; Pohlmann, Deborah; Redmond, Niamh E.; Reed, John; Richey, Stacy; Riesgo, Ana; Rubin, Ewelina; Russell, Zach; Rützler, Klaus; Sperling, Erik A.; di Stefano, Michael; Tarver, James E.; Collins, Allen G.

    2013-01-01

    Background Demosponges are challenging for phylogenetic systematics because of their plastic and relatively simple morphologies and many deep divergences between major clades. To improve understanding of the phylogenetic relationships within Demospongiae, we sequenced and analyzed seven nuclear housekeeping genes involved in a variety of cellular functions from a diverse group of sponges. Methodology/Principal Findings We generated data from each of the four sponge classes (i.e., Calcarea, Demospongiae, Hexactinellida, and Homoscleromorpha), but focused on family-level relationships within demosponges. With data for 21 newly sampled families, our Maximum Likelihood and Bayesian-based approaches recovered previously phylogenetically defined taxa: Keratosap, Myxospongiaep, Spongillidap, Haploscleromorphap (the marine haplosclerids) and Democlaviap. We found conflicting results concerning the relationships of Keratosap and Myxospongiaep to the remaining demosponges, but our results strongly supported a clade of Haploscleromorphap+Spongillidap+Democlaviap. In contrast to hypotheses based on mitochondrial genome and ribosomal data, nuclear housekeeping gene data suggested that freshwater sponges (Spongillidap) are sister to Haploscleromorphap rather than part of Democlaviap. Within Keratosap, we found equivocal results as to the monophyly of Dictyoceratida. Within Myxospongiaep, Chondrosida and Verongida were monophyletic. A well-supported clade within Democlaviap, Tetractinellidap, composed of all sampled members of Astrophorina and Spirophorina (including the only lithistid in our analysis), was consistently revealed as the sister group to all other members of Democlaviap. Within Tetractinellidap, we did not recover monophyletic Astrophorina or Spirophorina. Our results also reaffirmed the monophyly of order Poecilosclerida (excluding Desmacellidae and Raspailiidae), and polyphyly of Hadromerida and Halichondrida. Conclusions/Significance These results, using an

  6. Molecular phylogenetics, species diversity, and biogeography of the Andean lizards of the genus Proctoporus (Squamata: Gymnophthalmidae).

    Goicoechea, Noemí; Padial, José M; Chaparro, Juan C; Castroviejo-Fisher, Santiago; De la Riva, Ignacio

    2012-12-01

    The family Gymnophthalmidae comprises ca. 220 described species of Neotropical lizards distributed from southern Mexico to Argentina. It includes 36 genera, among them Proctoporus, which contains six currently recognized species occurring across the yungas forests and wet montane grasslands of the Amazonian versant of the Andes from central Peru to central Bolivia. Here, we investigate the phylogenetic relationships and species limits of Proctoporus and closely related taxa by analyzing 2121 base pairs of mitochondrial (12S, 16S, and ND4) and nuclear (c-mos) genes. Our taxon sampling of 92 terminals includes all currently recognized species of Proctoporus and 15 additional species representing the most closely related groups to the genus. Maximum parsimony, maximum likelihood and Bayesian phylogenetic analyses recovered a congruent, fully resolved, and strongly supported hypothesis of relationships that challenges previous phylogenetic hypotheses and classifications, and biogeographic scenarios. Our main results are: (i) discovery of a strongly supported clade that includes all species of Proctoporus and within which are nested the monotypic Opipeuter xestus (a genus that we consider a junior synonym of Proctoporus), and two species of Euspondylus, that are therefore transferred to Proctoporus; (ii) the paraphyly of Proctoporus bolivianus with respect to P. subsolanus, which is proposed as a junior synonym of P. bolivianus; (iii) the detection of seven divergent and reciprocally monophyletic lineages (five of them previously assigned to P. bolivianus) that are considered confirmed candidate species, which implies that more candidate species are awaiting formal description and naming than currently recognized species in the genus; (iv) rejection of the hypothesis that Proctoporus diversified following a south to north pattern parallel to the elevation of the Andes; (v) species diversity in Proctoporus is the result of in situ diversification through vicariance in

  7. Phylogenetic relationships of Salvia (Lamiaceae) in China:Evidence from DNA sequence datasets

    Qian-Quan LI; Min-Hui LI; Qing-Jun YUAN; Zhan-Hu CUI; Lu-Qi HUANG; Pei-Gen XIAO

    2013-01-01

    With 84 native species,China is a center of distribution of the genus Salvia (Lamiaceae).These species are mainly distributed in Yunnan and Sichuan provinces (southwestern China),notably the Hengduan Mountain region.Traditionally,the Chinese Salvia has been classified into four subgenera,Salvia,Sclarea,Jungia,and Allagospadonopsis.We tested this classification using molecular phylogenetic analysis of 43 species of Salvia from China,six from Japan,and four introduced species.The nuclear ribosomal internal transcribed spacer region and three chloroplast regions (rbcL,matK,and trnH-psbA) were analyzed by maximum parsimony,maximum likelihood,and Bayesian methods.Our results showed that the Chinese (except Salvia deserta) and Japanese Salvia species formed a well-supported clade; S.deserta from Xinjiang grouped with Salvia officinalis of Europe.In addition,all introduced Salvia species in China were relatively distantly related to the native Chinese Salvia.Our results differed from the subgeneric and section classifications in Flora Reipublicae Popularis Sinicae.We suggested that sections Eusphace and Pleiphace should be united in a new subgenus and that sect.Notiosphace should be removed from subg.Sclarea and form a new subgenus.Our data could not distinguish a boundary between subg.Altagospadonopsis and sect.Drymosphace (subg.Sclarea); the latter should be reduced into the former.Further clarification of the phylogenetic relationships within Salvia and between Salvia and related genera will require broader taxonomic sampling and more molecular markers.

  8. Automatic classification of time-variable X-ray sources

    Lo, Kitty K; Murphy, Tara; Gaensler, B M

    2014-01-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the second \\textit{XMM-Newton} serendipitous source catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10-fold cross validation accuracy of the training data is ${\\sim}$97% on a seven-class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest der...

  9. Land Cover Classification Using ALOS Imagery For Penang, Malaysia

    This paper presents the potential of integrating optical and radar remote sensing data to improve automatic land cover mapping. The analysis involved standard image processing, and consists of spectral signature extraction and application of a statistical decision rule to identify land cover categories. A maximum likelihood classifier is utilized to determine different land cover categories. Ground reference data from sites throughout the study area are collected for training and validation. The land cover information was extracted from the digital data using PCI Geomatica 10.3.2 software package. The variations in classification accuracy due to a number of radar imaging processing techniques are studied. The relationship between the processing window and the land classification is also investigated. The classification accuracies from the optical and radar feature combinations are studied. Our research finds that fusion of radar and optical significantly improved classification accuracies. This study indicates that the land cover/use can be mapped accurately by using this approach

  10. Ensemble polarimetric SAR image classification based on contextual sparse representation

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun

    2016-05-01

    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  11. Molecular phylogenetic relationships of China Seas groupers based on cytochrome b gene fragment sequences

    DING Shaoxiong; ZHUANG Xuan; GUO Feng; WANG Jun; SU Yongquan; ZHANG Qiyong; LI Qifu

    2006-01-01

    The classification and evolutionary relationships are important issues in the study of the groupers. Cytochrome b gene fragment of twenty-eight grouper species within six genera of subfamily Epinephelinae was amplified using PCR techniques and the sequences were analyzed to derive the phylogenetic relationships of the groupers from the China Seas. Genetic information indexes, including Kimura-2 parameter genetic distance and Ts/Tv ratios, were generated by using a variety of biology softwares. With Niphon spinosus, Pagrus major and Pagrus auriga as the designated outgroups, phylogenetic trees, which invoke additional homologous sequences of other Epinephelus fishes from GenBank, were constructed based on the neighbor-joining (NJ), maximum-parsimony (MP), maximum-likelihood (ML) and minimum-evolution (ME) methods. Several conclusions were drawn from the DNA sequences analysis: (1) genus Plectropomus, which was early diverged, is the most primitive group in the subfamily Epinephelinae; (2) genus Variola is more closely related to genus Cephalopolis than the other four genera; (3) genus Cephalopolis is a monophyletic group and more primitive than genus Epinephelus; (4) Promicrops lanceolatus and Cromileptes altivelis should be included in genus Epinephelus; (5) there exist two sister groups in genus Epinephelus.

  12. Distributional patterns of the Neotropical genus Thecomyia Perty (Diptera, Sciomyzidae and phylogenetic support

    Amanda Ciprandi Pires

    2011-03-01

    Full Text Available Distributional patterns of the Neotropical genus Thecomyia Perty (Diptera, Sciomyzidae and phylogenetic support. The distributional pattern of the genus Thecomyia Perty, 1833 was defined using panbiogeographic tools, and analyzed based on the phylogeny of the group. This study sought to establish biogeographical homologies in the Neotropical region between different species of the genus, based on their distribution pattern and later corroboration through its phylogeny. Eight individual tracks and 16 generalized tracks were identified, established along nearly the entire swath of the Neotropics. Individual tracks are the basic units of a panbiogeographic study, and correspond to the hypothesis of minimum distribution of the organisms involved. The generalized tracks, obtained from the spatial congruence between two or more individual tracks, are important in the identification of smaller areas of endemism. Thus, we found evidence from the generalized tracks in support of previous classification for the Neotropical region. The Amazon domain is indicated as an area of outstanding importance in the diversification of the group, by the confluence of generalized tracks and biogeographic nodes in the region. Most of the generalized tracks and biogeographical nodes were congruent with the phylogenetic hypothesis of the genus, indicating support of the primary biogeographical homologies originally defined by the track analysis.

  13. Reappraisal of phylogenetic status and genetic diversity analysis of Asian population of Lentinula edodes

    2006-01-01

    Phylogenetic relationship within the Lentinula genus is constructed based on the sequenced ITS fragments of the 60Chinese wild L. edodes isolates and the sequence data of 48 isolates of different species from other districts downloaded from the GenBank. The 108 isolates of Lentinula genus are divided into two branches and seven groups, one branch and two groups in the New World, and the other branch and five groups in the Old World, and the isolates clustering of different groups corresponds obviously with the classification of the morphological species. Asian isolates are partitioned in group Ⅰ and Ⅴ, two of the five groups of the Old World,by which the germplasm resources status represented is of great importance shown by the phylogenetic analysis. Group V which fills up the blank of geographic distribution has become one of the mainstream groups with an increased isolate number, while group Ⅰ has a tendency to dissimilate into two subgroups (Ia and Ib) with a huge isolate quantity and a coverage of most tested districts, suggesting that China (or Asia) is an important genetic diversity center of the natural population of Lentinula genus. Genetic analysis of Asian isolates based on groups Ia, Ib and group V indicates that the diversity of the east coastal-land, northwestern highland and southwestern China and Himalayas districts is the most plentiful, which is the three priorities in diversity protection of Asian Lentinula population.

  14. Classification problem in CBIR

    Tatiana Jaworska

    2013-01-01

    At present a great deal of research is being done in different aspects of Content-Based Im-age Retrieval (CBIR). Image classification is one of the most important tasks in image re-trieval that must be dealt with. The primary issue we have addressed is: how can the fuzzy set theory be used to handle crisp image data. We propose fuzzy rule-based classification of image objects. To achieve this goal we have built fuzzy rule-based classifiers for crisp data. In this paper we present the results ...

  15. Classification of syringomyelia.

    Milhorat, T H

    2000-01-01

    Syringomyelia poses special challenges for the clinician because of its complex symptomatology, uncertain pathogenesis, and multiple options of treatment. The purpose of this study was to classify intramedullary cavities according to their most salient pathological and clinical features. Pathological findings obtained in 175 individuals with tubular cavitations of the spinal cord were correlated with clinical and magnetic resonance (MR) imaging findings in a database of 927 patients. A classification system was developed in which the morbid anatomy, cause, and pathogenesis of these lesions are emphasized. The use of a disease-based classification of syringomyelia facilitates diagnosis and the interpretation of MR imaging findings and provides a guide to treatment. PMID:16676921

  16. Classification des rongeurs

    Mignon, Jacques; Hardouin, Jacques

    2003-01-01

    Les lecteurs du Bulletin BEDIM semblent parfois avoir des difficultés avec la classification scientifique des animaux connus comme "rongeurs" dans le langage courant. Vu les querelles existant encore aujourd'hui dans la mise en place de cette classification, nous ne nous en étonnerons guère. La brève synthèse qui suit concerne les animaux faisant ou susceptibles de faire partie du mini-élevage. The note aims at providing the main characteristics of the principal families of rodents relevan...

  17. The best of both worlds: Phylogenetic eigenvector regression and mapping

    José Alexandre Felizola Diniz Filho

    2015-09-01

    Full Text Available Eigenfunction analyses have been widely used to model patterns of autocorrelation in time, space and phylogeny. In a phylogenetic context, Diniz-Filho et al. (1998 proposed what they called Phylogenetic Eigenvector Regression (PVR, in which pairwise phylogenetic distances among species are submitted to a Principal Coordinate Analysis, and eigenvectors are then used as explanatory variables in regression, correlation or ANOVAs. More recently, a new approach called Phylogenetic Eigenvector Mapping (PEM was proposed, with the main advantage of explicitly incorporating a model-based warping in phylogenetic distance in which an Ornstein-Uhlenbeck (O-U process is fitted to data before eigenvector extraction. Here we compared PVR and PEM in respect to estimated phylogenetic signal, correlated evolution under alternative evolutionary models and phylogenetic imputation, using simulated data. Despite similarity between the two approaches, PEM has a slightly higher prediction ability and is more general than the original PVR. Even so, in a conceptual sense, PEM may provide a technique in the best of both worlds, combining the flexibility of data-driven and empirical eigenfunction analyses and the sounding insights provided by evolutionary models well known in comparative analyses.

  18. Spectral classification using convolutional neural networks

    Hála, Pavel

    2014-01-01

    There is a great need for accurate and autonomous spectral classification methods in astrophysics. This thesis is about training a convolutional neural network (ConvNet) to recognize an object class (quasar, star or galaxy) from one-dimension spectra only. Author developed several scripts and C programs for datasets preparation, preprocessing and postprocessing of the data. EBLearn library (developed by Pierre Sermanet and Yann LeCun) was used to create ConvNets. Application on dataset of more than 60000 spectra yielded success rate of nearly 95%. This thesis conclusively proved great potential of convolutional neural networks and deep learning methods in astrophysics.

  19. Laboratory Building for Accurate Determination of Plutonium

    2008-01-01

    <正>The accurate determination of plutonium is one of the most important assay techniques of nuclear fuel, also the key of the chemical measurement transfer and the base of the nuclear material balance. An

  20. Improved Surgical Site Infection (SSI) rate through accurately assessed surgical wounds

    John, Honeymol; Nimeri, Abdelrahman; Ellahham, Samer

    2015-01-01

    Sheikh Khalifa Medical City's (SKMC) Surgery Institute was identified as a high outlier in Surgical Site Infections (SSI) based on the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) - Semi-Annual Report (SAR) in January 2012. The aim of this project was to improve SSI rates through accurate wound classification. We identified SSI rate reduction as a performance improvement and safety priority at SKMC, a tertiary referral center. We used the American Col...

  1. Application of kernel functions for accurate similarity search in large chemical databases

    2010-01-01

    Background Similaritysearch in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions...

  2. A method for accurate, non-destructive diagnosis of congenital heart defects from heart specimens

    Schleich, Jean-Marc; Abdulla, Tariq; Houyel, Lucile; Paul, Jean-François; Summers, Ron; Dillenseger, Jean-Louis

    2013-01-01

    International audience The accurate analysis of congenital heart defect (CHD) specimens is often difficult and up to now required the opening of the heart. The objective of this study is to define a non-destructive method that allows for the precise analysis of each specimen and its different cardiac components in order to improve classification of the defect and thus provide an indication of underpinning causal mechanisms. We propose a method in which the heart volume is acquired by a CT ...

  3. Disentangling the effect of body size and phylogenetic distances on zooplankton top-down control of algae.

    Gianuca, Andros T; Pantel, Jelena H; De Meester, Luc

    2016-04-13

    A negative consequence of biodiversity loss is reduced rates of ecosystem functions. Phylogenetic-based biodiversity indices have been claimed to provide more accurate predictions of ecosystem functioning than species diversity alone. This approach assumes that the most relevant traits for ecosystem functioning present a phylogenetic signal. Yet, traits-mediating niche partitioning and resource uptake efficiency in animals can be labile. To assess the relative power of a key trait (body size) and phylogeny to predict zooplankton top-down control on phytoplankton, we manipulated trait and phylogenetic distances independently in microcosms while holding species richness constant. We found that body size provided strong predictions of top-down control. In contrast, phylogeny was a poor predictor of grazing rates. Size-related grazing efficiency asymmetry was mechanistically more important than niche differences in mediating ecosystem function in our experimental settings. Our study demonstrates a strong link between a single functional trait (i.e. body size) in zooplankton and trophic interactions, and urges for a cautionary use of phylogenetic information and taxonomic diversity as substitutes for trait information to predict and understand ecosystem functions. PMID:27075258

  4. Preliminary Study of Phylogenetic Relationship of Rice Field Chironomidae (Diptera Inferred From DNA Sequences of Mitochondrial Cytochrome Oxidase Subunit I

    Salman A. Al-Shami

    2009-01-01

    Full Text Available Problem statement: Chironomidae have been recorded in rice fields throughout the world including in many countries such as India, Australia and the USA. Although some studies provide the key to genera level and note the difficulty of identifying the larvae to species level. Chironomid researches have been hindered because of difficulties in specimen preparation, identification, morphology and literature. Systematics, phylogenetics and taxonomic studies of insects developed quickly with emergence of molecular techniques. These techniques provide an effective tool toward more accurate identification of ambiguous chironomid species. Approach: Samples of chironomids larvae were collected from rice plots at Bukit Merah Agricultural Experimental Station (BMAES, Penang, Malaysia. A 710 bp fragment of mitochondrial gene Cytochrome Oxidase subunit I (COI was amplified and sequenced. Results: Five species of Chironomidae; three species of subfamily Chironominae, Chironomus kiiensis, Polypedilum trigonus, Tanytarsus formosanus, two species of subfamily Tanypodinae, Clinotanypus sp and Tanypus punctipennis were morphologically identified. The phylogenetic relationship among these species was been investigated. High sequence divergence was observed between two individuals of the presumed C. kiiensis and it is suggested that more than one species may be present. However the intraspecific sequence divergence was lower between the other species of Tanypodinae subfamily. Interestingly, Tanytarsus formosanus showed close phylogenetic relationship to Tanypodinae species and this presumably reflect co-evolutionary traits of different subfamilies. Conclusion: The sequence of the mtDNA cytochrome oxidase subunit I gene has proven useful to investigate the phylogenetic relationship among the ambiguous species of chironomids.

  5. Carotenogenesis diversification in phylogenetic lineages of Rhodophyta.

    Takaichi, Shinichi; Yokoyama, Akiko; Mochimaru, Mari; Uchida, Hiroko; Murakami, Akio

    2016-06-01

    Carotenoid composition is very diverse in Rhodophyta. In this study, we investigated whether this variation is related to the phylogeny of this group. Rhodophyta consists of seven classes, and they can be divided into two groups on the basis of their morphology. The unicellular group (Cyanidiophyceae, Porphyridiophyceae, Rhodellophyceae, and Stylonematophyceae) contained only β-carotene and zeaxanthin, "ZEA-type carotenoids." In contrast, within the macrophytic group (Bangiophyceae, Compsopogonophyceae, and Florideophyceae), Compsopogonophyceae contained antheraxanthin in addition to ZEA-type carotenoids, "ANT-type carotenoids," whereas Bangiophyceae contained α-carotene and lutein along with ZEA-type carotenoids, "LUT-type carotenoids." Florideophyceae is divided into five subclasses. Ahnfeltiophycidae, Hildenbrandiophycidae, and Nemaliophycidae contained LUT-type carotenoids. In Corallinophycidae, Hapalidiales and Lithophylloideae in Corallinales contained LUT-type carotenoids, whereas Corallinoideae in Corallinales contained ANT-type carotenoids. In Rhodymeniophycidae, most orders contained LUT-type carotenoids; however, only Gracilariales contained ANT-type carotenoids. There is a clear relationship between carotenoid composition and phylogenetics in Rhodophyta. Furthermore, we searched open genome databases of several red algae for references to the synthetic enzymes of the carotenoid types detected in this study. β-Carotene and zeaxanthin might be synthesized from lycopene, as in land plants. Antheraxanthin might require zeaxanthin epoxydase, whereas α-carotene and lutein might require two additional enzymes, as in land plants. Furthermore, Glaucophyta contained ZEA-type carotenoids, and Cryptophyta contained β-carotene, α-carotene, and alloxanthin, whose acetylenic group might be synthesized from zeaxanthin by an unknown enzyme. Therefore, we conclude that the presence or absence of the four enzymes is related to diversification of carotenoid

  6. Annals of morphology. Atavisms: phylogenetic Lazarus?

    Zanni, Ginevra; Opitz, John M

    2013-11-01

    Dedication: with highest respect and affection to Prof. Giovanni Neri on the eve of his official administrative retirement as Chair of the Institute of Medical Genetics of the Università Cattolica of Rome for leadership in medical genetics and medical science and friendship for decades. The concept "atavism," reversion, throwback, Rückschlag remains an epistemological challenge in biology; unwise or implausible over-interpretation of a given structure as such has led some to almost total skepticism as to its existence. Originating in botany in the 18th century it became applied to zoology (and humans) with increasing frequency over the last two centuries such that the very concept became widely discredited. Presently, atavisms have acquired a new life and reconsideration given certain reasonable criteria, including: Homology of structure of the postulated atavism to that of ancestral fossils or collateral species with plausible soft tissue reconstructions taking into account relationships of parts, obvious sites of origin and insertion of muscles, vascular channels, etc. Most parsimonious, plausible phylogenetic assumptions. Evident rudimentary or vestigial anatomical state in prior generations or in morphogenesis of a given organism. Developmental instability in prior generations, that is, some closely related species facultatively with or without the trait. Genetic identity or phylogenomic similarity inferred in ancestors and corroborated in more or less closely related species. Fluctuating asymmetry may be the basis for the striking evolutionary diversification and common atavisms in limbs; however, strong selection and developmental constraints would make atavisms in, for example, cardiac or CNS development less likely. Thus, purported atavisms must be examined critically in light of the above criteria. PMID:24166815

  7. [Phylogenetic analysis of bacteria of extreme ecosystems].

    Romanovskaia, V A; Parfenova, V V; Bel'kova, N L; Sukhanova, E V; Gladka, G V; Tashireva, A A

    2014-01-01

    Phylogenetic analysis of aerobic chemoorganotrophic bacteria of the two extreme regions (Dead Sea and West Antarctic) was performed on the basis of the nucleotide sequences of the 16S rRNA gene. Thermotolerant and halotolerant spore-forming bacteria 7t1 and 7t3 of terrestrial ecosystems Dead Sea identified as Bacillus licheniformis and B. subtilis subsp. subtilis, respectively. Taking into account remote location of thermotolerant strain 6t1 from closely related strains in the cluster Staphylococcus, 6t1 strain can be regarded as Staphylococcus sp. In terrestrial ecosystems, Galindez Island (Antarctic) detected taxonomically diverse psychrotolerant bacteria. From ornithogenic soil were isolated Micrococcus luteus O-1 and Microbacterium trichothecenolyticum O-3. Strains 4r5, 5r5 and 40r5, isolated from grass and lichens, can be referred to the genus Frondihabitans. These strains are taxonomically and ecologically isolated and on the tree diagram form the joint cluster with three isolates Frondihabitans sp., isolated from the lichen Austrian Alps, and psychrotolerant associated with plants F. cladoniiphilus CafT13(T). Isolates from black lichen in the different stationary observation points on the south side of a vertical cliff identified as: Rhodococcus fascians 181n3, Sporosarcina aquimarina O-7, Staphylococcus sp. 0-10. From orange biofilm of fouling on top of the vertical cliff isolated Arthrobacter sp. 28r5g1, from the moss-- Serratia sp. 6r1g. According to the results, Frondihabitans strains most frequently encountered among chemoorganotrophic aerobic bacteria in the Antarctic phytocenoses. PMID:25007437

  8. Pitch Based Sound Classification

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft...

  9. Shark Teeth Classification

    Brown, Tom; Creel, Sally; Lee, Velda

    2009-01-01

    On a recent autumn afternoon at Harmony Leland Elementary in Mableton, Georgia, students in a fifth-grade science class investigated the essential process of classification--the act of putting things into groups according to some common characteristics or attributes. While they may have honed these skills earlier in the week by grouping their own…

  10. Classification system: Netherlands

    Hartemink, A.E.

    2006-01-01

    Although people have always classified soils, it is only since the mid 19th century that soil classification emerged as an important topic within soil science. It forced soil scientists to think systematically about soils and its genesis and developed to facilitate communication between soil scienti

  11. Text document classification

    Novovičová, Jana

    č. 62 (2005), s. 53-54. ISSN 0926-4981 R&D Projects: GA AV ČR IAA2075302; GA AV ČR KSK1019101; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : document representation * categorization * classification Subject RIV: BD - Theory of Information

  12. Automated Stellar Spectral Classification

    Bailer-Jones, Coryn; Irwin, Mike; von Hippel, Ted

    1996-05-01

    Stellar classification has long been a useful tool for probing important astrophysical phenomena. Beyond simply categorizing stars it yields fundamental stellar parameters, acts as a probe of galactic abundance distributions and gives a first foothold on the cosmological distance ladder. The MK system in particular has survived on account of its robustness to changes in the calibrations of the physical parameters. Nonetheless, if stellar classification is to continue as a useful tool in stellar surveys, then it must adapt to keep pace with the large amounts of data which will be acquired as magnitude limits are pushed ever deeper. We are working on a project to automate the multi-parameter classification of visual stellar spectra, using artificial neural networks and other techniques. Our techniques have been developed with 10,000 spectra (B Analysis as a front-end compression of the data. Our continuing work also looks at the application of synthetic spectra to the direct classification of spectra in terms of the physical parameters of Teff, log g, and [Fe/H].

  13. Classification of waste packages

    Mueller, H.P.; Sauer, M.; Rojahn, T. [Versuchsatomkraftwerk GmbH, Kahl am Main (Germany)

    2001-07-01

    A barrel gamma scanning unit has been in use at the VAK for the classification of radioactive waste materials since 1998. The unit provides the facility operator with the data required for classification of waste barrels. Once these data have been entered into the AVK data processing system, the radiological status of raw waste as well as pre-treated and processed waste can be tracked from the point of origin to the point at which the waste is delivered to a final storage. Since the barrel gamma scanning unit was commissioned in 1998, approximately 900 barrels have been measured and the relevant data required for classification collected and analyzed. Based on the positive results of experience in the use of the mobile barrel gamma scanning unit, the VAK now offers the classification of barrels as a service to external users. Depending upon waste quantity accumulation, this measurement unit offers facility operators a reliable and time-saving and cost-effective means of identifying and documenting the radioactivity inventory of barrels scheduled for final storage. (orig.)

  14. The Classification Conundrum.

    Granger, Charles R.

    1983-01-01

    Argues against the five-kingdom scheme of classification as using inconsistent criteria, ending up with divisions that are forced, not natural. Advocates an approach using cell type/complexity and modification of the metabolic machinery, recommending the five-kingdom scheme as starting point for class discussion on taxonomy and its conceptual…

  15. Improving Student Question Classification

    Heiner, Cecily; Zachary, Joseph L.

    2009-01-01

    Students in introductory programming classes often articulate their questions and information needs incompletely. Consequently, the automatic classification of student questions to provide automated tutorial responses is a challenging problem. This paper analyzes 411 questions from an introductory Java programming course by reducing the natural…

  16. Classifications in popular music

    A. van Venrooij; V. Schmutz

    2015-01-01

    The categorical system of popular music, such as genre categories, is a highly differentiated and dynamic classification system. In this article we present work that studies different aspects of these categorical systems in popular music. Following the work of Paul DiMaggio, we focus on four questio

  17. Dynamic Latent Classification Model

    Zhong, Shengtong; Martínez, Ana M.; Nielsen, Thomas Dyhre;

    possible. Motivated by this problem setting, we propose a generative model for dynamic classification in continuous domains. At each time point the model can be seen as combining a naive Bayes model with a mixture of factor analyzers (FA). The latent variables of the FA are used to capture the dynamics in...

  18. Classification of myocardial infarction

    Saaby, Lotte; Poulsen, Tina Svenstrup; Hosbond, Susanne Elisabeth;

    2013-01-01

    The classification of myocardial infarction into 5 types was introduced in 2007 as an important component of the universal definition. In contrast to the plaque rupture-related type 1 myocardial infarction, type 2 myocardial infarction is considered to be caused by an imbalance between demand and...

  19. [Classification of primary bone tumors].

    Dominok, G W; Frege, J

    1986-01-01

    An expanded classification for bone tumors is presented based on the well known international classification as well as earlier systems. The current status and future trends in this area are discussed. PMID:3461626

  20. Efficient Fingercode Classification

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  1. Novel Accurate Bacterial Discrimination by MALDI-Time-of-Flight MS Based on Ribosomal Proteins Coding in S10-spc-alpha Operon at Strain Level S10-GERMS

    Tamura, Hiroto; Hotta, Yudai; Sato, Hiroaki

    2013-08-01

    Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is one of the most widely used mass-based approaches for bacterial identification and classification because of the simple sample preparation and extremely rapid analysis within a few minutes. To establish the accurate MALDI-TOF MS bacterial discrimination method at strain level, the ribosomal subunit proteins coded in the S 10-spc-alpha operon, which encodes half of the ribosomal subunit protein and is highly conserved in eubacterial genomes, were selected as reliable biomarkers. This method, named the S10-GERMS method, revealed that the strains of genus Pseudomonas were successfully identified and discriminated at species and strain levels, respectively; therefore, the S10-GERMS method was further applied to discriminate the pathovar of P. syringae. The eight selected biomarkers (L24, L30, S10, S12, S14, S16, S17, and S19) suggested the rapid discrimination of P. syringae at the strain (pathovar) level. The S10-GERMS method appears to be a powerful tool for rapid and reliable bacterial discrimination and successful phylogenetic characterization. In this article, an overview of the utilization of results from the S10-GERMS method is presented, highlighting the characterization of the Lactobacillus casei group and discrimination of the bacteria of genera Bacillus and Sphingopyxis despite only two and one base difference in the 16S rRNA gene sequence, respectively.

  2. Oral epithelial dysplasia classification systems

    Warnakulasuriya, S; Reibel, J; Bouquot, J;

    2008-01-01

    report, we review the oral epithelial dysplasia classification systems. The three classification schemes [oral epithelial dysplasia scoring system, squamous intraepithelial neoplasia and Ljubljana classification] were presented and the Working Group recommended epithelial dysplasia grading for routine....... Several studies have shown great interexaminer and intraexaminer variability in the assessment of the presence or absence and the grade of oral epithelial dysplasia. The Working Group considered the two class classification (no/questionable/ mild - low risk; moderate or severe - implying high risk) and...

  3. Asperisporium and Pantospora (Mycosphaerellaceae): epitypifications and phylogenetic placement

    Minnis, A.M.; Kennedy, A.H.; Grenier, D.B.; Rehner, S.A.; Bischoff, J.F.

    2012-01-01

    The species-rich family Mycosphaerellaceae contains considerable morphological diversity and includes numerous anamorphic genera, many of which are economically important plant pathogens. Recent revisions and phylogenetic research have resulted in taxonomic instability. Ameliorating this problem req

  4. Mapping phylogenetic endemism in R using georeferenced branch extents

    Guerin, Greg R.; Lowe, Andrew J.

    2015-12-01

    Applications are needed to map biodiversity from large-scale species occurrence datasets whilst seamlessly integrating with existing functions in R. Phylogenetic endemism (PE) is a biodiversity measure based on range-restricted phylogenetic diversity (PD). Current implementations use area of occupancy (AOO) or frequency to estimate the spatial range of branch-length (i.e. phylogenetic range-rarity), rather than extent of occurrence (EOO; i.e. georeferenced phylogenetic endemism), which is known to produce different range estimates. We present R functions to map PD or PE weighted by AOO or EOO (new georeferenced implementation), taking as inputs georeferenced species occurrences and a phylogeny. Non-parametric statistics distinguish PD/PE from trivial correlates of species richness and sampling intensity.

  5. Markov invariants, plethysms, and phylogenetics (the long version)

    Sumner, J G; Jermiin, L S; Jarvis, P D

    2008-01-01

    We explore model based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our primary tool group representation theory, and show that it provides a general framework for analysing Markov processes on trees. From this algebraic perspective, the inherent symmetries of these processes become apparent, and focusing on plethysms, we are able to define Markov invariants and give existence proofs. We give an explicit technique for constructing the invariants, valid for any number of character states and taxa. For phylogenetic trees with three and four leaves, we demonstrate that the corresponding Markov invariants can be fruitfully exploited in applied phylogenetic studies.

  6. Phylogenetic comparative methods complement discriminant function analysis in ecomorphology.

    Barr, W Andrew; Scott, Robert S

    2014-04-01

    In ecomorphology, Discriminant Function Analysis (DFA) has been used as evidence for the presence of functional links between morphometric variables and ecological categories. Here we conduct simulations of characters containing phylogenetic signal to explore the performance of DFA under a variety of conditions. Characters were simulated using a phylogeny of extant antelope species from known habitats. Characters were modeled with no biomechanical relationship to the habitat category; the only sources of variation were body mass, phylogenetic signal, or random "noise." DFA on the discriminability of habitat categories was performed using subsets of the simulated characters, and Phylogenetic Generalized Least Squares (PGLS) was performed for each character. Analyses were repeated with randomized habitat assignments. When simulated characters lacked phylogenetic signal and/or habitat assignments were random, ecomorphology. PMID:24382658

  7. Phylogenetic and biological species diversity within the Neurospora tetrasperma complex.

    Menkis, A; Bastiaans, E; Jacobson, D J; Johannesson, H

    2009-09-01

    The objective of this study was to explore the evolutionary history of the morphologically recognized filamentous ascomycete Neurospora tetrasperma, and to reveal the genetic and reproductive relationships among its individuals and populations. We applied both phylogenetic and biological species recognition to a collection of strains representing the geographic and genetic diversity of N. tetrasperma. First, we were able to confirm a monophyletic origin of N. tetrasperma. Furthermore, we found nine phylogenetic species within the morphospecies. When using the traditional broad biological species recognition all investigated strains of N. tetrasperma constituted a single biological species. In contrast, when using a quantitative measurement of the reproductive success, incorporating characters such as viability and fertility of offspring, we found a high congruence between the phylogenetic and biological species recognition. Taken together, phylogenetically and biologically defined groups of individuals exist in N. tetrasperma, and these should be taken into account in future studies of its life history traits. PMID:19682307

  8. Phylogenetic relationships of Salmonella based on rRNA sequences

    Christensen, H.; Nordentoft, Steen; Olsen, J.E.

    1998-01-01

    To establish the phylogenetic relationships between the subspecies of Salmonella enterica (official name Salmonella choleraesuis), Salmonella bongori and related members of Enterobacteriaceae, sequence comparison of rRNA was performed by maximum-likelihood analysis. The two Salmonella species were...

  9. Phylogenetic constraints in key functional traits behind species' climate niches

    Kellermann, Vanessa; Loeschcke, Volker; Hoffmann, Ary A; Kristensen, Torsten Nygård; Fløjgaard, Camilla; David, Jean R; Svenning, Jens-Christian; Overgaard, Johannes

    2012-01-01

    Species distributions are often constrained by climatic tolerances that are ultimately determined by evolutionary history and/or adaptive capacity, but these factors have rarely been partitioned. Here, we experimentally determined two key climatic niche traits (desiccation and cold resistance) for....... Desiccation and cold resistance were clearly linked to species distributions because significant associations between traits and climatic variables persisted even after controlling for phylogeny. We used different methods to untangle whether phylogenetic signal reflected phylogenetically related species...... adapted to similar environments or alternatively phylogenetic inertia. For desiccation resistance, weak phylogenetic inertia was detected; ancestral trait reconstruction, however, revealed a deep divergence that could be traced back to the genus level. Despite drosophilids’ high evolutionary potential...

  10. The paradox of atheoretical classification

    Hjørland, Birger

    2016-01-01

    sometimes termed “descriptive” classifications). Paradoxically atheoretical classifications may be very successful. The best example of a successful “atheoretical” classification is probably the prestigious Diagnostic and Statistical Manual of Mental Disorders (DSM) since its third edition from 1980. On the...

  11. Etiologic Classification in Ischemic Stroke

    Hakan Ay

    2011-01-01

    Ischemic stroke is an etiologically heterogenous disorder. Classification of ischemic stroke etiology into categories with discrete phenotypic, therapeutic, and prognostic features is indispensible to generate consistent information from stroke research. In addition, a functional classification of stroke etiology is critical to ensure unity among physicians and comparability among studies. There are two major approaches to etiologic classification in stroke. Phenotypic systems define subtypes...

  12. Asperisporium and Pantospora (Mycosphaerellaceae): epitypifications and phylogenetic placement

    Minnis, A.M.; Kennedy, A.H.; Grenier, D.B.; Rehner, S.A.; Bischoff, J.F.

    2012-01-01

    The species-rich family Mycosphaerellaceae contains considerable morphological diversity and includes numerous anamorphic genera, many of which are economically important plant pathogens. Recent revisions and phylogenetic research have resulted in taxonomic instability. Ameliorating this problem requires phylogenetic placement of type species of key genera. We present an examination of the type species of the anamorphic Asperisporium and Pantospora. Cultures isolated from recent port intercep...

  13. Hal: an Automated Pipeline for Phylogenetic Analyses of Genomic Data

    Robbertse, Barbara; Yoder, Ryan J.; Boyd, Alex; Reeves, John; Spatafora, Joseph W.

    2011-01-01

    The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into ind...

  14. Phylogenetic analysis and development of probes for differentiating methylotrophic bacteria.

    Brusseau, G A; Bulygina, E S; Hanson, R S

    1994-01-01

    Fifteen small-subunit rRNAs from methylotrophic bacteria have been sequenced. Comparisons of these sequences with 22 previously published sequences further defined the phylogenetic relationships among these bacteria and illustrated the agreement between phylogeny and physiological characteristics of the bacteria. Phylogenetic trees were constructed with 16S rRNA sequences from methylotrophic bacteria and representative organisms from subdivisions within the class Proteobacteria on the basis o...

  15. Phylogenetic trees and the tropical geometry of flag varieties

    Manon, Christopher

    2012-01-01

    International audience We will discuss some recent theorems relating the space of weighted phylogenetic trees to the tropical varieties of each flag variety of type A. We will also discuss the tropicalizations of the functions corresponding to semi-standard tableaux, in particular we relate them to familiar functions from phylogenetics. We close with some remarks on the generalization of these results to the tropical geometry of arbitrary flag varieties. This involves the family of Bergman...

  16. Linguistic Phylogenetic Inference by PAM-like Matrices

    Delmestri, Antonella; Cristianini, Nello

    2010-01-01

    We apply to the task of linguistic phylogenetic inference a successful cognate identification learning model based on PAM-like matrices. We train our system and we employ the learned parameters for measuring the lexical distance between languages. We estimate phylogenetic trees using distance-based methods on an Indo-European database. Our results reproduce correctly all the established major language groups present in the dataset, are compatible with the Indo-European benchmark tree and incl...

  17. Phylogenetic analysis of porcine parvoviruses from swine samples in China

    Li Dong; Chen Yingli; Xie Baoxia; Bao Huifang; Bai Xingwen; Li Pinghua; Cao Yimei; Fu Yuanfang; Sun Pu; Lu Zengjun; Hao Xiaofang; Liu Zaixin

    2011-01-01

    Abstract Background Porcine parvovirus (PPV) usually causes reproductive failure in sows. The objective of the present study was to analyze the phylogenetic distribution and perform molecular characterization of PPVs isolated in China, as well as to identify two field strains, LZ and JY. The data used in this study contained the available sequences for NS1 and VP2 from GenBank, as well as the two aforementioned Chinese strains. Results Phylogenetic analysis shows that the PPV sequences are di...

  18. Exploration of phylogenetic data using a global sequence analysis method

    Giron Alain

    2005-11-01

    Full Text Available Abstract Background Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. Results Here we explore the phylogenetic signal present in molecular data by genomic signatures, defined as the set of frequencies of short oligonucleotides present in DNA sequences. Although violating many of the standard assumptions of traditional phylogenetic analyses – in particular explicit statements of homology inherent in character matrices – the use of the signature does permit the analysis of very long sequences, even those that are unalignable, and is therefore most useful in cases where alignment is questionable. We compare the results obtained by traditional phylogenetic methods to those inferred by the signature method for two genes: RAG1, which is easily alignable, and 18S RNA, where alignments are often ambiguous for some regions. We also apply this method to a multigene data set of 33 genes for 9 bacteria and one archea species as well as to the whole genome of a set of 16 γ-proteobacteria. In addition to delivering phylogenetic results comparable to traditional methods, the comparison of signatures for the sequences involved in the bacterial example identified putative candidates for horizontal gene transfers. Conclusion The signature method is therefore a fast tool for exploring phylogenetic data, providing not only a pretreatment for discovering new sequence relationships, but also for identifying cases of sequence evolution that could confound traditional phylogenetic analysis.

  19. FootPrinter3: phylogenetic footprinting in partially alignable sequences

    Fang, Fei; Blanchette, Mathieu

    2006-01-01

    FootPrinter3 is a web server for predicting transcription factor binding sites by using phylogenetic footprinting. Until now, phylogenetic footprinting approaches have been based either on multiple alignment analysis (e.g. PhyloVista, PhastCons), or on motif-discovery algorithms (e.g. FootPrinter2). FootPrinter3 integrates these two approaches, making use of local multiple sequence alignment blocks when those are available and reliable, but also allowing finding motifs in unalignable regions....

  20. Statistical Phylogenetic Tree Analysis Using Differences of Means

    Arnaoudova, Elissaveta; David C Haws; Huggins, Peter; Jaromczyk, Jerzy W; Moore, Neil; Schardl, Christopher L; Yoshida, Ruriko

    2010-01-01

    We propose a statistical method to test whether two phylogenetic trees with given alignments are significantly incongruent. Our method compares the two distributions of phylogenetic trees given by the input alignments, instead of comparing point estimations of trees. This statistical approach can be applied to gene tree analysis for example, detecting unusual events in genome evolution such as horizontal gene transfer and reshuffling. Our method uses difference of means to compare two distrib...

  1. Statistical phylogenetic tree analysis using differences of means

    Elissaveta Arnaoudova; David C Haws; Peter Huggins; Jerzy Jaromczyk; Niel Moore; Christopher Schardl; Ruriko Yoshida

    2010-01-01

    We propose a statistical method to test whether two phylogenetic trees with given alignments are significantly incongruent. Our method compares the two distributions of phylogenetic trees given by the input alignments, instead of comparing point estimations of trees. This statistical approach can be applied to gene tree analysis for example, detecting unusual events in genome evolution such as horizontal gene transfer and reshuffling. Our method uses difference of means to compare two distri...

  2. Ecological and phylogenetic influences on maxillary dentition in snakes

    Kate Jackson

    2010-12-01

    Full Text Available The maxillary dentition of snakes was used as a system with which to investigate the relative importance of the interacting forces of ecological selective pressures and phylogenetic constraints indetermining morphology. The maxillary morphology of three groups of snakes having different diets, with each group comprising two distinct lineages — boids and colubroids — was examined. Our results suggest that dietary selective pressures may be more significantthan phylogenetic history in shaping maxillary morphology.

  3. Acremonium phylogenetic overview and revision of Gliomastix, Sarocladium, and Trichothecium.

    2011-01-01

    Over 200 new sequences are generated for members of the genus Acremonium and related taxa including ribosomal small subunit sequences (SSU) for phylogenetic analysis and large subunit (LSU) sequences for phylogeny and DNA-based identification. Phylogenetic analysis reveals that within the Hypocreales, there are two major clusters containing multiple Acremonium species. One clade contains Acremonium sclerotigenum, the genus Emericellopsis, and the genus Geosmithia as prominent elements. The se...

  4. Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics.

    Andújar, Carmelo; Arribas, Paula; Ruzicka, Filip; Crampton-Platt, Alex; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-07-01

    High-throughput DNA methods hold great promise for the study of taxonomically intractable mesofauna of the soil. Here, we assess species diversity and community structure in a phylogenetic framework, by sequencing total DNA from bulk specimen samples and assembly of mitochondrial genomes. The combination of mitochondrial metagenomics and DNA barcode sequencing of 1494 specimens in 69 soil samples from three geographic regions in southern Iberia revealed >300 species of soil Coleoptera (beetles) from a broad spectrum of phylogenetic lineages. A set of 214 mitochondrial sequences longer than 3000 bp was generated and used to estimate a well-supported phylogenetic tree of the order Coleoptera. Shorter sequences, including cox1 barcodes, were placed on this mitogenomic tree. Raw Illumina reads were mapped against all available sequences to test for species present in local samples. This approach simultaneously established the species richness, phylogenetic composition and community turnover at species and phylogenetic levels. We find a strong signature of vertical structuring in soil fauna that shows high local community differentiation between deep soil and superficial horizons at phylogenetic levels. Within the two vertical layers, turnover among regions was primarily at the tip (species) level and was stronger in the deep soil than leaf litter communities, pointing to layer-mediated drivers determining species diversification, spatial structure and evolutionary assembly of soil communities. This integrated phylogenetic framework opens the application of phylogenetic community ecology to the mesofauna of the soil, among the most diverse and least well-understood ecosystems, and will propel both theoretical and applied soil science. PMID:25865150

  5. Phylogenetic relationships among hadal amphipods of the Superfamily Lysianassoidea: Implications for taxonomy and biogeography

    Ritchie, H.; Jamieson, A. J.; Piertney, S. B.

    2015-11-01

    Amphipods of the superfamily Lysianassoidea are ubiquitous at hadal depths (>6000 m) and therefore are an ideal model group for investigating levels of endemism and the drivers of speciation in deep ocean trenches. The taxonomic classification of hadal amphipods is typically based on conventional morphological traits but it has been suggested that convergent evolution, phenotypic plasticity, intra-specific variability and ontogenetic variation may obscure the ability to robustly diagnose taxa and define species. Here we use phylogenetic analysis of DNA sequence variation at two mitochondrial (COI and 16S rDNA) and one nuclear (18S rDNA) regions at to examine the evolutionary relationships among 25 putative amphipod species representing 14 genera and 11 families that were sampled from across seven hadal trenches. We identify several instances where species, genera and families do not resolve monophyletic clades, highlighting incongruence between the current taxonomic classification and the molecular phylogeny for this group. Our data also help extend and resolve the known biogeographic distributions for the different species, such as identifying the co-occurrence of Hirondellea dubia and Hirondellea gigas in the Mariana trench.

  6. Phylogeny and Classification of Prunus sensu lato (Rosaceae)

    Shuo Shi; Jinlu Li; Jiahui Sun; Jing Yu; Shiliang Zhou

    2013-01-01

    The classification of the economically important genus Prunus L. sensu lato (s.l.) is controversial due to the high levels of convergent or the parallel evolution of morphological characters. In the present study, phylogenetic analyses of fifteen main segregates of Prunus s.l. represented by eighty-four species were conducted with maximum parsimony and Bayesian approaches using twelve chloroplast regions (atpB-rbcL, matK, ndhF, psbA-trnH, rbcL, rpL16, rpoC1, rps16, trnS-G, trnL, trnL-F and ycf1) and three nuclear genes (ITS, s6pdh and SbeI) to explore their infrageneric relationships. The results of these analyses were used to develop a new, phylogeny-based classification of Prunus s.l. Our phylogenetic reconstructions resolved three main clades of Prunus s.l. with strong supports. We adopted a broad-sensed genus, Prunus, and recognised three subgenera corresponding to the three main clades: subgenus Padus, subgenus Cerasus and subgenus Prunus. Seven sections of subgenus Prunus were recognised. The dwarf cherries, which were previously assigned to subgenus Cerasus, were included in this subgenus Prunus. One new section name, Prunus L. subgenus Prunus section Persicae (T. T. Yü&L. T. Lu) S. L. Zhou and one new species name, Prunus tianshanica (Pojarkov) S. Shi, were proposed.

  7. Phylogenetic context determines the role of competition in adaptive radiation.

    Tan, Jiaqi; Slattery, Matthew R; Yang, Xian; Jiang, Lin

    2016-06-29

    Understanding ecological mechanisms regulating the evolution of biodiversity is of much interest to ecologists and evolutionary biologists. Adaptive radiation constitutes an important evolutionary process that generates biodiversity. Competition has long been thought to influence adaptive radiation, but the directionality of its effect and associated mechanisms remain ambiguous. Here, we report a rigorous experimental test of the role of competition on adaptive radiation using the rapidly evolving bacterium Pseudomonas fluorescens SBW25 interacting with multiple bacterial species that differed in their phylogenetic distance to the diversifying bacterium. We showed that the inhibitive effect of competitors on the adaptive radiation of P. fluorescens decreased as their phylogenetic distance increased. To explain this phylogenetic dependency of adaptive radiation, we linked the phylogenetic distance between P. fluorescens and its competitors to their niche and competitive fitness differences. Competitive fitness differences, which showed weak phylogenetic signal, reduced P. fluorescens abundance and thus diversification, whereas phylogenetically conserved niche differences promoted diversification. These results demonstrate the context dependency of competitive effects on adaptive radiation, and highlight the importance of past evolutionary history for ongoing evolutionary processes. PMID:27335414

  8. LAF: Logic Alignment Free and its application to bacterial genomes classification

    Weitschek, Emanuel; Cunial, Fabio; Felici, Giovanni

    2015-01-01

    Alignment-free algorithms can be used to estimate the similarity of biological sequences and hence are often applied to the phylogenetic reconstruction of genomes. Most of these algorithms rely on comparing the frequency of all the distinct substrings of fixed length (k-mers) that occur in the analyzed sequences. In this paper, we present Logic Alignment Free (LAF), a method that combines alignment-free techniques and rule-based classification algorithms in order to assign biological samples ...

  9. A phylogeny and revised classification of Squamata, including 4161 species of lizards and snakes

    Pyron, R. Alexander; Burbrink, Frank T.; Wiens, John J.

    2013-01-01

    Background The extant squamates (>9400 known species of lizards and snakes) are one of the most diverse and conspicuous radiations of terrestrial vertebrates, but no studies have attempted to reconstruct a phylogeny for the group with large-scale taxon sampling. Such an estimate is invaluable for comparative evolutionary studies, and to address their classification. Here, we present the first large-scale phylogenetic estimate for Squamata. Results The estimated phylogeny contains 4161 species...

  10. PyElph - a software tool for gel images analysis and phylogenetics

    Pavel Ana Brânduşa

    2012-01-01

    (Random Amplification of Polymorphic DNA and STR (Short Tandem Repeat. The similarity between the DNA sequences is computed and used to generate phylogenetic trees which are very useful for population genetics studies and taxonomic classification. Conclusions PyElph decreases the effort and time spent processing data from gel images by providing an automatic step-by-step gel image analysis system with a friendly Graphical User Interface. The proposed free software tool is suitable for researchers and students which do not have access to expensive commercial software and image acquisition devices.

  11. A molecular phylogenetic investigation of bakuella, anteholosticha, and caudiholosticha (protista, ciliophora, hypotrichia) based on three-gene sequences.

    Lv, Zhao; Shao, Chen; Yi, Zhenzhen; Warren, Alan

    2015-01-01

    Traditionally classifications of the Urostyloida have been mainly based on morphology and morphogenesis. Recent molecular phylogenetic analyses have been largely based on single-gene data for a limited number of taxa. Consequently, incongruence has arisen between the morphological/morphogenetic and the molecular data. In this study, the three phylogenetic markers (SSU rDNA, ITS1-5.8S-ITS2 region, and LSU-rDNA) of three urostyloid genera represented by four species (Bakuella granulifera, Anteholosticha monilata, Caudiholosticha sylvatica, and C. tetracirra) were sequenced to investigate their phylogeny. The results show that: (1) all three genera should be regarded as the members of the order Urostyloida within the subclass Hypotrichia, as indicated by morphological characters; (2) phylogenetic analyses and sequence similarities both indicate that neither Anteholosticha nor Caudiholosticha are monophyletic and the systematic assignment of both genera awaits further evaluation; and (3) Bakuella has a closer relationship with Urostyla than with bakuellids (e.g. Apobakuella and Metaurostylopsis), suggesting Bakuella may belong to the family Urostylidae rather than the family Bakuellidae. PMID:25399810

  12. Phylogenetic relationships of the Dactylogyridae Bychowsky, 1933 (Monogenea: Dactylogyridea): the need for the systematic revision of the Ancyrocephalinae Bychowsky, 1937.

    Simková, Andrea; Plaisance, Laetitia; Matejusová, Iveta; Morand, Serge; Verneau, Olivier

    2003-01-01

    Phylogenetic analyses based on partial 18S rDNA sequences of polyonchoinean monogeneans were conducted in order to investigate the relationships between selected families and subfamilies of the Dactylogyrinea, mainly within the Dactylogyridae. We tested the status of the Ancyrocephalidae sensu Bychowsky & Nagibina (1978) and the Ancyrocephalinae sensu Kritsky & Boeger (1989). Within the Dactylogyrinea, the Diplectanidae and Dactylogyridae are well supported by maximum likelihood and maximum parsimony analyses, but their phylogenetic relationship with the Pseudomurraytrematidae remains unresolved. Phylogenetic relationships between the Pseudodactylogyrinae, Ancyrocephalinae, Ancylodiscoidinae and Dactylogyrinae indicate paraphyly of the Ancyrocephalidae sensu Bychowsky & Nagibina (1978). The group of species recently considered as the Dactylogyridae sensu Kritsky & Boeger (1989) comprises two sister groups. The first group includes the freshwater Ancyrocephalinae and the Ancylodiscoidinae. The second group includes the Pseudodactylogyrinae, Dactylogyrinae and the Ancyrocephalinae from the fish species Siganus doliatus and Tetraodon fluviatilis. The non-monophyly of the Ancyrocephalinae (i.e. the non-monophyly of the group of species recently considered as members of Ancyrocephalinae), previously suggested by Kritsky & Boeger (1989) using the morphological characters, indicates that classification of the Dactylogyridae needs to be revised. PMID:12567005

  13. Invariant Image Watermarking Using Accurate Zernike Moments

    Ismail A. Ismail

    2010-01-01

    Full Text Available problem statement: Digital image watermarking is the most popular method for image authentication, copyright protection and content description. Zernike moments are the most widely used moments in image processing and pattern recognition. The magnitudes of Zernike moments are rotation invariant so they can be used just as a watermark signal or be further modified to carry embedded data. The computed Zernike moments in Cartesian coordinate are not accurate due to geometrical and numerical error. Approach: In this study, we employed a robust image-watermarking algorithm using accurate Zernike moments. These moments are computed in polar coordinate, where both approximation and geometric errors are removed. Accurate Zernike moments are used in image watermarking and proved to be robust against different kind of geometric attacks. The performance of the proposed algorithm is evaluated using standard images. Results: Experimental results show that, accurate Zernike moments achieve higher degree of robustness than those approximated ones against rotation, scaling, flipping, shearing and affine transformation. Conclusion: By computing accurate Zernike moments, the embedded bits watermark can be extracted at low error rate.

  14. Two Influential Primate Classifications Logically Aligned

    Franz, Nico M.; Pier, Naomi M.; Reeder, Deeann M.; Chen, Mingmin; Yu, Shizhuo; Kianmajd, Parisa; Bowers, Shawn; Ludäscher, Bertram

    2016-01-01

    Classifications and phylogenies of perceived natural entities change in the light of new evidence. Taxonomic changes, translated into Code-compliant names, frequently lead to name:meaning dissociations across succeeding treatments. Classification standards such as the Mammal Species of the World (MSW) may experience significant levels of taxonomic change from one edition to the next, with potential costs to long-term, large-scale information integration. This circumstance challenges the biodiversity and phylogenetic data communities to express taxonomic congruence and incongruence in ways that both humans and machines can process, that is, to logically represent taxonomic alignments across multiple classifications. We demonstrate that such alignments are feasible for two classifications of primates corresponding to the second and third MSW editions. Our approach has three main components: (i) use of taxonomic concept labels, that is name sec. author (where sec. means according to), to assemble each concept hierarchy separately via parent/child relationships; (ii) articulation of select concepts across the two hierarchies with user-provided Region Connection Calculus (RCC-5) relationships; and (iii) the use of an Answer Set Programming toolkit to infer and visualize logically consistent alignments of these input constraints. Our use case entails the Primates sec. Groves (1993; MSW2–317 taxonomic concepts; 233 at the species level) and Primates sec. Groves (2005; MSW3–483 taxonomic concepts; 376 at the species level). Using 402 RCC-5 input articulations, the reasoning process yields a single, consistent alignment and 153,111 Maximally Informative Relations that constitute a comprehensive meaning resolution map for every concept pair in the Primates sec. MSW2/MSW3. The complete alignment, and various partitions thereof, facilitate quantitative analyses of name:meaning dissociation, revealing that nearly one in three taxonomic names are not reliable across

  15. Two Influential Primate Classifications Logically Aligned.

    Franz, Nico M; Pier, Naomi M; Reeder, Deeann M; Chen, Mingmin; Yu, Shizhuo; Kianmajd, Parisa; Bowers, Shawn; Ludäscher, Bertram

    2016-07-01

    Classifications and phylogenies of perceived natural entities change in the light of new evidence. Taxonomic changes, translated into Code-compliant names, frequently lead to name:meaning dissociations across succeeding treatments. Classification standards such as the Mammal Species of the World (MSW) may experience significant levels of taxonomic change from one edition to the next, with potential costs to long-term, large-scale information integration. This circumstance challenges the biodiversity and phylogenetic data communities to express taxonomic congruence and incongruence in ways that both humans and machines can process, that is, to logically represent taxonomic alignments across multiple classifications. We demonstrate that such alignments are feasible for two classifications of primates corresponding to the second and third MSW editions. Our approach has three main components: (i) use of taxonomic concept labels, that is name sec. author (where sec. means according to), to assemble each concept hierarchy separately via parent/child relationships; (ii) articulation of select concepts across the two hierarchies with user-provided Region Connection Calculus (RCC-5) relationships; and (iii) the use of an Answer Set Programming toolkit to infer and visualize logically consistent alignments of these input constraints. Our use case entails the Primates sec. Groves (1993; MSW2-317 taxonomic concepts; 233 at the species level) and Primates sec. Groves (2005; MSW3-483 taxonomic concepts; 376 at the species level). Using 402 RCC-5 input articulations, the reasoning process yields a single, consistent alignment and 153,111 Maximally Informative Relations that constitute a comprehensive meaning resolution map for every concept pair in the Primates sec. MSW2/MSW3. The complete alignment, and various partitions thereof, facilitate quantitative analyses of name:meaning dissociation, revealing that nearly one in three taxonomic names are not reliable across treatments

  16. Phylogenetic aspects of the complement system.

    Zarkadis, I K; Mastellos, D; Lambris, J D

    2001-01-01

    During evolution two general systems of immunity have emerged: innate or, natural immunity and adaptive (acquired), or specific immunity. The innate system is phylogenetically older and is found in some form in all multicellular organisms, whereas the adaptive system appeared about 450 million years ago and is found in all vertebrates except jawless fish. The complement system in higher vertebrates plays an important role as an effector of both the innate and the acquired immune response, and also participates in various immunoregulatory processes. In lower vertebrates complement is activated by the alternative and lectin pathways and is primarily involved in the opsonization of foreign material. The Agnatha (the most primitive vertebrate species) possess the alternative and lectin pathways while cartilaginous fish are the first species in which the classical pathway appears following the emergence of immunoglobulins. The rest of the poikilothermic species, ranging from teleosts to reptilians, appear to contain a well-developed complement system resembling that of the homeothermic vertebrates. It seems that most of the complement components have appeared after the duplication of primordial genes encoding C3/C4/C5, fB/C2, C1s/C1r/MASP-1/MASP-2, and C6/C7/C8/C9 molecules, in a process that led to the formation of distinct activation pathways. However, unlike homeotherms, several species of poikilotherms (e.g. trout) have recently been shown to possess multiple forms of complement components (C3, factor B) that are structurally and functionally more diverse than those of higher vertebrates. We hypothesize that this remarkable diversity has allowed these animals to expand their innate capacity for immune recognition and response. Recent studies have also indicated the possible presence of complement receptors in protochordates and lower vertebrates. In conclusion, there is considerable evidence suggesting that the complement system is present in the entire lineage of

  17. Sequence Classification: 890247 [

    Full Text Available cient and accurate synthesis of DNA opposite cyclobutane pyrimidine dimers; homolog of human POLH and bacterial DinB proteins; Rad30p || http://www.ncbi.nlm.nih.gov/protein/6320627 ...

  18. A possibilistic approach to target classification

    This chapter describes an alternative to the Bayesian approach to target classification that is based on possibility theory. A possibilistic classifier minimizes the maximum cost of the classification decision taking into account the a posteriori possibilities of the target classes given the measured target attributes. The advantage of a possibilistic classifier when compared with a Bayesian classifier is that it requires only an ordinal ranking of the costs associated with the classification decisions and the uncertainty about the target class. Owing to its qualitative character, a possibilistic classifier is less sensitive to inaccuracies in a priori knowledge than a Bayesian classifier at the expense of a degraded performance in situations where accurate a priori knowledge is available. This robustness of the possibilistic classifier to inaccuracies in a priori knowledge is demonstrated in a case study where an average cost criterion is used to compare the performance of a possibilistic and a Bayesian classifier. It is shown that when the characteristics of the measured target attributes deviate strongly from the expected characteristics, the possibilistic classifier provides a lower average cost than a Bayesian classifier. (orig.)

  19. Fast Image Texture Classification Using Decision Trees

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  20. Nominated Texture Based Cervical Cancer Classification

    Edwin Jayasingh Mariarputham

    2015-01-01

    Full Text Available Accurate classification of Pap smear images becomes the challenging task in medical image processing. This can be improved in two ways. One way is by selecting suitable well defined specific features and the other is by selecting the best classifier. This paper presents a nominated texture based cervical cancer (NTCC classification system which classifies the Pap smear images into any one of the seven classes. This can be achieved by extracting well defined texture features and selecting best classifier. Seven sets of texture features (24 features are extracted which include relative size of nucleus and cytoplasm, dynamic range and first four moments of intensities of nucleus and cytoplasm, relative displacement of nucleus within the cytoplasm, gray level cooccurrence matrix, local binary pattern histogram, tamura features, and edge orientation histogram. Few types of support vector machine (SVM and neural network (NN classifiers are used for the classification. The performance of the NTCC algorithm is tested and compared to other algorithms on public image database of Herlev University Hospital, Denmark, with 917 Pap smear images. The output of SVM is found to be best for the most of the classes and better results for the remaining classes.

  1. Subphenotyping and Classification of Orofacial Clefts: Need for Orofacial Cleft Subphenotyping Calls for Revised Classification.

    McBride, W A; McIntyre, G T; Carroll, K; Mossey, P A

    2016-09-01

    Nonsyndromic orofacial clefting (OFC) describes a range of phenotypes that represent the most common craniofacial birth defects in humans, with an overall birth prevalence of 1:700 live births. Because of the lifelong negative implications on health and well-being associated with OFC and the numbers of people affected, quality research into its etiology, diagnosis, treatment outcomes, and preventative strategies is essential. A range of different methods is used for recording and classifying OFC subphenotypes, one of which is the International Classification of Diseases (ICD) system. However, there is a general perception that research is being hampered by a lack of sensitivity and specificity in grouping those with OFC into subphenotypes, with potential heterogeneity and confounding in epidemiologic, genetic, and genotype-phenotype correlation studies. This article provides a background to the necessity of OFC research, discusses current controversies within cleft subphenotyping, and provides a brief overview of current OFC classifications as well as their limitations. The LAHSHAL classification is described in the context of a potentially useful tool for OFC that could complement the ICD-10/ICD-11 Beta coding systems to become a simply understood, universally accepted, clinically friendly, and research-sensitive instrument. Empowering registries, clinicians, and researchers to use a common classification system would have significant implications for OFC research across the world at a time when accurate subphenotyping is crucial and health care research is becoming increasingly tailored toward the individual. PMID:26171570

  2. Fungal catalases: function, phylogenetic origin and structure.

    Hansberg, Wilhelm; Salas-Lizana, Rodolfo; Domínguez, Laura

    2012-09-15

    Most fungi have several monofunctional heme-catalases. Filamentous ascomycetes (Pezizomycotina) have two types of large-size subunit catalases (L1 and L2). L2-type are usually induced by different stressors and are extracellular enzymes; those from the L1-type are not inducible and accumulate in asexual spores. L2 catalases are important for growth and the start of cell differentiation, while L1 are required for spore germination. In addition, pezizomycetes have one to four small-size subunit catalases. Yeasts (Saccharomycotina) do not have large-subunit catalases and generally have one peroxisomal and one cytosolic small-subunit catalase. Small-subunit catalases are inhibited by substrate while large-subunit catalases are activated by H(2)O(2). Some small-subunit catalases bind NADPH preventing inhibition by substrate. We present a phylogenetic analysis revealing one or two events of horizontal gene transfers from Actinobacteria to a fungal ancestor before fungal diversification, as the origin of large-size subunit catalases. Other possible horizontal transfers of small- and large-subunit catalases genes were detected and one from bacteria to the fungus Malassezia globosa was analyzed in detail. All L2-type catalases analyzed presented a secretion signal peptide. Mucorales preserved only L2-type catalases, with one containing a secretion signal if two or more are present. Basidiomycetes have only L1-type catalases, all lacking signal peptide. Fungal small-size catalases are related to animal catalases and probably evolved from a common ancestor. However, there are several groups of small-size catalases. In particular, a conserved group of fungal sequences resemble plant catalases, whose phylogenetic origin was traced to a group of bacteria. This group probably has the heme orientation of plant catalases and could in principle bind NADPH. From almost a hundred small-subunit catalases only one fourth has a peroxisomal localization signal and in fact many fungi lack

  3. Sound classification of dwellings

    Rasmussen, Birgit

    2012-01-01

    needed, and a European COST Action TU0901 "Integrating and Harmonizing Sound Insulation Aspects in Sustainable Urban Housing Constructions", has been established and runs 2009-2013, one of the main objectives being to prepare a proposal for a European sound classification scheme with a number of quality......National schemes for sound classification of dwellings exist in more than ten countries in Europe, typically published as national standards. The schemes define quality classes reflecting different levels of acoustical comfort. Main criteria concern airborne and impact sound insulation between...... dwellings, facade sound insulation and installation noise. The schemes have been developed, implemented and revised gradually since the early 1990s. However, due to lack of coordination between countries, there are significant discrepancies, and new standards and revisions continue to increase the diversity...

  4. Soil Classification Using GATree

    Bhargavi, P

    2010-01-01

    This paper details the application of a genetic programming framework for classification of decision tree of Soil data to classify soil texture. The database contains measurements of soil profile data. We have applied GATree for generating classification decision tree. GATree is a decision tree builder that is based on Genetic Algorithms (GAs). The idea behind it is rather simple but powerful. Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size. GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems. Experimental results are presented which illustrate the performance of generating best decision tree for classifying soil texture for soil data set.

  5. Sequence-based molecular phylogenetics and phylogeography of the American box turtles (Terrapene spp.) with support from DNA barcoding.

    Martin, Bradley T; Bernstein, Neil P; Birkhead, Roger D; Koukl, Jim F; Mussmann, Steven M; Placyk, John S

    2013-07-01

    The classification of the American box turtles (Terrapene spp.) has remained enigmatic to systematists. Previous comprehensive phylogenetic studies focused primarily on morphology. The goal of this study was to re-assess the classification of Terrapene spp. by obtaining DNA sequence data from a broad geographic range and from all four recognized species and 11 subspecies within the genus. Tissue samples were obtained for all taxa except for Terrapene nelsoni klauberi. DNA was extracted, and the mitochondrial DNA (mtDNA) cytochrome b (Cytb) and nuclear DNA (nucDNA) glyceraldehyde-3-phosphate-dehydrogenase (GAPD) genes were amplified via polymerase chain reaction and sequenced. In addition, the mtDNA gene commonly used for DNA barcoding (cytochrome oxidase c subunit I; COI) was amplified and sequenced to calculate pairwise percent DNA sequence divergence comparisons for each Terrapene taxon. The sequence data were analyzed using maximum likelihood and Bayesian phylogenetic inference, a molecular clock, AMOVAs, SAMOVAs, haplotype networks, and pairwise percent sequence divergence comparisons. Terrapene carolina mexicana and T. c. yucatana formed a monophyletic clade with T. c. triunguis, and this clade was paraphyletic to the rest of T. carolina. Terrapene ornata ornata and T. o. luteola lacked distinction phylogenetically, and Terrapene nelsoni was confirmed to be the sister taxon of T. ornata. Terrapene c. major, T. c. bauri, and Terrapene coahuila were not well resolved for some of the analyses. The DNA barcoding results indicated that all taxa were different species (>2% sequence divergence) except for T. c. triunguis - T. c. mexicana and T. o. ornata - T. o. luteola. The results suggest that T. c. triunguis should be elevated to species status (Terrapene mexicana), and mexicana and yucatana should be included in this group as subspecies. In addition, T. o. ornata and T. o. luteola should not be considered separate subspecies. The DNA barcoding data support these

  6. Short Text Classification: A Survey

    Ge Song

    2014-05-01

    Full Text Available With the recent explosive growth of e-commerce and online communication, a new genre of text, short text, has been extensively applied in many areas. So many researches focus on short text mining. It is a challenge to classify the short text owing to its natural characters, such as sparseness, large-scale, immediacy, non-standardization. It is difficult for traditional methods to deal with short text classification mainly because too limited words in short text cannot represent the feature space and the relationship between words and documents. Several researches and reviews on text classification are shown in recent times. However, only a few of researches focus on short text classification. This paper discusses the characters of short text and the difficulty of short text classification. Then we introduce the existing popular works on short text classifiers and models, including short text classification using sematic analysis, semi-supervised short text classification, ensemble short text classification, and real-time classification. The evaluations of short text classification are analyzed in our paper. Finally we summarize the existing classification technology and prospect for development trend of short text classification

  7. Estuary Classification Revisited

    Guha, Anirban; Lawrence, Gregory A.

    2012-01-01

    This paper presents the governing equations of a tidally-averaged, width-averaged, rectangular estuary in completely nondimensionalized forms. Subsequently, we discover that the dynamics of an estuary is entirely controlled by only two variables: (i) the Estuarine Froude number, and (ii) a nondimensional number related to the Estuarine Aspect ratio and the Tidal Froude number. Motivated by this new observation, the problem of estuary classification is re-investigated. Our analysis shows that ...

  8. Classification of Arabic Documents

    Elbery, Ahmed

    2012-01-01

    Arabic language is a very rich language with complex morphology, so it has a very different and difficult structure than other languages. So it is important to build an Arabic Text Classifier (ATC) to deal with this complex language. The importance of text or document classification comes from its wide variety of application domains such as text indexing, document sorting, text filtering, and Web page categorization. Due to the immense amount of Arabic documents as well as the number of inter...

  9. Qatar content classification

    Handosa, Mohamed

    2014-01-01

    Short title: Qatar content classification. Long title: Develop methods and software for classifying Arabic texts into a taxonomy using machine learning. Contact person and their contact information: Tarek Kanan, . Project description: Starting 4/1/2012, and running through 12/31/2015, is a project to advance digital libraries in the country of Qatar. This is led by VT, but also involves Penn State, Texas A&M, and Qatar University. Tarek is a GRA on this effort. His di...

  10. Application of ant colony optimization in NPP classification fault location

    Nuclear Power Plant is a highly complex structural system with high safety requirements. Fault location appears to be particularly important to enhance its safety. Ant Colony Optimization is a new type of optimization algorithm, which is used in the fault location and classification of nuclear power plants in this paper. Taking the main coolant system of the first loop as the study object, using VB6.0 programming technology, the NPP fault location system is designed, and is tested against the related data in the literature. Test results show that the ant colony optimization can be used in the accurate classification fault location in the nuclear power plants. (authors)

  11. Maximum mutual information regularized classification

    Wang, Jim Jing-Yan

    2014-09-07

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  12. Classification of Meteorological Drought

    Zhang Qiang; Zou Xukai; Xiao Fengjin; Lu Houquan; Liu Haibo; Zhu Changhan; An Shunqing

    2011-01-01

    Background The national standard of the Classification of Meteorological Drought (GB/T 20481-2006) was developed by the National Climate Center in cooperation with Chinese Academy of Meteorological Sciences,National Meteorological Centre and Department of Forecasting and Disaster Mitigation under the China Meteorological Administration (CMA),and was formally released and implemented in November 2006.In 2008,this Standard won the second prize of the China Standard Innovation and Contribution Awards issued by SAC.Developed through independent innovation,it is the first national standard published to monitor meteorological drought disaster and the first standard in China and around the world specifying the classification of drought.Since its release in 2006,the national standard of Classification of Meteorological Drought has been used by CMA as the operational index to monitor and drought assess,and gradually used by provincial meteorological sureaus,and applied to the drought early warning release standard in the Methods of Release and Propagation of Meteorological Disaster Early Warning Signal.

  13. Histologic classification of gliomas.

    Perry, Arie; Wesseling, Pieter

    2016-01-01

    Gliomas form a heterogeneous group of tumors of the central nervous system (CNS) and are traditionally classified based on histologic type and malignancy grade. Most gliomas, the diffuse gliomas, show extensive infiltration in the CNS parenchyma. Diffuse gliomas can be further typed as astrocytic, oligodendroglial, or rare mixed oligodendroglial-astrocytic of World Health Organization (WHO) grade II (low grade), III (anaplastic), or IV (glioblastoma). Other gliomas generally have a more circumscribed growth pattern, with pilocytic astrocytomas (WHO grade I) and ependymal tumors (WHO grade I, II, or III) as the most frequent representatives. This chapter provides an overview of the histology of all glial neoplasms listed in the WHO 2016 classification, including the less frequent "nondiffuse" gliomas and mixed neuronal-glial tumors. For multiple decades the histologic diagnosis of these tumors formed a useful basis for assessment of prognosis and therapeutic management. However, it is now fully clear that information on the molecular underpinnings often allows for a more robust classification of (glial) neoplasms. Indeed, in the WHO 2016 classification, histologic and molecular findings are integrated in the definition of several gliomas. As such, this chapter and Chapter 6 are highly interrelated and neither should be considered in isolation. PMID:26948349

  14. BMGE (Block Mapping and Gathering with Entropy: a new software for selection of phylogenetic informative regions from multiple sequence alignments

    Gribaldo Simonetta

    2010-07-01

    Full Text Available Abstract Background The quality of multiple sequence alignments plays an important role in the accuracy of phylogenetic inference. It has been shown that removing ambiguously aligned regions, but also other sources of bias such as highly variable (saturated characters, can improve the overall performance of many phylogenetic reconstruction methods. A current scientific trend is to build phylogenetic trees from a large number of sequence datasets (semi-automatically extracted from numerous complete genomes. Because these approaches do not allow a precise manual curation of each dataset, there exists a real need for efficient bioinformatic tools dedicated to this alignment character trimming step. Results Here is presented a new software, named BMGE (Block Mapping and Gathering with Entropy, that is designed to select regions in a multiple sequence alignment that are suited for phylogenetic inference. For each character, BMGE computes a score closely related to an entropy value. Calculation of these entropy-like scores is weighted with BLOSUM or PAM similarity matrices in order to distinguish among biologically expected and unexpected variability for each aligned character. Sets of contiguous characters with a score above a given threshold are considered as not suited for phylogenetic inference and then removed. Simulation analyses show that the character trimming performed by BMGE produces datasets leading to accurate trees, especially with alignments including distantly-related sequences. BMGE also implements trimming and recoding methods aimed at minimizing phylogeny reconstruction artefacts due to compositional heterogeneity. Conclusions BMGE is able to perform biologically relevant trimming on a multiple alignment of DNA, codon or amino acid sequences. Java source code and executable are freely available at ftp://ftp.pasteur.fr/pub/GenSoft/projects/BMGE/.

  15. The complete chloroplast genome sequence of Ampelopsis: gene organization, comparative analysis and phylogenetic relationships to other angiosperms

    Gurusamy eRaman

    2016-03-01

    Full Text Available Ampelopsis brevipedunculata is an economically important plant that belongs to the Vitaceae family of angiosperms. The phylogenetic placement of Vitaceae is still unresolved. Recent phylogenetic studies suggested that it should be placed in various alternative families including Caryophyllaceae, asteraceae, Saxifragaceae, Dilleniaceae, or with the rest of the rosid families. However, these analyses provided weak supportive results because they were based on only one of several genes. Accordingly, complete chloroplast genome sequences are required to resolve the phylogenetic relationships among angiosperms. Recent phylogenetic analyses based on the complete chloroplast genome sequence suggested strong support for the position of Vitaceae as the earliest diverging lineage of rosids and placed it as a sister to the remaining rosids. These studies also revealed relationships among several major lineages of angiosperms; however, they highlighted the significance of taxon sampling for obtaining accurate phylogenies. In the present study, we sequenced the complete chloroplast genome of A. brevipedunculata and used these data to assess the relationships among 32 angiosperms, including 18 taxa of rosids. The Ampelopsis chloroplast genome is 161,090 bp in length, and includes a pair of inverted repeats of 26,394 bp that are separated by small and large single copy regions of 19,036 bp and 89,266 bp, respectively. The gene content and order of Ampelopsis is identical to many other unrearranged angiosperm chloroplast genomes, including Vitis and tobacco. A phylogenetic tree constructed based on 70 protein-coding genes of 33 angiosperms showed that both Saxifragales and Vitaceae diverged from the rosid clade and formed two clades with 100% bootstrap value. The position of the Vitaceae is sister to Saxifragales, and both are the basal and earliest diverging lineages. Moreover, Saxifragales forms a sister clade to Vitaceae of rosids. Overall, the results of

  16. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection.

    Yun Yu

    Full Text Available Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.

  17. Accurate atomic data for industrial plasma applications

    Griesmann, U.; Bridges, J.M.; Roberts, J.R.; Wiese, W.L.; Fuhr, J.R. [National Inst. of Standards and Technology, Gaithersburg, MD (United States)

    1997-12-31

    Reliable branching fraction, transition probability and transition wavelength data for radiative dipole transitions of atoms and ions in plasma are important in many industrial applications. Optical plasma diagnostics and modeling of the radiation transport in electrical discharge plasmas (e.g. in electrical lighting) depend on accurate basic atomic data. NIST has an ongoing experimental research program to provide accurate atomic data for radiative transitions. The new NIST UV-vis-IR high resolution Fourier transform spectrometer has become an excellent tool for accurate and efficient measurements of numerous transition wavelengths and branching fractions in a wide wavelength range. Recently, the authors have also begun to employ photon counting techniques for very accurate measurements of branching fractions of weaker spectral lines with the intent to improve the overall accuracy for experimental branching fractions to better than 5%. They have now completed their studies of transition probabilities of Ne I and Ne II. The results agree well with recent calculations and for the first time provide reliable transition probabilities for many weak intercombination lines.

  18. More accurate picture of human body organs

    Computerized tomography and nucler magnetic resonance tomography (NMRT) are revolutionary contributions to radiodiagnosis because they allow to obtain a more accurate image of human body organs. The principles are described of both methods. Attention is mainly devoted to NMRT which has clinically only been used for three years. It does not burden the organism with ionizing radiation. (Ha)

  19. Exploiting multi-context analysis in semantic image classification

    TIAN Yong-hong; HUANG Tie-jun; GAO Wen

    2005-01-01

    As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification approach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based correlation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.

  20. Polarimetric Synthetic Aperture Radar Image Classification by a Hybrid Method

    Kamran Ullah Khan; YANG Jian

    2007-01-01

    Different methods proposed so far for accurate classification of land cover types in polarimetric synthetic aperture radar (SAR) image are data specific and no general method is available. A novel hybrid framework for this classification was developed in this work. A set of effective features derived from the coherence matrix of polarimetric SARdata was proposed.Constituents of the feature set are wavelet,texture,and nonlinear features.The proposed feature set has a strong discrimination power. A neural network was used as the classification engine in a unique way. By exploiting the speed of the conjugate gradient method and the convergence rate of the Levenberg-Marquardt method (near the optimal point), an overall speed up of the classification procedure was achieved. Principal component analysis(PCA)was used to shrink the dimension of the feature vector without sacrificing much of the classification accuracy. The proposed approach is compared with the maximum likelihood estimator (MLE)based on the complex Wishart distribution and the results show the superiority of the proposed method,with the average classification accuracy by the proposed method(95.4%)higher than that of the MLE(93.77%). Use of PCA to reduce the dimensionality of the feature vector helps reduce the memory requirements and computational cost, thereby enhancing the speed of the process.

  1. Phylogenetic relationships of Malaysia’s long-tailed macaques, Macaca fascicularis, based on cytochrome b sequences

    Abdul-Latiff Muhammad Abu Bakar

    2014-05-01

    Full Text Available Phylogenetic relationships among Malaysia’s long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian portray a consistent clustering paradigm as Borneo’s population was distinguished from Peninsula’s population (98% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees. The East coast population was separated from other Peninsula populations (100% in NJ and MP, 1.00 in Bayesian. West coast populations were divided into 2 clades: the North-South (54% in NJ, 45% in MP and 0.99 in Bayesian and Island-Mainland (54% in NJ, 45% in MP and 0.99 in Bayesian. The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia’s M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia.

  2. Phylogenetic performance of mitochondrial protein-coding genes of Oncomelania hupensis in resolving relationships between landscape populations

    Shi-Zhu LI; Li ZHANG; Lin MA; Wei HU; Shan LV; Qin LIU; Ying-Jun QIAN

    2013-01-01

    Oncomelania hupensis is the unique intermediate host of Schistosomajaponicum,which plays a key role in the transmission of human blood fluke Schistosoma.The complete mitochondrial (mt) genome of O.hupensis has been characterized; however,the phylogenetic performance of mt protein-coding genes (PCGs) of the snail remain unclear.In this study,11 whole mt genomes of snails collected from four different ecological settings in China and the Philippines were sequenced.The mt genome sizes ranged from 15 183 to 15 216 bp,with the G + C contents from 32.4% to 33.4%.A total of 15 251 characters were generated from the multiple sequence alignment.Of 2711 (17.8%)polymorphic sites,56.22% (1524) were parsimony sites.The mt genomes' phylogenetic trees were reconstructed using minimum evolution,neighbor joining,maximum likelihood,maximum parsimony,and Bayesian tree estimate methods,and two main distinct clades were identified:(i) the isolate from mountainous regions; (ii) the remaining isolate which included three inner branches.All phylogenetic trees of the 13 PCGs were generated by running 1000 bootstrap replicates and compared with the complete mtDNA tree,the classification accuracy ranging from 21.23% to 87.87%,the topological distance of phylogenetic trees between PCGs ranging from 5 to 14.Therefore,the performance of PCGs can be divided into good condition (COⅠ,ND2,ND5,and ND3),medium (COⅡ,ATP6,ND1,ND6,Cytb,ND4,and COⅢ),poor (ATP8 and ND4L).This study represents the first analysis ofmt genome diversity of the O.hupensis snail and phylogenetic performance of mt PCGs.It presents clear evidence that the snail populations can be separated into four landscape genetic populations in mainland China based on whole mt genomes.The identification of the phylogenetic performance of PCGs provides new insight into the intensive genetic diversity study using mtDNA markers for the snail.

  3. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.

    Lartillot, Nicolas; Rodrigue, Nicolas; Stubbs, Daniel; Richer, Jacques

    2013-07-01

    Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website www.phylobayes.org. PMID:23564032

  4. Biomarker Selection and Classification of “-Omics” Data Using a Two-Step Bayes Classification Framework

    Anunchai Assawamakin

    2013-01-01

    Full Text Available Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray, and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time.

  5. BLAST-EXPLORER helps you building datasets for phylogenetic analysis

    Claverie Jean-Michel

    2010-01-01

    Full Text Available Abstract Background The right sampling of homologous sequences for phylogenetic or molecular evolution analyses is a crucial step, the quality of which can have a significant impact on the final interpretation of the study. There is no single way for constructing datasets suitable for phylogenetic analysis, because this task intimately depends on the scientific question we want to address, Moreover, database mining softwares such as BLAST which are routinely used for searching homologous sequences are not specifically optimized for this task. Results To fill this gap, we designed BLAST-Explorer, an original and friendly web-based application that combines a BLAST search with a suite of tools that allows interactive, phylogenetic-oriented exploration of the BLAST results and flexible selection of homologous sequences among the BLAST hits. Once the selection of the BLAST hits is done using BLAST-Explorer, the corresponding sequence can be imported locally for external analysis or passed to the phylogenetic tree reconstruction pipelines available on the Phylogeny.fr platform. Conclusions BLAST-Explorer provides a simple, intuitive and interactive graphical representation of the BLAST results and allows selection and retrieving of the BLAST hit sequences based a wide range of criterions. Although BLAST-Explorer primarily aims at helping the construction of sequence datasets for further phylogenetic study, it can also be used as a standard BLAST server with enriched output. BLAST-Explorer is available at http://www.phylogeny.fr

  6. SUMAC: Constructing Phylogenetic Supermatrices and Assessing Partially Decisive Taxon Coverage.

    Freyman, William A

    2015-01-01

    The amount of phylogenetically informative sequence data in GenBank is growing at an exponential rate, and large phylogenetic trees are increasingly used in research. Tools are needed to construct phylogenetic sequence matrices from GenBank data and evaluate the effect of missing data. Supermatrix Constructor (SUMAC) is a tool to data-mine GenBank, construct phylogenetic supermatrices, and assess the phylogenetic decisiveness of a matrix given the pattern of missing sequence data. SUMAC calculates a novel metric, Missing Sequence Decisiveness Scores (MSDS), which measures how much each individual missing sequence contributes to the decisiveness of the matrix. MSDS can be used to compare supermatrices and prioritize the acquisition of new sequence data. SUMAC constructs supermatrices either through an exploratory clustering of all GenBank sequences within a taxonomic group or by using guide sequences to build homologous clusters in a more targeted manner. SUMAC assembles supermatrices for any taxonomic group recognized in GenBank and is optimized to run on multicore computer systems by parallelizing multiple stages of operation. SUMAC is implemented as a Python package that can run as a stand-alone command-line program, or its modules and objects can be incorporated within other programs. SUMAC is released under the open source GPLv3 license and is available at https://github.com/wf8/sumac. PMID:26648681

  7. PhyloFinder: An intelligent search engine for phylogenetic tree databases

    Bansal Mukul S; Burleigh J Gordon; Chen Duhong; Fernández-Baca David

    2008-01-01

    Abstract Background Bioinformatic tools are needed to store and access the rapidly growing phylogenetic data. These tools should enable users to identify existing phylogenetic trees containing a specified taxon or set of taxa and to compare a specified phylogenetic hypothesis to existing phylogenetic trees. Results PhyloFinder is an intelligent search engine for phylogenetic databases that we have implemented using trees from TreeBASE. It enables taxonomic queries, in which it identifies tree...

  8. Classification of LiDAR Data with Point Based Classification Methods

    Yastikli, N.; Cetin, Z.

    2016-06-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features

  9. On the Classification of Psychology in General Library Classification Schemes.

    Soudek, Miluse

    1980-01-01

    Holds that traditional library classification systems are inadequate to handle psychological literature, and advocates the establishment of new theoretical approaches to bibliographic organization. (FM)

  10. Feedback about more accurate versus less accurate trials: differential effects on self-confidence and activation.

    Badami, Rokhsareh; VaezMousavi, Mohammad; Wulf, Gabriele; Namazizadeh, Mahdi

    2012-06-01

    One purpose of the present study was to examine whether self-confidence or anxiety would be differentially affected byfeedback from more accurate rather than less accurate trials. The second purpose was to determine whether arousal variations (activation) would predict performance. On day 1, participants performed a golf putting task under one of two conditions: one group received feedback on the most accurate trials, whereas another group received feedback on the least accurate trials. On day 2, participants completed an anxiety questionnaire and performed a retention test. Shin conductance level, as a measure of arousal, was determined. The results indicated that feedback about more accurate trials resulted in more effective learning as well as increased self-confidence. Also, activation was a predictor of performance. PMID:22808705

  11. Phylogenetic evidence for a case of misleading rather than mislabeling in caviar in the United Kingdom.

    Johnson, Tania Aspasia; Iyengar, Arati

    2015-01-01

    Sturgeons and paddlefish are freshwater fish which are highly valued for their caviar. Despite the fact that every single species of sturgeon and paddlefish is listed under CITES, there are reports of illegal trade in caviar where products are deliberately mislabeled. Three samples of caviar purchased in the United Kingdom were investigated for accurate CITES labeling using COI and cyt b sequencing. Initial species identification was carried out using BLAST followed by phylogenetic analyses using both maximum parsimony and maximum likelihood methods. Results showed no evidence for mislabeling with respect to CITES labels in any of the three samples, but we observed clear evidence for a case of misleading the customer in one sample. PMID:25098816

  12. TreeFam: a curated database of phylogenetic trees of animal gene families

    Li, Heng; Coghlan, Avril; Ruan, Jue;

    2006-01-01

    , based on seed alignments and trees in a similar fashion to Pfam. Release 1.1 of TreeFam contains curated trees for 690 families and automatically generated trees for another 11 646 families. These represent over 128 000 genes from nine fully sequenced animal genomes and over 45 000 other animal proteins......TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively...... from UniProt; approximately 40-85% of proteins encoded in the fully sequenced animal genomes are included in TreeFam. TreeFam is freely available at http://www.treefam.org and http://treefam.genomics.org.cn. Udgivelsesdato: 2006-Jan-1...

  13. How Accurate is inv(A)*b?

    Druinsky, Alex

    2012-01-01

    Several widely-used textbooks lead the reader to believe that solving a linear system of equations Ax = b by multiplying the vector b by a computed inverse inv(A) is inaccurate. Virtually all other textbooks on numerical analysis and numerical linear algebra advise against using computed inverses without stating whether this is accurate or not. In fact, under reasonable assumptions on how the inverse is computed, x = inv(A)*b is as accurate as the solution computed by the best backward-stable solvers. This fact is not new, but obviously obscure. We review the literature on the accuracy of this computation and present a self-contained numerical analysis of it.

  14. Accurate guitar tuning by cochlear implant musicians.

    Thomas Lu

    Full Text Available Modern cochlear implant (CI users understand speech but find difficulty in music appreciation due to poor pitch perception. Still, some deaf musicians continue to perform with their CI. Here we show unexpected results that CI musicians can reliably tune a guitar by CI alone and, under controlled conditions, match simultaneously presented tones to <0.5 Hz. One subject had normal contralateral hearing and produced more accurate tuning with CI than his normal ear. To understand these counterintuitive findings, we presented tones sequentially and found that tuning error was larger at ∼ 30 Hz for both subjects. A third subject, a non-musician CI user with normal contralateral hearing, showed similar trends in performance between CI and normal hearing ears but with less precision. This difference, along with electric analysis, showed that accurate tuning was achieved by listening to beats rather than discriminating pitch, effectively turning a spectral task into a temporal discrimination task.

  15. Accurate guitar tuning by cochlear implant musicians.

    Lu, Thomas; Huang, Juan; Zeng, Fan-Gang

    2014-01-01

    Modern cochlear implant (CI) users understand speech but find difficulty in music appreciation due to poor pitch perception. Still, some deaf musicians continue to perform with their CI. Here we show unexpected results that CI musicians can reliably tune a guitar by CI alone and, under controlled conditions, match simultaneously presented tones to <0.5 Hz. One subject had normal contralateral hearing and produced more accurate tuning with CI than his normal ear. To understand these counterintuitive findings, we presented tones sequentially and found that tuning error was larger at ∼ 30 Hz for both subjects. A third subject, a non-musician CI user with normal contralateral hearing, showed similar trends in performance between CI and normal hearing ears but with less precision. This difference, along with electric analysis, showed that accurate tuning was achieved by listening to beats rather than discriminating pitch, effectively turning a spectral task into a temporal discrimination task. PMID:24651081

  16. SPORT FOOD ADDITIVE CLASSIFICATION

    I. P. Prokopenko

    2015-01-01

    Full Text Available Correctly organized nutritive and pharmacological support is an important component of an athlete's preparation for competitions, an optimal shape maintenance, fast recovery and rehabilitation after traumas and defatigation. Special products of enhanced biological value (BAS for athletes nutrition are used with this purpose. Easy-to-use energy sources are administered into athlete's organism, yielded materials and biologically active substances which regulate and activate exchange reactions which proceed with difficulties during certain physical trainings. The article presents sport supplements classification which can be used before warm-up and trainings, after trainings and in competitions breaks.

  17. Classification of Emergency Scenarios

    Muench, Mathieu

    2011-01-01

    In most of today's emergency scenarios information plays a crucial role. Therefore, information has to be constantly collected and shared among all rescue team members and this requires new innovative technologies. In this paper a classification of emergency scenarios is presented, describing their special characteristics and common strategies employed by rescue units to handle them. Based on interviews with professional firefighters, requirements for new systems are listed. The goal of this article is to support developers designing new systems by providing them a deeper look into the work of first responders.

  18. Classification of hand eczema

    Agner, T; Aalto-Korte, K; Andersen, K E;

    2015-01-01

    recruited from nine different tertiary referral centres. All patients underwent examination by specialists in dermatology and were checked using relevant allergy testing. Patients were classified into one of the six diagnostic subgroups of HE: allergic contact dermatitis, irritant contact dermatitis, atopic......%) could not be classified. 38% had one additional diagnosis and 26% had two or more additional diagnoses. Eczema on feet was found in 30% of the patients, statistically significantly more frequently associated with hyperkeratotic and vesicular endogenous eczema. CONCLUSION: We find that the classification...

  19. Classification of smooth Fano polytopes

    Øbro, Mikkel

    A simplicial lattice polytope containing the origin in the interior is called a smooth Fano polytope, if the vertices of every facet is a basis of the lattice. The study of smooth Fano polytopes is motivated by their connection to toric varieties. The thesis concerns the classification of smooth...... Fano polytopes up to isomorphism. A smooth Fano -polytope can have at most vertices. In case of vertices an explicit classification is known. The thesis contains the classification in case of vertices. Classifications of smooth Fano -polytopes for fixed exist only for . In the thesis an algorithm for...... the classification of smooth Fano -polytopes for any given is presented. The algorithm has been implemented and used to obtain the complete classification for ....

  20. Accurate Finite Difference Methods for Option Pricing

    Persson, Jonas

    2006-01-01

    Stock options are priced numerically using space- and time-adaptive finite difference methods. European options on one and several underlying assets are considered. These are priced with adaptive numerical algorithms including a second order method and a more accurate method. For American options we use the adaptive technique to price options on one stock with and without stochastic volatility. In all these methods emphasis is put on the control of errors to fulfill predefined tolerance level...

  1. Accurate, reproducible measurement of blood pressure.

    Campbell, N. R.; Chockalingam, A; Fodor, J. G.; McKay, D. W.

    1990-01-01

    The diagnosis of mild hypertension and the treatment of hypertension require accurate measurement of blood pressure. Blood pressure readings are altered by various factors that influence the patient, the techniques used and the accuracy of the sphygmomanometer. The variability of readings can be reduced if informed patients prepare in advance by emptying their bladder and bowel, by avoiding over-the-counter vasoactive drugs the day of measurement and by avoiding exposure to cold, caffeine con...

  2. Accurate variational forms for multiskyrmion configurations

    Jackson, A.D.; Weiss, C.; Wirzba, A.; Lande, A.

    1989-04-17

    Simple variational forms are suggested for the fields of a single skyrmion on a hypersphere, S/sub 3/(L), and of a face-centered cubic array of skyrmions in flat space, R/sub 3/. The resulting energies are accurate at the level of 0.2%. These approximate field configurations provide a useful alternative to brute-force solutions of the corresponding Euler equations.

  3. Efficient Accurate Context-Sensitive Anomaly Detection

    2007-01-01

    For program behavior-based anomaly detection, the only way to ensure accurate monitoring is to construct an efficient and precise program behavior model. A new program behavior-based anomaly detection model,called combined pushdown automaton (CPDA) model was proposed, which is based on static binary executable analysis. The CPDA model incorporates the optimized call stack walk and code instrumentation technique to gain complete context information. Thereby the proposed method can detect more attacks, while retaining good performance.

  4. Towards accurate modeling of moving contact lines

    Holmgren, Hanna

    2015-01-01

    The present thesis treats the numerical simulation of immiscible incompressible two-phase flows with moving contact lines. The conventional Navier–Stokes equations combined with a no-slip boundary condition leads to a non-integrable stress singularity at the contact line. The singularity in the model can be avoided by allowing the contact line to slip. Implementing slip conditions in an accurate way is not straight-forward and different regularization techniques exist where ad-hoc procedures ...

  5. Automated simultaneous analysis phylogenetics (ASAP: an enabling tool for phlyogenomics

    Lee Ernest K

    2008-02-01

    Full Text Available Abstract Background The availability of sequences from whole genomes to reconstruct the tree of life has the potential to enable the development of phylogenomic hypotheses in ways that have not been before possible. A significant bottleneck in the analysis of genomic-scale views of the tree of life is the time required for manual curation of genomic data into multi-gene phylogenetic matrices. Results To keep pace with the exponentially growing volume of molecular data in the genomic era, we have developed an automated technique, ASAP (Automated Simultaneous Analysis Phylogenetics, to assemble these multigene/multi species matrices and to evaluate the significance of individual genes within the context of a given phylogenetic hypothesis. Conclusion Applications of ASAP may enable scientists to re-evaluate species relationships and to develop new phylogenomic hypotheses based on genome-scale data.

  6. Phylogenetic signals in the climatic niches of the world's amphibians

    Hof, Christian; Rahbek, Carsten; Araújo, Miguel B.

    2010-01-01

    The question of whether closely related species share similar ecological requirements has attracted increasing attention, because of its importance for understanding global diversity gradients and the impacts of climate change on species distributions. In fact, the assumption that related species...... are also ecologically similar has often been made, although the prevalence of such a phylogenetic signal in ecological niches remains heavily debated. Here, we provide a global analysis of phylogenetic niche relatedness for the world's amphibians. In particular, we assess which proportion of the...... variance in the realised climatic niches is explained on higher taxonomic levels, and whether the climatic niches of species within a given taxonomic group are more similar than between taxonomic groups. We found evidence for phylogenetic signals in realised climatic niches although the strength of the...

  7. A phylogenetic overview of the antrodia clade (Basidiomycota, Polyporales).

    Ortiz-Santana, Beatriz; Lindner, Daniel L; Miettinen, Otto; Justo, Alfredo; Hibbett, David S

    2013-01-01

    Phylogenetic relationships among members of the antrodia clade were investigated with molecular data from two nuclear ribosomal DNA regions, LSU and ITS. A total of 123 species representing 26 genera producing a brown rot were included in the present study. Three DNA datasets (combined LSU-ITS dataset, LSU dataset, ITS dataset) comprising sequences of 449 isolates were evaluated with three different phylogenetic analyses (maximum likelihood, maximum parsimony, Bayesian inference). We present a phylogenetic overview of the five main groups recovered: the fibroporia, laetiporus, postia, laricifomes and core antrodia groups. Not all of the main groups received strong support in the analyses, requiring further research. We were able to identify a number of well supported clades within the main groups. PMID:23935025

  8. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics

    von Haeseler Arndt

    2004-06-01

    Full Text Available Abstract Background Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. Results TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004 are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. Conclusions TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies.

  9. Constructing Phylogenetic Networks Based on the Isomorphism of Datasets

    Zhang, Zhibin; Li, Yanjuan

    2016-01-01

    Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK, and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topology G and construct a network for every dataset. For any one dataset 𝒞, we can compute a network from the network representing the simplest dataset which is isomorphic to 𝒞. This process will save more time for the algorithms when constructing networks.

  10. Phylogenetic analysis of Ostreococcus virus sequences from the Patagonian Coast.

    Manrique, Julieta M; Calvo, Andrea Y; Jones, Leandro R

    2012-10-01

    A phylogenetic analysis of new Ostreococcus virus (OV) sequences from the Patagonian Coast, Argentina, and homologous sequences from public databases was performed. This analysis showed that the Patagonian sequences represented a divergent viral clade and that the rest of OV sequences analyzed here were clustered into six additional phylogenetic groups. Analyses of 18S gene libraries supported a close relationship of the Patagonian Ostreococcus host with clade A sequences described elsewhere, corroborating previous studies indicating that clade A strains are ubiquitous. Besides the Patagonian OV sequences, several phylogenetic groupings were linked to particular geographic locations, suggesting a role for allopatric cladogenesis in viral diversification. However, and in agreement with previous observations, other viral lineages included sequences with diverse geographic origins. These findings, together with analyses of ancestral trait trajectories performed here, are consistent with an evolutionary dynamics in which geographical isolation has a role in OV diversification but can be followed by rapid dispersion to remote places. PMID:22674355

  11. Accurate phase-shift velocimetry in rock

    Shukla, Matsyendra Nath; Vallatos, Antoine; Phoenix, Vernon R.; Holmes, William M.

    2016-06-01

    Spatially resolved Pulsed Field Gradient (PFG) velocimetry techniques can provide precious information concerning flow through opaque systems, including rocks. This velocimetry data is used to enhance flow models in a wide range of systems, from oil behaviour in reservoir rocks to contaminant transport in aquifers. Phase-shift velocimetry is the fastest way to produce velocity maps but critical issues have been reported when studying flow through rocks and porous media, leading to inaccurate results. Combining PFG measurements for flow through Bentheimer sandstone with simulations, we demonstrate that asymmetries in the molecular displacement distributions within each voxel are the main source of phase-shift velocimetry errors. We show that when flow-related average molecular displacements are negligible compared to self-diffusion ones, symmetric displacement distributions can be obtained while phase measurement noise is minimised. We elaborate a complete method for the production of accurate phase-shift velocimetry maps in rocks and low porosity media and demonstrate its validity for a range of flow rates. This development of accurate phase-shift velocimetry now enables more rapid and accurate velocity analysis, potentially helping to inform both industrial applications and theoretical models.

  12. Active Learning for Text Classification

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  13. Random Forests for Poverty Classification

    Ruben Thoplan

    2014-01-01

    This paper applies a relatively novel method in data mining to address the issue of poverty classification in Mauritius. The random forests algorithm is applied to the census data in view of improving classification accuracy for poverty status. The analysis shows that the numbers of hours worked, age, education and sex are the most important variables in the classification of the poverty status of an individual. In addition, a clear poverty-gender gap is identified as women have higher chance...

  14. DCC Briefing Paper: Genre classification

    Abbott, Daisy; Kim, Yunhyong

    2008-01-01

    Genre classification is the process of grouping objects together based on defined similarities such as subject, format, style, or purpose. Genre classification as a means of managing information is already established in music (e.g. folk, blues, jazz) and text and is used, alongside topic classification, to organise materials in the commercial sector (the children's section of a bookshop) and intellectually (for example, in the Usenet newsgroup directory hierarchy). However, in the case o...

  15. Classification and Labelling for Biocides

    Rubbiani, Maristella

    2015-01-01

    CLP and biocides The EU Regulation (EC) No 1272/2008 on Classification, Labelling and Packaging of Substances and Mixtures, the CLP-Regulation, entered into force on 20th January, 2009. Since 1st December, 2010 the classification, labelling and packaging of substances has to comply with this Regulation. For mixtures, the rules of this Regulation are mandatory from 1st June, 2015; this means that until this date classification, labelling and packaging could either be carried out according to D...

  16. Classification of Pulse Waveforms Using Edit Distance with Real Penalty

    Zhang Dongyu

    2010-01-01

    Full Text Available Abstract Advances in sensor and signal processing techniques have provided effective tools for quantitative research in traditional Chinese pulse diagnosis (TCPD. Because of the inevitable intraclass variation of pulse patterns, the automatic classification of pulse waveforms has remained a difficult problem. In this paper, by referring to the edit distance with real penalty (ERP and the recent progress in -nearest neighbors (KNN classifiers, we propose two novel ERP-based KNN classifiers. Taking advantage of the metric property of ERP, we first develop an ERP-induced inner product and a Gaussian ERP kernel, then embed them into difference-weighted KNN classifiers, and finally develop two novel classifiers for pulse waveform classification. The experimental results show that the proposed classifiers are effective for accurate classification of pulse waveform.

  17. Robust tissue classification for reproducible wound assessment in telemedicine environments

    Wannous, Hazem; Treuillet, Sylvie; Lucas, Yves

    2010-04-01

    In telemedicine environments, a standardized and reproducible assessment of wounds, using a simple free-handled digital camera, is an essential requirement. However, to ensure robust tissue classification, particular attention must be paid to the complete design of the color processing chain. We introduce the key steps including color correction, merging of expert labeling, and segmentation-driven classification based on support vector machines. The tool thus developed ensures stability under lighting condition, viewpoint, and camera changes, to achieve accurate and robust classification of skin tissues. Clinical tests demonstrate that such an advanced tool, which forms part of a complete 3-D and color wound assessment system, significantly improves the monitoring of the healing process. It achieves an overlap score of 79.3 against 69.1% for a single expert, after mapping on the medical reference developed from the image labeling by a college of experts.

  18. AdaBoost for Improved Voice-Band Signal Classification

    2007-01-01

    A good voice-band signal classification can not only enable the safe application of speech coding techniques,the implementation of a Digital Signal Interpolation (DSI)system, but also facilitate network administration and planning by providing accurate voice-band traffic analysis.A new method is proposed to detect and classify the presence of various voice-band signals on the General Switched Telephone Network ( GSTN ). The method uses a combination of simple base classifiers through the AdaBoost algorithm. The conventional classification features for voiceband data classification are combined and optimized by the AdaBoost algorithm and spectral subtraction method.Experiments show the simpleness, effectiveness, efficiency and flexibility of the method.

  19. Data cache organization for accurate timing analysis

    Schoeberl, Martin; Huber, Benedikt; Puffitsch, Wolfgang

    2013-01-01

    Caches are essential to bridge the gap between the high latency main memory and the fast processor pipeline. Standard processor architectures implement two first-level caches to avoid a structural hazard in the pipeline: an instruction cache and a data cache. For tight worst-case execution times...... it is important to classify memory accesses as either cache hit or cache miss. The addresses of instruction fetches are known statically and static cache hit/miss classification is possible for the instruction cache. The access to data that is cached in the data cache is harder to predict statically. Several...... different data areas, such as stack, global data, and heap allocated data, share the same cache. Some addresses are known statically, other addresses are only known at runtime. With a standard cache organization all those different data areas must be considered by worst-case execution time analysis...

  20. Sequence Classification: 894861 [

    Full Text Available ial component of the MIND kinetochore complex (Mtw1p Including Nnf1p-Nsl1p-Dsn1p) which joins kinetochore subunits contact...ing DNA to those contacting microtubules; required for accurate chromosome segregation; Nnf1p || http://www.ncbi.nlm.nih.gov/protein/6322572 ...

  1. Sequence Classification: 893607 [

    Full Text Available ial component of the MIND kinetochore complex (Mtw1p Including Nnf1p-Nsl1p-Dsn1p) which joins kinetochore subunits contact...ing DNA to those contacting microtubules; required for accurate chromosome segregation; Nsl1p || http://www.ncbi.nlm.nih.gov/protein/6325023 ...

  2. High Frequency QRS ECG Accurately Detects Cardiomyopathy

    Schlegel, Todd T.; Arenare, Brian; Poulin, Gregory; Moser, Daniel R.; Delgado, Reynolds

    2005-01-01

    High frequency (HF, 150-250 Hz) analysis over the entire QRS interval of the ECG is more sensitive than conventional ECG for detecting myocardial ischemia. However, the accuracy of HF QRS ECG for detecting cardiomyopathy is unknown. We obtained simultaneous resting conventional and HF QRS 12-lead ECGs in 66 patients with cardiomyopathy (EF = 23.2 plus or minus 6.l%, mean plus or minus SD) and in 66 age- and gender-matched healthy controls using PC-based ECG software recently developed at NASA. The single most accurate ECG parameter for detecting cardiomyopathy was an HF QRS morphological score that takes into consideration the total number and severity of reduced amplitude zones (RAZs) present plus the clustering of RAZs together in contiguous leads. This RAZ score had an area under the receiver operator curve (ROC) of 0.91, and was 88% sensitive, 82% specific and 85% accurate for identifying cardiomyopathy at optimum score cut-off of 140 points. Although conventional ECG parameters such as the QRS and QTc intervals were also significantly longer in patients than controls (P less than 0.001, BBBs excluded), these conventional parameters were less accurate (area under the ROC = 0.77 and 0.77, respectively) than HF QRS morphological parameters for identifying underlying cardiomyopathy. The total amplitude of the HF QRS complexes, as measured by summed root mean square voltages (RMSVs), also differed between patients and controls (33.8 plus or minus 11.5 vs. 41.5 plus or minus 13.6 mV, respectively, P less than 0.003), but this parameter was even less accurate in distinguishing the two groups (area under ROC = 0.67) than the HF QRS morphologic and conventional ECG parameters. Diagnostic accuracy was optimal (86%) when the RAZ score from the HF QRS ECG and the QTc interval from the conventional ECG were used simultaneously with cut-offs of greater than or equal to 40 points and greater than or equal to 445 ms, respectively. In conclusion 12-lead HF QRS ECG employing

  3. Classification & Structure of Blood Vessels

    ... Thyroid & Parathyroid Glands Adrenal Gland Pancreas Gonads Other Endocrine Glands Review Quiz Cardiovascular System Heart Structure of the Heart Physiology of the Heart Blood Classification & Structure of Blood ...

  4. Decision Fusion Based on Hyperspectral and Multispectral Satellite Imagery for Accurate Forest Species Mapping

    Dimitris G. Stavrakoudis

    2014-07-01

    Full Text Available This study investigates the effectiveness of combining multispectral very high resolution (VHR and hyperspectral satellite imagery through a decision fusion approach, for accurate forest species mapping. Initially, two fuzzy classifications are conducted, one for each satellite image, using a fuzzy output support vector machine (SVM. The classification result from the hyperspectral image is then resampled to the multispectral’s spatial resolution and the two sources are combined using a simple yet efficient fusion operator. Thus, the complementary information provided from the two sources is effectively exploited, without having to resort to computationally demanding and time-consuming typical data fusion or vector stacking approaches. The effectiveness of the proposed methodology is validated in a complex Mediterranean forest landscape, comprising spectrally similar and spatially intermingled species. The decision fusion scheme resulted in an accuracy increase of 8% compared to the classification using only the multispectral imagery, whereas the increase was even higher compared to the classification using only the hyperspectral satellite image. Perhaps most importantly, its accuracy was significantly higher than alternative multisource fusion approaches, although the latter are characterized by much higher computation, storage, and time requirements.

  5. Phylogenetic and recombination analysis of human bocavirus 2

    Li Huiying; Jin Miao; Huang Canping; Yu Jiemei; Xu Ziqian; Chen Jinan; Cheng Weixia; Zhang Ming; Jin Yu; Duan Zhao-jun

    2011-01-01

    Abstract Background Human bocavirus 2(HBoV2) and other human bocavirus species (HBoV, HBoV3, and HBoV4) have been discovered recently. But the precise phylogenetic relationships among these viruses are not clear yet. Methods We collected 632 diarrhea and 162 healthy children in Lanzhou, China. Using PCR, Human bocavirus (HBoV), HBoV2, HBoV3 and HBoV4 were screened. The partial genes of NS, NP1 and VP, and two nearly complete sequences of HBoV2 were obtained. Result Phylogenetic analysis showe...

  6. Phylogenetic significance of composition and crystal morphology of magnetosome minerals.

    Pósfai, Mihály; Lefèvre, Christopher T; Trubitsyn, Denis; Bazylinski, Dennis A; Frankel, Richard B

    2013-01-01

    Magnetotactic bacteria (MTB) biomineralize magnetosomes, nano-scale crystals of magnetite or greigite in membrane enclosures that comprise a permanent magnetic dipole in each cell. MTB control the mineral composition, habit, size, and crystallographic orientation of the magnetosomes, as well as their arrangement within the cell. Studies involving magnetosomes that contain mineral and biological phases require multidisciplinary efforts. Here we use crystallographic, genomic and phylogenetic perspectives to review the correlations between magnetosome mineral habits and the phylogenetic affiliations of MTB, and show that these correlations have important implications for the evolution of magnetosome synthesis, and thus magnetotaxis. PMID:24324461

  7. Phylogenetic significance of composition and crystal morphology of magnetosome minerals

    Mihály ePósfai

    2013-11-01

    Full Text Available Magnetotactic bacteria (MTB biomineralize magnetosomes, nano-scale crystals of magnetite or greigite in membrane enclosures, that comprise a permanent magnetic dipole in each cell. MTB control the mineral composition, habit, size, and crystallographic orientation of the magnetosomes, as well as their arrangement within the cell. Studies involving magnetosomes that contain mineral and biological phases require multidisciplinary efforts. Here we use crystallographic, genomic and phylogenetic perspectives to review the correlations between magnetosome mineral habits and the phylogenetic affiliations of MTB, and show that these correlations have important implications for the evolution of magnetosome synthesis, and thus magnetotaxis.

  8. First molecular detection and phylogenetic analysis of Anaplasma phagocytophilum in shelter dogs in Seoul, Korea.

    Lee, Sukyee; Lee, Seung-Hun; VanBik, Dorene; Kim, Neung-Hee; Kim, Kyoo-Tae; Goo, Youn-Kyoung; Rhee, Man Hee; Kwon, Oh-Deog; Kwak, Dongmi

    2016-07-01

    In this study, the status of Anaplasma phagocytophilum infection was assessed in shelter dogs in Seoul, Korea, with PCR and phylogenetic analyses. Nested PCR on 1058 collected blood samples revealed only one A. phagocytophilum positive sample (female, age genetic variability of A. phagocytophilum was evaluated by genotyping, using the 16S rRNA, groEL, and msp2 gene sequences of the positive sample. BLASTn analysis revealed that the 16S rRNA, groEL, and msp2 genes had 99.6%, 99.9%, and 100% identity with the following sequences deposited in GenBank: a cat 16S rRNA sequence from Korea (KR021166), a rat groEL sequence from Korea (KT220194), and a water deer msp2 sequence from Korea (HM752099), respectively. Phylogenetic analyses classified the groEL gene into two distinct groups (serine and alanine), whereas the msp2 gene showed a general classification into two groups (USA and Europe) that were further subgrouped according to region. To the best of our knowledge, this study is the first to describe the molecular diagnosis of A. phagocytophilum in dogs reared in Korea. In addition, the high genetic identity of the 16S rRNA and groEL sequences between humans and dogs from the same region suggests a possible epidemiological relation. Given the conditions of climate change, tick ecology, and recent incidence of human granulocytic anaplasmosis in Korea, the findings of this study underscore the need to establish appropriate control programs for tick-borne diseases in Korea. PMID:27130537

  9. Phylogenetic relationships, character evolution and biogeographic diversification of Pogostemon s.l. (Lamiaceae).

    Yao, Gang; Drew, Bryan T; Yi, Ting-Shuang; Yan, Hai-Fei; Yuan, Yong-Ming; Ge, Xue-Jun

    2016-05-01

    Pogostemon (Lamiaceae; Lamioideae) sensu lato is a large genus consisting of about 80 species with a disjunct African/Asian distribution. The infrageneric taxonomy of the genus has historically been troublesome due to morphological variability and putative convergent evolution within the genus. Notably, some species of Pogostemon are obligately aquatic, perhaps the only Lamiaceae taxa which exhibit this trait. Phylogenetic analyses using the nuclear ribosomal internal transcribed spacer (ITS) and five plastid regions (matK, rbcL, rps16, trnH-psbA, trnL-F), confirmed the monophyly of Pogostemon and its sister relationship with the genus Anisomeles. Pogostemon was resolved into two major clades, and none of the three morphologically defined subgenera of Pogostemon were supported as monophyletic. Inflorescence type (spikes with more than two lateral branches vs. a single terminal spike, or rarely with two lateral branches) is phylogenetically informative and consistent with the two main clades we recovered. Accordingly, a new infrageneric classification of Pogostemon consisting of two subgenera is proposed. Molecular dating and biogeographic diversification analyses suggest that Pogostemon split from its sister genus in southern and southeast Asia in the early Miocene. The early strengthening of the Asia monsoon system that was triggered by the uplifting of the Qinghai-Tibetan Plateau may have played an important role in the subsequent diversification of the genus. In addition, our results suggest that transoceanic long-distance dispersal of Pogostemon from Asia to Africa occurred at least twice, once in the late Miocene and again during the late-Miocene/early-Pliocene. PMID:26923493

  10. Use of manual densitometry in land cover classification

    Jordan, D. C.; Graves, D. H.; Hammetter, M. C.

    1978-01-01

    Through use of manual spot densitometry values derived from multitemporal 1:24,000 color infrared aircraft photography, areas as small as one hectare in the Cumberland Plateau in Kentucky were accurately classified into one of eight ground cover groups. If distinguishing between undisturbed and disturbed forest areas is the sole criterion of interest, classification results are highly accurate if based on imagery taken during foliated ground cover conditions. Multiseasonal imagery analysis was superior to single data analysis, and transparencies from prefoliated conditions gave better separation of conifers and hardwoods than did those from foliated conditions.

  11. Correction of Alar Retraction Based on Frontal Classification.

    Kim, Jae Hoon; Song, Jin Woo; Park, Sung Wan; Bartlett, Erica; Nguyen, Anh H

    2015-11-01

    Among the various types of alar deformations in Asians, alar retraction not only has the highest occurrence rate, but is also very complicated to treat because the ala is supported only by cartilage and its soft tissue envelope cannot be easily stretched. As patients' knowledge of aesthetic procedures is becoming more extensive due to increased information dissemination through various media, doctors must give more accurate, logical explanations of the procedures to be performed and their anticipated results, with an emphasis on relevant anatomical features, accurate diagnoses, detailed classifications, and various appropriate methods of surgery. PMID:26648808

  12. Real time classification of viruses in 12 dimensions.

    Yu, Chenglong; Hernandez, Troy; Zheng, Hui; Yau, Shek-Chung; Huang, Hsin-Hsiung; He, Rong Lucy; Yang, Jie; Yau, Stephen S-T

    2013-01-01

    The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in [Formula: see text]. Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in [Formula: see text]. Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels. PMID:23717598

  13. Hydrologic landscape regionalisation using deductive classification and random forests.

    Stuart C Brown

    Full Text Available Landscape classification and hydrological regionalisation studies are being increasingly used in ecohydrology to aid in the management and research of aquatic resources. We present a methodology for classifying hydrologic landscapes based on spatial environmental variables by employing non-parametric statistics and hybrid image classification. Our approach differed from previous classifications which have required the use of an a priori spatial unit (e.g. a catchment which necessarily results in the loss of variability that is known to exist within those units. The use of a simple statistical approach to identify an appropriate number of classes eliminated the need for large amounts of post-hoc testing with different number of groups, or the selection and justification of an arbitrary number. Using statistical clustering, we identified 23 distinct groups within our training dataset. The use of a hybrid classification employing random forests extended this statistical clustering to an area of approximately 228,000 km2 of south-eastern Australia without the need to rely on catchments, landscape units or stream sections. This extension resulted in a highly accurate regionalisation at both 30-m and 2.5-km resolution, and a less-accurate 10-km classification that would be more appropriate for use at a continental scale. A smaller case study, of an area covering 27,000 km2, demonstrated that the method preserved the intra- and inter-catchment variability that is known to exist in local hydrology, based on previous research. Preliminary analysis linking the regionalisation to streamflow indices is promising suggesting that the method could be used to predict streamflow behaviour in ungauged catchments. Our work therefore simplifies current classification frameworks that are becoming more popular in ecohydrology, while better retaining small-scale variability in hydrology, thus enabling future attempts to explain and visualise broad-scale hydrologic

  14. 78 FR 68983 - Cotton Futures Classification: Optional Classification Procedure

    2013-11-18

    ...-Doxey data into the cotton futures classification process in March 2012 (77 FR 5379). When verified by a... October 9, 2013 (78 FR 54970). AMS received two comments: one from a national trade organization... Agricultural Marketing Service 7 CFR Part 27 RIN 0581-AD33 Cotton Futures Classification:...

  15. 78 FR 54970 - Cotton Futures Classification: Optional Classification Procedure

    2013-09-09

    ... process in March 2012 (77 FR 5379). When verified by a futures classification, Smith-Doxey data serves as...; ] DEPARTMENT OF AGRICULTURE Agricultural Marketing Service 7 CFR Part 27 RIN 0581-AD33 Cotton Futures... for the addition of an optional cotton futures classification procedure--identified and known...

  16. Reconstruction-classification method for quantitative photoacoustic tomography

    Malone, Emma; Cox, Ben T; Arridge, Simon R

    2015-01-01

    We propose a combined reconstruction-classification method for simultaneously recovering absorption and scattering in turbid media from images of absorbed optical energy. This method exploits knowledge that optical parameters are determined by a limited number of classes to iteratively improve their estimate. Numerical experiments show that the proposed approach allows for accurate recovery of absorption and scattering in 2 and 3 dimensions, and delivers superior image quality with respect to traditional reconstruction-only approaches.

  17. SPIDERz: SuPport vector classification for IDEntifying Redshifts

    Jones, Evan; Singal, J.

    2016-08-01

    SPIDERz (SuPport vector classification for IDEntifying Redshifts) applies powerful support vector machine (SVM) optimization and statistical learning techniques to custom data sets to obtain accurate photometric redshift (photo-z) estimations. It is written for the IDL environment and can be applied to traditional data sets consisting of photometric band magnitudes, or alternatively to data sets with additional galaxy parameters (such as shape information) to investigate potential correlations between the extra galaxy parameters and redshift.

  18. IMAGE RECONSTRUCTION AND OBJECT CLASSIFICATION IN CT IMAGING SYSTEM

    张晓明; 蒋大真; 等

    1995-01-01

    By obtaining a feasible filter function,reconstructed images can be got with linear interpolation and filtered backoprojection techniques.Considering the gray and spatial correlation neighbour informations of each pixel,a new supervised classification method is put forward for the reconstructed images,and an experiment with noise image is done,the result shows that the method is feasible and accurate compared with ideal phantoms.

  19. Automated Feature Design for Time Series Classification by Genetic Programming

    Harvey, Dustin Yewell

    2014-01-01

    Time series classification (TSC) methods discover and exploit patterns in time series and other one-dimensional signals. Although many accurate, robust classifiers exist for multivariate feature sets, general approaches are needed to extend machine learning techniques to make use of signal inputs. Numerous applications of TSC can be found in structural engineering, especially in the areas of structural health monitoring and non-destructive evaluation. Additionally, the fields of process contr...

  20. 14 CFR 1203.412 - Classification guides.

    2010-01-01

    ... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Classification guides. 1203.412 Section 1203.412 Aeronautics and Space NATIONAL AERONAUTICS AND SPACE ADMINISTRATION INFORMATION SECURITY PROGRAM Guides for Original Classification § 1203.412 Classification guides. (a) General. A classification guide, based upon classification...