WorldWideScience

Sample records for datasets reveals strain

  1. Improving Phylogeny Reconstruction at the Strain Level Using Peptidome Datasets.

    Directory of Open Access Journals (Sweden)

    Aitor Blanco-Míguez

    2016-12-01

    Full Text Available Typical bacterial strain differentiation methods are often challenged by high genetic similarity between strains. To address this problem, we introduce a novel in silico peptide fingerprinting method based on conventional wet-lab protocols that enables the identification of potential strain-specific peptides. These can be further investigated using in vitro approaches, laying a foundation for the development of biomarker detection and application-specific methods. This novel method aims at reducing large amounts of comparative peptide data to binary matrices while maintaining a high phylogenetic resolution. The underlying case study concerns the Bacillus cereus group, namely the differentiation of Bacillus thuringiensis, Bacillus anthracis and Bacillus cereus strains. Results show that trees based on cytoplasmic and extracellular peptidomes are only marginally in conflict with those based on whole proteomes, as inferred by the established Genome-BLAST Distance Phylogeny (GBDP method. Hence, these results indicate that the two approaches can most likely be used complementarily even in other organismal groups. The obtained results confirm previous reports about the misclassification of many strains within the B. cereus group. Moreover, our method was able to separate the B. anthracis strains with high resolution, similarly to the GBDP results as benchmarked via Bayesian inference and both Maximum Likelihood and Maximum Parsimony. In addition to the presented phylogenomic applications, whole-peptide fingerprinting might also become a valuable complementary technique to digital DNA-DNA hybridization, notably for bacterial classification at the species and subspecies level in the future.

  2. Improving Phylogeny Reconstruction at the Strain Level Using Peptidome Datasets.

    Science.gov (United States)

    Blanco-Míguez, Aitor; Meier-Kolthoff, Jan P; Gutiérrez-Jácome, Alberto; Göker, Markus; Fdez-Riverola, Florentino; Sánchez, Borja; Lourenço, Anália

    2016-12-01

    Typical bacterial strain differentiation methods are often challenged by high genetic similarity between strains. To address this problem, we introduce a novel in silico peptide fingerprinting method based on conventional wet-lab protocols that enables the identification of potential strain-specific peptides. These can be further investigated using in vitro approaches, laying a foundation for the development of biomarker detection and application-specific methods. This novel method aims at reducing large amounts of comparative peptide data to binary matrices while maintaining a high phylogenetic resolution. The underlying case study concerns the Bacillus cereus group, namely the differentiation of Bacillus thuringiensis, Bacillus anthracis and Bacillus cereus strains. Results show that trees based on cytoplasmic and extracellular peptidomes are only marginally in conflict with those based on whole proteomes, as inferred by the established Genome-BLAST Distance Phylogeny (GBDP) method. Hence, these results indicate that the two approaches can most likely be used complementarily even in other organismal groups. The obtained results confirm previous reports about the misclassification of many strains within the B. cereus group. Moreover, our method was able to separate the B. anthracis strains with high resolution, similarly to the GBDP results as benchmarked via Bayesian inference and both Maximum Likelihood and Maximum Parsimony. In addition to the presented phylogenomic applications, whole-peptide fingerprinting might also become a valuable complementary technique to digital DNA-DNA hybridization, notably for bacterial classification at the species and subspecies level in the future.

  3. Integrated Analysis of Alzheimer's Disease and Schizophrenia Dataset Revealed Different Expression Pattern in Learning and Memory.

    Science.gov (United States)

    Li, Wen-Xing; Dai, Shao-Xing; Liu, Jia-Qian; Wang, Qian; Li, Gong-Hua; Huang, Jing-Fei

    2016-01-01

    Alzheimer's disease (AD) and schizophrenia (SZ) are both accompanied by impaired learning and memory functions. This study aims to explore the expression profiles of learning or memory genes between AD and SZ. We downloaded 10 AD and 10 SZ datasets from GEO-NCBI for integrated analysis. These datasets were processed using RMA algorithm and a global renormalization for all studies. Then Empirical Bayes algorithm was used to find the differentially expressed genes between patients and controls. The results showed that most of the differentially expressed genes were related to AD whereas the gene expression profile was little affected in the SZ. Furthermore, in the aspects of the number of differentially expressed genes, the fold change and the brain region, there was a great difference in the expression of learning or memory related genes between AD and SZ. In AD, the CALB1, GABRA5, and TAC1 were significantly downregulated in whole brain, frontal lobe, temporal lobe, and hippocampus. However, in SZ, only two genes CRHBP and CX3CR1 were downregulated in hippocampus, and other brain regions were not affected. The effect of these genes on learning or memory impairment has been widely studied. It was suggested that these genes may play a crucial role in AD or SZ pathogenesis. The different gene expression patterns between AD and SZ on learning and memory functions in different brain regions revealed in our study may help to understand the different mechanism between two diseases.

  4. Vibrio cholerae classical biotype strains reveal distinct signatures in Mexico.

    Science.gov (United States)

    Alam, Munirul; Islam, M Tarequl; Rashed, Shah Manzur; Johura, Fatema-tuz; Bhuiyan, Nurul A; Delgado, Gabriela; Morales, Rosario; Mendez, Jose Luis; Navarro, Armando; Watanabe, Haruo; Hasan, Nur-A; Colwell, Rita R; Cravioto, Alejandro

    2012-07-01

    Vibrio cholerae O1 classical (CL) biotype caused the fifth and sixth pandemics, and probably the earlier cholera pandemics, before the El Tor (ET) biotype initiated the seventh pandemic in Asia in the 1970s by completely displacing the CL biotype. Although the CL biotype was thought to be extinct in Asia and although it had never been reported from Latin America, V. cholerae CL and ET biotypes, including a hybrid ET, were found associated with areas of cholera endemicity in Mexico between 1991 and 1997. In this study, CL biotype strains isolated from areas of cholera endemicity in Mexico between 1983 and 1997 were characterized in terms of major phenotypic and genetic traits and compared with CL biotype strains isolated in Bangladesh between 1962 and 1989. According to sero- and biotyping data, all V. cholerae strains tested had the major phenotypic and genotypic characteristics specific for the CL biotype. Antibiograms revealed the majority of the Bangladeshi strains to be resistant to trimethoprim-sulfamethoxazole, furazolidone, ampicillin, and gentamicin, while the Mexican strains were sensitive to all of these drugs, as well as to ciprofloxacin, erythromycin, and tetracycline. Pulsed-field gel electrophoresis (PFGE) of NotI-digested genomic DNA revealed characteristic banding patterns for all of the CL biotype strains although the Mexican strains differed from the Bangladeshi strains in 1 to 2 DNA bands. The difference was subtle but consistent, as confirmed by the subclustering patterns in the PFGE-based dendrogram, and can serve as a regional signature, suggesting the pre-1991 existence and evolution of the CL biotype strains in the Americas, independent from Asia.

  5. A Scalable Permutation Approach Reveals Replication and Preservation Patterns of Network Modules in Large Datasets.

    Science.gov (United States)

    Ritchie, Scott C; Watts, Stephen; Fearnley, Liam G; Holt, Kathryn E; Abraham, Gad; Inouye, Michael

    2016-07-01

    Network modules-topologically distinct groups of edges and nodes-that are preserved across datasets can reveal common features of organisms, tissues, cell types, and molecules. Many statistics to identify such modules have been developed, but testing their significance requires heuristics. Here, we demonstrate that current methods for assessing module preservation are systematically biased and produce skewed p values. We introduce NetRep, a rapid and computationally efficient method that uses a permutation approach to score module preservation without assuming data are normally distributed. NetRep produces unbiased p values and can distinguish between true and false positives during multiple hypothesis testing. We use NetRep to quantify preservation of gene coexpression modules across murine brain, liver, adipose, and muscle tissues. Complex patterns of multi-tissue preservation were revealed, including a liver-derived housekeeping module that displayed adipose- and muscle-specific association with body weight. Finally, we demonstrate the broader applicability of NetRep by quantifying preservation of bacterial networks in gut microbiota between men and women. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.

  6. Re-inspection of small RNA sequence datasets reveals several novel human miRNA genes.

    Directory of Open Access Journals (Sweden)

    Thomas Birkballe Hansen

    Full Text Available BACKGROUND: miRNAs are key players in gene expression regulation. To fully understand the complex nature of cellular differentiation or initiation and progression of disease, it is important to assess the expression patterns of as many miRNAs as possible. Thereby, identifying novel miRNAs is an essential prerequisite to make possible a comprehensive and coherent understanding of cellular biology. METHODOLOGY/PRINCIPAL FINDINGS: Based on two extensive, but previously published, small RNA sequence datasets from human embryonic stem cells and human embroid bodies, respectively [1], we identified 112 novel miRNA-like structures and were able to validate miRNA processing in 12 out of 17 investigated cases. Several miRNA candidates were furthermore substantiated by including additional available small RNA datasets, thereby demonstrating the power of combining datasets to identify miRNAs that otherwise may be assigned as experimental noise. CONCLUSIONS/SIGNIFICANCE: Our analysis highlights that existing datasets are not yet exhaustedly studied and continuous re-analysis of the available data is important to uncover all features of small RNA sequencing.

  7. The largest human cognitive performance dataset reveals insights into the effects of lifestyle factors and aging

    Directory of Open Access Journals (Sweden)

    Daniel A Sternberg

    2013-06-01

    Full Text Available Making new breakthroughs in understanding the processes underlying human cognition may depend on the availability of very large datasets that have not historically existed in psychology and neuroscience. Lumosity is a web-based cognitive training platform that has grown to include over 600 million cognitive training task results from over 35 million individuals, comprising the largest existing dataset of human cognitive performance. As part of the Human Cognition Project, Lumosity’s collaborative research program to understand the human mind, Lumos Labs researchers and external research collaborators have begun to explore this dataset in order uncover novel insights about the correlates of cognitive performance. This paper presents two preliminary demonstrations of some of the kinds of questions that can be examined with the dataset. The first example focuses on replicating known findings relating lifestyle factors to baseline cognitive performance in a demographically diverse, healthy population at a much larger scale than has previously been available. The second example examines a question that would likely be very difficult to study in laboratory-based and existing online experimental research approaches: specifically, how learning ability for different types of cognitive tasks changes with age. We hope that these examples will provoke the imagination of researchers who are interested in collaborating to answer fundamental questions about human cognitive performance.

  8. Meta-Analysis of High-Throughput Datasets Reveals Cellular Responses Following Hemorrhagic Fever Virus Infection

    Directory of Open Access Journals (Sweden)

    Gavin C. Bowick

    2011-05-01

    Full Text Available The continuing use of high-throughput assays to investigate cellular responses to infection is providing a large repository of information. Due to the large number of differentially expressed transcripts, often running into the thousands, the majority of these data have not been thoroughly investigated. Advances in techniques for the downstream analysis of high-throughput datasets are providing additional methods for the generation of additional hypotheses for further investigation. The large number of experimental observations, combined with databases that correlate particular genes and proteins with canonical pathways, functions and diseases, allows for the bioinformatic exploration of functional networks that may be implicated in replication or pathogenesis. Herein, we provide an example of how analysis of published high-throughput datasets of cellular responses to hemorrhagic fever virus infection can generate additional functional data. We describe enrichment of genes involved in metabolism, post-translational modification and cardiac damage; potential roles for specific transcription factors and a conserved involvement of a pathway based around cyclooxygenase-2. We believe that these types of analyses can provide virologists with additional hypotheses for continued investigation.

  9. Integrated analysis of ischemic stroke datasets revealed sex and age difference in anti-stroke targets

    Directory of Open Access Journals (Sweden)

    Wen-Xing Li

    2016-09-01

    Full Text Available Ischemic stroke is a common neurological disorder and the burden in the world is growing. This study aims to explore the effect of sex and age difference on ischemic stroke using integrated microarray datasets. The results showed a dramatic difference in whole gene expression profiles and influenced pathways between males and females, and also in the old and young individuals. Furthermore, compared with old males, old female patients showed more serious biological function damage. However, females showed less affected pathways than males in young subjects. Functional interaction networks showed these differential expression genes were mostly related to immune and inflammation-related functions. In addition, we found ARG1 and MMP9 were up-regulated in total and all subgroups. Importantly, IL1A, ILAB, IL6 and TNF and other anti-stroke target genes were up-regulated in males. However, these anti-stroke target genes showed low expression in females. This study found huge sex and age differences in ischemic stroke especially the opposite expression of anti-stroke target genes. Future studies are needed to uncover these pathological mechanisms, and to take appropriate pre-prevention, treatment and rehabilitation measures.

  10. Multilocus dataset reveals demographic histories of two peat mosses in Europe

    Directory of Open Access Journals (Sweden)

    Hock Zsófia

    2007-08-01

    Full Text Available Abstract Background Revealing the past and present demographic history of populations is of high importance to evaluate the conservation status of species. Demographic data can be obtained by direct monitoring or by analysing data of historical and recent collections. Although these methods provide the most detailed information they are very time consuming. Another alternative way is to make use of the information accumulated in the species' DNA over its history. Recent development of the coalescent theory makes it possible to reconstruct the demographic history of species using nucleotide polymorphism data. To separate the effect of natural selection and demography, multilocus analysis is needed because these two forces can produce similar patterns of polymorphisms. In this study we investigated the amount and pattern of sequence variability of a Europe wide sample set of two peat moss species (Sphagnum fimbriatum and S. squarrosum with similar distributions and mating systems but presumably contrasting historical demographies using 3 regions of the nuclear genome (appr. 3000 bps. We aimed to draw inferences concerning demographic, and phylogeographic histories of the species. Results All three nuclear regions supported the presence of an Atlantic and Non-Atlantic clade of S. fimbriatum suggesting glacial survival of the species along the Atlantic coast of Europe. Contrarily, S. squarrosum haplotypes showed three clades but no geographic structure at all. Maximum likelihood, mismatch and Bayesian analyses supported a severe historical bottleneck and a relatively recent demographic expansion of the Non-Atlantic clade of S. fimbriatum, whereas size of S. squarrosum populations has probably decreased in the past. Species wide molecular diversity of the two species was nearly the same with an excess of replacement mutations in S. fimbriatum. Similar levels of molecular diversity, contrasting phylogeographic patterns and excess of replacement

  11. Genotyping of ancient Mycobacterium tuberculosis strains reveals historic genetic diversity.

    Science.gov (United States)

    Müller, Romy; Roberts, Charlotte A; Brown, Terence A

    2014-04-22

    The evolutionary history of the Mycobacterium tuberculosis complex (MTBC) has previously been studied by analysis of sequence diversity in extant strains, but not addressed by direct examination of strain genotypes in archaeological remains. Here, we use ancient DNA sequencing to type 11 single nucleotide polymorphisms and two large sequence polymorphisms in the MTBC strains present in 10 archaeological samples from skeletons from Britain and Europe dating to the second-nineteenth centuries AD. The results enable us to assign the strains to groupings and lineages recognized in the extant MTBC. We show that at least during the eighteenth-nineteenth centuries AD, strains of M. tuberculosis belonging to different genetic groups were present in Britain at the same time, possibly even at a single location, and we present evidence for a mixed infection in at least one individual. Our study shows that ancient DNA typing applied to multiple samples can provide sufficiently detailed information to contribute to both archaeological and evolutionary knowledge of the history of tuberculosis.

  12. Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food.

    Science.gov (United States)

    Lim, Shu Yong; Yap, Kien-Pong; Thong, Kwai Lin

    2016-01-01

    Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.

  13. Comparative transcriptomic analysis reveals similarities and dissimilarities in Saccharomyces cerevisiae wine strains response to nitrogen availability.

    Directory of Open Access Journals (Sweden)

    Catarina Barbosa

    Full Text Available Nitrogen levels in grape-juices are of major importance in winemaking ensuring adequate yeast growth and fermentation performance. Here we used a comparative transcriptome analysis to uncover wine yeasts responses to nitrogen availability during fermentation. Gene expression was assessed in three genetically and phenotypically divergent commercial wine strains (CEG, VL1 and QA23, under low (67 mg/L and high nitrogen (670 mg/L regimes, at three time points during fermentation (12 h, 24 h and 96 h. Two-way ANOVA analysis of each fermentation condition led to the identification of genes whose expression was dependent on strain, fermentation stage and on the interaction of both factors. The high fermenter yeast strain QA23 was more clearly distinct from the other two strains, by differential expression of genes involved in flocculation, mitochondrial functions, energy generation and protein folding and stabilization. For all strains, higher transcriptional variability due to fermentation stage was seen in the high nitrogen fermentations. A positive correlation between maximum fermentation rate and the expression of genes involved in stress response was observed. The finding of common genes correlated with both fermentation activity and nitrogen up-take underlies the role of nitrogen on yeast fermentative fitness. The comparative analysis of genes differentially expressed between both fermentation conditions at 12 h, where the main difference was the level of nitrogen available, showed the highest variability amongst strains revealing strain-specific responses. Nevertheless, we were able to identify a small set of genes whose expression profiles can quantitatively assess the common response of the yeast strains to varying nitrogen conditions. The use of three contrasting yeast strains in gene expression analysis prompts the identification of more reliable, accurate and reproducible biomarkers that will facilitate the diagnosis of deficiency of this

  14. Comparative Transcriptomic Analysis Reveals Similarities and Dissimilarities in Saccharomyces cerevisiae Wine Strains Response to Nitrogen Availability

    Science.gov (United States)

    Barbosa, Catarina; García-Martínez, José; Pérez-Ortín, José E.; Mendes-Ferreira, Ana

    2015-01-01

    Nitrogen levels in grape-juices are of major importance in winemaking ensuring adequate yeast growth and fermentation performance. Here we used a comparative transcriptome analysis to uncover wine yeasts responses to nitrogen availability during fermentation. Gene expression was assessed in three genetically and phenotypically divergent commercial wine strains (CEG, VL1 and QA23), under low (67 mg/L) and high nitrogen (670 mg/L) regimes, at three time points during fermentation (12h, 24h and 96h). Two-way ANOVA analysis of each fermentation condition led to the identification of genes whose expression was dependent on strain, fermentation stage and on the interaction of both factors. The high fermenter yeast strain QA23 was more clearly distinct from the other two strains, by differential expression of genes involved in flocculation, mitochondrial functions, energy generation and protein folding and stabilization. For all strains, higher transcriptional variability due to fermentation stage was seen in the high nitrogen fermentations. A positive correlation between maximum fermentation rate and the expression of genes involved in stress response was observed. The finding of common genes correlated with both fermentation activity and nitrogen up-take underlies the role of nitrogen on yeast fermentative fitness. The comparative analysis of genes differentially expressed between both fermentation conditions at 12h, where the main difference was the level of nitrogen available, showed the highest variability amongst strains revealing strain-specific responses. Nevertheless, we were able to identify a small set of genes whose expression profiles can quantitatively assess the common response of the yeast strains to varying nitrogen conditions. The use of three contrasting yeast strains in gene expression analysis prompts the identification of more reliable, accurate and reproducible biomarkers that will facilitate the diagnosis of deficiency of this nutrient in the grape

  15. Anchored enrichment dataset for true flies (order Diptera) reveals insights into the phylogeny of flower flies (family Syrphidae).

    Science.gov (United States)

    Young, Andrew Donovan; Lemmon, Alan R; Skevington, Jeffrey H; Mengual, Ximo; Ståhls, Gunilla; Reemer, Menno; Jordaens, Kurt; Kelso, Scott; Lemmon, Emily Moriarty; Hauser, Martin; De Meyer, Marc; Misof, Bernhard; Wiegmann, Brian M

    2016-06-29

    Anchored hybrid enrichment is a form of next-generation sequencing that uses oligonucleotide probes to target conserved regions of the genome flanked by less conserved regions in order to acquire data useful for phylogenetic inference from a broad range of taxa. Once a probe kit is developed, anchored hybrid enrichment is superior to traditional PCR-based Sanger sequencing in terms of both the amount of genomic data that can be recovered and effective cost. Due to their incredibly diverse nature, importance as pollinators, and historical instability with regard to subfamilial and tribal classification, Syrphidae (flower flies or hoverflies) are an ideal candidate for anchored hybrid enrichment-based phylogenetics, especially since recent molecular phylogenies of the syrphids using only a few markers have resulted in highly unresolved topologies. Over 6200 syrphids are currently known and uncovering their phylogeny will help us to understand how these species have diversified, providing insight into an array of ecological processes, from the development of adult mimicry, the origin of adult migration, to pollination patterns and the evolution of larval resource utilization. We present the first use of anchored hybrid enrichment in insect phylogenetics on a dataset containing 30 flower fly species from across all four subfamilies and 11 tribes out of 15. To produce a phylogenetic hypothesis, 559 loci were sampled to produce a final dataset containing 217,702 sites. We recovered a well resolved topology with bootstrap support values that were almost universally >95 %. The subfamily Eristalinae is recovered as paraphyletic, with the strongest support for this hypothesis to date. The ant predators in the Microdontinae are sister to all other syrphids. Syrphinae and Pipizinae are monophyletic and sister to each other. Larval predation on soft-bodied hemipterans evolved only once in this family. Anchored hybrid enrichment was successful in producing a robustly supported

  16. Transcriptomic profiling of diverse Aedes aegypti strains reveals increased basal-level immune activation in dengue virus-refractory populations and identifies novel virus-vector molecular interactions.

    Directory of Open Access Journals (Sweden)

    Shuzhen Sim

    Full Text Available Genetic variation among Aedes aegypti populations can greatly influence their vector competence for human pathogens such as the dengue virus (DENV. While intra-species transcriptome differences remain relatively unstudied when compared to coding sequence polymorphisms, they also affect numerous aspects of mosquito biology. Comparative molecular profiling of mosquito strain transcriptomes can therefore provide valuable insight into the regulation of vector competence. We established a panel of A. aegypti strains with varying levels of susceptibility to DENV, comprising both laboratory-maintained strains and field-derived colonies collected from geographically distinct dengue-endemic regions spanning South America, the Caribbean, and Southeast Asia. A comparative genome-wide gene expression microarray-based analysis revealed higher basal levels of numerous immunity-related gene transcripts in DENV-refractory mosquito strains than in susceptible strains, and RNA interference assays further showed different degrees of immune pathway contribution to refractoriness in different strains. By correlating transcript abundance patterns with DENV susceptibility across our panel, we also identified new candidate modulators of DENV infection in the mosquito, and we provide functional evidence for two potential DENV host factors and one potential restriction factor. Our comparative transcriptome dataset thus not only provides valuable information about immune gene regulation and usage in natural refractoriness of mosquito populations to dengue virus but also allows us to identify new molecular interactions between the virus and its mosquito vector.

  17. Citizen science datasets reveal drivers of spatial and temporal variation for anthropogenic litter on Great Lakes beaches.

    Science.gov (United States)

    Vincent, Anna; Drag, Nate; Lyandres, Olga; Neville, Sarah; Hoellein, Timothy

    2017-01-15

    Accumulation of anthropogenic litter (AL) on marine beaches and its ecological effects have been a major focus of research. Recent studies suggest AL is also abundant in freshwater environments, but much less research has been conducted in freshwaters relative to oceans. The Adopt-a-BeachTM (AAB) program, administered by the Alliance for the Great Lakes, organizes volunteers to act as citizen scientists by collecting and maintaining data on AL abundance on Great Lakes beaches. Initial assessments of the AAB records quantified sources and abundance of AL on Lake Michigan beaches, and showed that plastic AL was >75% of AL on beaches across all five Great Lakes. However, AAB records have not yet been used to examine patterns of AL density and composition among beaches of all different substrate types (e.g., parks, rocky, sandy), across land-use categories (e.g., rural, suburban, urban), or among seasons (i.e., spring, summer, and fall). We found that most AL on beaches are consumer goods that most likely originate from beach visitors and nearby urban environments, rather than activities such as shipping, fishing, or illegal dumping. We also demonstrated that urban beaches and those with sand rather than rocks had higher AL density relative to other sites. Finally, we found that AL abundance is lowest during the summer, between the US holidays of Memorial Day (last Monday in May) and Labor Day (first Monday in September) at the urban beaches, while other beaches showed no seasonality. This research is a model for utilizing datasets collected by volunteers involved in citizen science programs, and will contribute to AL management by offering priorities for AL types and locations to maximize AL reduction. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  19. Comparative genome analysis of pathogenic and non-pathogenic Clavibacter strains reveals adaptations to their lifestyle.

    Science.gov (United States)

    Załuga, Joanna; Stragier, Pieter; Baeyen, Steve; Haegeman, Annelies; Van Vaerenbergh, Johan; Maes, Martine; De Vos, Paul

    2014-05-22

    The genus Clavibacter harbors economically important plant pathogens infecting agricultural crops such as potato and tomato. Although the vast majority of Clavibacter strains are pathogenic, there is an increasing number of non-pathogenic isolates reported. Non-pathogenic Clavibacter strains isolated from tomato seeds are particularly problematic because they affect the current detection and identification tests for Clavibacter michiganensis subsp. michiganensis (Cmm), which is regulated with a zero tolerance in tomato seed. Their misidentification as pathogenic Cmm hampers a clear judgment on the seed quality and health. To get more insight in the genetic features linked to the lifestyle of these bacteria, a whole-genome sequence of the tomato seed-borne non-pathogenic Clavibacter LMG 26808 was determined. To gain a better understanding of the molecular determinants of pathogenicity, the genome sequence of LMG 26808 was compared with that of the pathogenic Cmm strain (NCPPB 382). The comparative analysis revealed that LMG 26808 does not contain plasmids pCM1 and pCM2 and also lacks the majority of important virulence factors described so far for pathogenic Cmm. This explains its apparent non-pathogenic nature in tomato plants. Moreover, the genome analysis of LMG 26808 detected sequences from a plasmid originating from a member of Enterobacteriaceae/Klebsiella relative. Genes received that way and coding for antibiotic resistance may provide a competitive advantage for survival of LMG 26808 in its ecological niche. Genetically, LMG 26808 was the most similar to the pathogenic Cmm NCPPB 382 but contained more mobile genetic elements. The genome of this non-pathogenic Clavibacter strain contained also a high number of transporters and regulatory genes. The genome sequence of the non-pathogenic Clavibacter strain LMG 26808 and the comparative analyses with other pathogenic Clavibacter strains provided a better understanding of the genetic bases of virulence and

  20. Meta-Analysis of Public Microarray Datasets Reveals Voltage-Gated Calcium Gene Signatures in Clinical Cancer Patients.

    Directory of Open Access Journals (Sweden)

    Chih-Yang Wang

    Full Text Available Voltage-gated calcium channels (VGCCs are well documented to play roles in cell proliferation, migration, and apoptosis; however, whether VGCCs regulate the onset and progression of cancer is still under investigation. The VGCC family consists of five members, which are L-type, N-type, T-type, R-type and P/Q type. To date, no holistic approach has been used to screen VGCC family genes in different types of cancer. We analyzed the transcript expression of VGCCs in clinical cancer tissue samples by accessing ONCOMINE (www.oncomine.org, a web-based microarray database, to perform a systematic analysis. Every member of the VGCCs was examined across 21 different types of cancer by comparing mRNA expression in cancer to that in normal tissue. A previous study showed that altered expression of mRNA in cancer tissue may play an oncogenic role and promote tumor development; therefore, in the present findings, we focus only on the overexpression of VGCCs in different types of cancer. This bioinformatics analysis revealed that different subtypes of VGCCs (CACNA1C, CACNA1D, CACNA1B, CACNA1G, and CACNA1I are implicated in the development and progression of diverse types of cancer and show dramatic up-regulation in breast cancer. CACNA1F only showed high expression in testis cancer, whereas CACNA1A, CACNA1C, and CACNA1D were highly expressed in most types of cancer. The current analysis revealed that specific VGCCs likely play essential roles in specific types of cancer. Collectively, we identified several VGCC targets and classified them according to different cancer subtypes for prospective studies on the underlying carcinogenic mechanisms. The present findings suggest that VGCCs are possible targets for prospective investigation in cancer treatment.

  1. Revealing differences in metabolic flux distributions between a mutant strain and its parent strain Gluconacetobacter xylinus CGMCC 2955.

    Directory of Open Access Journals (Sweden)

    Cheng Zhong

    Full Text Available A better understanding of metabolic fluxes is important for manipulating microbial metabolism toward desired end products, or away from undesirable by-products. A mutant strain, Gluconacetobacter xylinus AX2-16, was obtained by combined chemical mutation of the parent strain (G. xylinus CGMCC 2955 using DEC (diethyl sulfate and LiCl. The highest bacterial cellulose production for this mutant was obtained at about 11.75 g/L, which was an increase of 62% compared with that by the parent strain. In contrast, gluconic acid (the main byproduct concentration was only 5.71 g/L for mutant strain, which was 55.7% lower than that of parent strain. Metabolic flux analysis indicated that 40.1% of the carbon source was transformed to bacterial cellulose in mutant strain, compared with 24.2% for parent strain. Only 32.7% and 4.0% of the carbon source were converted into gluconic acid and acetic acid in mutant strain, compared with 58.5% and 9.5% of that in parent strain. In addition, a higher flux of tricarboxylic acid (TCA cycle was obtained in mutant strain (57.0% compared with parent strain (17.0%. It was also indicated from the flux analysis that more ATP was produced in mutant strain from pentose phosphate pathway (PPP and TCA cycle. The enzymatic activity of succinate dehydrogenase (SDH, which is one of the key enzymes in TCA cycle, was 1.65-fold higher in mutant strain than that in parent strain at the end of culture. It was further validated by the measurement of ATPase that 3.53-6.41 fold higher enzymatic activity was obtained from mutant strain compared with parent strain.

  2. Nomadic lifestyle of Lactobacillus plantarum revealed by comparative genomics of 54 strains isolated from different habitats.

    Science.gov (United States)

    Martino, Maria Elena; Bayjanov, Jumamurat R; Caffrey, Brian E; Wels, Michiel; Joncour, Pauline; Hughes, Sandrine; Gillet, Benjamin; Kleerebezem, Michiel; van Hijum, Sacha A F T; Leulier, François

    2016-12-01

    The ability of bacteria to adapt to diverse environmental conditions is well-known. The process of bacterial adaptation to a niche has been linked to large changes in the genome content, showing that many bacterial genomes reflect the constraints imposed by their habitat. However, some highly versatile bacteria are found in diverse habitats that almost share nothing in common. Lactobacillus plantarum is a lactic acid bacterium that is found in a large variety of habitat. With the aim of unravelling the link between evolution and ecological versatility of L. plantarum, we analysed the genomes of 54 L. plantarum strains isolated from different environments. Comparative genome analysis identified a high level of genomic diversity and plasticity among the strains analysed. Phylogenomic and functional divergence studies coupled with gene-trait matching analyses revealed a mixed distribution of the strains, which was uncoupled from their environmental origin. Our findings revealed the absence of specific genomic signatures marking adaptations of L. plantarum towards the diverse habitats it is associated with. This suggests fundamentally similar trends of genome evolution in L. plantarum, which occur in a manner that is apparently uncoupled from ecological constraint and reflects the nomadic lifestyle of this species. © 2016 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  3. Caenorhabditis briggsae recombinant inbred line genotypes reveal inter-strain incompatibility and the evolution of recombination.

    Directory of Open Access Journals (Sweden)

    Joseph A Ross

    2011-07-01

    Full Text Available The nematode Caenorhabditis briggsae is an emerging model organism that allows evolutionary comparisons with C. elegans and exploration of its own unique biological attributes. To produce a high-resolution C. briggsae recombination map, recombinant inbred lines were generated from reciprocal crosses between two strains and genotyped at over 1,000 loci. A second set of recombinant inbred lines involving a third strain was also genotyped at lower resolution. The resulting recombination maps exhibit discrete domains of high and low recombination, as in C. elegans, indicating these are a general feature of Caenorhabditis species. The proportion of a chromosome's physical size occupied by the central, low-recombination domain is highly correlated between species. However, the C. briggsae intra-species comparison reveals striking variation in the distribution of recombination between domains. Hybrid lines made with the more divergent pair of strains also exhibit pervasive marker transmission ratio distortion, evidence of selection acting on hybrid genotypes. The strongest effect, on chromosome III, is explained by a developmental delay phenotype exhibited by some hybrid F2 animals. In addition, on chromosomes IV and V, cross direction-specific biases towards one parental genotype suggest the existence of cytonuclear epistatic interactions. These interactions are discussed in relation to surprising mitochondrial genome polymorphism in C. briggsae, evidence that the two strains diverged in allopatry, the potential for local adaptation, and the evolution of Dobzhansky-Muller incompatibilities. The genetic and genomic resources resulting from this work will support future efforts to understand inter-strain divergence as well as facilitate studies of gene function, natural variation, and the evolution of recombination in Caenorhabditis nematodes.

  4. Differential lysine acetylation profiles of Erwinia amylovora strains revealed by proteomics

    Science.gov (United States)

    Wu, Xia; Vellaichamy, Adaikkalam; Wang, Dongping; Zamdborg, Leonid; Kelleher, Neil L.; Huber, Steven C.; Zhao, Youfu

    2015-01-01

    Protein lysine acetylation (LysAc) has recently been demonstrated to be widespread in E. coli and Salmonella, and to broadly regulate bacterial physiology and metabolism. However, LysAc in plant pathogenic bacteria is largely unknown. Here we first report the lysine acetylome of Erwinia amylovora, an enterobacterium causing serious fire blight disease of apples and pears. Immunoblots using generic anti-lysine acetylation antibodies demonstrated that growth conditions strongly affected the LysAc profiles in E. amylovora. Differential LysAc profiles were also observed for two E. amylovora strains, known to have differential virulence in plants, indicating translational modification of proteins may be important in determining virulence of bacterial strains. Proteomic analysis of LysAc in two E. amylovora strains identified 141 LysAc sites in 96 proteins that function in a wide range of biological pathways. Consistent with previous reports, 44% of the proteins are involved in metabolic processes, including central metabolism, lipopolysaccharide, nucleotide and amino acid metabolism. Interestingly, for the first time, several proteins involved in E. amylovora virulence, including exopolysaccharide amylovoran biosynthesis- and type III secretion-associated proteins, were found to be lysine acetylated, suggesting that LysAc may play a major role in bacterial virulence. Comparative analysis of LysAc sites in E. amylovora and E. coli further revealed the sequence and structural commonality for LysAc in the two organisms. Collectively, these results reinforce the notion that LysAc of proteins is widespread in bacterial metabolism and virulence. PMID:23234799

  5. Comparative Transcriptome Analysis Reveals Different Silk Yields of Two Silkworm Strains.

    Directory of Open Access Journals (Sweden)

    Juan Li

    Full Text Available Cocoon and silk yields are the most important characteristics of sericulture. However, few studies have examined the genes that modulate these features. Further studies of these genes will be useful for improving the products of sericulture. JingSong (JS and Lan10 (L10 are two strains having significantly different cocoon and silk yields. In the current study, RNA-Seq and quantitative polymerase chain reaction (qPCR were performed on both strains in order to determine divergence of the silk gland, which controls silk biosynthesis in silkworms. Compared with L10, JS had 1375 differentially expressed genes (DEGs; 738 up-regulated genes and 673 down-regulated genes. Nine enriched gene ontology (GO terms were identified by GO enrichment analysis based on these DEGs. KEGG enrichment analysis results showed that the DEGs were enriched in three pathways, which were mainly associated with the processing and biosynthesis of proteins. The representative genes in the enrichment pathways and ten significant DEGs were further verified by qPCR, the results of which were consistent with the RNA-Seq data. Our study has revealed differences in silk glands between the two silkworm strains and provides a perspective for understanding the molecular mechanisms determining silk yield.

  6. Characterization of the biocontrol activity of pseudomonas fluorescens strain X reveals novel genes regulated by glucose.

    Directory of Open Access Journals (Sweden)

    Gerasimos F Kremmydas

    Full Text Available Pseudomonas fluorescens strain X, a bacterial isolate from the rhizosphere of bean seedlings, has the ability to suppress damping-off caused by the oomycete Pythium ultimum. To determine the genes controlling the biocontrol activity of strain X, transposon mutagenesis, sequencing and complementation was performed. Results indicate that, biocontrol ability of this isolate is attributed to gcd gene encoding glucose dehydrogenase, genes encoding its co-enzyme pyrroloquinoline quinone (PQQ, and two genes (sup5 and sup6 which seem to be organized in a putative operon. This operon (named supX consists of five genes, one of which encodes a non-ribosomal peptide synthase. A unique binding site for a GntR-type transcriptional factor is localized upstream of the supX putative operon. Synteny comparison of the genes in supX revealed that they are common in the genus Pseudomonas, but with a low degree of similarity. supX shows high similarity only to the mangotoxin operon of Ps. syringae pv. syringae UMAF0158. Quantitative real-time PCR analysis indicated that transcription of supX is strongly reduced in the gcd and PQQ-minus mutants of Ps. fluorescens strain X. On the contrary, transcription of supX in the wild type is enhanced by glucose and transcription levels that appear to be higher during the stationary phase. Gcd, which uses PQQ as a cofactor, catalyses the oxidation of glucose to gluconic acid, which controls the activity of the GntR family of transcriptional factors. The genes in the supX putative operon have not been implicated before in the biocontrol of plant pathogens by pseudomonads. They are involved in the biosynthesis of an antimicrobial compound by Ps. fluorescens strain X and their transcription is controlled by glucose, possibly through the activity of a GntR-type transcriptional factor binding upstream of this putative operon.

  7. Proteomics dataset

    DEFF Research Database (Denmark)

    Bennike, Tue Bjerg; Carlsen, Thomas Gelsing; Ellingsen, Torkell

    2017-01-01

    The datasets presented in this article are related to the research articles entitled “Neutrophil Extracellular Traps in Ulcerative Colitis: A Proteome Analysis of Intestinal Biopsies” (Bennike et al., 2015 [1]), and “Proteome Analysis of Rheumatoid Arthritis Gut Mucosa” (Bennike et al., 2017 [2])...... been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifiers PXD001608 for ulcerative colitis and control samples, and PXD003082 for rheumatoid arthritis samples....

  8. Longitudinal genotyping of Candida dubliniensis isolates reveals strain maintenance, microevolution, and the emergence of itraconazole resistance.

    LENUS (Irish Health Repository)

    Fleischhacker, M

    2010-05-01

    We investigated the population structure of 208 Candida dubliniensis isolates obtained from 29 patients (25 human immunodeficiency virus [HIV] positive and 4 HIV negative) as part of a longitudinal study. The isolates were identified as C. dubliniensis by arbitrarily primed PCR (AP-PCR) and then genotyped using the Cd25 probe specific for C. dubliniensis. The majority of the isolates (55 of 58) were unique to individual patients, and more than one genotype was recovered from 15 of 29 patients. A total of 21 HIV-positive patients were sampled on more than one occasion (2 to 36 times). Sequential isolates recovered from these patients were all closely related, as demonstrated by hybridization with Cd25 and genotyping by PCR. Six patients were colonized by the same genotype of C. dubliniensis on repeated sampling, while strains exhibiting altered genotypes were recovered from 15 of 21 patients. The majority of these isolates demonstrated minor genetic alterations, i.e., microevolution, while one patient acquired an unrelated strain. The C. dubliniensis strains could not be separated into genetically distinct groups based on patient viral load, CD4 cell count, or oropharyngeal candidosis. However, C. dubliniensis isolates obtained from HIV-positive patients were more closely related than those recovered from HIV-negative patients. Approximately 8% (16 of 194) of isolates exhibited itraconazole resistance. Cross-resistance to fluconazole was only observed in one of these patients. Two patients harboring itraconazole-resistant isolates had not received any previous azole therapy. In conclusion, longitudinal genotyping of C. dubliniensis isolates from HIV-infected patients reveals that isolates from the same patient are generally closely related and may undergo microevolution. In addition, isolates may acquire itraconazole resistance, even in the absence of prior azole therapy.

  9. Comparative Genomics Revealed Genetic Diversity and Species/Strain-Level Differences in Carbohydrate Metabolism of Three Probiotic Bifidobacterial Species

    Directory of Open Access Journals (Sweden)

    Toshitaka Odamaki

    2015-01-01

    Full Text Available Strains of Bifidobacterium longum, Bifidobacterium breve, and Bifidobacterium animalis are widely used as probiotics in the food industry. Although numerous studies have revealed the properties and functionality of these strains, it is uncertain whether these characteristics are species common or strain specific. To address this issue, we performed a comparative genomic analysis of 49 strains belonging to these three bifidobacterial species to describe their genetic diversity and to evaluate species-level differences. There were 166 common clusters between strains of B. breve and B. longum, whereas there were nine common clusters between strains of B. animalis and B. longum and four common clusters between strains of B. animalis and B. breve. Further analysis focused on carbohydrate metabolism revealed the existence of certain strain-dependent genes, such as those encoding enzymes for host glycan utilisation or certain membrane transporters, and many genes commonly distributed at the species level, as was previously reported in studies with limited strains. As B. longum and B. breve are human-residential bifidobacteria (HRB, whereas B. animalis is a non-HRB species, several of the differences in these species’ gene distributions might be the result of their adaptations to the nutrient environment. This information may aid both in selecting probiotic candidates and in understanding their potential function as probiotics.

  10. Viral forensic genomics reveals the relatedness of classic herpes simplex virus strains KOS, KOS63, and KOS79.

    Science.gov (United States)

    Bowen, Christopher D; Renner, Daniel W; Shreve, Jacob T; Tafuri, Yolanda; Payne, Kimberly M; Dix, Richard D; Kinchington, Paul R; Gatherer, Derek; Szpara, Moriah L

    2016-05-01

    Herpes simplex virus 1 (HSV-1) is a widespread global pathogen, of which the strain KOS is one of the most extensively studied. Previous sequence studies revealed that KOS does not cluster with other strains of North American geographic origin, but instead clustered with Asian strains. We sequenced a historical isolate of the original KOS strain, called KOS63, along with a separately isolated strain attributed to the same source individual, termed KOS79. Genomic analyses revealed that KOS63 closely resembled other recently sequenced isolates of KOS and was of Asian origin, but that KOS79 was a genetically unrelated strain that clustered in genetic distance analyses with HSV-1 strains of North American/European origin. These data suggest that the human source of KOS63 and KOS79 could have been infected with two genetically unrelated strains of disparate geographic origins. A PCR RFLP test was developed for rapid identification of these strains. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Atomic-scale Ge diffusion in strained Si revealed by quantitative scanning transmission electron microscopy

    Science.gov (United States)

    Radtke, G.; Favre, L.; Couillard, M.; Amiard, G.; Berbezier, I.; Botton, G. A.

    2013-05-01

    Aberration-corrected scanning transmission electron microscopy is employed to investigate the local chemistry in the vicinity of a Si0.8Ge0.2/Si interface grown by molecular-beam epitaxy. Atomic-resolution high-angle annular dark field contrast reveals the presence of a nonuniform diffusion of Ge from the substrate into the strained Si thin film. On the basis of multislice calculations, a model is proposed to quantify the experimental contrast, showing that the Ge concentration in the thin film reaches about 4% at the interface and decreases monotonically on a typical length scale of 10 nm. Diffusion occurring during the growth process itself therefore appears as a major factor limiting the abruptness of interfaces in the Si-Ge system.

  12. Multilocus microsatellite typing (MLMT of strains from Turkey and Cyprus reveals a novel monophyletic L. donovani sensu lato group.

    Directory of Open Access Journals (Sweden)

    Evi Gouzelou

    Full Text Available BACKGROUND: New foci of human CL caused by strains of the Leishmania donovani (L. donovani complex have been recently described in Cyprus and the Çukurova region in Turkey (L. infantum situated 150 km north of Cyprus. Cypriot strains were typed by Multilocus Enzyme Electrophoresis (MLEE using the Montpellier (MON system as L. donovani zymodeme MON-37. However, multilocus microsatellite typing (MLMT has shown that this zymodeme is paraphyletic; composed of distantly related genetic subgroups of different geographical origin. Consequently the origin of the Cypriot strains remained enigmatic. METHODOLOGY/PRINCIPAL FINDINGS: The Cypriot strains were compared with a set of Turkish isolates obtained from a CL patient and sand fly vectors in south-east Turkey (Çukurova region; CUK strains and from a VL patient in the south-west (Kuşadasi; EP59 strain. These Turkish strains were initially analyzed using the K26-PCR assay that discriminates MON-1 strains by their amplicon size. In line with previous DNA-based data, the strains were inferred to the L. donovani complex and characterized as non MON-1. For these strains MLEE typing revealed two novel zymodemes; L. donovani MON-309 (CUK strains and MON-308 (EP59. A population genetic analysis of the Turkish isolates was performed using 14 hyper-variable microsatellite loci. The genotypic profiles of 68 previously analyzed L. donovani complex strains from major endemic regions were included for comparison. Population structures were inferred by combination of bayesian model-based and distance-based approaches. MLMT placed the Turkish and Cypriot strains in a subclade of a newly discovered, genetically distinct L. infantum monophyletic group, suggesting that the Cypriot strains may originate from Turkey. CONCLUSION: The discovery of a genetically distinct L. infantum monophyletic group in the south-eastern Mediterranean stresses the importance of species genetic characterization towards better understanding

  13. Experimental single-strain mobilomics reveals events that shape pathogen emergence.

    Science.gov (United States)

    Schoeniger, Joseph S; Hudson, Corey M; Bent, Zachary W; Sinha, Anupama; Williams, Kelly P

    2016-08-19

    Virulence genes on mobile DNAs such as genomic islands (GIs) and plasmids promote bacterial pathogen emergence. Excision is an early step in GI mobilization, producing a circular GI and a deletion site in the chromosome; circular forms are also known for some bacterial insertion sequences (ISs). The recombinant sequence at the junctions of such circles and deletions can be detected sensitively in high-throughput sequencing data, using new computational methods that enable empirical discovery of mobile DNAs. For the rich mobilome of a hospital Klebsiella pneumoniae strain, circularization junctions (CJs) were detected for six GIs and seven IS types. Our methods revealed differential biology of multiple mobile DNAs, imprecision of integrases and transposases, and differential activity among identical IS copies for IS26, ISKpn18 and ISKpn21 Using the resistance of circular dsDNA molecules to exonuclease, internally calibrated with the native plasmids, showed that not all molecules bearing GI CJs were circular. Transpositions were also detected, revealing replicon preference (ISKpn18 prefers a conjugative IncA/C2 plasmid), local action (IS26), regional preferences, selection (against capsule synthesis) and IS polarity inversion. Efficient discovery and global characterization of numerous mobile elements per experiment improves accounting for the new gene combinations that arise in emerging pathogens. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Proteomics dataset

    DEFF Research Database (Denmark)

    Bennike, Tue Bjerg; Carlsen, Thomas Gelsing; Ellingsen, Torkell

    2017-01-01

    patients (Morgan et al., 2012; Abraham and Medzhitov, 2011; Bennike, 2014) [8–10. Therefore, we characterized the proteome of colon mucosa biopsies from 10 inflammatory bowel disease ulcerative colitis (UC) patients, 11 gastrointestinal healthy rheumatoid arthritis (RA) patients, and 10 controls. We...... been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifiers PXD001608 for ulcerative colitis and control samples, and PXD003082 for rheumatoid arthritis samples....

  15. Proteomic analysis reveals contrasting stress response to uranium in two nitrogen-fixing Anabaena strains, differentially tolerant to uranium

    Energy Technology Data Exchange (ETDEWEB)

    Panda, Bandita; Basu, Bhakti; Acharya, Celin; Rajaram, Hema; Apte, Shree Kumar, E-mail: aptesk@barc.gov.in

    2017-01-15

    Highlights: • Response of two native cyanobacterial strains to uranium exposure was studied. • Anabaena L-31 exhibited higher tolerance to uranium as compared to Anabaena 7120. • Uranium exposure differentially affected the proteome profiles of the two strains. • Anabaena L-31 showed better sustenance of photosynthesis and carbon metabolism. • Anabaena L-31 displayed superior oxidative stress defense than Anabaena 7120. - Abstract: Two strains of the nitrogen-fixing cyanobacterium Anabaena, native to Indian paddy fields, displayed differential sensitivity to exposure to uranyl carbonate at neutral pH. Anabaena sp. strain PCC 7120 and Anabaena sp. strain L-31 displayed 50% reduction in survival (LD{sub 50} dose), following 3 h exposure to 75 μM and 200 μM uranyl carbonate, respectively. Uranium responsive proteome alterations were visualized by 2D gel electrophoresis, followed by protein identification by MALDI-ToF mass spectrometry. The two strains displayed significant differences in levels of proteins associated with photosynthesis, carbon metabolism, and oxidative stress alleviation, commensurate with their uranium tolerance. Higher uranium tolerance of Anabaena sp. strain L-31 could be attributed to sustained photosynthesis and carbon metabolism and superior oxidative stress defense, as compared to the uranium sensitive Anabaena sp. strain PCC 7120. Significance: Uranium responsive proteome modulations in two nitrogen-fixing strains of Anabaena, native to Indian paddy fields, revealed that rapid adaptation to better oxidative stress management, and maintenance of metabolic and energy homeostasis underlies superior uranium tolerance of Anabaena sp. strain L-31 compared to Anabaena sp. strain PCC 7120.

  16. Sister Dehalobacter Genomes Reveal Specialization in Organohalide Respiration and Recent Strain Differentiation Likely Driven by Chlorinated Substrates

    Directory of Open Access Journals (Sweden)

    Shuiquan eTang

    2016-02-01

    Full Text Available The genomes of two closely related Dehalobacter strains (strain CF and strain DCA were assembled from the metagenome of an anaerobic enrichment culture that reductively dechlorinates chloroform (CF, 1,1,1-trichloroethane (1,1,1-TCA and 1,1-dichloroethane (1,1-DCA. The 3.1 Mbp genomes of strain CF (that dechlorinates CF and 1,1,1-TCA and strain DCA (that dechlorinates 1,1-DCA each contain 17 putative reductive dehalogenase homologous (rdh genes. These two genomes were systematically compared to three other available organohalide-respiring Dehalobacter genomes (Dehalobacter restrictus strain PER-K23, Dehalobacter sp. strain E1 and Dehalobacter sp. strain UNSWDHB, and to the genomes of Dehalococcoides mccartyi strain 195 and Desulfitobacterium hafniense strain Y51. This analysis compared 42 different metabolic and physiological categories. The genomes of strains CF and DCA share 90% overall average nucleotide identity and greater than 99.8% identity over a 2.9 Mbp alignment that excludes large insertions, indicating that these genomes differentiated from a close common ancestor. This differentiation was likely driven by selection pressures around two orthologous reductive dehalogenase genes, cfrA and dcrA, that code for the enzymes that reduce CF or 1,1,1-TCA and 1,1-DCA. The many reductive dehalogenase genes found in the five Dehalobacter genomes cluster into two small conserved regions and were often associated with Crp/Fnr transcriptional regulators. Specialization is on-going on a strain-specific basis, as some strains but not others have lost essential genes in the Wood-Ljungdahl (strain E1 and corrinoid biosynthesis pathways (strains E1 and PER-K23. The gene encoding phosphoserine phosphatase, which catalyzes the last step of serine biosynthesis, is missing from all five Dehalobacter genomes, yet D. restrictus can grow without serine, suggesting an alternative or unrecognized biosynthesis route exists. In contrast to Dehalococcoides mccartyi

  17. Molecular typing of canine distemper virus strains reveals the presence of a new genetic variant in South America.

    Science.gov (United States)

    Sarute, Nicolás; Pérez, Ruben; Aldaz, Jaime; Alfieri, Amauri A; Alfieri, Alice F; Name, Daniela; Llanes, Jessika; Hernández, Martín; Francia, Lourdes; Panzera, Yanina

    2014-06-01

    Canine distemper virus (CDV, Paramyxoviridae, Morbillivirus) is the causative agent of a severe infectious disease affecting terrestrial and marine carnivores worldwide. Phylogenetic relationships and the genetic variability of the hemagglutinin (H) protein and the fusion protein signal-peptide (Fsp) allow for the classification of field strains into genetic lineages. Currently, there are nine CDV lineages worldwide, two of them co-circulating in South America. Using the Fsp-coding region, we analyzed the genetic variability of strains from Uruguay, Brazil, and Ecuador, and compared them with those described previously in South America and other geographical areas. The results revealed that the Brazilian and Uruguayan strains belong to the already described South America lineage (EU1/SA1), whereas the Ecuadorian strains cluster in a new clade, here named South America 3, which may represent the third CDV lineage described in South America.

  18. High-resolution spatiotemporal strain mapping reveals non-uniform deformation in micropatterned elastomers

    Science.gov (United States)

    Aksoy, B.; Rehman, A.; Bayraktar, H.; Alaca, B. E.

    2017-04-01

    Micropatterns are generated on a vast selection of polymeric substrates for various applications ranging from stretchable electronics to cellular mechanobiological systems. When these patterned substrates are exposed to external loading, strain field is primarily affected by the presence of microfabricated structures and similarly by fabrication-related defects. The capturing of such nonhomogeneous strain fields is of utmost importance in cases where study of the mechanical behavior with a high spatial resolution is necessary. Image-based non-contact strain measurement techniques are favorable and have recently been extended to scanning tunneling microscope and scanning electron microscope images for the characterization of mechanical properties of metallic materials, e.g. steel and aluminum, at the microscale. A similar real-time analysis of strain heterogeneity in elastomers is yet to be achieved during the entire loading sequence. The available measurement methods for polymeric materials mostly depend on cross-head displacement or precalibrated strain values. Thus, they suffer either from the lack of any real-time analysis, spatiotemporal distribution or high resolution in addition to a combination of these factors. In this work, these challenges are addressed by integrating a tensile stretcher with an inverted optical microscope and developing a subpixel particle tracking algorithm. As a proof of concept, the patterns with a critical dimension of 200 µm are generated on polydimethylsiloxane substrates and strain distribution in the vicinity of the patterns is captured with a high spatiotemporal resolution. In the field of strain measurement, there is always a tradeoff between minimum measurable strain value and spatial resolution. Current noncontact techniques on elastomers can deliver a strain resolution of 0.001% over a minimum length of 5 cm. More importantly, inhomogeneities within this quite large region cannot be captured. The proposed technique can

  19. High-resolution spatiotemporal strain mapping reveals non-uniform deformation in micropatterned elastomers

    International Nuclear Information System (INIS)

    Aksoy, B; Alaca, B E; Rehman, A; Bayraktar, H

    2017-01-01

    Micropatterns are generated on a vast selection of polymeric substrates for various applications ranging from stretchable electronics to cellular mechanobiological systems. When these patterned substrates are exposed to external loading, strain field is primarily affected by the presence of microfabricated structures and similarly by fabrication-related defects. The capturing of such nonhomogeneous strain fields is of utmost importance in cases where study of the mechanical behavior with a high spatial resolution is necessary. Image-based non-contact strain measurement techniques are favorable and have recently been extended to scanning tunneling microscope and scanning electron microscope images for the characterization of mechanical properties of metallic materials, e.g. steel and aluminum, at the microscale. A similar real-time analysis of strain heterogeneity in elastomers is yet to be achieved during the entire loading sequence. The available measurement methods for polymeric materials mostly depend on cross-head displacement or precalibrated strain values. Thus, they suffer either from the lack of any real-time analysis, spatiotemporal distribution or high resolution in addition to a combination of these factors. In this work, these challenges are addressed by integrating a tensile stretcher with an inverted optical microscope and developing a subpixel particle tracking algorithm. As a proof of concept, the patterns with a critical dimension of 200 µ m are generated on polydimethylsiloxane substrates and strain distribution in the vicinity of the patterns is captured with a high spatiotemporal resolution. In the field of strain measurement, there is always a tradeoff between minimum measurable strain value and spatial resolution. Current noncontact techniques on elastomers can deliver a strain resolution of 0.001% over a minimum length of 5 cm. More importantly, inhomogeneities within this quite large region cannot be captured. The proposed technique can

  20. Cognitive assessment of mice strains heterozygous for cell-adhesion genes reveals strain-specific alterations in timing.

    Science.gov (United States)

    Gallistel, C R; Tucci, Valter; Nolan, Patrick M; Schachner, Melitta; Jakovcevski, Igor; Kheifets, Aaron; Barboza, Luendro

    2014-03-05

    We used a fully automated system for the behavioural measurement of physiologically meaningful properties of basic mechanisms of cognition to test two strains of heterozygous mutant mice, Bfc (batface) and L1, and their wild-type littermate controls. Both of the target genes are involved in the establishment and maintenance of synapses. We find that the Bfc heterozygotes show reduced precision in their representation of interval duration, whereas the L1 heterozygotes show increased precision. These effects are functionally specific, because many other measures made on the same mice are unaffected, namely: the accuracy of matching temporal investment ratios to income ratios in a matching protocol, the rate of instrumental and classical conditioning, the latency to initiate a cued instrumental response, the trials on task and the impulsivity in a switch paradigm, the accuracy with which mice adjust timed switches to changes in the temporal constraints, the days to acquisition, and mean onset time and onset variability in the circadian anticipation of food availability.

  1. Exoproteome analysis reveals higher abundance of proteins linked to alkaline stress in persistent Listeria monocytogenes strains.

    Science.gov (United States)

    Rychli, Kathrin; Grunert, Tom; Ciolacu, Luminita; Zaiser, Andreas; Razzazi-Fazeli, Ebrahim; Schmitz-Esser, Stephan; Ehling-Schulz, Monika; Wagner, Martin

    2016-02-02

    The foodborne pathogen Listeria monocytogenes, responsible for listeriosis a rare but severe infection disease, can survive in the food processing environment for month or even years. So-called persistent L. monocytogenes strains greatly increase the risk of (re)contamination of food products, and are therefore a great challenge for food safety. However, our understanding of the mechanism underlying persistence is still fragmented. In this study we compared the exoproteome of three persistent strains with the reference strain EGDe under mild stress conditions using 2D differential gel electrophoresis. Principal component analysis including all differentially abundant protein spots showed that the exoproteome of strain EGDe (sequence type (ST) 35) is distinct from that of the persistent strain R479a (ST8) and the two closely related ST121 strains 4423 and 6179. Phylogenetic analyses based on multilocus ST genes showed similar grouping of the strains. Comparing the exoproteome of strain EGDe and the three persistent strains resulted in identification of 22 differentially expressed protein spots corresponding to 16 proteins. Six proteins were significantly increased in the persistent L. monocytogenes exoproteomes, among them proteins involved in alkaline stress response (e.g. the membrane anchored lipoprotein Lmo2637 and the NADPH dehydrogenase NamA). In parallel the persistent strains showed increased survival under alkaline stress, which is often provided during cleaning and disinfection in the food processing environments. In addition, gene expression of the proteins linked to stress response (Lmo2637, NamA, Fhs and QoxA) was higher in the persistent strain not only at 37 °C but also at 10 °C. Invasion efficiency of EGDe was higher in intestinal epithelial Caco2 and macrophage-like THP1 cells compared to the persistent strains. Concurrently we found higher expression of proteins involved in virulence in EGDe e.g. the actin-assembly-inducing protein ActA and the

  2. Antagonistic pleiotropy and fitness trade-offs reveal specialist and generalist traits in strains of canine distemper virus.

    Directory of Open Access Journals (Sweden)

    Veljko M Nikolin

    Full Text Available Theoretically, homogeneous environments favor the evolution of specialists whereas heterogeneous environments favor generalists. Canine distemper is a multi-host carnivore disease caused by canine distemper virus (CDV. The described cell receptor of CDV is SLAM (CD150. Attachment of CDV hemagglutinin protein (CDV-H to this receptor facilitates fusion and virus entry in cooperation with the fusion protein (CDV-F. We investigated whether CDV strains co-evolved in the large, homogeneous domestic dog population exhibited specialist traits, and strains adapted to the heterogeneous environment of smaller populations of different carnivores exhibited generalist traits. Comparison of amino acid sequences of the SLAM binding region revealed higher similarity between sequences from Canidae species than to sequences from other carnivore families. Using an in vitro assay, we quantified syncytia formation mediated by CDV-H proteins from dog and non-dog CDV strains in cells expressing dog, lion or cat SLAM. CDV-H proteins from dog strains produced significantly higher values with cells expressing dog SLAM than with cells expressing lion or cat SLAM. CDV-H proteins from strains of non-dog species produced similar values in all three cell types, but lower values in cells expressing dog SLAM than the values obtained for CDV-H proteins from dog strains. By experimentally changing one amino acid (Y549H in the CDV-H protein of one dog strain we decreased expression of specialist traits and increased expression of generalist traits, thereby confirming its functional importance. A virus titer assay demonstrated that dog strains produced higher titers in cells expressing dog SLAM than cells expressing SLAM of non-dog hosts, which suggested possible fitness benefits of specialization post-cell entry. We provide in vitro evidence for the expression of specialist and generalist traits by CDV strains, and fitness trade-offs across carnivore host environments caused by

  3. Antagonistic Pleiotropy and Fitness Trade-Offs Reveal Specialist and Generalist Traits in Strains of Canine Distemper Virus

    Science.gov (United States)

    Nikolin, Veljko M.; Osterrieder, Klaus; von Messling, Veronika; Hofer, Heribert; Anderson, Danielle; Dubovi, Edward; Brunner, Edgar; East, Marion L.

    2012-01-01

    Theoretically, homogeneous environments favor the evolution of specialists whereas heterogeneous environments favor generalists. Canine distemper is a multi-host carnivore disease caused by canine distemper virus (CDV). The described cell receptor of CDV is SLAM (CD150). Attachment of CDV hemagglutinin protein (CDV-H) to this receptor facilitates fusion and virus entry in cooperation with the fusion protein (CDV-F). We investigated whether CDV strains co-evolved in the large, homogeneous domestic dog population exhibited specialist traits, and strains adapted to the heterogeneous environment of smaller populations of different carnivores exhibited generalist traits. Comparison of amino acid sequences of the SLAM binding region revealed higher similarity between sequences from Canidae species than to sequences from other carnivore families. Using an in vitro assay, we quantified syncytia formation mediated by CDV-H proteins from dog and non-dog CDV strains in cells expressing dog, lion or cat SLAM. CDV-H proteins from dog strains produced significantly higher values with cells expressing dog SLAM than with cells expressing lion or cat SLAM. CDV-H proteins from strains of non-dog species produced similar values in all three cell types, but lower values in cells expressing dog SLAM than the values obtained for CDV-H proteins from dog strains. By experimentally changing one amino acid (Y549H) in the CDV-H protein of one dog strain we decreased expression of specialist traits and increased expression of generalist traits, thereby confirming its functional importance. A virus titer assay demonstrated that dog strains produced higher titers in cells expressing dog SLAM than cells expressing SLAM of non-dog hosts, which suggested possible fitness benefits of specialization post-cell entry. We provide in vitro evidence for the expression of specialist and generalist traits by CDV strains, and fitness trade-offs across carnivore host environments caused by antagonistic

  4. Multilocus Sequence Typing Reveals Relevant Genetic Variation and Different Evolutionary Dynamics among Strains of Xanthomonas arboricola pv. juglandis

    Directory of Open Access Journals (Sweden)

    Marco Scortichini

    2010-11-01

    Full Text Available Forty-five Xanthomonas arboricola pv. juglandis (Xaj strains originating from Juglans regia cultivation in different countries were molecularly typed by means of MultiLocus Sequence Typing (MLST, using acnB, gapA, gyrB and rpoD gene fragments. A total of 2.5 kilobases was used to infer the phylogenetic relationship among the strains and possible recombination events. Haplotype diversity, linkage disequilibrium analysis, selection tests, gene flow estimates and codon adaptation index were also assessed. The dendrograms built by maximum likelihood with concatenated nucleotide and amino acid sequences revealed two major and two minor phylotypes. The same haplotype was found in strains originating from different continents, and different haplotypes were found in strains isolated in the same year from the same location. A recombination breakpoint was detected within the rpoD gene fragment. At the pathovar level, the Xaj populations studied here are clonal and under neutral selection. However, four Xaj strains isolated from walnut fruits with apical necrosis are under diversifying selection, suggesting a possible new adaptation. Gene flow estimates do not support the hypothesis of geographic isolation of the strains, even though the genetic diversity between the strains increases as the geographic distance between them increases. A triplet deletion, causing the absence of valine, was found in the rpoD fragment of all 45 Xaj strains when compared with X. axonopodis pv. citri strain 306. The codon adaptation index was high in all four genes studied, indicating a relevant metabolic activity.

  5. Laboratory-Cultured Strains of the Sea Anemone Exaiptasia Reveal Distinct Bacterial Communities

    KAUST Repository

    Herrera Sarrias, Marcela; Ziegler, Maren; Voolstra, Christian R.; Aranda, Manuel

    2017-01-01

    Exaiptasia is a laboratory sea anemone model system for stony corals. Two clonal strains are commonly used, referred to as H2 and CC7, that originate from two genetically distinct lineages and that differ in their Symbiodinium specificity. However, little is known about their other microbial associations. Here, we examined and compared the taxonomic composition of the bacterial assemblages of these two symbiotic Exaiptasia strains, both of which have been cultured in the laboratory long-term under identical conditions. We found distinct bacterial microbiota for each strain, indicating the presence of host-specific microbial consortia. Putative differences in the bacterial functional profiles (i.e., enrichment and depletion of various metabolic processes) based on taxonomic inference were also detected, further suggesting functional differences of the microbiomes associated with these lineages. Our study contributes to the current knowledge of the Exaiptasia holobiont by comparing the bacterial diversity of two commonly used strains as models for coral research.

  6. Laboratory-Cultured Strains of the Sea Anemone Exaiptasia Reveal Distinct Bacterial Communities

    KAUST Repository

    Herrera Sarrias, Marcela

    2017-05-02

    Exaiptasia is a laboratory sea anemone model system for stony corals. Two clonal strains are commonly used, referred to as H2 and CC7, that originate from two genetically distinct lineages and that differ in their Symbiodinium specificity. However, little is known about their other microbial associations. Here, we examined and compared the taxonomic composition of the bacterial assemblages of these two symbiotic Exaiptasia strains, both of which have been cultured in the laboratory long-term under identical conditions. We found distinct bacterial microbiota for each strain, indicating the presence of host-specific microbial consortia. Putative differences in the bacterial functional profiles (i.e., enrichment and depletion of various metabolic processes) based on taxonomic inference were also detected, further suggesting functional differences of the microbiomes associated with these lineages. Our study contributes to the current knowledge of the Exaiptasia holobiont by comparing the bacterial diversity of two commonly used strains as models for coral research.

  7. Hydrogen embrittlement of austenitic stainless steels revealed by deformation microstructures and strain-induced creation of vacancies

    International Nuclear Information System (INIS)

    Hatano, M.; Fujinami, M.; Arai, K.; Fujii, H.; Nagumo, M.

    2014-01-01

    Hydrogen embrittlement of austenitic stainless steels has been examined with respect to deformation microstructures and lattice defects created during plastic deformation. Two types of austenitic stainless steels, SUS 304 and SUS 316L, uniformly hydrogen-precharged to 30 mass ppm in a high-pressure hydrogen environment, are subjected to tensile straining at room temperature. A substantial reduction of tensile ductility appears in hydrogen-charged SUS 304 and the onset of fracture is likely due to plastic instability. Fractographic features show involvement of plasticity throughout the crack path, implying the degradation of the austenitic phase. Electron backscatter diffraction analyses revealed prominent strain localization enhanced by hydrogen in SUS 304. Deformation microstructures of hydrogen-charged SUS 304 were characterized by the formation of high densities of fine stacking faults and ε-martensite, while tangled dislocations prevailed in SUS 316L. Positron lifetime measurements have revealed for the first time hydrogen-enhanced creation of strain-induced vacancies rather than dislocations in the austenitic phase and more clustering of vacancies in SUS 304 than in SUS 316L. Embrittlement and its mechanism are ascribed to the decrease in stacking fault energies resulting in strain localization and hydrogen-enhanced creation of strain-induced vacancies, leading to premature fracture in a similar way to that proposed for ferritic steels

  8. Comparative genome analysis of pathogenic and non-pathogenic Clavibacter strains reveals adaptations to their lifestyle

    OpenAIRE

    Załuga, Joanna; Stragier, Pieter; Baeyen, Steve; Haegeman, Annelies; Van Vaerenbergh, Johan; Maes, Martine; De Vos, Paul

    2014-01-01

    Background The genus Clavibacter harbors economically important plant pathogens infecting agricultural crops such as potato and tomato. Although the vast majority of Clavibacter strains are pathogenic, there is an increasing number of non-pathogenic isolates reported. Non-pathogenic Clavibacter strains isolated from tomato seeds are particularly problematic because they affect the current detection and identification tests for Clavibacter michiganensis subsp. michiganensis (Cmm), which is reg...

  9. Plant root transcriptome profiling reveals a strain-dependent response during Azospirillum-rice cooperation

    Directory of Open Access Journals (Sweden)

    Benoît eDrogue

    2014-11-01

    Full Text Available Cooperation involving Plant Growth-Promoting Rhizobacteria results in improvements of plant growth and health. While pathogenic and symbiotic interactions are known to induce transcriptional changes for genes related to plant defence and development, little is known about the impact of phytostimulating rhizobacteria on plant gene expression. This study aims at identifying genes significantly regulated in rice roots upon Azospirillum inoculation, considering possible favored interaction between a strain and its original host cultivar. Genome-wide analyses of Oryza sativa japonica cultivars Cigalon and Nipponbare were performed, by using microarrays, seven days post inoculation with A. lipoferum 4B (isolated from Cigalon or Azospirillum sp. B510 (isolated from Nipponbare and compared to the respective non-inoculated condition. A total of 7,384 genes were significantly regulated, which represent about 16 % of total rice genes. A set of 34 genes is regulated by both Azospirillum strains in both cultivars, including a gene orthologous to PR10 of Brachypodium, and these could represent plant markers of Azospirillum-rice interactions. The results highlight a strain-dependent response of rice, with 83 % of the differentially expressed genes being classified as combination-specific. Whatever the combination, most of the differentially expressed genes are involved in primary metabolism, transport, regulation of transcription and protein fate. When considering genes involved in response to stress and plant defence, it appears that strain B510, a strain displaying endophytic properties, leads to the repression of a wider set of genes than strain 4B. Individual genotypic variations could be the most important driving force of rice roots gene expression upon Azospirillum inoculation. Strain-dependent transcriptional changes observed for genes related to auxin and ethylene signalling highlight the complexity of hormone signalling networks in the Azospirillum

  10. Genetic Diversity among Rhizobium leguminosarum bv. Trifolii Strains Revealed by Allozyme and Restriction Fragment Length Polymorphism Analyses

    Science.gov (United States)

    Demezas, David H.; Reardon, Terry B.; Watson, John M.; Gibson, Alan H.

    1991-01-01

    Allozyme electrophoresis and restriction fragment length polymorphism (RFLP) analyses were used to examine the genetic diversity of a collection of 18 Rhizobium leguminosarum bv. trifolii, 1 R. leguminosarum bv. viciae, and 2 R. meliloti strains. Allozyme analysis at 28 loci revealed 16 electrophoretic types. The mean genetic distance between electrophoretic types of R. leguminosarum and R. meliloti was 0.83. Within R. leguminosarum, the single strain of bv. viciae differed at an average of 0.65 from strains of bv. trifolii, while electrophoretic types of bv. trifolii differed at a range of 0.23 to 0.62. Analysis of RFLPs around two chromosomal DNA probes also delineated 16 unique RFLP patterns and yielded genetic diversity similar to that revealed by the allozyme data. Analysis of RFLPs around three Sym (symbiotic) plasmid-derived probes demonstrated that the Sym plasmids reflect genetic divergence similar to that of their bacterial hosts. The large genetic distances between many strains precluded reliable estimates of their genetic relationships. PMID:16348600

  11. RNA-Seq Analyses for Two Silkworm Strains Reveals Insight into Their Susceptibility and Resistance to Beauveria bassiana Infection

    Directory of Open Access Journals (Sweden)

    Dongxu Xing

    2017-02-01

    Full Text Available The silkworm Bombyx mori is an economically important species. White muscardine caused by Beauveria bassiana is the main fungal disease in sericulture, and understanding the silkworm responses to B. bassiana infection is of particular interest. Herein, we investigated the molecular mechanisms underlying these responses in two silkworm strains Haoyue (HY, sensitive to B. bassiana and Kang 8 (K8, resistant to B. bassiana using an RNA-seq approach. For each strain, three biological replicates for immersion treatment, two replicates for injection treatment and three untreated controls were collected to generate 16 libraries for sequencing. Differentially expressed genes (DEGs between treated samples and untreated controls, and between the two silkworm strains, were identified. DEGs and the enriched Kyoto Encyclopedia of Genes and Genomes (KEGG pathways of the two strains exhibited an obvious difference. Several genes encoding cuticle proteins, serine proteinase inhibitors (SPI and antimicrobial peptides (AMP and the drug metabolism pathway involved in toxin detoxification were considered to be related to the resistance of K8 to B. bassiana. These results revealed insight into the resistance and susceptibility of two silkworm strains against B. bassiana infection and provided a roadmap for silkworm molecular breeding to enhance its resistance to B. bassiana.

  12. RNA-Seq Analyses for Two Silkworm Strains Reveals Insight into Their Susceptibility and Resistance to Beauveria bassiana Infection.

    Science.gov (United States)

    Xing, Dongxu; Yang, Qiong; Jiang, Liang; Li, Qingrong; Xiao, Yang; Ye, Mingqiang; Xia, Qingyou

    2017-02-10

    The silkworm Bombyx mori is an economically important species. White muscardine caused by Beauveria bassiana is the main fungal disease in sericulture, and understanding the silkworm responses to B. bassiana infection is of particular interest. Herein, we investigated the molecular mechanisms underlying these responses in two silkworm strains Haoyue (HY, sensitive to B. bassiana ) and Kang 8 (K8, resistant to B. bassiana ) using an RNA-seq approach. For each strain, three biological replicates for immersion treatment, two replicates for injection treatment and three untreated controls were collected to generate 16 libraries for sequencing. Differentially expressed genes (DEGs) between treated samples and untreated controls, and between the two silkworm strains, were identified. DEGs and the enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the two strains exhibited an obvious difference. Several genes encoding cuticle proteins, serine proteinase inhibitors (SPI) and antimicrobial peptides (AMP) and the drug metabolism pathway involved in toxin detoxification were considered to be related to the resistance of K8 to B. bassiana. These results revealed insight into the resistance and susceptibility of two silkworm strains against B. bassiana infection and provided a roadmap for silkworm molecular breeding to enhance its resistance to B. bassiana .

  13. Maternal mismatches in farmed tilapia strains (Oreochromis spp.) in the Philippines as revealed by mitochondrial COI gene.

    Science.gov (United States)

    Ordoñez, June Feliciano F; Ventolero, Minerva Fatimae H; Santos, Mudjekeewis D

    2017-07-01

    The introduction of genetically enhanced tilapia has significantly boosted the performance of Philippine aquaculture industry. While enhanced strains contribute to the increase in tilapia production, genetic characterization of present tilapia stocks is critical to maintain their quality and to ensure the genetic gains are sustained. To understand and determine the genetic relationship of the genetically enhanced strains produced in the Philippines, mitochondrial cytochrome oxidase subunit I (COI) gene using DNA barcoding approach was analyzed. Specimens representing 10 genetically enhanced strains (GIFT, FaST, GET-EXCEL, GST, SST, COLD, YY-male, GMT, Molobicus, and BEST), three red tilapia (Taiwan red, Florida red, and FAC-red), and two pure lines (initially identified as O. aureus and O. spilurus) were collected, sequenced, and identified using DNA barcoding. Results revealed that farmed tilapias consisted of four different Oreochromis species. As expected, COI could not distinguish individuals at the strain level but surprisingly, mismatch between the species of maternal origin and present-day offspring was observed. This particular result may pose a question on the genetic purity and integrity of the strains being distributed to farmers and suggests a re-evaluation of the effectiveness of major tilapia breeding centers in maintaining their stocks.

  14. Sequencing of bovine herpesvirus 4 v.test strain reveals important genome features

    Directory of Open Access Journals (Sweden)

    Gillet Laurent

    2011-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a useful model for the human pathogenic gammaherpesviruses Epstein-Barr virus and Kaposi's Sarcoma-associated Herpesvirus. Although genome manipulations of this virus have been greatly facilitated by the cloning of the BoHV-4 V.test strain as a Bacterial Artificial Chromosome (BAC, the lack of a complete genome sequence for this strain limits its experimental use. Methods In this study, we have determined the complete sequence of BoHV-4 V.test strain by a pyrosequencing approach. Results The long unique coding region (LUR consists of 108,241 bp encoding at least 79 open reading frames and is flanked by several polyrepetitive DNA units (prDNA. As previously suggested, we showed that the prDNA unit located at the left prDNA-LUR junction (prDNA-G differs from the other prDNA units (prDNA-inner. Namely, the prDNA-G unit lacks the conserved pac-2 cleavage and packaging signal in its right terminal region. Based on the mechanisms of cleavage and packaging of herpesvirus genomes, this feature implies that only genomes bearing left and right end prDNA units are encapsulated into virions. Conclusions In this study, we have determined the complete genome sequence of the BAC-cloned BoHV-4 V.test strain and identified genome organization features that could be important in other herpesviruses.

  15. Bordetella pertussis pertactin knock-out strains reveal immunomodulatory properties of this virulence factor.

    NARCIS (Netherlands)

    Hovingh, Elise Sofie; Mariman, Rob; Solans, Luis; Hijdra, Daniëlle; Hamstra, Hendrik-Jan; Jongerius, Ilse; van Gent, Marjolein; Mooi, Frits; Locht, Camille; Pinelli, Elena

    2018-01-01

    Whooping cough, caused by Bordetella pertussis, has resurged and presents a global health burden worldwide. B. pertussis strains unable to produce the acellular pertussis vaccine component pertactin (Prn), have been emerging and in some countries represent up to 95% of recent clinical isolates.

  16. An Inducible Operon Is Involved in Inulin Utilization in Lactobacillus plantarum Strains, as Revealed by Comparative Proteogenomics and Metabolic Profiling.

    Science.gov (United States)

    Buntin, Nirunya; Hongpattarakere, Tipparat; Ritari, Jarmo; Douillard, François P; Paulin, Lars; Boeren, Sjef; Shetty, Sudarshan A; de Vos, Willem M

    2017-01-15

    The draft genomes of Lactobacillus plantarum strains isolated from Asian fermented foods, infant feces, and shrimp intestines were sequenced and compared to those of well-studied strains. Among 28 strains of L. plantarum, variations in the genomic features involved in ecological adaptation were elucidated. The genome sizes ranged from approximately 3.1 to 3.5 Mb, of which about 2,932 to 3,345 protein-coding sequences (CDS) were predicted. The food-derived isolates contained a higher number of carbohydrate metabolism-associated genes than those from infant feces. This observation correlated to their phenotypic carbohydrate metabolic profile, indicating their ability to metabolize the largest range of sugars. Surprisingly, two strains (P14 and P76) isolated from fermented fish utilized inulin. β-Fructosidase, the inulin-degrading enzyme, was detected in the supernatants and cell wall extracts of both strains. No activity was observed in the cytoplasmic fraction, indicating that this key enzyme was either membrane-bound or extracellularly secreted. From genomic mining analysis, a predicted inulin operon of fosRABCDXE, which encodes β-fructosidase and many fructose transporting proteins, was found within the genomes of strains P14 and P76. Moreover, pts1BCA genes, encoding sucrose-specific IIBCA components involved in sucrose transport, were also identified. The proteomic analysis revealed the mechanism and functional characteristic of the fosRABCDXE operon involved in the inulin utilization of L. plantarum The expression levels of the fos operon and pst genes were upregulated at mid-log phase. FosE and the LPXTG-motif cell wall anchored β-fructosidase were induced to a high abundance when inulin was present as a carbon source. Inulin is a long-chain carbohydrate that may act as a prebiotic, which provides many health benefits to the host by selectively stimulating the growth and activity of beneficial bacteria in the colon. While certain lactobacilli can catabolize

  17. Genomic comparison of invasive and rare non-invasive strains reveals Porphyromonas gingivalis genetic polymorphisms

    Directory of Open Access Journals (Sweden)

    Svetlana Dolgilevich

    2011-03-01

    Full Text Available Porphyromonas gingivalis strains are shown to invade human cells in vitro with different invasion efficiencies, varying by up to three orders of magnitude.We tested the hypothesis that invasion-associated interstrain genomic polymorphisms are present in P. gingivalis and that putative invasion-associated genes can contribute to P. gingivalis invasion.Using an invasive (W83 and the only available non-invasive P. gingivalis strain (AJW4 and whole genome microarrays followed by two separate software tools, we carried out comparative genomic hybridization (CGH analysis.We identified 68 annotated and 51 hypothetical open reading frames (ORFs that are polymorphic between these strains. Among these are surface proteins, lipoproteins, capsular polysaccharide biosynthesis enzymes, regulatory and immunoreactive proteins, integrases, and transposases often with abnormal GC content and clustered on the chromosome. Amplification of selected ORFs was used to validate the approach and the selection. Eleven clinical strains were investigated for the presence of selected ORFs. The putative invasion-associated ORFs were present in 10 of the isolates. The invasion ability of three isogenic mutants, carrying deletions in PG0185, PG0186, and PG0982 was tested. The PG0185 (ragA and PG0186 (ragB mutants had 5.1×103-fold and 3.6×103-fold decreased in vitro invasion ability, respectively.The annotation of divergent ORFs suggests deficiency in multiple genes as a basis for P. gingivalis non-invasive phenotype. Access the supplementary material to this article: Supplement, table (see Supplementary files under Reading Tools online.

  18. Antifungal susceptibility profiles of 1698 yeast reference strains revealing potential emerging human pathogens.

    Directory of Open Access Journals (Sweden)

    Marie Desnos-Ollivier

    Full Text Available New molecular identification techniques and the increased number of patients with various immune defects or underlying conditions lead to the emergence and/or the description of novel species of human and animal fungal opportunistic pathogens. Antifungal susceptibility provides important information for ecological, epidemiological and therapeutic issues. The aim of this study was to assess the potential risk of the various species based on their antifungal drug resistance, keeping in mind the methodological limitations. Antifungal susceptibility profiles to the five classes of antifungal drugs (polyens, azoles, echinocandins, allylamines and antimetabolites were determined for 1698 yeast reference strains belonging to 992 species (634 Ascomycetes and 358 Basidiomycetes. Interestingly, geometric mean minimum inhibitory concentrations (MICs of all antifungal drugs tested were significantly higher for Basidiomycetes compared to Ascomycetes (p<0.001. Twenty four strains belonging to 23 species of which 19 were Basidiomycetes seem to be intrinsically "resistant" to all drugs. Comparison of the antifungal susceptibility profiles of the 4240 clinical isolates and the 315 reference strains belonging to 53 shared species showed similar results. Even in the absence of demonstrated in vitro/in vivo correlation, knowing the in vitro susceptibility to systemic antifungal agents and the putative intrinsic resistance of yeast species present in the environment is important because they could become opportunistic pathogens.

  19. High-Resolution Typing Reveals Distinct Chlamydia trachomatis Strains in an At-Risk Population in Nanjing, China

    NARCIS (Netherlands)

    Bom, Reinier J. M.; van den Hoek, Anneke; Wang, Qianqiu; Long, Fuquan; de Vries, Henry J. C.; Bruisten, Sylvia M.

    2013-01-01

    We investigated Chlamydia trachomatis strains from Nanjing, China, and whether these strains differed from Amsterdam, the Netherlands. C. trachomatis type was determined with multilocus sequence typing. Most strains were specific to Nanjing, but some clustered with strains from Amsterdam. This

  20. Leuconostoc Strains Unable to Split a Lactose Analogue Revealed by Characterisation of Mesophilic Dairy Starters

    Directory of Open Access Journals (Sweden)

    Maarit Mäki

    2005-01-01

    Full Text Available Mesophilic starter cultures used in dairy industry have been traditionally characterised by metabolic and biochemical methods. As closely related species of lactic acid bacteria have often only minor differences in phenotypic traits, which may also be variable within certain species, clear identification is often complicated. Therefore, techniques of molecular biology have been applied for rapid detection and differentiation of lactic acid bacteria. In this work, some bacterial clones isolated from mesophilic starters, which were preliminary identified as lactococci by phenotypic methods, were found to be Leuconostoc strains by both PCR and PFGE. According to the results, genotypic differentiation methods used in combination with phenotypic tests provide a fast and convenient way to reliably identify lactic acid bacteria displaying atypical metabolic characteristics.

  1. Comparative Genomics of Early-Diverging Brucella Strains Reveals a Novel Lipopolysaccharide Biosynthesis Pathway

    Science.gov (United States)

    Wattam, Alice R.; Inzana, Thomas J.; Williams, Kelly P.; Mane, Shrinivasrao P.; Shukla, Maulik; Almeida, Nalvo F.; Dickerman, Allan W.; Mason, Steven; Moriyón, Ignacio; O’Callaghan, David; Whatmore, Adrian M.; Sobral, Bruno W.; Tiller, Rebekah V.; Hoffmaster, Alex R.; Frace, Michael A.; De Castro, Cristina; Molinaro, Antonio; Boyle, Stephen M.; De, Barun K.; Setubal, João C.

    2012-01-01

    ABSTRACT Brucella species are Gram-negative bacteria that infect mammals. Recently, two unusual strains (Brucella inopinata BO1T and B. inopinata-like BO2) have been isolated from human patients, and their similarity to some atypical brucellae isolated from Australian native rodent species was noted. Here we present a phylogenomic analysis of the draft genome sequences of BO1T and BO2 and of the Australian rodent strains 83-13 and NF2653 that shows that they form two groups well separated from the other sequenced Brucella spp. Several important differences were noted. Both BO1T and BO2 did not agglutinate significantly when live or inactivated cells were exposed to monospecific A and M antisera against O-side chain sugars composed of N-formyl-perosamine. While BO1T maintained the genes required to synthesize a typical Brucella O-antigen, BO2 lacked many of these genes but still produced a smooth LPS (lipopolysaccharide). Most missing genes were found in the wbk region involved in O-antigen synthesis in classic smooth Brucella spp. In their place, BO2 carries four genes that other bacteria use for making a rhamnose-based O-antigen. Electrophoretic, immunoblot, and chemical analyses showed that BO2 carries an antigenically different O-antigen made of repeating hexose-rich oligosaccharide units that made the LPS water-soluble, which contrasts with the homopolymeric O-antigen of other smooth brucellae that have a phenol-soluble LPS. The results demonstrate the existence of a group of early-diverging brucellae with traits that depart significantly from those of the Brucella species described thus far. PMID:22930339

  2. Multilocus sequence analysis (MLSA) of Bradyrhizobium strains: revealing high diversity of tropical diazotrophic symbiotic bacteria.

    Science.gov (United States)

    Delamuta, Jakeline Renata Marçon; Ribeiro, Renan Augusto; Menna, Pâmela; Bangel, Eliane Villamil; Hungria, Mariangela

    2012-04-01

    Symbiotic association of several genera of bacteria collectively called as rhizobia and plants belonging to the family Leguminosae (=Fabaceae) results in the process of biological nitrogen fixation, playing a key role in global N cycling, and also bringing relevant contributions to the agriculture. Bradyrhizobium is considered as the ancestral of all nitrogen-fixing rhizobial species, probably originated in the tropics. The genus encompasses a variety of diverse bacteria, but the diversity captured in the analysis of the 16S rRNA is often low. In this study, we analyzed twelve Bradyrhizobium strains selected from previous studies performed by our group for showing high genetic diversity in relation to the described species. In addition to the 16S rRNA, five housekeeping genes (recA, atpD, glnII, gyrB and rpoB) were analyzed in the MLSA (multilocus sequence analysis) approach. Analysis of each gene and of the concatenated housekeeping genes captured a considerably higher level of genetic diversity, with indication of putative new species. The results highlight the high genetic variability associated with Bradyrhizobium microsymbionts of a variety of legumes. In addition, the MLSA approach has proved to represent a rapid and reliable method to be employed in phylogenetic and taxonomic studies, speeding the identification of the still poorly known diversity of nitrogen-fixing rhizobia in the tropics.

  3. Genetic relationships between clinical and non-clinical strains of Yersinia enterocolitica biovar 1A as revealed by multilocus enzyme electrophoresis and multilocus restriction typing

    Directory of Open Access Journals (Sweden)

    Virdi Jugsharan S

    2010-05-01

    Full Text Available Abstract Background Genetic relationships among 81 strains of Y. enterocolitica biovar 1A isolated from clinical and non-clinical sources were discerned by multilocus enzyme electrophoresis (MLEE and multilocus restriction typing (MLRT using six loci each. Such studies may reveal associations between the genotypes of the strains and their sources of isolation. Results All loci were polymorphic and generated 62 electrophoretic types (ETs and 12 restriction types (RTs. The mean genetic diversity (H of the strains by MLEE and MLRT was 0.566 and 0.441 respectively. MLEE (DI = 0.98 was more discriminatory and clustered Y. enterocolitica biovar 1A strains into four groups, while MLRT (DI = 0.77 identified two distinct groups. BURST (Based Upon Related Sequence Types analysis of the MLRT data suggested aquatic serotype O:6,30-6,31 isolates to be the ancestral strains from which, clinical O:6,30-6,31 strains might have originated by host adaptation and genetic change. Conclusion MLEE revealed greater genetic diversity among strains of Y. enterocolitica biovar 1A and clustered strains in four groups, while MLRT grouped the strains into two groups. BURST analysis of MLRT data nevertheless provided newer insights into the probable evolution of clinical strains from aquatic strains.

  4. The complete genome sequence of Bacillus velezensis strain GH1-13 reveals agriculturally beneficial properties and a unique plasmid.

    Science.gov (United States)

    Kim, Sang Yoon; Song, Hajin; Sang, Mee Kyung; Weon, Hang-Yeon; Song, Jaekyeong

    2017-10-10

    The bacterial strain Bacillus velezensis GH1-13, isolated from rice paddy soil in Korea, has been shown to promote plant growth and have strong antagonistic activities against pathogens. Here, we report the complete genome sequence of GH1-13, revealing that it possesses a single 4,071,980-bp circular chromosome with 46.2% GC-content. The chromosome encodes 3,930 genes, and we have also identified a unique plasmid in the strain that encodes a further 104 genes (71,628bp and 31.7% GC-content). The genome was found to contain various enzyme-encoding operons, including indole-3-acetic acid (IAA) biosynthesis proteins, 2,3-butanediol dehydrogenase, various non-ribosomal peptide synthetases, and several polyketide synthases. These properties are responsible for the promotion of plant growth and the biosynthesis of secondary metabolites. They therefore have multiple beneficial effects that could be applied to agriculture. Through curing, we found that the unique plasmid of GH1-13 has important roles in the production of phytohormones, such as IAA, and in shaping phenotypic and physiological characteristics. The plasmid therefore likely influences the biological activities of GH1-13. The complete genome sequence of B. velezensis GH1-13 contributes to our understanding of this beneficial strain and will encourage research into its development for agricultural or biotechnological applications, enhancing productivity and crop quality. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Pyrosequencing Analysis Reveals Changes in Intestinal Microbiota of Healthy Adults Who Received a Daily Dose of Immunomodulatory Probiotic Strains

    Directory of Open Access Journals (Sweden)

    Julio Plaza-Díaz

    2015-05-01

    Full Text Available The colon microbiota plays a crucial role in human gastrointestinal health. Current attempts to manipulate the colon microbiota composition are aimed at finding remedies for various diseases. We have recently described the immunomodulatory effects of three probiotic strains (Lactobacillus rhamnosus CNCM I-4036, Lactobacillus paracasei CNCM I-4034, and Bifidobacterium breve CNCM I-4035. The goal of the present study was to analyze the compositions of the fecal microbiota of healthy adults who received one of these strains using high-throughput 16S ribosomal RNA gene sequencing. Bacteroides was the most abundant genus in the groups that received L. rhamnosus CNCM I-4036 or L. paracasei CNCM I-4034. The Shannon indices were significantly increased in these two groups. Our results also revealed a significant increase in the Lactobacillus genus after the intervention with L. rhamnosus CNCM I-4036. The initially different colon microbiota became homogeneous in the subjects who received L. rhamnosus CNCM I-4036. While some orders that were initially present disappeared after the administration of L. rhamnosus CNCM I-4036, other orders, such as Sphingobacteriales, Nitrospirales, Desulfobacterales, Thiotrichales, and Synergistetes, were detected after the intervention. In summary, our results show that the intake of these three bacterial strains induced changes in the colon microbiota.

  6. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    Science.gov (United States)

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  7. Sequence analysis of chromosome 1 revealed different selection patterns between Chinese wild mice and laboratory strains.

    Science.gov (United States)

    Xu, Fuyi; Hu, Shixian; Chao, Tianzhu; Wang, Maochun; Li, Kai; Zhou, Yuxun; Xu, Hongyan; Xiao, Junhua

    2017-10-01

    Both natural and artificial selection play a critical role in animals' adaptation to the environment. Detection of the signature of selection in genomic regions can provide insights for understanding the function of specific phenotypes. It is generally assumed that laboratory mice may experience intense artificial selection while wild mice more natural selection. However, the differences of selection signature in the mouse genome and underlying genes between wild and laboratory mice remain unclear. In this study, we used two mouse populations: chromosome 1 (Chr 1) substitution lines (C1SLs) derived from Chinese wild mice and mouse genome project (MGP) sequenced inbred strains and two selection detection statistics: Fst and Tajima's D to identify the signature of selection footprint on Chr 1. For the differentiation between the C1SLs and MGP, 110 candidate selection regions containing 47 protein coding genes were detected. A total of 149 selection regions which encompass 7.215 Mb were identified in the C1SLs by Tajima's D approach. While for the MGP, we identified nearly twice selection regions (243) compared with the C1SLs which accounted for 13.27 Mb Chr 1 sequence. Through functional annotation, we identified several biological processes with significant enrichment including seven genes in the olfactory transduction pathway. In addition, we searched the phenotypes associated with the 47 candidate selection genes identified by Fst. These genes were involved in behavior, growth or body weight, mortality or aging, and immune systems which align well with the phenotypic differences between wild and laboratory mice. Therefore, the findings would be helpful for our understanding of the phenotypic differences between wild and laboratory mice and applications for using this new mouse resource (C1SLs) for further genetics studies.

  8. Comparative genome analysis of Prevotella intermedia strain isolated from infected root canal reveals features related to pathogenicity and adaptation.

    Science.gov (United States)

    Ruan, Yunfeng; Shen, Lu; Zou, Yan; Qi, Zhengnan; Yin, Jun; Jiang, Jie; Guo, Liang; He, Lin; Chen, Zijiang; Tang, Zisheng; Qin, Shengying

    2015-02-25

    Many species of the genus Prevotella are pathogens that cause oral diseases. Prevotella intermedia is known to cause various oral disorders e.g. periodontal disease, periapical periodontitis and noma as well as colonize in the respiratory tract and be associated with cystic fibrosis and chronic bronchitis. It is of clinical significance to identify the main drive of its various adaptation and pathogenicity. In order to explore the intra-species genetic differences among strains of Prevotella intermedia of different niches, we isolated a strain Prevotella intermedia ZT from the infected root canal of a Chinese patient with periapical periodontitis and gained a draft genome sequence. We annotated the genome and compared it with the genomes of other taxa in the genus Prevotella. The raw data set, consisting of approximately 65X-coverage reads, was trimmed and assembled into contigs from which 2165 ORFs were predicted. The comparison of the Prevotella intermedia ZT genome sequence with the published genome sequence of Prevotella intermedia 17 and Prevotella intermedia ATCC25611 revealed that ~14% of the genes were strain-specific. The Preveotella intermedia strains share a set of conserved genes contributing to its adaptation and pathogenic and possess strain-specific genes especially those involved in adhesion and secreting bacteriocin. The Prevotella intermedia ZT shares similar gene content with other taxa of genus Prevotella. The genomes of the genus Prevotella is highly dynamic with relative conserved parts: on average, about half of the genes in one Prevotella genome were not included in another genome of the different Prevotella species. The degree of conservation varied with different pathways: the ability of amino acid biosynthesis varied greatly with species but the pathway of cell wall components biosynthesis were nearly constant. Phylogenetic tree shows that the taxa from different niches are scarcely distributed among clades. Prevotella intermedia ZT

  9. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

    Science.gov (United States)

    Ignatieva, Elena V; Levitsky, Victor G; Yudin, Nikolay S; Moshkin, Mikhail P; Kolchanov, Nikolay A

    2014-01-01

    The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  10. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset

    Directory of Open Access Journals (Sweden)

    Elena V. Ignatieva

    2014-03-01

    Full Text Available The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors, which are activated by olfactory stimuli (ligands. Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter (a region of DNA about 100–1000 base pairs long located upstream of the transcription start site. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.. In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  11. Whole genome PCR scanning reveals the syntenic genome structure of toxigenic Vibrio cholerae strains in the O1/O139 population.

    Directory of Open Access Journals (Sweden)

    Bo Pang

    Full Text Available Vibrio cholerae is commonly found in estuarine water systems. Toxigenic O1 and O139 V. cholerae strains have caused cholera epidemics and pandemics, whereas the nontoxigenic strains within these serogroups only occasionally lead to disease. To understand the differences in the genome and clonality between the toxigenic and nontoxigenic strains of V. cholerae serogroups O1 and O139, we employed a whole genome PCR scanning (WGPScanning method, an rrn operon-mediated fragment rearrangement analysis and comparative genomic hybridization (CGH to analyze the genome structure of different strains. WGPScanning in conjunction with CGH revealed that the genomic contents of the toxigenic strains were conservative, except for a few indels located mainly in mobile elements. Minor nucleotide variation in orthologous genes appeared to be the major difference between the toxigenic strains. rrn operon-mediated rearrangements were infrequent in El Tor toxigenic strains tested using I-CeuI digested pulsed-field gel electrophoresis (PFGE analysis and PCR analysis based on flanking sequence of rrn operons. Using these methods, we found that the genomic structures of toxigenic El Tor and O139 strains were syntenic. The nontoxigenic strains exhibited more extensive sequence variations, but toxin coregulated pilus positive (TCP+ strains had a similar structure. TCP+ nontoxigenic strains could be subdivided into multiple lineages according to the TCP type, suggesting the existence of complex intermediates in the evolution of toxigenic strains. The data indicate that toxigenic O1 El Tor and O139 strains were derived from a single lineage of intermediates from complex clones in the environment. The nontoxigenic strains with non-El Tor type TCP may yet evolve into new epidemic clones after attaining toxigenic attributes.

  12. Multi-gene phylogenetic analysis reveals that shochu-fermenting Saccharomyces cerevisiae strains form a distinct sub-clade of the Japanese sake cluster.

    Science.gov (United States)

    Futagami, Taiki; Kadooka, Chihiro; Ando, Yoshinori; Okutsu, Kayu; Yoshizaki, Yumiko; Setoguchi, Shinji; Takamine, Kazunori; Kawai, Mikihiko; Tamaki, Hisanori

    2017-10-01

    Shochu is a traditional Japanese distilled spirit. The formation of the distinguishing flavour of shochu produced in individual distilleries is attributed to putative indigenous yeast strains. In this study, we performed the first (to our knowledge) phylogenetic classification of shochu strains based on nucleotide gene sequences. We performed phylogenetic classification of 21 putative indigenous shochu yeast strains isolated from 11 distilleries. All of these strains were shown or confirmed to be Saccharomyces cerevisiae, sharing species identification with 34 known S. cerevisiae strains (including commonly used shochu, sake, ale, whisky, bakery, bioethanol and laboratory yeast strains and clinical isolate) that were tested in parallel. Our analysis used five genes that reflect genome-level phylogeny for the strain-level classification. In a first step, we demonstrated that partial regions of the ZAP1, THI7, PXL1, YRR1 and GLG1 genes were sufficient to reproduce previous sub-species classifications. In a second step, these five analysed regions from each of 25 strains (four commonly used shochu strains and the 21 putative indigenous shochu strains) were concatenated and used to generate a phylogenetic tree. Further analysis revealed that the putative indigenous shochu yeast strains form a monophyletic group that includes both the shochu yeasts and a subset of the sake group strains; this cluster is a sister group to other sake yeast strains, together comprising a sake-shochu group. Differences among shochu strains were small, suggesting that it may be possible to correlate subtle phenotypic differences among shochu flavours with specific differences in genome sequences. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Analysis of integrated multiple 'omics' datasets reveals the mechanisms of initiation and determination in the formation of tuberous roots in Rehmannia glutinosa.

    Science.gov (United States)

    Li, Mingjie; Yang, Yanhui; Li, Xinyu; Gu, Li; Wang, Fengji; Feng, Fajie; Tian, Yunhe; Wang, Fengqing; Wang, Xiaoran; Lin, Wenxiong; Chen, Xinjian; Zhang, Zhongyi

    2015-09-01

    All tuberous roots in Rehmannia glutinosa originate from the expansion of fibrous roots (FRs), but not all FRs can successfully transform into tuberous roots. This study identified differentially expressed genes and proteins associated with the expansion of FRs, by comparing the tuberous root at expansion stages (initiated tuberous root, ITRs) and FRs at the seedling stage (initiated FRs, IFRs). The role of miRNAs in the expansion of FRs was also explored using the sRNA transcriptome and degradome to identify miRNAs and their target genes that were differentially expressed between ITRs and FRs at the mature stage (unexpanded FRs, UFRs, which are unable to expand into ITRs). A total of 6032 genes and 450 proteins were differentially expressed between ITRs and IFRs. Integrated analyses of these data revealed several genes and proteins involved in light signalling, hormone response, and signal transduction that might participate in the induction of tuberous root formation. Several genes related to cell division and cell wall metabolism were involved in initiating the expansion of IFRs. Of 135 miRNAs differentially expressed between ITRs and UFRs, there were 27 miRNAs whose targets were specifically identified in the degradome. Analysis of target genes showed that several miRNAs specifically expressed in UFRs were involved in the degradation of key genes required for the formation of tuberous roots. As far as could be ascertained, this is the first time that the miRNAs that control the transition of FRs to tuberous roots in R. glutinosa have been identified. This comprehensive analysis of 'omics' data sheds new light on the mechanisms involved in the regulation of tuberous roots formation. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  14. Sequence analysis of measles virus strains collected during the pre- and early-vaccination era in Denmark reveals a considerable diversity of ancient strains

    DEFF Research Database (Denmark)

    Christensen, Laurids Siig; Schöller, S.; Schierup, M. H.

    2002-01-01

    A total of 199 serum samples from patients with measles collected in Denmark, Greenland and the Faroe Islands from 1964 to 1983 were analysed by PCR. Measles virus (MV) RNA could be detected in 38 (19%) of the samples and a total of 18 strains were subjected to partial sequence analysis of the he......A total of 199 serum samples from patients with measles collected in Denmark, Greenland and the Faroe Islands from 1964 to 1983 were analysed by PCR. Measles virus (MV) RNA could be detected in 38 (19%) of the samples and a total of 18 strains were subjected to partial sequence analysis...... of the hemagglutinin gene. The strains exhibited a considerable genomic diversity, which is at odds with the assumption that one genome type prevailed among globally circulating MV strains prior to the advent of live-attenuated vaccines. Our data indicate that the similarity of the various vaccine strains...... is attributed to their having originated from the same primary isolate. Consequently, it is implied that a small number of clinical manifestations of MV worldwide from which strains similar to the vaccine strain were identified were vaccine related rather than being caused by members of a persistently...

  15. A novel system for tracking social preference dynamics in mice reveals sex- and strain-specific characteristics.

    Science.gov (United States)

    Netser, Shai; Haskal, Shani; Magalnik, Hen; Wagner, Shlomo

    2017-01-01

    Deciphering the biological mechanisms underlying social behavior in animal models requires standard behavioral paradigms that can be unbiasedly employed in an observer- and laboratory-independent manner. During the past decade, the three-chamber test has become such a standard paradigm used to evaluate social preference (sociability) and social novelty preference in mice. This test suffers from several caveats, including its reliance on spatial navigation skills and negligence of behavioral dynamics. Here, we present a novel experimental apparatus and an automated analysis system which offer an alternative to the three-chamber test while solving the aforementioned caveats. The custom-made apparatus is simple for production, and the analysis system is publically available as an open-source software, enabling its free use. We used this system to compare the dynamics of social behavior during the social preference and social novelty preference tests between male and female C57BL/6J mice. We found that in both tests, male mice keep their preference towards one of the stimuli for longer periods than females. We then employed our system to define several new parameters of social behavioral dynamics in mice and revealed that social preference behavior is segregated in time into two distinct phases. An early exploration phase, characterized by high rate of transitions between stimuli and short bouts of stimulus investigation, is followed by an interaction phase with low transition rate and prolonged interactions, mainly with the preferred stimulus. Finally, we compared the dynamics of social behavior between C57BL/6J and BTBR male mice, the latter of which are considered as asocial strain serving as a model for autism spectrum disorder. We found that BTBR mice ( n  = 8) showed a specific deficit in transition from the exploration phase to the interaction phase in the social preference test, suggesting a reduced tendency towards social interaction. We successfully

  16. Whole genome sequencing reveals complex evolution patterns of multidrug-resistant Mycobacterium tuberculosis Beijing strains in patients.

    Directory of Open Access Journals (Sweden)

    Matthias Merker

    Full Text Available Multidrug-resistant (MDR Mycobacterium tuberculosis complex (MTBC strains represent a major threat for tuberculosis (TB control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A and nine (Patient B polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and

  17. Biochemical and genetical analysis reveal a new clade of biovar 3 Dickeya spp. strains isolated from potato in Europe

    NARCIS (Netherlands)

    Slawiak, M.; Beckhoven, van J.R.C.M.; Speksnijder, A.G.C.L.; Czajkowski, R.L.; Grabe, G.; Wolf, van der J.M.

    2009-01-01

    Sixty-five potato strains of the soft rot-causing plant pathogenic bacterium Dickeya spp., and two strains from hyacinth, were characterised using biochemical assays, REP-PCR genomic finger printing, 16S rDNA and dnaX sequence analysis. These methods were compared with nineteen strains representing

  18. Genetic diversity and population structure of Iranian wild Pleurotus eryngii species-complex strains revealed by URP-PCR markers

    NARCIS (Netherlands)

    Behnamian, Mahdi; Mohammadi, Seyed A.; Sonnenberg, A.S.M.; Goltapeh, Ebrahim M.; Hendrickx, P.M.

    2010-01-01

    In the present study, a set of 68 P. eryngii wild strains collected from nine locations in northwest and west of Iran along with six commercial strains were studied using universal rice primers (URP). The wild strains were isolated from Ferula ovina, F. haussknechtii, Cachrys ferulacea, Kellusia

  19. Whole-Genome Analysis of Three Yeast Strains Used for Production of Sherry-Like Wines Revealed Genetic Traits Specific to Flor Yeasts

    Science.gov (United States)

    Eldarov, Mikhail A.; Beletsky, Alexey V.; Tanashchuk, Tatiana N.; Kishkovskaya, Svetlana A.; Ravin, Nikolai V.; Mardanov, Andrey V.

    2018-01-01

    Flor yeast strains represent a specialized group of Saccharomyces cerevisiae yeasts used for biological wine aging. We have sequenced the genomes of three flor strains originated from different geographic regions and used for production of sherry-like wines in Russia. According to the obtained phylogeny of 118 yeast strains, flor strains form very tight cluster adjacent to the main wine clade. SNP analysis versus available genomes of wine and flor strains revealed 2,270 genetic variants in 1,337 loci specific to flor strains. Gene ontology analysis in combination with gene content evaluation revealed a complex landscape of possibly adaptive genetic changes in flor yeast, related to genes associated with cell morphology, mitotic cell cycle, ion homeostasis, DNA repair, carbohydrate metabolism, lipid metabolism, and cell wall biogenesis. Pangenomic analysis discovered the presence of several well-known “non-reference” loci of potential industrial importance. Events of gene loss included deletions of asparaginase genes, maltose utilization locus, and FRE-FIT locus involved in iron transport. The latter in combination with a flor-yeast-specific mutation in the Aft1 transcription factor gene is likely to be responsible for the discovered phenotype of increased iron sensitivity and improved iron uptake of analyzed strains. Expansion of the coding region of the FLO11 flocullin gene and alteration of the balance between members of the FLO gene family are likely to positively affect the well-known propensity of flor strains for velum formation. Our study provides new insights in the nature of genetic variation in flor yeast strains and demonstrates that different adaptive properties of flor yeast strains could have evolved through different mechanisms of genetic variation. PMID:29867869

  20. EPA Nanorelease Dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — EPA Nanorelease Dataset. This dataset is associated with the following publication: Wohlleben, W., C. Kingston, J. Carter, E. Sahle-Demessie, S. Vazquez-Campos, B....

  1. Sequencing of emerging canine distemper virus strain reveals new distinct genetic lineage in the United States associated with disease in wildlife and domestic canine populations.

    Science.gov (United States)

    Riley, Matthew C; Wilkes, Rebecca P

    2015-12-18

    Recent outbreaks of canine distemper have prompted examination of strains from clinical samples submitted to the University of Tennessee College of Veterinary Medicine (UTCVM) Clinical Virology Lab. We previously described a new strain of CDV that significantly diverged from all genotypes reported to date including America 2, the genotype proposed to be the main lineage currently circulating in the US. The aim of this study was to determine when this new strain appeared and how widespread it is in animal populations, given that it has also been detected in fully vaccinated adult dogs. Additionally, we sequenced complete viral genomes to characterize the strain and determine if variation is confined to known variable regions of the genome or if the changes are also present in more conserved regions. Archived clinical samples were genotyped using real-time RT-PCR amplification and sequencing. The genomes of two unrelated viruses from a dog and fox each from a different state were sequenced and aligned with previously published genomes. Phylogenetic analysis was performed using coding, non-coding and genome-length sequences. Virus neutralization assays were used to evaluate potential antigenic differences between this strain and a vaccine strain and mixed ANOVA test was used to compare the titers. Genotyping revealed this strain first appeared in 2011 and was detected in dogs from multiple states in the Southeast region of the United States. It was the main strain detected among the clinical samples that were typed from 2011-2013, including wildlife submissions. Genome sequencing demonstrated that it is highly conserved within a new lineage and preliminary serologic testing showed significant differences in neutralizing antibody titers between this strain and the strain commonly used in vaccines. This new strain represents an emerging CDV in domestic dogs in the US, may be associated with a stable reservoir in the wildlife population, and could facilitate vaccine

  2. Resistance of Permafrost and Modern Acinetobacter lwoffii Strains to Heavy Metals and Arsenic Revealed by Genome Analysis.

    Science.gov (United States)

    Mindlin, Sofia; Petrenko, Anatolii; Kurakov, Anton; Beletsky, Alexey; Mardanov, Andrey; Petrova, Mayya

    2016-01-01

    We performed whole-genome sequencing of five permafrost strains of Acinetobacter lwoffii (frozen for 15-3000 thousand years) and analyzed their resistance genes found in plasmids and chromosomes. Four strains contained multiple plasmids (8-12), which varied significantly in size (from 4,135 to 287,630 bp) and genetic structure; the fifth strain contained only two plasmids. All large plasmids and some medium-size and small plasmids contained genes encoding resistance to various heavy metals, including mercury, cobalt, zinc, cadmium, copper, chromium, and arsenic compounds. Most resistance genes found in the ancient strains of A . lwoffii had their closely related counterparts in modern clinical A . lwoffii strains that were also located on plasmids. The vast majority of the chromosomal resistance determinants did not possess complete sets of the resistance genes or contained truncated genes. Comparative analysis of various A . lwoffii and of A . baumannii strains discovered a number of differences between them: (i) chromosome sizes in A . baumannii exceeded those in A . lwoffii by about 20%; (ii) on the contrary, the number of plasmids in A . lwoffii and their total size were much higher than those in A . baumannii ; (iii) heavy metal resistance genes in the environmental A . lwoffii strains surpassed those in A . baumannii strains in the number and diversity and were predominantly located on plasmids. Possible reasons for these differences are discussed.

  3. Polymorphism of Paramecium pentaurelia (Ciliophora, Oligohymenophorea) strains revealed by rDNA and mtDNA sequences.

    Science.gov (United States)

    Przyboś, Ewa; Tarcz, Sebastian; Greczek-Stachura, Magdalena; Surmacz, Marta

    2011-05-01

    Paramecium pentaurelia is one of 15 known sibling species of the Paramecium aurelia complex. It is recognized as a species showing no intra-specific differentiation on the basis of molecular fingerprint analyses, whereas the majority of other species are polymorphic. This study aimed at assessing genetic polymorphism within P. pentaurelia including new strains recently found in Poland (originating from two water bodies, different years, seasons, and clones of one strain) as well as strains collected from distant habitats (USA, Europe, Asia), and strains representing other species of the complex. We compared two DNA fragments: partial sequences (349 bp) of the LSU rDNA and partial sequences (618 bp) of cytochrome B gene. A correlation between the geographical origin of the strains and the genetic characteristics of their genotypes was not observed. Different genotypes were found in Kraków in two types of water bodies (Opatkowice-natural pond; Jordan's Park-artificial pond). Haplotype diversity within a single water body was not recorded. Likewise, seasonal haplotype differences between the strains within the artificial water body, as well as differences between clones originating from one strain, were not detected. The clustering of some strains belonging to different species was observed in the phylogenies. Copyright © 2010 Elsevier GmbH. All rights reserved.

  4. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Directory of Open Access Journals (Sweden)

    Ruben Pérez

    Full Text Available Canine parvovirus (CPV, a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population and a major recombinant strain (86.7%. The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  5. Phylogenetic and Genome-Wide Deep-Sequencing Analyses of Canine Parvovirus Reveal Co-Infection with Field Variants and Emergence of a Recent Recombinant Strain

    Science.gov (United States)

    Pérez, Ruben; Calleros, Lucía; Marandino, Ana; Sarute, Nicolás; Iraola, Gregorio; Grecco, Sofia; Blanc, Hervé; Vignuzzi, Marco; Isakov, Ofer; Shomron, Noam; Carrau, Lucía; Hernández, Martín; Francia, Lourdes; Sosa, Katia; Tomás, Gonzalo; Panzera, Yanina

    2014-01-01

    Canine parvovirus (CPV), a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c) with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population) and a major recombinant strain (86.7%). The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity. PMID:25365348

  6. Analysis of cagA in Helicobacter pylori strains from Colombian populations with contrasting gastric cancer risk reveals a biomarker for disease severity

    Science.gov (United States)

    Loh, John T.; Shaffer, Carrie L.; Piazuelo, M. Blanca; Bravo, Luis E.; McClain, Mark S.; Correa, Pelayo; Cover, Timothy L.

    2011-01-01

    BACKGROUND Helicobacter pylori infection is a risk factor for the development of gastric cancer, and the bacterial oncoprotein CagA contributes to gastric carcinogenesis. METHODS We analyzed H. pylori isolates from persons in Colombia and observed that there was marked variation among strains in levels of CagA expression. To elucidate the basis for this variation, we analyzed sequences upstream from the CagA translational initiation site in each strain. RESULTS A DNA motif (AATAAGATA) upstream of the translational initiation site of CagA was associated with high levels of CagA expression. Experimental studies showed that this motif was necessary but not sufficient for high-level CagA expression. H. pylori strains from a region of Colombia with high gastric cancer rates expressed higher levels of CagA than did strains from a region with lower gastric cancer rates, and Colombian strains of European phylogeographic origin expressed higher levels of CagA than did strains of African origin. Histopathological analysis of gastric biopsy specimens revealed that strains expressing high levels of CagA or containing the AATAAGATA motif were associated with more advanced precancerous lesions than those found in persons infected with strains expressing low levels of CagA or lacking the AATAAGATA motif. CONCLUSIONS CagA expression varies greatly among H. pylori strains. The DNA motif identified in this study is associated with high levels of CagA expression, and may be a useful biomarker to predict gastric cancer risk. IMPACT These findings help to explain why some persons infected with cagA-positive H. pylori develop gastric cancer and others do not. PMID:21859954

  7. Feature tracking CMR reveals abnormal strain in preclinical arrhythmogenic right ventricular dysplasia/ cardiomyopathy: a multisoftware feasibility and clinical implementation study.

    Science.gov (United States)

    Bourfiss, Mimount; Vigneault, Davis M; Aliyari Ghasebeh, Mounes; Murray, Brittney; James, Cynthia A; Tichnell, Crystal; Mohamed Hoesein, Firdaus A; Zimmerman, Stefan L; Kamel, Ihab R; Calkins, Hugh; Tandri, Harikrishna; Velthuis, Birgitta K; Bluemke, David A; Te Riele, Anneline S J M

    2017-09-01

    Regional right ventricular (RV) dysfunction is the hallmark of Arrhythmogenic Right Ventricular Dysplasia/Cardiomyopathy (ARVD/C), but is currently only qualitatively evaluated in the clinical setting. Feature Tracking Cardiovascular Magnetic Resonance (FT-CMR) is a novel quantitative method that uses cine CMR to calculate strain values. However, most prior FT-CMR studies in ARVD/C have focused on global RV strain using different software methods, complicating implementation of FT-CMR in clinical practice. We aimed to assess the clinical value of global and regional strain using FT-CMR in ARVD/C and to determine differences between commercially available FT-CMR software packages. We analyzed cine CMR images of 110 subjects (39 overt ARVD/C [mutation+/phenotype+], 40 preclinical ARVD/C [mutation+/phenotype-] and 31 control) for global and regional (subtricuspid, anterior, apical) RV strain in the horizontal longitudinal axis using four FT-CMR software methods (Multimodality Tissue Tracking, TomTec, Medis and Circle Cardiovascular Imaging). Intersoftware agreement was assessed using Bland Altman plots. For global strain, all methods showed reduced strain in overt ARVD/C patients compared to control subjects (p  0.275). For regional strain, overt ARVD/C patients showed reduced strain compared to control subjects in all segments which reached statistical significance in the subtricuspid region for all software methods (p < 0.037), in the anterior wall for two methods (p < 0.005) and in the apex for one method (p = 0.012). Preclinical subjects showed abnormal subtricuspid strain compared to control subjects using one of the software methods (p = 0.009). Agreement between software methods for absolute strain values was low (Intraclass Correlation Coefficient = 0.373). Despite large intersoftware variability of FT-CMR derived strain values, all four software methods distinguished overt ARVD/C patients from control subjects by both global and subtricuspid

  8. Characterization of CRISPR-Cas system in clinical Staphylococcus epidermidis strains revealed its potential association with bacterial infection sites

    DEFF Research Database (Denmark)

    Li, Qiuchun; Xie, Xiaolei; Yin, Kequan

    2016-01-01

    Staphylococcus epidermidis is considered as a major cause of nosocomial infections, bringing an immense burden to healthcare systems. Virulent phages have been confirmed to be efficient in combating the pathogen, but the prensence of CRISPR-Cas system, which is a bacterial immune system eliminating...... phages was reported in few S. epidermidis strains. In this study, the CRISPR-Cas system was detected in 12 from almost 300 published genomes in GenBank and by PCR of cas6 gene in 18 strains out of 130 clinical isolates obtained in Copenhagen. Four strains isolated in 1965-1966 harboured CRISPR elements...... spacers located in the CRISPR1 locus with homolgy to virulent phage 6ec DNA sequences, and 19 strains each carrying 2 or 3 different spacers recognizing this phage, implied that the CRISPR-Cas immunity could be abrogated by nucleotide mismatch between the spacer and its target phage sequence, while new...

  9. A rapid NMR-based method for discrimination of strain-specific cell wall teichoic acid structures reveals a third backbone type in Lactobacillus plantarum.

    Science.gov (United States)

    Tomita, Satoru; Tanaka, Naoto; Okada, Sanae

    2017-03-01

    The lactic acid bacterium Lactobacillus plantarum is capable of producing strain-specific structures of cell wall teichoic acid (WTA), an anionic polysaccharide found in the Gram-positive bacterial cell wall. In this study, we established a rapid, NMR-based procedure to discriminate WTA structures in this species, and applied it to 94 strains of L. plantarum. Six previously reported glycerol- and ribitol-containing WTA subtypes were successfully identified from 78 strains, suggesting that these were the dominant structures. However, the level of structural variety differed markedly among bacterial sources, possibly reflecting differences in strain-level microbial diversity. WTAs from eight strains were not identified based on NMR spectra and were classified into three groups. Structural analysis of a partial degradation product of an unidentified WTA produced by strain TUA 1496L revealed that the WTA was 1-O-β-d-glucosylglycerol. Two-dimensional NMR analysis of the polymer structure showed phosphodiester bonds between C-3 and C-6 of the glycerol and glucose residues, suggesting a polymer structure of 3,6΄-linked poly(1-O-β-d-glucosyl-sn-glycerol phosphate). This is the third WTA backbone structure in L. plantarum, following 3,6΄-linked poly(1-O-α-d-glucosyl-sn-glycerol phosphate) and 1,5-linked poly(ribitol phosphate). © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  10. Genome analysis coupled with physiological studies reveals a diverse nitrogen metabolism in Methylocystis sp. strain SC2.

    Directory of Open Access Journals (Sweden)

    Bomba Dam

    Full Text Available BACKGROUND: Methylocystis sp. strain SC2 can adapt to a wide range of methane concentrations. This is due to the presence of two isozymes of particulate methane monooxygenase exhibiting different methane oxidation kinetics. To gain insight into the underlying genetic information, its genome was sequenced and found to comprise a 3.77 Mb chromosome and two large plasmids. PRINCIPAL FINDINGS: We report important features of the strain SC2 genome. Its sequence is compared with those of seven other methanotroph genomes, comprising members of the Alphaproteobacteria, Gammaproteobacteria, and Verrucomicrobia. While the pan-genome of all eight methanotroph genomes totals 19,358 CDS, only 154 CDS are shared. The number of core genes increased with phylogenetic relatedness: 328 CDS for proteobacterial methanotrophs and 1,853 CDS for the three alphaproteobacterial Methylocystaceae members, Methylocystis sp. strain SC2 and strain Rockwell, and Methylosinus trichosporium OB3b. The comparative study was coupled with physiological experiments to verify that strain SC2 has diverse nitrogen metabolism capabilities. In correspondence to a full complement of 34 genes involved in N2 fixation, strain SC2 was found to grow with atmospheric N2 as the sole nitrogen source, preferably at low oxygen concentrations. Denitrification-mediated accumulation of 0.7 nmol (30N2/hr/mg dry weight of cells under anoxic conditions was detected by tracer analysis. N2 production is related to the activities of plasmid-borne nitric oxide and nitrous oxide reductases. CONCLUSIONS/PERSPECTIVES: Presence of a complete denitrification pathway in strain SC2, including the plasmid-encoded nosRZDFYX operon, is unique among known methanotrophs. However, the exact ecophysiological role of this pathway still needs to be elucidated. Detoxification of toxic nitrogen compounds and energy conservation under oxygen-limiting conditions are among the possible roles. Relevant features that may stimulate

  11. A novel Zika virus mouse model reveals strain specific differences in virus pathogenesis and host inflammatory immune responses.

    Directory of Open Access Journals (Sweden)

    Shashank Tripathi

    2017-03-01

    Full Text Available Zika virus (ZIKV is a mosquito borne flavivirus, which was a neglected tropical pathogen until it emerged and spread across the Pacific Area and the Americas, causing large human outbreaks associated with fetal abnormalities and neurological disease in adults. The factors that contributed to the emergence, spread and change in pathogenesis of ZIKV are not understood. We previously reported that ZIKV evades cellular antiviral responses by targeting STAT2 for degradation in human cells. In this study, we demonstrate that Stat2-/- mice are highly susceptible to ZIKV infection, recapitulate virus spread to the central nervous system (CNS, gonads and other visceral organs, and display neurological symptoms. Further, we exploit this model to compare ZIKV pathogenesis caused by a panel of ZIKV strains of a range of spatiotemporal history of isolation and representing African and Asian lineages. We observed that African ZIKV strains induce short episodes of severe neurological symptoms followed by lethality. In comparison, Asian strains manifest prolonged signs of neuronal malfunctions, occasionally causing death of the Stat2-/- mice. African ZIKV strains induced higher levels of inflammatory cytokines and markers associated with cellular infiltration in the infected brain in mice, which may explain exacerbated pathogenesis in comparison to those of the Asian lineage. Interestingly, viral RNA levels in different organs did not correlate with the pathogenicity of the different strains. Taken together, we have established a new murine model that supports ZIKV infection and demonstrate its utility in highlighting intrinsic differences in the inflammatory response induced by different ZIKV strains leading to severity of disease. This study paves the way for the future interrogation of strain-specific changes in the ZIKV genome and their contribution to viral pathogenesis.

  12. Multilocus Microsatellite Typing reveals intra-focal genetic diversity among strains of Leishmania tropica in Chichaoua Province, Morocco.

    Science.gov (United States)

    Krayter, Lena; Alam, Mohammad Zahangir; Rhajaoui, Mohamed; Schnur, Lionel F; Schönian, Gabriele

    2014-12-01

    In Morocco, cutaneous leishmaniasis (CL) caused by Leishmania (L.) tropica is a major public health threat. Strains of this species have been shown to display considerable serological, biochemical, molecular biological and genetic heterogeneity; and Multilocus Enzyme Electrophoresis (MLEE), has shown that in many countries including Morocco heterogenic variants of L. tropica can co-exist in single geographical foci. Here, the microsatellite profiles discerned by MLMT of nine Moroccan strains of L. tropica isolated in 2000 from human cases of CL from Chichaoua Province were compared to those of nine Moroccan strains of L. tropica isolated between 1988 and 1990 from human cases of CL from Marrakech Province, and also to those of 147 strains of L. tropica isolated at different times from different worldwide geographical locations within the range of distribution of the species. Several programs, each employing a different algorithm, were used for population genetic analysis. The strains from each of the two Moroccan foci separated into two phylogenetic clusters independent of their geographical origin. Genetic diversity and heterogeneity existed in both foci, which are geographically close to each other. This intra-focal distribution of genetic variants of L. tropica is not considered owing to in situ mutation. Rather, it is proposed to be explained by the importation of pre-existing variants of L. tropica into Morocco. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  13. Motor coordination and balance measurements reveal differential pathogenicity of currently spreading enterovirus 71 strains in human SCARB2 transgenic mice.

    Science.gov (United States)

    Chen, Mei-Feng; Shih, Shin-Ru

    2016-12-01

    Enterovirus 71 (EV71) has caused large-scale epidemics with neurological complications in the Asia-Pacific region. The C4a and B5 strains are the two major genotypes circulating in many countries recently. This study used a new protocol, a motor coordination task, to assess the differential pathogenicity of C4a and B5 strains in human SCARB2 transgenic mice. We found that the pathogenicity of C4a viruses was more severe than that of B5 viruses. Moreover, we discovered that an increased level of monocyte chemoattractant protein-1 was positively correlated with severely deficient motor function. This study provides a new method for evaluating EV71 infection in mice and distinguishing the severity of the symptoms caused by different clinical strains, which would contribute to studies of pathogenesis and development of vaccines and antivirals in EV71 infections.

  14. Strain-resolved microbial community proteomics reveals simultaneous aerobic and anaerobic function during gastrointestinal tract colonization of a preterm infant

    Directory of Open Access Journals (Sweden)

    Brandon eBrooks

    2015-07-01

    Full Text Available While there has been growing interest in the gut microbiome in recent years, it remains unclear whether closely related species and strains have similar or distinct functional roles and if organisms capable of both aerobic and anaerobic growth do so simultaneously. To investigate these questions, we implemented a high-throughput mass spectrometry-based proteomics approach to identify proteins in fecal samples collected on days of life 13-21 from an infant born at 28 weeks gestation. No prior studies have coupled strain-resolved community metagenomics to proteomics for such a purpose. Sequences were manually curated to resolve the genomes of two strains of Citrobacter that were present during the later stage of colonization. Proteome extracts from fecal samples were processed via a nano-2D-LC-MS/MS and peptides were identified based on information predicted from the genome sequences for the dominant organisms, Serratia and the two Citrobacter strains. These organisms are facultative anaerobes, and proteomic information indicates the utilization of both aerobic and anaerobic metabolisms throughout the time series. This may indicate growth in distinct niches within the gastrointestinal tract. We uncovered differences in the physiology of coexisting Citrobacter strains, including differences in motility and chemotaxis functions. Additionally, for both Citrobacter strains we resolved a community-essential role in vitamin metabolism and a predominant role in propionate production. Finally, in this case study we detected differences between genome abundance and activity levels for the dominant populations. This underlines the value in layering proteomic information over genetic potential.

  15. Comparative genome analysis of VSP-II and SNPs reveals heterogenic variation in contemporary strains of Vibrio cholerae O1 isolated from cholera patients in Kolkata, India.

    Science.gov (United States)

    Imamura, Daisuke; Morita, Masatomo; Sekizuka, Tsuyoshi; Mizuno, Tamaki; Takemura, Taichiro; Yamashiro, Tetsu; Chowdhury, Goutam; Pazhani, Gururaja P; Mukhopadhyay, Asish K; Ramamurthy, Thandavarayan; Miyoshi, Shin-Ichi; Kuroda, Makoto; Shinoda, Sumio; Ohnishi, Makoto

    2017-02-01

    Cholera is an acute diarrheal disease and a major public health problem in many developing countries in Asia, Africa, and Latin America. Since the Bay of Bengal is considered the epicenter for the seventh cholera pandemic, it is important to understand the genetic dynamism of Vibrio cholerae from Kolkata, as a representative of the Bengal region. We analyzed whole genome sequence data of V. cholerae O1 isolated from cholera patients in Kolkata, India, from 2007 to 2014 and identified the heterogeneous genomic region in these strains. In addition, we carried out a phylogenetic analysis based on the whole genome single nucleotide polymorphisms to determine the genetic lineage of strains in Kolkata. This analysis revealed the heterogeneity of the Vibrio seventh pandemic island (VSP)-II in Kolkata strains. The ctxB genotype was also heterogeneous and was highly related to VSP-II types. In addition, phylogenetic analysis revealed the shifts in predominant strains in Kolkata. Two distinct lineages, 1 and 2, were found between 2007 and 2010. However, the proportion changed markedly in 2010 and lineage 2 strains were predominant thereafter. Lineage 2 can be divided into four sublineages, I, II, III and IV. The results of this study indicate that lineages 1 and 2-I were concurrently prevalent between 2007 and 2009, and lineage 2-III observed in 2010, followed by the predominance of lineage 2-IV in 2011 and continued until 2014. Our findings demonstrate that the epidemic of cholera in Kolkata was caused by several distinct strains that have been constantly changing within the genetic lineages of V. cholerae O1 in recent years.

  16. Private selective sweeps identified from next-generation pool-sequencing reveal convergent pathways under selection in two inbred Schistosoma mansoni strains.

    Directory of Open Access Journals (Sweden)

    Julie A J Clément

    Full Text Available BACKGROUND: The trematode flatworms of the genus Schistosoma, the causative agents of schistosomiasis, are among the most prevalent parasites in humans, affecting more than 200 million people worldwide. In this study, we focused on two well-characterized strains of S. mansoni, to explore signatures of selection. Both strains are highly inbred and exhibit differences in life history traits, in particular in their compatibility with the intermediate host Biomphalaria glabrata. METHODOLOGY/PRINCIPAL FINDINGS: We performed high throughput sequencing of DNA from pools of individuals of each strain using Illumina technology and identified single nucleotide polymorphisms (SNP and copy number variations (CNV. In total, 708,898 SNPs were identified and roughly 2,000 CNVs. The SNPs revealed low nucleotide diversity (π = 2 × 10(-4 within each strain and a high differentiation level (Fst = 0.73 between them. Based on a recently developed in-silico approach, we further detected 12 and 19 private (i.e. specific non-overlapping selective sweeps among the 121 and 151 sweeps found in total for each strain. CONCLUSIONS/SIGNIFICANCE: Functional annotation of transcripts lying in the private selective sweeps revealed specific selection for functions related to parasitic interaction (e.g. cell-cell adhesion or redox reactions. Despite high differentiation between strains, we identified evolutionary convergence of genes related to proteolysis, known as a key virulence factor and a potential target of drug and vaccine development. Our data show that pool-sequencing can be used for the detection of selective sweeps in parasite populations and enables one to identify biological functions under selection.

  17. Comparative Genomic Analyses of Multiple Pseudomonas Strains Infecting Corylus avellana Trees Reveal the Occurrence of Two Genetic Clusters with Both Common and Distinctive Virulence and Fitness Traits

    Science.gov (United States)

    Marcelletti, Simone; Scortichini, Marco

    2015-01-01

    The European hazelnut (Corylus avellana) is threatened in Europe by several pseudomonads which cause symptoms ranging from twig dieback to tree death. A comparison of the draft genomes of nine Pseudomonas strains isolated from symptomatic C. avellana trees was performed to identify common and distinctive genomic traits. The thorough assessment of genetic relationships among the strains revealed two clearly distinct clusters: P. avellanae and P. syringae. The latter including the pathovars avellanae, coryli and syringae. Between these two clusters, no recombination event was found. A genomic island of approximately 20 kb, containing the hrp/hrc type III secretion system gene cluster, was found to be present without any genomic difference in all nine pseudomonads. The type III secretion system effector repertoires were remarkably different in the two groups, with P. avellanae showing a higher number of effectors. Homologue genes of the antimetabolite mangotoxin and ice nucleation activity clusters were found solely in all P. syringae pathovar strains, whereas the siderophore yersiniabactin was only present in P. avellanae. All nine strains have genes coding for pectic enzymes and sucrose metabolism. By contrast, they do not have genes coding for indolacetic acid and anti-insect toxin. Collectively, this study reveals that genomically different Pseudomonas can converge on the same host plant by suppressing the host defence mechanisms with the use of different virulence weapons. The integration into their genomes of a horizontally acquired genomic island could play a fundamental role in their evolution, perhaps giving them the ability to exploit new ecological niches. PMID:26147218

  18. Biodiversity among Lactobacillus helveticus Strains Isolated from Different Natural Whey Starter Cultures as Revealed by Classification Trees

    Science.gov (United States)

    Gatti, Monica; Trivisano, Carlo; Fabrizi, Enrico; Neviani, Erasmo; Gardini, Fausto

    2004-01-01

    Lactobacillus helveticus is a homofermentative thermophilic lactic acid bacterium used extensively for manufacturing Swiss type and aged Italian cheese. In this study, the phenotypic and genotypic diversity of strains isolated from different natural dairy starter cultures used for Grana Padano, Parmigiano Reggiano, and Provolone cheeses was investigated by a classification tree technique. A data set was used that consists of 119 L. helveticus strains, each of which was studied for its physiological characters, as well as surface protein profiles and hybridization with a species-specific DNA probe. The methodology employed in this work allowed the strains to be grouped into terminal nodes without difficult and subjective interpretation. In particular, good discrimination was obtained between L. helveticus strains isolated, respectively, from Grana Padano and from Provolone natural whey starter cultures. The method used in this work allowed identification of the main characteristics that permit discrimination of biotypes. In order to understand what kind of genes could code for phenotypes of technological relevance, evidence that specific DNA sequences are present only in particular biotypes may be of great interest. PMID:14711641

  19. Linking Genotype and Phenotype of Saccharomyces cerevisiae Strains Reveals Metabolic Engineering Targets and Leads to Triterpene Hyper-Producers

    DEFF Research Database (Denmark)

    Madsen, Karina Marie; Udatha, Gupta D. B. R. K.; Semba, Saori

    2011-01-01

    with the S288C strain, we implemented a strategy for the construction of a beta-amyrin production platform. The genes Erg8, Erg9 and HFA1 contained non-silent SNPs that were computationally analyzed to evaluate the changes that cause in the respective protein structures. Subsequently, Erg8, Erg9 and HFA1...

  20. Generation of lycopene-overproducing strains of the fungus Mucor circinelloides reveals important aspects of lycopene formation and accumulation.

    Science.gov (United States)

    Zhang, Yingtong; Chen, Haiqin; Navarro, Eusebio; López-García, Sergio; Chen, Yong Q; Zhang, Hao; Chen, Wei; Garre, Victoriano

    2017-03-01

    To generate lycopene-overproducing strains of the fungus Mucor circinelloides with interest for industrial production and to gain insight into the catalytic mechanism of lycopene cyclase and regulatory process during lycopene overaccumulation. Three lycopene-overproducing mutants were generated by classic mutagenesis techniques from a β-carotene-overproducing strain. They carried distinct mutations in the carRP gene encoding lycopene cyclase that produced loss of enzymatic activity to different extents. In one mutant (MU616), the lycopene cyclase was completely destroyed, and a 43.8% (1.1 mg/g dry mass) increase in lycopene production was observed in comparison to that by the previously existing lycopene overproducer. In addition, feedback regulation of the end product was suggested in lycopene-overproducing strains. A lycopene-overaccumulating strain of the fungus M. circinelloides was generated that could be an alternative for the industrial production of lycopene. Vital catalytic residues for lycopene cyclase activity and the potential mechanism of lycopene formation and accumulation were identified.

  1. Aaron Journal article datasets

    Data.gov (United States)

    U.S. Environmental Protection Agency — All figures used in the journal article are in netCDF format. This dataset is associated with the following publication: Sims, A., K. Alapaty , and S. Raman....

  2. Integrated Surface Dataset (Global)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Integrated Surface (ISD) Dataset (ISD) is composed of worldwide surface weather observations from over 35,000 stations, though the best spatial coverage is...

  3. Control Measure Dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — The EPA Control Measure Dataset is a collection of documents describing air pollution control available to regulated facilities for the control and abatement of air...

  4. National Hydrography Dataset (NHD)

    Data.gov (United States)

    Kansas Data Access and Support Center — The National Hydrography Dataset (NHD) is a feature-based database that interconnects and uniquely identifies the stream segments or reaches that comprise the...

  5. Market Squid Ecology Dataset

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This dataset contains ecological information collected on the major adult spawning and juvenile habitats of market squid off California and the US Pacific Northwest....

  6. Tables and figure datasets

    Data.gov (United States)

    U.S. Environmental Protection Agency — Soil and air concentrations of asbestos in Sumas study. This dataset is associated with the following publication: Wroble, J., T. Frederick, A. Frame, and D....

  7. CE-MS-based metabolomics reveals the metabolic profile of maitake mushroom (Grifola frondosa) strains with different cultivation characteristics.

    Science.gov (United States)

    Sato, Mayumi; Miyagi, Atsuko; Yoneyama, Shozo; Gisusi, Seiki; Tokuji, Yoshihiko; Kawai-Yamada, Maki

    2017-12-01

    Maitake mushroom (Grifola frondosa [Dicks.] Gray) is generally cultured using the sawdust of broadleaf trees. The maitake strain Gf433 has high production efficiency, with high-quality of fruiting bodies even when 30% of the birch sawdust on the basal substrate is replaced with conifer sawdust. We performed metabolome analysis to investigate the effect of different cultivation components on the metabolism of Gf433 and Mori52 by performing CE-MS on their fruiting bodies in different cultivation conditions to quantify the levels of amino acids, organic acids, and phosphorylated organic acids. We found that amino acid and organic acid content in Gf433 were not affected by the kind of sawdust. However, Gf433 contained more organic acids and less amino acids than Mori52, and Gf433 also contained more chitin compared with Mori52. We believe that these differences in the metabolome contents of the two strains are related to the high production efficiency of Gf433.

  8. Comparative analysis of the complete genome sequence of the California MSW strain of myxoma virus reveals potential host adaptations.

    Science.gov (United States)

    Kerr, Peter J; Rogers, Matthew B; Fitch, Adam; Depasse, Jay V; Cattadori, Isabella M; Hudson, Peter J; Tscharke, David C; Holmes, Edward C; Ghedin, Elodie

    2013-11-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits.

  9. Revealing strategies of quorum sensing in Azospirillum brasilense strains Ab-V5 and Ab-V6.

    Science.gov (United States)

    Fukami, Josiane; Abrantes, Julia Laura Fernandes; Del Cerro, Pablo; Nogueira, Marco Antonio; Ollero, Francisco Javier; Megías, Manuel; Hungria, Mariangela

    2018-01-01

    Azospirillum brasilense is an important plant-growth promoting bacterium (PGPB) that requires several critical steps for root colonization, including biofilm and exopolysaccharide (EPS) synthesis and cell motility. In several bacteria these mechanisms are mediated by quorum sensing (QS) systems that regulate the expression of specific genes mediated by the autoinducers N-acyl-homoserine lactones (AHLs). We investigated QS mechanisms in strains Ab-V5 and Ab-V6 of A. brasilense, which are broadly used in commercial inoculants in Brazil. Neither of these strains carries a luxI gene, but there are several luxR solos that might perceive AHL molecules. By adding external AHLs we verified that biofilm and EPS production and cell motility (swimming and swarming) were regulated via QS in Ab-V5, but not in Ab-V6. Differences were observed not only between strains, but also in the specificity of LuxR-type receptors to AHL molecules. However, Ab-V6 was outstanding in indole acetic acid (IAA) synthesis and this molecule might mimic AHL signals. We also applied the quorum quenching (QQ) strategy, obtaining transconjugants of Ab-V5 and Ab-V6 carrying a plasmid with acyl-homoserine lactonase. When maize (Zea mays L.) was inoculated with the wild-type and transconjugant strains, plant growth was decreased with the transconjugant of Ab-V5-confirming the importance of an AHL-mediated QS system-but did not affect plant growth promotion by Ab-V6.

  10. Global mRNA expression analysis in myosin II deficient strains of Saccharomyces cerevisiae reveals an impairment of cell integrity functions

    Directory of Open Access Journals (Sweden)

    Rivera-Molina Félix E

    2008-01-01

    Full Text Available Abstract Background The Saccharomyces cerevisiae MYO1 gene encodes the myosin II heavy chain (Myo1p, a protein required for normal cytokinesis in budding yeast. Myo1p deficiency in yeast (myo1Δ causes a cell separation defect characterized by the formation of attached cells, yet it also causes abnormal budding patterns, formation of enlarged and elongated cells, increased osmotic sensitivity, delocalized chitin deposition, increased chitin synthesis, and hypersensitivity to the chitin synthase III inhibitor Nikkomycin Z. To determine how differential expression of genes is related to these diverse cell wall phenotypes, we analyzed the global mRNA expression profile of myo1Δ strains. Results Global mRNA expression profiles of myo1Δ strains and their corresponding wild type controls were obtained by hybridization to yeast oligonucleotide microarrays. Results for selected genes were confirmed by real time RT-PCR. A total of 547 differentially expressed genes (p ≤ 0.01 were identified with 263 up regulated and 284 down regulated genes in the myo1Δ strains. Gene set enrichment analysis revealed the significant over-representation of genes in the protein biosynthesis and stress response categories. The SLT2/MPK1 gene was up regulated in the microarray, and a myo1Δslt2Δ double mutant was non-viable. Overexpression of ribosomal protein genes RPL30 and RPS31 suppressed the hypersensitivity to Nikkomycin Z and increased the levels of phosphorylated Slt2p in myo1Δ strains. Increased levels of phosphorylated Slt2p were also observed in wild type strains under these conditions. Conclusion Following this analysis of global mRNA expression in yeast myo1Δ strains, we conclude that 547 genes were differentially regulated in myo1Δ strains and that the stress response and protein biosynthesis gene categories were coordinately regulated in this mutant. The SLT2/MPK1 gene was confirmed to be essential for myo1Δ strain viability, supporting that the up

  11. Simulation of Smart Home Activity Datasets

    Directory of Open Access Journals (Sweden)

    Jonathan Synnott

    2015-06-01

    Full Text Available A globally ageing population is resulting in an increased prevalence of chronic conditions which affect older adults. Such conditions require long-term care and management to maximize quality of life, placing an increasing strain on healthcare resources. Intelligent environments such as smart homes facilitate long-term monitoring of activities in the home through the use of sensor technology. Access to sensor datasets is necessary for the development of novel activity monitoring and recognition approaches. Access to such datasets is limited due to issues such as sensor cost, availability and deployment time. The use of simulated environments and sensors may address these issues and facilitate the generation of comprehensive datasets. This paper provides a review of existing approaches for the generation of simulated smart home activity datasets, including model-based approaches and interactive approaches which implement virtual sensors, environments and avatars. The paper also provides recommendation for future work in intelligent environment simulation.

  12. Simulation of Smart Home Activity Datasets.

    Science.gov (United States)

    Synnott, Jonathan; Nugent, Chris; Jeffers, Paul

    2015-06-16

    A globally ageing population is resulting in an increased prevalence of chronic conditions which affect older adults. Such conditions require long-term care and management to maximize quality of life, placing an increasing strain on healthcare resources. Intelligent environments such as smart homes facilitate long-term monitoring of activities in the home through the use of sensor technology. Access to sensor datasets is necessary for the development of novel activity monitoring and recognition approaches. Access to such datasets is limited due to issues such as sensor cost, availability and deployment time. The use of simulated environments and sensors may address these issues and facilitate the generation of comprehensive datasets. This paper provides a review of existing approaches for the generation of simulated smart home activity datasets, including model-based approaches and interactive approaches which implement virtual sensors, environments and avatars. The paper also provides recommendation for future work in intelligent environment simulation.

  13. In-vivo expression profiling of Pseudomonas aeruginosa infections reveals niche-specific and strain-independent transcriptional programs.

    Directory of Open Access Journals (Sweden)

    Piotr Bielecki

    Full Text Available Pseudomonas aeruginosa is a threatening, opportunistic pathogen causing disease in immunocompromised individuals. The hallmark of P. aeruginosa virulence is its multi-factorial and combinatorial nature. It renders such bacteria infectious for many organisms and it is often resistant to antibiotics. To gain insights into the physiology of P. aeruginosa during infection, we assessed the transcriptional programs of three different P. aeruginosa strains directly after isolation from burn wounds of humans. We compared the programs to those of the same strains using two infection models: a plant model, which consisted of the infection of the midrib of lettuce leaves, and a murine tumor model, which was obtained by infection of mice with an induced tumor in the abdomen. All control conditions of P. aeruginosa cells growing in suspension and as a biofilm were added to the analysis. We found that these different P. aeruginosa strains express a pool of distinct genetic traits that are activated under particular infection conditions regardless of their genetic variability. The knowledge herein generated will advance our understanding of P. aeruginosa virulence and provide valuable cues for the definition of prospective targets to develop novel intervention strategies.

  14. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    Directory of Open Access Journals (Sweden)

    Youssef Darzi

    Full Text Available Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease.

  15. Two highly divergent lineages of exfoliative toxin B-encoding plasmids revealed in impetigo strains of Staphylococcus aureus.

    Science.gov (United States)

    Botka, Tibor; Růžičková, Vladislava; Svobodová, Karla; Pantůček, Roman; Petráš, Petr; Čejková, Darina; Doškař, Jiří

    2017-09-01

    Exfoliative toxin B (ETB) encoded by some large plasmids plays a crucial role in epidermolytic diseases caused by Staphylococcus aureus. We have found as yet unknown types of etb gene-positive plasmids isolated from a set of impetigo strains implicated in outbreaks of pemphigus neonatorum in Czech maternity hospitals. Plasmids from the strains of clonal complex CC121 were related to archetypal plasmid pETB TY4 . Sharing a 33-kb core sequence including virulence genes for ETB, EDIN C, and lantibiotics, they were assigned to a stand-alone lineage, named pETB TY4 -based plasmids. Differing from each other in the content of variable DNA regions, they formed four sequence types. In addition to them, a novel unique plasmid pETB608 isolated from a strain of ST130 was described. Carrying conjugative cluster genes, as well as new variants of etb and edinA genes, pETB608 could be regarded as a source of a new lineage of ETB plasmids. We have designed a helpful detection assay, which facilitates the precise identification of the all described types of ETB plasmids. Copyright © 2017 Elsevier GmbH. All rights reserved.

  16. A high performance Trichoderma reesei strain that reveals the importance of xylanase III in cellulosic biomass conversion.

    Science.gov (United States)

    Nakazawa, Hikaru; Kawai, Tetsushi; Ida, Noriko; Shida, Yosuke; Shioya, Kouki; Kobayashi, Yoshinori; Okada, Hirofumi; Tani, Shuji; Sumitani, Jun-Ichi; Kawaguchi, Takashi; Morikawa, Yasushi; Ogasawara, Wataru

    2016-01-01

    The ability of the Trichoderma reesei X3AB1strain enzyme preparations to convert cellulosic biomass into fermentable sugars is enhanced by the replacement of xyn3 by Aspergillus aculeatus β-glucosidase 1 gene (aabg1), as shown in our previous study. However, subsequent experiments using T. reesei extracts supplemented with the glycoside hydrolase (GH) family 10 xylanase III (XYN III) and GH Family 11 XYN II showed increased conversion of alkaline treated cellulosic biomass, which is rich in xylan, underscoring the importance of XYN III. To attain optimal saccharifying potential in T. reesei, we constructed two new strains, C1AB1 and E1AB1, in which aabg1 was expressed heterologously by means of the cbh1 or egl1 promoters, respectively, so that the endogenous XYN III synthesis remained intact. Due to the presence of wild-type xyn3 in T. reesei E1AB1, enzymes prepared from this strain were 20-30% more effective in the saccharification of alkaline-pretreated rice straw than enzyme extracts from X3AB1, and also outperformed recent commercial cellulase preparations. Our results demonstrate the importance of XYN III in the conversion of alkaline-pretreated cellulosic biomass by T. reesei. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Spatiotemporal maps reveal regional differences in the effects on gut motility for Lactobacillus reuteri and rhamnosus strains.

    Science.gov (United States)

    Wu, R Y; Pasyk, M; Wang, B; Forsythe, P; Bienenstock, J; Mao, Y-K; Sharma, P; Stanisz, A M; Kunze, W A

    2013-03-01

    Commensal bacteria such as probiotics that are neuroactive acutely affect the amplitudes of intestinal migrating motor complexes (MMCs). What is lacking for an improved understanding of these motility effects are region specific measurements of velocity and frequency. We have combined intraluminal pressure recordings with spatiotemporal diameter maps to analyze more completely effects of different strains of beneficial bacteria on motility. Intraluminal peak pressure (PPr) was measured and video recordings made of mouse ex vivo jejunum and colon segments before and after intraluminal applications of Lactobacillus rhamnosus (JB-1) or Lactobacillus reuteri (DSM 17938). Migrating motor complex frequency and velocity were calculated. JB-1 decreased jejunal frequencies by 56% and 34% in colon. Jejunal velocities increased 171%, but decreased 31% in colon. Jejunal PPr decreased by 55% and in colon by 21%. DSM 17938 increased jejunal frequencies 63% and in colon 75%; jejunal velocity decreased 57%, but increased in colon 146%; jejunal PPr was reduced 26% and 12% in colon. TRAM-34 decreased frequency by 71% and increased velocity 200% for jejunum, but increased frequency 46% and velocity 50% for colon; PPr was decreased 59% for jejunum and 39% for colon. The results show that probiotics and other beneficial bacteria have strain and region-specific actions on gut motility that can be successfully discriminated using spatiotemporal mapping of diameter changes. Effects are not necessarily the same in colon and jejunum. Further research is needed on the detailed effects of the strains on enteric neuron currents for each gut region. © 2013 Blackwell Publishing Ltd.

  18. Significant strain accumulation between the deformation front and landward out-of-sequence thrusts in accretionary wedge of SW Taiwan revealed by cGPS and SAR interferometry

    Science.gov (United States)

    Tsai, M. C.

    2017-12-01

    High strain accumulation across the fold-and-thrust belt in Southwestern Taiwan are revealed by the Continuous GPS (cGPS) and SAR interferometry. This high strain is generally accommodated by the major active structures in fold-and-thrust belt of western Foothills in SW Taiwan connected to the accretionary wedge in the incipient are-continent collision zone. The active structures across the high strain accumulation include the deformation front around the Tainan Tableland, the Hochiali, Hsiaokangshan, Fangshan and Chishan faults. Among these active structures, the deformation pattern revealed from cGPS and SAR interferometry suggest that the Fangshan transfer fault may be a left-lateral fault zone with thrust component accommodating the westward differential motion of thrust sheets on both side of the fault. In addition, the Chishan fault connected to the splay fault bordering the lower-slope and upper-slope of the accretionary wedge which could be the major seismogenic fault and an out-of-sequence thrust fault in SW Taiwan. The big earthquakes resulted from the reactivation of out-of-sequence thrusts have been observed along the Nankai accretionary wedge, thus the assessment of the major seismogenic structures by strain accumulation between the frontal décollement and out-of-sequence thrusts is a crucial topic. According to the background seismicity, the low seismicity and mid-crust to mantle events are observed inland and the lower- and upper- slope domain offshore SW Taiwan, which rheologically implies the upper crust of the accretionary wedge is more or less aseimic. This result may suggest that the excess fluid pressure from the accretionary wedge not only has significantly weakened the prism materials as well as major fault zone, but also makes the accretionary wedge landward extension, which is why the low seismicity is observed in SW Taiwan area. Key words: Continuous GPS, SAR interferometry, strain rate, out-of-sequence thrust.

  19. Whole-genome characterization of Uruguayan strains of avian infectious bronchitis virus reveals extensive recombination between the two major South American lineages.

    Science.gov (United States)

    Marandino, Ana; Tomás, Gonzalo; Panzera, Yanina; Greif, Gonzalo; Parodi-Talice, Adriana; Hernández, Martín; Techera, Claudia; Hernández, Diego; Pérez, Ruben

    2017-10-01

    Infectious bronchitis virus (Gammacoronavirus, Coronaviridae) is a genetically variable RNA virus that causes one of the most persistent respiratory diseases in poultry. The virus is classified in genotypes and lineages with different epidemiological relevance. Two lineages of the GI genotype (11 and 16) have been widely circulating for decades in South America. GI-11 is an exclusive South American lineage while the GI-16 lineage is distributed in Asia, Europe and South America. Here, we obtained the whole genome of two Uruguayan strains of the GI-11 and GI-16 lineages using Illumina high-throughput sequencing. The strains here sequenced are the first obtained in South America for the infectious bronchitis virus and provide new insights into the origin, spreading and evolution of viral variants. The complete genome of the GI-11 and GI-16 strains have 27,621 and 27,638 nucleotides, respectively, and possess the same genomic organization. Phylogenetic incongruence analysis reveals that both strains have a mosaic genome that arose by recombination between Euro Asiatic strains of the GI-16 lineage and ancestral South American GI-11 viruses. The recombination occurred in South America and produced two viral variants that have retained the full-length S1 sequences of the parental lineages but are extremely similar in the rest of their genomes. These recombinant virus have been extraordinary successful, persisting in the continent for several years with a notorious wide geographic distribution. Our findings reveal a singular viral dynamics and emphasize the importance of complete genomic characterization to understand the emergence and evolutionary history of viral variants. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Isolation and Characterization of Current Human Coronavirus Strains in Primary Human Epithelial Cell Cultures Reveal Differences in Target Cell Tropism

    Science.gov (United States)

    Dijkman, Ronald; Jebbink, Maarten F.; Koekkoek, Sylvie M.; Deijs, Martin; Jónsdóttir, Hulda R.; Molenkamp, Richard; Ieven, Margareta; Goossens, Herman; Thiel, Volker

    2013-01-01

    The human airway epithelium (HAE) represents the entry port of many human respiratory viruses, including human coronaviruses (HCoVs). Nowadays, four HCoVs, HCoV-229E, HCoV-OC43, HCoV-HKU1, and HCoV-NL63, are known to be circulating worldwide, causing upper and lower respiratory tract infections in nonhospitalized and hospitalized children. Studies of the fundamental aspects of these HCoV infections at the primary entry port, such as cell tropism, are seriously hampered by the lack of a universal culture system or suitable animal models. To expand the knowledge on fundamental virus-host interactions for all four HCoVs at the site of primary infection, we used pseudostratified HAE cell cultures to isolate and characterize representative clinical HCoV strains directly from nasopharyngeal material. Ten contemporary isolates were obtained, representing HCoV-229E (n = 1), HCoV-NL63 (n = 1), HCoV-HKU1 (n = 4), and HCoV-OC43 (n = 4). For each strain, we analyzed the replication kinetics and progeny virus release on HAE cell cultures derived from different donors. Surprisingly, by visualizing HCoV infection by confocal microscopy, we observed that HCoV-229E employs a target cell tropism for nonciliated cells, whereas HCoV-OC43, HCoV-HKU1, and HCoV-NL63 all infect ciliated cells. Collectively, the data demonstrate that HAE cell cultures, which morphologically and functionally resemble human airways in vivo, represent a robust universal culture system for isolating and comparing all contemporary HCoV strains. PMID:23427150

  1. Isfahan MISP Dataset.

    Science.gov (United States)

    Kashefpur, Masoud; Kafieh, Rahele; Jorjandi, Sahar; Golmohammadi, Hadis; Khodabande, Zahra; Abbasi, Mohammadreza; Teifuri, Nilufar; Fakharzadeh, Ali Akbar; Kashefpoor, Maryam; Rabbani, Hossein

    2017-01-01

    An online depository was introduced to share clinical ground truth with the public and provide open access for researchers to evaluate their computer-aided algorithms. PHP was used for web programming and MySQL for database managing. The website was entitled "biosigdata.com." It was a fast, secure, and easy-to-use online database for medical signals and images. Freely registered users could download the datasets and could also share their own supplementary materials while maintaining their privacies (citation and fee). Commenting was also available for all datasets, and automatic sitemap and semi-automatic SEO indexing have been set for the site. A comprehensive list of available websites for medical datasets is also presented as a Supplementary (http://journalonweb.com/tempaccess/4800.584.JMSS_55_16I3253.pdf).

  2. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

    Science.gov (United States)

    Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.

    2018-01-01

    Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions

  3. Mridangam stroke dataset

    OpenAIRE

    CompMusic

    2014-01-01

    The audio examples were recorded from a professional Carnatic percussionist in a semi-anechoic studio conditions by Akshay Anantapadmanabhan using SM-58 microphones and an H4n ZOOM recorder. The audio was sampled at 44.1 kHz and stored as 16 bit wav files. The dataset can be used for training models for each Mridangam stroke. /n/nA detailed description of the Mridangam and its strokes can be found in the paper below. A part of the dataset was used in the following paper. /nAkshay Anantapadman...

  4. The GTZAN dataset

    DEFF Research Database (Denmark)

    Sturm, Bob L.

    2013-01-01

    The GTZAN dataset appears in at least 100 published works, and is the most-used public dataset for evaluation in machine listening research for music genre recognition (MGR). Our recent work, however, shows GTZAN has several faults (repetitions, mislabelings, and distortions), which challenge...... of GTZAN, and provide a catalog of its faults. We review how GTZAN has been used in MGR research, and find few indications that its faults have been known and considered. Finally, we rigorously study the effects of its faults on evaluating five different MGR systems. The lesson is not to banish GTZAN...

  5. Proteome analysis reveals distinct uranium stress response in two strains of Cyanobacteria native to Indian paddy fields

    International Nuclear Information System (INIS)

    Panda, Bandita; Basu, Bhakti; Acharya, Celin; Rajaram, Hema; Apte, Shree Kumar

    2017-01-01

    Uranium present in phosphate fertilizer contaminates agricultural land. Uranium exerts chemical toxicity to the resident biota as it induces oxidative stress by generating free radicals. Two strains of nitrogen fixing cyanobacteria viz., Anabaena PCC 7120 and L-31 native to Indian paddy, regularly experience oxidative stress induced by different stresses and heavy metals. The present study investigated their response to uranium exposure at proteomic level. LD_5_0 dose for Anabaena 7120 and Anabaena L-31 was determined to be 75 μM and 200 μM uranyl carbonate exposure for 3 h. A total of 79 proteins from Anabaena 7120 and 64 proteins from Anabaena L-31 were identified by MALDI mass spectrometry, of which levels of 45 and 27 proteins respectively were found to be differentially modulated in the two strains in response to uranium exposure. The differentially expressed proteins belonged to the major functional categories of photosynthesis, carbon metabolism and oxidative stress alleviation, commensurate with their uranium tolerance. Better oxidative stress management, and maintenance of metabolic and energy homeostasis lead to superior uranium tolerance in Anabaena L-31 as compared to Anabaena PCC 7120

  6. Genomic analysis of an attenuated Chlamydia abortus live vaccine strain reveals defects in central metabolism and surface proteins.

    Science.gov (United States)

    Burall, L S; Rodolakis, A; Rekiki, A; Myers, G S A; Bavoil, P M

    2009-09-01

    Comparative genomic analysis of a wild-type strain of the ovine pathogen Chlamydia abortus and its nitrosoguanidine-induced, temperature-sensitive, virulence-attenuated live vaccine derivative identified 22 single nucleotide polymorphisms unique to the mutant, including nine nonsynonymous mutations, one leading to a truncation of pmpG, which encodes a polymorphic membrane protein, and two intergenic mutations potentially affecting promoter sequences. Other nonsynonymous mutations mapped to a pmpG pseudogene and to predicted coding sequences encoding a putative lipoprotein, a sigma-54-dependent response regulator, a PhoH-like protein, a putative export protein, two tRNA synthetases, and a putative serine hydroxymethyltransferase. One of the intergenic mutations putatively affects transcription of two divergent genes encoding pyruvate kinase and a putative SOS response nuclease, respectively. These observations suggest that the temperature-sensitive phenotype and associated virulence attenuation of the vaccine strain result from disrupted metabolic activity due to altered pyruvate kinase expression and/or alteration in the function of one or more membrane proteins, most notably PmpG and a putative lipoprotein.

  7. Unexpected diversity in the mobilome of a Pseudomonas aeruginosa strain isolated from a dental unit waterline revealed by SMRT Sequencing.

    Science.gov (United States)

    Vincent, Antony T; Charette, Steve J; Barbeau, Jean

    2018-05-01

    The Gram-negative bacterium Pseudomonas aeruginosa is found in several habitats, both natural and human-made, and is particularly known for its recurrent presence as a pathogen in the lungs of patients suffering from cystic fibrosis, a genetic disease. Given its clinical importance, several major studies have investigated the genomic adaptation of P. aeruginosa in lungs and its transition as acute infections become chronic. However, our knowledge about the diversity and adaptation of the P. aeruginosa genome to non-clinical environments is still fragmentary, in part due to the lack of accurate reference genomes of strains from the numerous environments colonized by the bacterium. Here, we used PacBio long-read technology to sequence the genome of PPF-1, a strain of P. aeruginosa isolated from a dental unit waterline. Generating this closed genome was an opportunity to investigate genomic features that are difficult to accurately study in a draft genome (contigs state). It was possible to shed light on putative genomic islands, some shared with other reference genomes, new prophages, and the complete content of insertion sequences. In addition, four different group II introns were also found, including two characterized here and not listed in the specialized group II intron database.

  8. Genome-wide comparison and taxonomic relatedness of multiple Xylella fastidiosa strains reveal the occurrence of three subspecies and a new Xylella species.

    Science.gov (United States)

    Marcelletti, Simone; Scortichini, Marco

    2016-10-01

    A total of 21 Xylella fastidiosa strains were assessed by comparing their genomes to infer their taxonomic relationships. The whole-genome-based average nucleotide identity and tetranucleotide frequency correlation coefficient analyses were performed. In addition, a consensus tree based on comparisons of 956 core gene families, and a genome-wide phylogenetic tree and a Neighbor-net network were constructed with 820,088 nucleotides (i.e., approximately 30-33 % of the entire X. fastidiosa genome). All approaches revealed the occurrence of three well-demarcated genetic clusters that represent X. fastidiosa subspecies fastidiosa, multiplex and pauca, with the latter appeared to diverge. We suggest that the proposed but never formally described subspecies 'sandyi' and 'morus' are instead members of the subspecies fastidiosa. These analyses support the view that the Xylella strain isolated from Pyrus pyrifolia in Taiwan is likely to be a new species. A widely used multilocus sequence typing analysis yielded conflicting results.

  9. Temporal dynamics of the developing lung transcriptome in three common inbred strains of laboratory mice reveals multiple stages of postnatal alveolar development

    Directory of Open Access Journals (Sweden)

    Kyle J. Beauchemin

    2016-08-01

    Full Text Available To characterize temporal patterns of transcriptional activity during normal lung development, we generated genome wide gene expression data for 26 pre- and post-natal time points in three common inbred strains of laboratory mice (C57BL/6J, A/J, and C3H/HeJ. Using Principal Component Analysis and least squares regression modeling, we identified both strain-independent and strain-dependent patterns of gene expression. The 4,683 genes contributing to the strain-independent expression patterns were used to define a murine Developing Lung Characteristic Subtranscriptome (mDLCS. Regression modeling of the Principal Components supported the four canonical stages of mammalian embryonic lung development (embryonic, pseudoglandular, canalicular, saccular defined previously by morphology and histology. For postnatal alveolar development, the regression model was consistent with four stages of alveolarization characterized by episodic transcriptional activity of genes related to pulmonary vascularization. Genes expressed in a strain-dependent manner were enriched for annotations related to neurogenesis, extracellular matrix organization, and Wnt signaling. Finally, a comparison of mouse and human transcriptomics from pre-natal stages of lung development revealed conservation of pathways associated with cell cycle, axon guidance, immune function, and metabolism as well as organism-specific expression of genes associated with extracellular matrix organization and protein modification. The mouse lung development transcriptome data generated for this study serves as a unique reference set to identify genes and pathways essential for normal mammalian lung development and for investigations into the developmental origins of respiratory disease and cancer. The gene expression data are available from the Gene Expression Omnibus (GEO archive (GSE74243. Temporal expression patterns of mouse genes can be investigated using a study specific web resource (http://lungdevelopment.jax.org.

  10. Temporal dynamics of the developing lung transcriptome in three common inbred strains of laboratory mice reveals multiple stages of postnatal alveolar development.

    Science.gov (United States)

    Beauchemin, Kyle J; Wells, Julie M; Kho, Alvin T; Philip, Vivek M; Kamir, Daniela; Kohane, Isaac S; Graber, Joel H; Bult, Carol J

    2016-01-01

    To characterize temporal patterns of transcriptional activity during normal lung development, we generated genome wide gene expression data for 26 pre- and post-natal time points in three common inbred strains of laboratory mice (C57BL/6J, A/J, and C3H/HeJ). Using Principal Component Analysis and least squares regression modeling, we identified both strain-independent and strain-dependent patterns of gene expression. The 4,683 genes contributing to the strain-independent expression patterns were used to define a murine Developing Lung Characteristic Subtranscriptome (mDLCS). Regression modeling of the Principal Components supported the four canonical stages of mammalian embryonic lung development (embryonic, pseudoglandular, canalicular, saccular) defined previously by morphology and histology. For postnatal alveolar development, the regression model was consistent with four stages of alveolarization characterized by episodic transcriptional activity of genes related to pulmonary vascularization. Genes expressed in a strain-dependent manner were enriched for annotations related to neurogenesis, extracellular matrix organization, and Wnt signaling. Finally, a comparison of mouse and human transcriptomics from pre-natal stages of lung development revealed conservation of pathways associated with cell cycle, axon guidance, immune function, and metabolism as well as organism-specific expression of genes associated with extracellular matrix organization and protein modification. The mouse lung development transcriptome data generated for this study serves as a unique reference set to identify genes and pathways essential for normal mammalian lung development and for investigations into the developmental origins of respiratory disease and cancer. The gene expression data are available from the Gene Expression Omnibus (GEO) archive (GSE74243). Temporal expression patterns of mouse genes can be investigated using a study specific web resource (http://lungdevelopment.jax.org).

  11. Dataset - Adviesregel PPL 2010

    NARCIS (Netherlands)

    Evert, van F.K.; Schans, van der D.A.; Geel, van W.C.A.; Slabbekoorn, J.J.; Booij, R.; Jukema, J.N.; Meurs, E.J.J.; Uenk, D.

    2011-01-01

    This dataset contains experimental data from a number of field experiments with potato in The Netherlands (Van Evert et al., 2011). The data are presented as an SQL dump of a PostgreSQL database (version 8.4.4). An outline of the entity-relationship diagram of the database is given in an

  12. A novel approach to eliminate Wolbachia infections in Nasonia vitripennis revealed different antibiotic resistance between two bacterial strains.

    Science.gov (United States)

    Liu, Hai-Yang; Wang, Yan-Kun; Zhi, Cong-Cong; Xiao, Jin-Hua; Huang, Da-Wei

    2014-06-01

    Wolbachia are widespread in insects and can manipulate host reproduction. Nasonia vitripennis is a widely studied organism with a very high prevalence of Wolbachia infection. To study the effect of Wolbachia infection in Nasonia spp., it is important to obtain noninfected individuals by artificial methods. Current methods that employ sugar water-containing antibiotics can successfully eliminate Wolbachia from the parasitic wasps; however, treatment of at least three generations is required. Here, we describe a novel, feasible, and effective approach to eliminate Wolbachia from N. vitripennis by feeding fly pupae continuously offering antibiotics to Nasonia populations, which shortened the time to eliminate the pathogens to two generations. Additionally, the Wolbachia Uni and CauB strains have obviously different rifampicin-resistance abilities, which is a previously unknown phenomenon. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  13. Genetic characterisation of farmed rainbow trout in Norway: intra- and inter-strain variation reveals potential for identification of escapees

    Directory of Open Access Journals (Sweden)

    Glover Kevin A

    2008-12-01

    Full Text Available Abstract Background The rainbow trout (Oncorhynchus mykiss is one of the most important aquaculture species in the world, and Norway is one of the largest producers. The present study was initiated in response to a request from the Norwegian police authority to identify the farm of origin for 35 escaped rainbow trout captured in a fjord. Eleven samples, each consisting of approximately 47 fish, were collected from the three farms operating in the fjord where the escapees were captured. In order to gain a better general understanding of the genetic structure of rainbow trout strains used in Norwegian aquaculture, seven samples (47 fish per sample were collected from six farms located outside the region where the escapees were captured. All samples, including the escapees, were genotyped with 12 microsatellite loci. Results All samples displayed considerable genetic variability at all loci (mean number of alleles per locus per sample ranged from 5.4–8.6. Variable degrees of genetic differentiation were observed among the samples, with pair-wise FST values ranging from 0–0.127. Self-assignment tests conducted among the samples collected from farms outside the fjord where the escapees were observed gave an overall correct assignment of 82.5%, demonstrating potential for genetic identification of escapees. In the "real life" assignment of the 35 captured escapees, all were excluded from two of the samples included as controls in the analysis, and 26 were excluded from the third control sample. In contrast, only 1 of the escapees was excluded from the 11 pooled samples collected on the 3 farms operating in the fjord. Conclusion Considerable genetic variation exists within and among rainbow trout strains farmed in Norway. Together with modern statistical methods, this will provide commercial operators with a tool to monitor breeding and fish movements, and management authorities with the ability to identify the source of escapees. The data

  14. Molecular Analysis of Asymptomatic Bacteriuria Escherichia coli Strain VR50 Reveals Adaptation to the Urinary Tract by Gene Acquisition

    DEFF Research Database (Denmark)

    Beatson, Scott A.; Ben Zakour, Nouri L.; Totsika, Makrina

    2015-01-01

    the evolution and molecular mechanisms that underpin ABU, the genome of the ABU E. coli strain VR50 was sequenced. Analysis of the complete genome indicated that it most resembles E. coli K-12, with the addition of a 94-kb genomic island (GI-VR50-pheV), eight prophages, and multiple plasmids. GI-VR50-pheV has...... a mosaic structure and contains genes encoding a number of UTI-associated virulence factors, namely, Afa (afimbrial adhesin), two autotransporter proteins (Ag43 and Sat), and aerobactin. We demonstrated that the presence of this island in VR50 confers its ability to colonize the murine bladder, as a VR50...... mutant with GI-VR50-pheV deleted was attenuated in a mouse model of UTI in vivo. We established that Afa is the island-encoded factor responsible for this phenotype using two independent deletion (Afa operon and AfaE adhesin) mutants. E. coli VR50afa and VR50afaE displayed significantly decreased ability...

  15. The Geographic Distribution of Saccharomyces cerevisiae Isolates within three Italian Neighboring Winemaking Regions Reveals Strong Differences in Yeast Abundance, Genetic Diversity and Industrial Strain Dissemination

    Directory of Open Access Journals (Sweden)

    Alessia Viel

    2017-08-01

    Full Text Available In recent years the interest for natural fermentations has been re-evaluated in terms of increasing the wine terroir and managing more sustainable winemaking practices. Therefore, the level of yeast genetic variability and the abundance of Saccharomyces cerevisiae native populations in vineyard are becoming more and more crucial at both ecological and technological level. Among the factors that can influence the strain diversity, the commercial starter release that accidentally occur in the environment around the winery, has to be considered. In this study we led a wide scale investigation of S. cerevisiae genetic diversity and population structure in the vineyards of three neighboring winemaking regions of Protected Appellation of Origin, in North-East of Italy. Combining mtDNA RFLP and microsatellite markers analyses we evaluated 634 grape samples collected over 3 years. We could detect major differences in the presence of S. cerevisiae yeasts, according to the winemaking region. The population structures revealed specificities of yeast microbiota at vineyard scale, with a relative Appellation of Origin area homogeneity, and transition zones suggesting a geographic differentiation. Surprisingly, we found a widespread industrial yeast dissemination that was very high in the areas where the native yeast abundance was low. Although geographical distance is a key element involved in strain distribution, the high presence of industrial strains in vineyard reduced the differences between populations. This finding indicates that industrial yeast diffusion it is a real emergency and their presence strongly interferes with the natural yeast microbiota.

  16. Use of combined microscopic and spectroscopic techniques to reveal interactions between uranium and Microbacterium sp. A9, a strain isolated from the Chernobyl exclusion zone

    Energy Technology Data Exchange (ETDEWEB)

    Theodorakopoulos, Nicolas [CEA, DSV, IBEB, SBVME, LIPM, F-13108 Saint-Paul-lez-Durance (France); CNRS, UMR 7265, F-13108 Saint-Paul-lez-Durance (France); Université d' Aix-Marseille, F-13108 Saint-Paul-lez-Durance (France); IRSN/PRP-ENV/SERIS/L2BT, bat 183, B.P. 3, F-13115 Saint Paul-lez-Durance (France); Chapon, Virginie [CEA, DSV, IBEB, SBVME, LIPM, F-13108 Saint-Paul-lez-Durance (France); CNRS, UMR 7265, F-13108 Saint-Paul-lez-Durance (France); Université d' Aix-Marseille, F-13108 Saint-Paul-lez-Durance (France); Coppin, Fréderic; Floriani, Magali [IRSN/PRP-ENV/SERIS/L2BT, bat 183, B.P. 3, F-13115 Saint Paul-lez-Durance (France); Vercouter, Thomas [CEA, DEN, DANS, DPC SEARS, LANIE, F-91191 Gif-Sur-Yvette Cedex (France); Sergeant, Claire [Univ Bordeaux, CENBG, UMR5797, F-33170 Gradignan (France); CNRS, IN2P3, CENBG, UMR5797, F-33170 Gradignan (France); Camilleri, Virginie [IRSN/PRP-ENV/SERIS/L2BT, bat 183, B.P. 3, F-13115 Saint Paul-lez-Durance (France); Berthomieu, Catherine [CEA, DSV, IBEB, SBVME, LIPM, F-13108 Saint-Paul-lez-Durance (France); CNRS, UMR 7265, F-13108 Saint-Paul-lez-Durance (France); Université d' Aix-Marseille, F-13108 Saint-Paul-lez-Durance (France); Février, Laureline, E-mail: laureline.fevrier@irsn.fr [IRSN/PRP-ENV/SERIS/L2BT, bat 183, B.P. 3, F-13115 Saint Paul-lez-Durance (France)

    2015-03-21

    Highlights: • Microbacterium sp. A9 develops various detoxification mechanisms. • Microbacterium sp. A9 promotes metal efflux from the cells. • Microbacterium sp. A9 releases phosphate to prevent uranium entrance in the cells. • Microbacterium sp. A9 stores U intracellularly as autunite. - Abstract: Although uranium (U) is naturally found in the environment, soil remediation programs will become increasingly important in light of certain human activities. This work aimed to identify U(VI) detoxification mechanisms employed by a bacteria strain isolated from a Chernobyl soil sample, and to distinguish its active from passive mechanisms of interaction. The ability of the Microbacterium sp. A9 strain to remove U(VI) from aqueous solutions at 4 °C and 25 °C was evaluated, as well as its survival capacity upon U(VI) exposure. The subcellular localisation of U was determined by TEM/EDX microscopy, while functional groups involved in the interaction with U were further evaluated by FTIR; finally, the speciation of U was analysed by TRLFS. We have revealed, for the first time, an active mechanism promoting metal efflux from the cells, during the early steps following U(VI) exposure at 25 °C. The Microbacterium sp. A9 strain also stores U intracellularly, as needle-like structures that have been identified as an autunite group mineral. Taken together, our results demonstrate that this strain exhibits a high U(VI) tolerance based on multiple detoxification mechanisms. These findings support the potential role of the genus Microbacterium in the remediation of aqueous environments contaminated with U(VI) under aerobic conditions.

  17. Diversity and strain specificity of plant cell wall degrading enzymes revealed by the draft genome of Ruminococcus flavefaciens FD-1.

    Directory of Open Access Journals (Sweden)

    Margret E Berg Miller

    Full Text Available BACKGROUND: Ruminococcus flavefaciens is a predominant cellulolytic rumen bacterium, which forms a multi-enzyme cellulosome complex that could play an integral role in the ability of this bacterium to degrade plant cell wall polysaccharides. Identifying the major enzyme types involved in plant cell wall degradation is essential for gaining a better understanding of the cellulolytic capabilities of this organism as well as highlighting potential enzymes for application in improvement of livestock nutrition and for conversion of cellulosic biomass to liquid fuels. METHODOLOGY/PRINCIPAL FINDINGS: The R. flavefaciens FD-1 genome was sequenced to 29x-coverage, based on pulsed-field gel electrophoresis estimates (4.4 Mb, and assembled into 119 contigs providing 4,576,399 bp of unique sequence. As much as 87.1% of the genome encodes ORFs, tRNA, rRNAs, or repeats. The GC content was calculated at 45%. A total of 4,339 ORFs was detected with an average gene length of 918 bp. The cellulosome model for R. flavefaciens was further refined by sequence analysis, with at least 225 dockerin-containing ORFs, including previously characterized cohesin-containing scaffoldin molecules. These dockerin-containing ORFs encode a variety of catalytic modules including glycoside hydrolases (GHs, polysaccharide lyases, and carbohydrate esterases. Additionally, 56 ORFs encode proteins that contain carbohydrate-binding modules (CBMs. Functional microarray analysis of the genome revealed that 56 of the cellulosome-associated ORFs were up-regulated, 14 were down-regulated, 135 were unaffected, when R. flavefaciens FD-1 was grown on cellulose versus cellobiose. Three multi-modular xylanases (ORF01222, ORF03896, and ORF01315 exhibited the highest levels of up-regulation. CONCLUSIONS/SIGNIFICANCE: The genomic evidence indicates that R. flavefaciens FD-1 has the largest known number of fiber-degrading enzymes likely to be arranged in a cellulosome architecture. Functional

  18. Biogeochemical typing of paddy field by a data-driven approach revealing sub-systems within a complex environment--a pipeline to filtrate, organize and frame massive dataset from multi-omics analyses.

    Directory of Open Access Journals (Sweden)

    Diogo M O Ogawa

    Full Text Available We propose the technique of biogeochemical typing (BGC typing as a novel methodology to set forth the sub-systems of organismal communities associated to the correlated chemical profiles working within a larger complex environment. Given the intricate characteristic of both organismal and chemical consortia inherent to the nature, many environmental studies employ the holistic approach of multi-omics analyses undermining as much information as possible. Due to the massive amount of data produced applying multi-omics analyses, the results are hard to visualize and to process. The BGC typing analysis is a pipeline built using integrative statistical analysis that can treat such huge datasets filtering, organizing and framing the information based on the strength of the various mutual trends of the organismal and chemical fluctuations occurring simultaneously in the environment. To test our technique of BGC typing, we choose a rich environment abounding in chemical nutrients and organismal diversity: the surficial freshwater from Japanese paddy fields and surrounding waters. To identify the community consortia profile we employed metagenomics as high throughput sequencing (HTS for the fragments amplified from Archaea rRNA, universal 16S rRNA and 18S rRNA; to assess the elemental content we employed ionomics by inductively coupled plasma optical emission spectroscopy (ICP-OES; and for the organic chemical profile, metabolomics employing both Fourier transformed infrared (FT-IR spectroscopy and proton nuclear magnetic resonance (1H-NMR all these analyses comprised our multi-omics dataset. The similar trends between the community consortia against the chemical profiles were connected through correlation. The result was then filtered, organized and framed according to correlation strengths and peculiarities. The output gave us four BGC types displaying uniqueness in community and chemical distribution, diversity and richness. We conclude therefore that

  19. LC-MS/MS Detection of Karlotoxins Reveals New Variants in Strains of the Marine Dinoflagellate Karlodinium veneficum from the Ebro Delta (NW Mediterranean

    Directory of Open Access Journals (Sweden)

    Bernd Krock

    2017-12-01

    Full Text Available A liquid chromatography-tandem mass spectrometry (LC-MS/MS method was developed for the detection and quantitation of karlotoxins in the selected reaction monitoring (SRM mode. This novel method was based upon the analysis of purified karlotoxins (KcTx-1, KmTx-2, 44-oxo-KmTx-2, KmTx-5, one amphidinol (AM-18, and unpurified extracts of bulk cultures of the marine dinoflagellate Karlodinium veneficum strain CCMP2936 from Delaware (Eastern USA, which produces KmTx-1 and KmTx-3. The limit of detection of the SRM method for KmTx-2 was determined as 2.5 ng on-column. Collision induced dissociation (CID spectra of all putative karlotoxins were recorded to present fragmentation patterns of each compound for their unambiguous identification. Bulk cultures of K. veneficum strain K10 isolated from an embayment of the Ebro Delta, NW Mediterranean, yielded five previously unreported putative karlotoxins with molecular masses 1280, 1298, 1332, 1356, and 1400 Da, and similar fragments to KmTx-5. Analysis of several isolates of K. veneficum from the Ebro Delta revealed small-scale diversity in the karlotoxin spectrum in that one isolate from Fangar Bay produced KmTx-5, whereas the five putative novel karlotoxins were found among several isolates from nearby, but hydrographically distinct Alfacs Bay. Application of this LC-MS/MS method represents an incremental advance in the determination of putative karlotoxins, particularly in the absence of a complete spectrum of purified analytical standards of known specific potency.

  20. Comparative Phosphoproteomics Reveals the Role of AmpC β-lactamase Phosphorylation in the Clinical Imipenem-resistant Strain Acinetobacter baumannii SK17.

    Science.gov (United States)

    Lai, Juo-Hsin; Yang, Jhih-Tian; Chern, Jeffy; Chen, Te-Li; Wu, Wan-Ling; Liao, Jiahn-Haur; Tsai, Shih-Feng; Liang, Suh-Yuen; Chou, Chi-Chi; Wu, Shih-Hsiung

    2016-01-01

    Nosocomial infectious outbreaks caused by multidrug-resistant Acinetobacter baumannii have emerged as a serious threat to human health. Phosphoproteomics of pathogenic bacteria has been used to identify the mechanisms of bacterial virulence and antimicrobial resistance. In this study, we used a shotgun strategy combined with high-accuracy mass spectrometry to analyze the phosphoproteomics of the imipenem-susceptible strain SK17-S and -resistant strain SK17-R. We identified 410 phosphosites on 248 unique phosphoproteins in SK17-S and 285 phosphosites on 211 unique phosphoproteins in SK17-R. The distributions of the Ser/Thr/Tyr/Asp/His phosphosites in SK17-S and SK17-R were 47.0%/27.6%/12.4%/8.0%/4.9% versus 41.4%/29.5%/17.5%/6.7%/4.9%, respectively. The Ser-90 phosphosite, located on the catalytic motif S(88)VS(90)K of the AmpC β-lactamase, was first identified in SK17-S. Based on site-directed mutagenesis, the nonphosphorylatable mutant S90A was found to be more resistant to imipenem, whereas the phosphorylation-simulated mutant S90D was sensitive to imipenem. Additionally, the S90A mutant protein exhibited higher β-lactamase activity and conferred greater bacterial protection against imipenem in SK17-S compared with the wild-type. In sum, our results revealed that in A. baumannii, Ser-90 phosphorylation of AmpC negatively regulates both β-lactamase activity and the ability to counteract the antibiotic effects of imipenem. These findings highlight the impact of phosphorylation-mediated regulation in antibiotic-resistant bacteria on future drug design and new therapies. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  1. Comparative Phosphoproteomics Reveals the Role of AmpC β-lactamase Phosphorylation in the Clinical Imipenem-resistant Strain Acinetobacter baumannii SK17*

    Science.gov (United States)

    Lai, Juo-Hsin; Yang, Jhih-Tian; Chern, Jeffy; Chen, Te-Li; Wu, Wan-Ling; Liao, Jiahn-Haur; Tsai, Shih-Feng; Liang, Suh-Yuen; Chou, Chi-Chi

    2016-01-01

    Nosocomial infectious outbreaks caused by multidrug-resistant Acinetobacter baumannii have emerged as a serious threat to human health. Phosphoproteomics of pathogenic bacteria has been used to identify the mechanisms of bacterial virulence and antimicrobial resistance. In this study, we used a shotgun strategy combined with high-accuracy mass spectrometry to analyze the phosphoproteomics of the imipenem-susceptible strain SK17-S and -resistant strain SK17-R. We identified 410 phosphosites on 248 unique phosphoproteins in SK17-S and 285 phosphosites on 211 unique phosphoproteins in SK17-R. The distributions of the Ser/Thr/Tyr/Asp/His phosphosites in SK17-S and SK17-R were 47.0%/27.6%/12.4%/8.0%/4.9% versus 41.4%/29.5%/17.5%/6.7%/4.9%, respectively. The Ser-90 phosphosite, located on the catalytic motif S88VS90K of the AmpC β-lactamase, was first identified in SK17-S. Based on site-directed mutagenesis, the nonphosphorylatable mutant S90A was found to be more resistant to imipenem, whereas the phosphorylation-simulated mutant S90D was sensitive to imipenem. Additionally, the S90A mutant protein exhibited higher β-lactamase activity and conferred greater bacterial protection against imipenem in SK17-S compared with the wild-type. In sum, our results revealed that in A. baumannii, Ser-90 phosphorylation of AmpC negatively regulates both β-lactamase activity and the ability to counteract the antibiotic effects of imipenem. These findings highlight the impact of phosphorylation-mediated regulation in antibiotic-resistant bacteria on future drug design and new therapies. PMID:26499836

  2. Serological and virological survey of hepatitis E virus (HEV) in animal reservoirs from Uruguay reveals elevated prevalences and a very close phylogenetic relationship between swine and human strains.

    Science.gov (United States)

    Mirazo, Santiago; Gardinali, Noemí R; Cecilia, D'Albora; Verger, Lorenzo; Ottonelli, Florencia; Ramos, Natalia; Castro, Gustavo; Pinto, Marcelo A; Ré, Viviana; Pisano, Belén; Lozano, Alejandra; de Oliveira, Jaqueline Mendes; Arbiza, Juan

    2018-01-01

    Hepatitis E virus (HEV) infection is an issue of public health concern in high-income and non-endemic countries. Increasing evidence supports the hypothesis of a zoonotic route as the main mode of infection in this epidemiological setting, since the transmission of genotypes HEV-3 and HEV-4 from reservoirs to humans has been demonstrated. In America, studies have confirmed the circulation of HEV in pig herds but the zoonotic role of wild boars has never been evaluated. Uruguay has a high burden of HEV- associated acute hepatitis, and a close phylogenetic relationship was observed among human HEV-3 strains and European isolates detected in swine. However in this context, swine herds have never been surveyed. Herein is reported a survey of HEV in swine herds, pigs at slaughter-house and free-living wild boar populations. Two-hundred and twenty sera and 150 liver tissue samples from domestic pigs, and 140 sera from wild boars were tested for HEV by ELISA and PCR-based approaches. All tested swine farms resulted seropositive with an overall rate of 46.8%. In turn, 22.1% of the wild boars had anti-HEV antibodies. HEV RNA was detected in 16.6% and 9.3% of liver samples from slaughter-age pigs and adult wild boars sera, respectively. Three strains from domestic pig were also amplified by nested-PCR approaches. By contrast, none of the positive samples obtained from wild boars could be confirmed by nested-PCR. Phylogenetic analysis revealed a very high nucleotide identity among swine strains and sequences obtained from humans in Uruguay. Results showed that HEV is widely distributed among swine herds in Uruguay. Additionally, this study evidences for the first time in the American continent that wild boar populations are a reservoir for HEV, though its zoonotic role remains to be elucidated. Altogether, data presented here suggest a high zoonotic risk of HEV transmission from swine to humans. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. National Elevation Dataset

    Science.gov (United States)

    ,

    2002-01-01

    The National Elevation Dataset (NED) is a new raster product assembled by the U.S. Geological Survey. NED is designed to provide National elevation data in a seamless form with a consistent datum, elevation unit, and projection. Data corrections were made in the NED assembly process to minimize artifacts, perform edge matching, and fill sliver areas of missing data. NED has a resolution of one arc-second (approximately 30 meters) for the conterminous United States, Hawaii, Puerto Rico and the island territories and a resolution of two arc-seconds for Alaska. NED data sources have a variety of elevation units, horizontal datums, and map projections. In the NED assembly process the elevation values are converted to decimal meters as a consistent unit of measure, NAD83 is consistently used as horizontal datum, and all the data are recast in a geographic projection. Older DEM's produced by methods that are now obsolete have been filtered during the NED assembly process to minimize artifacts that are commonly found in data produced by these methods. Artifact removal greatly improves the quality of the slope, shaded-relief, and synthetic drainage information that can be derived from the elevation data. Figure 2 illustrates the results of this artifact removal filtering. NED processing also includes steps to adjust values where adjacent DEM's do not match well, and to fill sliver areas of missing data between DEM's. These processing steps ensure that NED has no void areas and artificial discontinuities have been minimized. The artifact removal filtering process does not eliminate all of the artifacts. In areas where the only available DEM is produced by older methods, then "striping" may still occur.

  4. Proteins involved in difference of sorbitol fermentation rates of the toxigenic and nontoxigenic Vibrio cholerae El Tor strains revealed by comparative proteome analysis

    Science.gov (United States)

    2009-01-01

    Background The nontoxigenic V. cholerae El Tor strains ferment sorbitol faster than the toxigenic strains, hence fast-fermenting and slow-fermenting strains are defined by sorbitol fermentation test. This test has been used for more than 40 years in cholera surveillance and strain analysis in China. Understanding of the mechanisms of sorbitol metabolism of the toxigenic and nontoxigenic strains may help to explore the genome and metabolism divergence in these strains. Here we used comparative proteomic analysis to find the proteins which may be involved in such metabolic difference. Results We found the production of formate and lactic acid in the sorbitol fermentation medium of the nontoxigenic strain was earlier than of the toxigenic strain. We compared the protein expression profiles of the toxigenic strain N16961 and nontoxigenic strain JS32 cultured in sorbitol fermentation medium, by using fructose fermentation medium as the control. Seventy-three differential protein spots were found and further identified by MALDI-MS. The difference of product of fructose-specific IIA/FPR component gene and mannitol-1-P dehydrogenase, may be involved in the difference of sorbitol transportation and dehydrogenation in the sorbitol fast- and slow-fermenting strains. The difference of the relative transcription levels of pyruvate formate-lyase to pyruvate dehydrogenase between the toxigenic and nontoxigenic strains may be also responsible for the time and ability difference of formate production between these strains. Conclusion Multiple factors involved in different metabolism steps may affect the sorbitol fermentation in the toxigenic and nontoxigenic strains of V. cholerae El Tor. PMID:19589152

  5. NP-PAH Interaction Dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — Dataset presents concentrations of organic pollutants, such as polyaromatic hydrocarbon compounds, in water samples. Water samples of known volume and concentration...

  6. Combination of Metabolomic and Proteomic Analysis Revealed Different Features among Lactobacillus delbrueckii Subspecies bulgaricus and lactis Strains While In Vivo Testing in the Model Organism Caenorhabditis elegans Highlighted Probiotic Properties

    Directory of Open Access Journals (Sweden)

    Elena Zanni

    2017-06-01

    Full Text Available Lactobacillus delbrueckii represents a technologically relevant member of lactic acid bacteria, since the two subspecies bulgaricus and lactis are widely associated with fermented dairy products. In the present work, we report the characterization of two commercial strains belonging to L. delbrueckii subspecies bulgaricus, lactis and a novel strain previously isolated from a traditional fermented fresh cheese. A phenomic approach was performed by combining metabolomic and proteomic analysis of the three strains, which were subsequently supplemented as food source to the model organism Caenorhabditis elegans, with the final aim to evaluate their possible probiotic effects. Restriction analysis of 16S ribosomal DNA revealed that the novel foodborne strain belonged to L. delbrueckii subspecies lactis. Proteomic and metabolomic approaches showed differences in folate, aminoacid and sugar metabolic pathways among the three strains. Moreover, evaluation of C. elegans lifespan, larval development, brood size, and bacterial colonization capacity demonstrated that L. delbrueckii subsp. bulgaricus diet exerted beneficial effects on nematodes. On the other hand, both L. delbrueckii subsp. lactis strains affected lifespan and larval development. We have characterized three strains belonging to L. delbrueckii subspecies bulgaricus and lactis highlighting their divergent origin. In particular, the two closely related isolates L. delbrueckii subspecies lactis display different galactose metabolic capabilities. Moreover, the L. delbrueckii subspecies bulgaricus strain demonstrated potential probiotic features. Combination of omic platforms coupled with in vivo screening in the simple model organism C. elegans is a powerful tool to characterize industrially relevant bacterial isolates.

  7. Combination of Metabolomic and Proteomic Analysis Revealed Different Features among Lactobacillus delbrueckii Subspecies bulgaricus and lactis Strains While In Vivo Testing in the Model Organism Caenorhabditis elegans Highlighted Probiotic Properties.

    Science.gov (United States)

    Zanni, Elena; Schifano, Emily; Motta, Sara; Sciubba, Fabio; Palleschi, Claudio; Mauri, Pierluigi; Perozzi, Giuditta; Uccelletti, Daniela; Devirgiliis, Chiara; Miccheli, Alfredo

    2017-01-01

    Lactobacillus delbrueckii represents a technologically relevant member of lactic acid bacteria, since the two subspecies bulgaricus and lactis are widely associated with fermented dairy products. In the present work, we report the characterization of two commercial strains belonging to L. delbrueckii subspecies bulgaricus , lactis and a novel strain previously isolated from a traditional fermented fresh cheese. A phenomic approach was performed by combining metabolomic and proteomic analysis of the three strains, which were subsequently supplemented as food source to the model organism Caenorhabditis elegans , with the final aim to evaluate their possible probiotic effects. Restriction analysis of 16S ribosomal DNA revealed that the novel foodborne strain belonged to L. delbrueckii subspecies lactis . Proteomic and metabolomic approaches showed differences in folate, aminoacid and sugar metabolic pathways among the three strains. Moreover, evaluation of C. elegans lifespan, larval development, brood size, and bacterial colonization capacity demonstrated that L. delbrueckii subsp. bulgaricus diet exerted beneficial effects on nematodes. On the other hand, both L. delbrueckii subsp. lactis strains affected lifespan and larval development. We have characterized three strains belonging to L. delbrueckii subspecies bulgaricus and lactis highlighting their divergent origin. In particular, the two closely related isolates L. delbrueckii subspecies lactis display different galactose metabolic capabilities. Moreover, the L. delbrueckii subspecies bulgaricus strain demonstrated potential probiotic features. Combination of omic platforms coupled with in vivo screening in the simple model organism C. elegans is a powerful tool to characterize industrially relevant bacterial isolates.

  8. Editorial: Datasets for Learning Analytics

    NARCIS (Netherlands)

    Dietze, Stefan; George, Siemens; Davide, Taibi; Drachsler, Hendrik

    2018-01-01

    The European LinkedUp and LACE (Learning Analytics Community Exchange) project have been responsible for setting up a series of data challenges at the LAK conferences 2013 and 2014 around the LAK dataset. The LAK datasets consists of a rich collection of full text publications in the domain of

  9. Cloning and sequencing of wsp encoding gene fragments reveals a diversity of co-infecting Wolbachia strains in Acromyrmex leafcutter ants

    DEFF Research Database (Denmark)

    van Borm, S.; Wenseleers, T.; Billen, J.

    2003-01-01

    Acromyrmex insinuator hosted two additional infections. The multiple Wolbachia strains may influence the expression of reproductive conflicts in leafcutter ants, but the expected turnover of infections may make the cumulative effects on host ant reproduction complex. The additional Wolbachia infections......By sequencing part of the wsp gene of a series of clones, we detected an unusually high diversity of nine Wolbachia strains in queens of three species of leafcutter ants. Up to four strains co-occurred in a single ant. Most strains occurred in two clusters (InvA and InvB), but the social parasite...

  10. Whole genome analysis of selected human and animal rotaviruses identified in Uganda from 2012 to 2014 reveals complex genome reassortment events between human, bovine, caprine and porcine strains.

    Science.gov (United States)

    Bwogi, Josephine; Jere, Khuzwayo C; Karamagi, Charles; Byarugaba, Denis K; Namuwulya, Prossy; Baliraine, Frederick N; Desselberger, Ulrich; Iturriza-Gomara, Miren

    2017-01-01

    Rotaviruses of species A (RVA) are a common cause of diarrhoea in children and the young of various other mammals and birds worldwide. To investigate possible interspecies transmission of RVAs, whole genomes of 18 human and 6 domestic animal RVA strains identified in Uganda between 2012 and 2014 were sequenced using the Illumina HiSeq platform. The backbone of the human RVA strains had either a Wa- or a DS-1-like genetic constellation. One human strain was a Wa-like mono-reassortant containing a DS-1-like VP2 gene of possible animal origin. All eleven genes of one bovine RVA strain were closely related to those of human RVAs. One caprine strain had a mixed genotype backbone, suggesting that it emerged from multiple reassortment events involving different host species. The porcine RVA strains had mixed genotype backbones with possible multiple reassortant events with strains of human and bovine origin.Overall, whole genome characterisation of rotaviruses found in domestic animals in Uganda strongly suggested the presence of human-to animal RVA transmission, with concomitant circulation of multi-reassortant strains potentially derived from complex interspecies transmission events. However, whole genome data from the human RVA strains causing moderate and severe diarrhoea in under-fives in Uganda indicated that they were primarily transmitted from person-to-person.

  11. Open University Learning Analytics dataset.

    Science.gov (United States)

    Kuzilek, Jakub; Hlosta, Martin; Zdrahal, Zdenek

    2017-11-28

    Learning Analytics focuses on the collection and analysis of learners' data to improve their learning experience by providing informed guidance and to optimise learning materials. To support the research in this area we have developed a dataset, containing data from courses presented at the Open University (OU). What makes the dataset unique is the fact that it contains demographic data together with aggregated clickstream data of students' interactions in the Virtual Learning Environment (VLE). This enables the analysis of student behaviour, represented by their actions. The dataset contains the information about 22 courses, 32,593 students, their assessment results, and logs of their interactions with the VLE represented by daily summaries of student clicks (10,655,280 entries). The dataset is freely available at https://analyse.kmi.open.ac.uk/open_dataset under a CC-BY 4.0 license.

  12. Genetic characterization of circulating seasonal Influenza A viruses (2005-2009) revealed introduction of oseltamivir resistant H1N1 strains during 2009 in eastern India.

    Science.gov (United States)

    Agrawal, Anurodh S; Sarkar, Mehuli; Ghosh, Swati; Roy, Tapasi; Chakrabarti, Sekhar; Lal, Renu; Mishra, Akhilesh C; Chadha, Mandeep S; Chawla-Sarkar, Mamta

    2010-12-01

    Influenza surveillance was implemented in Kolkata, eastern India in 2005 to identify the circulating subtypes and characterize their genetic diversity. Throat and nasal swabs were collected from outpatients with influenza-like illness (ILI). Of 2844 ILI cases identified at two referral hospitals during October 2005-September 2009, 309 (10.86%) were positive for Influenza A by real time RT-PCR, of which 110 (35.60%) were subtyped as H1N1 and 199 (64.40%) as H3N2. Comparison of the nucleotide (nt) and amino acid (aa) sequences of the HA1 gene for H1N1 and H3N2 strains showed that a subset of strains precede WHO recommended contemporary strains by 1-2 years. The Kolkata H1N1 strains clustered in Clade II, subgroup 2B with A/Brisbane/59/2007 but were distant from the corresponding vaccine strains (New Caledonia/20/99 and A/Solomon Island/3/06). The 2005-06 and 2007 H3N2 strains (15/17) clustered either A/Brisbane/10/2007-like (n=8) or A/Nepal/921/2006 like (n=7) strains, whereas 2008 strains (8/12) and 2009 strains (4/4) were similar to the 2010-11 vaccine strain A/Perth/16/2009. More aa substitutions were found in HA or NA genes of H3N2 than in H1N1 strains. No mutation conferring neuraminidase resistance was observed in any of the strain during 2005-08, however in 2009, drug resistant marker (H275Y) was present in seasonal H1N1, but not in co-circulating H3N2 strains. This is the first report of genetic characterization of circulating Influenza A strains from India. The results also highlight the importance of continuing Influenza surveillance in developing countries of Asia for monitoring unusual strains with pandemic potential and mutations conferring antiviral resistance. Copyright © 2010 Elsevier B.V. All rights reserved.

  13. Characterization of genomic variations in SNPs of PE_PGRS genes reveals deletions and insertions in extensively drug resistant (XDR) M. tuberculosis strains from Pakistan

    KAUST Repository

    Kanji, Akbar; Hasan, Zahra; Ali, Asho; McNerney, Ruth; Mallard, Kim; Coll, Francesc; Hill-Cawthorne, Grant A.; Nair, Mridul; Clark, Taane G.; Zaver, Ambreen; Jafri, Sana; Hasan, Rumina

    2015-01-01

    Genetic diversity in PE_PGRS genes contributes to antigenic variability and may result in increased immunogenicity of strains. This is the first study identifying variations in nsSNPs and INDELs in the PE_PGRS genes of XDR-TB strains from Pakistan. It highlights common genetic variations which may contribute to persistence.

  14. Typing Discrepancy Between Phenotypic and Molecular Characterization Revealing an Emerging Biovar 9 Variant of Smooth Phage-Resistant B. abortus Strain 8416 in China.

    Science.gov (United States)

    Kang, Yao-Xia; Li, Xu-Ming; Piao, Dong-Ri; Tian, Guo-Zhong; Jiang, Hai; Jia, En-Hou; Lin, Liang; Cui, Bu-Yun; Chang, Yung-Fu; Guo, Xiao-Kui; Zhu, Yong-Zhang

    2015-01-01

    A newly isolated smooth colony morphology phage-resistant strain 8416 isolated from a 45-year-old cattle farm cleaner with clinical features of brucellosis in China was reported. The most unusual phenotype was its resistance to two Brucella phages Tbilisi and Weybridge, but sensitive to Berkeley 2, a pattern similar to that of Brucella melitensis biovar 1. VITEK 2 biochemical identification system found that both strain 8416 and B. melitensis strains shared positive ILATk, but negative in other B. abortus strains. However, routine biochemical and phenotypic characteristics of strain 8416 were most similar to that of B. abortus biovar 9 except CO2 requirement. In addition, multiple PCR molecular typing assays including AMOS-PCR, B. abortus special PCR (B-ab PCR) and a novel sub-biovar typing PCR, indicated that strain 8416 may belong to either biovar 3b or 9 of B. abortus. Surprisingly, further MLVA typing results showed that strain 8416 was most closely related to B. abortus biovar 3 in the Brucella MLVA database, primarily differing in 4 out of 16 screened loci. Therefore, due to the unusual discrepancy between phenotypic (biochemical reactions and particular phage lysis profile) and molecular typing characteristics, strain 8416 could not be exactly classified to any of the existing B. abortus biovars and might be a new variant of B. abortus biovar 9. The present study also indicates that the present phage typing scheme for Brucella sp. is subject to variation and the routine Brucella biovar typing needs further studies.

  15. Cell Size Influences the Reproductive Potential and Total Lifespan of the Saccharomyces cerevisiae Yeast as Revealed by the Analysis of Polyploid Strains

    Directory of Open Access Journals (Sweden)

    Renata Zadrag-Tecza

    2018-01-01

    Full Text Available The total lifespan of the yeast Saccharomyces cerevisiae may be divided into two phases: the reproductive phase, during which the cell undergoes mitosis cycles to produce successive buds, and the postreproductive phase, which extends from the last division to cell death. These phases may be regulated by a common mechanism or by distinct ones. In this paper, we proposed a more comprehensive approach to reveal the mechanisms that regulate both reproductive potential and total lifespan in cell size context. Our study was based on yeast cells, whose size was determined by increased genome copy number, ranging from haploid to tetraploid. Such experiments enabled us to test the hypertrophy hypothesis, which postulates that excessive size achieved by the cell—the hypertrophy state—is the reason preventing the cell from further proliferation. This hypothesis defines the reproductive potential value as the difference between the maximal size that a cell can reach and the threshold value, which allows a cell to undergo its first cell cycle and the rate of the cell size to increase per generation. Here, we showed that cell size has an important impact on not only the reproductive potential but also the total lifespan of this cell. Moreover, the maximal cell size value, which limits its reproduction capacity, can be regulated by different factors and differs depending on the strain ploidy. The achievement of excessive size by the cell (hypertrophic state may lead to two distinct phenomena: the cessation of reproduction without “mother” cell death and the cessation of reproduction with cell death by bursting, which has not been shown before.

  16. Genomic insights into a new Citrobacter koseri strain revealed gene exchanges with the virulence-associated Yersinia pestis pPCP1 plasmid

    Directory of Open Access Journals (Sweden)

    Fabrice eArmougom

    2016-03-01

    Full Text Available The history of infectious diseases raised the plague as one of the most devastating for human beings. Far too often considered an ancient disease, the frequent resurgence of the plague has led to consider it as a reemerging disease in Madagascar, Algeria, Libya and Congo. The genetic factors associated with the pathogenicity of Yersinia pestis, the causative agent of the plague, involve the acquisition of the pPCP1 plasmid that promotes host invasion through the expression of the virulence factor Pla. The surveillance of plague foci after the 2003 outbreak in Algeria resulted in a positive detection of the specific pla gene of Y. pestis in rodents. However, the phenotypic characterization of the isolate identified a Citrobacter koseri. The comparative genomics of our sequenced C. koseri URMITE genome revealed a mosaic gene structure resulting from the lifestyle of our isolate and provided evidence for gene exchanges with different enteric bacteria. The most striking was the acquisition of a continuous 2 kb genomic fragment containing the virulence factor Pla of the Y. pestis pPCP1 plasmid; however, the subcutaneous injection of the CKU strain in mice did not produce any pathogenic effect. Our findings demonstrate that fast molecular detection of plague using solely the pla gene is unsuitable and should rather require Y. pestis gene marker combinations. We also suggest that the evolutionary force that might govern the expression of pathogenicity can occur through the acquisition of virulence genes but could also require the loss or the inactivation of resident genes such as antivirulence genes.

  17. CRISPR/Cas9 Mutagenesis of UL21 in Multiple Strains of Herpes Simplex Virus Reveals Differential Requirements for pUL21 in Viral Replication

    Directory of Open Access Journals (Sweden)

    Renée L. Finnen

    2018-05-01

    Full Text Available Studies from multiple laboratories using different strains or species of herpes simplex virus (HSV with deletions in UL21 have yielded conflicting results regarding the necessity of pUL21 in HSV infection. To resolve this discrepancy, we utilized CRISPR/Cas9 mutagenesis to isolate pUL21 deficient viruses in multiple HSV backgrounds, and performed a side-by-side comparison of the cell-to-cell spread and replication phenotypes of these viruses. These analyses confirmed previous studies implicating the involvement of pUL21 in cell-to-cell spread of HSV. Cell-to-cell spread of HSV-2 was more greatly affected by the lack of pUL21 than HSV-1, and strain-specific differences in the requirement for pUL21 in cell-to-cell spread were also noted. HSV-2 strain 186 lacking pUL21 was particularly crippled in both cell-to-cell spread and viral replication in non-complementing cells, in comparison to other HSV strains lacking pUL21, suggesting that the strict requirement for pUL21 by strain 186 may not be representative of the HSV-2 species as a whole. This work highlights CRISPR/Cas9 technology as a useful tool for rapidly constructing deletion mutants of alphaherpesviruses, regardless of background strain, and should find great utility whenever strain-specific differences need to be investigated.

  18. Global Genome Comparative Analysis Reveals Insights of Resistome and Life-Style Adaptation of Pseudomonas putida Strain T2-2 in Oral Cavity

    Directory of Open Access Journals (Sweden)

    Xin Yue Chan

    2014-01-01

    Full Text Available Most Pseudomonas putida strains are environmental microorganisms exhibiting a wide range of metabolic capability but certain strains have been reported as rare opportunistic pathogens and some emerged as multidrug resistant P. putida. This study aimed to assess the drug resistance profile of, via whole genome analysis, P. putida strain T2-2 isolated from oral cavity. At the same time, we also compared the nonenvironmental strain with environmentally isolated P. putida. In silico comparative genome analysis with available reference strains of P. putida shows that T2-2 has lesser gene counts on carbohydrate and aromatic compounds metabolisms, which suggested its little versatility. The detection of its edd gene also suggested T2-2’s catabolism of glucose via ED pathway instead of EMP pathway. On the other hand, its drug resistance profile was observed via in silico gene prediction and most of the genes found were in agreement with drug-susceptibility testing in laboratory by automated VITEK 2. In addition, the finding of putative genes of multidrug resistance efflux pump and ATP-binding cassette transporters in this strain suggests a multidrug resistant phenotype. In summary, it is believed that multiple metabolic characteristics and drug resistance in P. putida strain T2-2 helped in its survival in human oral cavity.

  19. High genetic diversity of equine infectious anaemia virus strains from Slovenia revealed upon phylogenetic analysis of the p15 gag gene region.

    Science.gov (United States)

    Kuhar, U; Malovrh, T

    2016-03-01

    The equine infectious anaemia virus (EIAV), which belongs to the Retroviridae family, infects equids almost worldwide. Every year, sporadic EIAV cases are detected in Slovenia. To characterise the Slovenian EIAV strains in the p15 gag gene region phylogenetically in order to compare the Slovenian EIAV strains with EIAV strains from abroad, especially with the recently published European strains. Cross-sectional study using material derived from post mortem examination. In total, 29 EIAV serologically positive horses from 18 different farms were examined in this study. Primers were designed to amplify the p15 gag gene region. Amplicons of 28 PCRs were subjected to direct DNA sequencing and phylogenetic analysis. Altogether, 28 EIAV sequences were obtained from 17 different farms and were distributed between 4 separate monophyletic groups and 9 branches upon phylogenetic analysis. Among EIAV strains from abroad, the closest relatives to Slovenian EIAV strains were European EIAV strains from Italy. Phylogenetic analysis also showed that some animals from distantly located farms were most probably infected with the same EIAV strains, as well as animals from the same farm and animals from farms located in the same geographical region. This is the first report of such high genetic diversity of EIAV strains from one country. This led to speculation that there is a potential virus reservoir among the populations of riding horses, horses kept for pleasure and horses for meat production, with some farmers or horse-owners not following legislation, thus enabling the spread of infection with EIAV. The low sensitivity of the agar gel immunodiffusion test may also contribute to the spread of infection with EIAV, because some infected horses might have escaped detection. The results of the phylogenetic analysis also provide additional knowledge about the highly heterogeneous nature of the EIAV genome. © 2015 EVJ Ltd.

  20. Turkey Run Landfill Emissions Dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — landfill emissions measurements for the Turkey run landfill in Georgia. This dataset is associated with the following publication: De la Cruz, F., R. Green, G....

  1. Dataset of NRDA emission data

    Data.gov (United States)

    U.S. Environmental Protection Agency — Emissions data from open air oil burns. This dataset is associated with the following publication: Gullett, B., J. Aurell, A. Holder, B. Mitchell, D. Greenwell, M....

  2. Chemical product and function dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — Merged product weight fraction and chemical function data. This dataset is associated with the following publication: Isaacs , K., M. Goldsmith, P. Egeghy , K....

  3. Comparative genomic and proteomic analyses of Clostridium acetobutylicum Rh8 and its parent strain DSM 1731 revealed new understandings on butanol tolerance

    International Nuclear Information System (INIS)

    Bao, Guanhui; Dong, Hongjun; Zhu, Yan; Mao, Shaoming; Zhang, Tianrui; Zhang, Yanping; Chen, Zugen; Li, Yin

    2014-01-01

    Highlights: • Genomes of a butanol tolerant strain and its parent strain were deciphered. • Comparative genomic and proteomic was applied to understand butanol tolerance. • None differentially expressed proteins have mutations in its corresponding genes. • Mutations in ribosome might be responsible for the global difference of proteomics. - Abstract: Clostridium acetobutylicum strain Rh8 is a butanol-tolerant mutant which can tolerate up to 19 g/L butanol, 46% higher than that of its parent strain DSM 1731. We previously performed comparative cytoplasm- and membrane-proteomic analyses to understand the mechanism underlying the improved butanol tolerance of strain Rh8. In this work, we further extended this comparison to the genomic level. Compared with the genome of the parent strain DSM 1731, two insertion sites, four deletion sites, and 67 single nucleotide variations (SNVs) are distributed throughout the genome of strain Rh8. Among the 67 SNVs, 16 SNVs are located in the predicted promoters and intergenic regions; while 29 SNVs are located in the coding sequence, affecting a total of 21 proteins involved in transport, cell structure, DNA replication, and protein translation. The remaining 22 SNVs are located in the ribosomal genes, affecting a total of 12 rRNA genes in different operons. Analysis of previous comparative proteomic data indicated that none of the differentially expressed proteins have mutations in its corresponding genes. Rchange Algorithms analysis indicated that the mutations occurred in the ribosomal genes might change the ribosome RNA thermodynamic characteristics, thus affect the translation strength of these proteins. Take together, the improved butanol tolerance of C. acetobutylicum strain Rh8 might be acquired through regulating the translational process to achieve different expression strength of genes involved in butanol tolerance

  4. Typing discrepancy between phenotypic and molecular characterization revealing an emerging biovar 9 variant of smooth phage-resistant B. abortus strain 8416 in China

    Directory of Open Access Journals (Sweden)

    YaoXia eKang

    2015-12-01

    Full Text Available A newly isolated smooth colony morphology phage-resistant (SPR strain 8416 isolated from a 45-year-old cattle farm cleaner with clinical features of brucellosis in China was reported. The most unusual phenotype was its resistance to two Brucella phages Tbilisi and Weybridge, but sensitive to Berkeley 2, a pattern similar to that of B. melitensis biovar 1. VITEK 2 biochemical identification system found that both strain 8416 and B. melitensis strains shared positive ILATk, but negative in other B. abortus strains. However, routine biochemical and phenotypic characteristics of strain 8416 were most similar to that of B. abortus biovar 9 except CO2 requirement. In addition, multiple PCR molecular typing assays including AMOS-PCR, B. abortus special PCR (B-ab PCR and a novel sub-biovar typing PCR, indicated that strain 8416 may belong to either biovar 3b or 9 of B. abortus. Surprisingly, further MLVA typing results showed that strain 8416 was most closely related to B. abortus biovar 3 in the Brucella MLVA database, primarily differing in 4 out of 16 screened loci. Therefore, due to the unusual discrepancy between phenotypic (biochemical reactions and particular phage lysis profile and molecular typing characteristics, strain 8416 couldn’t be exactly classified to any of the existing B. abortus biovars and might be a new variant of B. abortus biovar 9. The present study also indicates that the present phage typing scheme for Brucella spp. is subject to variation and the routine Brucella biovar typing needs further studies.

  5. Characterization of genomic variations in SNPs of PE_PGRS genes reveals deletions and insertions in extensively drug resistant (XDR) M. tuberculosis strains from Pakistan

    KAUST Repository

    Kanji, Akbar

    2015-01-21

    Background Mycobacterium tuberculosis (MTB) PE_PGRS genes belong to the PE multigene family. Although the function of PE_PGRS genes is unknown, it is hypothesized that the PE_PGRS genes may be associated with antigenic variability in MTB. Material and methods Whole genome sequencing analysis was performed on (n = 37) extensively drug-resistant (XDR) MTB strains from Pakistan, which included Lineage 1 (East African Indian, n = 2); Other lineage 1 (n = 3); Lineage 3 (Central Asian, n = 24); Other lineage 3 (n = 4); Lineage 4 (X3, n = 1) and T group (n = 3) MTB strains. Results There were 107 SNPs identified from the analysis of 42 PE_PGRS genes; of these, 13 were non-synonymous SNPs (nsSNPs). The nsSNPs identified in PE_PGRS genes – 6, 9 and 10 – were common in all EAI, CAS, Other lineages (1 and 3), T1 and X3. Deletions (DELs) in PE_PGRS genes – 3 and 19 – were observed in 17 (80.9%) CAS1 and 6 (85.7%) in Other lineages (1 and 3) XDR MTB strains, while DELs in the PE_PGRS49 were observed in all CAS1, CAS, CAS2 and Other lineages (1 and 3) XDR MTB strains. All CAS, EAI and Other lineages (1 and 3) strains showed insertions (INS) in PE_PGRS6 gene, while INS in the PE_PGRS genes 19 and 33 were observed in 20 (95.2%) CAS1, all CAS, CAS2, EAI and Other lineages (1 and 3) XDR MTB strains. Conclusion Genetic diversity in PE_PGRS genes contributes to antigenic variability and may result in increased immunogenicity of strains. This is the first study identifying variations in nsSNPs and INDELs in the PE_PGRS genes of XDR-TB strains from Pakistan. It highlights common genetic variations which may contribute to persistence.

  6. Comparative genomic and proteomic analyses of Clostridium acetobutylicum Rh8 and its parent strain DSM 1731 revealed new understandings on butanol tolerance

    Energy Technology Data Exchange (ETDEWEB)

    Bao, Guanhui [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); University of Chinese Academy of Sciences, Beijing (China); Dong, Hongjun; Zhu, Yan; Mao, Shaoming [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); Zhang, Tianrui [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin (China); Zhang, Yanping [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); Chen, Zugen [Department of Human Genetics, School of Medicine, University of California, Los Angeles, CA 90095 (United States); Li, Yin, E-mail: yli@im.ac.cn [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China)

    2014-08-08

    Highlights: • Genomes of a butanol tolerant strain and its parent strain were deciphered. • Comparative genomic and proteomic was applied to understand butanol tolerance. • None differentially expressed proteins have mutations in its corresponding genes. • Mutations in ribosome might be responsible for the global difference of proteomics. - Abstract: Clostridium acetobutylicum strain Rh8 is a butanol-tolerant mutant which can tolerate up to 19 g/L butanol, 46% higher than that of its parent strain DSM 1731. We previously performed comparative cytoplasm- and membrane-proteomic analyses to understand the mechanism underlying the improved butanol tolerance of strain Rh8. In this work, we further extended this comparison to the genomic level. Compared with the genome of the parent strain DSM 1731, two insertion sites, four deletion sites, and 67 single nucleotide variations (SNVs) are distributed throughout the genome of strain Rh8. Among the 67 SNVs, 16 SNVs are located in the predicted promoters and intergenic regions; while 29 SNVs are located in the coding sequence, affecting a total of 21 proteins involved in transport, cell structure, DNA replication, and protein translation. The remaining 22 SNVs are located in the ribosomal genes, affecting a total of 12 rRNA genes in different operons. Analysis of previous comparative proteomic data indicated that none of the differentially expressed proteins have mutations in its corresponding genes. Rchange Algorithms analysis indicated that the mutations occurred in the ribosomal genes might change the ribosome RNA thermodynamic characteristics, thus affect the translation strength of these proteins. Take together, the improved butanol tolerance of C. acetobutylicum strain Rh8 might be acquired through regulating the translational process to achieve different expression strength of genes involved in butanol tolerance.

  7. Characterization of genomic variations in SNPs of PE_PGRS genes reveals deletions and insertions in extensively drug resistant (XDR) M. tuberculosis strains from Pakistan

    KAUST Repository

    Kanji, Akbar

    2015-03-01

    Background: Mycobacterium tuberculosis (MTB) PE_PGRS genes belong to the PE multi-gene family. Although the function of the members of the PE_PGRS multi-gene family is not yet known, it is hypothesized that the PE_PGRS genes may be associated with genetic variability. Material and methods: Whole genome sequencing analysis was performed on (n= 37) extensively drug resistant (XDR) MTB strains from Pakistan which included Central Asian (n= 23), East African Indian (n= 2), X3 (n= 1), T group (n= 3) and Orphan (n= 8) MTB strains. Results: By analyzing 42 PE_PGRS genes, 111 SNPs were identified, of which 13 were non-synonymous SNPs (nsSNPs). The nsSNPs identified in the PE_PGRS genes were as follows: 6, 9, 10 and 55 present in each of the CAS, EAI, Orphan, T1 and X3 XDR MTB strains studied. Deletions in PE_PGRS genes: 19, 21 and 23 were observed in 7 (35.0%) CAS1 and 3 (37.5%) in Orphan XDR MTB strains, while deletions in the PE_PGRS genes: 49 and 50 were observed in 36 (95.0%) CAS1 and all CAS, CAS2 and Orphan XDR MTB strains. An insertion in PE_PGRS6 gene was observed in all CAS, EAI3 and Orphan, while insertions in the PE_PGRS genes 19 and 33 were observed in 19 (95%) CAS1 and all CAS, CAS2, EAI3 and Orphan XDR MTB strains. Conclusion: Genetic diversity in PE_PGRS genes contributes to antigenic variability and may result in increased immunogenicity of strains. This is the first study identifying variations in nsSNPs, Insertions and Deletions in the PE_PGRS genes of XDR-TB strains from Pakistan. It highlights common genetic variations which may contribute to persistence.

  8. Comparative Genome Analysis Between Aspergillus oryzae Strains Reveals Close Relationship Between Sites of Mutation Localization and Regions of Highly Divergent Genes among Aspergillus Species

    OpenAIRE

    Umemura, Myco; Koike, Hideaki; Yamane, Noriko; Koyama, Yoshinori; Satou, Yuki; Kikuzato, Ikuya; Teruya, Morimi; Tsukahara, Masatoshi; Imada, Yumi; Wachi, Youji; Miwa, Yukino; Yano, Shuichi; Tamano, Koichi; Kawarabayasi, Yutaka; Fujimori, Kazuhiro E.

    2012-01-01

    Aspergillus oryzae has been utilized for over 1000 years in Japan for the production of various traditional foods, and a large number of A. oryzae strains have been isolated and/or selected for the effective fermentation of food ingredients. Characteristics of genetic alterations among the strains used are of particular interest in studies of A. oryzae. Here, we have sequenced the whole genome of an industrial fungal isolate, A. oryzae RIB326, by using a next-generation sequencing system and ...

  9. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    Science.gov (United States)

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  10. Genome sequences of lower Great Lakes Microcystis sp. reveal strain-specific genes that are present and expressed in western Lake Erie blooms.

    Directory of Open Access Journals (Sweden)

    Kevin Anthony Meyer

    Full Text Available Blooms of the potentially toxic cyanobacterium Microcystis are increasing worldwide. In the Laurentian Great Lakes they pose major socioeconomic, ecological, and human health threats, particularly in western Lake Erie. However, the interpretation of "omics" data is constrained by the highly variable genome of Microcystis and the small number of reference genome sequences from strains isolated from the Great Lakes. To address this, we sequenced two Microcystis isolates from Lake Erie (Microcystis aeruginosa LE3 and M. wesenbergii LE013-01 and one from upstream Lake St. Clair (M. cf aeruginosa LSC13-02, and compared these data to the genomes of seventeen Microcystis spp. from across the globe as well as one metagenome and seven metatranscriptomes from a 2014 Lake Erie Microcystis bloom. For the publically available strains analyzed, the core genome is ~1900 genes, representing ~11% of total genes in the pan-genome and ~45% of each strain's genome. The flexible genome content was related to Microcystis subclades defined by phylogenetic analysis of both housekeeping genes and total core genes. To our knowledge this is the first evidence that the flexible genome is linked to the core genome of the Microcystis species complex. The majority of strain-specific genes were present and expressed in bloom communities in Lake Erie. Roughly 8% of these genes from the lower Great Lakes are involved in genome plasticity (rapid gain, loss, or rearrangement of genes and resistance to foreign genetic elements (such as CRISPR-Cas systems. Intriguingly, strain-specific genes from Microcystis cultured from around the world were also present and expressed in the Lake Erie blooms, suggesting that the Microcystis pangenome is truly global. The presence and expression of flexible genes, including strain-specific genes, suggests that strain-level genomic diversity may be important in maintaining Microcystis abundance during bloom events.

  11. The NOAA Dataset Identifier Project

    Science.gov (United States)

    de la Beaujardiere, J.; Mccullough, H.; Casey, K. S.

    2013-12-01

    The US National Oceanic and Atmospheric Administration (NOAA) initiated a project in 2013 to assign persistent identifiers to datasets archived at NOAA and to create informational landing pages about those datasets. The goals of this project are to enable the citation of datasets used in products and results in order to help provide credit to data producers, to support traceability and reproducibility, and to enable tracking of data usage and impact. A secondary goal is to encourage the submission of datasets for long-term preservation, because only archived datasets will be eligible for a NOAA-issued identifier. A team was formed with representatives from the National Geophysical, Oceanographic, and Climatic Data Centers (NGDC, NODC, NCDC) to resolve questions including which identifier scheme to use (answer: Digital Object Identifier - DOI), whether or not to embed semantics in identifiers (no), the level of granularity at which to assign identifiers (as coarsely as reasonable), how to handle ongoing time-series data (do not break into chunks), creation mechanism for the landing page (stylesheet from formal metadata record preferred), and others. Decisions made and implementation experience gained will inform the writing of a Data Citation Procedural Directive to be issued by the Environmental Data Management Committee in 2014. Several identifiers have been issued as of July 2013, with more on the way. NOAA is now reporting the number as a metric to federal Open Government initiatives. This paper will provide further details and status of the project.

  12. Complete genome sequence of a novel H9N2 subtype influenza virus FJG9 strain in China reveals a natural reassortant event.

    Science.gov (United States)

    Xie, Qingmei; Yan, Zhuanqiang; Ji, Jun; Zhang, Huanmin; Liu, Jun; Sun, Yue; Li, Guangwei; Chen, Feng; Xue, Chunyi; Ma, Jingyun; Bee, Yingzuo

    2012-09-01

    A/chicken/FJ/G9/09 (FJ/G9) is an H9N2 subtype avian influenza virus (H9N2 AIV) strain causing high morbidity that was isolated from broilers in Fujian Province of China in 2009. FJ/G9 has been used as the vaccine strain against H9N2 AIV infection in Fujian Province of China. Here, we report the complete genome sequence of FJ/G9 with natural six-way reassortment, which is the most complex genotype strain in China and even in the world so far. The present findings will aid in understanding the complexity and diversity of H9N2 subtype avian influenza virus.

  13. Comparative Genomics of Mycoplasma bovis Strains Reveals That Decreased Virulence with Increasing Passages Might Correlate with Potential Virulence-Related Factors

    Directory of Open Access Journals (Sweden)

    Muhammad A. Rasheed

    2017-05-01

    Full Text Available Mycoplasma bovis is an important cause of bovine respiratory disease worldwide. To understand its virulence mechanisms, we sequenced three attenuated M. bovis strains, P115, P150, and P180, which were passaged in vitro 115, 150, and 180 times, respectively, and exhibited progressively decreasing virulence. Comparative genomics was performed among the wild-type M. bovis HB0801 (P1 strain and the P115, P150, and P180 strains, and one 14.2-kb deleted region covering 14 genes was detected in the passaged strains. Additionally, 46 non-sense single-nucleotide polymorphisms and indels were detected, which confirmed that more passages result in more mutations. A subsequent collective bioinformatics analysis of paralogs, metabolic pathways, protein-protein interactions, secretory proteins, functionally conserved domains, and virulence-related factors identified 11 genes that likely contributed to the increased attenuation in the passaged strains. These genes encode ascorbate-specific phosphotransferase system enzyme IIB and IIA components, enolase, L-lactate dehydrogenase, pyruvate kinase, glycerol, and multiple sugar ATP-binding cassette transporters, ATP binding proteins, NADH dehydrogenase, phosphate acetyltransferase, transketolase, and a variable surface protein. Fifteen genes were shown to be enriched in 15 metabolic pathways, and they included the aforementioned genes encoding pyruvate kinase, transketolase, enolase, and L-lactate dehydrogenase. Hydrogen peroxide (H2O2 production in M. bovis strains representing seven passages from P1 to P180 decreased progressively with increasing numbers of passages and increased attenuation. However, eight mutants specific to eight individual genes within the 14.2-kb deleted region did not exhibit altered H2O2 production. These results enrich the M. bovis genomics database, and they increase our understanding of the mechanisms underlying M. bovis virulence.

  14. Full genome analysis of rotavirus G9P[8] strains identified in acute gastroenteritis cases reveals genetic diversity: Pune, western India.

    Science.gov (United States)

    Tatte, Vaishali S; Chaphekar, Deepa; Gopalkrishna, Varanasi

    2017-08-01

    Group A rotaviruses (RVA) are the major enteric etiological agents of severe acute gastroenteritis among children globally. As G9 RVA now represents as one of the major human RVA genotypes, studies on full genome of this particular genotype are being carried out worldwide. So far, no such studies on G9P[8] RVAs have been reported from Pune, western part of India. Keeping in view of this, the study was undertaken to understand the degree of genetic diversity of the commonly circulating G9P[8] RVA strains. Rotavirus surveillance studies carried out earlier during the years 2009-2011 showed increase in the prevalence of G9P[8] RVAs. Representative G9P[8] RVA strains from the years 2009, 2010, and 2011 were selected for the study. In general, all the G9 RVA strains showed clustering in the globally circulating sublineage of the VP7 gene and showed nucleotide/amino acid identities of 96.8-99.7%/96.9-99.8% with global G9 RV strains. Full genome analysis, of all three RVAs in this study indicated Wa-like genotype constellation G9-P[8]-I1-R1-C1-M1-A1-N1-T1-E1-H1. Within the strains nucleotide/amino acid divergence of 0.1-3.4%/0.0-4.1% was noted in all the RVA structural and non-structural genes. In conclusion, the present study highlights intra-genotypic variations throughout the RVA genome. The study further emphasizes the need for surveillance and analysis of the whole genomic constellation of the commonly circulating RVA strains of other regions in the country for understanding to a greater degree of the impact of rotavirus vaccination recently introduced in India. © 2017 Wiley Periodicals, Inc.

  15. The Harvard organic photovoltaic dataset.

    Science.gov (United States)

    Lopez, Steven A; Pyzer-Knapp, Edward O; Simm, Gregor N; Lutzow, Trevor; Li, Kewei; Seress, Laszlo R; Hachmann, Johannes; Aspuru-Guzik, Alán

    2016-09-27

    The Harvard Organic Photovoltaic Dataset (HOPV15) presented in this work is a collation of experimental photovoltaic data from the literature, and corresponding quantum-chemical calculations performed over a range of conformers, each with quantum chemical results using a variety of density functionals and basis sets. It is anticipated that this dataset will be of use in both relating electronic structure calculations to experimental observations through the generation of calibration schemes, as well as for the creation of new semi-empirical methods and the benchmarking of current and future model chemistries for organic electronic applications.

  16. The Harvard organic photovoltaic dataset

    Science.gov (United States)

    Lopez, Steven A.; Pyzer-Knapp, Edward O.; Simm, Gregor N.; Lutzow, Trevor; Li, Kewei; Seress, Laszlo R.; Hachmann, Johannes; Aspuru-Guzik, Alán

    2016-01-01

    The Harvard Organic Photovoltaic Dataset (HOPV15) presented in this work is a collation of experimental photovoltaic data from the literature, and corresponding quantum-chemical calculations performed over a range of conformers, each with quantum chemical results using a variety of density functionals and basis sets. It is anticipated that this dataset will be of use in both relating electronic structure calculations to experimental observations through the generation of calibration schemes, as well as for the creation of new semi-empirical methods and the benchmarking of current and future model chemistries for organic electronic applications. PMID:27676312

  17. Genetic characterization of Anaplasma marginale strains from Tunisia using single and multiple gene typing reveals novel variants with an extensive genetic diversity.

    Science.gov (United States)

    Ben Said, Mourad; Ben Asker, Alaa; Belkahia, Hanène; Ghribi, Raoua; Selmi, Rachid; Messadi, Lilia

    2018-05-12

    Anaplasma marginale, which is responsible for bovine anaplasmosis in tropical and subtropical regions, is a tick-borne obligatory intraerythrocytic bacterium of cattle and wild ruminants. In Tunisia, information about the genetic diversity and the phylogeny of A. marginale strains are limited to the msp4 gene analysis. The purpose of this study is to investigate A. marginale isolates infecting 16 cattle located in different bioclimatic areas of northern Tunisia with single gene analysis and multilocus sequence typing methods on the basis of seven partial genes (dnaA, ftsZ, groEL, lipA, secY, recA and sucB). The single gene analysis confirmed the presence of different and novel heterogenic A. marginale strains infecting cattle from the north of Tunisia. The concatenated sequence analysis showed a phylogeographical resolution at the global level and that most of the Tunisian sequence types (STs) formed a separate cluster from a South African isolate and from all New World isolates and strains. By combining the characteristics of each single locus with those of the multi-loci scheme, these results provide a more detailed understanding on the diversity and the evolution of Tunisian A. marginale strains. Copyright © 2018 Elsevier GmbH. All rights reserved.

  18. The pan-genome of the animal pathogen Corynebacterium pseudotuberculosis reveals differences in genome plasticity between the biovar ovis and equi strains

    DEFF Research Database (Denmark)

    Soares, Siomar C; Silva, Artur; Trost, Eva

    2013-01-01

    , Corynebacterium pseudotuberculosis infections pose a rising worldwide economic problem in ruminants. The complete genome sequences of 15 C. pseudotuberculosis strains isolated from different hosts and countries were comparatively analyzed using a pan-genomic strategy. Phylogenomic, pan-genomic, core genomic...

  19. Phenotypic and genotypic characterization of Brucella strains isolated from autochthonous livestock reveals the dominance of B. abortus biovar 3a in Nigeria.

    Science.gov (United States)

    Bertu, Wilson J; Ducrotoy, Marie J; Muñoz, Pilar M; Mick, Virginie; Zúñiga-Ripa, Amaia; Bryssinckx, Ward; Kwaga, Jacob K P; Kabir, Junaid; Welburn, Susan C; Moriyón, Ignacio; Ocholi, Reuben A

    2015-10-22

    Brucellosis is a worldwide widespread zoonosis caused by bacteria of the genus Brucella. Control of this disease in a given area requires an understanding of the Brucella species circulating in livestock and humans. However, because of the difficulties intrinsic to Brucella isolation and typing, such data are scarce for resource-poor areas. The paucity of bacteriological data and the consequent imperfect epidemiological picture are particularly critical for Sahelian and Sub-Sahara African countries. Here, we report on the characterization of 34 isolates collected between 1976 and 2012 from cattle, sheep and horses in Nigeria. All isolates were identified as Brucella abortus by Bruce-ladder PCR and assigned to biovar 3 by conventional typing. Further analysis by enhanced AMOS-ERY PCR showed that all of them belonged to the 3a sub-biovar, and MLVA analysis grouped them in a cluster clearly distinct from that formed by European B. abortus biovar 3b strains. Nevertheless, MLVA detected heterogeneity within the Nigerian biovar 3a strains. The close genetic profiles of the isolates from cattle, sheep and horses, suggest that, at least in some parts of Nigeria, biovar 3a circulates among animal species that are not the preferential hosts of B. abortus. Consistent with previous genetic analyses of 7 strains from Ivory Cost, Gambia and Togo, the analysis of these 34 Nigerian strains supports the hypothesis that the B. abortus biovar 3a lineage is dominant in West African countries. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Comparative genome analysis of an avirulent and two virulent strains of avian Pasteurella multocida reveals candidate genes involved in fitness and pathogenicity

    Science.gov (United States)

    Fowl cholera is a highly contagious systemic disease affecting wild and domestic birds, frequently resulting in high morbidity and mortality. The causative agent is Pasteurella multocida (P. multocida). The completed genome of P. multocida strain Pm70 has been available for over eleven years and has...

  1. Comparative genomics using microarrays reveals divergence and loss of virulence-associated genes in host-specific strains of the insect pathogen Metarhizium anisopliae.

    Science.gov (United States)

    Wang, Sibao; Leclerque, Andreas; Pava-Ripoll, Monica; Fang, Weiguo; St Leger, Raymond J

    2009-06-01

    Many strains of Metarhizium anisopliae have broad host ranges, but others are specialists and adapted to particular hosts. Patterns of gene duplication, divergence, and deletion in three generalist and three specialist strains were investigated by heterologous hybridization of genomic DNA to genes from the generalist strain Ma2575. As expected, major life processes are highly conserved, presumably due to purifying selection. However, up to 7% of Ma2575 genes were highly divergent or absent in specialist strains. Many of these sequences are conserved in other fungal species, suggesting that there has been rapid evolution and loss in specialist Metarhizium genomes. Some poorly hybridizing genes in specialists were functionally coordinated, indicative of reductive evolution. These included several involved in toxin biosynthesis and sugar metabolism in root exudates, suggesting that specialists are losing genes required to live in alternative hosts or as saprophytes. Several components of mobile genetic elements were also highly divergent or lost in specialists. Exceptionally, the genome of the specialist cricket pathogen Ma443 contained extra insertion elements that might play a role in generating evolutionary novelty. This study throws light on the abundance of orphans in genomes, as 15% of orphan sequences were found to be rapidly evolving in the Ma2575 lineage.

  2. Comparative genome analysis of Lactobacillus casei strains isolated from Actimel and Yakult products reveals marked similarities and points to a common origin

    NARCIS (Netherlands)

    Douillard, F.P.; Kant, R.; Ritari, J.; Paulin, L.; Palva, A.; Vos, de W.M.

    2013-01-01

    The members of the Lactobacillus genus are widely used in the food and feed industry and show a remarkable ecological adaptability. Several Lactobacillus strains have been marketed as probiotics as they possess health-promoting properties for the host. In the present study, we used two complementary

  3. Comparative genomics analysis of Streptococcus agalactiae reveals that isolates from cultured tilapia in China are closely related to the human strain A909.

    Science.gov (United States)

    Liu, Guangjin; Zhang, Wei; Lu, Chengping

    2013-11-11

    Streptococcus agalactiae, also referred to as Group B Streptococcus (GBS), is a frequent resident of the rectovaginal tract in humans, and a major cause of neonatal infection. In addition, S. agalactiae is a known fish pathogen, which compromises food safety and represents a zoonotic hazard. The complete genome sequence of the piscine S. agalactiae isolate GD201008-001 was compared with 14 other piscine, human and bovine strains to explore their virulence determinants, evolutionary relationships and the genetic basis of host tropism in S. agalactiae. The pan-genome of S. agalactiae is open and its size increases with the addition of newly sequenced genomes. The core genes shared by all isolates account for 50 ~ 70% of any single genome. The Chinese piscine isolates GD201008-001 and ZQ0910 are phylogenetically distinct from the Latin American piscine isolates SA20-06 and STIR-CD-17, but are closely related to the human strain A909, in the context of the clustered regularly interspaced short palindromic repeats (CRISPRs), prophage, virulence-associated genes and phylogenetic relationships. We identified a unique 10 kb gene locus in Chinese piscine strains. Isolates from cultured tilapia in China have a close genomic relationship with the human strain A909. Our findings provide insight into the pathogenesis and host-associated genome content of piscine S. agalactiae isolated in China.

  4. Selective sweep analysis in the genomes of the 91-R and 91-C Drosophila melanogaster strains reveals few of the ‘usual suspects’ in Dichlorodiphenyltrichloroethane (DDT) resistance

    Science.gov (United States)

    Adaptation of insect phenotypes for survival after exposure to xenobiotics can result from selection at multiple loci with additive genetic effects. A high level dichlorodiphenyltrichloroethane (DDT) resistance phenotype in the Drosophila melanogaster strain 91-R has resulted due to continuous labo...

  5. High genetic differentiation between an African and a non-African strain of Drosophila simulans revealed by segregation distortion and reduced crossover frequency.

    Science.gov (United States)

    Tatsuta, Haruki; Takano-Shimizu, Toshiyuki

    2009-11-01

    Drosophila simulans strains originating from Madagascar and nearby islands in the Indian Ocean often differ from those elsewhere in the number of sex comb teeth and the degree of morphological anomaly in hybrids with D. melanogaster. Here, we report a strong segregation distortion in the F1 intercross between two D. simulans strains originating from Madagascar and the US, possibly at both the gametic and zygotic levels. Strong bias against alleles of the Madagascar strain was observed for all ten marker loci distributed over the entire second chromosome in the F1 intercross, but only a few showed a weak distortion in the isogenic backgrounds of either strains. Significant deviations of genotype frequencies from Hardy-Weinberg proportions were consistently observed for the second chromosome. By contrast, the X and third chromosomes did not show any strong segregation distortion. Crossover frequency on the second chromosome was uniformly reduced in isogenic backgrounds whereas the map lengths in the F1 intercross were comparable to or larger than that of the standard D. melanogaster map. We discuss these findings in relation to previous studies on other traits and interspecific differences between D. mauritiana, which is endemic to Mauritius Island, and D. simulans.

  6. Querying Large Biological Network Datasets

    Science.gov (United States)

    Gulsoy, Gunhan

    2013-01-01

    New experimental methods has resulted in increasing amount of genetic interaction data to be generated every day. Biological networks are used to store genetic interaction data gathered. Increasing amount of data available requires fast large scale analysis methods. Therefore, we address the problem of querying large biological network datasets.…

  7. Fluxnet Synthesis Dataset Collaboration Infrastructure

    Energy Technology Data Exchange (ETDEWEB)

    Agarwal, Deborah A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Humphrey, Marty [Univ. of Virginia, Charlottesville, VA (United States); van Ingen, Catharine [Microsoft. San Francisco, CA (United States); Beekwilder, Norm [Univ. of Virginia, Charlottesville, VA (United States); Goode, Monte [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Jackson, Keith [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Rodriguez, Matt [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Weber, Robin [Univ. of California, Berkeley, CA (United States)

    2008-02-06

    The Fluxnet synthesis dataset originally compiled for the La Thuile workshop contained approximately 600 site years. Since the workshop, several additional site years have been added and the dataset now contains over 920 site years from over 240 sites. A data refresh update is expected to increase those numbers in the next few months. The ancillary data describing the sites continues to evolve as well. There are on the order of 120 site contacts and 60proposals have been approved to use thedata. These proposals involve around 120 researchers. The size and complexity of the dataset and collaboration has led to a new approach to providing access to the data and collaboration support and the support team attended the workshop and worked closely with the attendees and the Fluxnet project office to define the requirements for the support infrastructure. As a result of this effort, a new website (http://www.fluxdata.org) has been created to provide access to the Fluxnet synthesis dataset. This new web site is based on a scientific data server which enables browsing of the data on-line, data download, and version tracking. We leverage database and data analysis tools such as OLAP data cubes and web reports to enable browser and Excel pivot table access to the data.

  8. High genetic diversity among strains of the unindustrialized lactic acid bacterium Carnobacterium maltaromaticum in dairy products as revealed by multilocus sequence typing.

    Science.gov (United States)

    Rahman, Abdur; Cailliez-Grimal, Catherine; Bontemps, Cyril; Payot, Sophie; Chaillou, Stéphane; Revol-Junelles, Anne-Marie; Borges, Frédéric

    2014-07-01

    Dairy products are colonized with three main classes of lactic acid bacteria (LAB): opportunistic bacteria, traditional starters, and industrial starters. Most of the population structure studies were previously performed with LAB species belonging to these three classes and give interesting knowledge about the population structure of LAB at the stage where they are already industrialized. However, these studies give little information about the population structure of LAB prior their use as an industrial starter. Carnobacterium maltaromaticum is a LAB colonizing diverse environments, including dairy products. Since this bacterium was discovered relatively recently, it is not yet commercialized as an industrial starter, which makes C. maltaromaticum an interesting model for the study of unindustrialized LAB population structure in dairy products. A multilocus sequence typing scheme based on an analysis of fragments of the genes dapE, ddlA, glpQ, ilvE, pyc, pyrE, and leuS was applied to a collection of 47 strains, including 28 strains isolated from dairy products. The scheme allowed detecting 36 sequence types with a discriminatory index of 0.98. The whole population was clustered in four deeply branched lineages, in which the dairy strains were spread. Moreover, the dairy strains could exhibit a high diversity within these lineages, leading to an overall dairy population with a diversity level as high as that of the nondairy population. These results are in agreement with the hypothesis according to which the industrialization of LAB leads to a diversity reduction in dairy products. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  9. Comparative Transcriptomics of Bacillus mycoides Strains in Response to Potato-Root Exudates Reveals Different Genetic Adaptation of Endophytic and Soil Isolates.

    Science.gov (United States)

    Yi, Yanglei; de Jong, Anne; Frenzel, Elrike; Kuipers, Oscar P

    2017-01-01

    Plant root secreted compounds alter the gene expression of associated microorganisms by acting as signal molecules that either stimulate or repel the interaction with beneficial or harmful species, respectively. However, it is still unclear whether two distinct groups of beneficial bacteria, non-plant-associated (soil) strains and plant-associated (endophytic) strains, respond uniformly or variably to the exposure with root exudates. Therefore, Bacillus mycoides , a potential biocontrol agent and plant growth-promoting bacterium, was isolated from the endosphere of potatoes and from soil of the same geographical region. Confocal fluorescence microscopy of plants inoculated with GFP-tagged B. mycoides strains showed that the endosphere isolate EC18 had a stronger plant colonization ability and competed more successfully for the colonization sites than the soil isolate SB8. To dissect these phenotypic differences, the genomes of the two strains were sequenced and the transcriptome response to potato root exudates was compared. The global transcriptome profiles evidenced that the endophytic isolate responded more pronounced than the soil-derived isolate and a higher number of significant differentially expressed genes were detected. Both isolates responded with the alteration of expression of an overlapping set of genes, which had previously been reported to be involved in plant-microbe interactions; including organic substance metabolism, oxidative reduction, and transmembrane transport. Notably, several genes were specifically upregulated in the endosphere isolate EC18, while being oppositely downregulated in the soil isolate SB8. These genes mainly encoded membrane proteins, transcriptional regulators or were involved in amino acid metabolism and biosynthesis. By contrast, several genes upregulated in the soil isolate SB8 and downregulated in the endosphere isolate EC18 were related to sugar transport, which might coincide with the different nutrient availability

  10. The gene expression profile of resistant and susceptible Bombyx mori strains reveals cypovirus-associated variations in host gene transcript levels.

    Science.gov (United States)

    Guo, Rui; Wang, Simei; Xue, Renyu; Cao, Guangli; Hu, Xiaolong; Huang, Moli; Zhang, Yangqi; Lu, Yahong; Zhu, Liyuan; Chen, Fei; Liang, Zi; Kuang, Sulan; Gong, Chengliang

    2015-06-01

    High-throughput paired-end RNA sequencing (RNA-Seq) was performed to investigate the gene expression profile of a susceptible Bombyx mori strain, Lan5, and a resistant B. mori strain, Ou17, which were both orally infected with B. mori cypovirus (BmCPV) in the midgut. There were 330 and 218 up-regulated genes, while there were 147 and 260 down-regulated genes in the Lan5 and Ou17 strains, respectively. Gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment for differentially expressed genes (DEGs) were carried out. Moreover, gene interaction network (STRING) analyses were performed to analyze the relationships among the shared DEGs. Some of these genes were related and formed a large network, in which the genes for B. mori cuticular protein RR-2 motif 123 (BmCPR123) and the gene for B. mori DNA replication licensing factor Mcm2-like (BmMCM2) were key genes among the common up-regulated DEGs, whereas the gene for B. mori heat shock protein 20.1 (Bmhsp20.1) was the central gene among the shared down-regulated DEGs between Lan5 vs Lan5-CPV and Ou17 vs Ou17-CPV. These findings established a comprehensive database of genes that are differentially expressed in response to BmCPV infection between silkworm strains that differed in resistance to BmCPV and implied that these DEGs might be involved in B. mori immune responses against BmCPV infection.

  11. Sequencing and characterisation of rearrangements in three S. pastorianus strains reveals the presence of chimeric genes and gives evidence of breakpoint reuse.

    Directory of Open Access Journals (Sweden)

    Sarah K Hewitt

    Full Text Available Gross chromosomal rearrangements have the potential to be evolutionarily advantageous to an adapting organism. The generation of a hybrid species increases opportunity for recombination by bringing together two homologous genomes. We sought to define the location of genomic rearrangements in three strains of Saccharomyces pastorianus, a natural lager-brewing yeast hybrid of Saccharomyces cerevisiae and Saccharomyces eubayanus, using whole genome shotgun sequencing. Each strain of S. pastorianus has lost species-specific portions of its genome and has undergone extensive recombination, producing chimeric chromosomes. We predicted 30 breakpoints that we confirmed at the single nucleotide level by designing species-specific primers that flank each breakpoint, and then sequencing the PCR product. These rearrangements are the result of recombination between areas of homology between the two subgenomes, rather than repetitive elements such as transposons or tRNAs. Interestingly, 28/30 S. cerevisiae-S. eubayanus recombination breakpoints are located within genic regions, generating chimeric genes. Furthermore we show evidence for the reuse of two breakpoints, located in HSP82 and KEM1, in strains of proposed independent origin.

  12. Biochemical and full genome sequence analyses of clinical Vibrio cholerae isolates in Mexico reveals the presence of novel V. cholerae strains.

    Science.gov (United States)

    Díaz-Quiñonez, José Alberto; Hernández-Monroy, Irma; Montes-Colima, Norma Angélica; Moreno-Pérez, María Asunción; Galicia-Nicolás, Adriana Guadalupe; López-Martínez, Irma; Ruiz-Matus, Cuitláhuac; Kuri-Morales, Pablo; Ortíz-Alcántara, Joanna María; Garcés-Ayala, Fabiola; Ramírez-González, José Ernesto

    2016-05-01

    The first week of September 2013, the National Epidemiological Surveillance System identified two cases of cholera in Mexico City. The cultures of both samples were confirmed as Vibrio cholerae serogroup O1, serotype Ogawa, biotype El Tor. Initial analyses by PFGE and by PCR-amplification of the virulence genes, suggested that both strains were similar, but different from those previously reported in Mexico. The following week, four more cases were identified in a community in the state of Hidalgo, located 121 km northeast of Mexico City. Thereafter a cholera outbreak started in the region of La Huasteca. Genomic analyses of the four strains obtained in this study confirmed the presence of Pathogenicity Islands VPI-1 and -2, VSP-1 and -2, and of the integrative element SXT. The genomic structure of the 4 isolates was similar to that of V. cholerae strain 2010 EL-1786, identified during the epidemic in Haiti in 2010. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  13. Genomic Investigation Reveals Highly Conserved, Mosaic, Recombination Events Associated with Capsular Switching among Invasive Neisseria meningitidis Serogroup W Sequence Type (ST)-11 Strains.

    Science.gov (United States)

    Mustapha, Mustapha M; Marsh, Jane W; Krauland, Mary G; Fernandez, Jorge O; de Lemos, Ana Paula S; Dunning Hotopp, Julie C; Wang, Xin; Mayer, Leonard W; Lawrence, Jeffrey G; Hiller, N Luisa; Harrison, Lee H

    2016-07-03

    Neisseria meningitidis is an important cause of meningococcal disease globally. Sequence type (ST)-11 clonal complex (cc11) is a hypervirulent meningococcal lineage historically associated with serogroup C capsule and is believed to have acquired the W capsule through a C to W capsular switching event. We studied the sequence of capsule gene cluster (cps) and adjoining genomic regions of 524 invasive W cc11 strains isolated globally. We identified recombination breakpoints corresponding to two distinct recombination events within W cc11: A 8.4-kb recombinant region likely acquired from W cc22 including the sialic acid/glycosyl-transferase gene, csw resulted in a C→W change in capsular phenotype and a 13.7-kb recombinant segment likely acquired from Y cc23 lineage includes 4.5 kb of cps genes and 8.2 kb downstream of the cps cluster resulting in allelic changes in capsule translocation genes. A vast majority of W cc11 strains (497/524, 94.8%) retain both recombination events as evidenced by sharing identical or very closely related capsular allelic profiles. These data suggest that the W cc11 capsular switch involved two separate recombination events and that current global W cc11 meningococcal disease is caused by strains bearing this mosaic capsular switch. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Comparative Genome Analysis Between Aspergillus oryzae Strains Reveals Close Relationship Between Sites of Mutation Localization and Regions of Highly Divergent Genes among Aspergillus Species

    Science.gov (United States)

    Umemura, Myco; Koike, Hideaki; Yamane, Noriko; Koyama, Yoshinori; Satou, Yuki; Kikuzato, Ikuya; Teruya, Morimi; Tsukahara, Masatoshi; Imada, Yumi; Wachi, Youji; Miwa, Yukino; Yano, Shuichi; Tamano, Koichi; Kawarabayasi, Yutaka; Fujimori, Kazuhiro E.; Machida, Masayuki; Hirano, Takashi

    2012-01-01

    Aspergillus oryzae has been utilized for over 1000 years in Japan for the production of various traditional foods, and a large number of A. oryzae strains have been isolated and/or selected for the effective fermentation of food ingredients. Characteristics of genetic alterations among the strains used are of particular interest in studies of A. oryzae. Here, we have sequenced the whole genome of an industrial fungal isolate, A. oryzae RIB326, by using a next-generation sequencing system and compared the data with those of A. oryzae RIB40, a wild-type strain sequenced in 2005. The aim of this study was to evaluate the mutation pressure on the non-syntenic blocks (NSBs) of the genome, which were previously identified through comparative genomic analysis of A. oryzae, Aspergillus fumigatus, and Aspergillus nidulans. We found that genes within the NSBs of RIB326 accumulate mutations more frequently than those within the SBs, regardless of their distance from the telomeres or of their expression level. Our findings suggest that the high mutation frequency of NSBs might contribute to maintaining the diversity of the A. oryzae genome. PMID:22912434

  15. Comparative genome analysis between Aspergillus oryzae strains reveals close relationship between sites of mutation localization and regions of highly divergent genes among Aspergillus species.

    Science.gov (United States)

    Umemura, Myco; Koike, Hideaki; Yamane, Noriko; Koyama, Yoshinori; Satou, Yuki; Kikuzato, Ikuya; Teruya, Morimi; Tsukahara, Masatoshi; Imada, Yumi; Wachi, Youji; Miwa, Yukino; Yano, Shuichi; Tamano, Koichi; Kawarabayasi, Yutaka; Fujimori, Kazuhiro E; Machida, Masayuki; Hirano, Takashi

    2012-10-01

    Aspergillus oryzae has been utilized for over 1000 years in Japan for the production of various traditional foods, and a large number of A. oryzae strains have been isolated and/or selected for the effective fermentation of food ingredients. Characteristics of genetic alterations among the strains used are of particular interest in studies of A. oryzae. Here, we have sequenced the whole genome of an industrial fungal isolate, A. oryzae RIB326, by using a next-generation sequencing system and compared the data with those of A. oryzae RIB40, a wild-type strain sequenced in 2005. The aim of this study was to evaluate the mutation pressure on the non-syntenic blocks (NSBs) of the genome, which were previously identified through comparative genomic analysis of A. oryzae, Aspergillus fumigatus, and Aspergillus nidulans. We found that genes within the NSBs of RIB326 accumulate mutations more frequently than those within the SBs, regardless of their distance from the telomeres or of their expression level. Our findings suggest that the high mutation frequency of NSBs might contribute to maintaining the diversity of the A. oryzae genome.

  16. CERC Dataset (Full Hadza Data)

    DEFF Research Database (Denmark)

    2016-01-01

    The dataset includes demographic, behavioral, and religiosity data from eight different populations from around the world. The samples were drawn from: (1) Coastal and (2) Inland Tanna, Vanuatu; (3) Hadzaland, Tanzania; (4) Lovu, Fiji; (5) Pointe aux Piment, Mauritius; (6) Pesqueiro, Brazil; (7......) Kyzyl, Tyva Republic; and (8) Yasawa, Fiji. Related publication: Purzycki, et al. (2016). Moralistic Gods, Supernatural Punishment and the Expansion of Human Sociality. Nature, 530(7590): 327-330....

  17. Viking Seismometer PDS Archive Dataset

    Science.gov (United States)

    Lorenz, R. D.

    2016-12-01

    The Viking Lander 2 seismometer operated successfully for over 500 Sols on the Martian surface, recording at least one likely candidate Marsquake. The Viking mission, in an era when data handling hardware (both on board and on the ground) was limited in capability, predated modern planetary data archiving, and ad-hoc repositories of the data, and the very low-level record at NSSDC, were neither convenient to process nor well-known. In an effort supported by the NASA Mars Data Analysis Program, we have converted the bulk of the Viking dataset (namely the 49,000 and 270,000 records made in High- and Event- modes at 20 and 1 Hz respectively) into a simple ASCII table format. Additionally, since wind-generated lander motion is a major component of the signal, contemporaneous meteorological data are included in summary records to facilitate correlation. These datasets are being archived at the PDS Geosciences Node. In addition to brief instrument and dataset descriptions, the archive includes code snippets in the freely-available language 'R' to demonstrate plotting and analysis. Further, we present examples of lander-generated noise, associated with the sampler arm, instrument dumps and other mechanical operations.

  18. PHYSICS PERFORMANCE AND DATASET (PPD)

    CERN Multimedia

    L. Silvestris

    2013-01-01

    The first part of the Long Shutdown period has been dedicated to the preparation of the samples for the analysis targeting the summer conferences. In particular, the 8 TeV data acquired in 2012, including most of the “parked datasets”, have been reconstructed profiting from improved alignment and calibration conditions for all the sub-detectors. A careful planning of the resources was essential in order to deliver the datasets well in time to the analysts, and to schedule the update of all the conditions and calibrations needed at the analysis level. The newly reprocessed data have undergone detailed scrutiny by the Dataset Certification team allowing to recover some of the data for analysis usage and further improving the certification efficiency, which is now at 91% of the recorded luminosity. With the aim of delivering a consistent dataset for 2011 and 2012, both in terms of conditions and release (53X), the PPD team is now working to set up a data re-reconstruction and a new MC pro...

  19. Genetic analysis of environmental strains of the plant pathogen Phytophthora capsici reveals heterogeneous repertoire of effectors and possible effector evolution via genomic island.

    Science.gov (United States)

    Iribarren, María Josefina; Pascuan, Cecilia; Soto, Gabriela; Ayub, Nicolás Daniel

    2015-11-01

    Phytophthora capsici is a virulent oomycete pathogen of many vegetable crops. Recently, it has been demonstrated that the recognition of the RXLR effector AVR3a1 of P. capsici (PcAVR3a1) triggers a hypersensitive response and plays a critical role in mediating non-host resistance. Here, we analyzed the occurrence of PcAVR3a1 in 57 isolates of P. capsici derived from globe squash, eggplant, tomato and bell pepper cocultivated in a small geographical area. The occurrence of PcAVR3a1 in environmental strains of P. capsici was confirmed by PCR in only 21 of these pathogen isolates. To understand the presence-absence pattern of PcAVR3a1 in environmental strains, the flanking region of this gene was sequenced. PcAVR3a1 was found within a genetic element that we named PcAVR3a1-GI (PcAVR3a1 genomic island). PcAVR3a1-GI was flanked by a 22-bp direct repeat, which is related to its site-specific recombination site. In addition to the PcAVR3a1 gene, PcAVR3a1-GI also encoded a phage integrase probably associated with the excision and integration of this mobile element. Exposure to plant induced the presence of an episomal circular intermediate of PcAVR3a1-GI, indicating that this mobile element is functional. Collectively, these findings provide evidence of PcAVR3a1 evolution via mobile elements in environmental strains of Phytophthora. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Determination of 5 '-leader sequences from radically disparate strains of porcine reproductive and respiratory syndrome virus reveals the presence of highly conserved sequence motifs

    DEFF Research Database (Denmark)

    Oleksiewicz, M.B.; Bøtner, Anette; Nielsen, Jens

    1999-01-01

    We determined the untranslated 5'-leader sequence for three different isolates of porcine reproductive and respiratory syndrome virus (PRRSV): pathogenic European- and American-types, as well as an American-type vaccine strain. 5'-leader from European- and American-type PRRSV differed in length...... (220 and 190 nt, respectively), and exhibited only approximately 50% nucleotide homology. Nevertheless, highly conserved areas were identified in the leader of all 3 PRRSV isolates, which constitute candidate motifs for binding of protein(s) involved in viral replication. These comparative data provide...

  1. Viability of Controlling Prosthetic Hand Utilizing Electroencephalograph (EEG) Dataset Signal

    Science.gov (United States)

    Miskon, Azizi; A/L Thanakodi, Suresh; Raihan Mazlan, Mohd; Mohd Haziq Azhar, Satria; Nooraya Mohd Tawil, Siti

    2016-11-01

    This project presents the development of an artificial hand controlled by Electroencephalograph (EEG) signal datasets for the prosthetic application. The EEG signal datasets were used as to improvise the way to control the prosthetic hand compared to the Electromyograph (EMG). The EMG has disadvantages to a person, who has not used the muscle for a long time and also to person with degenerative issues due to age factor. Thus, the EEG datasets found to be an alternative for EMG. The datasets used in this work were taken from Brain Computer Interface (BCI) Project. The datasets were already classified for open, close and combined movement operations. It served the purpose as an input to control the prosthetic hand by using an Interface system between Microsoft Visual Studio and Arduino. The obtained results reveal the prosthetic hand to be more efficient and faster in response to the EEG datasets with an additional LiPo (Lithium Polymer) battery attached to the prosthetic. Some limitations were also identified in terms of the hand movements, weight of the prosthetic, and the suggestions to improve were concluded in this paper. Overall, the objective of this paper were achieved when the prosthetic hand found to be feasible in operation utilizing the EEG datasets.

  2. RARD: The Related-Article Recommendation Dataset

    OpenAIRE

    Beel, Joeran; Carevic, Zeljko; Schaible, Johann; Neusch, Gabor

    2017-01-01

    Recommender-system datasets are used for recommender-system evaluations, training machine-learning algorithms, and exploring user behavior. While there are many datasets for recommender systems in the domains of movies, books, and music, there are rather few datasets from research-paper recommender systems. In this paper, we introduce RARD, the Related-Article Recommendation Dataset, from the digital library Sowiport and the recommendation-as-a-service provider Mr. DLib. The dataset contains ...

  3. A novel field transplantation technique reveals intra-specific metal-induced oxidative responses in strains of Ectocarpus siliculosus with different pollution histories

    International Nuclear Information System (INIS)

    Sáez, Claudio A.; González, Alberto; Contreras, Rodrigo A.; Moody, A. John; Moenne, Alejandra; Brown, Murray T.

    2015-01-01

    A novel field transplantation technique, in which seaweed material is incorporated into dialysis tubing, was used to investigate intra-specific responses to metals in the model brown alga Ectocarpus siliculosus. Metal accumulation in the two strains was similar, with higher concentrations in material deployed to the metal-contaminated site (Ventanas, Chile) than the pristine site (Quintay, Chile). However, the oxidative responses differed. At Ventanas, strain Es147 (from low-polluted site) underwent oxidative damage whereas Es524 (from highly polluted site) was not affected. Concentrations of reduced ascorbate (ASC) and reduced glutathione (GSH) were significantly higher in Es524. Activities of the antioxidant enzymes superoxide dismutase (SOD), ascorbate peroxidase (APX), catalase (CAT), and glutathione reductase (GR) all increased in Es524, whereas only SOD increased in Es147. For the first time, employing a field transplantation technique, we provide unambiguous evidence of inter-population variation of metal-tolerance in brown algae and establish that antioxidant defences are, in part, responsible. - Highlights: • Metal tolerance in Ectocarpus siliculosus populations was studied through in situ experiments. • Metal tolerance in E. siliculosus populations is partly based in antioxidant defences. • In situ experiments using a dialysis tubing device was successful for metal diagnosis. - Field transplantation experimentation provides evidence that differential antioxidant defences, in part, mediate inter-population tolerance to metal pollution in the model brown alga Ectocarpus siliculosus

  4. Examination of food chain-derived Listeria monocytogenes strains of different serotypes reveals considerable diversity in inlA genotypes, mutability, and adaptation to cold temperatures.

    Science.gov (United States)

    Kovacevic, Jovana; Arguedas-Villa, Carolina; Wozniak, Anna; Tasara, Taurai; Allen, Kevin J

    2013-03-01

    Listeria monocytogenes strains belonging to serotypes 1/2a and 4b are frequently linked to listeriosis. While inlA mutations leading to premature stop codons (PMSCs) and attenuated virulence are common in 1/2a, they are rare in serotype 4b. We observed PMSCs in 35% of L. monocytogenes isolates (n = 54) recovered from the British Columbia food supply, including serotypes 1/2a (30%), 1/2c (100%), and 3a (100%), and a 3-codon deletion (amino acid positions 738 to 740) seen in 57% of 4b isolates from fish-processing facilities. Caco-2 invasion assays showed that two isolates with the deletion were significantly more invasive than EGD-SmR (P cold temperature following a downshift from 37°C to 4°C. Overall, three distinct cold-adapting groups (CAG) were observed: 46% were fast (200 h) adaptors. Intermediate CAG strains (70%) more frequently possessed inlA PMSCs than did fast (20%) and slow (10%) CAGs; in contrast, 87% of fast adaptors lacked inlA PMSCs. In conclusion, we report food chain-derived 1/2a and 4b serotypes with a 3-codon deletion possessing invasive behavior and the novel association of inlA genotypes encoding a full-length InlA with fast cold-adaptation phenotypes.

  5. The evolution with strain of the stored energy in different texture components of cold-rolled IF steel revealed by high resolution X-ray diffraction

    Energy Technology Data Exchange (ETDEWEB)

    Wauthier-Monnin, A. [LSPM–CNRS, Université Paris 13, 99, Av. J.B. Clément, 93430 Villetaneuse (France); ArcelorMittal Research Voie Romaine BP 30320, 57 283 Maizières-les Metz (France); Chauveau, T.; Castelnau, O. [LSPM–CNRS, Université Paris 13, 99, Av. J.B. Clément, 93430 Villetaneuse (France); Réglé, H. [ArcelorMittal Research Voie Romaine BP 30320, 57 283 Maizières-les Metz (France); Bacroix, B., E-mail: brigitte.bacroix@univ-paris13.fr [LSPM–CNRS, Université Paris 13, 99, Av. J.B. Clément, 93430 Villetaneuse (France)

    2015-06-15

    During the deformation of low carbon steel by cold-rolling, dislocations are created and stored in grains depending on local crystallographic orientation, deformation, and deformation gradient. Orientation dependent dislocation densities have been estimated from the broadening of X-ray diffraction lines measured on a synchrotron beamline. Different cold-rolling levels (from 30% to 95% thickness reduction) have been considered. It is shown that the present measurements are consistent with the hypothesis of the sole consideration of screw dislocations for the analysis of the data. The presented evolutions show that the dislocation density first increases within the α fiber (=(hkl)<110>) and then within the γ fiber (=(111)). A comparison with EBSD measurements is done and confirms that the storage of dislocations during the deformation process is orientation dependent and that this dependence is correlated to the cold-rolling level. If we assume that this dislocation density acts as a driving force during recrystallization, these observations can explain the fact that the recrystallization mechanisms are generally different after moderate or large strains. - Highlights: • Dislocation densities are assessed by XRD in main texture components of a steel sheet. • Dislocation densities vary with both strain and texture components. • The analysis relies on the sole presence of screw dislocations. • The measured dislocation densities include the contribution of both SSD and GND.

  6. Genome Sequence Analysis of Vibrio cholerae clinical isolates from 2013 in Mexico reveals the presence of the strain responsible for the 2010 Haiti outbreak.

    Science.gov (United States)

    Díaz-Quiñonez, José Alberto

    2017-01-01

    La primera semana de septiembre de 2013, el Sistema Nacional de Vigilancia Epidemiológica identificó dos casos de cólera en Ciudad de México. Los cultivos de ambas muestras se confirmaron como Vibrio cholerae serogrupo O1, serotipo Ogawa, biotipo El Tor. Los análisis iniciales por electroforesis por campos pulsados y por reacción en cadena de la polimerasa indicaron que ambas cepas eran similares, pero diferentes de las previamente reportadas en México. La semana siguiente se identificaron cuatro casos más en una comunidad del Estado de Hidalgo, ubicada a 121 kilómetros al noreste de Ciudad de México. Posteriormente se inició un brote de cólera en la región de La Huasteca. Los análisis genómicos de cuatro cepas obtenidas en este estudio confirmaron la presencia de las islas de patogenicidad VPI -1 y VPI-2, VSP-1 y VSP-2, y del elemento integrador SXT. La estructura genómica de los cuatro aislamientos fue similar a la de V. cholerae cepa 2010 EL-1786, identificada durante la epidemia en Haití en 2010. Este estudio pone de manifiesto que la epidemiología molecular es una herramienta muy poderosa para vigilar, prevenir y controlar enfermedades de importancia en salud pública en México. The first week of September 2013, the National Epidemiological Surveillance System identified two cases of cholera in Mexico City. The cultures of both samples were confirmed as Vibrio cholerae serogroup O1, serotype Ogawa, biotype El Tor. Initial analyses by pulsed-field gel electrophoresis and by polymerase chain reaction-amplification of the virulence genes, suggested that both strains were similar, but different from those previously reported in Mexico. The following week, four more cases were identified in a community in the state of Hidalgo, located 121 km northeast of Mexico City. Thereafter a cholera outbreak started in the region of La Huasteca. Genomic analyses of the strains obtained in this study confirmed the presence of pathogenicity islands VPI-1 and

  7. Gene-trait matching across the Bifidobacterium longum pan-genome reveals considerable diversity in carbohydrate catabolism among human infant strains.

    LENUS (Irish Health Repository)

    Arboleya, Silvia

    2018-01-08

    Bifidobacterium longum is a common member of the human gut microbiota and is frequently present at high numbers in the gut microbiota of humans throughout life, thus indicative of a close symbiotic host-microbe relationship. Different mechanisms may be responsible for the high competitiveness of this taxon in its human host to allow stable establishment in the complex and dynamic intestinal microbiota environment. The objective of this study was to assess the genetic and metabolic diversity in a set of 20 B. longum strains, most of which had previously been isolated from infants, by performing whole genome sequencing and comparative analysis, and to analyse their carbohydrate utilization abilities using a gene-trait matching approach.

  8. Characterization of sour cherry isolates of plum pox virus from the Volga Basin in Russia reveals a new cherry strain of the virus.

    Science.gov (United States)

    Glasa, Miroslav; Prikhodko, Yuri; Predajňa, Lukáš; Nagyová, Alžbeta; Shneyder, Yuri; Zhivaeva, Tatiana; Subr, Zdeno; Cambra, Mariano; Candresse, Thierry

    2013-09-01

    Plum pox virus (PPV) is the causal agent of sharka, the most detrimental virus disease of stone fruit trees worldwide. PPV isolates have been assigned into seven distinct strains, of which PPV-C regroups the genetically distinct isolates detected in several European countries on cherry hosts. Here, three complete and several partial genomic sequences of PPV isolates from sour cherry trees in the Volga River basin of Russia have been determined. The comparison of complete genome sequences has shown that the nucleotide identity values with other PPV isolates reached only 77.5 to 83.5%. Phylogenetic analyses clearly assigned the RU-17sc, RU-18sc, and RU-30sc isolates from cherry to a distinct cluster, most closely related to PPV-C and, to a lesser extent, PPV-W. Based on their natural infection of sour cherry trees and genomic characterization, the PPV isolates reported here represent a new strain of PPV, for which the name PPV-CR (Cherry Russia) is proposed. The unique amino acids conserved among PPV-CR and PPV-C cherry-infecting isolates (75 in total) are mostly distributed within the central part of P1, NIa, and the N terminus of the coat protein (CP), making them potential candidates for genetic determinants of the ability to infect cherry species or of adaptation to these hosts. The variability observed within 14 PPV-CR isolates analyzed in this study (0 to 2.6% nucleotide divergence in partial CP sequences) and the identification of these isolates in different localities and cultivation conditions suggest the efficient establishment and competitiveness of the PPV-CR in the environment. A specific primer pair has been developed, allowing the specific reverse-transcription polymerase chain reaction detection of PPV-CR isolates.

  9. Selective Sweep Analysis in the Genomes of the 91-R and 91-C Drosophila melanogaster Strains Reveals Few of the ‘Usual Suspects’ in Dichlorodiphenyltrichloroethane (DDT) Resistance

    Science.gov (United States)

    Steele, Laura D.; Coates, Brad; Valero, M. Carmen; Sun, Weilin; Seong, Keon Mook; Muir, William M.; Clark, John M.; Pittendrigh, Barry R.

    2015-01-01

    Adaptation of insect phenotypes for survival after exposure to xenobiotics can result from selection at multiple loci with additive genetic effects. To the authors’ knowledge, no selective sweep analysis has been performed to identify such loci in highly dichlorodiphenyltrichloroethane (DDT) resistant insects. Here we compared a highly DDT resistant phenotype in the Drosophila melanogaster (Drosophila) 91-R strain to the DDT susceptible 91-C strain, both of common origin. Whole genome re-sequencing data from pools of individuals was generated separately for 91-R and 91-C, and mapped to the reference Drosophila genome assembly (v. 5.72). Thirteen major and three minor effect chromosome intervals with reduced nucleotide diversity (π) were identified only in the 91-R population. Estimates of Tajima's D (D) showed corresponding evidence of directional selection in these same genome regions of 91-R, however, no similar reductions in π or D estimates were detected in 91-C. An overabundance of non-synonymous proteins coding to synonymous changes were identified in putative open reading frames associated with 91-R. Except for NinaC and Cyp4g1, none of the identified genes were the ‘usual suspects’ previously observed to be associated with DDT resistance. Additionally, up-regulated ATP-binding cassette transporters have been previously associated with DDT resistance; however, here we identified a structurally altered MDR49 candidate resistance gene. The remaining fourteen genes have not previously been shown to be associated with DDT resistance. These results suggest hitherto unknown mechanisms of DDT resistance, most of which have been overlooked in previous transcriptional studies, with some genes having orthologs in mammals. PMID:25826265

  10. Passive Containment DataSet

    Science.gov (United States)

    This data is for Figures 6 and 7 in the journal article. The data also includes the two EPANET input files used for the analysis described in the paper, one for the looped system and one for the block system.This dataset is associated with the following publication:Grayman, W., R. Murray , and D. Savic. Redesign of Water Distribution Systems for Passive Containment of Contamination. JOURNAL OF THE AMERICAN WATER WORKS ASSOCIATION. American Water Works Association, Denver, CO, USA, 108(7): 381-391, (2016).

  11. The CMS dataset bookkeeping service

    Science.gov (United States)

    Afaq, A.; Dolgert, A.; Guo, Y.; Jones, C.; Kosyakov, S.; Kuznetsov, V.; Lueking, L.; Riley, D.; Sekhri, V.

    2008-07-01

    The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems.

  12. The CMS dataset bookkeeping service

    Energy Technology Data Exchange (ETDEWEB)

    Afaq, A; Guo, Y; Kosyakov, S; Lueking, L; Sekhri, V [Fermilab, Batavia, Illinois 60510 (United States); Dolgert, A; Jones, C; Kuznetsov, V; Riley, D [Cornell University, Ithaca, New York 14850 (United States)

    2008-07-15

    The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems.

  13. The CMS dataset bookkeeping service

    International Nuclear Information System (INIS)

    Afaq, A; Guo, Y; Kosyakov, S; Lueking, L; Sekhri, V; Dolgert, A; Jones, C; Kuznetsov, V; Riley, D

    2008-01-01

    The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems

  14. The CMS dataset bookkeeping service

    International Nuclear Information System (INIS)

    Afaq, Anzar; Dolgert, Andrew; Guo, Yuyi; Jones, Chris; Kosyakov, Sergey; Kuznetsov, Valentin; Lueking, Lee; Riley, Dan; Sekhri, Vijay

    2007-01-01

    The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems

  15. Investigating on the fermentation behavior of six lactic acid bacteria strains in barley malt wort reveals limitation in key amino acids and buffer capacity.

    Science.gov (United States)

    Nsogning, Sorelle Dongmo; Fischer, Susann; Becker, Thomas

    2018-08-01

    Understanding lactic acid bacteria (LAB) fermentation behavior in malt wort is a milestone towards flavor improvement of lactic acid fermented malt beverages. Therefore, this study aims to outline deficiencies that may exist in malt wort fermentation. First, based on six LAB strains, cell viability and vitality were evaluated. Second, sugars, organic acids, amino acids, pH value and buffering capacity (BC) were monitored. Finally, the implication of key amino acids, fructose and wort BC on LAB growth was determined. Short growth phase coupled with prompt cell death and a decrease in metabolic activity was observed. Low wort BC caused rapid pH drop with lactic acid accumulation, which conversely increased the BC leading to less pH change at late-stage fermentation. Lactic acid content (≤3.9 g/L) was higher than the reported inhibitory concentration (1.8 g/L). Furthermore, sugars were still available but fructose and key amino acids lysine, arginine and glutamic acid were considerably exhausted (≤98%). Wort supplementations improved cell growth and viability leading to conclude that key amino acid depletion coupled with low BC limits LAB growth in malt wort. Then, a further increase in organic acid reduces LAB viability. This knowledge opens doors for LAB fermentation process optimization in malt wort. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. Multilocus sequence typing, biochemical and antibiotic resistance characterizations reveal diversity of North American strains of the honey bee pathogen Paenibacillus larvae.

    Science.gov (United States)

    Krongdang, Sasiprapa; Evans, Jay D; Pettis, Jeffery S; Chantawannakul, Panuwan

    2017-01-01

    Paenibacillus larvae is a Gram positive bacterium and the causative agent of the most widespread fatal brood disease of honey bees, American foulbrood (AFB). A total of thirty-three independent Paenibacillus larvae isolates from various geographical origins in North America and five reference strains were investigated for genetic diversity using multilocus sequence typing (MLST). This technique is regarded to be a powerful tool for epidemiological studies of pathogenic bacteria and is widely used in genotyping assays. For MLST, seven housekeeping gene loci, ilvD (dihydroxy-acid dyhydrogenase), tri (triosephosphate isomerase), purH (phospharibosyl-aminoimidazolecarboxamide), recF (DNA replication and repair protein), pyrE (orotate phosphoribosyltransferase), sucC (succinyl coenzyme A synthetase β subunit) and glpF (glycerol uptake facilitator protein) were studied and applied for primer designs. Previously, ERIC type DNA fingerprinting was applied to these same isolates and the data showed that almost all represented the ERIC I type, whereas using BOX-PCR gave an indication of more diversity. All isolates were screened for resistance to four antibiotics used by U.S. beekeepers, showing extensive resistance to tetracycline and the first records of resistance to tylosin and lincomycin. Our data highlight the intraspecies relationships of P. larvae and the potential application of MLST methods in enhancing our understanding of epidemiological relationships among bacterial isolates of different origins.

  17. Infection of a French Population of Aedes albopictus and of Aedes aegypti (Paea Strain with Zika Virus Reveals Low Transmission Rates to These Vectors’ Saliva

    Directory of Open Access Journals (Sweden)

    Faustine Ryckebusch

    2017-11-01

    Full Text Available Disease caused by the Zika virus (ZIKV is a public health emergency of international concern. Recent epidemics have emerged in different regions of the world and attest to the ability of the virus to spread wherever its vector, Aedes species mosquitoes, can be found. We have compared the transmission of ZIKV by Ae. aegypti (PAEA strain originating from Tahiti and by a French population of Ae. albopictus to better assess their competence and the potential risk of the emergence of ZIKV in Europe. We assessed the transmission of ZIKV by Ae. albopictus in temperatures similar to those in Southern France during the summer. Our study shows that the extrinsic incubation period of Ae. aegypti for transmission was shorter than that of Ae. albopictus. Both vectors were able to transmit ZIKV from 10 to 14 days post-infection. Ae. aegypti, however, had a longer transmission period than the French population of Ae. albopictus. Although the salivary glands of both vectors are highly infected, transmission rates of ZIKV to saliva remain relatively low. These observations may suggest that the risk of emergence of ZIKV in Europe could be low.

  18. Molecular typing of canine parvovirus strains circulating from 2008 to 2012 in an organized kennel in India reveals the possibility of vaccination failure.

    Science.gov (United States)

    Mittal, Mitesh; Chakravarti, Soumendu; Mohapatra, J K; Chug, P K; Dubey, Rahul; Upmanuyu, Vikramaditya; Narwal, P S; Kumar, Anil; Churamani, C P; Kanwar, N S

    2014-04-01

    Canine parvovirus-2 (CPV-2), which emerged in 1978, is considered as the major viral enteric pathogen of the canine population. With the emergence of new antigenic variants and incidences of vaccine failure, CPV has become one of the dreaded diseases of the canines worldwide. The present study was undertaken in an organized kennel from North India to ascertain the molecular basis of the CPV outbreaks in the vaccinated dogs. 415 samples were collected over a 5year period (2008-2012). The outbreak of the disease was more severe in 2012 with high incidence of mortality in pups with pronounced clinical symptoms. Molecular typing based on the VP2 gene was carried out with the 11 isolates from different years and compared with the CPV prototype and the vaccine strains. All the isolates in the study were either new CPV-2a (2012 isolates) or new CPV-2b (2008 and 2011 isolates). There were amino acid mutations at the Tyr324Ile and at the Thr440Ala position in five isolates from 2012 indicating new CPV mutants spreading in India. The CPV vaccines used in the present study failed to generate protective antibody titer against heterogeneous CPV antigenic types. The findings were confirmed when the affected pups were treated with hyper-immune heterogeneous purified immunoglobulin's against CPV in dogs of different antigenic types. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Six Highly Conserved Targets of RNAi Revealed in HIV-1-Infected Patients from Russia Are Also Present in Many HIV-1 Strains Worldwide.

    Science.gov (United States)

    Kretova, Olga V; Fedoseeva, Daria M; Gorbacheva, Maria A; Gashnikova, Natalya M; Gashnikova, Maria P; Melnikova, Nataliya V; Chechetkin, Vladimir R; Kravatsky, Yuri V; Tchurikov, Nickolai A

    2017-09-15

    RNAi has been suggested for use in gene therapy of HIV/AIDS, but the main problem is that HIV-1 is highly variable and could escape attack from the small interfering RNAs (siRNAs) due to even single nucleotide substitutions in the potential targets. To exhaustively check the variability in selected RNA targets of HIV-1, we used ultra-deep sequencing of six regions of HIV-1 from the plasma of two independent cohorts of patients from Russia. Six RNAi targets were found that are invariable in 82%-97% of viruses in both cohorts and are located inside the domains specifying reverse transcriptase (RT), integrase, vpu, gp120, and p17. The analysis of mutation frequencies and their characteristics inside the targets suggests a likely role for APOBEC3G (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3G, A3G) in G-to-A mutations and a predominant effect of RT biases in the detected variability of the virus. The lowest frequency of mutations was detected in the central part of all six targets. We also discovered that the identical RNAi targets are present in many HIV-1 strains from many countries and from all continents. The data are important for both the understanding of the patterns of HIV-1 mutability and properties of RT and for the development of gene therapy approaches using RNAi for the treatment of HIV/AIDS. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Multilocus sequence typing, biochemical and antibiotic resistance characterizations reveal diversity of North American strains of the honey bee pathogen Paenibacillus larvae.

    Directory of Open Access Journals (Sweden)

    Sasiprapa Krongdang

    Full Text Available Paenibacillus larvae is a Gram positive bacterium and the causative agent of the most widespread fatal brood disease of honey bees, American foulbrood (AFB. A total of thirty-three independent Paenibacillus larvae isolates from various geographical origins in North America and five reference strains were investigated for genetic diversity using multilocus sequence typing (MLST. This technique is regarded to be a powerful tool for epidemiological studies of pathogenic bacteria and is widely used in genotyping assays. For MLST, seven housekeeping gene loci, ilvD (dihydroxy-acid dyhydrogenase, tri (triosephosphate isomerase, purH (phospharibosyl-aminoimidazolecarboxamide, recF (DNA replication and repair protein, pyrE (orotate phosphoribosyltransferase, sucC (succinyl coenzyme A synthetase β subunit and glpF (glycerol uptake facilitator protein were studied and applied for primer designs. Previously, ERIC type DNA fingerprinting was applied to these same isolates and the data showed that almost all represented the ERIC I type, whereas using BOX-PCR gave an indication of more diversity. All isolates were screened for resistance to four antibiotics used by U.S. beekeepers, showing extensive resistance to tetracycline and the first records of resistance to tylosin and lincomycin. Our data highlight the intraspecies relationships of P. larvae and the potential application of MLST methods in enhancing our understanding of epidemiological relationships among bacterial isolates of different origins.

  1. 2008 TIGER/Line Nationwide Dataset

    Data.gov (United States)

    California Natural Resource Agency — This dataset contains a nationwide build of the 2008 TIGER/Line datasets from the US Census Bureau downloaded in April 2009. The TIGER/Line Shapefiles are an extract...

  2. Enhanced 3-sulfanylhexan-1-ol production in sequential mixed fermentation with Torulaspora delbrueckii/Saccharomyces cerevisiae reveals a situation of synergistic interaction between two industrial strains

    Directory of Open Access Journals (Sweden)

    Philippe eRenault

    2016-03-01

    Full Text Available The aim of this work was to study the volatile thiol productions of 2 industrial strains of Torulaspora delbrueckii and Saccharomyces cerevisiae during alcoholic fermentation (AF of Sauvignon Blanc must. In order to evaluate the influence of the inoculation procedure, sequential and simultaneous mixed cultures were carried out and compared to pure cultures of T. delbrueckii and S. cerevisiae. The results confirmed the inability of T. delbrueckii to release 4-methyl-4-sulfanylpentan-2-one (4MSP and its low capacity to produce 3-sulfanylhexyl acetate (3SHA, as already reported in previous studies. A synergistic interaction was observed between the two species, resulting in higher levels of 3SH (3-sulfanylhexan-1-ol and its acetate when S. cerevisiae was inoculated 24 hours after T. delbrueckii, compared to the pure cultures. To elucidate the nature of the interactions between these 2 species, the yeast population kinetics were examined and monitored, as well as the production of 3SH, its acetate and their related non-odorous precursors: Glut-3SH (glutathionylated conjugate precursor and Cys-3SH (cysteinylated conjugate precursor. For the first time, it was suggested that, unlike, S. cerevisiae, which is able to metabolize the two precursor forms, T. delbrueckii was only able to metabolize the glutathionylated precursor. Consequently, the presence of T. delbrueckii during mixed fermentation led to an increase in Glut-3SH degradation and Cys-3SH production. This overproduction was dependent on the T. delbrueckii biomass. In sequential culture, thus favouring T. delbrueckii development, the higher availability of Cys-3SH throughout AF (alcoholic fermentation resulted in more abundant 3SH and 3SHA production by S. cerevisiae

  3. Possible roles of vacuolar H+-ATPase and mitochondrial function in tolerance to air-drying stress revealed by genome-wide screening of Saccharomyces cerevisiae deletion strains.

    Science.gov (United States)

    Shima, Jun; Ando, Akira; Takagi, Hiroshi

    2008-03-01

    Yeasts used in bread making are exposed to air-drying stress during dried yeast production processes. To clarify the genes required for air-drying tolerance, we performed genome-wide screening using the complete deletion strain collection of diploid Saccharomyces cerevisiae. The screening identified 278 gene deletions responsible for air-drying sensitivity. These genes were classified based on their cellular function and on the localization of their gene products. The results showed that the genes required for air-drying tolerance were frequently involved in mitochondrial functions and in connection with vacuolar H(+)-ATPase, which plays a role in vacuolar acidification. To determine the role of vacuolar acidification in air-drying stress tolerance, we monitored intracellular pH. The results showed that intracellular acidification was induced during air-drying and that this acidification was amplified in a deletion mutant of the VMA2 gene encoding a component of vacuolar H(+)-ATPase, suggesting that vacuolar H(+)-ATPase helps maintain intracellular pH homeostasis, which is affected by air-drying stress. To determine the effects of air-drying stress on mitochondria, we analysed the mitochondrial membrane potential under air-drying stress conditions using MitoTracker. The results showed that mitochondria were extremely sensitive to air-drying stress, suggesting that a mitochondrial function is required for tolerance to air-drying stress. We also analysed the correlation between oxidative-stress sensitivity and air-drying-stress sensitivity. The results suggested that oxidative stress is a critical determinant of sensitivity to air-drying stress, although ROS-scavenging systems are not necessary for air-drying stress tolerance. (c) 2008 John Wiley & Sons, Ltd.

  4. Enhanced 3-Sulfanylhexan-1-ol Production in Sequential Mixed Fermentation with Torulaspora delbrueckii/Saccharomyces cerevisiae Reveals a Situation of Synergistic Interaction between Two Industrial Strains.

    Science.gov (United States)

    Renault, Philippe; Coulon, Joana; Moine, Virginie; Thibon, Cécile; Bely, Marina

    2016-01-01

    The aim of this work was to study the volatile thiol productions of two industrial strains of Torulaspora delbrueckii and Saccharomyces cerevisiae during alcoholic fermentation (AF) of Sauvignon Blanc must. In order to evaluate the influence of the inoculation procedure, sequential and simultaneous mixed cultures were carried out and compared to pure cultures of T. delbrueckii and S. cerevisiae. The results confirmed the inability of T. delbrueckii to release 4-methyl-4-sulfanylpentan-2-one (4MSP) and its low capacity to produce 3-sulfanylhexyl acetate (3SHA), as already reported in previous studies. A synergistic interaction was observed between the two species, resulting in higher levels of 3SH (3-sulfanylhexan-1-ol) and its acetate when S. cerevisiae was inoculated 24 h after T. delbrueckii, compared to the pure cultures. To elucidate the nature of the interactions between these two species, the yeast population kinetics were examined and monitored, as well as the production of 3SH, its acetate and their related non-odorous precursors: Glut-3SH (glutathionylated conjugate precursor) and Cys-3SH (cysteinylated conjugate precursor). For the first time, it was suggested that, unlike S. cerevisiae, which is able to metabolize the two precursor forms, T. delbrueckii was only able to metabolize the glutathionylated precursor. Consequently, the presence of T. delbrueckii during mixed fermentation led to an increase in Glut-3SH degradation and Cys-3SH production. This overproduction was dependent on the T. delbrueckii biomass. In sequential culture, thus favoring T. delbrueckii development, the higher availability of Cys-3SH throughout AF resulted in more abundant 3SH and 3SHA production by S. cerevisiae.

  5. Six Highly Conserved Targets of RNAi Revealed in HIV-1-Infected Patients from Russia Are Also Present in Many HIV-1 Strains Worldwide

    Directory of Open Access Journals (Sweden)

    Olga V. Kretova

    2017-09-01

    Full Text Available RNAi has been suggested for use in gene therapy of HIV/AIDS, but the main problem is that HIV-1 is highly variable and could escape attack from the small interfering RNAs (siRNAs due to even single nucleotide substitutions in the potential targets. To exhaustively check the variability in selected RNA targets of HIV-1, we used ultra-deep sequencing of six regions of HIV-1 from the plasma of two independent cohorts of patients from Russia. Six RNAi targets were found that are invariable in 82%–97% of viruses in both cohorts and are located inside the domains specifying reverse transcriptase (RT, integrase, vpu, gp120, and p17. The analysis of mutation frequencies and their characteristics inside the targets suggests a likely role for APOBEC3G (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3G, A3G in G-to-A mutations and a predominant effect of RT biases in the detected variability of the virus. The lowest frequency of mutations was detected in the central part of all six targets. We also discovered that the identical RNAi targets are present in many HIV-1 strains from many countries and from all continents. The data are important for both the understanding of the patterns of HIV-1 mutability and properties of RT and for the development of gene therapy approaches using RNAi for the treatment of HIV/AIDS. Keywords: HIV-1, RNAi targets, gene therapy, ultra-deep sequencing, conserved HIV-1 sequences

  6. Satellite-Based Precipitation Datasets

    Science.gov (United States)

    Munchak, S. J.; Huffman, G. J.

    2017-12-01

    Of the possible sources of precipitation data, those based on satellites provide the greatest spatial coverage. There is a wide selection of datasets, algorithms, and versions from which to choose, which can be confusing to non-specialists wishing to use the data. The International Precipitation Working Group (IPWG) maintains tables of the major publicly available, long-term, quasi-global precipitation data sets (http://www.isac.cnr.it/ ipwg/data/datasets.html), and this talk briefly reviews the various categories. As examples, NASA provides two sets of quasi-global precipitation data sets: the older Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) and current Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (GPM) mission (IMERG). Both provide near-real-time and post-real-time products that are uniformly gridded in space and time. The TMPA products are 3-hourly 0.25°x0.25° on the latitude band 50°N-S for about 16 years, while the IMERG products are half-hourly 0.1°x0.1° on 60°N-S for over 3 years (with plans to go to 16+ years in Spring 2018). In addition to the precipitation estimates, each data set provides fields of other variables, such as the satellite sensor providing estimates and estimated random error. The discussion concludes with advice about determining suitability for use, the necessity of being clear about product names and versions, and the need for continued support for satellite- and surface-based observation.

  7. Global Expression Profiling and Pathway Analysis of Mouse Mammary Tumor Reveals Strain and Stage Specific Dysregulated Pathways in Breast Cancer Progression.

    Science.gov (United States)

    Mei, Yan; Yang, Jun-Ping; Lang, Yan-Hong; Peng, Li-Xia; Yang, Ming-Ming; Liu, Qin; Meng, Dong-Fang; Zheng, Li-Sheng; Qiang, Yuan-Yuan; Xu, Liang; Li, Chang-Zhi; Wei, Wen-Wen; Niu, Ting; Peng, Xing-Si; Yang, Qin; Lin, Fen; Hu, Hao; Xu, Hong-Fa; Huang, Bi-Jun; Wang, Li-Jing; Qian, Chao-Nan

    2018-05-01

    It is believed that the alteration of tissue microenvironment would affect cancer initiation and progression. However, little is known in terms of the underlying molecular mechanisms that would affect the initiation and progression of breast cancer. In the present study, we use two murine mammary tumor models with different speeds of tumor initiation and progression for whole genome expression profiling to reveal the involved genes and signaling pathways. The pathways regulating PI3K-Akt signaling and Ras signaling were activated in Fvb mice and promoted tumor progression. Contrastingly, the pathways regulating apoptosis and cellular senescence were activated in Fvb.B6 mice and suppressed tumor progression. We identified distinct patterns of oncogenic pathways activation at different stages of breast cancer, and uncovered five oncogenic pathways that were activated in both human and mouse breast cancers. The genes and pathways discovered in our study would be useful information for other researchers and drug development.

  8. Metatranscriptome analysis of fungal strains Penicillium camemberti and Geotrichum candidum reveal cheese matrix breakdown and potential development of sensory properties of ripened Camembert-type cheese.

    Science.gov (United States)

    Lessard, Marie-Hélène; Viel, Catherine; Boyle, Brian; St-Gelais, Daniel; Labrie, Steve

    2014-03-26

    Camembert-type cheese ripening is driven mainly by fungal microflora including Geotrichum candidum and Penicillium camemberti. These species are major contributors to the texture and flavour of typical bloomy rind cheeses. Biochemical studies showed that G. candidum reduces bitterness, enhances sulphur flavors through amino acid catabolism and has an impact on rind texture, firmness and thickness, while P. camemberti is responsible for the white and bloomy aspect of the rind, and produces enzymes involved in proteolysis and lipolysis activities. However, very little is known about the genetic determinants that code for these activities and their expression profile over time during the ripening process. The metatranscriptome of an industrial Canadian Camembert-type cheese was studied at seven different sampling days over 77 days of ripening. A database called CamemBank01 was generated, containing a total of 1,060,019 sequence tags (reads) assembled in 7916 contigs. Sequence analysis revealed that 57% of the contigs could be affiliated to molds, 16% originated from yeasts, and 27% could not be identified. According to the functional annotation performed, the predominant processes during Camembert ripening include gene expression, energy-, carbohydrate-, organic acid-, lipid- and protein- metabolic processes, cell growth, and response to different stresses. Relative expression data showed that these functions occurred mostly in the first two weeks of the ripening period. These data provide further advances in our knowledge about the biological activities of the dominant ripening microflora of Camembert cheese and will help select biological markers to improve cheese quality assessment.

  9. PHYSICS PERFORMANCE AND DATASET (PPD)

    CERN Multimedia

    L. Silvestris

    2012-01-01

      Introduction The first part of the year presented an important test for the new Physics Performance and Dataset (PPD) group (cf. its mandate: http://cern.ch/go/8f77). The activity was focused on the validation of the new releases meant for the Monte Carlo (MC) production and the data-processing in 2012 (CMSSW 50X and 52X), and on the preparation of the 2012 operations. In view of the Chamonix meeting, the PPD and physics groups worked to understand the impact of the higher pile-up scenario on some of the flagship Higgs analyses to better quantify the impact of the high luminosity on the CMS physics potential. A task force is working on the optimisation of the reconstruction algorithms and on the code to cope with the performance requirements imposed by the higher event occupancy as foreseen for 2012. Concerning the preparation for the analysis of the new data, a new MC production has been prepared. The new samples, simulated at 8 TeV, are already being produced and the digitisation and recons...

  10. Pattern Analysis On Banking Dataset

    Directory of Open Access Journals (Sweden)

    Amritpal Singh

    2015-06-01

    Full Text Available Abstract Everyday refinement and development of technology has led to an increase in the competition between the Tech companies and their going out of way to crack the system andbreak down. Thus providing Data mining a strategically and security-wise important area for many business organizations including banking sector. It allows the analyzes of important information in the data warehouse and assists the banks to look for obscure patterns in a group and discover unknown relationship in the data.Banking systems needs to process ample amount of data on daily basis related to customer information their credit card details limit and collateral details transaction details risk profiles Anti Money Laundering related information trade finance data. Thousands of decisionsbased on the related data are taken in a bank daily. This paper analyzes the banking dataset in the weka environment for the detection of interesting patterns based on its applications ofcustomer acquisition customer retention management and marketing and management of risk fraudulence detections.

  11. PHYSICS PERFORMANCE AND DATASET (PPD)

    CERN Multimedia

    L. Silvestris

    2013-01-01

    The PPD activities, in the first part of 2013, have been focused mostly on the final physics validation and preparation for the data reprocessing of the full 8 TeV datasets with the latest calibrations. These samples will be the basis for the preliminary results for summer 2013 but most importantly for the final publications on the 8 TeV Run 1 data. The reprocessing involves also the reconstruction of a significant fraction of “parked data” that will allow CMS to perform a whole new set of precision analyses and searches. In this way the CMSSW release 53X is becoming the legacy release for the 8 TeV Run 1 data. The regular operation activities have included taking care of the prolonged proton-proton data taking and the run with proton-lead collisions that ended in February. The DQM and Data Certification team has deployed a continuous effort to promptly certify the quality of the data. The luminosity-weighted certification efficiency (requiring all sub-detectors to be certified as usab...

  12. Loss of genetic variability in a hatchery strain of Senegalese sole (Solea senegalensis revealed by sequence data of the mitochondrial DNA control region and microsatellite markers

    Directory of Open Access Journals (Sweden)

    Pablo Sánchez

    2012-06-01

    Full Text Available Comparisons of the levels of genetic variation within and between a hatchery F1 (FAR, n=116 of Senegalese sole, Solea senegalensis, and its wild donor population (ATL, n = 26, both native to the SW Atlantic coast of the Iberian peninsula, as well as between the wild donor population and a wild western Mediterranean sample (MED, n=18, were carried out by characterizing 412 base pairs of the nucleotide sequence of the mitochondrial DNA control region I, and six polymorphic microsatellite loci. FAR showed a substantial loss of genetic variability (haplotypic diversity, h=0.49±0.066; nucleotide diversity, π=0.006±0.004; private allelic richness, pAg=0.28 to its donor population ATL (h=0.69±0.114; π=0.009±0.006; pAg=1.21. Pairwise FST values of microsatellite data were highly significant (P < 0.0001 between FAR and ATL (0.053 and FAR and MED (0.055. The comparison of wild samples revealed higher values of genetic variability in MED than in ATL, but only with mtDNA CR-I sequence data (h=0.948±0.033; π=0.030±0.016. However, pairwise ΦST and FST values between ATL and MED were highly significant (P < 0.0001 with mtDNA CR-I (0.228 and with microsatellite data (0.095, respectively. While loss of genetic variability in FAR could be associated with the sampling error when the broodstock was established, the results of parental and sibship inference suggest that most of these losses can be attributed to a high variance in reproductive success among members of the broodstock, particularly among females.

  13. The Geometry of Finite Equilibrium Datasets

    DEFF Research Database (Denmark)

    Balasko, Yves; Tvede, Mich

    We investigate the geometry of finite datasets defined by equilibrium prices, income distributions, and total resources. We show that the equilibrium condition imposes no restrictions if total resources are collinear, a property that is robust to small perturbations. We also show that the set...... of equilibrium datasets is pathconnected when the equilibrium condition does impose restrictions on datasets, as for example when total resources are widely non collinear....

  14. IPCC Socio-Economic Baseline Dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — The Intergovernmental Panel on Climate Change (IPCC) Socio-Economic Baseline Dataset consists of population, human development, economic, water resources, land...

  15. Veterans Affairs Suicide Prevention Synthetic Dataset

    Data.gov (United States)

    Department of Veterans Affairs — The VA's Veteran Health Administration, in support of the Open Data Initiative, is providing the Veterans Affairs Suicide Prevention Synthetic Dataset (VASPSD). The...

  16. Nanoparticle-organic pollutant interaction dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — Dataset presents concentrations of organic pollutants, such as polyaromatic hydrocarbon compounds, in water samples. Water samples of known volume and concentration...

  17. An Annotated Dataset of 14 Meat Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given.......This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  18. SIMADL: Simulated Activities of Daily Living Dataset

    Directory of Open Access Journals (Sweden)

    Talal Alshammari

    2018-04-01

    Full Text Available With the realisation of the Internet of Things (IoT paradigm, the analysis of the Activities of Daily Living (ADLs, in a smart home environment, is becoming an active research domain. The existence of representative datasets is a key requirement to advance the research in smart home design. Such datasets are an integral part of the visualisation of new smart home concepts as well as the validation and evaluation of emerging machine learning models. Machine learning techniques that can learn ADLs from sensor readings are used to classify, predict and detect anomalous patterns. Such techniques require data that represent relevant smart home scenarios, for training, testing and validation. However, the development of such machine learning techniques is limited by the lack of real smart home datasets, due to the excessive cost of building real smart homes. This paper provides two datasets for classification and anomaly detection. The datasets are generated using OpenSHS, (Open Smart Home Simulator, which is a simulation software for dataset generation. OpenSHS records the daily activities of a participant within a virtual environment. Seven participants simulated their ADLs for different contexts, e.g., weekdays, weekends, mornings and evenings. Eighty-four files in total were generated, representing approximately 63 days worth of activities. Forty-two files of classification of ADLs were simulated in the classification dataset and the other forty-two files are for anomaly detection problems in which anomalous patterns were simulated and injected into the anomaly detection dataset.

  19. ASSISTments Dataset from Multiple Randomized Controlled Experiments

    Science.gov (United States)

    Selent, Douglas; Patikorn, Thanaporn; Heffernan, Neil

    2016-01-01

    In this paper, we present a dataset consisting of data generated from 22 previously and currently running randomized controlled experiments inside the ASSISTments online learning platform. This dataset provides data mining opportunities for researchers to analyze ASSISTments data in a convenient format across multiple experiments at the same time.…

  20. Synthetic and Empirical Capsicum Annuum Image Dataset

    NARCIS (Netherlands)

    Barth, R.

    2016-01-01

    This dataset consists of per-pixel annotated synthetic (10500) and empirical images (50) of Capsicum annuum, also known as sweet or bell pepper, situated in a commercial greenhouse. Furthermore, the source models to generate the synthetic images are included. The aim of the datasets are to

  1. Design of an audio advertisement dataset

    Science.gov (United States)

    Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

    2015-12-01

    Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.

  2. Congruent strain specific intestinal persistence of Lactobacillus plantarum in an intestine-mimicking in vitro system and in human volunteers.

    Directory of Open Access Journals (Sweden)

    Hermien van Bokhorst-van de Veen

    Full Text Available BACKGROUND: An important trait of probiotics is their capability to reach their intestinal target sites alive to optimally exert their beneficial effects. Assessment of this trait in intestine-mimicking in vitro model systems has revealed differential survival of individual strains of a species. However, data on the in situ persistence characteristics of individual or mixtures of strains of the same species in the gastrointestinal tract of healthy human volunteers have not been reported to date. METHODOLOGY/PRINCIPAL FINDINGS: The GI-tract survival of individual L. plantarum strains was determined using an intestine mimicking model system, revealing substantial inter-strain differences. The obtained data were correlated to genomic diversity of the strains using comparative genome hybridization (CGH datasets, but this approach failed to discover specific genetic loci that explain the observed differences between the strains. Moreover, we developed a next-generation sequencing-based method that targets a variable intergenic region, and employed this method to assess the in vivo GI-tract persistence of different L. plantarum strains when administered in mixtures to healthy human volunteers. Remarkable consistency of the strain-specific persistence curves were observed between individual volunteers, which also correlated significantly with the GI-tract survival predicted on basis of the in vitro assay. CONCLUSION: The survival of individual L. plantarum strains in the GI-tract could not be correlated to the absence or presence of specific genes compared to the reference strain L. plantarum WCFS1. Nevertheless, in vivo persistence analysis in the human GI-tract confirmed the strain-specific persistence, which appeared to be remarkably similar in different healthy volunteers. Moreover, the relative strain-specific persistence in vivo appeared to be accurately and significantly predicted by their relative survival in the intestine-mimicking in vitro

  3. A high-resolution European dataset for hydrologic modeling

    Science.gov (United States)

    Ntegeka, Victor; Salamon, Peter; Gomes, Goncalo; Sint, Hadewij; Lorini, Valerio; Thielen, Jutta

    2013-04-01

    inputs to the hydrological calibration and validation of EFAS as well as for establishing long-term discharge "proxy" climatologies which can then in turn be used for statistical analysis to derive return periods or other time series derivatives. In addition, this dataset will be used to assess climatological trends in Europe. Unfortunately, to date no baseline dataset at the European scale exists to test the quality of the herein presented data. Hence, a comparison against other existing datasets can therefore only be an indication of data quality. Due to availability, a comparison was made for precipitation and temperature only, arguably the most important meteorological drivers for hydrologic models. A variety of analyses was undertaken at country scale against data reported to EUROSTAT and E-OBS datasets. The comparison revealed that while the datasets showed overall similar temporal and spatial patterns, there were some differences in magnitudes especially for precipitation. It is not straightforward to define the specific cause for these differences. However, in most cases the comparatively low observation station density appears to be the principal reason for the differences in magnitude.

  4. The Kinetics Human Action Video Dataset

    OpenAIRE

    Kay, Will; Carreira, Joao; Simonyan, Karen; Zhang, Brian; Hillier, Chloe; Vijayanarasimhan, Sudheendra; Viola, Fabio; Green, Tim; Back, Trevor; Natsev, Paul; Suleyman, Mustafa; Zisserman, Andrew

    2017-01-01

    We describe the DeepMind Kinetics human action video dataset. The dataset contains 400 human action classes, with at least 400 video clips for each action. Each clip lasts around 10s and is taken from a different YouTube video. The actions are human focussed and cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands. We describe the statistics of the dataset, how it was collected, and give some ...

  5. MLST and Whole-Genome-Based Population Analysis of Cryptococcus gattii VGIII Links Clinical, Veterinary and Environmental Strains, and Reveals Divergent Serotype Specific Sub-populations and Distant Ancestors

    Science.gov (United States)

    Firacative, Carolina; Roe, Chandler C.; Malik, Richard; Ferreira-Paim, Kennio; Escandón, Patricia; Sykes, Jane E.; Castañón-Olivares, Laura Rocío; Contreras-Peres, Cudberto; Samayoa, Blanca; Sorrell, Tania C.; Castañeda, Elizabeth; Lockhart, Shawn R.; Engelthaler, David M.; Meyer, Wieland

    2016-01-01

    The emerging pathogen Cryptococcus gattii causes life-threatening disease in immunocompetent and immunocompromised hosts. Of the four major molecular types (VGI-VGIV), the molecular type VGIII has recently emerged as cause of disease in otherwise healthy individuals, prompting a need to investigate its population genetic structure to understand if there are potential genotype-dependent characteristics in its epidemiology, environmental niche(s), host range and clinical features of disease. Multilocus sequence typing (MLST) of 122 clinical, environmental and veterinary C. gattii VGIII isolates from Australia, Colombia, Guatemala, Mexico, New Zealand, Paraguay, USA and Venezuela, and whole genome sequencing (WGS) of 60 isolates representing all established MLST types identified four divergent sub-populations. The majority of the isolates belong to two main clades, corresponding either to serotype B or C, indicating an ongoing species evolution. Both major clades included clinical, environmental and veterinary isolates. The C. gattii VGIII population was genetically highly diverse, with minor differences between countries, isolation source, serotype and mating type. Little to no recombination was found between the two major groups, serotype B and C, at the whole and mitochondrial genome level. C. gattii VGIII is widespread in the Americas, with sporadic cases occurring elsewhere, WGS revealed Mexico and USA as a likely origin of the serotype B VGIII population and Colombia as a possible origin of the serotype C VGIII population. Serotype B isolates are more virulent than serotype C isolates in a murine model of infection, causing predominantly pulmonary cryptococcosis. No specific link between genotype and virulence was observed. Antifungal susceptibility testing against six antifungal drugs revealed that serotype B isolates are more susceptible to azoles than serotype C isolates, highlighting the importance of strain typing to guide effective treatment to improve the

  6. BASE MAP DATASET, LOS ANGELES COUNTY, CALIFORNIA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  7. BASE MAP DATASET, CHEROKEE COUNTY, SOUTH CAROLINA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  8. SIAM 2007 Text Mining Competition dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining...

  9. Harvard Aging Brain Study : Dataset and accessibility

    NARCIS (Netherlands)

    Dagley, Alexander; LaPoint, Molly; Huijbers, Willem; Hedden, Trey; McLaren, Donald G.; Chatwal, Jasmeer P.; Papp, Kathryn V.; Amariglio, Rebecca E.; Blacker, Deborah; Rentz, Dorene M.; Johnson, Keith A.; Sperling, Reisa A.; Schultz, Aaron P.

    2017-01-01

    The Harvard Aging Brain Study is sharing its data with the global research community. The longitudinal dataset consists of a 284-subject cohort with the following modalities acquired: demographics, clinical assessment, comprehensive neuropsychological testing, clinical biomarkers, and neuroimaging.

  10. BASE MAP DATASET, HONOLULU COUNTY, HAWAII, USA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  11. BASE MAP DATASET, EDGEFIELD COUNTY, SOUTH CAROLINA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  12. Environmental Dataset Gateway (EDG) REST Interface

    Data.gov (United States)

    U.S. Environmental Protection Agency — Use the Environmental Dataset Gateway (EDG) to find and access EPA's environmental resources. Many options are available for easily reusing EDG content in other...

  13. BASE MAP DATASET, INYO COUNTY, OKLAHOMA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  14. BASE MAP DATASET, JACKSON COUNTY, OKLAHOMA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  15. BASE MAP DATASET, SANTA CRIZ COUNTY, CALIFORNIA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  16. Climate Prediction Center IR 4km Dataset

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — CPC IR 4km dataset was created from all available individual geostationary satellite data which have been merged to form nearly seamless global (60N-60S) IR...

  17. BASE MAP DATASET, MAYES COUNTY, OKLAHOMA, USA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications: cadastral, geodetic control,...

  18. BASE MAP DATASET, KINGFISHER COUNTY, OKLAHOMA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — FEMA Framework Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme,...

  19. Structural Masquerade of Plesiomonas shigelloides Strain CNCTC 78/89 O-Antigen-High-Resolution Magic Angle Spinning NMR Reveals the Modified d-galactan I of Klebsiella pneumoniae.

    Science.gov (United States)

    Ucieklak, Karolina; Koj, Sabina; Pawelczyk, Damian; Niedziela, Tomasz

    2017-11-29

    The high-resolution magic angle spinning nuclear magnetic resonance spectroscopy (HR-MAS NMR) analysis of Plesiomonas shigelloides 78/89 lipopolysaccharide directly on bacteria revealed the characteristic structural features of the O -acetylated polysaccharide in the NMR spectra. The O -antigen profiles were unique, yet the pattern of signals in the, spectra along with their ¹H, 13 C chemical shift values, resembled these of d-galactan I of Klebsiella pneumoniae . The isolated O- specific polysaccharide (O-PS) of P. shigelloides strain CNCTC 78/89 was investigated by ¹H and 13 C NMR spectroscopy, mass spectrometry and chemical methods. The analyses demonstrated that the P. shigelloides 78/89 O- PS is composed of →3)-α-d-Gal p -(1→3)-β-d-Gal f 2OAc-(1→ disaccharide repeating units. The O- acetylation was incomplete and resulted in a microheterogeneity of the O- antigen. This O- acetylation generates additional antigenic determinants within the O- antigen, forms a new chemotype, and contributes to the epitopes recognized by the O- serotype specific antibodies. The serological cross-reactivities further confirmed the inter-specific structural similarity of these O- antigens.

  20. Comparison of recent SnIa datasets

    International Nuclear Information System (INIS)

    Sanchez, J.C. Bueno; Perivolaropoulos, L.; Nesseris, S.

    2009-01-01

    We rank the six latest Type Ia supernova (SnIa) datasets (Constitution (C), Union (U), ESSENCE (Davis) (E), Gold06 (G), SNLS 1yr (S) and SDSS-II (D)) in the context of the Chevalier-Polarski-Linder (CPL) parametrization w(a) = w 0 +w 1 (1−a), according to their Figure of Merit (FoM), their consistency with the cosmological constant (ΛCDM), their consistency with standard rulers (Cosmic Microwave Background (CMB) and Baryon Acoustic Oscillations (BAO)) and their mutual consistency. We find a significant improvement of the FoM (defined as the inverse area of the 95.4% parameter contour) with the number of SnIa of these datasets ((C) highest FoM, (U), (G), (D), (E), (S) lowest FoM). Standard rulers (CMB+BAO) have a better FoM by about a factor of 3, compared to the highest FoM SnIa dataset (C). We also find that the ranking sequence based on consistency with ΛCDM is identical with the corresponding ranking based on consistency with standard rulers ((S) most consistent, (D), (C), (E), (U), (G) least consistent). The ranking sequence of the datasets however changes when we consider the consistency with an expansion history corresponding to evolving dark energy (w 0 ,w 1 ) = (−1.4,2) crossing the phantom divide line w = −1 (it is practically reversed to (G), (U), (E), (S), (D), (C)). The SALT2 and MLCS2k2 fitters are also compared and some peculiar features of the SDSS-II dataset when standardized with the MLCS2k2 fitter are pointed out. Finally, we construct a statistic to estimate the internal consistency of a collection of SnIa datasets. We find that even though there is good consistency among most samples taken from the above datasets, this consistency decreases significantly when the Gold06 (G) dataset is included in the sample

  1. Knowledge discovery with classification rules in a cardiovascular dataset.

    Science.gov (United States)

    Podgorelec, Vili; Kokol, Peter; Stiglic, Milojka Molan; Hericko, Marjan; Rozman, Ivan

    2005-12-01

    In this paper we study an evolutionary machine learning approach to data mining and knowledge discovery based on the induction of classification rules. A method for automatic rules induction called AREX using evolutionary induction of decision trees and automatic programming is introduced. The proposed algorithm is applied to a cardiovascular dataset consisting of different groups of attributes which should possibly reveal the presence of some specific cardiovascular problems in young patients. A case study is presented that shows the use of AREX for the classification of patients and for discovering possible new medical knowledge from the dataset. The defined knowledge discovery loop comprises a medical expert's assessment of induced rules to drive the evolution of rule sets towards more appropriate solutions. The final result is the discovery of a possible new medical knowledge in the field of pediatric cardiology.

  2. Data Recommender: An Alternative Way to Discover Open Scientific Datasets

    Science.gov (United States)

    Klump, J. F.; Devaraju, A.; Williams, G.; Hogan, D.; Davy, R.; Page, J.; Singh, D.; Peterson, N.

    2017-12-01

    similar and serendipitous data recommendations. It measures the relevance between datasets based on their properties, and search and download patterns. We evaluated the recommendation approach in a user study, and the obtained user judgments revealed the ability of the approach to accurately quantify the relevance of the datasets.

  3. The satellite-based remote sensing of particulate matter (PM) in support to urban air quality: PM variability and hot spots within the Cordoba city (Argentina) as revealed by the high-resolution MAIAC-algorithm retrievals applied to a ten-years dataset (2

    Science.gov (United States)

    Della Ceca, Lara Sofia; Carreras, Hebe A.; Lyapustin, Alexei I.; Barnaba, Francesca

    2016-04-01

    Particulate matter (PM) is one of the major harmful pollutants to public health and the environment [1]. In developed countries, specific air-quality legislation establishes limit values for PM metrics (e.g., PM10, PM2.5) to protect the citizens health (e.g., European Commission Directive 2008/50, US Clean Air Act). Extensive PM measuring networks therefore exist in these countries to comply with the legislation. In less developed countries air quality monitoring networks are still lacking and satellite-based datasets could represent a valid alternative to fill observational gaps. The main PM (or aerosol) parameter retrieved from satellite is the 'aerosol optical depth' (AOD), an optical parameter quantifying the aerosol load in the whole atmospheric column. Datasets from the MODIS sensors on board of the NASA spacecrafts TERRA and AQUA are among the longest records of AOD from space. However, although extremely useful in regional and global studies, the standard 10 km-resolution MODIS AOD product is not suitable to be employed at the urban scale. Recently, a new algorithm called Multi-Angle Implementation of Atmospheric Correction (MAIAC) was developed for MODIS, providing AOD at 1 km resolution [2]. In this work, the MAIAC AOD retrievals over the decade 2003-2013 were employed to investigate the spatiotemporal variation of atmospheric aerosols over the Argentinean city of Cordoba and its surroundings, an area where a very scarce dataset of in situ PM data is available. The MAIAC retrievals over the city were firstly validated using a 'ground truth' AOD dataset from the Cordoba sunphotometer operating within the global AERONET network [3]. This validation showed the good performances of the MAIAC algorithm in the area. The satellite MAIAC AOD dataset was therefore employed to investigate the 10-years trend as well as seasonal and monthly patterns of particulate matter in the Cordoba city. The first showed a marked increase of AOD over time, particularly evident in

  4. Comparison of Shallow Survey 2012 Multibeam Datasets

    Science.gov (United States)

    Ramirez, T. M.

    2012-12-01

    The purpose of the Shallow Survey common dataset is a comparison of the different technologies utilized for data acquisition in the shallow survey marine environment. The common dataset consists of a series of surveys conducted over a common area of seabed using a variety of systems. It provides equipment manufacturers the opportunity to showcase their latest systems while giving hydrographic researchers and scientists a chance to test their latest algorithms on the dataset so that rigorous comparisons can be made. Five companies collected data for the Common Dataset in the Wellington Harbor area in New Zealand between May 2010 and May 2011; including Kongsberg, Reson, R2Sonic, GeoAcoustics, and Applied Acoustics. The Wellington harbor and surrounding coastal area was selected since it has a number of well-defined features, including the HMNZS South Seas and HMNZS Wellington wrecks, an armored seawall constructed of Tetrapods and Akmons, aquifers, wharves and marinas. The seabed inside the harbor basin is largely fine-grained sediment, with gravel and reefs around the coast. The area outside the harbor on the southern coast is an active environment, with moving sand and exposed reefs. A marine reserve is also in this area. For consistency between datasets, the coastal research vessel R/V Ikatere and crew were used for all surveys conducted for the common dataset. Using Triton's Perspective processing software multibeam datasets collected for the Shallow Survey were processed for detail analysis. Datasets from each sonar manufacturer were processed using the CUBE algorithm developed by the Center for Coastal and Ocean Mapping/Joint Hydrographic Center (CCOM/JHC). Each dataset was gridded at 0.5 and 1.0 meter resolutions for cross comparison and compliance with International Hydrographic Organization (IHO) requirements. Detailed comparisons were made of equipment specifications (transmit frequency, number of beams, beam width), data density, total uncertainty, and

  5. The Whole-Genome Sequence of Bacillus velezensis Strain SB1216 Isolated from the Great Salt Plains of Oklahoma Reveals the Presence of a Novel Extracellular RNase with Antitumor Activity.

    Science.gov (United States)

    Marasini, Daya; Cornell, Carolyn R; Oyewole, Opeoluwa; Sheaff, Robert J; Fakhr, Mohamed K

    2017-11-22

    The whole-genome sequence of Bacillus velezensis strain SB1216, isolated from the Great Salt Plains of Oklahoma, showed the presence of a 3,814,720-bp circular chromosome and no plasmids. The presence of a novel 870-bp extracellular RNase gene is predicted to be responsible for this strain's antitumor activity. Copyright © 2017 Marasini et al.

  6. 3DSEM: A 3D microscopy dataset

    Directory of Open Access Journals (Sweden)

    Ahmad P. Tafti

    2016-03-01

    Full Text Available The Scanning Electron Microscope (SEM as a 2D imaging instrument has been widely used in many scientific disciplines including biological, mechanical, and materials sciences to determine the surface attributes of microscopic objects. However the SEM micrographs still remain 2D images. To effectively measure and visualize the surface properties, we need to truly restore the 3D shape model from 2D SEM images. Having 3D surfaces would provide anatomic shape of micro-samples which allows for quantitative measurements and informative visualization of the specimens being investigated. The 3DSEM is a dataset for 3D microscopy vision which is freely available at [1] for any academic, educational, and research purposes. The dataset includes both 2D images and 3D reconstructed surfaces of several real microscopic samples. Keywords: 3D microscopy dataset, 3D microscopy vision, 3D SEM surface reconstruction, Scanning Electron Microscope (SEM

  7. Data Mining for Imbalanced Datasets: An Overview

    Science.gov (United States)

    Chawla, Nitesh V.

    A dataset is imbalanced if the classification categories are not approximately equally represented. Recent years brought increased interest in applying machine learning techniques to difficult "real-world" problems, many of which are characterized by imbalanced data. Additionally the distribution of the testing data may differ from that of the training data, and the true misclassification costs may be unknown at learning time. Predictive accuracy, a popular choice for evaluating performance of a classifier, might not be appropriate when the data is imbalanced and/or the costs of different errors vary markedly. In this Chapter, we discuss some of the sampling techniques used for balancing the datasets, and the performance measures more appropriate for mining imbalanced datasets.

  8. Genomics dataset of unidentified disclosed isolates

    Directory of Open Access Journals (Sweden)

    Bhagwan N. Rekadwad

    2016-09-01

    Full Text Available Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis. Keywords: BioLABs, Blunt ends, Genomics, NEB cutter, Restriction digestion, Short DNA sequences, Sticky ends

  9. A curated database of cyanobacterial strains relevant for modern taxonomy and phylogenetic studies

    OpenAIRE

    Ramos, Vitor; Morais, Jo?o; Vasconcelos, Vitor M.

    2017-01-01

    The dataset herein described lays the groundwork for an online database of relevant cyanobacterial strains, named CyanoType (http://lege.ciimar.up.pt/cyanotype). It is a database that includes categorized cyanobacterial strains useful for taxonomic, phylogenetic or genomic purposes, with associated information obtained by means of a literature-based curation. The dataset lists 371 strains and represents the first version of the database (CyanoType v.1). Information for each strain includes st...

  10. Harvard Aging Brain Study: Dataset and accessibility.

    Science.gov (United States)

    Dagley, Alexander; LaPoint, Molly; Huijbers, Willem; Hedden, Trey; McLaren, Donald G; Chatwal, Jasmeer P; Papp, Kathryn V; Amariglio, Rebecca E; Blacker, Deborah; Rentz, Dorene M; Johnson, Keith A; Sperling, Reisa A; Schultz, Aaron P

    2017-01-01

    The Harvard Aging Brain Study is sharing its data with the global research community. The longitudinal dataset consists of a 284-subject cohort with the following modalities acquired: demographics, clinical assessment, comprehensive neuropsychological testing, clinical biomarkers, and neuroimaging. To promote more extensive analyses, imaging data was designed to be compatible with other publicly available datasets. A cloud-based system enables access to interested researchers with blinded data available contingent upon completion of a data usage agreement and administrative approval. Data collection is ongoing and currently in its fifth year. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Feedback control in deep drawing based on experimental datasets

    Science.gov (United States)

    Fischer, P.; Heingärtner, J.; Aichholzer, W.; Hortig, D.; Hora, P.

    2017-09-01

    In large-scale production of deep drawing parts, like in automotive industry, the effects of scattering material properties as well as warming of the tools have a significant impact on the drawing result. In the scope of the work, an approach is presented to minimize the influence of these effects on part quality by optically measuring the draw-in of each part and adjusting the settings of the press to keep the strain distribution, which is represented by the draw-in, inside a certain limit. For the design of the control algorithm, a design of experiments for in-line tests is used to quantify the influence of the blank holder force as well as the force distribution on the draw-in. The results of this experimental dataset are used to model the process behavior. Based on this model, a feedback control loop is designed. Finally, the performance of the control algorithm is validated in the production line.

  12. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Science.gov (United States)

    Yazar, Seyhan; Gooden, George E C; Mackey, David A; Hewitt, Alex W

    2014-01-01

    A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR) on Amazon EC2 instances and Google Compute Engine (GCE), using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome) and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2) for E.coli and 53.5% (95% CI: 34.4-72.6) for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1) and 173.9% (95% CI: 134.6-213.1) more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  13. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Directory of Open Access Journals (Sweden)

    Seyhan Yazar

    Full Text Available A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR on Amazon EC2 instances and Google Compute Engine (GCE, using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2 for E.coli and 53.5% (95% CI: 34.4-72.6 for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1 and 173.9% (95% CI: 134.6-213.1 more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  14. Random Coefficient Logit Model for Large Datasets

    NARCIS (Netherlands)

    C. Hernández-Mireles (Carlos); D. Fok (Dennis)

    2010-01-01

    textabstractWe present an approach for analyzing market shares and products price elasticities based on large datasets containing aggregate sales data for many products, several markets and for relatively long time periods. We consider the recently proposed Bayesian approach of Jiang et al [Jiang,

  15. Thesaurus Dataset of Educational Technology in Chinese

    Science.gov (United States)

    Wu, Linjing; Liu, Qingtang; Zhao, Gang; Huang, Huan; Huang, Tao

    2015-01-01

    The thesaurus dataset of educational technology is a knowledge description of educational technology in Chinese. The aims of this thesaurus were to collect the subject terms in the domain of educational technology, facilitate the standardization of terminology and promote the communication between Chinese researchers and scholars from various…

  16. Heterogeneity of the Epstein-Barr Virus (EBV) Major Internal Repeat Reveals Evolutionary Mechanisms of EBV and a Functional Defect in the Prototype EBV Strain B95-8.

    Science.gov (United States)

    Ba Abdullah, Mohammed M; Palermo, Richard D; Palser, Anne L; Grayson, Nicholas E; Kellam, Paul; Correia, Samantha; Szymula, Agnieszka; White, Robert E

    2017-12-01

    Epstein-Barr virus (EBV) is a ubiquitous pathogen of humans that can cause several types of lymphoma and carcinoma. Like other herpesviruses, EBV has diversified through both coevolution with its host and genetic exchange between virus strains. Sequence analysis of the EBV genome is unusually challenging because of the large number and lengths of repeat regions within the virus. Here we describe the sequence assembly and analysis of the large internal repeat 1 of EBV (IR1; also known as the BamW repeats) for more than 70 strains. The diversity of the latency protein EBV nuclear antigen leader protein (EBNA-LP) resides predominantly within the exons downstream of IR1. The integrity of the putative BWRF1 open reading frame (ORF) is retained in over 80% of strains, and deletions truncating IR1 always spare BWRF1. Conserved regions include the IR1 latency promoter (Wp) and one zone upstream of and two within BWRF1. IR1 is heterogeneous in 70% of strains, and this heterogeneity arises from sequence exchange between strains as well as from spontaneous mutation, with interstrain recombination being more common in tumor-derived viruses. This genetic exchange often incorporates regions of Epstein-Barr virus (EBV) infects the majority of the world population but causes illness in only a small minority of people. Nevertheless, over 1% of cancers worldwide are attributable to EBV. Recent sequencing projects investigating virus diversity to see if different strains have different disease impacts have excluded regions of repeating sequence, as they are more technically challenging. Here we analyze the sequence of the largest repeat in EBV (IR1). We first characterized the variations in protein sequences encoded across IR1. In studying variations within the repeat of each strain, we identified a mutation in the main laboratory strain of EBV that impairs virus function, and we suggest that tumor-associated viruses may be more likely to contain DNA mixed from two strains. The

  17. Strains and Stressors: An Analysis of Touchscreen Learning in Genetically Diverse Mouse Strains

    Science.gov (United States)

    Graybeal, Carolyn; Bachu, Munisa; Mozhui, Khyobeni; Saksida, Lisa M.; Bussey, Timothy J.; Sagalyn, Erica; Williams, Robert W.; Holmes, Andrew

    2014-01-01

    Touchscreen-based systems are growing in popularity as a tractable, translational approach for studying learning and cognition in rodents. However, while mouse strains are well known to differ in learning across various settings, performance variation between strains in touchscreen learning has not been well described. The selection of appropriate genetic strains and backgrounds is critical to the design of touchscreen-based studies and provides a basis for elucidating genetic factors moderating behavior. Here we provide a quantitative foundation for visual discrimination and reversal learning using touchscreen assays across a total of 35 genotypes. We found significant differences in operant performance and learning, including faster reversal learning in DBA/2J compared to C57BL/6J mice. We then assessed DBA/2J and C57BL/6J for differential sensitivity to an environmental insult by testing for alterations in reversal learning following exposure to repeated swim stress. Stress facilitated reversal learning (selectively during the late stage of reversal) in C57BL/6J, but did not affect learning in DBA/2J. To dissect genetic factors underlying these differences, we phenotyped a family of 27 BXD strains generated by crossing C57BL/6J and DBA/2J. There was marked variation in discrimination, reversal and extinction learning across the BXD strains, suggesting this task may be useful for identifying underlying genetic differences. Moreover, different measures of touchscreen learning were only modestly correlated in the BXD strains, indicating that these processes are comparatively independent at both genetic and phenotypic levels. Finally, we examined the behavioral structure of learning via principal component analysis of the current data, plus an archival dataset, totaling 765 mice. This revealed 5 independent factors suggestive of “reversal learning,” “motivation-related late reversal learning,” “discrimination learning,” “speed to respond,” and

  18. Analysis of Erwinia chrysanthemi EC16 pelE::uidA, pelL::uidA, and hrpN::uidA mutants reveals strain-specific atypical regulation of the Hrp type III secretion system.

    Science.gov (United States)

    Ham, Jong Hyun; Cui, Yaya; Alfano, James R; Rodríguez-Palenzuela, Pablo; Rojas, Clemencia M; Chatterjee, Arun K; Collmer, Alan

    2004-02-01

    The plant pathogen Erwinia chrysanthemi produces a variety of factors that have been implicated in its ability to cause soft-rot diseases in various hosts. These include HrpN, a harpin secreted by the Hrp type III secretion system; PelE, one of several major pectate lyase isozymes secreted by the type II system; and PelL, one of several secondary Pels secreted by the type II system. We investigated these factors in E. chrysanthemi EC16 with respect to the effects of medium composition and growth phase on gene expression (as determined with uidA fusions and Northern analyses) and effects on virulence. pelE was induced by polygalacturonic acid, but pelL was not, and hrpN was expressed unexpectedly in nutrient-rich King's medium B and in minimal salts medium at neutral pH. In contrast, the effect of medium composition on hrp expression in E. chrysanthemi CUCPB1237 and 3937 was like that of many other phytopathogenic bacteria in being repressed in complex media and induced in acidic pH minimal medium. Northern blot analysis of hrpN and hrpL expression by the wild-type and hrpL::omegaCmr and hrpS::omegaCmr mutants revealed that hrpN expression was dependent on the HrpL alternative sigma factor, whose expression, in turn, was dependent on the HrpS putative sigma54 enhancer binding protein. The expression of pelE and hrpN increased strongly in late logarithmic growth phase. To test the possible role of quorum sensing in this expression pattern, the expI/expR locus was cloned in Escherichia coli on the basis of its ability to direct production of acyl-homoserine lactone and then used to construct expI mutations in pelE::uidA, pelL::uidA, and hrpN::uidA Erwinia chrysanthemi strains. Mutation of expI had no apparent effect on the growth-phase-dependent expression of hrpN and pelE, or on the virulence of E. chrysanthemi in witloof chicory leaves. Overexpression of hrpN in E. chrysanthemi resulted in approximately 50% reduction of lesion size on chicory leaves without an

  19. Strain Pattern in Supercooled Liquids

    Science.gov (United States)

    Illing, Bernd; Fritschi, Sebastian; Hajnal, David; Klix, Christian; Keim, Peter; Fuchs, Matthias

    2016-11-01

    Investigations of strain correlations at the glass transition reveal unexpected phenomena. The shear strain fluctuations show an Eshelby-strain pattern [˜cos (4 θ ) /r2 ], characteristic of elastic response, even in liquids, at long times. We address this using a mode-coupling theory for the strain fluctuations in supercooled liquids and data from both video microscopy of a two-dimensional colloidal glass former and simulations of Brownian hard disks. We show that the long-ranged and long-lived strain signatures follow a scaling law valid close to the glass transition. For large enough viscosities, the Eshelby-strain pattern is visible even on time scales longer than the structural relaxation time τ and after the shear modulus has relaxed to zero.

  20. The complete genome sequencing of Prevotella intermedia strain OMA14 and a subsequent fine-scale, intra-species genomic comparison reveal an unusual amplification of conjugative and mobile transposons and identify a novel Prevotella-lineage-specific repeat.

    Science.gov (United States)

    Naito, Mariko; Ogura, Yoshitoshi; Itoh, Takehiko; Shoji, Mikio; Okamoto, Masaaki; Hayashi, Tetsuya; Nakayama, Koji

    2016-02-01

    Prevotella intermedia is a pathogenic bacterium involved in periodontal diseases. Here, we present the complete genome sequence of a clinical strain, OMA14, of this bacterium along with the results of comparative genome analysis with strain 17 of the same species whose genome has also been sequenced, but not fully analysed yet. The genomes of both strains consist of two circular chromosomes: the larger chromosomes are similar in size and exhibit a high overall linearity of gene organizations, whereas the smaller chromosomes show a significant size variation and have undergone remarkable genome rearrangements. Unique features of the Pre. intermedia genomes are the presence of a remarkable number of essential genes on the second chromosomes and the abundance of conjugative and mobilizable transposons (CTns and MTns). The CTns/MTns are particularly abundant in the second chromosomes, involved in its extensive genome rearrangement, and have introduced a number of strain-specific genes into each strain. We also found a novel 188-bp repeat sequence that has been highly amplified in Pre. intermedia and are specifically distributed among the Pre. intermedia-related species. These findings expand our understanding of the genetic features of Pre. intermedia and the roles of CTns and MTns in the evolution of bacteria. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  1. Sharing Video Datasets in Design Research

    DEFF Research Database (Denmark)

    Christensen, Bo; Abildgaard, Sille Julie Jøhnk

    2017-01-01

    This paper examines how design researchers, design practitioners and design education can benefit from sharing a dataset. We present the Design Thinking Research Symposium 11 (DTRS11) as an exemplary project that implied sharing video data of design processes and design activity in natural settings...... with a large group of fellow academics from the international community of Design Thinking Research, for the purpose of facilitating research collaboration and communication within the field of Design and Design Thinking. This approach emphasizes the social and collaborative aspects of design research, where...... a multitude of appropriate perspectives and methods may be utilized in analyzing and discussing the singular dataset. The shared data is, from this perspective, understood as a design object in itself, which facilitates new ways of working, collaborating, studying, learning and educating within the expanding...

  2. Automatic processing of multimodal tomography datasets.

    Science.gov (United States)

    Parsons, Aaron D; Price, Stephen W T; Wadeson, Nicola; Basham, Mark; Beale, Andrew M; Ashton, Alun W; Mosselmans, J Frederick W; Quinn, Paul D

    2017-01-01

    With the development of fourth-generation high-brightness synchrotrons on the horizon, the already large volume of data that will be collected on imaging and mapping beamlines is set to increase by orders of magnitude. As such, an easy and accessible way of dealing with such large datasets as quickly as possible is required in order to be able to address the core scientific problems during the experimental data collection. Savu is an accessible and flexible big data processing framework that is able to deal with both the variety and the volume of data of multimodal and multidimensional scientific datasets output such as those from chemical tomography experiments on the I18 microfocus scanning beamline at Diamond Light Source.

  3. Interpolation of diffusion weighted imaging datasets

    DEFF Research Database (Denmark)

    Dyrby, Tim B; Lundell, Henrik; Burke, Mark W

    2014-01-01

    anatomical details and signal-to-noise-ratio for reliable fibre reconstruction. We assessed the potential benefits of interpolating DWI datasets to a higher image resolution before fibre reconstruction using a diffusion tensor model. Simulations of straight and curved crossing tracts smaller than or equal......Diffusion weighted imaging (DWI) is used to study white-matter fibre organisation, orientation and structural connectivity by means of fibre reconstruction algorithms and tractography. For clinical settings, limited scan time compromises the possibilities to achieve high image resolution for finer...... interpolation methods fail to disentangle fine anatomical details if PVE is too pronounced in the original data. As for validation we used ex-vivo DWI datasets acquired at various image resolutions as well as Nissl-stained sections. Increasing the image resolution by a factor of eight yielded finer geometrical...

  4. Data assimilation and model evaluation experiment datasets

    Science.gov (United States)

    Lai, Chung-Cheng A.; Qian, Wen; Glenn, Scott M.

    1994-01-01

    The Institute for Naval Oceanography, in cooperation with Naval Research Laboratories and universities, executed the Data Assimilation and Model Evaluation Experiment (DAMEE) for the Gulf Stream region during fiscal years 1991-1993. Enormous effort has gone into the preparation of several high-quality and consistent datasets for model initialization and verification. This paper describes the preparation process, the temporal and spatial scopes, the contents, the structure, etc., of these datasets. The goal of DAMEE and the need of data for the four phases of experiment are briefly stated. The preparation of DAMEE datasets consisted of a series of processes: (1) collection of observational data; (2) analysis and interpretation; (3) interpolation using the Optimum Thermal Interpolation System package; (4) quality control and re-analysis; and (5) data archiving and software documentation. The data products from these processes included a time series of 3D fields of temperature and salinity, 2D fields of surface dynamic height and mixed-layer depth, analysis of the Gulf Stream and rings system, and bathythermograph profiles. To date, these are the most detailed and high-quality data for mesoscale ocean modeling, data assimilation, and forecasting research. Feedback from ocean modeling groups who tested this data was incorporated into its refinement. Suggestions for DAMEE data usages include (1) ocean modeling and data assimilation studies, (2) diagnosis and theoretical studies, and (3) comparisons with locally detailed observations.

  5. A hybrid organic-inorganic perovskite dataset

    Science.gov (United States)

    Kim, Chiho; Huan, Tran Doan; Krishnan, Sridevi; Ramprasad, Rampi

    2017-05-01

    Hybrid organic-inorganic perovskites (HOIPs) have been attracting a great deal of attention due to their versatility of electronic properties and fabrication methods. We prepare a dataset of 1,346 HOIPs, which features 16 organic cations, 3 group-IV cations and 4 halide anions. Using a combination of an atomic structure search method and density functional theory calculations, the optimized structures, the bandgap, the dielectric constant, and the relative energies of the HOIPs are uniformly prepared and validated by comparing with relevant experimental and/or theoretical data. We make the dataset available at Dryad Digital Repository, NoMaD Repository, and Khazana Repository (http://khazana.uconn.edu/), hoping that it could be useful for future data-mining efforts that can explore possible structure-property relationships and phenomenological models. Progressive extension of the dataset is expected as new organic cations become appropriate within the HOIP framework, and as additional properties are calculated for the new compounds found.

  6. Quantifying uncertainty in observational rainfall datasets

    Science.gov (United States)

    Lennard, Chris; Dosio, Alessandro; Nikulin, Grigory; Pinto, Izidine; Seid, Hussen

    2015-04-01

    The CO-ordinated Regional Downscaling Experiment (CORDEX) has to date seen the publication of at least ten journal papers that examine the African domain during 2012 and 2013. Five of these papers consider Africa generally (Nikulin et al. 2012, Kim et al. 2013, Hernandes-Dias et al. 2013, Laprise et al. 2013, Panitz et al. 2013) and five have regional foci: Tramblay et al. (2013) on Northern Africa, Mariotti et al. (2014) and Gbobaniyi el al. (2013) on West Africa, Endris et al. (2013) on East Africa and Kalagnoumou et al. (2013) on southern Africa. There also are a further three papers that the authors know about under review. These papers all use an observed rainfall and/or temperature data to evaluate/validate the regional model output and often proceed to assess projected changes in these variables due to climate change in the context of these observations. The most popular reference rainfall data used are the CRU, GPCP, GPCC, TRMM and UDEL datasets. However, as Kalagnoumou et al. (2013) point out there are many other rainfall datasets available for consideration, for example, CMORPH, FEWS, TAMSAT & RIANNAA, TAMORA and the WATCH & WATCH-DEI data. They, with others (Nikulin et al. 2012, Sylla et al. 2012) show that the observed datasets can have a very wide spread at a particular space-time coordinate. As more ground, space and reanalysis-based rainfall products become available, all which use different methods to produce precipitation data, the selection of reference data is becoming an important factor in model evaluation. A number of factors can contribute to a uncertainty in terms of the reliability and validity of the datasets such as radiance conversion algorithims, the quantity and quality of available station data, interpolation techniques and blending methods used to combine satellite and guage based products. However, to date no comprehensive study has been performed to evaluate the uncertainty in these observational datasets. We assess 18 gridded

  7. The Whole-Genome Sequence of Bacillus velezensis Strain SB1216 Isolated from the Great Salt Plains of Oklahoma Reveals the Presence of a Novel Extracellular RNase with Antitumor Activity

    OpenAIRE

    Marasini, Daya; Cornell, Carolyn R.; Oyewole, Opeoluwa; Sheaff, Robert J.; Fakhr, Mohamed K.

    2017-01-01

    ABSTRACT The whole-genome sequence of Bacillus velezensis strain SB1216, isolated from the Great Salt Plains of Oklahoma, showed the presence of a 3,814,720-bp circular chromosome and no plasmids. The presence of a novel 870-bp extracellular RNase gene is predicted to be responsible for this strain’s antitumor activity.

  8. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction.

    Science.gov (United States)

    Ivanova, Christa; Ramoni, Jonas; Aouam, Thiziri; Frischmann, Alexa; Seiboth, Bernhard; Baker, Scott E; Le Crom, Stéphane; Lemoine, Sophie; Margeot, Antoine; Bidard, Frédérique

    2017-01-01

    The hydrolysis of biomass to simple sugars used for the production of biofuels in biorefineries requires the action of cellulolytic enzyme mixtures. During the last 50 years, the ascomycete Trichoderma reesei , the main source of industrial cellulase and hemicellulase cocktails, has been subjected to several rounds of classical mutagenesis with the aim to obtain higher production levels. During these random genetic events, strains unable to produce cellulases were generated. Here, whole genome sequencing and transcriptomic analyses of the cellulase-negative strain QM9978 were used for the identification of mutations underlying this cellulase-negative phenotype. Sequence comparison of the cellulase-negative strain QM9978 to the reference strain QM6a identified a total of 43 mutations, of which 33 were located either close to or in coding regions. From those, we identified 23 single-nucleotide variants, nine InDels, and one translocation. The translocation occurred between chromosomes V and VII, is located upstream of the putative transcription factor vib1 , and abolishes its expression in QM9978 as detected during the transcriptomic analyses. Ectopic expression of vib1 under the control of its native promoter as well as overexpression of vib1 under the control of a strong constitutive promoter restored cellulase expression in QM9978, thus confirming that the translocation event is the reason for the cellulase-negative phenotype. Gene deletion of vib1 in the moderate producer strain QM9414 and in the high producer strain Rut-C30 reduced cellulase expression in both cases. Overexpression of vib1 in QM9414 and Rut-C30 had no effect on cellulase production, most likely because vib1 is already expressed at an optimal level under normal conditions. We were able to establish a link between a chromosomal translocation in QM9978 and the cellulase-negative phenotype of the strain. We identified the transcription factor vib1 as a key regulator of cellulases in T. reesei whose

  9. IMPACT OF GENETIC STRAIN ON BODY FAT LOSS, FOOD CONSUMPTION, METABOLISM, VENTILATION, AND MOTOR ACTIVITY IN FREE RUNNING FEMALE RATS

    Data.gov (United States)

    U.S. Environmental Protection Agency — Physiologic data associated with different strains of common laboratory rat strains. This dataset is associated with the following publication: Gordon , C., P....

  10. Next-Generation Sequence Analysis Reveals Transfer of Methicillin Resistance to a Methicillin-Susceptible Staphylococcus aureus Strain That Subsequently Caused a Methicillin-Resistant Staphylococcus aureus Outbreak: a Descriptive Study.

    Science.gov (United States)

    Weterings, Veronica; Bosch, Thijs; Witteveen, Sandra; Landman, Fabian; Schouls, Leo; Kluytmans, Jan

    2017-09-01

    Resistance to methicillin in Staphylococcus aureus is caused primarily by the mecA gene, which is carried on a mobile genetic element, the staphylococcal cassette chromosome mec (SCC mec ). Horizontal transfer of this element is supposed to be an important factor in the emergence of new clones of methicillin-resistant Staphylococcus aureus (MRSA) but has been rarely observed in real time. In 2012, an outbreak occurred involving a health care worker (HCW) and three patients, all carrying a fusidic acid-resistant MRSA strain. The husband of the HCW was screened for MRSA carriage, but only a methicillin-susceptible S. aureus (MSSA) strain, which was also resistant to fusidic acid, was detected. Multiple-locus variable-number tandem-repeat analysis (MLVA) typing showed that both the MSSA and MRSA isolates were MT4053-MC0005. This finding led to the hypothesis that the MSSA strain acquired the SCC mec and subsequently caused an outbreak. To support this hypothesis, next-generation sequencing of the MSSA and MRSA isolates was performed. This study showed that the MSSA isolate clustered closely with the outbreak isolates based on whole-genome multilocus sequence typing and single-nucleotide polymorphism (SNP) analysis, with a genetic distance of 17 genes and 44 SNPs, respectively. Remarkably, there were relatively large differences in the mobile genetic elements in strains within and between individuals. The limited genetic distance between the MSSA and MRSA isolates in combination with a clear epidemiologic link supports the hypothesis that the MSSA isolate acquired a SCC mec and that the resulting MRSA strain caused an outbreak. Copyright © 2017 American Society for Microbiology.

  11. Development of a SPARK Training Dataset

    Energy Technology Data Exchange (ETDEWEB)

    Sayre, Amanda M. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Olson, Jarrod R. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2015-03-01

    In its first five years, the National Nuclear Security Administration’s (NNSA) Next Generation Safeguards Initiative (NGSI) sponsored more than 400 undergraduate, graduate, and post-doctoral students in internships and research positions (Wyse 2012). In the past seven years, the NGSI program has, and continues to produce a large body of scientific, technical, and policy work in targeted core safeguards capabilities and human capital development activities. Not only does the NGSI program carry out activities across multiple disciplines, but also across all U.S. Department of Energy (DOE)/NNSA locations in the United States. However, products are not readily shared among disciplines and across locations, nor are they archived in a comprehensive library. Rather, knowledge of NGSI-produced literature is localized to the researchers, clients, and internal laboratory/facility publication systems such as the Electronic Records and Information Capture Architecture (ERICA) at the Pacific Northwest National Laboratory (PNNL). There is also no incorporated way of analyzing existing NGSI literature to determine whether the larger NGSI program is achieving its core safeguards capabilities and activities. A complete library of NGSI literature could prove beneficial to a cohesive, sustainable, and more economical NGSI program. The Safeguards Platform for Automated Retrieval of Knowledge (SPARK) has been developed to be a knowledge storage, retrieval, and analysis capability to capture safeguards knowledge to exist beyond the lifespan of NGSI. During the development process, it was necessary to build a SPARK training dataset (a corpus of documents) for initial entry into the system and for demonstration purposes. We manipulated these data to gain new information about the breadth of NGSI publications, and they evaluated the science-policy interface at PNNL as a practical demonstration of SPARK’s intended analysis capability. The analysis demonstration sought to answer the

  12. Development of a SPARK Training Dataset

    International Nuclear Information System (INIS)

    Sayre, Amanda M.; Olson, Jarrod R.

    2015-01-01

    In its first five years, the National Nuclear Security Administration's (NNSA) Next Generation Safeguards Initiative (NGSI) sponsored more than 400 undergraduate, graduate, and post-doctoral students in internships and research positions (Wyse 2012). In the past seven years, the NGSI program has, and continues to produce a large body of scientific, technical, and policy work in targeted core safeguards capabilities and human capital development activities. Not only does the NGSI program carry out activities across multiple disciplines, but also across all U.S. Department of Energy (DOE)/NNSA locations in the United States. However, products are not readily shared among disciplines and across locations, nor are they archived in a comprehensive library. Rather, knowledge of NGSI-produced literature is localized to the researchers, clients, and internal laboratory/facility publication systems such as the Electronic Records and Information Capture Architecture (ERICA) at the Pacific Northwest National Laboratory (PNNL). There is also no incorporated way of analyzing existing NGSI literature to determine whether the larger NGSI program is achieving its core safeguards capabilities and activities. A complete library of NGSI literature could prove beneficial to a cohesive, sustainable, and more economical NGSI program. The Safeguards Platform for Automated Retrieval of Knowledge (SPARK) has been developed to be a knowledge storage, retrieval, and analysis capability to capture safeguards knowledge to exist beyond the lifespan of NGSI. During the development process, it was necessary to build a SPARK training dataset (a corpus of documents) for initial entry into the system and for demonstration purposes. We manipulated these data to gain new information about the breadth of NGSI publications, and they evaluated the science-policy interface at PNNL as a practical demonstration of SPARK's intended analysis capability. The analysis demonstration sought to answer

  13. Developing a Data-Set for Stereopsis

    Directory of Open Access Journals (Sweden)

    D.W Hunter

    2014-08-01

    Full Text Available Current research on binocular stereopsis in humans and non-human primates has been limited by a lack of available data-sets. Current data-sets fall into two categories; stereo-image sets with vergence but no ranging information (Hibbard, 2008, Vision Research, 48(12, 1427-1439 or combinations of depth information with binocular images and video taken from cameras in fixed fronto-parallel configurations exhibiting neither vergence or focus effects (Hirschmuller & Scharstein, 2007, IEEE Conf. Computer Vision and Pattern Recognition. The techniques for generating depth information are also imperfect. Depth information is normally inaccurate or simply missing near edges and on partially occluded surfaces. For many areas of vision research these are the most interesting parts of the image (Goutcher, Hunter, Hibbard, 2013, i-Perception, 4(7, 484; Scarfe & Hibbard, 2013, Vision Research. Using state-of-the-art open-source ray-tracing software (PBRT as a back-end, our intention is to release a set of tools that will allow researchers in this field to generate artificial binocular stereoscopic data-sets. Although not as realistic as photographs, computer generated images have significant advantages in terms of control over the final output and ground-truth information about scene depth is easily calculated at all points in the scene, even partially occluded areas. While individual researchers have been developing similar stimuli by hand for many decades, we hope that our software will greatly reduce the time and difficulty of creating naturalistic binocular stimuli. Our intension in making this presentation is to elicit feedback from the vision community about what sort of features would be desirable in such software.

  14. Mapping the resistance-associated mobilome of a carbapenem-resistant Klebsiella pneumoniae strain reveals insights into factors shaping these regions and facilitates generation of a 'resistance-disarmed' model organism.

    Science.gov (United States)

    Bi, Dexi; Jiang, Xiaofei; Sheng, Zi-Ke; Ngmenterebo, David; Tai, Cui; Wang, Minggui; Deng, Zixin; Rajakumar, Kumar; Ou, Hong-Yu

    2015-10-01

    This study aims to investigate the landscape of the mobile genome, with a focus on antibiotic resistance-associated factors in carbapenem-resistant Klebsiella pneumoniae. The mobile genome of the completely sequenced K. pneumoniae HS11286 strain (an ST11, carbapenem-resistant, near-pan-resistant, clinical isolate) was annotated in fine detail. The identified mobile genetic elements were mapped to the genetic contexts of resistance genes. The blaKPC-2 gene and a 26 kb region containing 12 clustered antibiotic resistance genes and one biocide resistance gene were deleted, and the MICs were determined again to ensure that antibiotic resistance had been lost. HS11286 contains six plasmids, 49 ISs, nine transposons, two separate In2-related integron remnants, two integrative and conjugative elements (ICEs) and seven prophages. Sixteen plasmid-borne resistance genes were identified, 14 of which were found to be directly associated with Tn1721-, Tn3-, Tn5393-, In2-, ISCR2- and ISCR3-derived elements. IS26 appears to have actively moulded several of these genetic regions. The deletion of blaKPC-2, followed by the deletion of a 26 kb region containing 12 clustered antibiotic resistance genes, progressively decreased the spectrum and level of resistance exhibited by the resultant mutant strains. This study has reiterated the role of plasmids as bearers of the vast majority of resistance genes in this species and has provided valuable insights into the vital role played by ISs, transposons and integrons in shaping the resistance-coding regions in this important strain. The 'resistance-disarmed' K. pneumoniae ST11 strain generated in this study will offer a more benign and readily genetically modifiable model organism for future extensive functional studies. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Quality Controlling CMIP datasets at GFDL

    Science.gov (United States)

    Horowitz, L. W.; Radhakrishnan, A.; Balaji, V.; Adcroft, A.; Krasting, J. P.; Nikonov, S.; Mason, E. E.; Schweitzer, R.; Nadeau, D.

    2017-12-01

    As GFDL makes the switch from model development to production in light of the Climate Model Intercomparison Project (CMIP), GFDL's efforts are shifted to testing and more importantly establishing guidelines and protocols for Quality Controlling and semi-automated data publishing. Every CMIP cycle introduces key challenges and the upcoming CMIP6 is no exception. The new CMIP experimental design comprises of multiple MIPs facilitating research in different focus areas. This paradigm has implications not only for the groups that develop the models and conduct the runs, but also for the groups that monitor, analyze and quality control the datasets before data publishing, before their knowledge makes its way into reports like the IPCC (Intergovernmental Panel on Climate Change) Assessment Reports. In this talk, we discuss some of the paths taken at GFDL to quality control the CMIP-ready datasets including: Jupyter notebooks, PrePARE, LAMP (Linux, Apache, MySQL, PHP/Python/Perl): technology-driven tracker system to monitor the status of experiments qualitatively and quantitatively, provide additional metadata and analysis services along with some in-built controlled-vocabulary validations in the workflow. In addition to this, we also discuss the integration of community-based model evaluation software (ESMValTool, PCMDI Metrics Package, and ILAMB) as part of our CMIP6 workflow.

  16. Integrated remotely sensed datasets for disaster management

    Science.gov (United States)

    McCarthy, Timothy; Farrell, Ronan; Curtis, Andrew; Fotheringham, A. Stewart

    2008-10-01

    Video imagery can be acquired from aerial, terrestrial and marine based platforms and has been exploited for a range of remote sensing applications over the past two decades. Examples include coastal surveys using aerial video, routecorridor infrastructures surveys using vehicle mounted video cameras, aerial surveys over forestry and agriculture, underwater habitat mapping and disaster management. Many of these video systems are based on interlaced, television standards such as North America's NTSC and European SECAM and PAL television systems that are then recorded using various video formats. This technology has recently being employed as a front-line, remote sensing technology for damage assessment post-disaster. This paper traces the development of spatial video as a remote sensing tool from the early 1980s to the present day. The background to a new spatial-video research initiative based at National University of Ireland, Maynooth, (NUIM) is described. New improvements are proposed and include; low-cost encoders, easy to use software decoders, timing issues and interoperability. These developments will enable specialists and non-specialists collect, process and integrate these datasets within minimal support. This integrated approach will enable decision makers to access relevant remotely sensed datasets quickly and so, carry out rapid damage assessment during and post-disaster.

  17. ClimateNet: A Machine Learning dataset for Climate Science Research

    Science.gov (United States)

    Prabhat, M.; Biard, J.; Ganguly, S.; Ames, S.; Kashinath, K.; Kim, S. K.; Kahou, S.; Maharaj, T.; Beckham, C.; O'Brien, T. A.; Wehner, M. F.; Williams, D. N.; Kunkel, K.; Collins, W. D.

    2017-12-01

    Deep Learning techniques have revolutionized commercial applications in Computer vision, speech recognition and control systems. The key for all of these developments was the creation of a curated, labeled dataset ImageNet, for enabling multiple research groups around the world to develop methods, benchmark performance and compete with each other. The success of Deep Learning can be largely attributed to the broad availability of this dataset. Our empirical investigations have revealed that Deep Learning is similarly poised to benefit the task of pattern detection in climate science. Unfortunately, labeled datasets, a key pre-requisite for training, are hard to find. Individual research groups are typically interested in specialized weather patterns, making it hard to unify, and share datasets across groups and institutions. In this work, we are proposing ClimateNet: a labeled dataset that provides labeled instances of extreme weather patterns, as well as associated raw fields in model and observational output. We develop a schema in NetCDF to enumerate weather pattern classes/types, store bounding boxes, and pixel-masks. We are also working on a TensorFlow implementation to natively import such NetCDF datasets, and are providing a reference convolutional architecture for binary classification tasks. Our hope is that researchers in Climate Science, as well as ML/DL, will be able to use (and extend) ClimateNet to make rapid progress in the application of Deep Learning for Climate Science research.

  18. Strontium removal jar test dataset for all figures and tables.

    Data.gov (United States)

    U.S. Environmental Protection Agency — The datasets where used to generate data to demonstrate strontium removal under various water quality and treatment conditions. This dataset is associated with the...

  19. Predicting dataset popularity for the CMS experiment

    CERN Document Server

    INSPIRE-00005122; Li, Ting; Giommi, Luca; Bonacorsi, Daniele; Wildish, Tony

    2016-01-01

    The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at the frontier of High Energy Physics, searching for new phenomena and making discoveries. Even though computing plays a significant role in physics analysis we rarely use its data to predict the system behavior itself. A basic information about computing resources, user activities and site utilization can be really useful for improving the throughput of the system and its management. In this paper, we discuss a first CMS analysis of dataset popularity based on CMS meta-data which can be used as a model for dynamic data placement and provide the foundation of data-driven approach for the CMS computing infrastructure.

  20. Predicting dataset popularity for the CMS experiment

    International Nuclear Information System (INIS)

    Kuznetsov, V.; Li, T.; Giommi, L.; Bonacorsi, D.; Wildish, T.

    2016-01-01

    The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at the frontier of High Energy Physics, searching for new phenomena and making discoveries. Even though computing plays a significant role in physics analysis we rarely use its data to predict the system behavior itself. A basic information about computing resources, user activities and site utilization can be really useful for improving the throughput of the system and its management. In this paper, we discuss a first CMS analysis of dataset popularity based on CMS meta-data which can be used as a model for dynamic data placement and provide the foundation of data-driven approach for the CMS computing infrastructure. (paper)

  1. Internationally coordinated glacier monitoring: strategy and datasets

    Science.gov (United States)

    Hoelzle, Martin; Armstrong, Richard; Fetterer, Florence; Gärtner-Roer, Isabelle; Haeberli, Wilfried; Kääb, Andreas; Kargel, Jeff; Nussbaumer, Samuel; Paul, Frank; Raup, Bruce; Zemp, Michael

    2014-05-01

    (c) the Randolph Glacier Inventory (RGI), a new and globally complete digital dataset of outlines from about 180,000 glaciers with some meta-information, which has been used for many applications relating to the IPCC AR5 report. Concerning glacier changes, a database (Fluctuations of Glaciers) exists containing information about mass balance, front variations including past reconstructed time series, geodetic changes and special events. Annual mass balance reporting contains information for about 125 glaciers with a subset of 37 glaciers with continuous observational series since 1980 or earlier. Front variation observations of around 1800 glaciers are available from most of the mountain ranges world-wide. This database was recently updated with 26 glaciers having an unprecedented dataset of length changes from from reconstructions of well-dated historical evidence going back as far as the 16th century. Geodetic observations of about 430 glaciers are available. The database is completed by a dataset containing information on special events including glacier surges, glacier lake outbursts, ice avalanches, eruptions of ice-clad volcanoes, etc. related to about 200 glaciers. A special database of glacier photographs contains 13,000 pictures from around 500 glaciers, some of them dating back to the 19th century. A key challenge is to combine and extend the traditional observations with fast evolving datasets from new technologies.

  2. MIPS bacterial genomes functional annotation benchmark dataset.

    Science.gov (United States)

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  3. 2006 Fynmeet sea clutter measurement trial: Datasets

    CSIR Research Space (South Africa)

    Herselman, PLR

    2007-09-06

    Full Text Available -011............................................................................................................................................................................................. 25 iii Dataset CAD14-001 0 5 10 15 20 25 30 35 10 20 30 40 50 60 70 80 90 R an ge G at e # Time [s] A bs ol ut e R an ge [m ] RCS [dBm2] vs. time and range for f1 = 9.000 GHz - CAD14-001 2400 2600 2800... 40 10 20 30 40 50 60 70 80 90 R an ge G at e # Time [s] A bs ol ut e R an ge [m ] RCS [dBm2] vs. time and range for f1 = 9.000 GHz - CAD14-002 2400 2600 2800 3000 3200 3400 3600 -30 -25 -20 -15 -10 -5 0 5 10...

  4. Differential evolution of a CXCR4-using HIV-1 strain in CCR5wt/wt and CCR5∆32/∆32 hosts revealed by longitudinal deep sequencing and phylogenetic reconstruction.

    Science.gov (United States)

    Le, Anh Q; Taylor, Jeremy; Dong, Winnie; McCloskey, Rosemary; Woods, Conan; Danroth, Ryan; Hayashi, Kanna; Milloy, M-J; Poon, Art F Y; Brumme, Zabrina L

    2015-12-03

    Rare individuals homozygous for a naturally-occurring 32 base pair deletion in the CCR5 gene (CCR5∆32/∆32) are resistant to infection by CCR5-using ("R5") HIV-1 strains but remain susceptible to less common CXCR4-using ("X4") strains. The evolutionary dynamics of X4 infections however, remain incompletely understood. We identified two individuals, one CCR5wt/wt and one CCR5∆32/∆32, within the Vancouver Injection Drug Users Study who were infected with a genetically similar X4 HIV-1 strain. While early-stage plasma viral loads were comparable in the two individuals (~4.5-5 log10 HIV-1 RNA copies/ml), CD4 counts in the CCR5wt/wt individual reached a nadir of 250 cells/mm(3) in the CCR5∆32/∆32 individual. Ancestral phylogenetic reconstructions using longitudinal envelope-V3 deep sequences suggested that both individuals were infected by a single transmitted/founder (T/F) X4 virus that differed at only one V3 site (codon 24). While substantial within-host HIV-1 V3 diversification was observed in plasma and PBMC in both individuals, the CCR5wt/wt individual's HIV-1 population gradually reverted from 100% X4 to ~60% R5 over ~4 years whereas the CCR5∆32/∆32 individual's remained consistently X4. Our observations illuminate early dynamics of X4 HIV-1 infections and underscore the influence of CCR5 genotype on HIV-1 V3 evolution.

  5. A new bed elevation dataset for Greenland

    Directory of Open Access Journals (Sweden)

    J. L. Bamber

    2013-03-01

    Full Text Available We present a new bed elevation dataset for Greenland derived from a combination of multiple airborne ice thickness surveys undertaken between the 1970s and 2012. Around 420 000 line kilometres of airborne data were used, with roughly 70% of this having been collected since the year 2000, when the last comprehensive compilation was undertaken. The airborne data were combined with satellite-derived elevations for non-glaciated terrain to produce a consistent bed digital elevation model (DEM over the entire island including across the glaciated–ice free boundary. The DEM was extended to the continental margin with the aid of bathymetric data, primarily from a compilation for the Arctic. Ice thickness was determined where an ice shelf exists from a combination of surface elevation and radar soundings. The across-track spacing between flight lines warranted interpolation at 1 km postings for significant sectors of the ice sheet. Grids of ice surface elevation, error estimates for the DEM, ice thickness and data sampling density were also produced alongside a mask of land/ocean/grounded ice/floating ice. Errors in bed elevation range from a minimum of ±10 m to about ±300 m, as a function of distance from an observation and local topographic variability. A comparison with the compilation published in 2001 highlights the improvement in resolution afforded by the new datasets, particularly along the ice sheet margin, where ice velocity is highest and changes in ice dynamics most marked. We estimate that the volume of ice included in our land-ice mask would raise mean sea level by 7.36 m, excluding any solid earth effects that would take place during ice sheet decay.

  6. Sequence similarity between the erythrocyte binding domain of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals a functional heparin binding motif involved in binding to the Duffy antigen receptor for chemokines

    OpenAIRE

    Bolton, Michael J; Garry, Robert F

    2011-01-01

    Abstract Background The HIV surface glycoprotein gp120 (SU, gp120) and the Plasmodium vivax Duffy binding protein (PvDBP) bind to chemokine receptors during infection and have a site of amino acid sequence similarity in their binding domains that often includes a heparin binding motif (HBM). Infection by either pathogen has been found to be inhibited by polyanions. Results Specific polyanions that inhibit HIV infection and bind to the V3 loop of X4 strains also inhibited DBP-mediated infectio...

  7. Wind Integration National Dataset Toolkit | Grid Modernization | NREL

    Science.gov (United States)

    Integration National Dataset Toolkit Wind Integration National Dataset Toolkit The Wind Integration National Dataset (WIND) Toolkit is an update and expansion of the Eastern Wind Integration Data Set and Western Wind Integration Data Set. It supports the next generation of wind integration studies. WIND

  8. Solar Integration National Dataset Toolkit | Grid Modernization | NREL

    Science.gov (United States)

    Solar Integration National Dataset Toolkit Solar Integration National Dataset Toolkit NREL is working on a Solar Integration National Dataset (SIND) Toolkit to enable researchers to perform U.S . regional solar generation integration studies. It will provide modeled, coherent subhourly solar power data

  9. Technical note: An inorganic water chemistry dataset (1972–2011 ...

    African Journals Online (AJOL)

    A national dataset of inorganic chemical data of surface waters (rivers, lakes, and dams) in South Africa is presented and made freely available. The dataset comprises more than 500 000 complete water analyses from 1972 up to 2011, collected from more than 2 000 sample monitoring stations in South Africa. The dataset ...

  10. QSAR ligand dataset for modelling mutagenicity, genotoxicity, and rodent carcinogenicity

    Directory of Open Access Journals (Sweden)

    Davy Guan

    2018-04-01

    Full Text Available Five datasets were constructed from ligand and bioassay result data from the literature. These datasets include bioassay results from the Ames mutagenicity assay, Greenscreen GADD-45a-GFP assay, Syrian Hamster Embryo (SHE assay, and 2 year rat carcinogenicity assay results. These datasets provide information about chemical mutagenicity, genotoxicity and carcinogenicity.

  11. Analysis of Public Datasets for Wearable Fall Detection Systems.

    Science.gov (United States)

    Casilari, Eduardo; Santoyo-Ramón, José-Antonio; Cano-García, José-Manuel

    2017-06-27

    Due to the boom of wireless handheld devices such as smartwatches and smartphones, wearable Fall Detection Systems (FDSs) have become a major focus of attention among the research community during the last years. The effectiveness of a wearable FDS must be contrasted against a wide variety of measurements obtained from inertial sensors during the occurrence of falls and Activities of Daily Living (ADLs). In this regard, the access to public databases constitutes the basis for an open and systematic assessment of fall detection techniques. This paper reviews and appraises twelve existing available data repositories containing measurements of ADLs and emulated falls envisaged for the evaluation of fall detection algorithms in wearable FDSs. The analysis of the found datasets is performed in a comprehensive way, taking into account the multiple factors involved in the definition of the testbeds deployed for the generation of the mobility samples. The study of the traces brings to light the lack of a common experimental benchmarking procedure and, consequently, the large heterogeneity of the datasets from a number of perspectives (length and number of samples, typology of the emulated falls and ADLs, characteristics of the test subjects, features and positions of the sensors, etc.). Concerning this, the statistical analysis of the samples reveals the impact of the sensor range on the reliability of the traces. In addition, the study evidences the importance of the selection of the ADLs and the need of categorizing the ADLs depending on the intensity of the movements in order to evaluate the capability of a certain detection algorithm to discriminate falls from ADLs.

  12. Analysis of Public Datasets for Wearable Fall Detection Systems

    Directory of Open Access Journals (Sweden)

    Eduardo Casilari

    2017-06-01

    Full Text Available Due to the boom of wireless handheld devices such as smartwatches and smartphones, wearable Fall Detection Systems (FDSs have become a major focus of attention among the research community during the last years. The effectiveness of a wearable FDS must be contrasted against a wide variety of measurements obtained from inertial sensors during the occurrence of falls and Activities of Daily Living (ADLs. In this regard, the access to public databases constitutes the basis for an open and systematic assessment of fall detection techniques. This paper reviews and appraises twelve existing available data repositories containing measurements of ADLs and emulated falls envisaged for the evaluation of fall detection algorithms in wearable FDSs. The analysis of the found datasets is performed in a comprehensive way, taking into account the multiple factors involved in the definition of the testbeds deployed for the generation of the mobility samples. The study of the traces brings to light the lack of a common experimental benchmarking procedure and, consequently, the large heterogeneity of the datasets from a number of perspectives (length and number of samples, typology of the emulated falls and ADLs, characteristics of the test subjects, features and positions of the sensors, etc.. Concerning this, the statistical analysis of the samples reveals the impact of the sensor range on the reliability of the traces. In addition, the study evidences the importance of the selection of the ADLs and the need of categorizing the ADLs depending on the intensity of the movements in order to evaluate the capability of a certain detection algorithm to discriminate falls from ADLs.

  13. Statistical segmentation of multidimensional brain datasets

    Science.gov (United States)

    Desco, Manuel; Gispert, Juan D.; Reig, Santiago; Santos, Andres; Pascau, Javier; Malpica, Norberto; Garcia-Barreno, Pedro

    2001-07-01

    This paper presents an automatic segmentation procedure for MRI neuroimages that overcomes part of the problems involved in multidimensional clustering techniques like partial volume effects (PVE), processing speed and difficulty of incorporating a priori knowledge. The method is a three-stage procedure: 1) Exclusion of background and skull voxels using threshold-based region growing techniques with fully automated seed selection. 2) Expectation Maximization algorithms are used to estimate the probability density function (PDF) of the remaining pixels, which are assumed to be mixtures of gaussians. These pixels can then be classified into cerebrospinal fluid (CSF), white matter and grey matter. Using this procedure, our method takes advantage of using the full covariance matrix (instead of the diagonal) for the joint PDF estimation. On the other hand, logistic discrimination techniques are more robust against violation of multi-gaussian assumptions. 3) A priori knowledge is added using Markov Random Field techniques. The algorithm has been tested with a dataset of 30 brain MRI studies (co-registered T1 and T2 MRI). Our method was compared with clustering techniques and with template-based statistical segmentation, using manual segmentation as a gold-standard. Our results were more robust and closer to the gold-standard.

  14. ASSESSING SMALL SAMPLE WAR-GAMING DATASETS

    Directory of Open Access Journals (Sweden)

    W. J. HURLEY

    2013-10-01

    Full Text Available One of the fundamental problems faced by military planners is the assessment of changes to force structure. An example is whether to replace an existing capability with an enhanced system. This can be done directly with a comparison of measures such as accuracy, lethality, survivability, etc. However this approach does not allow an assessment of the force multiplier effects of the proposed change. To gauge these effects, planners often turn to war-gaming. For many war-gaming experiments, it is expensive, both in terms of time and dollars, to generate a large number of sample observations. This puts a premium on the statistical methodology used to examine these small datasets. In this paper we compare the power of three tests to assess population differences: the Wald-Wolfowitz test, the Mann-Whitney U test, and re-sampling. We employ a series of Monte Carlo simulation experiments. Not unexpectedly, we find that the Mann-Whitney test performs better than the Wald-Wolfowitz test. Resampling is judged to perform slightly better than the Mann-Whitney test.

  15. Correction of elevation offsets in multiple co-located lidar datasets

    Science.gov (United States)

    Thompson, David M.; Dalyander, P. Soupy; Long, Joseph W.; Plant, Nathaniel G.

    2017-04-07

    IntroductionTopographic elevation data collected with airborne light detection and ranging (lidar) can be used to analyze short- and long-term changes to beach and dune systems. Analysis of multiple lidar datasets at Dauphin Island, Alabama, revealed systematic, island-wide elevation differences on the order of 10s of centimeters (cm) that were not attributable to real-world change and, therefore, were likely to represent systematic sampling offsets. These offsets vary between the datasets, but appear spatially consistent within a given survey. This report describes a method that was developed to identify and correct offsets between lidar datasets collected over the same site at different times so that true elevation changes over time, associated with sediment accumulation or erosion, can be analyzed.

  16. fCCAC: functional canonical correlation analysis to evaluate covariance between nucleic acid sequencing datasets.

    Science.gov (United States)

    Madrigal, Pedro

    2017-03-01

    Computational evaluation of variability across DNA or RNA sequencing datasets is a crucial step in genomic science, as it allows both to evaluate reproducibility of biological or technical replicates, and to compare different datasets to identify their potential correlations. Here we present fCCAC, an application of functional canonical correlation analysis to assess covariance of nucleic acid sequencing datasets such as chromatin immunoprecipitation followed by deep sequencing (ChIP-seq). We show how this method differs from other measures of correlation, and exemplify how it can reveal shared covariance between histone modifications and DNA binding proteins, such as the relationship between the H3K4me3 chromatin mark and its epigenetic writers and readers. An R/Bioconductor package is available at http://bioconductor.org/packages/fCCAC/ . pmb59@cam.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  17. Anonymising the Sparse Dataset: A New Privacy Preservation Approach while Predicting Diseases

    Directory of Open Access Journals (Sweden)

    V. Shyamala Susan

    2016-09-01

    Full Text Available Data mining techniques analyze the medical dataset with the intention of enhancing patient’s health and privacy. Most of the existing techniques are properly suited for low dimensional medical dataset. The proposed methodology designs a model for the representation of sparse high dimensional medical dataset with the attitude of protecting the patient’s privacy from an adversary and additionally to predict the disease’s threat degree. In a sparse data set many non-zero values are randomly spread in the entire data space. Hence, the challenge is to cluster the correlated patient’s record to predict the risk degree of the disease earlier than they occur in patients and to keep privacy. The first phase converts the sparse dataset right into a band matrix through the Genetic algorithm along with Cuckoo Search (GCS.This groups the correlated patient’s record together and arranges them close to the diagonal. The next segment dissociates the patient’s disease, which is a sensitive value (SA with the parameters that determine the disease normally Quasi Identifier (QI.Finally, density based clustering technique is used over the underlying data to  create anonymized groups to maintain privacy and to predict the risk level of disease. Empirical assessments on actual health care data corresponding to V.A.Medical Centre heart disease dataset reveal the efficiency of this model pertaining to information loss, utility and privacy.

  18. Mobilomics in Saccharomyces cerevisiae strains.

    Science.gov (United States)

    Menconi, Giulia; Battaglia, Giovanni; Grossi, Roberto; Pisanti, Nadia; Marangoni, Roberto

    2013-03-20

    Mobile Genetic Elements (MGEs) are selfish DNA integrated in the genomes. Their detection is mainly based on consensus-like searches by scanning the investigated genome against the sequence of an already identified MGE. Mobilomics aims at discovering all the MGEs in a genome and understanding their dynamic behavior: The data for this kind of investigation can be provided by comparative genomics of closely related organisms. The amount of data thus involved requires a strong computational effort, which should be alleviated. Our approach proposes to exploit the high similarity among homologous chromosomes of different strains of the same species, following a progressive comparative genomics philosophy. We introduce a software tool based on our new fast algorithm, called regender, which is able to identify the conserved regions between chromosomes. Our case study is represented by a unique recently available dataset of 39 different strains of S.cerevisiae, which regender is able to compare in few minutes. By exploring the non-conserved regions, where MGEs are mainly retrotransposons called Tys, and marking the candidate Tys based on their length, we are able to locate a priori and automatically all the already known Tys and map all the putative Tys in all the strains. The remaining putative mobile elements (PMEs) emerging from this intra-specific comparison are sharp markers of inter-specific evolution: indeed, many events of non-conservation among different yeast strains correspond to PMEs. A clustering based on the presence/absence of the candidate Tys in the strains suggests an evolutionary interconnection that is very similar to classic phylogenetic trees based on SNPs analysis, even though it is computed without using phylogenetic information. The case study indicates that the proposed methodology brings two major advantages: (a) it does not require any template sequence for the wanted MGEs and (b) it can be applied to infer MGEs also for low coverage genomes

  19. Mobilomics in Saccharomyces cerevisiae strains

    Science.gov (United States)

    2013-01-01

    Background Mobile Genetic Elements (MGEs) are selfish DNA integrated in the genomes. Their detection is mainly based on consensus–like searches by scanning the investigated genome against the sequence of an already identified MGE. Mobilomics aims at discovering all the MGEs in a genome and understanding their dynamic behavior: The data for this kind of investigation can be provided by comparative genomics of closely related organisms. The amount of data thus involved requires a strong computational effort, which should be alleviated. Results Our approach proposes to exploit the high similarity among homologous chromosomes of different strains of the same species, following a progressive comparative genomics philosophy. We introduce a software tool based on our new fast algorithm, called regender, which is able to identify the conserved regions between chromosomes. Our case study is represented by a unique recently available dataset of 39 different strains of S.cerevisiae, which regender is able to compare in few minutes. By exploring the non–conserved regions, where MGEs are mainly retrotransposons called Tys, and marking the candidate Tys based on their length, we are able to locate a priori and automatically all the already known Tys and map all the putative Tys in all the strains. The remaining putative mobile elements (PMEs) emerging from this intra–specific comparison are sharp markers of inter–specific evolution: indeed, many events of non–conservation among different yeast strains correspond to PMEs. A clustering based on the presence/absence of the candidate Tys in the strains suggests an evolutionary interconnection that is very similar to classic phylogenetic trees based on SNPs analysis, even though it is computed without using phylogenetic information. Conclusions The case study indicates that the proposed methodology brings two major advantages: (a) it does not require any template sequence for the wanted MGEs and (b) it can be applied to

  20. The Dataset of Countries at Risk of Electoral Violence

    OpenAIRE

    Birch, Sarah; Muchlinski, David

    2017-01-01

    Electoral violence is increasingly affecting elections around the world, yet researchers have been limited by a paucity of granular data on this phenomenon. This paper introduces and describes a new dataset of electoral violence – the Dataset of Countries at Risk of Electoral Violence (CREV) – that provides measures of 10 different types of electoral violence across 642 elections held around the globe between 1995 and 2013. The paper provides a detailed account of how and why the dataset was ...

  1. Norwegian Hydrological Reference Dataset for Climate Change Studies

    Energy Technology Data Exchange (ETDEWEB)

    Magnussen, Inger Helene; Killingland, Magnus; Spilde, Dag

    2012-07-01

    Based on the Norwegian hydrological measurement network, NVE has selected a Hydrological Reference Dataset for studies of hydrological change. The dataset meets international standards with high data quality. It is suitable for monitoring and studying the effects of climate change on the hydrosphere and cryosphere in Norway. The dataset includes streamflow, groundwater, snow, glacier mass balance and length change, lake ice and water temperature in rivers and lakes.(Author)

  2. Sequence similarity between the erythrocyte binding domain of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals a functional heparin binding motif involved in binding to the Duffy antigen receptor for chemokines

    Directory of Open Access Journals (Sweden)

    Bolton Michael J

    2011-11-01

    Full Text Available Abstract Background The HIV surface glycoprotein gp120 (SU, gp120 and the Plasmodium vivax Duffy binding protein (PvDBP bind to chemokine receptors during infection and have a site of amino acid sequence similarity in their binding domains that often includes a heparin binding motif (HBM. Infection by either pathogen has been found to be inhibited by polyanions. Results Specific polyanions that inhibit HIV infection and bind to the V3 loop of X4 strains also inhibited DBP-mediated infection of erythrocytes and DBP binding to the Duffy Antigen Receptor for Chemokines (DARC. A peptide including the HBM of PvDBP had similar affinity for heparin as RANTES and V3 loop peptides, and could be specifically inhibited from heparin binding by the same polyanions that inhibit DBP binding to DARC. However, some V3 peptides can competitively inhibit RANTES binding to heparin, but not the PvDBP HBM peptide. Three other members of the DBP family have an HBM sequence that is necessary for erythrocyte binding, however only the protein which binds to DARC, the P. knowlesi alpha protein, is inhibited by heparin from binding to erythrocytes. Heparitinase digestion does not affect the binding of DBP to erythrocytes. Conclusion The HBMs of DBPs that bind to DARC have similar heparin binding affinities as some V3 loop peptides and chemokines, are responsible for specific sulfated polysaccharide inhibition of parasite binding and invasion of red blood cells, and are more likely to bind to negative charges on the receptor than cell surface glycosaminoglycans.

  3. Public Availability to ECS Collected Datasets

    Science.gov (United States)

    Henderson, J. F.; Warnken, R.; McLean, S. J.; Lim, E.; Varner, J. D.

    2013-12-01

    Coastal nations have spent considerable resources exploring the limits of their extended continental shelf (ECS) beyond 200 nm. Although these studies are funded to fulfill requirements of the UN Convention on the Law of the Sea, the investments are producing new data sets in frontier areas of Earth's oceans that will be used to understand, explore, and manage the seafloor and sub-seafloor for decades to come. Although many of these datasets are considered proprietary until a nation's potential ECS has become 'final and binding' an increasing amount of data are being released and utilized by the public. Data sets include multibeam, seismic reflection/refraction, bottom sampling, and geophysical data. The U.S. ECS Project, a multi-agency collaboration whose mission is to establish the full extent of the continental shelf of the United States consistent with international law, relies heavily on data and accurate, standard metadata. The United States has made it a priority to make available to the public all data collected with ECS-funding as quickly as possible. The National Oceanic and Atmospheric Administration's (NOAA) National Geophysical Data Center (NGDC) supports this objective by partnering with academia and other federal government mapping agencies to archive, inventory, and deliver marine mapping data in a coordinated, consistent manner. This includes ensuring quality, standard metadata and developing and maintaining data delivery capabilities built on modern digital data archives. Other countries, such as Ireland, have submitted their ECS data for public availability and many others have made pledges to participate in the future. The data services provided by NGDC support the U.S. ECS effort as well as many developing nation's ECS effort through the U.N. Environmental Program. Modern discovery, visualization, and delivery of scientific data and derived products that span national and international sources of data ensure the greatest re-use of data and

  4. Multiple Genome Sequences of Lactobacillus plantarum Strains

    OpenAIRE

    Kafka, Thomas A.; Geissler, Andreas J.; Vogel, Rudi F.

    2017-01-01

    ABSTRACT We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses.

  5. BIA Indian Lands Dataset (Indian Lands of the United States)

    Data.gov (United States)

    Federal Geographic Data Committee — The American Indian Reservations / Federally Recognized Tribal Entities dataset depicts feature location, selected demographics and other associated data for the 561...

  6. Framework for Interactive Parallel Dataset Analysis on the Grid

    Energy Technology Data Exchange (ETDEWEB)

    Alexander, David A.; Ananthan, Balamurali; /Tech-X Corp.; Johnson, Tony; Serbo, Victor; /SLAC

    2007-01-10

    We present a framework for use at a typical Grid site to facilitate custom interactive parallel dataset analysis targeting terabyte-scale datasets of the type typically produced by large multi-institutional science experiments. We summarize the needs for interactive analysis and show a prototype solution that satisfies those needs. The solution consists of desktop client tool and a set of Web Services that allow scientists to sign onto a Grid site, compose analysis script code to carry out physics analysis on datasets, distribute the code and datasets to worker nodes, collect the results back to the client, and to construct professional-quality visualizations of the results.

  7. Socioeconomic Data and Applications Center (SEDAC) Treaty Status Dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — The Socioeconomic Data and Application Center (SEDAC) Treaty Status Dataset contains comprehensive treaty information for multilateral environmental agreements,...

  8. Mathematics revealed

    CERN Document Server

    Berman, Elizabeth

    1979-01-01

    Mathematics Revealed focuses on the principles, processes, operations, and exercises in mathematics.The book first offers information on whole numbers, fractions, and decimals and percents. Discussions focus on measuring length, percent, decimals, numbers as products, addition and subtraction of fractions, mixed numbers and ratios, division of fractions, addition, subtraction, multiplication, and division. The text then examines positive and negative numbers and powers and computation. Topics include division and averages, multiplication, ratios, and measurements, scientific notation and estim

  9. Structural dataset for the PPARγ V290M mutant

    Directory of Open Access Journals (Sweden)

    Ana C. Puhl

    2016-06-01

    Full Text Available Loss-of-function mutation V290M in the ligand-binding domain of the peroxisome proliferator activated receptor γ (PPARγ is associated with a ligand resistance syndrome (PLRS, characterized by partial lipodystrophy and severe insulin resistance. In this data article we discuss an X-ray diffraction dataset that yielded the structure of PPARγ LBD V290M mutant refined at 2.3 Å resolution, that allowed building of 3D model of the receptor mutant with high confidence and revealed continuous well-defined electron density for the partial agonist diclofenac bound to hydrophobic pocket of the PPARγ. These structural data provide significant insights into molecular basis of PLRS caused by V290M mutation and are correlated with the receptor disability of rosiglitazone binding and increased affinity for corepressors. Furthermore, our structural evidence helps to explain clinical observations which point out to a failure to restore receptor function by the treatment with a full agonist of PPARγ, rosiglitazone.

  10. Highlighting nonlinear patterns in population genetics datasets

    KAUST Repository

    Alanis Lobato, Gregorio; Cannistraci, Carlo Vittorio; Eriksson, Anders; Manica, Andrea; Ravasi, Timothy

    2015-01-01

    Detecting structure in population genetics and case-control studies is important, as it exposes phenomena such as ecoclines, admixture and stratification. Principal Component Analysis (PCA) is a linear dimension-reduction technique commonly used for this purpose, but it struggles to reveal complex, nonlinear data patterns. In this paper we introduce non-centred Minimum Curvilinear Embedding (ncMCE), a nonlinear method to overcome this problem. Our analyses show that ncMCE can separate individuals into ethnic groups in cases in which PCA fails to reveal any clear structure. This increased discrimination power arises from ncMCE's ability to better capture the phylogenetic signal in the samples, whereas PCA better reflects their geographic relation. We also demonstrate how ncMCE can discover interesting patterns, even when the data has been poorly pre-processed. The juxtaposition of PCA and ncMCE visualisations provides a new standard of analysis with utility for discovering and validating significant linear/nonlinear complementary patterns in genetic data.

  11. Highlighting nonlinear patterns in population genetics datasets

    KAUST Repository

    Alanis Lobato, Gregorio

    2015-01-30

    Detecting structure in population genetics and case-control studies is important, as it exposes phenomena such as ecoclines, admixture and stratification. Principal Component Analysis (PCA) is a linear dimension-reduction technique commonly used for this purpose, but it struggles to reveal complex, nonlinear data patterns. In this paper we introduce non-centred Minimum Curvilinear Embedding (ncMCE), a nonlinear method to overcome this problem. Our analyses show that ncMCE can separate individuals into ethnic groups in cases in which PCA fails to reveal any clear structure. This increased discrimination power arises from ncMCE\\'s ability to better capture the phylogenetic signal in the samples, whereas PCA better reflects their geographic relation. We also demonstrate how ncMCE can discover interesting patterns, even when the data has been poorly pre-processed. The juxtaposition of PCA and ncMCE visualisations provides a new standard of analysis with utility for discovering and validating significant linear/nonlinear complementary patterns in genetic data.

  12. An Analysis of the GTZAN Music Genre Dataset

    DEFF Research Database (Denmark)

    Sturm, Bob L.

    2012-01-01

    Most research in automatic music genre recognition has used the dataset assembled by Tzanetakis et al. in 2001. The composition and integrity of this dataset, however, has never been formally analyzed. For the first time, we provide an analysis of its composition, and create a machine...

  13. Really big data: Processing and analysis of large datasets

    Science.gov (United States)

    Modern animal breeding datasets are large and getting larger, due in part to the recent availability of DNA data for many animals. Computational methods for efficiently storing and analyzing those data are under development. The amount of storage space required for such datasets is increasing rapidl...

  14. An Annotated Dataset of 14 Cardiac MR Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated cardiac MR images. Points of correspondence are placed on each image at the left ventricle (LV). As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  15. A New Outlier Detection Method for Multidimensional Datasets

    KAUST Repository

    Abdel Messih, Mario A.

    2012-07-01

    This study develops a novel hybrid method for outlier detection (HMOD) that combines the idea of distance based and density based methods. The proposed method has two main advantages over most of the other outlier detection methods. The first advantage is that it works well on both dense and sparse datasets. The second advantage is that, unlike most other outlier detection methods that require careful parameter setting and prior knowledge of the data, HMOD is not very sensitive to small changes in parameter values within certain parameter ranges. The only required parameter to set is the number of nearest neighbors. In addition, we made a fully parallelized implementation of HMOD that made it very efficient in applications. Moreover, we proposed a new way of using the outlier detection for redundancy reduction in datasets where the confidence level that evaluates how accurate the less redundant dataset can be used to represent the original dataset can be specified by users. HMOD is evaluated on synthetic datasets (dense and mixed “dense and sparse”) and a bioinformatics problem of redundancy reduction of dataset of position weight matrices (PWMs) of transcription factor binding sites. In addition, in the process of assessing the performance of our redundancy reduction method, we developed a simple tool that can be used to evaluate the confidence level of reduced dataset representing the original dataset. The evaluation of the results shows that our method can be used in a wide range of problems.

  16. The impact of the resolution of meteorological datasets on catchment-scale drought studies

    Science.gov (United States)

    Hellwig, Jost; Stahl, Kerstin

    2017-04-01

    Gridded meteorological datasets provide the basis to study drought at a range of scales, including catchment scale drought studies in hydrology. They are readily available to study past weather conditions and often serve real time monitoring as well. As these datasets differ in spatial/temporal coverage and spatial/temporal resolution, for most studies there is a tradeoff between these features. Our investigation examines whether biases occur when studying drought on catchment scale with low resolution input data. For that, a comparison among the datasets HYRAS (covering Central Europe, 1x1 km grid, daily data, 1951 - 2005), E-OBS (Europe, 0.25° grid, daily data, 1950-2015) and GPCC (whole world, 0.5° grid, monthly data, 1901 - 2013) is carried out. Generally, biases in precipitation increase with decreasing resolution. Most important variations are found during summer. In low mountain range of Central Europe the datasets of sparse resolution (E-OBS, GPCC) overestimate dry days and underestimate total precipitation since they are not able to describe high spatial variability. However, relative measures like the correlation coefficient reveal good consistencies of dry and wet periods, both for absolute precipitation values and standardized indices like the Standardized Precipitation Index (SPI) or Standardized Precipitation Evaporation Index (SPEI). Particularly the most severe droughts derived from the different datasets match very well. These results indicate that absolute values of sparse resolution datasets applied to catchment scale might be critical to use for an assessment of the hydrological drought at catchment scale, whereas relative measures for determining periods of drought are more trustworthy. Therefore, studies on drought, that downscale meteorological data, should carefully consider their data needs and focus on relative measures for dry periods if sufficient for the task.

  17. ATLAS File and Dataset Metadata Collection and Use

    CERN Document Server

    Albrand, S; The ATLAS collaboration; Lambert, F; Gallas, E J

    2012-01-01

    The ATLAS Metadata Interface (“AMI”) was designed as a generic cataloguing system, and as such it has found many uses in the experiment including software release management, tracking of reconstructed event sizes and control of dataset nomenclature. The primary use of AMI is to provide a catalogue of datasets (file collections) which is searchable using physics criteria. In this paper we discuss the various mechanisms used for filling the AMI dataset and file catalogues. By correlating information from different sources we can derive aggregate information which is important for physics analysis; for example the total number of events contained in dataset, and possible reasons for missing events such as a lost file. Finally we will describe some specialized interfaces which were developed for the Data Preparation and reprocessing coordinators. These interfaces manipulate information from both the dataset domain held in AMI, and the run-indexed information held in the ATLAS COMA application (Conditions and ...

  18. A dataset on tail risk of commodities markets.

    Science.gov (United States)

    Powell, Robert J; Vo, Duc H; Pham, Thach N; Singh, Abhay K

    2017-12-01

    This article contains the datasets related to the research article "The long and short of commodity tails and their relationship to Asian equity markets"(Powell et al., 2017) [1]. The datasets contain the daily prices (and price movements) of 24 different commodities decomposed from the S&P GSCI index and the daily prices (and price movements) of three share market indices including World, Asia, and South East Asia for the period 2004-2015. Then, the dataset is divided into annual periods, showing the worst 5% of price movements for each year. The datasets are convenient to examine the tail risk of different commodities as measured by Conditional Value at Risk (CVaR) as well as their changes over periods. The datasets can also be used to investigate the association between commodity markets and share markets.

  19. Error characterisation of global active and passive microwave soil moisture datasets

    Directory of Open Access Journals (Sweden)

    W. A. Dorigo

    2010-12-01

    Full Text Available Understanding the error structures of remotely sensed soil moisture observations is essential for correctly interpreting observed variations and trends in the data or assimilating them in hydrological or numerical weather prediction models. Nevertheless, a spatially coherent assessment of the quality of the various globally available datasets is often hampered by the limited availability over space and time of reliable in-situ measurements. As an alternative, this study explores the triple collocation error estimation technique for assessing the relative quality of several globally available soil moisture products from active (ASCAT and passive (AMSR-E and SSM/I microwave sensors. The triple collocation is a powerful statistical tool to estimate the root mean square error while simultaneously solving for systematic differences in the climatologies of a set of three linearly related data sources with independent error structures. Prerequisite for this technique is the availability of a sufficiently large number of timely corresponding observations. In addition to the active and passive satellite-based datasets, we used the ERA-Interim and GLDAS-NOAH reanalysis soil moisture datasets as a third, independent reference. The prime objective is to reveal trends in uncertainty related to different observation principles (passive versus active, the use of different frequencies (C-, X-, and Ku-band for passive microwave observations, and the choice of the independent reference dataset (ERA-Interim versus GLDAS-NOAH. The results suggest that the triple collocation method provides realistic error estimates. Observed spatial trends agree well with the existing theory and studies on the performance of different observation principles and frequencies with respect to land cover and vegetation density. In addition, if all theoretical prerequisites are fulfilled (e.g. a sufficiently large number of common observations is available and errors of the different

  20. Revealing Rembrandt

    Directory of Open Access Journals (Sweden)

    Andrew J Parker

    2014-04-01

    Full Text Available The power and significance of artwork in shaping human cognition is self-evident. The starting point for our empirical investigations is the view that the task of neuroscience is to integrate itself with other forms of knowledge, rather than to seek to supplant them. In our recent work, we examined a particular aspect of the appreciation of artwork using present-day functional magnetic resonance imaging (fMRI. Our results emphasised the continuity between viewing artwork and other human cognitive activities. We also showed that appreciation of a particular aspect of artwork, namely authenticity, depends upon the co-ordinated activity between the brain regions involved in multiple decision making and those responsible for processing visual information. The findings about brain function probably have no specific consequences for understanding how people respond to the art of Rembrandt in comparison with their response to other artworks. However, the use of images of Rembrandt’s portraits, his most intimate and personal works, clearly had a significant impact upon our viewers, even though they have been spatially confined to the interior of an MRI scanner at the time of viewing. Neuroscientific studies of humans viewing artwork have the capacity to reveal the diversity of human cognitive responses that may be induced by external advice or context as people view artwork in a variety of frameworks and settings.

  1. Discovery and Reuse of Open Datasets: An Exploratory Study

    Directory of Open Access Journals (Sweden)

    Sara

    2016-07-01

    Full Text Available Objective: This article analyzes twenty cited or downloaded datasets and the repositories that house them, in order to produce insights that can be used by academic libraries to encourage discovery and reuse of research data in institutional repositories. Methods: Using Thomson Reuters’ Data Citation Index and repository download statistics, we identified twenty cited/downloaded datasets. We documented the characteristics of the cited/downloaded datasets and their corresponding repositories in a self-designed rubric. The rubric includes six major categories: basic information; funding agency and journal information; linking and sharing; factors to encourage reuse; repository characteristics; and data description. Results: Our small-scale study suggests that cited/downloaded datasets generally comply with basic recommendations for facilitating reuse: data are documented well; formatted for use with a variety of software; and shared in established, open access repositories. Three significant factors also appear to contribute to dataset discovery: publishing in discipline-specific repositories; indexing in more than one location on the web; and using persistent identifiers. The cited/downloaded datasets in our analysis came from a few specific disciplines, and tended to be funded by agencies with data publication mandates. Conclusions: The results of this exploratory research provide insights that can inform academic librarians as they work to encourage discovery and reuse of institutional datasets. Our analysis also suggests areas in which academic librarians can target open data advocacy in their communities in order to begin to build open data success stories that will fuel future advocacy efforts.

  2. Sparse Group Penalized Integrative Analysis of Multiple Cancer Prognosis Datasets

    Science.gov (United States)

    Liu, Jin; Huang, Jian; Xie, Yang; Ma, Shuangge

    2014-01-01

    SUMMARY In cancer research, high-throughput profiling studies have been extensively conducted, searching for markers associated with prognosis. Because of the “large d, small n” characteristic, results generated from the analysis of a single dataset can be unsatisfactory. Recent studies have shown that integrative analysis, which simultaneously analyzes multiple datasets, can be more effective than single-dataset analysis and classic meta-analysis. In most of existing integrative analysis, the homogeneity model has been assumed, which postulates that different datasets share the same set of markers. Several approaches have been designed to reinforce this assumption. In practice, different datasets may differ in terms of patient selection criteria, profiling techniques, and many other aspects. Such differences may make the homogeneity model too restricted. In this study, we assume the heterogeneity model, under which different datasets are allowed to have different sets of markers. With multiple cancer prognosis datasets, we adopt the AFT (accelerated failure time) model to describe survival. This model may have the lowest computational cost among popular semiparametric survival models. For marker selection, we adopt a sparse group MCP (minimax concave penalty) approach. This approach has an intuitive formulation and can be computed using an effective group coordinate descent algorithm. Simulation study shows that it outperforms the existing approaches under both the homogeneity and heterogeneity models. Data analysis further demonstrates the merit of heterogeneity model and proposed approach. PMID:23938111

  3. Investigating country-specific music preferences and music recommendation algorithms with the LFM-1b dataset.

    Science.gov (United States)

    Schedl, Markus

    2017-01-01

    Recently, the LFM-1b dataset has been proposed to foster research and evaluation in music retrieval and music recommender systems, Schedl (Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR). New York, 2016). It contains more than one billion music listening events created by more than 120,000 users of Last.fm. Each listening event is characterized by artist, album, and track name, and further includes a timestamp. Basic demographic information and a selection of more elaborate listener-specific descriptors are included as well, for anonymized users. In this article, we reveal information about LFM-1b's acquisition and content and we compare it to existing datasets. We furthermore provide an extensive statistical analysis of the dataset, including basic properties of the item sets, demographic coverage, distribution of listening events (e.g., over artists and users), and aspects related to music preference and consumption behavior (e.g., temporal features and mainstreaminess of listeners). Exploiting country information of users and genre tags of artists, we also create taste profiles for populations and determine similar and dissimilar countries in terms of their populations' music preferences. Finally, we illustrate the dataset's usage in a simple artist recommendation task, whose results are intended to serve as baseline against which more elaborate techniques can be assessed.

  4. PROVIDING GEOGRAPHIC DATASETS AS LINKED DATA IN SDI

    Directory of Open Access Journals (Sweden)

    E. Hietanen

    2016-06-01

    Full Text Available In this study, a prototype service to provide data from Web Feature Service (WFS as linked data is implemented. At first, persistent and unique Uniform Resource Identifiers (URI are created to all spatial objects in the dataset. The objects are available from those URIs in Resource Description Framework (RDF data format. Next, a Web Ontology Language (OWL ontology is created to describe the dataset information content using the Open Geospatial Consortium’s (OGC GeoSPARQL vocabulary. The existing data model is modified in order to take into account the linked data principles. The implemented service produces an HTTP response dynamically. The data for the response is first fetched from existing WFS. Then the Geographic Markup Language (GML format output of the WFS is transformed on-the-fly to the RDF format. Content Negotiation is used to serve the data in different RDF serialization formats. This solution facilitates the use of a dataset in different applications without replicating the whole dataset. In addition, individual spatial objects in the dataset can be referred with URIs. Furthermore, the needed information content of the objects can be easily extracted from the RDF serializations available from those URIs. A solution for linking data objects to the dataset URI is also introduced by using the Vocabulary of Interlinked Datasets (VoID. The dataset is divided to the subsets and each subset is given its persistent and unique URI. This enables the whole dataset to be explored with a web browser and all individual objects to be indexed by search engines.

  5. Homogenised Australian climate datasets used for climate change monitoring

    International Nuclear Information System (INIS)

    Trewin, Blair; Jones, David; Collins; Dean; Jovanovic, Branislava; Braganza, Karl

    2007-01-01

    Full text: The Australian Bureau of Meteorology has developed a number of datasets for use in climate change monitoring. These datasets typically cover 50-200 stations distributed as evenly as possible over the Australian continent, and have been subject to detailed quality control and homogenisation.The time period over which data are available for each element is largely determined by the availability of data in digital form. Whilst nearly all Australian monthly and daily precipitation data have been digitised, a significant quantity of pre-1957 data (for temperature and evaporation) or pre-1987 data (for some other elements) remains to be digitised, and is not currently available for use in the climate change monitoring datasets. In the case of temperature and evaporation, the start date of the datasets is also determined by major changes in instruments or observing practices for which no adjustment is feasible at the present time. The datasets currently available cover: Monthly and daily precipitation (most stations commence 1915 or earlier, with many extending back to the late 19th century, and a few to the mid-19th century); Annual temperature (commences 1910); Daily temperature (commences 1910, with limited station coverage pre-1957); Twice-daily dewpoint/relative humidity (commences 1957); Monthly pan evaporation (commences 1970); Cloud amount (commences 1957) (Jovanovic etal. 2007). As well as the station-based datasets listed above, an additional dataset being developed for use in climate change monitoring (and other applications) covers tropical cyclones in the Australian region. This is described in more detail in Trewin (2007). The datasets already developed are used in analyses of observed climate change, which are available through the Australian Bureau of Meteorology website (http://www.bom.gov.au/silo/products/cli_chg/). They are also used as a basis for routine climate monitoring, and in the datasets used for the development of seasonal

  6. Tension in the recent Type Ia supernovae datasets

    International Nuclear Information System (INIS)

    Wei, Hao

    2010-01-01

    In the present work, we investigate the tension in the recent Type Ia supernovae (SNIa) datasets Constitution and Union. We show that they are in tension not only with the observations of the cosmic microwave background (CMB) anisotropy and the baryon acoustic oscillations (BAO), but also with other SNIa datasets such as Davis and SNLS. Then, we find the main sources responsible for the tension. Further, we make this more robust by employing the method of random truncation. Based on the results of this work, we suggest two truncated versions of the Union and Constitution datasets, namely the UnionT and ConstitutionT SNIa samples, whose behaviors are more regular.

  7. Factors affecting finite strain estimation in low-grade, low-strain clastic rocks

    Science.gov (United States)

    Pastor-Galán, Daniel; Gutiérrez-Alonso, Gabriel; Meere, Patrick A.; Mulchrone, Kieran F.

    2009-12-01

    The computer strain analysis methods SAPE, MRL and DTNNM have permitted the characterization of finite strain in two different regions with contrasting geodynamic scenarios; (1) the Talas Ala Tau (Tien Shan, Kyrgyzs Republic) and (2) the Somiedo Nappe and Narcea Antiform (Cantabrian to West Asturian-Leonese Zone boundary, Variscan Belt, NW of Iberia). The performed analyses have revealed low-strain values and the regional strain trend in both studied areas. This study also investigates the relationship between lithology (grain size and percentage of matrix) and strain estimates the two methodologies used. The results show that these methods are comparable and the absence of significant finite strain lithological control in rocks deformed under low metamorphic and low-strain conditions.

  8. Life Stress, Strain, and Deviance Across Schools: Testing the Contextual Version of General Strain Theory in China.

    Science.gov (United States)

    Zhang, Jinwu; Liu, Jianhong; Wang, Xin; Zou, Anquan

    2017-08-01

    General Strain Theory delineates different types of strain and intervening processes from strain to deviance and crime. In addition to explaining individual strain-crime relationship, a contextualized version of general strain theory, which is called the Macro General Strain Theory, has been used to analyze how aggregate variables influence aggregate and individual deviance and crime. Using a sample of 1,852 students (Level 1) nested in 52 schools (Level 2), the current study tests the Macro General Strain Theory using Chinese data. The results revealed that aggregate life stress and strain have influences on aggregate and individual deviance, and reinforce the individual stress-deviance association. The current study contributes by providing the first Macro General Strain Theory test based on Chinese data and offering empirical evidence for the multilevel intervening processes from strain to deviance. Limitations and future research directions are discussed.

  9. Background qualitative analysis of the European reference life cycle database (ELCD) energy datasets - part II: electricity datasets.

    Science.gov (United States)

    Garraín, Daniel; Fazio, Simone; de la Rúa, Cristina; Recchioni, Marco; Lechón, Yolanda; Mathieux, Fabrice

    2015-01-01

    The aim of this paper is to identify areas of potential improvement of the European Reference Life Cycle Database (ELCD) electricity datasets. The revision is based on the data quality indicators described by the International Life Cycle Data system (ILCD) Handbook, applied on sectorial basis. These indicators evaluate the technological, geographical and time-related representativeness of the dataset and the appropriateness in terms of completeness, precision and methodology. Results show that ELCD electricity datasets have a very good quality in general terms, nevertheless some findings and recommendations in order to improve the quality of Life-Cycle Inventories have been derived. Moreover, these results ensure the quality of the electricity-related datasets to any LCA practitioner, and provide insights related to the limitations and assumptions underlying in the datasets modelling. Giving this information, the LCA practitioner will be able to decide whether the use of the ELCD electricity datasets is appropriate based on the goal and scope of the analysis to be conducted. The methodological approach would be also useful for dataset developers and reviewers, in order to improve the overall Data Quality Requirements of databases.

  10. Dataset definition for CMS operations and physics analyses

    Science.gov (United States)

    Franzoni, Giovanni; Compact Muon Solenoid Collaboration

    2016-04-01

    Data recorded at the CMS experiment are funnelled into streams, integrated in the HLT menu, and further organised in a hierarchical structure of primary datasets and secondary datasets/dedicated skims. Datasets are defined according to the final-state particles reconstructed by the high level trigger, the data format and the use case (physics analysis, alignment and calibration, performance studies). During the first LHC run, new workflows have been added to this canonical scheme, to exploit at best the flexibility of the CMS trigger and data acquisition systems. The concepts of data parking and data scouting have been introduced to extend the physics reach of CMS, offering the opportunity of defining physics triggers with extremely loose selections (e.g. dijet resonance trigger collecting data at a 1 kHz). In this presentation, we review the evolution of the dataset definition during the LHC run I, and we discuss the plans for the run II.

  11. U.S. Climate Divisional Dataset (Version Superseded)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This data has been superseded by a newer version of the dataset. Please refer to NOAA's Climate Divisional Database for more information. The U.S. Climate Divisional...

  12. Karna Particle Size Dataset for Tables and Figures

    Data.gov (United States)

    U.S. Environmental Protection Agency — This dataset contains 1) table of bulk Pb-XAS LCF results, 2) table of bulk As-XAS LCF results, 3) figure data of particle size distribution, and 4) figure data for...

  13. NOAA Global Surface Temperature Dataset, Version 4.0

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The NOAA Global Surface Temperature Dataset (NOAAGlobalTemp) is derived from two independent analyses: the Extended Reconstructed Sea Surface Temperature (ERSST)...

  14. National Hydrography Dataset (NHD) - USGS National Map Downloadable Data Collection

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — The USGS National Hydrography Dataset (NHD) Downloadable Data Collection from The National Map (TNM) is a comprehensive set of digital spatial data that encodes...

  15. Watershed Boundary Dataset (WBD) - USGS National Map Downloadable Data Collection

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — The Watershed Boundary Dataset (WBD) from The National Map (TNM) defines the perimeter of drainage areas formed by the terrain and other landscape characteristics....

  16. BASE MAP DATASET, LE FLORE COUNTY, OKLAHOMA, USA

    Data.gov (United States)

    Federal Emergency Management Agency, Department of Homeland Security — Basemap datasets comprise six of the seven FGDC themes of geospatial data that are used by most GIS applications (Note: the seventh framework theme, orthographic...

  17. USGS National Hydrography Dataset from The National Map

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — USGS The National Map - National Hydrography Dataset (NHD) is a comprehensive set of digital spatial data that encodes information about naturally occurring and...

  18. A robust dataset-agnostic heart disease classifier from Phonocardiogram.

    Science.gov (United States)

    Banerjee, Rohan; Dutta Choudhury, Anirban; Deshpande, Parijat; Bhattacharya, Sakyajit; Pal, Arpan; Mandana, K M

    2017-07-01

    Automatic classification of normal and abnormal heart sounds is a popular area of research. However, building a robust algorithm unaffected by signal quality and patient demography is a challenge. In this paper we have analysed a wide list of Phonocardiogram (PCG) features in time and frequency domain along with morphological and statistical features to construct a robust and discriminative feature set for dataset-agnostic classification of normal and cardiac patients. The large and open access database, made available in Physionet 2016 challenge was used for feature selection, internal validation and creation of training models. A second dataset of 41 PCG segments, collected using our in-house smart phone based digital stethoscope from an Indian hospital was used for performance evaluation. Our proposed methodology yielded sensitivity and specificity scores of 0.76 and 0.75 respectively on the test dataset in classifying cardiovascular diseases. The methodology also outperformed three popular prior art approaches, when applied on the same dataset.

  19. AFSC/REFM: Seabird Necropsy dataset of North Pacific

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The seabird necropsy dataset contains information on seabird specimens that were collected under salvage and scientific collection permits primarily by...

  20. Dataset definition for CMS operations and physics analyses

    CERN Document Server

    AUTHOR|(CDS)2051291

    2016-01-01

    Data recorded at the CMS experiment are funnelled into streams, integrated in the HLT menu, and further organised in a hierarchical structure of primary datasets, secondary datasets, and dedicated skims. Datasets are defined according to the final-state particles reconstructed by the high level trigger, the data format and the use case (physics analysis, alignment and calibration, performance studies). During the first LHC run, new workflows have been added to this canonical scheme, to exploit at best the flexibility of the CMS trigger and data acquisition systems. The concept of data parking and data scouting have been introduced to extend the physics reach of CMS, offering the opportunity of defining physics triggers with extremely loose selections (e.g. dijet resonance trigger collecting data at a 1 kHz). In this presentation, we review the evolution of the dataset definition during the first run, and we discuss the plans for the second LHC run.

  1. USGS National Boundary Dataset (NBD) Downloadable Data Collection

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — The USGS Governmental Unit Boundaries dataset from The National Map (TNM) represents major civil areas for the Nation, including States or Territories, counties (or...

  2. Environmental Dataset Gateway (EDG) CS-W Interface

    Data.gov (United States)

    U.S. Environmental Protection Agency — Use the Environmental Dataset Gateway (EDG) to find and access EPA's environmental resources. Many options are available for easily reusing EDG content in other...

  3. Global Man-made Impervious Surface (GMIS) Dataset From Landsat

    Data.gov (United States)

    National Aeronautics and Space Administration — The Global Man-made Impervious Surface (GMIS) Dataset From Landsat consists of global estimates of fractional impervious cover derived from the Global Land Survey...

  4. A Comparative Analysis of Classification Algorithms on Diverse Datasets

    Directory of Open Access Journals (Sweden)

    M. Alghobiri

    2018-04-01

    Full Text Available Data mining involves the computational process to find patterns from large data sets. Classification, one of the main domains of data mining, involves known structure generalizing to apply to a new dataset and predict its class. There are various classification algorithms being used to classify various data sets. They are based on different methods such as probability, decision tree, neural network, nearest neighbor, boolean and fuzzy logic, kernel-based etc. In this paper, we apply three diverse classification algorithms on ten datasets. The datasets have been selected based on their size and/or number and nature of attributes. Results have been discussed using some performance evaluation measures like precision, accuracy, F-measure, Kappa statistics, mean absolute error, relative absolute error, ROC Area etc. Comparative analysis has been carried out using the performance evaluation measures of accuracy, precision, and F-measure. We specify features and limitations of the classification algorithms for the diverse nature datasets.

  5. Newton SSANTA Dr Water using POU filters dataset

    Data.gov (United States)

    U.S. Environmental Protection Agency — This dataset contains information about all the features extracted from the raw data files, the formulas that were assigned to some of these features, and the...

  6. Estimating parameters for probabilistic linkage of privacy-preserved datasets.

    Science.gov (United States)

    Brown, Adrian P; Randall, Sean M; Ferrante, Anna M; Semmens, James B; Boyd, James H

    2017-07-10

    Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters. Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data. Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher

  7. Toward computational cumulative biology by combining models of biological datasets.

    Science.gov (United States)

    Faisal, Ali; Peltonen, Jaakko; Georgii, Elisabeth; Rung, Johan; Kaski, Samuel

    2014-01-01

    A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher's experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to include both biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or automatically by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition, we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer, and the model-based search was more accurate than the keyword search; moreover, it recovered biologically meaningful relationships that are not straightforwardly visible from annotations-for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.

  8. Testing the Neutral Theory of Biodiversity with Human Microbiome Datasets

    OpenAIRE

    Li, Lianwei; Ma, Zhanshan (Sam)

    2016-01-01

    The human microbiome project (HMP) has made it possible to test important ecological theories for arguably the most important ecosystem to human health?the human microbiome. Existing limited number of studies have reported conflicting evidence in the case of the neutral theory; the present study aims to comprehensively test the neutral theory with extensive HMP datasets covering all five major body sites inhabited by the human microbiome. Utilizing 7437 datasets of bacterial community samples...

  9. General Purpose Multimedia Dataset - GarageBand 2008

    DEFF Research Database (Denmark)

    Meng, Anders

    This document describes a general purpose multimedia data-set to be used in cross-media machine learning problems. In more detail we describe the genre taxonomy applied at http://www.garageband.com, from where the data-set was collected, and how the taxonomy have been fused into a more human...... understandable taxonomy. Finally, a description of various features extracted from both the audio and text are presented....

  10. Artificial intelligence (AI) systems for interpreting complex medical datasets.

    Science.gov (United States)

    Altman, R B

    2017-05-01

    Advances in machine intelligence have created powerful capabilities in algorithms that find hidden patterns in data, classify objects based on their measured characteristics, and associate similar patients/diseases/drugs based on common features. However, artificial intelligence (AI) applications in medical data have several technical challenges: complex and heterogeneous datasets, noisy medical datasets, and explaining their output to users. There are also social challenges related to intellectual property, data provenance, regulatory issues, economics, and liability. © 2017 ASCPT.

  11. Heuristics for Relevancy Ranking of Earth Dataset Search Results

    Science.gov (United States)

    Lynnes, Christopher; Quinn, Patrick; Norton, James

    2016-01-01

    As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.

  12. Investigating automated depth modelling of archaeo-magnetic datasets

    Science.gov (United States)

    Cheyney, Samuel; Hill, Ian; Linford, Neil; Leech, Christopher

    2010-05-01

    Magnetic surveying is a commonly used tool for first-pass non-invasive archaeological surveying, and is often used to target areas for more detailed geophysical investigation, or excavation. Quick and routine processing of magnetic datasets mean survey results are typically viewed as 2D greyscale maps and the shapes of anomalies are interpreted in terms of likely archaeological structures. This technique is simple, but ignores some of the information content of the data. The data collected using dense spatial sampling with modern precise instrumentation are capable of yielding numerical estimates of the depths to buried structures, and their physical properties. The magnetic field measured at the surface is a superposition of the responses to all anomalous magnetic susceptibilities in the subsurface, and is therefore capable of revealing a 3D model of the magnetic properties. The application of mathematical modelling techniques to very-near-surface surveys such as for archaeology is quite rare, however similar methods are routinely used in regional scale mineral exploration surveys. Inverse modelling techniques have inherent ambiguity due to the nature of the mathematical "inverse problem". Often, although a good fit to the recorded values can be obtained, the final model will be non-unique and may be heavily biased by the starting model provided. Also the run time and computer resources required can be restrictive. Our approach is to derive as much information as possible from the data directly, and use this to define a starting model for inversion. This addresses both the ambiguity of the inverse problem and reduces the task for the inversion computation. A number of alternative methods exist that can be used to obtain parameters for source bodies in potential field data. Here, methods involving the derivatives of the total magnetic field are used in association with advanced image processing techniques to outline the edges of anomalous bodies more accurately

  13. Segmentation of teeth in CT volumetric dataset by panoramic projection and variational level set

    International Nuclear Information System (INIS)

    Hosntalab, Mohammad; Aghaeizadeh Zoroofi, Reza; Abbaspour Tehrani-Fard, Ali; Shirani, Gholamreza

    2008-01-01

    Quantification of teeth is of clinical importance for various computer assisted procedures such as dental implant, orthodontic planning, face, jaw and cosmetic surgeries. In this regard, segmentation is a major step. In this paper, we propose a method for segmentation of teeth in volumetric computed tomography (CT) data using panoramic re-sampling of the dataset in the coronal view and variational level set. The proposed method consists of five steps as follows: first, we extract a mask in a CT images using Otsu thresholding. Second, the teeth are segmented from other bony tissues by utilizing anatomical knowledge of teeth in the jaws. Third, the proposed method is followed by estimating the arc of the upper and lower jaws and panoramic re-sampling of the dataset. Separation of upper and lower jaws and initial segmentation of teeth are performed by employing the horizontal and vertical projections of the panoramic dataset, respectively. Based the above mentioned procedures an initial mask for each tooth is obtained. Finally, we utilize the initial mask of teeth and apply a Variational level set to refine initial teeth boundaries to final contours. The proposed algorithm was evaluated in the presence of 30 multi-slice CT datasets including 3,600 images. Experimental results reveal the effectiveness of the proposed method. In the proposed algorithm, the variational level set technique was utilized to trace the contour of the teeth. In view of the fact that, this technique is based on the characteristic of the overall region of the teeth image, it is possible to extract a very smooth and accurate tooth contour using this technique. In the presence of the available datasets, the proposed technique was successful in teeth segmentation compared to previous techniques. (orig.)

  14. Segmentation of teeth in CT volumetric dataset by panoramic projection and variational level set

    Energy Technology Data Exchange (ETDEWEB)

    Hosntalab, Mohammad [Islamic Azad University, Faculty of Engineering, Science and Research Branch, Tehran (Iran); Aghaeizadeh Zoroofi, Reza [University of Tehran, Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, College of Engineering, Tehran (Iran); Abbaspour Tehrani-Fard, Ali [Islamic Azad University, Faculty of Engineering, Science and Research Branch, Tehran (Iran); Sharif University of Technology, Department of Electrical Engineering, Tehran (Iran); Shirani, Gholamreza [Faculty of Dentistry Medical Science of Tehran University, Oral and Maxillofacial Surgery Department, Tehran (Iran)

    2008-09-15

    Quantification of teeth is of clinical importance for various computer assisted procedures such as dental implant, orthodontic planning, face, jaw and cosmetic surgeries. In this regard, segmentation is a major step. In this paper, we propose a method for segmentation of teeth in volumetric computed tomography (CT) data using panoramic re-sampling of the dataset in the coronal view and variational level set. The proposed method consists of five steps as follows: first, we extract a mask in a CT images using Otsu thresholding. Second, the teeth are segmented from other bony tissues by utilizing anatomical knowledge of teeth in the jaws. Third, the proposed method is followed by estimating the arc of the upper and lower jaws and panoramic re-sampling of the dataset. Separation of upper and lower jaws and initial segmentation of teeth are performed by employing the horizontal and vertical projections of the panoramic dataset, respectively. Based the above mentioned procedures an initial mask for each tooth is obtained. Finally, we utilize the initial mask of teeth and apply a Variational level set to refine initial teeth boundaries to final contours. The proposed algorithm was evaluated in the presence of 30 multi-slice CT datasets including 3,600 images. Experimental results reveal the effectiveness of the proposed method. In the proposed algorithm, the variational level set technique was utilized to trace the contour of the teeth. In view of the fact that, this technique is based on the characteristic of the overall region of the teeth image, it is possible to extract a very smooth and accurate tooth contour using this technique. In the presence of the available datasets, the proposed technique was successful in teeth segmentation compared to previous techniques. (orig.)

  15. Strain-Modulated Epitaxy

    National Research Council Canada - National Science Library

    Brown, April

    1999-01-01

    Strain-Modulated Epitaxy (SME) is a novel approach, invented at Georgia Tech, to utilize subsurface stressors to control strain and therefore material properties and growth kinetics in the material above the stressors...

  16. Hamstring strain - aftercare

    Science.gov (United States)

    Pulled hamstring muscle; Sprain - hamstring ... There are 3 levels of hamstring strains: Grade 1 -- mild muscle strain or pull Grade 2 -- partial muscle tear Grade 3 -- complete muscle tear Recovery time depends ...

  17. Molecular typing of Brucella melitensis endemic strains and differentiation from the vaccine strain Rev-1.

    Science.gov (United States)

    Noutsios, Georgios T; Papi, Rigini M; Ekateriniadou, Loukia V; Minas, Anastasios; Kyriakidis, Dimitrios A

    2012-03-01

    In the present study forty-four Greek endemic strains of Br. melitensis and three reference strains were genotyped by Multi locus Variable Number Tandem Repeat (ML-VNTR) analysis based on an eight-base pair tandem repeat sequence that was revealed in eight loci of Br. melitensis genome. The forty-four strains were discriminated from the vaccine strain Rev-1 by Restriction Fragment Length Polymorphism (RFLP) and Denaturant Gradient Gel Electrophoresis (DGGE). The ML-VNTR analysis revealed that endemic, reference and vaccine strains are genetically closely related, while most of the loci tested (1, 2, 4, 5 and 7) are highly polymorphic with Hunter-Gaston Genetic Diversity Index (HGDI) values in the range of 0.939 to 0.775. Analysis of ML-VNTRs loci stability through in vitro passages proved that loci 1 and 5 are non stable. Therefore, vaccine strain can be discriminated from endemic strains by allele's clusters of loci 2, 4, 6 and 7. RFLP and DGGE were also employed to analyse omp2 gene and reveled different patterns among Rev-1 and endemic strains. In RFLP, Rev-1 revealed three fragments (282, 238 and 44 bp), while endemic strains two fragments (238 and 44 bp). As for DGGE, the electrophoretic mobility of Rev-1 is different from the endemic strains due to heterologous binding of DNA chains of omp2a and omp2b gene. Overall, our data show clearly that it is feasible to genotype endemic strains of Br. melitensis and differentiate them from vaccine strain Rev-1 with ML-VNTR, RFLP and DGGE techniques. These tools can be used for conventional investigations in brucellosis outbreaks.

  18. Job strain as a risk factor for clinical depression

    DEFF Research Database (Denmark)

    Madsen, I. E. H.; Nyberg, S. T.; Magnusson Hanson, L. L.

    2017-01-01

    BACKGROUND: Adverse psychosocial working environments characterized by job strain (the combination of high demands and low control at work) are associated with an increased risk of depressive symptoms among employees, but evidence on clinically diagnosed depression is scarce. We examined job strain...... as a risk factor for clinical depression. METHOD: We identified published cohort studies from a systematic literature search in PubMed and PsycNET and obtained 14 cohort studies with unpublished individual-level data from the Individual-Participant-Data Meta-analysis in Working Populations (IPD...... unpublished datasets we included 120 221 individuals and 982 first episodes of hospital-treated clinical depression. Job strain was associated with an increased risk of clinical depression in both published [relative risk (RR) = 1.77, 95% confidence interval (CI) 1.47-2.13] and unpublished datasets (RR = 1...

  19. Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Metadata, Usage Metrics, and User Feedback to Improve Data Discovery and Access

    Data.gov (United States)

    National Aeronautics and Space Administration — We propose to mine and utilize the combination of Earth Science dataset, metadata with usage metrics and user feedback to objectively extract relevance for improved...

  20. Vibrio cholerae Classical Biotype Strains Reveal Distinct Signatures in Mexico

    OpenAIRE

    Alam, Munirul; Islam, M. Tarequl; Rashed, Shah Manzur; Johura, Fatema-tuz; Bhuiyan, Nurul A.; Delgado, Gabriela; Morales, Rosario; Mendez, Jose Luis; Navarro, Armando; Watanabe, Haruo; Hasan, Nur-A; Colwell, Rita R.; Cravioto, Alejandro

    2012-01-01

    Vibrio cholerae O1 classical (CL) biotype caused the fifth and sixth pandemics, and probably the earlier cholera pandemics, before the El Tor (ET) biotype initiated the seventh pandemic in Asia in the 1970s by completely displacing the CL biotype. Although the CL biotype was thought to be extinct in Asia and although it had never been reported from Latin America, V. cholerae CL and ET biotypes, including a hybrid ET, were found associated with areas of cholera endemicity in Mexico between 199...

  1. EEG datasets for motor imagery brain-computer interface.

    Science.gov (United States)

    Cho, Hohyun; Ahn, Minkyu; Ahn, Sangtae; Kwon, Moonyoung; Jun, Sung Chan

    2017-07-01

    Most investigators of brain-computer interface (BCI) research believe that BCI can be achieved through induced neuronal activity from the cortex, but not by evoked neuronal activity. Motor imagery (MI)-based BCI is one of the standard concepts of BCI, in that the user can generate induced activity by imagining motor movements. However, variations in performance over sessions and subjects are too severe to overcome easily; therefore, a basic understanding and investigation of BCI performance variation is necessary to find critical evidence of performance variation. Here we present not only EEG datasets for MI BCI from 52 subjects, but also the results of a psychological and physiological questionnaire, EMG datasets, the locations of 3D EEG electrodes, and EEGs for non-task-related states. We validated our EEG datasets by using the percentage of bad trials, event-related desynchronization/synchronization (ERD/ERS) analysis, and classification analysis. After conventional rejection of bad trials, we showed contralateral ERD and ipsilateral ERS in the somatosensory area, which are well-known patterns of MI. Finally, we showed that 73.08% of datasets (38 subjects) included reasonably discriminative information. Our EEG datasets included the information necessary to determine statistical significance; they consisted of well-discriminated datasets (38 subjects) and less-discriminative datasets. These may provide researchers with opportunities to investigate human factors related to MI BCI performance variation, and may also achieve subject-to-subject transfer by using metadata, including a questionnaire, EEG coordinates, and EEGs for non-task-related states. © The Authors 2017. Published by Oxford University Press.

  2. TL transgenic mouse strains

    International Nuclear Information System (INIS)

    Obata, Y.; Matsudaira, Y.; Hasegawa, H.; Tamaki, H.; Takahashi, T.; Morita, A.; Kasai, K.

    1993-01-01

    As a result of abnormal development of the thymus of these mice, TCR αβ lineage of the T cell differentiation is disturbed and cells belonging to the TCR γδ CD4 - CD8 - double negative (DN) lineage become preponderant. The γδ DN cells migrate into peripheral lymphoid organs and constitute nearly 50% of peripheral T cells. Immune function of the transgenic mice is severely impaired, indicating that the γδ cells are incapable of participating in these reactions. Molecular and serological analyses of T-cell lymphomas reveal that they belong to the γδ lineage. Tg.Tla a -3-1 mice should be useful in defining the role of TL in normal and abnormal T cell differentiation as well as in the development of T-cell lymphomas, and further they should facilitate studies on the differentiation and function of γδ T cells. We isolated T3 b -TL gene from B6 mice and constructed a chimeric gene in which T3 b -TL is driven by the promoter of H-2K b . With the chimeric gene, two transgenic mouse strains, Tg. Con.3-1 and -2 have been derived in C3H background. Both strains express TL antigen in various tissues including skin. The skin graft of transgenic mice on C3H and (B6 X C3H)F 1 mice were rejected. In the mice which rejected the grafts, CD8 + TCRαβ cytotoxic T cells (CTL) against TL antigens were recognized. The recognition of TL by CTL did not require the antigen presentation by H-2 molecules. The results indicated that TL antigen in the skin becomes a transplantation antigen and behaves like a typical allogeneic MHC class I antigen. The facts that (B6 X C3H)F 1 mice rejected the skin expressing T3 b -TL antigen and induced CTL that killed TL + lymphomas of B6 origin revealed that TL antigen encoded by T3 b -TL is recognized as non-self in B6 mice. Experiments are now extended to analyze immune responses to TL antigen expressed on autochthonous T cell lymphomas. (J.P.N.)

  3. Ancient genomes reveal a high diversity of Mycobacterium leprae in medieval Europe.

    Science.gov (United States)

    Schuenemann, Verena J; Avanzi, Charlotte; Krause-Kyora, Ben; Seitz, Alexander; Herbig, Alexander; Inskip, Sarah; Bonazzi, Marion; Reiter, Ella; Urban, Christian; Dangvard Pedersen, Dorthe; Taylor, G Michael; Singh, Pushpendra; Stewart, Graham R; Velemínský, Petr; Likovsky, Jakub; Marcsik, Antónia; Molnár, Erika; Pálfi, György; Mariotti, Valentina; Riga, Alessandro; Belcastro, M Giovanna; Boldsen, Jesper L; Nebel, Almut; Mays, Simon; Donoghue, Helen D; Zakrzewski, Sonia; Benjak, Andrej; Nieselt, Kay; Cole, Stewart T; Krause, Johannes

    2018-05-01

    Studying ancient DNA allows us to retrace the evolutionary history of human pathogens, such as Mycobacterium leprae, the main causative agent of leprosy. Leprosy is one of the oldest recorded and most stigmatizing diseases in human history. The disease was prevalent in Europe until the 16th century and is still endemic in many countries with over 200,000 new cases reported annually. Previous worldwide studies on modern and European medieval M. leprae genomes revealed that they cluster into several distinct branches of which two were present in medieval Northwestern Europe. In this study, we analyzed 10 new medieval M. leprae genomes including the so far oldest M. leprae genome from one of the earliest known cases of leprosy in the United Kingdom-a skeleton from the Great Chesterford cemetery with a calibrated age of 415-545 C.E. This dataset provides a genetic time transect of M. leprae diversity in Europe over the past 1500 years. We find M. leprae strains from four distinct branches to be present in the Early Medieval Period, and strains from three different branches were detected within a single cemetery from the High Medieval Period. Altogether these findings suggest a higher genetic diversity of M. leprae strains in medieval Europe at various time points than previously assumed. The resulting more complex picture of the past phylogeography of leprosy in Europe impacts current phylogeographical models of M. leprae dissemination. It suggests alternative models for the past spread of leprosy such as a wide spread prevalence of strains from different branches in Eurasia already in Antiquity or maybe even an origin in Western Eurasia. Furthermore, these results highlight how studying ancient M. leprae strains improves understanding the history of leprosy worldwide.

  4. Ancient genomes reveal a high diversity of Mycobacterium leprae in medieval Europe.

    Directory of Open Access Journals (Sweden)

    Verena J Schuenemann

    2018-05-01

    Full Text Available Studying ancient DNA allows us to retrace the evolutionary history of human pathogens, such as Mycobacterium leprae, the main causative agent of leprosy. Leprosy is one of the oldest recorded and most stigmatizing diseases in human history. The disease was prevalent in Europe until the 16th century and is still endemic in many countries with over 200,000 new cases reported annually. Previous worldwide studies on modern and European medieval M. leprae genomes revealed that they cluster into several distinct branches of which two were present in medieval Northwestern Europe. In this study, we analyzed 10 new medieval M. leprae genomes including the so far oldest M. leprae genome from one of the earliest known cases of leprosy in the United Kingdom-a skeleton from the Great Chesterford cemetery with a calibrated age of 415-545 C.E. This dataset provides a genetic time transect of M. leprae diversity in Europe over the past 1500 years. We find M. leprae strains from four distinct branches to be present in the Early Medieval Period, and strains from three different branches were detected within a single cemetery from the High Medieval Period. Altogether these findings suggest a higher genetic diversity of M. leprae strains in medieval Europe at various time points than previously assumed. The resulting more complex picture of the past phylogeography of leprosy in Europe impacts current phylogeographical models of M. leprae dissemination. It suggests alternative models for the past spread of leprosy such as a wide spread prevalence of strains from different branches in Eurasia already in Antiquity or maybe even an origin in Western Eurasia. Furthermore, these results highlight how studying ancient M. leprae strains improves understanding the history of leprosy worldwide.

  5. Effect of genetic strain and gender on age-related changes in body composition of the laboratory rat.

    Data.gov (United States)

    U.S. Environmental Protection Agency — Body composition data for common laboratory strains of rat as a function of age. This dataset is associated with the following publication: Gordon , C., K. Jarema ,...

  6. MODERNIZATION OF GENEOTIPING OF STRAINS B. PERTUSSIS

    Directory of Open Access Journals (Sweden)

    G. A. Ivashinnikova

    2013-01-01

    Full Text Available The new rapid molecular genotyping method was developed for studying the structure of ptxP promoter of pertussis toxin. Method is based on PCR-RFLP analysis, which allows studying the specific restriction profiles of the B. pertussis strains and allows differentiation of the strains with the ptxP structural particularities. The developed method for genotyping of strains of B. pertussis can be hhelpful when monitoring strains of the causative agent of whooping cough in system of an epidemiological surveillance over pertussis infections, allowing observation over circulating population of B.pertussis, revealing strains of the causative agent of whooping cough with high production of pertussis toxin and to watch their distribution.

  7. Comparison of CORA and EN4 in-situ datasets validation methods, toward a better quality merged dataset.

    Science.gov (United States)

    Szekely, Tanguy; Killick, Rachel; Gourrion, Jerome; Reverdin, Gilles

    2017-04-01

    CORA and EN4 are both global delayed time mode validated in-situ ocean temperature and salinity datasets distributed by the Met Office (http://www.metoffice.gov.uk/) and Copernicus (www.marine.copernicus.eu). A large part of the profiles distributed by CORA and EN4 in recent years are Argo profiles from the ARGO DAC, but profiles are also extracted from the World Ocean Database and TESAC profiles from GTSPP. In the case of CORA, data coming from the EUROGOOS Regional operationnal oserving system( ROOS) operated by European institutes no managed by National Data Centres and other datasets of profiles povided by scientific sources can also be found (Sea mammals profiles from MEOP, XBT datasets from cruises ...). (EN4 also takes data from the ASBO dataset to supplement observations in the Arctic). First advantage of this new merge product is to enhance the space and time coverage at global and european scales for the period covering 1950 till a year before the current year. This product is updated once a year and T&S gridded fields are alos generated for the period 1990-year n-1. The enhancement compared to the revious CORA product will be presented Despite the fact that the profiles distributed by both datasets are mostly the same, the quality control procedures developed by the Met Office and Copernicus teams differ, sometimes leading to different quality control flags for the same profile. Started in 2016 a new study started that aims to compare both validation procedures to move towards a Copernicus Marine Service dataset with the best features of CORA and EN4 validation.A reference data set composed of the full set of in-situ temperature and salinity measurements collected by Coriolis during 2015 is used. These measurements have been made thanks to wide range of instruments (XBTs, CTDs, Argo floats, Instrumented sea mammals,...), covering the global ocean. The reference dataset has been validated simultaneously by both teams.An exhaustive comparison of the

  8. Wind and wave dataset for Matara, Sri Lanka

    Science.gov (United States)

    Luo, Yao; Wang, Dongxiao; Priyadarshana Gamage, Tilak; Zhou, Fenghua; Madusanka Widanage, Charith; Liu, Taiwei

    2018-01-01

    We present a continuous in situ hydro-meteorology observational dataset from a set of instruments first deployed in December 2012 in the south of Sri Lanka, facing toward the north Indian Ocean. In these waters, simultaneous records of wind and wave data are sparse due to difficulties in deploying measurement instruments, although the area hosts one of the busiest shipping lanes in the world. This study describes the survey, deployment, and measurements of wind and waves, with the aim of offering future users of the dataset the most comprehensive and as much information as possible. This dataset advances our understanding of the nearshore hydrodynamic processes and wave climate, including sea waves and swells, in the north Indian Ocean. Moreover, it is a valuable resource for ocean model parameterization and validation. The archived dataset (Table 1) is examined in detail, including wave data at two locations with water depths of 20 and 10 m comprising synchronous time series of wind, ocean astronomical tide, air pressure, etc. In addition, we use these wave observations to evaluate the ERA-Interim reanalysis product. Based on Buoy 2 data, the swells are the main component of waves year-round, although monsoons can markedly alter the proportion between swell and wind sea. The dataset (Luo et al., 2017) is publicly available from Science Data Bank (https://doi.org/10.11922/sciencedb.447).

  9. The LANDFIRE Refresh strategy: updating the national dataset

    Science.gov (United States)

    Nelson, Kurtis J.; Connot, Joel A.; Peterson, Birgit E.; Martin, Charley

    2013-01-01

    The LANDFIRE Program provides comprehensive vegetation and fuel datasets for the entire United States. As with many large-scale ecological datasets, vegetation and landscape conditions must be updated periodically to account for disturbances, growth, and natural succession. The LANDFIRE Refresh effort was the first attempt to consistently update these products nationwide. It incorporated a combination of specific systematic improvements to the original LANDFIRE National data, remote sensing based disturbance detection methods, field collected disturbance information, vegetation growth and succession modeling, and vegetation transition processes. This resulted in the creation of two complete datasets for all 50 states: LANDFIRE Refresh 2001, which includes the systematic improvements, and LANDFIRE Refresh 2008, which includes the disturbance and succession updates to the vegetation and fuel data. The new datasets are comparable for studying landscape changes in vegetation type and structure over a decadal period, and provide the most recent characterization of fuel conditions across the country. The applicability of the new layers is discussed and the effects of using the new fuel datasets are demonstrated through a fire behavior modeling exercise using the 2011 Wallow Fire in eastern Arizona as an example.

  10. Interactive visualization and analysis of multimodal datasets for surgical applications.

    Science.gov (United States)

    Kirmizibayrak, Can; Yim, Yeny; Wakid, Mike; Hahn, James

    2012-12-01

    Surgeons use information from multiple sources when making surgical decisions. These include volumetric datasets (such as CT, PET, MRI, and their variants), 2D datasets (such as endoscopic videos), and vector-valued datasets (such as computer simulations). Presenting all the information to the user in an effective manner is a challenging problem. In this paper, we present a visualization approach that displays the information from various sources in a single coherent view. The system allows the user to explore and manipulate volumetric datasets, display analysis of dataset values in local regions, combine 2D and 3D imaging modalities and display results of vector-based computer simulations. Several interaction methods are discussed: in addition to traditional interfaces including mouse and trackers, gesture-based natural interaction methods are shown to control these visualizations with real-time performance. An example of a medical application (medialization laryngoplasty) is presented to demonstrate how the combination of different modalities can be used in a surgical setting with our approach.

  11. Wind and wave dataset for Matara, Sri Lanka

    Directory of Open Access Journals (Sweden)

    Y. Luo

    2018-01-01

    Full Text Available We present a continuous in situ hydro-meteorology observational dataset from a set of instruments first deployed in December 2012 in the south of Sri Lanka, facing toward the north Indian Ocean. In these waters, simultaneous records of wind and wave data are sparse due to difficulties in deploying measurement instruments, although the area hosts one of the busiest shipping lanes in the world. This study describes the survey, deployment, and measurements of wind and waves, with the aim of offering future users of the dataset the most comprehensive and as much information as possible. This dataset advances our understanding of the nearshore hydrodynamic processes and wave climate, including sea waves and swells, in the north Indian Ocean. Moreover, it is a valuable resource for ocean model parameterization and validation. The archived dataset (Table 1 is examined in detail, including wave data at two locations with water depths of 20 and 10 m comprising synchronous time series of wind, ocean astronomical tide, air pressure, etc. In addition, we use these wave observations to evaluate the ERA-Interim reanalysis product. Based on Buoy 2 data, the swells are the main component of waves year-round, although monsoons can markedly alter the proportion between swell and wind sea. The dataset (Luo et al., 2017 is publicly available from Science Data Bank (https://doi.org/10.11922/sciencedb.447.

  12. Process mining in oncology using the MIMIC-III dataset

    Science.gov (United States)

    Prima Kurniati, Angelina; Hall, Geoff; Hogg, David; Johnson, Owen

    2018-03-01

    Process mining is a data analytics approach to discover and analyse process models based on the real activities captured in information systems. There is a growing body of literature on process mining in healthcare, including oncology, the study of cancer. In earlier work we found 37 peer-reviewed papers describing process mining research in oncology with a regular complaint being the limited availability and accessibility of datasets with suitable information for process mining. Publicly available datasets are one option and this paper describes the potential to use MIMIC-III, for process mining in oncology. MIMIC-III is a large open access dataset of de-identified patient records. There are 134 publications listed as using the MIMIC dataset, but none of them have used process mining. The MIMIC-III dataset has 16 event tables which are potentially useful for process mining and this paper demonstrates the opportunities to use MIMIC-III for process mining in oncology. Our research applied the L* lifecycle method to provide a worked example showing how process mining can be used to analyse cancer pathways. The results and data quality limitations are discussed along with opportunities for further work and reflection on the value of MIMIC-III for reproducible process mining research.

  13. A strain gauge

    DEFF Research Database (Denmark)

    2016-01-01

    The invention relates to a strain gauge of a carrier layer and a meandering measurement grid positioned on the carrier layer, wherein the strain gauge comprises two reinforcement members positioned on the carrier layer at opposite ends of the measurement grid in the axial direction....... The reinforcement members are each placed within a certain axial distance to the measurement grid with the axial distance being equal to or smaller than a factor times the grid spacing. The invention further relates to a multi-axial strain gauge such as a bi-axial strain gauge or a strain gauge rosette where each...... of the strain gauges comprises reinforcement members. The invention further relates to a method for manufacturing a strain gauge as mentioned above....

  14. DNA type analysis to differentiate strains of Xylophilus ampelinus from Europe and Hokkaido, Japan

    OpenAIRE

    Komatsu, Tsutomu; Shinmura, Akinori; Kondo, Norio

    2016-01-01

    Strains of the bacterium Xylophilus ampelinus were collected from Europe and Hokkaido, Japan. Genomic fingerprints generated from 43 strains revealed four DNA types (A-D) using the combined results of Rep-, ERIC-, and Box-PCR. Genetic variation was found among the strains examined; strains collected from Europe belonged to DNA types A or B, and strains collected from Hokkaido belonged to DNA types C or D. However, strains belonging to each DNA type showed the same pathogenicity to grapevines ...

  15. Serological characterization of Actinobacillus pleuropneumoniae biotype 1 strains antigenically related to both serotypes 2 and 7

    DEFF Research Database (Denmark)

    Nielsen, R.; Andresen, Lars Ole; Plambeck, Tamara

    1996-01-01

    Nine Danish Actinobacillus pleuropneumoniae biotype 1 isolates were shown by latex agglutination and indirect haemagglutination to possess capsular polysaccharide epitopes identical to those of serotype 2 strain 1536 (reference strain of serotype 2) and strain 4226 (Danish serotype 2 strain). Imm...... in the LPS of strains 1536 and 7317 were revealed. Since an antigenic determinant specific for the 9 isolates could not be demonstrated with the methods used, the strains are proposed to be designated K2:O7....

  16. Recent Development on the NOAA's Global Surface Temperature Dataset

    Science.gov (United States)

    Zhang, H. M.; Huang, B.; Boyer, T.; Lawrimore, J. H.; Menne, M. J.; Rennie, J.

    2016-12-01

    Global Surface Temperature (GST) is one of the most widely used indicators for climate trend and extreme analyses. A widely used GST dataset is the NOAA merged land-ocean surface temperature dataset known as NOAAGlobalTemp (formerly MLOST). The NOAAGlobalTemp had recently been updated from version 3.5.4 to version 4. The update includes a significant improvement in the ocean surface component (Extended Reconstructed Sea Surface Temperature or ERSST, from version 3b to version 4) which resulted in an increased temperature trends in recent decades. Since then, advancements in both the ocean component (ERSST) and land component (GHCN-Monthly) have been made, including the inclusion of Argo float SSTs and expanded EOT modes in ERSST, and the use of ISTI databank in GHCN-Monthly. In this presentation, we describe the impact of those improvements on the merged global temperature dataset, in terms of global trends and other aspects.

  17. Synthetic ALSPAC longitudinal datasets for the Big Data VR project.

    Science.gov (United States)

    Avraam, Demetris; Wilson, Rebecca C; Burton, Paul

    2017-01-01

    Three synthetic datasets - of observation size 15,000, 155,000 and 1,555,000 participants, respectively - were created by simulating eleven cardiac and anthropometric variables from nine collection ages of the ALSAPC birth cohort study. The synthetic datasets retain similar data properties to the ALSPAC study data they are simulated from (co-variance matrices, as well as the mean and variance values of the variables) without including the original data itself or disclosing participant information.  In this instance, the three synthetic datasets have been utilised in an academia-industry collaboration to build a prototype virtual reality data analysis software, but they could have a broader use in method and software development projects where sensitive data cannot be freely shared.

  18. The OXL format for the exchange of integrated datasets

    Directory of Open Access Journals (Sweden)

    Taubert Jan

    2007-12-01

    Full Text Available A prerequisite for systems biology is the integration and analysis of heterogeneous experimental data stored in hundreds of life-science databases and millions of scientific publications. Several standardised formats for the exchange of specific kinds of biological information exist. Such exchange languages facilitate the integration process; however they are not designed to transport integrated datasets. A format for exchanging integrated datasets needs to i cover data from a broad range of application domains, ii be flexible and extensible to combine many different complex data structures, iii include metadata and semantic definitions, iv include inferred information, v identify the original data source for integrated entities and vi transport large integrated datasets. Unfortunately, none of the exchange formats from the biological domain (e.g. BioPAX, MAGE-ML, PSI-MI, SBML or the generic approaches (RDF, OWL fulfil these requirements in a systematic way.

  19. Dataset of transcriptional landscape of B cell early activation

    Directory of Open Access Journals (Sweden)

    Alexander S. Garruss

    2015-09-01

    Full Text Available Signaling via B cell receptors (BCR and Toll-like receptors (TLRs result in activation of B cells with distinct physiological outcomes, but transcriptional regulatory mechanisms that drive activation and distinguish these pathways remain unknown. At early time points after BCR and TLR ligand exposure, 0.5 and 2 h, RNA-seq was performed allowing observations on rapid transcriptional changes. At 2 h, ChIP-seq was performed to allow observations on important regulatory mechanisms potentially driving transcriptional change. The dataset includes RNA-seq, ChIP-seq of control (Input, RNA Pol II, H3K4me3, H3K27me3, and a separate RNA-seq for miRNA expression, which can be found at Gene Expression Omnibus Dataset GSE61608. Here, we provide details on the experimental and analysis methods used to obtain and analyze this dataset and to examine the transcriptional landscape of B cell early activation.

  20. The Global Precipitation Climatology Project (GPCP) Combined Precipitation Dataset

    Science.gov (United States)

    Huffman, George J.; Adler, Robert F.; Arkin, Philip; Chang, Alfred; Ferraro, Ralph; Gruber, Arnold; Janowiak, John; McNab, Alan; Rudolf, Bruno; Schneider, Udo

    1997-01-01

    The Global Precipitation Climatology Project (GPCP) has released the GPCP Version 1 Combined Precipitation Data Set, a global, monthly precipitation dataset covering the period July 1987 through December 1995. The primary product in the dataset is a merged analysis incorporating precipitation estimates from low-orbit-satellite microwave data, geosynchronous-orbit -satellite infrared data, and rain gauge observations. The dataset also contains the individual input fields, a combination of the microwave and infrared satellite estimates, and error estimates for each field. The data are provided on 2.5 deg x 2.5 deg latitude-longitude global grids. Preliminary analyses show general agreement with prior studies of global precipitation and extends prior studies of El Nino-Southern Oscillation precipitation patterns. At the regional scale there are systematic differences with standard climatologies.

  1. Visualization of conserved structures by fusing highly variable datasets.

    Science.gov (United States)

    Silverstein, Jonathan C; Chhadia, Ankur; Dech, Fred

    2002-01-01

    Skill, effort, and time are required to identify and visualize anatomic structures in three-dimensions from radiological data. Fundamentally, automating these processes requires a technique that uses symbolic information not in the dynamic range of the voxel data. We were developing such a technique based on mutual information for automatic multi-modality image fusion (MIAMI Fuse, University of Michigan). This system previously demonstrated facility at fusing one voxel dataset with integrated symbolic structure information to a CT dataset (different scale and resolution) from the same person. The next step of development of our technique was aimed at accommodating the variability of anatomy from patient to patient by using warping to fuse our standard dataset to arbitrary patient CT datasets. A standard symbolic information dataset was created from the full color Visible Human Female by segmenting the liver parenchyma, portal veins, and hepatic veins and overwriting each set of voxels with a fixed color. Two arbitrarily selected patient CT scans of the abdomen were used for reference datasets. We used the warping functions in MIAMI Fuse to align the standard structure data to each patient scan. The key to successful fusion was the focused use of multiple warping control points that place themselves around the structure of interest automatically. The user assigns only a few initial control points to align the scans. Fusion 1 and 2 transformed the atlas with 27 points around the liver to CT1 and CT2 respectively. Fusion 3 transformed the atlas with 45 control points around the liver to CT1 and Fusion 4 transformed the atlas with 5 control points around the portal vein. The CT dataset is augmented with the transformed standard structure dataset, such that the warped structure masks are visualized in combination with the original patient dataset. This combined volume visualization is then rendered interactively in stereo on the ImmersaDesk in an immersive Virtual

  2. A cross-country Exchange Market Pressure (EMP) dataset.

    Science.gov (United States)

    Desai, Mohit; Patnaik, Ila; Felman, Joshua; Shah, Ajay

    2017-06-01

    The data presented in this article are related to the research article titled - "An exchange market pressure measure for cross country analysis" (Patnaik et al. [1]). In this article, we present the dataset for Exchange Market Pressure values (EMP) for 139 countries along with their conversion factors, ρ (rho). Exchange Market Pressure, expressed in percentage change in exchange rate, measures the change in exchange rate that would have taken place had the central bank not intervened. The conversion factor ρ can interpreted as the change in exchange rate associated with $1 billion of intervention. Estimates of conversion factor ρ allow us to calculate a monthly time series of EMP for 139 countries. Additionally, the dataset contains the 68% confidence interval (high and low values) for the point estimates of ρ 's. Using the standard errors of estimates of ρ 's, we obtain one sigma intervals around mean estimates of EMP values. These values are also reported in the dataset.

  3. Solitary waves in morphogenesis: Determination fronts as strain-cued strain transformations among automatous cells

    Science.gov (United States)

    Cox, Brian N.; Landis, Chad M.

    2018-02-01

    We present a simple theory of a strain pulse propagating as a solitary wave through a continuous two-dimensional population of cells. A critical strain is assumed to trigger a strain transformation, while, simultaneously, cells move as automata to tend to restore a preferred cell density. We consider systems in which the strain transformation is a shape change, a burst of proliferation, or the commencement of growth (which changes the shape of the population sheet), and demonstrate isomorphism among these cases. Numerical and analytical solutions describe a strain pulse whose height does not depend on how the strain disturbance was first launched, or the rate at which the strain transformation is achieved, or the rate constant in the rule for the restorative cell motion. The strain pulse is therefore very stable, surviving the imposition of strong perturbations: it would serve well as a timing signal in development. The automatous wave formulation is simple, with few model parameters. A strong case exists for the presence of a strain pulse during amelogenesis. Quantitative analysis reveals a simple relationship between the velocity of the leading edge of the pulse in amelogenesis and the known speed of migration of ameloblast cells. This result and energy arguments support the depiction of wave motion as an automatous cell response to strain, rather than as a response to an elastic energy gradient. The theory may also contribute to understanding the determination front in somitogenesis, moving fronts of convergent-extension transformation, and mitotic wavefronts in the syncytial drosophila embryo.

  4. Dataset of herbarium specimens of threatened vascular plants in Catalonia.

    Science.gov (United States)

    Nualart, Neus; Ibáñez, Neus; Luque, Pere; Pedrol, Joan; Vilar, Lluís; Guàrdia, Roser

    2017-01-01

    This data paper describes a specimens' dataset of the Catalonian threatened vascular plants conserved in five public Catalonian herbaria (BC, BCN, HGI, HBIL and MTTE). Catalonia is an administrative region of Spain that includes large autochthon plants diversity and 199 taxa with IUCN threatened categories (EX, EW, RE, CR, EN and VU). This dataset includes 1,618 records collected from 17 th century to nowadays. For each specimen, the species name, locality indication, collection date, collector, ecology and revision label are recorded. More than 94% of the taxa are represented in the herbaria, which evidence the paper of the botanical collections as an essential source of occurrence data.

  5. A Large-Scale 3D Object Recognition dataset

    DEFF Research Database (Denmark)

    Sølund, Thomas; Glent Buch, Anders; Krüger, Norbert

    2016-01-01

    geometric groups; concave, convex, cylindrical and flat 3D object models. The object models have varying amount of local geometric features to challenge existing local shape feature descriptors in terms of descriptiveness and robustness. The dataset is validated in a benchmark which evaluates the matching...... performance of 7 different state-of-the-art local shape descriptors. Further, we validate the dataset in a 3D object recognition pipeline. Our benchmark shows as expected that local shape feature descriptors without any global point relation across the surface have a poor matching performance with flat...

  6. Traffic sign classification with dataset augmentation and convolutional neural network

    Science.gov (United States)

    Tang, Qing; Kurnianggoro, Laksono; Jo, Kang-Hyun

    2018-04-01

    This paper presents a method for traffic sign classification using a convolutional neural network (CNN). In this method, firstly we transfer a color image into grayscale, and then normalize it in the range (-1,1) as the preprocessing step. To increase robustness of classification model, we apply a dataset augmentation algorithm and create new images to train the model. To avoid overfitting, we utilize a dropout module before the last fully connection layer. To assess the performance of the proposed method, the German traffic sign recognition benchmark (GTSRB) dataset is utilized. Experimental results show that the method is effective in classifying traffic signs.

  7. Towards interoperable and reproducible QSAR analyses: Exchange of datasets.

    Science.gov (United States)

    Spjuth, Ola; Willighagen, Egon L; Guha, Rajarshi; Eklund, Martin; Wikberg, Jarl Es

    2010-06-30

    QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but

  8. Towards interoperable and reproducible QSAR analyses: Exchange of datasets

    Directory of Open Access Journals (Sweden)

    Spjuth Ola

    2010-06-01

    Full Text Available Abstract Background QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. Results We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Conclusions Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join

  9. The Wind Integration National Dataset (WIND) toolkit (Presentation)

    Energy Technology Data Exchange (ETDEWEB)

    Caroline Draxl: NREL

    2014-01-01

    Regional wind integration studies require detailed wind power output data at many locations to perform simulations of how the power system will operate under high penetration scenarios. The wind datasets that serve as inputs into the study must realistically reflect the ramping characteristics, spatial and temporal correlations, and capacity factors of the simulated wind plants, as well as being time synchronized with available load profiles.As described in this presentation, the WIND Toolkit fulfills these requirements by providing a state-of-the-art national (US) wind resource, power production and forecast dataset.

  10. (Project 14-6770) An Investigation to Establish Multiphysical Property Dataset of Nuclear Materials Based on in-situ Observations and Measurements

    Energy Technology Data Exchange (ETDEWEB)

    Tomar, Vikas [Purdue Univ., West Lafayette, IN (United States); Haque, Aman [Pennsylvania State Univ., University Park, PA (United States). Dept of Physics; Hattar, Khalid [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-11-10

    In-core nuclear materials including fuel pins and cladding materials fail due to issues including corrosion, mechanical wear, and pellet cladding interaction. In most such scenario microstructure dependent and corrosioninduced chemistry dependent property changes significantly affect performance of cladding, pellet, and housing. Emphasis of this work was on replace conventional pellet-cladding material models with a new straingradient viscoplasticity model that is informed by transmission electron microscopy (TEM) based measurements and by nanomechanical Raman spectroscopy (NMRS) based measurements. The TEM measurements are quantitative in nature and therefore reveal stress-strain relations with simultaneous insights into mechanisms of deformation at nanoscale. The NMRS measurements reveal the similar information at mesoscale along with additional information on relating local microstructural stresses with applied stresses. The resulting information is used to fit constants in the strain gradient viscoplasticity model as well as to validate one. During TEM measurements, a micro-electro-mechanical system based setup was developed with mechanical actuation, sensing, heating, and electrical loading. Contrary to post-mortem analysis or qualitative visualization, this setup combines direct visualization of the mechanisms behind deformation with measurement of stress, strain, thermal and electrical properties. The unique research philosophy of visualizing the microstructure at high resolution while measuring the properties led to fundamental understanding in grain size and temperature effects on measured mechanical properties such as fracture toughness. A key contribution is the role of mechanical loading boundary conditions to deconvolute the insitu TEM based nanoscale and NMRS based mesoscale data to bulk behavior. First the literature based pellet cladding mechanical interaction model based on the work of Retel’s and Williamson’s in literature work to predict

  11. MiSTIC, an integrated platform for the analysis of heterogeneity in large tumour transcriptome datasets.

    Science.gov (United States)

    Lemieux, Sebastien; Sargeant, Tobias; Laperrière, David; Ismail, Houssam; Boucher, Geneviève; Rozendaal, Marieke; Lavallée, Vincent-Philippe; Ashton-Beaucage, Dariel; Wilhelm, Brian; Hébert, Josée; Hilton, Douglas J; Mader, Sylvie; Sauvageau, Guy

    2017-07-27

    Genome-wide transcriptome profiling has enabled non-supervised classification of tumours, revealing different sub-groups characterized by specific gene expression features. However, the biological significance of these subtypes remains for the most part unclear. We describe herein an interactive platform, Minimum Spanning Trees Inferred Clustering (MiSTIC), that integrates the direct visualization and comparison of the gene correlation structure between datasets, the analysis of the molecular causes underlying co-variations in gene expression in cancer samples, and the clinical annotation of tumour sets defined by the combined expression of selected biomarkers. We have used MiSTIC to highlight the roles of specific transcription factors in breast cancer subtype specification, to compare the aspects of tumour heterogeneity targeted by different prognostic signatures, and to highlight biomarker interactions in AML. A version of MiSTIC preloaded with datasets described herein can be accessed through a public web server (http://mistic.iric.ca); in addition, the MiSTIC software package can be obtained (github.com/iric-soft/MiSTIC) for local use with personalized datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. A Novel Technique for Time-Centric Analysis of Massive Remotely-Sensed Datasets

    Directory of Open Access Journals (Sweden)

    Glenn E. Grant

    2015-04-01

    Full Text Available Analyzing massive remotely-sensed datasets presents formidable challenges. The volume of satellite imagery collected often outpaces analytical capabilities, however thorough analyses of complete datasets may provide new insights into processes that would otherwise be unseen. In this study we present a novel, object-oriented approach to storing, retrieving, and analyzing large remotely-sensed datasets. The objective is to provide a new structure for scalable storage and rapid, Internet-based analysis of climatology data. The concept of a “data rod” is introduced, a conceptual data object that organizes time-series information into a temporally-oriented vertical column at any given location. To demonstrate one possible use, we ingest 25 years of Greenland imagery into a series of pure-object databases, then retrieve and analyze the data. The results provide a basis for evaluating the database performance and scientific analysis capabilities. The project succeeds in demonstrating the effectiveness of the prototype database architecture and analysis approach, not because new scientific information is discovered, but because quality control issues are revealed in the source data that had gone undetected for years.

  13. Minimum datasets to establish a CAR-mediated mode of action for rodent liver tumors.

    Science.gov (United States)

    Peffer, Richard C; LeBaron, Matthew J; Battalora, Michael; Bomann, Werner H; Werner, Christoph; Aggarwal, Manoj; Rowe, Rocky R; Tinwell, Helen

    2018-07-01

    Methods for investigating the Mode of Action (MoA) for rodent liver tumors via constitutive androstane receptor (CAR) activation are outlined here, based on current scientific knowledge about CAR and feedback from regulatory agencies globally. The key events (i.e., CAR activation, altered gene expression, cell proliferation, altered foci and increased adenomas/carcinomas) can be demonstrated by measuring a combination of key events and associative events that are markers for the key events. For crop protection products, a primary dataset typically should include a short-term study in the species/strain that showed the tumor response at dose levels that bracket the tumorigenic and non-tumorigenic dose levels. The dataset may vary depending on the species and the test compound. As examples, Case Studies with nitrapyrin (in mice) and metofluthrin (in rats) are described. Based on qualitative differences between the species, the key events leading to tumors in mice or rats by this MoA are not operative in humans. In the future, newer approaches such as a CAR biomarker signature approach and/or in vitro CAR3 reporter assays for mouse, rat and human CAR may eventually be used to demonstrate a CAR MoA is operative, without the need for extensive additional studies in laboratory animals. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  14. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Yu-Wei [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Simmons, Blake A. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Singer, Steven W. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-10-29

    The recovery of genomes from metagenomic datasets is a critical step to defining the functional roles of the underlying uncultivated populations. We previously developed MaxBin, an automated binning approach for high-throughput recovery of microbial genomes from metagenomes. Here, we present an expanded binning algorithm, MaxBin 2.0, which recovers genomes from co-assembly of a collection of metagenomic datasets. Tests on simulated datasets revealed that MaxBin 2.0 is highly accurate in recovering individual genomes, and the application of MaxBin 2.0 to several metagenomes from environmental samples demonstrated that it could achieve two complementary goals: recovering more bacterial genomes compared to binning a single sample as well as comparing the microbial community composition between different sampling environments. Availability and implementation: MaxBin 2.0 is freely available at http://sourceforge.net/projects/maxbin/ under BSD license. Supplementary information: Supplementary data are available at Bioinformatics online.

  15. Using Multiple Big Datasets and Machine Learning to Produce a New Global Particulate Dataset: A Technology Challenge Case Study

    Science.gov (United States)

    Lary, D. J.

    2013-12-01

    A BigData case study is described where multiple datasets from several satellites, high-resolution global meteorological data, social media and in-situ observations are combined using machine learning on a distributed cluster using an automated workflow. The global particulate dataset is relevant to global public health studies and would not be possible to produce without the use of the multiple big datasets, in-situ data and machine learning.To greatly reduce the development time and enhance the functionality a high level language capable of parallel processing has been used (Matlab). A key consideration for the system is high speed access due to the large data volume, persistence of the large data volumes and a precise process time scheduling capability.

  16. Three dimensional strained semiconductors

    Science.gov (United States)

    Voss, Lars; Conway, Adam; Nikolic, Rebecca J.; Leao, Cedric Rocha; Shao, Qinghui

    2016-11-08

    In one embodiment, an apparatus includes a three dimensional structure comprising a semiconductor material, and at least one thin film in contact with at least one exterior surface of the three dimensional structure for inducing a strain in the structure, the thin film being characterized as providing at least one of: an induced strain of at least 0.05%, and an induced strain in at least 5% of a volume of the three dimensional structure. In another embodiment, a method includes forming a three dimensional structure comprising a semiconductor material, and depositing at least one thin film on at least one surface of the three dimensional structure for inducing a strain in the structure, the thin film being characterized as providing at least one of: an induced strain of at least 0.05%, and an induced strain in at least 5% of a volume of the structure.

  17. Intramyocardial strain estimation from cardiac cine MRI.

    Science.gov (United States)

    Elnakib, Ahmed; Beache, Garth M; Gimel'farb, Georgy; El-Baz, Ayman

    2015-08-01

    Functional strain is one of the important clinical indicators for the quantification of heart performance and the early detection of cardiovascular diseases, and functional strain parameters are used to aid therapeutic decisions and follow-up evaluations after cardiac surgery. A comprehensive framework for deriving functional strain parameters at the endocardium, epicardium, and mid-wall of the left ventricle (LV) from conventional cine MRI data was developed and tested. Cine data were collected using short TR-/TE-balanced steady-state free precession acquisitions on a 1.5T Siemens Espree scanner. The LV wall borders are segmented using a level set-based deformable model guided by a stochastic force derived from a second-order Markov-Gibbs random field model that accounts for the object shape and appearance features. Then, the mid-wall of the segmented LV is determined based on estimating the centerline between the endocardium and epicardium of the LV. Finally, a geometrical Laplace-based method is proposed to track corresponding points on successive myocardial contours throughout the cardiac cycle in order to characterize the strain evolutions. The method was tested using simulated phantom images with predefined point locations of the LV wall throughout the cardiac cycle. The method was tested on 30 in vivo datasets to evaluate the feasibility of the proposed framework to index functional strain parameters. The cine MRI-based model agreed with the ground truth for functional metrics to within 0.30 % for indexing the peak systolic strain change and 0.29 % (per unit time) for indexing systolic and diastolic strain rates. The method was feasible for in vivo extraction of functional strain parameters. Strain indexes of the endocardium, mid-wall, and epicardium can be derived from routine cine images using automated techniques, thereby improving the utility of cine MRI data for characterization of myocardial function. Unlike traditional texture-based tracking, the

  18. Would the ‘real’ observed dataset stand up? A critical examination of eight observed gridded climate datasets for China

    International Nuclear Information System (INIS)

    Sun, Qiaohong; Miao, Chiyuan; Duan, Qingyun; Kong, Dongxian; Ye, Aizhong; Di, Zhenhua; Gong, Wei

    2014-01-01

    This research compared and evaluated the spatio-temporal similarities and differences of eight widely used gridded datasets. The datasets include daily precipitation over East Asia (EA), the Climate Research Unit (CRU) product, the Global Precipitation Climatology Centre (GPCC) product, the University of Delaware (UDEL) product, Precipitation Reconstruction over Land (PREC/L), the Asian Precipitation Highly Resolved Observational (APHRO) product, the Institute of Atmospheric Physics (IAP) dataset from the Chinese Academy of Sciences, and the National Meteorological Information Center dataset from the China Meteorological Administration (CN05). The meteorological variables focus on surface air temperature (SAT) or precipitation (PR) in China. All datasets presented general agreement on the whole spatio-temporal scale, but some differences appeared for specific periods and regions. On a temporal scale, EA shows the highest amount of PR, while APHRO shows the lowest. CRU and UDEL show higher SAT than IAP or CN05. On a spatial scale, the most significant differences occur in western China for PR and SAT. For PR, the difference between EA and CRU is the largest. When compared with CN05, CRU shows higher SAT in the central and southern Northwest river drainage basin, UDEL exhibits higher SAT over the Southwest river drainage system, and IAP has lower SAT in the Tibetan Plateau. The differences in annual mean PR and SAT primarily come from summer and winter, respectively. Finally, potential factors impacting agreement among gridded climate datasets are discussed, including raw data sources, quality control (QC) schemes, orographic correction, and interpolation techniques. The implications and challenges of these results for climate research are also briefly addressed. (paper)

  19. Atlantic small-mammal: a dataset of communities of rodents and marsupials of the Atlantic forests of South America.

    Science.gov (United States)

    Bovendorp, Ricardo S; Villar, Nacho; de Abreu-Junior, Edson F; Bello, Carolina; Regolin, André L; Percequillo, Alexandre R; Galetti, Mauro

    2017-08-01

    The contribution of small mammal ecology to the understanding of macroecological patterns of biodiversity, population dynamics, and community assembly has been hindered by the absence of large datasets of small mammal communities from tropical regions. Here we compile the largest dataset of inventories of small mammal communities for the Neotropical region. The dataset reviews small mammal communities from the Atlantic forest of South America, one of the regions with the highest diversity of small mammals and a global biodiversity hotspot, though currently covering less than 12% of its original area due to anthropogenic pressures. The dataset comprises 136 references from 300 locations covering seven vegetation types of tropical and subtropical Atlantic forests of South America, and presents data on species composition, richness, and relative abundance (captures/trap-nights). One paper was published more than 70 yr ago, but 80% of them were published after 2000. The dataset comprises 53,518 individuals of 124 species of small mammals, including 30 species of marsupials and 94 species of rodents. Species richness averaged 8.2 species (1-21) per site. Only two species occurred in more than 50% of the sites (the common opossum, Didelphis aurita and black-footed pigmy rice rat Oligoryzomys nigripes). Mean species abundance varied 430-fold, from 4.3 to 0.01 individuals/trap-night. The dataset also revealed a hyper-dominance of 22 species that comprised 78.29% of all individuals captured, with only seven species representing 44% of all captures. The information contained on this dataset can be applied in the study of macroecological patterns of biodiversity, communities, and populations, but also to evaluate the ecological consequences of fragmentation and defaunation, and predict disease outbreaks, trophic interactions and community dynamics in this biodiversity hotspot. © 2017 by the Ecological Society of America.

  20. Strain measurement technique

    International Nuclear Information System (INIS)

    1987-01-01

    The 10 contributions are concerned with selected areas of application, such as strain measurements in wood, rubber/metal compounds, sets of strain measurements on buildings, reinforced concrete structures without gaps, pipes buried in the ground and measurements of pressure fluctuations. To increase the availability and safety of plant, stress analyses were made on gas turbine rotors with HT-DMS or capacitive HT-DMS (high temperature strain measurements). (DG) [de

  1. Strain path and work-hardening behavior of brass

    International Nuclear Information System (INIS)

    Sakharova, N.A.; Fernandes, J.V.; Vieira, M.F.

    2009-01-01

    Plastic straining in metal forming usually includes changes of strain path, which are frequently not taken into account in the analysis of forming processes. Moreover, strain path change can significantly affect the mechanical behavior and microstructural evolution of the material. For this reason, a combination of several simple loading test sequences is an effective way to investigate the dislocation microstructure of sheet metals under such forming conditions. Pure tension and rolling strain paths and rolling-tension strain path sequences were performed on brass sheets. A study of mechanical behavior and microstructural evolution during the simple and the complex strain paths was carried out, within a wide range of strain values. The appearance and development of deformation twinning was evident. It was shown that strain path change promotes the onset of premature twinning. The work-hardening behavior is discussed in terms of the twinning and dislocation microstructure evolution, as revealed by transmission electron microscopy

  2. Influence of strain on dislocation core in silicon

    Science.gov (United States)

    Pizzagalli, L.; Godet, J.; Brochard, S.

    2018-05-01

    First principles, density functional-based tight binding and semi-empirical interatomic potentials calculations are performed to analyse the influence of large strains on the structure and stability of a 60? dislocation in silicon. Such strains typically arise during the mechanical testing of nanostructures like nanopillars or nanoparticles. We focus on bi-axial strains in the plane normal to the dislocation line. Our calculations surprisingly reveal that the dislocation core structure largely depends on the applied strain, for strain levels of about 5%. In the particular case of bi-axial compression, the transformation of the dislocation to a locally disordered configuration occurs for similar strain magnitudes. The formation of an opening, however, requires larger strains, of about 7.5%. Furthermore, our results suggest that electronic structure methods should be favoured to model dislocation cores in case of large strains whenever possible.

  3. Strained Silicon Photonics

    Directory of Open Access Journals (Sweden)

    Ralf B. Wehrspohn

    2012-05-01

    Full Text Available A review of recent progress in the field of strained silicon photonics is presented. The application of strain to waveguide and photonic crystal structures can be used to alter the linear and nonlinear optical properties of these devices. Here, methods for the fabrication of strained devices are summarized and recent examples of linear and nonlinear optical devices are discussed. Furthermore, the relation between strain and the enhancement of the second order nonlinear susceptibility is investigated, which may enable the construction of optically active photonic devices made of silicon.

  4. Using Real Datasets for Interdisciplinary Business/Economics Projects

    Science.gov (United States)

    Goel, Rajni; Straight, Ronald L.

    2005-01-01

    The workplace's global and dynamic nature allows and requires improved approaches for providing business and economics education. In this article, the authors explore ways of enhancing students' understanding of course material by using nontraditional, real-world datasets of particular interest to them. Teaching at a historically Black university,…

  5. Dataset-driven research for improving recommender systems for learning

    NARCIS (Netherlands)

    Verbert, Katrien; Drachsler, Hendrik; Manouselis, Nikos; Wolpers, Martin; Vuorikari, Riina; Duval, Erik

    2011-01-01

    Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., & Duval, E. (2011). Dataset-driven research for improving recommender systems for learning. In Ph. Long, & G. Siemens (Eds.), Proceedings of 1st International Conference Learning Analytics & Knowledge (pp. 44-53). February,

  6. dataTEL - Datasets for Technology Enhanced Learning

    NARCIS (Netherlands)

    Drachsler, Hendrik; Verbert, Katrien; Sicilia, Miguel-Angel; Wolpers, Martin; Manouselis, Nikos; Vuorikari, Riina; Lindstaedt, Stefanie; Fischer, Frank

    2011-01-01

    Drachsler, H., Verbert, K., Sicilia, M. A., Wolpers, M., Manouselis, N., Vuorikari, R., Lindstaedt, S., & Fischer, F. (2011). dataTEL - Datasets for Technology Enhanced Learning. STELLAR Alpine Rendez-Vous White Paper. Alpine Rendez-Vous 2011 White paper collection, Nr. 13., France (2011)

  7. A dataset of forest biomass structure for Eurasia.

    Science.gov (United States)

    Schepaschenko, Dmitry; Shvidenko, Anatoly; Usoltsev, Vladimir; Lakyda, Petro; Luo, Yunjian; Vasylyshyn, Roman; Lakyda, Ivan; Myklush, Yuriy; See, Linda; McCallum, Ian; Fritz, Steffen; Kraxner, Florian; Obersteiner, Michael

    2017-05-16

    The most comprehensive dataset of in situ destructive sampling measurements of forest biomass in Eurasia have been compiled from a combination of experiments undertaken by the authors and from scientific publications. Biomass is reported as four components: live trees (stem, bark, branches, foliage, roots); understory (above- and below ground); green forest floor (above- and below ground); and coarse woody debris (snags, logs, dead branches of living trees and dead roots), consisting of 10,351 unique records of sample plots and 9,613 sample trees from ca 1,200 experiments for the period 1930-2014 where there is overlap between these two datasets. The dataset also contains other forest stand parameters such as tree species composition, average age, tree height, growing stock volume, etc., when available. Such a dataset can be used for the development of models of biomass structure, biomass extension factors, change detection in biomass structure, investigations into biodiversity and species distribution and the biodiversity-productivity relationship, as well as the assessment of the carbon pool and its dynamics, among many others.

  8. A reanalysis dataset of the South China Sea

    Science.gov (United States)

    Zeng, Xuezhi; Peng, Shiqiu; Li, Zhijin; Qi, Yiquan; Chen, Rongyu

    2014-01-01

    Ocean reanalysis provides a temporally continuous and spatially gridded four-dimensional estimate of the ocean state for a better understanding of the ocean dynamics and its spatial/temporal variability. Here we present a 19-year (1992–2010) high-resolution ocean reanalysis dataset of the upper ocean in the South China Sea (SCS) produced from an ocean data assimilation system. A wide variety of observations, including in-situ temperature/salinity profiles, ship-measured and satellite-derived sea surface temperatures, and sea surface height anomalies from satellite altimetry, are assimilated into the outputs of an ocean general circulation model using a multi-scale incremental three-dimensional variational data assimilation scheme, yielding a daily high-resolution reanalysis dataset of the SCS. Comparisons between the reanalysis and independent observations support the reliability of the dataset. The presented dataset provides the research community of the SCS an important data source for studying the thermodynamic processes of the ocean circulation and meso-scale features in the SCS, including their spatial and temporal variability. PMID:25977803

  9. Comparision of analysis of the QTLMAS XII common dataset

    DEFF Research Database (Denmark)

    Crooks, Lucy; Sahana, Goutam; de Koning, Dirk-Jan

    2009-01-01

    As part of the QTLMAS XII workshop, a simulated dataset was distributed and participants were invited to submit analyses of the data based on genome-wide association, fine mapping and genomic selection. We have evaluated the findings from the groups that reported fine mapping and genome-wide asso...

  10. The LAMBADA dataset: Word prediction requiring a broad discourse context

    NARCIS (Netherlands)

    Paperno, D.; Kruszewski, G.; Lazaridou, A.; Pham, Q.N.; Bernardi, R.; Pezzelle, S.; Baroni, M.; Boleda, G.; Fernández, R.; Erk, K.; Smith, N.A.

    2016-01-01

    We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the

  11. NEW WEB-BASED ACCESS TO NUCLEAR STRUCTURE DATASETS.

    Energy Technology Data Exchange (ETDEWEB)

    WINCHELL,D.F.

    2004-09-26

    As part of an effort to migrate the National Nuclear Data Center (NNDC) databases to a relational platform, a new web interface has been developed for the dissemination of the nuclear structure datasets stored in the Evaluated Nuclear Structure Data File and Experimental Unevaluated Nuclear Data List.

  12. Cross-Cultural Concept Mapping of Standardized Datasets

    DEFF Research Database (Denmark)

    Kano Glückstad, Fumiko

    2012-01-01

    This work compares four feature-based similarity measures derived from cognitive sciences. The purpose of the comparative analysis is to verify the potentially most effective model that can be applied for mapping independent ontologies in a culturally influenced domain [1]. Here, datasets based...

  13. Level-1 muon trigger performance with the full 2017 dataset

    CERN Document Server

    CMS Collaboration

    2018-01-01

    This document describes the performance of the CMS Level-1 Muon Trigger with the full dataset of 2017. Efficiency plots are included for each track finder (TF) individually and for the system as a whole. The efficiency is measured to be greater than 90% for all track finders.

  14. A Dataset for Visual Navigation with Neuromorphic Methods

    Directory of Open Access Journals (Sweden)

    Francisco eBarranco

    2016-02-01

    Full Text Available Standardized benchmarks in Computer Vision have greatly contributed to the advance of approaches to many problems in the field. If we want to enhance the visibility of event-driven vision and increase its impact, we will need benchmarks that allow comparison among different neuromorphic methods as well as comparison to Computer Vision conventional approaches. We present datasets to evaluate the accuracy of frame-free and frame-based approaches for tasks of visual navigation. Similar to conventional Computer Vision datasets, we provide synthetic and real scenes, with the synthetic data created with graphics packages, and the real data recorded using a mobile robotic platform carrying a dynamic and active pixel vision sensor (DAVIS and an RGB+Depth sensor. For both datasets the cameras move with a rigid motion in a static scene, and the data includes the images, events, optic flow, 3D camera motion, and the depth of the scene, along with calibration procedures. Finally, we also provide simulated event data generated synthetically from well-known frame-based optical flow datasets.

  15. Evaluation of Uncertainty in Precipitation Datasets for New Mexico, USA

    Science.gov (United States)

    Besha, A. A.; Steele, C. M.; Fernald, A.

    2014-12-01

    Climate change, population growth and other factors are endangering water availability and sustainability in semiarid/arid areas particularly in the southwestern United States. Wide coverage of spatial and temporal measurements of precipitation are key for regional water budget analysis and hydrological operations which themselves are valuable tool for water resource planning and management. Rain gauge measurements are usually reliable and accurate at a point. They measure rainfall continuously, but spatial sampling is limited. Ground based radar and satellite remotely sensed precipitation have wide spatial and temporal coverage. However, these measurements are indirect and subject to errors because of equipment, meteorological variability, the heterogeneity of the land surface itself and lack of regular recording. This study seeks to understand precipitation uncertainty and in doing so, lessen uncertainty propagation into hydrological applications and operations. We reviewed, compared and evaluated the TRMM (Tropical Rainfall Measuring Mission) precipitation products, NOAA's (National Oceanic and Atmospheric Administration) Global Precipitation Climatology Centre (GPCC) monthly precipitation dataset, PRISM (Parameter elevation Regression on Independent Slopes Model) data and data from individual climate stations including Cooperative Observer Program (COOP), Remote Automated Weather Stations (RAWS), Soil Climate Analysis Network (SCAN) and Snowpack Telemetry (SNOTEL) stations. Though not yet finalized, this study finds that the uncertainty within precipitation estimates datasets is influenced by regional topography, season, climate and precipitation rate. Ongoing work aims to further evaluate precipitation datasets based on the relative influence of these phenomena so that we can identify the optimum datasets for input to statewide water budget analysis.

  16. Dataset: Multi Sensor-Orientation Movement Data of Goats

    NARCIS (Netherlands)

    Kamminga, Jacob Wilhelm

    2018-01-01

    This is a labeled dataset. Motion data were collected from six sensor nodes that were fixed with different orientations to a collar around the neck of goats. These six sensor nodes simultaneously, with different orientations, recorded various activities performed by the goat. We recorded the

  17. A dataset of human decision-making in teamwork management

    Science.gov (United States)

    Yu, Han; Shen, Zhiqi; Miao, Chunyan; Leung, Cyril; Chen, Yiqiang; Fauvel, Simon; Lin, Jun; Cui, Lizhen; Pan, Zhengxiang; Yang, Qiang

    2017-01-01

    Today, most endeavours require teamwork by people with diverse skills and characteristics. In managing teamwork, decisions are often made under uncertainty and resource constraints. The strategies and the effectiveness of the strategies different people adopt to manage teamwork under different situations have not yet been fully explored, partially due to a lack of detailed large-scale data. In this paper, we describe a multi-faceted large-scale dataset to bridge this gap. It is derived from a game simulating complex project management processes. It presents the participants with different conditions in terms of team members' capabilities and task characteristics for them to exhibit their decision-making strategies. The dataset contains detailed data reflecting the decision situations, decision strategies, decision outcomes, and the emotional responses of 1,144 participants from diverse backgrounds. To our knowledge, this is the first dataset simultaneously covering these four facets of decision-making. With repeated measurements, the dataset may help establish baseline variability of decision-making in teamwork management, leading to more realistic decision theoretic models and more effective decision support approaches.

  18. UK surveillance: provision of quality assured information from combined datasets.

    Science.gov (United States)

    Paiba, G A; Roberts, S R; Houston, C W; Williams, E C; Smith, L H; Gibbens, J C; Holdship, S; Lysons, R

    2007-09-14

    Surveillance information is most useful when provided within a risk framework, which is achieved by presenting results against an appropriate denominator. Often the datasets are captured separately and for different purposes, and will have inherent errors and biases that can be further confounded by the act of merging. The United Kingdom Rapid Analysis and Detection of Animal-related Risks (RADAR) system contains data from several sources and provides both data extracts for research purposes and reports for wider stakeholders. Considerable efforts are made to optimise the data in RADAR during the Extraction, Transformation and Loading (ETL) process. Despite efforts to ensure data quality, the final dataset inevitably contains some data errors and biases, most of which cannot be rectified during subsequent analysis. So, in order for users to establish the 'fitness for purpose' of data merged from more than one data source, Quality Statements are produced as defined within the overarching surveillance Quality Framework. These documents detail identified data errors and biases following ETL and report construction as well as relevant aspects of the datasets from which the data originated. This paper illustrates these issues using RADAR datasets, and describes how they can be minimised.

  19. participatory development of a minimum dataset for the khayelitsha ...

    African Journals Online (AJOL)

    This dataset was integrated with data requirements at ... model for defining health information needs at district level. This participatory process has enabled health workers to appraise their .... of reproductive health, mental health, disability and community ... each chose a facilitator and met in between the forum meetings.

  20. Comparision of analysis of the QTLMAS XII common dataset

    DEFF Research Database (Denmark)

    Lund, Mogens Sandø; Sahana, Goutam; de Koning, Dirk-Jan

    2009-01-01

    A dataset was simulated and distributed to participants of the QTLMAS XII workshop who were invited to develop genomic selection models. Each contributing group was asked to describe the model development and validation as well as to submit genomic predictions for three generations of individuals...

  1. The NASA Subsonic Jet Particle Image Velocimetry (PIV) Dataset

    Science.gov (United States)

    Bridges, James; Wernet, Mark P.

    2011-01-01

    Many tasks in fluids engineering require prediction of turbulence of jet flows. The present document documents the single-point statistics of velocity, mean and variance, of cold and hot jet flows. The jet velocities ranged from 0.5 to 1.4 times the ambient speed of sound, and temperatures ranged from unheated to static temperature ratio 2.7. Further, the report assesses the accuracies of the data, e.g., establish uncertainties for the data. This paper covers the following five tasks: (1) Document acquisition and processing procedures used to create the particle image velocimetry (PIV) datasets. (2) Compare PIV data with hotwire and laser Doppler velocimetry (LDV) data published in the open literature. (3) Compare different datasets acquired at the same flow conditions in multiple tests to establish uncertainties. (4) Create a consensus dataset for a range of hot jet flows, including uncertainty bands. (5) Analyze this consensus dataset for self-consistency and compare jet characteristics to those of the open literature. The final objective was fulfilled by using the potential core length and the spread rate of the half-velocity radius to collapse of the mean and turbulent velocity fields over the first 20 jet diameters.

  2. A new dataset validation system for the Planetary Science Archive

    Science.gov (United States)

    Manaud, N.; Zender, J.; Heather, D.; Martinez, S.

    2007-08-01

    The Planetary Science Archive is the official archive for the Mars Express mission. It has received its first data by the end of 2004. These data are delivered by the PI teams to the PSA team as datasets, which are formatted conform to the Planetary Data System (PDS). The PI teams are responsible for analyzing and calibrating the instrument data as well as the production of reduced and calibrated data. They are also responsible of the scientific validation of these data. ESA is responsible of the long-term data archiving and distribution to the scientific community and must ensure, in this regard, that all archived products meet quality. To do so, an archive peer-review is used to control the quality of the Mars Express science data archiving process. However a full validation of its content is missing. An independent review board recently recommended that the completeness of the archive as well as the consistency of the delivered data should be validated following well-defined procedures. A new validation software tool is being developed to complete the overall data quality control system functionality. This new tool aims to improve the quality of data and services provided to the scientific community through the PSA, and shall allow to track anomalies in and to control the completeness of datasets. It shall ensure that the PSA end-users: (1) can rely on the result of their queries, (2) will get data products that are suitable for scientific analysis, (3) can find all science data acquired during a mission. We defined dataset validation as the verification and assessment process to check the dataset content against pre-defined top-level criteria, which represent the general characteristics of good quality datasets. The dataset content that is checked includes the data and all types of information that are essential in the process of deriving scientific results and those interfacing with the PSA database. The validation software tool is a multi-mission tool that

  3. Proglacial river stage, discharge, and temperature datasets from the Akuliarusiarsuup Kuua River northern tributary, Southwest Greenland, 2008–2011

    Directory of Open Access Journals (Sweden)

    A. K. Rennermalm

    2012-05-01

    Full Text Available Pressing scientific questions concerning the Greenland ice sheet's climatic sensitivity, hydrology, and contributions to current and future sea level rise require hydrological datasets to resolve. While direct observations of ice sheet meltwater losses can be obtained in terrestrial rivers draining the ice sheet and from lake levels, few such datasets exist. We present a new hydrologic dataset from previously unmonitored sites in the vicinity of Kangerlussuaq, Southwest Greenland. This dataset contains measurements of river stage and discharge for three sites along the Akuliarusiarsuup Kuua (Watson River's northern tributary, with 30 min temporal resolution between June 2008 and July 2011. Additional data of water temperature, air pressure, and lake stage are also provided. Flow velocity and depth measurements were collected at sites with incised bedrock or structurally reinforced channels to maximize data quality. However, like most proglacial rivers, high turbulence and bedload transport introduce considerable uncertainty to the derived discharge estimates. Eleven propagating error sources were quantified, and reveal that largest uncertainties are associated with flow depth observations. Mean discharge uncertainties (approximately the 68% confidence interval are two to four times larger (±19% to ±43% than previously published estimates for Greenland rivers. Despite these uncertainties, this dataset offers a rare collection of direct measurements of ice sheet runoff to the global ocean and is freely available for scientific use at http://dx.doi.org/10.1594/PANGAEA.762818.

  4. Comparison of global 3-D aviation emissions datasets

    Directory of Open Access Journals (Sweden)

    S. C. Olsen

    2013-01-01

    Full Text Available Aviation emissions are unique from other transportation emissions, e.g., from road transportation and shipping, in that they occur at higher altitudes as well as at the surface. Aviation emissions of carbon dioxide, soot, and water vapor have direct radiative impacts on the Earth's climate system while emissions of nitrogen oxides (NOx, sulfur oxides, carbon monoxide (CO, and hydrocarbons (HC impact air quality and climate through their effects on ozone, methane, and clouds. The most accurate estimates of the impact of aviation on air quality and climate utilize three-dimensional chemistry-climate models and gridded four dimensional (space and time aviation emissions datasets. We compare five available aviation emissions datasets currently and historically used to evaluate the impact of aviation on climate and air quality: NASA-Boeing 1992, NASA-Boeing 1999, QUANTIFY 2000, Aero2k 2002, and AEDT 2006 and aviation fuel usage estimates from the International Energy Agency. Roughly 90% of all aviation emissions are in the Northern Hemisphere and nearly 60% of all fuelburn and NOx emissions occur at cruise altitudes in the Northern Hemisphere. While these datasets were created by independent methods and are thus not strictly suitable for analyzing trends they suggest that commercial aviation fuelburn and NOx emissions increased over the last two decades while HC emissions likely decreased and CO emissions did not change significantly. The bottom-up estimates compared here are consistently lower than International Energy Agency fuelburn statistics although the gap is significantly smaller in the more recent datasets. Overall the emissions distributions are quite similar for fuelburn and NOx with regional peaks over the populated land masses of North America, Europe, and East Asia. For CO and HC there are relatively larger differences. There are however some distinct differences in the altitude distribution

  5. Geoseq: a tool for dissecting deep-sequencing datasets

    Directory of Open Access Journals (Sweden)

    Homann Robert

    2010-10-01

    Full Text Available Abstract Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO, Sequence Read Archive (SRA hosted by the NCBI, or the DNA Data Bank of Japan (ddbj. Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Results Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Conclusions Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a identify differential isoform expression in mRNA-seq datasets, b identify miRNAs (microRNAs in libraries, and identify mature and star sequences in miRNAS and c to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  6. On sample size and different interpretations of snow stability datasets

    Science.gov (United States)

    Schirmer, M.; Mitterer, C.; Schweizer, J.

    2009-04-01

    Interpretations of snow stability variations need an assessment of the stability itself, independent of the scale investigated in the study. Studies on stability variations at a regional scale have often chosen stability tests such as the Rutschblock test or combinations of various tests in order to detect differences in aspect and elevation. The question arose: ‘how capable are such stability interpretations in drawing conclusions'. There are at least three possible errors sources: (i) the variance of the stability test itself; (ii) the stability variance at an underlying slope scale, and (iii) that the stability interpretation might not be directly related to the probability of skier triggering. Various stability interpretations have been proposed in the past that provide partly different results. We compared a subjective one based on expert knowledge with a more objective one based on a measure derived from comparing skier-triggered slopes vs. slopes that have been skied but not triggered. In this study, the uncertainties are discussed and their effects on regional scale stability variations will be quantified in a pragmatic way. An existing dataset with very large sample sizes was revisited. This dataset contained the variance of stability at a regional scale for several situations. The stability in this dataset was determined using the subjective interpretation scheme based on expert knowledge. The question to be answered was how many measurements were needed to obtain similar results (mainly stability differences in aspect or elevation) as with the complete dataset. The optimal sample size was obtained in several ways: (i) assuming a nominal data scale the sample size was determined with a given test, significance level and power, and by calculating the mean and standard deviation of the complete dataset. With this method it can also be determined if the complete dataset consists of an appropriate sample size. (ii) Smaller subsets were created with similar

  7. Genome-Wide Transcription Study of Cryptococcus neoformans H99 Clinical Strain versus Environmental Strains.

    Directory of Open Access Journals (Sweden)

    Elaheh Movahed

    Full Text Available The infection of Cryptococcus neoformans is acquired through the inhalation of desiccated yeast cells and basidiospores originated from the environment, particularly from bird's droppings and decaying wood. Three environmental strains of C. neoformans originated from bird droppings (H4, S48B and S68B and C. neoformans reference clinical strain (H99 were used for intranasal infection in C57BL/6 mice. We showed that the H99 strain demonstrated higher virulence compared to H4, S48B and S68B strains. To examine if gene expression contributed to the different degree of virulence among these strains, a genome-wide microarray study was performed to inspect the transcriptomic profiles of all four strains. Our results revealed that out of 7,419 genes (22,257 probes examined, 65 genes were significantly up-or down-regulated in H99 versus H4, S48B and S68B strains. The up-regulated genes in H99 strain include Hydroxymethylglutaryl-CoA synthase (MVA1, Mitochondrial matrix factor 1 (MMF1, Bud-site-selection protein 8 (BUD8, High affinity glucose transporter 3 (SNF3 and Rho GTPase-activating protein 2 (RGA2. Pathway annotation using DAVID bioinformatics resource showed that metal ion binding and sugar transmembrane transporter activity pathways were highly expressed in the H99 strain. We suggest that the genes and pathways identified may possibly play crucial roles in the fungal pathogenesis.

  8. A strain gauge

    DEFF Research Database (Denmark)

    2017-01-01

    The invention relates to a strain gauge of a carrier layer and a meandering measurement grid (101) positioned on the carrier layer, wherein the measurement grid comprises a number of measurement grid sections placed side by side with gaps in between, and a number of end loops (106) interconnecting...... relates to a method for manufacturing a strain gauge as mentioned above....

  9. Chemical Profile of Monascus ruber Strains

    Directory of Open Access Journals (Sweden)

    Ahamed M. Moharram

    2012-01-01

    Full Text Available Chemical profile of Monascus ruber strains has been studied using gas chromatography-mass spectrometry (GC/MS analysis. The colour intensity of the red pigment and secondary metabolic products of two M. ruber strains (AUMC 4066 and AUMC 5705 cultivated on ten different media were also studied. Metabolic products can be classified into four categories: anticholesterol, anticancer, food colouring, and essential fatty acids necessary for human health. Using GC/MS, the following 88 metabolic products were detected: butyric acid and its derivatives (25 products, other fatty acids and their derivatives (19 products, pyran and its derivatives (22 products and other metabolites (22 products. Among these, 32 metabolites were specific for AUMC 4066 strain and 34 for AUMC 5705 strain, whereas 22 metabolites were produced by both strains on different tested substrates. Production of some metabolites depended on the substrate used. High number of metabolites was recorded in the red pigment extract obtained by both strains grown on malt broth and malt agar. Also, 42 aroma compounds were recorded (4 alcohols, 2 benzaldehydes, 27 esters, 3 lactones, 1 phenol, 1 terpenoid, 3 thiol compounds and acetate-3-mercapto butyric acid. Thin layer chromatography and GC/MS analyses revealed no mycotoxin citrinin in any media used for the growth of the two M. ruber strains.

  10. Isolation and genetic characterization of Aurantimonas and Methylobacterium strains from stems of hypernodulated soybeans.

    Science.gov (United States)

    Anda, Mizue; Ikeda, Seishi; Eda, Shima; Okubo, Takashi; Sato, Shusei; Tabata, Satoshi; Mitsui, Hisayuki; Minamisawa, Kiwamu

    2011-01-01

    The aims of this study were to isolate Aurantimonas and Methylobacterium strains that responded to soybean nodulation phenotypes and nitrogen fertilization rates in a previous culture-independent analysis (Ikeda et al. ISME J. 4:315-326, 2010). Two strategies were adopted for isolation from enriched bacterial cells prepared from stems of field-grown, hypernodulated soybeans: PCR-assisted isolation for Aurantimonas and selective cultivation for Methylobacterium. Thirteen of 768 isolates cultivated on Nutrient Agar medium were identified as Aurantimonas by colony PCR specific for Aurantimonas and 16S rRNA gene sequencing. Meanwhile, among 187 isolates on methanol-containing agar media, 126 were identified by 16S rRNA gene sequences as Methylobacterium. A clustering analysis (>99% identity) of the 16S rRNA gene sequences for the combined datasets of the present and previous studies revealed 4 and 8 operational taxonomic units (OTUs) for Aurantimonas and Methylobacterium, respectively, and showed the successful isolation of target bacteria for these two groups. ERIC- and BOX-PCR showed the genomic uniformity of the target isolates. In addition, phylogenetic analyses of Aurantimonas revealed a phyllosphere-specific cluster in the genus. The isolates obtained in the present study will be useful for revealing unknown legume-microbe interactions in relation to the autoregulation of nodulation.

  11. A multimodal MRI dataset of professional chess players.

    Science.gov (United States)

    Li, Kaiming; Jiang, Jing; Qiu, Lihua; Yang, Xun; Huang, Xiaoqi; Lui, Su; Gong, Qiyong

    2015-01-01

    Chess is a good model to study high-level human brain functions such as spatial cognition, memory, planning, learning and problem solving. Recent studies have demonstrated that non-invasive MRI techniques are valuable for researchers to investigate the underlying neural mechanism of playing chess. For professional chess players (e.g., chess grand masters and masters or GM/Ms), what are the structural and functional alterations due to long-term professional practice, and how these alterations relate to behavior, are largely veiled. Here, we report a multimodal MRI dataset from 29 professional Chinese chess players (most of whom are GM/Ms), and 29 age matched novices. We hope that this dataset will provide researchers with new materials to further explore high-level human brain functions.

  12. Augmented Reality Prototype for Visualizing Large Sensors’ Datasets

    Directory of Open Access Journals (Sweden)

    Folorunso Olufemi A.

    2011-04-01

    Full Text Available This paper addressed the development of an augmented reality (AR based scientific visualization system prototype that supports identification, localisation, and 3D visualisation of oil leakages sensors datasets. Sensors generates significant amount of multivariate datasets during normal and leak situations which made data exploration and visualisation daunting tasks. Therefore a model to manage such data and enhance computational support needed for effective explorations are developed in this paper. A challenge of this approach is to reduce the data inefficiency. This paper presented a model for computing information gain for each data attributes and determine a lead attribute.The computed lead attribute is then used for the development of an AR-based scientific visualization interface which automatically identifies, localises and visualizes all necessary data relevant to a particularly selected region of interest (ROI on the network. Necessary architectural system supports and the interface requirements for such visualizations are also presented.

  13. An integrated dataset for in silico drug discovery

    Directory of Open Access Journals (Sweden)

    Cockell Simon J

    2010-12-01

    Full Text Available Drug development is expensive and prone to failure. It is potentially much less risky and expensive to reuse a drug developed for one condition for treating a second disease, than it is to develop an entirely new compound. Systematic approaches to drug repositioning are needed to increase throughput and find candidates more reliably. Here we address this need with an integrated systems biology dataset, developed using the Ondex data integration platform, for the in silico discovery of new drug repositioning candidates. We demonstrate that the information in this dataset allows known repositioning examples to be discovered. We also propose a means of automating the search for new treatment indications of existing compounds.

  14. Application of Density Estimation Methods to Datasets from a Glider

    Science.gov (United States)

    2014-09-30

    humpback and sperm whales as well as different dolphin species. OBJECTIVES The objective of this research is to extend existing methods for cetacean...collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources...estimation from single sensor datasets. Required steps for a cue counting approach, where a cue has been defined as a clicking event (Küsel et al., 2011), to

  15. A review of continent scale hydrological datasets available for Africa

    OpenAIRE

    Bonsor, H.C.

    2010-01-01

    As rainfall becomes less reliable with predicted climate change the ability to assess the spatial and seasonal variations in groundwater availability on a large-scale (catchment and continent) is becoming increasingly important (Bates, et al. 2007; MacDonald et al. 2009). The scarcity of observed hydrological data, or difficulty in obtaining such data, within Africa means remotely sensed (RS) datasets must often be used to drive large-scale hydrological models. The different ap...

  16. Dataset of mitochondrial genome variants in oncocytic tumors

    Directory of Open Access Journals (Sweden)

    Lihua Lyu

    2018-04-01

    Full Text Available This dataset presents the mitochondrial genome variants associated with oncocytic tumors. These data were obtained by Sanger sequencing of the whole mitochondrial genomes of oncocytic tumors and the adjacent normal tissues from 32 patients. The mtDNA variants are identified after compared with the revised Cambridge sequence, excluding those defining haplogroups of our patients. The pathogenic prediction for the novel missense variants found in this study was performed with the Mitimpact 2 program.

  17. GLEAM version 3: Global Land Evaporation Datasets and Model

    Science.gov (United States)

    Martens, B.; Miralles, D. G.; Lievens, H.; van der Schalie, R.; de Jeu, R.; Fernandez-Prieto, D.; Verhoest, N.

    2015-12-01

    Terrestrial evaporation links energy, water and carbon cycles over land and is therefore a key variable of the climate system. However, the global-scale magnitude and variability of the flux, and the sensitivity of the underlying physical process to changes in environmental factors, are still poorly understood due to limitations in in situ measurements. As a result, several methods have risen to estimate global patterns of land evaporation from satellite observations. However, these algorithms generally differ in their approach to model evaporation, resulting in large differences in their estimates. One of these methods is GLEAM, the Global Land Evaporation: the Amsterdam Methodology. GLEAM estimates terrestrial evaporation based on daily satellite observations of meteorological variables, vegetation characteristics and soil moisture. Since the publication of the first version of the algorithm (2011), the model has been widely applied to analyse trends in the water cycle and land-atmospheric feedbacks during extreme hydrometeorological events. A third version of the GLEAM global datasets is foreseen by the end of 2015. Given the relevance of having a continuous and reliable record of global-scale evaporation estimates for climate and hydrological research, the establishment of an online data portal to host these data to the public is also foreseen. In this new release of the GLEAM datasets, different components of the model have been updated, with the most significant change being the revision of the data assimilation algorithm. In this presentation, we will highlight the most important changes of the methodology and present three new GLEAM datasets and their validation against in situ observations and an alternative dataset of terrestrial evaporation (ERA-Land). Results of the validation exercise indicate that the magnitude and the spatiotemporal variability of the modelled evaporation agree reasonably well with the estimates of ERA-Land and the in situ

  18. Soil chemistry in lithologically diverse datasets: the quartz dilution effect

    Science.gov (United States)

    Bern, Carleton R.

    2009-01-01

    National- and continental-scale soil geochemical datasets are likely to move our understanding of broad soil geochemistry patterns forward significantly. Patterns of chemistry and mineralogy delineated from these datasets are strongly influenced by the composition of the soil parent material, which itself is largely a function of lithology and particle size sorting. Such controls present a challenge by obscuring subtler patterns arising from subsequent pedogenic processes. Here the effect of quartz concentration is examined in moist-climate soils from a pilot dataset of the North American Soil Geochemical Landscapes Project. Due to variable and high quartz contents (6.2–81.7 wt.%), and its residual and inert nature in soil, quartz is demonstrated to influence broad patterns in soil chemistry. A dilution effect is observed whereby concentrations of various elements are significantly and strongly negatively correlated with quartz. Quartz content drives artificial positive correlations between concentrations of some elements and obscures negative correlations between others. Unadjusted soil data show the highly mobile base cations Ca, Mg, and Na to be often strongly positively correlated with intermediately mobile Al or Fe, and generally uncorrelated with the relatively immobile high-field-strength elements (HFS) Ti and Nb. Both patterns are contrary to broad expectations for soils being weathered and leached. After transforming bulk soil chemistry to a quartz-free basis, the base cations are generally uncorrelated with Al and Fe, and negative correlations generally emerge with the HFS elements. Quartz-free element data may be a useful tool for elucidating patterns of weathering or parent-material chemistry in large soil datasets.

  19. Dataset on records of Hericium erinaceus in Slovakia

    OpenAIRE

    Vladimír Kunca; Marek Čiliak

    2017-01-01

    The data presented in this article are related to the research article entitled ?Habitat preferences of Hericium erinaceus in Slovakia? (Kunca and ?iliak, 2016) [FUNECO607] [2]. The dataset include all available and unpublished data from Slovakia, besides the records from the same tree or stem. We compiled a database of records of collections by processing data from herbaria, personal records and communication with mycological activists. Data on altitude, tree species, host tree vital status,...

  20. Diffeomorphic Iterative Centroid Methods for Template Estimation on Large Datasets

    OpenAIRE

    Cury , Claire; Glaunès , Joan Alexis; Colliot , Olivier

    2014-01-01

    International audience; A common approach for analysis of anatomical variability relies on the stimation of a template representative of the population. The Large Deformation Diffeomorphic Metric Mapping is an attractive framework for that purpose. However, template estimation using LDDMM is computationally expensive, which is a limitation for the study of large datasets. This paper presents an iterative method which quickly provides a centroid of the population in the shape space. This centr...

  1. A Dataset from TIMSS to Examine the Relationship between Computer Use and Mathematics Achievement

    Science.gov (United States)

    Kadijevich, Djordje M.

    2015-01-01

    Because the relationship between computer use and achievement is still puzzling, there is a need to prepare and analyze good quality datasets on computer use and achievement. Such a dataset can be derived from TIMSS data. This paper describes how this dataset can be prepared. It also gives an example of how the dataset may be analyzed. The…

  2. An Analysis on Better Testing than Training Performances on the Iris Dataset

    NARCIS (Netherlands)

    Schutten, Marten; Wiering, Marco

    2016-01-01

    The Iris dataset is a well known dataset containing information on three different types of Iris flowers. A typical and popular method for solving classification problems on datasets such as the Iris set is the support vector machine (SVM). In order to do so the dataset is separated in a set used

  3. Enzyme markers in inbred rat strains: genetics of new markers and strain profiles.

    Science.gov (United States)

    Adams, M; Baverstock, P R; Watts, C H; Gutman, G A

    1984-08-01

    Twenty-six inbred strains of the laboratory rat (Rattus norvegicus) were examined for electrophoretic variation at an estimated 97 genetic loci. In addition to previously documented markers, variation was observed for the enzymes aconitase, aldehyde dehydrogenase, and alkaline phosphatase. The genetic basis of these markers (Acon-1, Ahd-2, and Akp-1) was confirmed. Linkage analysis between 35 pairwise comparisons revealed that the markers Fh-1 and Pep-3 are linked. The strain profiles of the 25 inbred strains at 11 electrophoretic markers are given.

  4. Parton Distributions based on a Maximally Consistent Dataset

    Science.gov (United States)

    Rojo, Juan

    2016-04-01

    The choice of data that enters a global QCD analysis can have a substantial impact on the resulting parton distributions and their predictions for collider observables. One of the main reasons for this has to do with the possible presence of inconsistencies, either internal within an experiment or external between different experiments. In order to assess the robustness of the global fit, different definitions of a conservative PDF set, that is, a PDF set based on a maximally consistent dataset, have been introduced. However, these approaches are typically affected by theory biases in the selection of the dataset. In this contribution, after a brief overview of recent NNPDF developments, we propose a new, fully objective, definition of a conservative PDF set, based on the Bayesian reweighting approach. Using the new NNPDF3.0 framework, we produce various conservative sets, which turn out to be mutually in agreement within the respective PDF uncertainties, as well as with the global fit. We explore some of their implications for LHC phenomenology, finding also good consistency with the global fit result. These results provide a non-trivial validation test of the new NNPDF3.0 fitting methodology, and indicate that possible inconsistencies in the fitted dataset do not affect substantially the global fit PDFs.

  5. New public dataset for spotting patterns in medieval document images

    Science.gov (United States)

    En, Sovann; Nicolas, Stéphane; Petitjean, Caroline; Jurie, Frédéric; Heutte, Laurent

    2017-01-01

    With advances in technology, a large part of our cultural heritage is becoming digitally available. In particular, in the field of historical document image analysis, there is now a growing need for indexing and data mining tools, thus allowing us to spot and retrieve the occurrences of an object of interest, called a pattern, in a large database of document images. Patterns may present some variability in terms of color, shape, or context, making the spotting of patterns a challenging task. Pattern spotting is a relatively new field of research, still hampered by the lack of available annotated resources. We present a new publicly available dataset named DocExplore dedicated to spotting patterns in historical document images. The dataset contains 1500 images and 1464 queries, and allows the evaluation of two tasks: image retrieval and pattern localization. A standardized benchmark protocol along with ad hoc metrics is provided for a fair comparison of the submitted approaches. We also provide some first results obtained with our baseline system on this new dataset, which show that there is room for improvement and that should encourage researchers of the document image analysis community to design new systems and submit improved results.

  6. Kernel-based discriminant feature extraction using a representative dataset

    Science.gov (United States)

    Li, Honglin; Sancho Gomez, Jose-Luis; Ahalt, Stanley C.

    2002-07-01

    Discriminant Feature Extraction (DFE) is widely recognized as an important pre-processing step in classification applications. Most DFE algorithms are linear and thus can only explore the linear discriminant information among the different classes. Recently, there has been several promising attempts to develop nonlinear DFE algorithms, among which is Kernel-based Feature Extraction (KFE). The efficacy of KFE has been experimentally verified by both synthetic data and real problems. However, KFE has some known limitations. First, KFE does not work well for strongly overlapped data. Second, KFE employs all of the training set samples during the feature extraction phase, which can result in significant computation when applied to very large datasets. Finally, KFE can result in overfitting. In this paper, we propose a substantial improvement to KFE that overcomes the above limitations by using a representative dataset, which consists of critical points that are generated from data-editing techniques and centroid points that are determined by using the Frequency Sensitive Competitive Learning (FSCL) algorithm. Experiments show that this new KFE algorithm performs well on significantly overlapped datasets, and it also reduces computational complexity. Further, by controlling the number of centroids, the overfitting problem can be effectively alleviated.

  7. Decoys Selection in Benchmarking Datasets: Overview and Perspectives

    Science.gov (United States)

    Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

    2018-01-01

    Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509

  8. ENHANCED DATA DISCOVERABILITY FOR IN SITU HYPERSPECTRAL DATASETS

    Directory of Open Access Journals (Sweden)

    B. Rasaiah

    2016-06-01

    Full Text Available Field spectroscopic metadata is a central component in the quality assurance, reliability, and discoverability of hyperspectral data and the products derived from it. Cataloguing, mining, and interoperability of these datasets rely upon the robustness of metadata protocols for field spectroscopy, and on the software architecture to support the exchange of these datasets. Currently no standard for in situ spectroscopy data or metadata protocols exist. This inhibits the effective sharing of growing volumes of in situ spectroscopy datasets, to exploit the benefits of integrating with the evolving range of data sharing platforms. A core metadataset for field spectroscopy was introduced by Rasaiah et al., (2011-2015 with extended support for specific applications. This paper presents a prototype model for an OGC and ISO compliant platform-independent metadata discovery service aligned to the specific requirements of field spectroscopy. In this study, a proof-of-concept metadata catalogue has been described and deployed in a cloud-based architecture as a demonstration of an operationalized field spectroscopy metadata standard and web-based discovery service.

  9. Multiresolution persistent homology for excessively large biomolecular datasets

    Energy Technology Data Exchange (ETDEWEB)

    Xia, Kelin; Zhao, Zhixiong [Department of Mathematics, Michigan State University, East Lansing, Michigan 48824 (United States); Wei, Guo-Wei, E-mail: wei@math.msu.edu [Department of Mathematics, Michigan State University, East Lansing, Michigan 48824 (United States); Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824 (United States); Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824 (United States)

    2015-10-07

    Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs.

  10. Tissue-Based MRI Intensity Standardization: Application to Multicentric Datasets

    Directory of Open Access Journals (Sweden)

    Nicolas Robitaille

    2012-01-01

    Full Text Available Intensity standardization in MRI aims at correcting scanner-dependent intensity variations. Existing simple and robust techniques aim at matching the input image histogram onto a standard, while we think that standardization should aim at matching spatially corresponding tissue intensities. In this study, we present a novel automatic technique, called STI for STandardization of Intensities, which not only shares the simplicity and robustness of histogram-matching techniques, but also incorporates tissue spatial intensity information. STI uses joint intensity histograms to determine intensity correspondence in each tissue between the input and standard images. We compared STI to an existing histogram-matching technique on two multicentric datasets, Pilot E-ADNI and ADNI, by measuring the intensity error with respect to the standard image after performing nonlinear registration. The Pilot E-ADNI dataset consisted in 3 subjects each scanned in 7 different sites. The ADNI dataset consisted in 795 subjects scanned in more than 50 different sites. STI was superior to the histogram-matching technique, showing significantly better intensity matching for the brain white matter with respect to the standard image.

  11. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander; Mularoni, Loris; Cope, Leslie M.; Medvedeva, Yulia; Mironov, Andrey A.; Makeev, Vsevolod J.; Wheelan, Sarah J.

    2012-01-01

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  12. Image segmentation evaluation for very-large datasets

    Science.gov (United States)

    Reeves, Anthony P.; Liu, Shuang; Xie, Yiting

    2016-03-01

    With the advent of modern machine learning methods and fully automated image analysis there is a need for very large image datasets having documented segmentations for both computer algorithm training and evaluation. Current approaches of visual inspection and manual markings do not scale well to big data. We present a new approach that depends on fully automated algorithm outcomes for segmentation documentation, requires no manual marking, and provides quantitative evaluation for computer algorithms. The documentation of new image segmentations and new algorithm outcomes are achieved by visual inspection. The burden of visual inspection on large datasets is minimized by (a) customized visualizations for rapid review and (b) reducing the number of cases to be reviewed through analysis of quantitative segmentation evaluation. This method has been applied to a dataset of 7,440 whole-lung CT images for 6 different segmentation algorithms designed to fully automatically facilitate the measurement of a number of very important quantitative image biomarkers. The results indicate that we could achieve 93% to 99% successful segmentation for these algorithms on this relatively large image database. The presented evaluation method may be scaled to much larger image databases.

  13. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander

    2012-05-31

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  14. Principal Component Analysis of Process Datasets with Missing Values

    Directory of Open Access Journals (Sweden)

    Kristen A. Severson

    2017-07-01

    Full Text Available Datasets with missing values arising from causes such as sensor failure, inconsistent sampling rates, and merging data from different systems are common in the process industry. Methods for handling missing data typically operate during data pre-processing, but can also occur during model building. This article considers missing data within the context of principal component analysis (PCA, which is a method originally developed for complete data that has widespread industrial application in multivariate statistical process control. Due to the prevalence of missing data and the success of PCA for handling complete data, several PCA algorithms that can act on incomplete data have been proposed. Here, algorithms for applying PCA to datasets with missing values are reviewed. A case study is presented to demonstrate the performance of the algorithms and suggestions are made with respect to choosing which algorithm is most appropriate for particular settings. An alternating algorithm based on the singular value decomposition achieved the best results in the majority of test cases involving process datasets.

  15. A cross-country Exchange Market Pressure (EMP dataset

    Directory of Open Access Journals (Sweden)

    Mohit Desai

    2017-06-01

    Full Text Available The data presented in this article are related to the research article titled - “An exchange market pressure measure for cross country analysis” (Patnaik et al. [1]. In this article, we present the dataset for Exchange Market Pressure values (EMP for 139 countries along with their conversion factors, ρ (rho. Exchange Market Pressure, expressed in percentage change in exchange rate, measures the change in exchange rate that would have taken place had the central bank not intervened. The conversion factor ρ can interpreted as the change in exchange rate associated with $1 billion of intervention. Estimates of conversion factor ρ allow us to calculate a monthly time series of EMP for 139 countries. Additionally, the dataset contains the 68% confidence interval (high and low values for the point estimates of ρ’s. Using the standard errors of estimates of ρ’s, we obtain one sigma intervals around mean estimates of EMP values. These values are also reported in the dataset.

  16. Omics strategies for revealing Yersinia pestis virulence

    Science.gov (United States)

    Yang, Ruifu; Du, Zongmin; Han, Yanping; Zhou, Lei; Song, Yajun; Zhou, Dongsheng; Cui, Yujun

    2012-01-01

    Omics has remarkably changed the way we investigate and understand life. Omics differs from traditional hypothesis-driven research because it is a discovery-driven approach. Mass datasets produced from omics-based studies require experts from different fields to reveal the salient features behind these data. In this review, we summarize omics-driven studies to reveal the virulence features of Yersinia pestis through genomics, trascriptomics, proteomics, interactomics, etc. These studies serve as foundations for further hypothesis-driven research and help us gain insight into Y. pestis pathogenesis. PMID:23248778

  17. Discovering New Global Climate Patterns: Curating a 21-Year High Temporal (Hourly) and Spatial (40km) Resolution Reanalysis Dataset

    Science.gov (United States)

    Hou, C. Y.; Dattore, R.; Peng, G. S.

    2014-12-01

    The National Center for Atmospheric Research's Global Climate Four-Dimensional Data Assimilation (CFDDA) Hourly 40km Reanalysis dataset is a dynamically downscaled dataset with high temporal and spatial resolution. The dataset contains three-dimensional hourly analyses in netCDF format for the global atmospheric state from 1985 to 2005 on a 40km horizontal grid (0.4°grid increment) with 28 vertical levels, providing good representation of local forcing and diurnal variation of processes in the planetary boundary layer. This project aimed to make the dataset publicly available, accessible, and usable in order to provide a unique resource to allow and promote studies of new climate characteristics. When the curation project started, it had been five years since the data files were generated. Also, although the Principal Investigator (PI) had generated a user document at the end of the project in 2009, the document had not been maintained. Furthermore, the PI had moved to a new institution, and the remaining team members were reassigned to other projects. These factors made data curation in the areas of verifying data quality, harvest metadata descriptions, documenting provenance information especially challenging. As a result, the project's curation process found that: Data curator's skill and knowledge helped make decisions, such as file format and structure and workflow documentation, that had significant, positive impact on the ease of the dataset's management and long term preservation. Use of data curation tools, such as the Data Curation Profiles Toolkit's guidelines, revealed important information for promoting the data's usability and enhancing preservation planning. Involving data curators during each stage of the data curation life cycle instead of at the end could improve the curation process' efficiency. Overall, the project showed that proper resources invested in the curation process would give datasets the best chance to fulfill their potential to

  18. The Role of Datasets on Scientific Influence within Conflict Research

    Science.gov (United States)

    Van Holt, Tracy; Johnson, Jeffery C.; Moates, Shiloh; Carley, Kathleen M.

    2016-01-01

    We inductively tested if a coherent field of inquiry in human conflict research emerged in an analysis of published research involving “conflict” in the Web of Science (WoS) over a 66-year period (1945–2011). We created a citation network that linked the 62,504 WoS records and their cited literature. We performed a critical path analysis (CPA), a specialized social network analysis on this citation network (~1.5 million works), to highlight the main contributions in conflict research and to test if research on conflict has in fact evolved to represent a coherent field of inquiry. Out of this vast dataset, 49 academic works were highlighted by the CPA suggesting a coherent field of inquiry; which means that researchers in the field acknowledge seminal contributions and share a common knowledge base. Other conflict concepts that were also analyzed—such as interpersonal conflict or conflict among pharmaceuticals, for example, did not form their own CP. A single path formed, meaning that there was a cohesive set of ideas that built upon previous research. This is in contrast to a main path analysis of conflict from 1957–1971 where ideas didn’t persist in that multiple paths existed and died or emerged reflecting lack of scientific coherence (Carley, Hummon, and Harty, 1993). The critical path consisted of a number of key features: 1) Concepts that built throughout include the notion that resource availability drives conflict, which emerged in the 1960s-1990s and continued on until 2011. More recent intrastate studies that focused on inequalities emerged from interstate studies on the democracy of peace earlier on the path. 2) Recent research on the path focused on forecasting conflict, which depends on well-developed metrics and theories to model. 3) We used keyword analysis to independently show how the CP was topically linked (i.e., through democracy, modeling, resources, and geography). Publically available conflict datasets developed early on helped

  19. The Role of Datasets on Scientific Influence within Conflict Research.

    Directory of Open Access Journals (Sweden)

    Tracy Van Holt

    Full Text Available We inductively tested if a coherent field of inquiry in human conflict research emerged in an analysis of published research involving "conflict" in the Web of Science (WoS over a 66-year period (1945-2011. We created a citation network that linked the 62,504 WoS records and their cited literature. We performed a critical path analysis (CPA, a specialized social network analysis on this citation network (~1.5 million works, to highlight the main contributions in conflict research and to test if research on conflict has in fact evolved to represent a coherent field of inquiry. Out of this vast dataset, 49 academic works were highlighted by the CPA suggesting a coherent field of inquiry; which means that researchers in the field acknowledge seminal contributions and share a common knowledge base. Other conflict concepts that were also analyzed-such as interpersonal conflict or conflict among pharmaceuticals, for example, did not form their own CP. A single path formed, meaning that there was a cohesive set of ideas that built upon previous research. This is in contrast to a main path analysis of conflict from 1957-1971 where ideas didn't persist in that multiple paths existed and died or emerged reflecting lack of scientific coherence (Carley, Hummon, and Harty, 1993. The critical path consisted of a number of key features: 1 Concepts that built throughout include the notion that resource availability drives conflict, which emerged in the 1960s-1990s and continued on until 2011. More recent intrastate studies that focused on inequalities emerged from interstate studies on the democracy of peace earlier on the path. 2 Recent research on the path focused on forecasting conflict, which depends on well-developed metrics and theories to model. 3 We used keyword analysis to independently show how the CP was topically linked (i.e., through democracy, modeling, resources, and geography. Publically available conflict datasets developed early on helped

  20. The Role of Datasets on Scientific Influence within Conflict Research.

    Science.gov (United States)

    Van Holt, Tracy; Johnson, Jeffery C; Moates, Shiloh; Carley, Kathleen M

    2016-01-01

    We inductively tested if a coherent field of inquiry in human conflict research emerged in an analysis of published research involving "conflict" in the Web of Science (WoS) over a 66-year period (1945-2011). We created a citation network that linked the 62,504 WoS records and their cited literature. We performed a critical path analysis (CPA), a specialized social network analysis on this citation network (~1.5 million works), to highlight the main contributions in conflict research and to test if research on conflict has in fact evolved to represent a coherent field of inquiry. Out of this vast dataset, 49 academic works were highlighted by the CPA suggesting a coherent field of inquiry; which means that researchers in the field acknowledge seminal contributions and share a common knowledge base. Other conflict concepts that were also analyzed-such as interpersonal conflict or conflict among pharmaceuticals, for example, did not form their own CP. A single path formed, meaning that there was a cohesive set of ideas that built upon previous research. This is in contrast to a main path analysis of conflict from 1957-1971 where ideas didn't persist in that multiple paths existed and died or emerged reflecting lack of scientific coherence (Carley, Hummon, and Harty, 1993). The critical path consisted of a number of key features: 1) Concepts that built throughout include the notion that resource availability drives conflict, which emerged in the 1960s-1990s and continued on until 2011. More recent intrastate studies that focused on inequalities emerged from interstate studies on the democracy of peace earlier on the path. 2) Recent research on the path focused on forecasting conflict, which depends on well-developed metrics and theories to model. 3) We used keyword analysis to independently show how the CP was topically linked (i.e., through democracy, modeling, resources, and geography). Publically available conflict datasets developed early on helped shape the

  1. Mechanical control over valley magnetotransport in strained graphene

    Energy Technology Data Exchange (ETDEWEB)

    Ma, Ning, E-mail: maning@stu.xjtu.edu.cn [Department of Physics, MOE Key Laboratory of Advanced Transducers and Intelligent Control System, Taiyuan University of Technology, Taiyuan 030024 (China); Department of Applied Physics, MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, Xi' an Jiaotong University, Xi' an 710049 (China); Zhang, Shengli, E-mail: zhangsl@mail.xjtu.edu.cn [Department of Applied Physics, MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, Xi' an Jiaotong University, Xi' an 710049 (China); Liu, Daqing, E-mail: liudq@cczu.edu.cn [School of Mathematics and Physics, Changzhou University, Changzhou 213164 (China)

    2016-05-06

    Recent experiments report that the graphene exhibits Landau levels (LLs) that form in the presence of a uniform strain pseudomagnetic field with magnitudes up to hundreds of tesla. We further reveal that the strain removes the valley degeneracy in LLs, and leads to a significant valley polarization with inversion symmetry broken. This accordingly gives rise to the well separated valley Hall plateaus and Shubnikov–de Haas oscillations. These effects are absent in strainless graphene, and can be used to generate and detect valley polarization by mechanical means, forming the basis for the new paradigm “valleytronics” applications. - Highlights: • We explore the mechanical strain effects on the valley magnetotransport in graphene. • We analytically derive the dc collisional and Hall conductivities under strain. • The strain removes the valley degeneracy in Landau levels. • The strain causes a significant valley polarization with inversion symmetry broken. • The strain leads to the well separated valley Hall and Shubnikov–de Haas effects.

  2. Internally Mounting Strain Gages

    Science.gov (United States)

    Jett, J. R., Jr.

    1984-01-01

    Technique for mounting strain gages inside bolt or cylinder simultaneously inserts gage, attached dowel segment, and length of expandable tubing. Expandable tubing holds gage in place while adhesive cures, assuring even distribution of pressure on gage and area gaged.

  3. Running Title: Strained Yoghurts

    African Journals Online (AJOL)

    USER

    2012-09-27

    Sep 27, 2012 ... ever, the traditional method of producing strained yoghurt ... Food market studies have the essential function of providing ..... Communication No: 2001/21. ... fermented foods and beverages of Turkey. Crit. Rev. Food. Sci. Nutr.

  4. Different distribution patterns of ten virulence genes in Legionella reference strains and strains isolated from environmental water and patients.

    Science.gov (United States)

    Zhan, Xiao-Yong; Hu, Chao-Hui; Zhu, Qing-Yi

    2016-04-01

    Virulence genes are distinct regions of DNA which are present in the genome of pathogenic bacteria and absent in nonpathogenic strains of the same or related species. Virulence genes are frequently associated with bacterial pathogenicity in genus Legionella. In the present study, an assay was performed to detect ten virulence genes, including iraA, iraB, lvrA, lvrB, lvhD, cpxR, cpxA, dotA, icmC and icmD in different pathogenicity islands of 47 Legionella reference strains, 235 environmental strains isolated from water, and 4 clinical strains isolated from the lung tissue of pneumonia patients. The distribution frequencies of these genes in reference or/and environmental L. pneumophila strains were much higher than those in reference non-L. pneumophila or/and environmental non-L. pneumophila strains, respectively. L. pneumophila clinical strains also maintained higher frequencies of these genes compared to four other types of Legionella strains. Distribution frequencies of these genes in reference L. pneumophila strains were similar to those in environmental L. pneumophila strains. In contrast, environmental non-L. pneumophila maintained higher frequencies of these genes compared to those found in reference non-L. pneumophila strains. This study illustrates the association of virulence genes with Legionella pathogenicity and reveals the possible virulence evolution of non-L. pneumophia strains isolated from environmental water.

  5. Draft genome sequence of two Shingopyxis sp. strains H107 and H115 isolated from a chloraminated drinking water distriburion system simulator

    Data.gov (United States)

    U.S. Environmental Protection Agency — Draft genome sequence of two Shingopyxis sp. strains H107 and H115 isolated from a chloraminated drinking water distriburion system simulator. This dataset is...

  6. Flexible piezotronic strain sensor.

    Science.gov (United States)

    Zhou, Jun; Gu, Yudong; Fei, Peng; Mai, Wenjie; Gao, Yifan; Yang, Rusen; Bao, Gang; Wang, Zhong Lin

    2008-09-01

    Strain sensors based on individual ZnO piezoelectric fine-wires (PFWs; nanowires, microwires) have been fabricated by a simple, reliable, and cost-effective technique. The electromechanical sensor device consists of a single electrically connected PFW that is placed on the outer surface of a flexible polystyrene (PS) substrate and bonded at its two ends. The entire device is fully packaged by a polydimethylsiloxane (PDMS) thin layer. The PFW has Schottky contacts at its two ends but with distinctly different barrier heights. The I- V characteristic is highly sensitive to strain mainly due to the change in Schottky barrier height (SBH), which scales linear with strain. The change in SBH is suggested owing to the strain induced band structure change and piezoelectric effect. The experimental data can be well-described by the thermionic emission-diffusion model. A gauge factor of as high as 1250 has been demonstrated, which is 25% higher than the best gauge factor demonstrated for carbon nanotubes. The strain sensor developed here has applications in strain and stress measurements in cell biology, biomedical sciences, MEMS devices, structure monitoring, and more.

  7. Unified Scaling Law for flux pinning in practical superconductors: III. Minimum datasets, core parameters, and application of the Extrapolative Scaling Expression

    Science.gov (United States)

    Ekin, Jack W.; Cheggour, Najib; Goodrich, Loren; Splett, Jolene

    2017-03-01

    In Part 2 of these articles, an extensive analysis of pinning-force curves and raw scaling data was used to derive the Extrapolative Scaling Expression (ESE). This is a parameterization of the Unified Scaling Law (USL) that has the extrapolation capability of fundamental unified scaling, coupled with the application ease of a simple fitting equation. Here in Part 3, the accuracy of the ESE relation to interpolate and extrapolate limited critical-current data to obtain complete I c(B,T,ɛ) datasets is evaluated and compared with present fitting equations. Accuracy is analyzed in terms of root mean square (RMS) error and fractional deviation statistics. Highlights from 92 test cases are condensed and summarized, covering most fitting protocols and proposed parameterizations of the USL. The results show that ESE reliably extrapolates critical currents at fields B, temperatures T, and strains ɛ that are remarkably different from the fitted minimum dataset. Depending on whether the conductor is moderate-J c or high-J c, effective RMS extrapolation errors for ESE are in the range 2-5 A at 12 T, which approaches the I c measurement error (1-2%). The minimum dataset for extrapolating full I c(B,T,ɛ) characteristics is also determined from raw scaling data. It consists of one set of I c(B,ɛ) data at a fixed temperature (e.g., liquid helium temperature), and one set of I c(B,T) data at a fixed strain (e.g., zero applied strain). Error analysis of extrapolations from the minimum dataset with different fitting equations shows that ESE reduces the percentage extrapolation errors at individual data points at high fields, temperatures, and compressive strains down to 1/10th to 1/40th the size of those for extrapolations with present fitting equations. Depending on the conductor, percentage fitting errors for interpolations are also reduced to as little as 1/15th the size. The extrapolation accuracy of the ESE relation offers the prospect of straightforward implementation of

  8. Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies

    Science.gov (United States)

    Ma, X.

    2014-12-01

    Knowledge evolves in geoscience, and the evolution is reflected in datasets. In a context with distributed data sources, the evolution of knowledge may cause considerable challenges to data management and re-use. For example, a short news published in 2009 (Mascarelli, 2009) revealed the geoscience community's concern that the International Commission on Stratigraphy's change to the definition of Quaternary may bring heavy reworking of geologic maps. Now we are in the era of the World Wide Web, and geoscience knowledge is increasingly modeled and encoded in the form of ontologies and vocabularies by using semantic technologies. Accordingly, knowledge evolution leads to a consequence called ontology dynamics. Flouris et al. (2008) summarized 10 topics of general ontology changes/dynamics such as: ontology mapping, morphism, evolution, debugging and versioning, etc. Ontology dynamics makes impacts at several stages of a data life cycle and causes challenges, such as: the request for reworking of the extant data in a data center, semantic mismatch among data sources, differentiated understanding of a same piece of dataset between data providers and data users, as well as error propagation in cross-discipline data discovery and re-use (Ma et al., 2014). This presentation will analyze the best practices in the geoscience community so far and summarize a few recommendations to reduce the negative impacts of ontology dynamics in a data life cycle, including: communities of practice and collaboration on ontology and vocabulary building, link data records to standardized terms, and methods for (semi-)automatic reworking of datasets using semantic technologies. References: Flouris, G., Manakanatas, D., Kondylakis, H., Plexousakis, D., Antoniou, G., 2008. Ontology change: classification and survey. The Knowledge Engineering Review 23 (2), 117-152. Ma, X., Fox, P., Rozell, E., West, P., Zednik, S., 2014. Ontology dynamics in a data life cycle: Challenges and recommendations

  9. A curated database of cyanobacterial strains relevant for modern taxonomy and phylogenetic studies.

    Science.gov (United States)

    Ramos, Vitor; Morais, João; Vasconcelos, Vitor M

    2017-04-25

    The dataset herein described lays the groundwork for an online database of relevant cyanobacterial strains, named CyanoType (http://lege.ciimar.up.pt/cyanotype). It is a database that includes categorized cyanobacterial strains useful for taxonomic, phylogenetic or genomic purposes, with associated information obtained by means of a literature-based curation. The dataset lists 371 strains and represents the first version of the database (CyanoType v.1). Information for each strain includes strain synonymy and/or co-identity, strain categorization, habitat, accession numbers for molecular data, taxonomy and nomenclature notes according to three different classification schemes, hierarchical automatic classification, phylogenetic placement according to a selection of relevant studies (including this), and important bibliographic references. The database will be updated periodically, namely by adding new strains meeting the criteria for inclusion and by revising and adding up-to-date metadata for strains already listed. A global 16S rDNA-based phylogeny is provided in order to assist users when choosing the appropriate strains for their studies.

  10. Animated analysis of geoscientific datasets: An interactive graphical application

    Science.gov (United States)

    Morse, Peter; Reading, Anya; Lueg, Christopher

    2017-12-01

    Geoscientists are required to analyze and draw conclusions from increasingly large volumes of data. There is a need to recognise and characterise features and changing patterns of Earth observables within such large datasets. It is also necessary to identify significant subsets of the data for more detailed analysis. We present an innovative, interactive software tool and workflow to visualise, characterise, sample and tag large geoscientific datasets from both local and cloud-based repositories. It uses an animated interface and human-computer interaction to utilise the capacity of human expert observers to identify features via enhanced visual analytics. 'Tagger' enables users to analyze datasets that are too large in volume to be drawn legibly on a reasonable number of single static plots. Users interact with the moving graphical display, tagging data ranges of interest for subsequent attention. The tool provides a rapid pre-pass process using fast GPU-based OpenGL graphics and data-handling and is coded in the Quartz Composer visual programing language (VPL) on Mac OSX. It makes use of interoperable data formats, and cloud-based (or local) data storage and compute. In a case study, Tagger was used to characterise a decade (2000-2009) of data recorded by the Cape Sorell Waverider Buoy, located approximately 10 km off the west coast of Tasmania, Australia. These data serve as a proxy for the understanding of Southern Ocean storminess, which has both local and global implications. This example shows use of the tool to identify and characterise 4 different types of storm and non-storm events during this time. Events characterised in this way are compared with conventional analysis, noting advantages and limitations of data analysis using animation and human interaction. Tagger provides a new ability to make use of humans as feature detectors in computer-based analysis of large-volume geosciences and other data.

  11. Designing the colorectal cancer core dataset in Iran

    Directory of Open Access Journals (Sweden)

    Sara Dorri

    2017-01-01

    Full Text Available Background: There is no need to explain the importance of collection, recording and analyzing the information of disease in any health organization. In this regard, systematic design of standard data sets can be helpful to record uniform and consistent information. It can create interoperability between health care systems. The main purpose of this study was design the core dataset to record colorectal cancer information in Iran. Methods: For the design of the colorectal cancer core data set, a combination of literature review and expert consensus were used. In the first phase, the draft of the data set was designed based on colorectal cancer literature review and comparative studies. Then, in the second phase, this data set was evaluated by experts from different discipline such as medical informatics, oncology and surgery. Their comments and opinion were taken. In the third phase refined data set, was evaluated again by experts and eventually data set was proposed. Results: In first phase, based on the literature review, a draft set of 85 data elements was designed. In the second phase this data set was evaluated by experts and supplementary information was offered by professionals in subgroups especially in treatment part. In this phase the number of elements totally were arrived to 93 numbers. In the third phase, evaluation was conducted by experts and finally this dataset was designed in five main parts including: demographic information, diagnostic information, treatment information, clinical status assessment information, and clinical trial information. Conclusion: In this study the comprehensive core data set of colorectal cancer was designed. This dataset in the field of collecting colorectal cancer information can be useful through facilitating exchange of health information. Designing such data set for similar disease can help providers to collect standard data from patients and can accelerate retrieval from storage systems.

  12. FTSPlot: fast time series visualization for large datasets.

    Directory of Open Access Journals (Sweden)

    Michael Riss

    Full Text Available The analysis of electrophysiological recordings often involves visual inspection of time series data to locate specific experiment epochs, mask artifacts, and verify the results of signal processing steps, such as filtering or spike detection. Long-term experiments with continuous data acquisition generate large amounts of data. Rapid browsing through these massive datasets poses a challenge to conventional data plotting software because the plotting time increases proportionately to the increase in the volume of data. This paper presents FTSPlot, which is a visualization concept for large-scale time series datasets using techniques from the field of high performance computer graphics, such as hierarchic level of detail and out-of-core data handling. In a preprocessing step, time series data, event, and interval annotations are converted into an optimized data format, which then permits fast, interactive visualization. The preprocessing step has a computational complexity of O(n x log(N; the visualization itself can be done with a complexity of O(1 and is therefore independent of the amount of data. A demonstration prototype has been implemented and benchmarks show that the technology is capable of displaying large amounts of time series data, event, and interval annotations lag-free with < 20 ms ms. The current 64-bit implementation theoretically supports datasets with up to 2(64 bytes, on the x86_64 architecture currently up to 2(48 bytes are supported, and benchmarks have been conducted with 2(40 bytes/1 TiB or 1.3 x 10(11 double precision samples. The presented software is freely available and can be included as a Qt GUI component in future software projects, providing a standard visualization method for long-term electrophysiological experiments.

  13. A synthetic dataset for evaluating soft and hard fusion algorithms

    Science.gov (United States)

    Graham, Jacob L.; Hall, David L.; Rimland, Jeffrey

    2011-06-01

    There is an emerging demand for the development of data fusion techniques and algorithms that are capable of combining conventional "hard" sensor inputs such as video, radar, and multispectral sensor data with "soft" data including textual situation reports, open-source web information, and "hard/soft" data such as image or video data that includes human-generated annotations. New techniques that assist in sense-making over a wide range of vastly heterogeneous sources are critical to improving tactical situational awareness in counterinsurgency (COIN) and other asymmetric warfare situations. A major challenge in this area is the lack of realistic datasets available for test and evaluation of such algorithms. While "soft" message sets exist, they tend to be of limited use for data fusion applications due to the lack of critical message pedigree and other metadata. They also lack corresponding hard sensor data that presents reasonable "fusion opportunities" to evaluate the ability to make connections and inferences that span the soft and hard data sets. This paper outlines the design methodologies, content, and some potential use cases of a COIN-based synthetic soft and hard dataset created under a United States Multi-disciplinary University Research Initiative (MURI) program funded by the U.S. Army Research Office (ARO). The dataset includes realistic synthetic reports from a variety of sources, corresponding synthetic hard data, and an extensive supporting database that maintains "ground truth" through logical grouping of related data into "vignettes." The supporting database also maintains the pedigree of messages and other critical metadata.

  14. A curated transcriptome dataset collection to investigate the functional programming of human hematopoietic cells in early life.

    Science.gov (United States)

    Rahman, Mahbuba; Boughorbel, Sabri; Presnell, Scott; Quinn, Charlie; Cugno, Chiara; Chaussabel, Damien; Marr, Nico

    2016-01-01

    Compendia of large-scale datasets made available in public repositories provide an opportunity to identify and fill gaps in biomedical knowledge. But first, these data need to be made readily accessible to research investigators for interpretation. Here we make available a collection of transcriptome datasets to investigate the functional programming of human hematopoietic cells in early life. Thirty two datasets were retrieved from the NCBI Gene Expression Omnibus (GEO) and loaded in a custom web application called the Gene Expression Browser (GXB), which was designed for interactive query and visualization of integrated large-scale data. Quality control checks were performed. Multiple sample groupings and gene rank lists were created allowing users to reveal age-related differences in transcriptome profiles, changes in the gene expression of neonatal hematopoietic cells to a variety of immune stimulators and modulators, as well as during cell differentiation. Available demographic, clinical, and cell phenotypic information can be overlaid with the gene expression data and used to sort samples. Web links to customized graphical views can be generated and subsequently inserted in manuscripts to report novel findings. GXB also enables browsing of a single gene across projects, thereby providing new perspectives on age- and developmental stage-specific expression of a given gene across the human hematopoietic system. This dataset collection is available at: http://developmentalimmunology.gxbsidra.org/dm3/geneBrowser/list.

  15. StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees.

    Science.gov (United States)

    Roosaare, Märt; Vaher, Mihkel; Kaplinski, Lauris; Möls, Märt; Andreson, Reidar; Lepamets, Maarja; Kõressaar, Triinu; Naaber, Paul; Kõljalg, Siiri; Remm, Maido

    2017-01-01

    Fast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees. A tool named StrainSeeker was developed that constructs a list of specific k -mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1-2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific k -mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 Escherichia coli isolates, we demonstrate that StrainSeeker can predict the clades of E. coli with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain. StrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker's web interface and pre-computed guide trees are available at http://bioinfo.ut.ee/strainseeker. Source code is stored at GitHub: https://github.com/bioinfo-ut/StrainSeeker.

  16. Identifying frauds and anomalies in Medicare-B dataset.

    Science.gov (United States)

    Jiwon Seo; Mendelevitch, Ofer

    2017-07-01

    Healthcare industry is growing at a rapid rate to reach a market value of $7 trillion dollars world wide. At the same time, fraud in healthcare is becoming a serious problem, amounting to 5% of the total healthcare spending, or $100 billion dollars each year in US. Manually detecting healthcare fraud requires much effort. Recently, machine learning and data mining techniques are applied to automatically detect healthcare frauds. This paper proposes a novel PageRank-based algorithm to detect healthcare frauds and anomalies. We apply the algorithm to Medicare-B dataset, a real-life data with 10 million healthcare insurance claims. The algorithm successfully identifies tens of previously unreported anomalies.

  17. Power analysis dataset for QCA based multiplexer circuits

    Directory of Open Access Journals (Sweden)

    Md. Abdullah-Al-Shafi

    2017-04-01

    Full Text Available Power consumption in irreversible QCA logic circuits is a vital and a major issue; however in the practical cases, this focus is mostly omitted.The complete power depletion dataset of different QCA multiplexers have been worked out in this paper. At −271.15 °C temperature, the depletion is evaluated under three separate tunneling energy levels. All the circuits are designed with QCADesigner, a broadly used simulation engine and QCAPro tool has been applied for estimating the power dissipation.

  18. Equalizing imbalanced imprecise datasets for genetic fuzzy classifiers

    Directory of Open Access Journals (Sweden)

    AnaM. Palacios

    2012-04-01

    Full Text Available Determining whether an imprecise dataset is imbalanced is not immediate. The vagueness in the data causes that the prior probabilities of the classes are not precisely known, and therefore the degree of imbalance can also be uncertain. In this paper we propose suitable extensions of different resampling algorithms that can be applied to interval valued, multi-labelled data. By means of these extended preprocessing algorithms, certain classification systems designed for minimizing the fraction of misclassifications are able to produce knowledge bases that are also adequate under common metrics for imbalanced classification.

  19. Scientific Datasets: Discovery and Aggregation for Semantic Interpretation.

    Science.gov (United States)

    Lopez, L. A.; Scott, S.; Khalsa, S. J. S.; Duerr, R.

    2015-12-01

    One of the biggest challenges that interdisciplinary researchers face is finding suitable datasets in order to advance their science; this problem remains consistent across multiple disciplines. A surprising number of scientists, when asked what tool they use for data discovery, reply "Google", which is an acceptable solution in some cases but not even Google can find -or cares to compile- all the data that's relevant for science and particularly geo sciences. If a dataset is not discoverable through a well known search provider it will remain dark data to the scientific world.For the past year, BCube, an EarthCube Building Block project, has been developing, testing and deploying a technology stack capable of data discovery at web-scale using the ultimate dataset: The Internet. This stack has 2 principal components, a web-scale crawling infrastructure and a semantic aggregator. The web-crawler is a modified version of Apache Nutch (the originator of Hadoop and other big data technologies) that has been improved and tailored for data and data service discovery. The second component is semantic aggregation, carried out by a python-based workflow that extracts valuable metadata and stores it in the form of triples through the use semantic technologies.While implementing the BCube stack we have run into several challenges such as a) scaling the project to cover big portions of the Internet at a reasonable cost, b) making sense of very diverse and non-homogeneous data, and lastly, c) extracting facts about these datasets using semantic technologies in order to make them usable for the geosciences community. Despite all these challenges we have proven that we can discover and characterize data that otherwise would have remained in the dark corners of the Internet. Having all this data indexed and 'triplelized' will enable scientists to access a trove of information relevant to their work in a more natural way. An important characteristic of the BCube stack is that all

  20. Dataset concerning the analytical approximation of the Ae3 temperature

    Directory of Open Access Journals (Sweden)

    B.L. Ennis

    2017-02-01

    The dataset includes the terms of the function and the values for the polynomial coefficients for major alloying elements in steel. A short description of the approximation method used to derive and validate the coefficients has also been included. For discussion and application of this model, please refer to the full length article entitled “The role of aluminium in chemical and phase segregation in a TRIP-assisted dual phase steel” 10.1016/j.actamat.2016.05.046 (Ennis et al., 2016 [1].