large genome soil: Topics by WorldWideScience.org

Sample records for large genome soil

Biological consequences of ancient gene acquisition and duplication in the large genome soil bacterium, ""solibacter usitatus"" strain Ellin6076

Energy Technology Data Exchange (ETDEWEB)

Challacombe, Jean F [Los Alamos National Laboratory; Eichorst, Stephanie A [Los Alamos National Laboratory; Xie, Gary [Los Alamos National Laboratory; Kuske, Cheryl R [Los Alamos National Laboratory; Hauser, Loren [ORNL; Land, Miriam [ORNL

2009-01-01

Bacterial genome sizes range from ca. 0.5 to 10Mb and are influenced by gene duplication, horizontal gene transfer, gene loss and other evolutionary processes. Sequenced genomes of strains in the phylum Acidobacteria revealed that 'Solibacter usistatus' strain Ellin6076 harbors a 9.9 Mb genome. This large genome appears to have arisen by horizontal gene transfer via ancient bacteriophage and plasmid-mediated transduction, as well as widespread small-scale gene duplications. This has resulted in an increased number of paralogs that are potentially ecologically important (ecoparalogs). Low amino acid sequence identities among functional group members and lack of conserved gene order and orientation in the regions containing similar groups of paralogs suggest that most of the paralogs were not the result of recent duplication events. The genome sizes of cultured subdivision 1 and 3 strains in the phylum Acidobacteria were estimated using pulsed-field gel electrophoresis to determine the prevalence of the large genome trait within the phylum. Members of subdivision 1 were estimated to have smaller genome sizes ranging from ca. 2.0 to 4.8 Mb, whereas members of subdivision 3 had slightly larger genomes, from ca. 5.8 to 9.9 Mb. It is hypothesized that the large genome of strain Ellin6076 encodes traits that provide a selective metabolic, defensive and regulatory advantage in the variable soil environment.
Characterization of large-insert DNA libraries from soil for environmental genomic studies of Archaea

DEFF Research Database (Denmark)

Treusch, Alexander H; Kletzin, Arnulf; Raddatz, Guenter

2004-01-01

Complex genomic libraries are increasingly being used to retrieve complete genes, operons or large genomic fragments directly from environmental samples, without the need to cultivate the respective microorganisms. We report on the construction of three large-insert fosmid libraries in total...... (approximately 1% each) have been captured in our libraries. The diversity of putative protein-encoding genes, as reflected by their distribution into different COG clusters, was comparable to that encoded in complete genomes of cultivated microorganisms. A huge variety of genomic fragments has been captured...
Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes

Energy Technology Data Exchange (ETDEWEB)

White, Richard Allen; Bottos, Eric M.; Roy Chowdhury, Taniya; Zucker, Jeremy D.; Brislawn, Colin J.; Nicora, Carrie D.; Fansler, Sarah J.; Glaesemann, Kurt R.; Glass, Kevin; Jansson, Janet K.; Langille, Morgan

2016-06-28

functional roles in ecosystem stability and responses to environmental perturbations. This knowledge gap is largely due to the difficulty in culturing the majority of soil microbes. Thus, use of culture-independent approaches, such as metagenomics, promises the direct assessment of the functional potential of soil microbiomes. Soil is, however, a challenge for metagenomic assembly due to its high microbial diversity and variable evenness, resulting in low coverage and uneven sampling of microbial genomes. Despite increasingly large soil metagenome data volumes (>200 Gbp), the majority of the data do not assemble. Here, we used the cutting-edge approach of synthetic long-read sequencing technology (Moleculo) to assemble soil metagenome sequence data into long contigs and used the assemblies for binning of genomes.

Author Video: Anauthor video summaryof this article is available.
Phylogenetic distribution of large-scale genome patchiness

Directory of Open Access Journals (Sweden)

Hackenberg Michael

2008-04-01

Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.
Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

Science.gov (United States)

Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

2016-01-01

Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.
Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

Directory of Open Access Journals (Sweden)

Wei Yee Wee

Full Text Available Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.
How agricultural management shapes soil microbial communities: patterns emerging from genetic and genomic studies

Science.gov (United States)

Daly, Amanda; Grandy, A. Stuart

2016-04-01

Agriculture is a predominant land use and thus a large influence on global carbon (C) and nitrogen (N) balances, climate, and human health. If we are to produce food, fiber, and fuel sustainably we must maximize agricultural yield while minimizing negative environmental consequences, goals towards which we have made great strides through agronomic advances. However, most agronomic strategies have been designed with a view of soil as a black box, largely ignoring the way management is mediated by soil biota. Because soil microbes play a central role in many of the processes that deliver nutrients to crops and support their health and productivity, agricultural management strategies targeted to exploit or support microbial activity should deliver additional benefits. To do this we must determine how microbial community structure and function are shaped by agricultural practices, but until recently our characterizations of soil microbial communities in agricultural soils have been largely limited to broad taxonomic classes due to methodological constraints. With advances in high-throughput genetic and genomic sequencing techniques, better taxonomic resolution now enables us to determine how agricultural management affects specific microbes and, in turn, nutrient cycling outcomes. Here we unite findings from published research that includes genetic or genomic data about microbial community structure (e.g. 454, Illumina, clone libraries, qPCR) in soils under agricultural management regimes that differ in type and extent of tillage, cropping selections and rotations, inclusion of cover crops, organic amendments, and/or synthetic fertilizer application. We delineate patterns linking agricultural management to microbial diversity, biomass, C- and N-content, and abundance of microbial taxa; furthermore, where available, we compare patterns in microbial communities to patterns in soil extracellular enzyme activities, catabolic profiles, inorganic nitrogen pools, and nitrogen
First insight into the genome of an uncultivated crenarchaeote from soil

DEFF Research Database (Denmark)

Quaiser, Achim; Ochsenreiter, Torsten; Klenk, Hans-Peter

2002-01-01

RNA genes and of several protein encoding genes (e.g. DNA polymerase, FixAB, glycosyl transferase) confirmed the specific affiliation of the genomic fragment with the non-thermophilic clade of the crenarchaeota. Content and structure of the genomic fragment indicated that the archaea from soil differ......Molecular phylogenetic surveys based on the characterization of 16S rRNA genes have revealed that soil is an environment particularly rich in microbial diversity. A clade of crenarchaeota (archaea) has frequently been detected among many other novel lineages of uncultivated bacteria. In this study...... we have initiated a genomic approach for the characterization of uncultivated microorganisms from soil. We have developed a procedure based on a two-phase electrophoresis technique that allows the fast and reliable purification of concentrated and clonable, high molecular weight DNA. From this DNA we...
Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

Science.gov (United States)

Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

2014-07-01

Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Draft Genome Sequences of Three β-Lactam-Catabolizing Soil Proteobacteria

DEFF Research Database (Denmark)

Crofts, Terence S.; Wang, Bin; Spivak, Aaron

2017-01-01

Most antibiotics are derived from the soil, but their catabolism there, which is necessary to close the antibiotic carbon cycle, remains uncharacterized. We report the first draft genome sequences of soil Proteobacteria identified for subsisting solely on β-lactams as their carbon sources...
EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

Science.gov (United States)

Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun

2017-08-01

Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Whole-Genome Sequence of the Soil Bacterium Micrococcus sp. KBS0714.

Science.gov (United States)

Kuo, V; Shoemaker, W R; Muscarella, M E; Lennon, J T

2017-08-10

We present here a draft genome assembly of Micrococcus sp. KBS0714, which was isolated from agricultural soil. The genome provides insight into the strategies that Micrococcus spp. use to contend with environmental stressors such as desiccation and starvation in environmental and host-associated ecosystems. Copyright © 2017 Kuo et al.
GDC 2: Compression of large collections of genomes.

Science.gov (United States)

Deorowicz, Sebastian; Danek, Agnieszka; Niemiec, Marcin

2015-06-25

The fall of prices of the high-throughput genome sequencing changes the landscape of modern genomics. A number of large scale projects aimed at sequencing many human genomes are in progress. Genome sequencing also becomes an important aid in the personalized medicine. One of the significant side effects of this change is a necessity of storage and transfer of huge amounts of genomic data. In this paper we deal with the problem of compression of large collections of complete genomic sequences. We propose an algorithm that is able to compress the collection of 1092 human diploid genomes about 9,500 times. This result is about 4 times better than what is offered by the other existing compressors. Moreover, our algorithm is very fast as it processes the data with speed 200 MB/s on a modern workstation. In a consequence the proposed algorithm allows storing the complete genomic collections at low cost, e.g., the examined collection of 1092 human genomes needs only about 700 MB when compressed, what can be compared to about 6.7 TB of uncompressed FASTA files. The source code is available at http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=gdc&subpage=about.
Genome-based comparative analyses of Antarctic and temperate species of Paenibacillus.

Directory of Open Access Journals (Sweden)

Melissa Dsouza

Full Text Available Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR. However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043-3,091 protein-coding sequences (CDSs, primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further
Targeting Unknowns Just Underfoot: Microbial Ecology and Community Genomics of C Cycling in Soil Informed and Enabled with DNA-SIP

Science.gov (United States)

Pepe-Ranney, C. P.; Campbell, A.; Buckley, D. H.

2015-12-01

Microorganisms drive biogeochemical cycles and because soil is a large global carbon (C) reservoir (soil contains more C than plants and the atmosphere combined), soil microorganisms are important players in the global C-cycle. Frustratingly, however, many soil microorganisms resist cultivation and soil communities are astoundingly complex. This makes soil microbiology difficult to study and without a solid understanding of soil microbial ecology, models of soil C feedbacks to climate change are under-informed. Stable isotope probing (SIP) is a useful approach for establishing identity-function connections in microbial communities but has been challenging to employ in soil due to the inadequate resolution of microbial community fingerprinting techniques. High throughput DNA sequencing improves SIP resolving power transforming it into a powerful tool for studying the soil C cycle. We conducted a DNA-SIP experiment to track flow of xylose-C, a labile component of plant biomass, and cellulose-C, the most abundant global biopolymer, through a soil microbial community. We could track 13C into microbial DNA even when added 13C amounted to less than 5% of native C and found Spartobacteria, Chloroflexi, and Planctomycetes taxa were among those that assimilated 13C cellulose. These lineages are cosmopolitan in soil but little is known of their ecophysiology. By profiling SSU rRNA genes across entire DNA-SIP density gradients, we assessed relative DNA atom % 13C per taxon in 13C treatments and found cellulose degraders exhibited signal consistent with a specialist lifestyle with respect to C preference. Further, DNA-SIP enriches DNA of targeted microorganisms (Verrucomicrobia cellulose degraders were enriched by nearly two orders of magnitude) and this enriched DNA can serve as template for community genomics. We produced draft genomes from soil cellulose degraders including microorganisms belonging to Verrucomicrobia, Chloroflexi, and Planctomycetes from SIP enriched DNA
Genome size variation affects song attractiveness in grasshoppers: evidence for sexual selection against large genomes.

Science.gov (United States)

Schielzeth, Holger; Streitner, Corinna; Lampe, Ulrike; Franzke, Alexandra; Reinhold, Klaus

2014-12-01

Genome size is largely uncorrelated to organismal complexity and adaptive scenarios. Genetic drift as well as intragenomic conflict have been put forward to explain this observation. We here study the impact of genome size on sexual attractiveness in the bow-winged grasshopper Chorthippus biguttulus. Grasshoppers show particularly large variation in genome size due to the high prevalence of supernumerary chromosomes that are considered (mildly) selfish, as evidenced by non-Mendelian inheritance and fitness costs if present in high numbers. We ranked male grasshoppers by song characteristics that are known to affect female preferences in this species and scored genome sizes of attractive and unattractive individuals from the extremes of this distribution. We find that attractive singers have significantly smaller genomes, demonstrating that genome size is reflected in male courtship songs and that females prefer songs of males with small genomes. Such a genome size dependent mate preference effectively selects against selfish genetic elements that tend to increase genome size. The data therefore provide a novel example of how sexual selection can reinforce natural selection and can act as an agent in an intragenomic arms race. Furthermore, our findings indicate an underappreciated route of how choosy females could gain indirect benefits. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
State of the Art in Large-Scale Soil Moisture Monitoring

Science.gov (United States)

Ochsner, Tyson E.; Cosh, Michael Harold; Cuenca, Richard H.; Dorigo, Wouter; Draper, Clara S.; Hagimoto, Yutaka; Kerr, Yan H.; Larson, Kristine M.; Njoku, Eni Gerald; Small, Eric E.;

2013-01-01

Soil moisture is an essential climate variable influencing land atmosphere interactions, an essential hydrologic variable impacting rainfall runoff processes, an essential ecological variable regulating net ecosystem exchange, and an essential agricultural variable constraining food security. Large-scale soil moisture monitoring has advanced in recent years creating opportunities to transform scientific understanding of soil moisture and related processes. These advances are being driven by researchers from a broad range of disciplines, but this complicates collaboration and communication. For some applications, the science required to utilize large-scale soil moisture data is poorly developed. In this review, we describe the state of the art in large-scale soil moisture monitoring and identify some critical needs for research to optimize the use of increasingly available soil moisture data. We review representative examples of 1) emerging in situ and proximal sensing techniques, 2) dedicated soil moisture remote sensing missions, 3) soil moisture monitoring networks, and 4) applications of large-scale soil moisture measurements. Significant near-term progress seems possible in the use of large-scale soil moisture data for drought monitoring. Assimilation of soil moisture data for meteorological or hydrologic forecasting also shows promise, but significant challenges related to model structures and model errors remain. Little progress has been made yet in the use of large-scale soil moisture observations within the context of ecological or agricultural modeling. Opportunities abound to advance the science and practice of large-scale soil moisture monitoring for the sake of improved Earth system monitoring, modeling, and forecasting.

Efficient assembly of de novo human artificial chromosomes from large genomic loci

Directory of Open Access Journals (Sweden)

Stromberg Gregory

2005-07-01

Full Text Available Abstract Background Human Artificial Chromosomes (HACs are potentially useful vectors for gene transfer studies and for functional annotation of the genome because of their suitability for cloning, manipulating and transferring large segments of the genome. However, development of HACs for the transfer of large genomic loci into mammalian cells has been limited by difficulties in manipulating high-molecular weight DNA, as well as by the low overall frequencies of de novo HAC formation. Indeed, to date, only a small number of large (>100 kb genomic loci have been reported to be successfully packaged into de novo HACs. Results We have developed novel methodologies to enable efficient assembly of HAC vectors containing any genomic locus of interest. We report here the creation of a novel, bimolecular system based on bacterial artificial chromosomes (BACs for the construction of HACs incorporating any defined genomic region. We have utilized this vector system to rapidly design, construct and validate multiple de novo HACs containing large (100–200 kb genomic loci including therapeutically significant genes for human growth hormone (HGH, polycystic kidney disease (PKD1 and ß-globin. We report significant differences in the ability of different genomic loci to support de novo HAC formation, suggesting possible effects of cis-acting genomic elements. Finally, as a proof of principle, we have observed sustained ß-globin gene expression from HACs incorporating the entire 200 kb ß-globin genomic locus for over 90 days in the absence of selection. Conclusion Taken together, these results are significant for the development of HAC vector technology, as they enable high-throughput assembly and functional validation of HACs containing any large genomic locus. We have evaluated the impact of different genomic loci on the frequency of HAC formation and identified segments of genomic DNA that appear to facilitate de novo HAC formation. These genomic loci
Draft genome sequence of Streptomyces sp. strain F1, a potential source for glycoside hydrolases isolated from Brazilian soil

Directory of Open Access Journals (Sweden)

Ricardo Rodrigues de Melo

Full Text Available ABSTRACT Here, we show the draft genome sequence of Streptomyces sp. F1, a strain isolated from soil with great potential for secretion of hydrolytic enzymes used to deconstruct cellulosic biomass. The draft genome assembly of Streptomyces sp. strain F1 has 69 contigs with a total genome size of 8,142,296 bp and G + C 72.65%. Preliminary genome analysis identified 175 proteins as Carbohydrate-Active Enzymes, being 85 glycoside hydrolases organized in 33 distinct families. This draft genome information provides new insights on the key genes encoding hydrolytic enzymes involved in biomass deconstruction employed by soil bacteria.
Overview of soil phosphorus data from a large international soil database

NARCIS (Netherlands)

Batjes, N.H.

2014-01-01

An overiew of extractable soil phosphorus (P-Bray, P-Olsen, P-Mehlich and P-water) and P-retention data held in a large profile database is presented. The primary aim is to assess whether representative P-values, by broad soil group (FAO system), can be determined for each of these analytical

Analysis of large soil samples for actinides

Science.gov (United States)

Maxwell, III; Sherrod, L [Aiken, SC

2009-03-24

A method of analyzing relatively large soil samples for actinides by employing a separation process that includes cerium fluoride precipitation for removing the soil matrix and precipitates plutonium, americium, and curium with cerium and hydrofluoric acid followed by separating these actinides using chromatography cartridges.
Collembase: a repository for springtail genomics and soil quality assessment

Directory of Open Access Journals (Sweden)

Klein-Lankhorst Rene M

2007-09-01

Full Text Available Abstract Background Environmental quality assessment is traditionally based on responses of reproduction and survival of indicator organisms. For soil assessment the springtail Folsomia candida (Collembola is an accepted standard test organism. We argue that environmental quality assessment using gene expression profiles of indicator organisms exposed to test substrates is more sensitive, more toxicant specific and significantly faster than current risk assessment methods. To apply this species as a genomic model for soil quality testing we conducted an EST sequencing project and developed an online database. Description Collembase is a web-accessible database comprising springtail (F. candida genomic data. Presently, the database contains information on 8686 ESTs that are assembled into 5952 unique gene objects. Of those gene objects ~40% showed homology to other protein sequences available in GenBank (blastx analysis; non-redundant (nr database; expect-value -5. Software was applied to infer protein sequences. The putative peptides, which had an average length of 115 amino-acids (ranging between 23 and 440 were annotated with Gene Ontology (GO terms. In total 1025 peptides (~17% of the gene objects were assigned at least one GO term (expect-value -25. Within Collembase searches can be conducted based on BLAST and GO annotation, cluster name or using a BLAST server. The system furthermore enables easy sequence retrieval for functional genomic and Quantitative-PCR experiments. Sequences are submitted to GenBank (Accession numbers: EV473060 – EV481745. Conclusion Collembase http://www.collembase.org is a resource of sequence data on the springtail F. candida. The information within the database will be linked to a custom made microarray, based on the Agilent platform, which can be applied for soil quality testing. In addition, Collembase supplies information that is valuable for related scientific disciplines such as molecular ecology
Large zero-tension plate lysimeters for soil water and solute collection in undisturbed soils

Directory of Open Access Journals (Sweden)

A. Peters

2009-09-01

Full Text Available Water collection from undisturbed unsaturated soils to estimate in situ water and solute fluxes in the field is a challenge, in particular if soils are heterogeneous. Large sampling devices are required if preferential flow paths are present. We present a modular plate system that allows installation of large zero-tension lysimeter plates under undisturbed soils in the field. To investigate the influence of the lysimeter on the water flow field in the soil, a numerical 2-D simulation study was conducted for homogeneous soils with uni- and bimodal pore-size distributions and stochastic Miller-Miller heterogeneity. The collection efficiency was found to be highly dependent on the hydraulic functions, infiltration rate, and lysimeter size, and was furthermore affected by the degree of heterogeneity. In homogeneous soils with high saturated conductivities the devices perform poorly and even large lysimeters (width 250 cm can be bypassed by the soil water. Heterogeneities of soil hydraulic properties result into a network of flow channels that enhance the sampling efficiency of the lysimeter plates. Solute breakthrough into zero-tension lysimeter occurs slightly retarded as compared to the free soil, but concentrations in the collected water are similar to the mean flux concentration in the undisturbed soil. To validate the results from the numerical study, a dual tracer study with seven lysimeters of 1.25×1.25 m area was conducted in the field. Three lysimeters were installed underneath a 1.2 m filling of contaminated silty sand, the others deeper in the undisturbed soil. The lysimeters directly underneath the filled soil material collected water with a collection efficiency of 45%. The deeper lysimeters did not collect any water. The arrival of the tracers showed that almost all collected water came from preferential flow paths.
Draft genome sequence of Streptomyces sp. strain F1, a potential source for glycoside hydrolases isolated from Brazilian soil.

Science.gov (United States)

Melo, Ricardo Rodrigues de; Persinoti, Gabriela Felix; Paixão, Douglas Antonio Alvaredo; Squina, Fábio Márcio; Ruller, Roberto; Sato, Helia Harumi

Here, we show the draft genome sequence of Streptomyces sp. F1, a strain isolated from soil with great potential for secretion of hydrolytic enzymes used to deconstruct cellulosic biomass. The draft genome assembly of Streptomyces sp. strain F1 has 69 contigs with a total genome size of 8,142,296bp and G+C 72.65%. Preliminary genome analysis identified 175 proteins as Carbohydrate-Active Enzymes, being 85 glycoside hydrolases organized in 33 distinct families. This draft genome information provides new insights on the key genes encoding hydrolytic enzymes involved in biomass deconstruction employed by soil bacteria. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Defoliation and Soil Compaction Jointly Drive Large-Herbivore Grazing Effects on Plants and Soil Arthropods on Clay Soil

NARCIS (Netherlands)

van Klink, R.; Schrama, M.; Nolte, S.; Bakker, J. P.; WallisDeVries, M. F.; Berg, M. P.

In addition to the well-studied impacts of defecation and defoliation, large herbivores also affect plant and arthropod communities through trampling, and the associated soil compaction. Soil compaction can be expected to be particularly important on wet, fine-textured soils. Therefore, we
Defoliation and Soil Compaction Jointly Drive Large-Herbivore Grazing Effects on Plants and Soil Arthropods on Clay Soil

NARCIS (Netherlands)

van Klink, R.; Schrama, M.; Nolte, S.; Bakker, Jan P.; WallisDeVries, M.F.; Berg, M.P.

2015-01-01

In addition to the well-studied impacts of defecation and defoliation, large herbivores also affect plant and arthropod communities through trampling, and the associated soil compaction. Soil compaction can be expected to be particularly important on wet, fine-textured soils. Therefore, we
GEnomes Management Application (GEM.app): a new software tool for large-scale collaborative genome analysis.

Science.gov (United States)

Gonzalez, Michael A; Lebrigio, Rafael F Acosta; Van Booven, Derek; Ulloa, Rick H; Powell, Eric; Speziani, Fiorella; Tekin, Mustafa; Schüle, Rebecca; Züchner, Stephan

2013-06-01

Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges, we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ∼1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for nonbioinformaticians to make next-generation sequencing data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 sec across ∼1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease. © 2013 Wiley Periodicals, Inc.
Moditored unsaturated soil transport processes as a support for large scale soil and water management

Science.gov (United States)

Vanclooster, Marnik

2010-05-01

The current societal demand for sustainable soil and water management is very large. The drivers of global and climate change exert many pressures on the soil and water ecosystems, endangering appropriate ecosystem functioning. The unsaturated soil transport processes play a key role in soil-water system functioning as it controls the fluxes of water and nutrients from the soil to plants (the pedo-biosphere link), the infiltration flux of precipitated water to groundwater and the evaporative flux, and hence the feed back from the soil to the climate system. Yet, unsaturated soil transport processes are difficult to quantify since they are affected by huge variability of the governing properties at different space-time scales and the intrinsic non-linearity of the transport processes. The incompatibility of the scales between the scale at which processes reasonably can be characterized, the scale at which the theoretical process correctly can be described and the scale at which the soil and water system need to be managed, calls for further development of scaling procedures in unsaturated zone science. It also calls for a better integration of theoretical and modelling approaches to elucidate transport processes at the appropriate scales, compatible with the sustainable soil and water management objective. Moditoring science, i.e the interdisciplinary research domain where modelling and monitoring science are linked, is currently evolving significantly in the unsaturated zone hydrology area. In this presentation, a review of current moditoring strategies/techniques will be given and illustrated for solving large scale soil and water management problems. This will also allow identifying research needs in the interdisciplinary domain of modelling and monitoring and to improve the integration of unsaturated zone science in solving soil and water management issues. A focus will be given on examples of large scale soil and water management problems in Europe.
Complete Genome Sequences of Mycobacteriophages Clautastrophe, Kingsolomon, Krypton555, and Nicholas

OpenAIRE

Chung, Hui-Min; D’Elia, Tom; Ross, Joseph F.; Alvarado, Samuel M.; Brantley, Molly-Catherine; Bricker, Lydia P.; Butler, Courtney R.; Crist, Carson; Dane, Julia M.; Farran, Brett W.; Hobbs, Sierra; Lapak, Michelle; Lovell, Conner; Ludergnani, Nicholas; McMullen, Allison

2017-01-01

ABSTRACT We report here the complete genome sequences of four subcluster L3 mycobacteriophages newly isolated from soil samples, using Mycobacterium smegmatis mc2155 as the host. Comparative genomic analyses with four previously described subcluster L3 phages reveal strong nucleotide similarity and gene conservation, with several large insertions/deletions near their right genome ends.
Genomic characterization of large heterochromatic gaps in the human genome assembly.

Directory of Open Access Journals (Sweden)

Nicolas Altemose

2014-05-01

Full Text Available The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3. The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations.
Non-contiguous finished genome sequence and contextual data of the filamentous soil bacterium Ktedonobacter racemifer type strain (SOSP1-21).

Science.gov (United States)

Chang, Yun-Juan; Land, Miriam; Hauser, Loren; Chertkov, Olga; Del Rio, Tijana Glavina; Nolan, Matt; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Ovchinikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Mavromatis, Konstantinos; Liolios, Konstantinos; Brettin, Thomas; Fiebig, Anne; Rohde, Manfred; Abt, Birte; Göker, Markus; Detter, John C; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

2011-10-15

Ktedonobacter racemifer corrig. Cavaletti et al. 2007 is the type species of the genus Ktedonobacter, which in turn is the type genus of the family Ktedonobacteraceae, the type family of the order Ktedonobacterales within the class Ktedonobacteria in the phylum 'Chloroflexi'. Although K. racemifer shares some morphological features with the actinobacteria, it is of special interest because it was the first cultivated representative of a deep branching unclassified lineage of otherwise uncultivated environmental phylotypes tentatively located within the phylum 'Chloroflexi'. The aerobic, filamentous, non-motile, spore-forming Gram-positive heterotroph was isolated from soil in Italy. The 13,661,586 bp long non-contiguous finished genome consists of ten contigs and is the first reported genome sequence from a member of the class Ktedonobacteria. With its 11,453 protein-coding and 87 RNA genes, it is the largest prokaryotic genome reported so far. It comprises a large number of over-represented COGs, particularly genes associated with transposons, causing the genetic redundancy within the genome being considerably larger than expected by chance. This work is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
Complete Genome Sequences of Mycobacteriophages Clautastrophe, Kingsolomon, Krypton555, and Nicholas

Science.gov (United States)

Chung, Hui-Min; D’Elia, Tom; Ross, Joseph F.; Alvarado, Samuel M.; Brantley, Molly-Catherine; Bricker, Lydia P.; Butler, Courtney R.; Crist, Carson; Dane, Julia M.; Farran, Brett W.; Hobbs, Sierra; Lapak, Michelle; Lovell, Conner; McMullen, Allison; Mirza, Sohail A.; Thrift, Noah; Vaughan, Donald P.; Worley, Grace; Ejikemeuwa, Amara; Zaw, May; Albritton, Claude F.; Bertrand, Sarah C.; Chaudhry, Shanzay S.; Cheema, Vzair A.; Do, Camilla; Do, Michael L.; Duong, Huyen M.; El-Desoky, Dalia H.; Green, Kelsey M.; Lee, Rhea N.; Thornton, Lauren A.; Vu, James M.; Zahra, Mah Noor; Stoner, Ty H.; Garlena, Rebecca A.; Jacobs-Sera, Deborah; Russell, Daniel A.

2017-01-01

ABSTRACT We report here the complete genome sequences of four subcluster L3 mycobacteriophages newly isolated from soil samples, using Mycobacterium smegmatis mc2155 as the host. Comparative genomic analyses with four previously described subcluster L3 phages reveal strong nucleotide similarity and gene conservation, with several large insertions/deletions near their right genome ends. PMID:29122864
Kernel methods for large-scale genomic data analysis

Science.gov (United States)

Xing, Eric P.; Schaid, Daniel J.

2015-01-01

Machine learning, particularly kernel methods, has been demonstrated as a promising new tool to tackle the challenges imposed by today’s explosive data growth in genomics. They provide a practical and principled approach to learning how a large number of genetic variants are associated with complex phenotypes, to help reveal the complexity in the relationship between the genetic markers and the outcome of interest. In this review, we highlight the potential key role it will have in modern genomic data processing, especially with regard to integration with classical methods for gene prioritizing, prediction and data fusion. PMID:25053743
LARGE-SCALE INDICATIVE MAPPING OF SOIL RUNOFF

Directory of Open Access Journals (Sweden)

E. Panidi

2017-11-01

Full Text Available In our study we estimate relationships between quantitative parameters of relief, soil runoff regime, and spatial distribution of radioactive pollutants in the soil. The study is conducted on the test arable area located in basin of the upper Oka River (Orel region, Russia. Previously we collected rich amount of soil samples, which make it possible to investigate redistribution of the Chernobyl-origin cesium-137 in soil material and as a consequence the soil runoff magnitude at sampling points. Currently we are describing and discussing the technique applied to large-scale mapping of the soil runoff. The technique is based upon the cesium-137 radioactivity measurement in the different relief structures. Key stages are the allocation of the places for soil sampling points (we used very high resolution space imagery as a supporting data; soil samples collection and analysis; calibration of the mathematical model (using the estimated background value of the cesium-137 radioactivity; and automated compilation of the map (predictive map of the studied territory (digital elevation model is used for this purpose, and cesium-137 radioactivity can be predicted using quantitative parameters of the relief. The maps can be used as a support data for precision agriculture and for recultivation or melioration purposes.
Large inserts for big data: artificial chromosomes in the genomic era.

Science.gov (United States)

Tocchetti, Arianna; Donadio, Stefano; Sosio, Margherita

2018-05-01

The exponential increase in available microbial genome sequences coupled with predictive bioinformatic tools is underscoring the genetic capacity of bacteria to produce an unexpected large number of specialized bioactive compounds. Since most of the biosynthetic gene clusters (BGCs) present in microbial genomes are cryptic, i.e. not expressed under laboratory conditions, a variety of cloning systems and vectors have been devised to harbor DNA fragments large enough to carry entire BGCs and to allow their transfer in suitable heterologous hosts. This minireview provides an overview of the vectors and approaches that have been developed for cloning large BGCs, and successful examples of heterologous expression.
Continuous data assimilation for downscaling large-footprint soil moisture retrievals

KAUST Repository

Altaf, M. U.

2016-09-01

Soil moisture is a crucial component of the hydrologic cycle, significantly influencing runoff, infiltration, recharge, evaporation and transpiration processes. Models characterizing these processes require soil moisture as an input, either directly or indirectly. Better characterization of the spatial variability of soil moisture leads to better predictions from hydrologic/climate models. In-situ measurements have fine resolution, but become impractical in terms of coverage over large extents. Remotely sensed data have excellent spatial coverage extents, but suffer from poorer spatial and temporal resolution. We present here an innovative approach to downscaling coarse resolution soil moisture data by combining data assimilation and physically based modeling. In this approach, we exploit the features of Continuous Data Assimilation (CDA). A nudging term, estimated as the misfit between interpolants of the assimilated coarse grid measurements and the fine grid model solution, is added to the model equations to constrain the model’s large scale variability by available measurements. Soil moisture fields generated at a fine resolution by a physically-based vadose zone model (e.g., HYDRUS) are subjected to data assimilation conditioned upon the coarse resolution observations. This enables nudging of the model outputs towards values that honor the coarse resolution dynamics while still being generated at the fine scale. The large scale features of the model output are constrained to the observations, and as a consequence, the misfit at the fine scale is reduced. The advantage of this approach is that fine resolution soil moisture maps can be generated across large spatial extents, given the coarse resolution data. The data assimilation approach also enables multi-scale data generation which is helpful to match the soil moisture input data to the corresponding modeling scale. Application of this approach is likely in generating fine and intermediate resolution soil
A protocol for large scale genomic DNA isolation for cacao genetics ...

African Journals Online (AJOL)

Advances in DNA technology, such as marker assisted selection, detection of quantitative trait loci and genomic selection also require the isolation of DNA from a large number of samples and the preservation of tissue samples for future use in cacao genome studies. The present study proposes a method for the ...
Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

Directory of Open Access Journals (Sweden)

Shade Larry L

2006-06-01

Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.
Indexes of large genome collections on a PC.

Directory of Open Access Journals (Sweden)

Agnieszka Danek

Full Text Available The availability of thousands of individual genomes of one species should boost rapid progress in personalized medicine or understanding of the interaction between genotype and phenotype, to name a few applications. A key operation useful in such analyses is aligning sequencing reads against a collection of genomes, which is costly with the use of existing algorithms due to their large memory requirements. We present MuGI, Multiple Genome Index, which reports all occurrences of a given pattern, in exact and approximate matching model, against a collection of thousand(s genomes. Its unique feature is the small index size, which is customisable. It fits in a standard computer with 16-32 GB, or even 8 GB, of RAM, for the 1000GP collection of 1092 diploid human genomes. The solution is also fast. For example, the exact matching queries (of average length 150 bp are handled in average time of 39 µs and with up to 3 mismatches in 373 µs on the test PC with the index size of 13.4 GB. For a smaller index, occupying 7.4 GB in memory, the respective times grow to 76 µs and 917 µs. Software is available at http://sun.aei.polsl.pl/mugi under a free license. Data S1 is available at PLOS One online.
Genic regions of a large salamander genome contain long introns and novel genes

Directory of Open Access Journals (Sweden)

Bryant Susan V

2009-01-01

Full Text Available Abstract Background The basis of genome size variation remains an outstanding question because DNA sequence data are lacking for organisms with large genomes. Sixteen BAC clones from the Mexican axolotl (Ambystoma mexicanum: c-value = 32 × 109 bp were isolated and sequenced to characterize the structure of genic regions. Results Annotation of genes within BACs showed that axolotl introns are on average 10× longer than orthologous vertebrate introns and they are predicted to contain more functional elements, including miRNAs and snoRNAs. Loci were discovered within BACs for two novel EST transcripts that are differentially expressed during spinal cord regeneration and skin metamorphosis. Unexpectedly, a third novel gene was also discovered while manually annotating BACs. Analysis of human-axolotl protein-coding sequences suggests there are 2% more lineage specific genes in the axolotl genome than the human genome, but the great majority (86% of genes between axolotl and human are predicted to be 1:1 orthologs. Considering that axolotl genes are on average 5× larger than human genes, the genic component of the salamander genome is estimated to be incredibly large, approximately 2.8 gigabases! Conclusion This study shows that a large salamander genome has a correspondingly large genic component, primarily because genes have incredibly long introns. These intronic sequences may harbor novel coding and non-coding sequences that regulate biological processes that are unique to salamanders.

Decoding Synteny Blocks and Large-Scale Duplications in Mammalian and Plant Genomes

Science.gov (United States)

Peng, Qian; Alekseyev, Max A.; Tesler, Glenn; Pevzner, Pavel A.

The existing synteny block reconstruction algorithms use anchors (e.g., orthologous genes) shared over all genomes to construct the synteny blocks for multiple genomes. This approach, while efficient for a few genomes, cannot be scaled to address the need to construct synteny blocks in many mammalian genomes that are currently being sequenced. The problem is that the number of anchors shared among all genomes quickly decreases with the increase in the number of genomes. Another problem is that many genomes (plant genomes in particular) had extensive duplications, which makes decoding of genomic architecture and rearrangement analysis in plants difficult. The existing synteny block generation algorithms in plants do not address the issue of generating non-overlapping synteny blocks suitable for analyzing rearrangements and evolution history of duplications. We present a new algorithm based on the A-Bruijn graph framework that overcomes these difficulties and provides a unified approach to synteny block reconstruction for multiple genomes, and for genomes with large duplications.
WormBase: Annotating many nematode genomes.

Science.gov (United States)

Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

2012-01-01

WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.
Assembling large genomes: analysis of the stick insect (Clitarchus hookeri) genome reveals a high repeat content and sex-biased genes associated with reproduction.

Science.gov (United States)

Wu, Chen; Twort, Victoria G; Crowhurst, Ross N; Newcomb, Richard D; Buckley, Thomas R

2017-11-16

Stick insects (Phasmatodea) have a high incidence of parthenogenesis and other alternative reproductive strategies, yet the genetic basis of reproduction is poorly understood. Phasmatodea includes nearly 3000 species, yet only the genome of Timema cristinae has been published to date. Clitarchus hookeri is a geographical parthenogenetic stick insect distributed across New Zealand. Sexual reproduction dominates in northern habitats but is replaced by parthenogenesis in the south. Here, we present a de novo genome assembly of a female C. hookeri and use it to detect candidate genes associated with gamete production and development in females and males. We also explore the factors underlying large genome size in stick insects. The C. hookeri genome assembly was 4.2 Gb, similar to the flow cytometry estimate, making it the second largest insect genome sequenced and assembled to date. Like the large genome of Locusta migratoria, the genome of C. hookeri is also highly repetitive and the predicted gene models are much longer than those from most other sequenced insect genomes, largely due to longer introns. Miniature inverted repeat transposable elements (MITEs), absent in the much smaller T. cristinae genome, is the most abundant repeat type in the C. hookeri genome assembly. Mapping RNA-Seq reads from female and male gonadal transcriptomes onto the genome assembly resulted in the identification of 39,940 gene loci, 15.8% and 37.6% of which showed female-biased and male-biased expression, respectively. The genes that were over-expressed in females were mostly associated with molecular transportation, developmental process, oocyte growth and reproductive process; whereas, the male-biased genes were enriched in rhythmic process, molecular transducer activity and synapse. Several genes involved in the juvenile hormone synthesis pathway were also identified. The evolution of large insect genomes such as L. migratoria and C. hookeri genomes is most likely due to the
Multidimensional scaling for large genomic data sets

Directory of Open Access Journals (Sweden)

Lu Henry

2008-04-01

Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully
105-DR Large sodium fire facility soil sampling data evaluation report

International Nuclear Information System (INIS)

Adler, J.G.

1996-01-01

This report evaluates the soil sampling activities, soil sample analysis, and soil sample data associated with the closure activities at the 105-DR Large Sodium Fire Facility. The evaluation compares these activities to the regulatory requirements for meeting clean closure. The report concludes that there is no soil contamination from the waste treatment activities
Novel European free-living, non-diazotrophic Bradyrhizobium isolates from contrasting soils that lack nodulation and nitrogen fixation genes - a genome comparison

Science.gov (United States)

Jones, Frances Patricia; Clark, Ian M.; King, Robert; Shaw, Liz J.; Woodward, Martin J.; Hirsch, Penny R.

2016-05-01

The slow-growing genus Bradyrhizobium is biologically important in soils, with different representatives found to perform a range of biochemical functions including photosynthesis, induction of root nodules and symbiotic nitrogen fixation and denitrification. Consequently, the role of the genus in soil ecology and biogeochemical transformations is of agricultural and environmental significance. Some isolates of Bradyrhizobium have been shown to be non-symbiotic and do not possess the ability to form nodules. Here we present the genome and gene annotations of two such free-living Bradyrhizobium isolates, named G22 and BF49, from soils with differing long-term management regimes (grassland and bare fallow respectively) in addition to carbon metabolism analysis. These Bradyrhizobium isolates are the first to be isolated and sequenced from European soil and are the first free-living Bradyrhizobium isolates, lacking both nodulation and nitrogen fixation genes, to have their genomes sequenced and assembled from cultured samples. The G22 and BF49 genomes are distinctly different with respect to size and number of genes; the grassland isolate also contains a plasmid. There are also a number of functional differences between these isolates and other published genomes, suggesting that this ubiquitous genus is extremely heterogeneous and has roles within the community not including symbiotic nitrogen fixation.
Seismic soil-structure interaction of foundations with large piles

International Nuclear Information System (INIS)

Zeevaert, L.

1996-01-01

In seismic regions with soft soil deposits subjected to ground surface subsidence, there is the necessity to support the weight of constructions on large diameter piles or piers hearing on deep firm strata. To justify the action of these elements working under flexo compression and shear, it is necessary to perform calculations of soil pile interaction from a practical engineering point of view and estimate the order of magnitude of the forces and displacements to which these elements will be subjected during the seismic action assigned to the foundation. In this paper we defined a pier as a large diameter pile constructed on site. Furthermore, in the seismic analysis it is necessary to evaluate the seismic pore water pressure to learn on the effective seismic soil stresses close to the ground surface. (author)
GENEPEASE Genomic tools for assessment of pesticide effects on the agricultural soil ecosystem

DEFF Research Database (Denmark)

Jacobsen, Carsten Suhr; Feld, Louise; Hjelmsø, Mathis Hjort

The project focussed on validating RNA based methods as potential genomic tools in assessment of agricultural soil ecosystems. It was shown that the mRNA based technique was very sensitive and the effects was seen in the same situations as when the OECD nitrification assay showed an effect. 16S r......RNA based pyrosequencing of bacterial communities in soil was shown to report different than just DNA based analysis and indicated unlike the DNA measurement that the community was developing. Finally microarray analysis was compared to traditional test for toxicity testing of Folsomia candida and showed...
Large-scale chromosome folding versus genomic DNA sequences: A discrete double Fourier transform technique.

Science.gov (United States)

Chechetkin, V R; Lobzin, V V

2017-08-07

Using state-of-the-art techniques combining imaging methods and high-throughput genomic mapping tools leaded to the significant progress in detailing chromosome architecture of various organisms. However, a gap still remains between the rapidly growing structural data on the chromosome folding and the large-scale genome organization. Could a part of information on the chromosome folding be obtained directly from underlying genomic DNA sequences abundantly stored in the databanks? To answer this question, we developed an original discrete double Fourier transform (DDFT). DDFT serves for the detection of large-scale genome regularities associated with domains/units at the different levels of hierarchical chromosome folding. The method is versatile and can be applied to both genomic DNA sequences and corresponding physico-chemical parameters such as base-pairing free energy. The latter characteristic is closely related to the replication and transcription and can also be used for the assessment of temperature or supercoiling effects on the chromosome folding. We tested the method on the genome of E. coli K-12 and found good correspondence with the annotated domains/units established experimentally. As a brief illustration of further abilities of DDFT, the study of large-scale genome organization for bacteriophage PHIX174 and bacterium Caulobacter crescentus was also added. The combined experimental, modeling, and bioinformatic DDFT analysis should yield more complete knowledge on the chromosome architecture and genome organization. Copyright © 2017 Elsevier Ltd. All rights reserved.
Non-contiguous finished genome sequence and contextual data of the filamentous soil bacterium Ktedonobacter racemifer type strain (SOSP1-21T)

Energy Technology Data Exchange (ETDEWEB)

Chang, Yun-Juan [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chertkov, Olga [Los Alamos National Laboratory (LANL); Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Fiebig, Anne [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute

2011-01-01

Ktedonobacter racemifer corrig. Cavaletti et al. 2007 is the type species of the genus Ktedo- nobacter, which in turn is the type genus of the family Ktedonobacteraceae, the type family of the order Ktedonobacterales within the class Ktedonobacteria in the phylum Chloroflexi . Although K. racemifer shares some morphological features with the actinobacteria, it is of special interest because it was the first cultivated representative of a deep branching unclassi- fied lineage of otherwise uncultivated environmental phylotypes tentatively located within the phylum Chloroflexi . The aerobic, filamentous, non-motile, spore-forming Gram-positive heterotroph was isolated from soil in Italy. The 13,661,586 bp long non-contiguous finished genome consists of ten contigs and is the first reported genome sequence from a member of the class Ktedonobacteria. With its 11,453 protein-coding and 87 RNA genes, it is the largest prokaryotic genome reported so far. It comprises a large number of over-represented COGs, particularly genes associated with transposons, causing the genetic redundancy within the genome being considerably larger than expected by chance. This work is a part of the Ge- nomic Encyclopedia of Bacteria and Archaea project.
Removal of Uranium in Soil Using Large-scale Electrokinetic Decontamination Equipment

Energy Technology Data Exchange (ETDEWEB)

Kim, Gye Nam; Kim, Il gook; Jeong, Jung Whan; Kim, Seung Soo; Choi, Jong Won [KAERI, Daejeon (Korea, Republic of)

2016-05-15

A method to remediate a large volume of radioactive soil should be developed. Until now the soil washing method has been studied to remediate soil contaminated with uranium, cobalt, cesium, and so on. However, it has a lower removal efficiency of nuclide from soils and generated a large volume of waste-solution. In addition, its application to the soil composed of fine particle is impossible. Thus, the electrokinetic method has been studied as a new technology for soil remediation recently. In this study, for a reduction of the waste electrolyte volume, the reuse period of waste electrolyte in the electrokinetic decontamination experiment through several experiments with the manufactured 1.2 ton electrokinetic decontamination equipment. In addition, the time required to reach below the clearance concentration level for self- disposal was estimated through several experiments using the manufactured electrokinetic decontamination equipment. When the initial uranium concentrations in the soils were 7.0-20.0 Bq/g, the times required to reach below the clearance concentration level for self-disposal were 25-40 days with the waste and reclaimed electrolytes.
Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets.

Science.gov (United States)

Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L

2014-01-01

As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Complete Genome Sequence of Bradyrhizobium sp. S23321: Insights into Symbiosis Evolution in Soil Oligotrophs

Science.gov (United States)

Okubo, Takashi; Tsukui, Takahiro; Maita, Hiroko; Okamoto, Shinobu; Oshima, Kenshiro; Fujisawa, Takatomo; Saito, Akihiro; Futamata, Hiroyuki; Hattori, Reiko; Shimomura, Yumi; Haruta, Shin; Morimoto, Sho; Wang, Yong; Sakai, Yoriko; Hattori, Masahira; Aizawa, Shin-ichi; Nagashima, Kenji V. P.; Masuda, Sachiko; Hattori, Tsutomu; Yamashita, Akifumi; Bao, Zhihua; Hayatsu, Masahito; Kajiya-Kanegae, Hiromi; Yoshinaga, Ikuo; Sakamoto, Kazunori; Toyota, Koki; Nakao, Mitsuteru; Kohara, Mitsuyo; Anda, Mizue; Niwa, Rieko; Jung-Hwan, Park; Sameshima-Saito, Reiko; Tokuda, Shin-ichi; Yamamoto, Sumiko; Yamamoto, Syuji; Yokoyama, Tadashi; Akutsu, Tomoko; Nakamura, Yasukazu; Nakahira-Yanaka, Yuka; Hoshino, Yuko Takada; Hirakawa, Hideki; Mitsui, Hisayuki; Terasawa, Kimihiro; Itakura, Manabu; Sato, Shusei; Ikeda-Ohtsubo, Wakako; Sakakura, Natsuko; Kaminuma, Eli; Minamisawa, Kiwamu

2012-01-01

Bradyrhizobium sp. S23321 is an oligotrophic bacterium isolated from paddy field soil. Although S23321 is phylogenetically close to Bradyrhizobium japonicum USDA110, a legume symbiont, it is unable to induce root nodules in siratro, a legume often used for testing Nod factor-dependent nodulation. The genome of S23321 is a single circular chromosome, 7,231,841 bp in length, with an average GC content of 64.3%. The genome contains 6,898 potential protein-encoding genes, one set of rRNA genes, and 45 tRNA genes. Comparison of the genome structure between S23321 and USDA110 showed strong colinearity; however, the symbiosis islands present in USDA110 were absent in S23321, whose genome lacked a chaperonin gene cluster (groELS3) for symbiosis regulation found in USDA110. A comparison of sequences around the tRNA-Val gene strongly suggested that S23321 contains an ancestral-type genome that precedes the acquisition of a symbiosis island by horizontal gene transfer. Although S23321 contains a nif (nitrogen fixation) gene cluster, the organization, homology, and phylogeny of the genes in this cluster were more similar to those of photosynthetic bradyrhizobia ORS278 and BTAi1 than to those on the symbiosis island of USDA110. In addition, we found genes encoding a complete photosynthetic system, many ABC transporters for amino acids and oligopeptides, two types (polar and lateral) of flagella, multiple respiratory chains, and a system for lignin monomer catabolism in the S23321 genome. These features suggest that S23321 is able to adapt to a wide range of environments, probably including low-nutrient conditions, with multiple survival strategies in soil and rhizosphere. PMID:22452844
Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

Science.gov (United States)

Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

2013-06-27

Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available
Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

Directory of Open Access Journals (Sweden)

Deng Xuemei

2007-11-01

Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic
Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

Directory of Open Access Journals (Sweden)

Varala Kranthi

2007-05-01

Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.
Agaricus bisporus genome sequence: a commentary.

Science.gov (United States)

Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

2013-06-01

The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and β-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium. Copyright © 2013 Elsevier Inc. All rights reserved.
Small genomes and large seeds: chromosome numbers, genome size and seed mass in diploid Aesculus species (Sapindaceae).

Science.gov (United States)

Krahulcová, Anna; Trávnícek, Pavel; Krahulec, František; Rejmánek, Marcel

2017-04-01

Aesculus L. (horse chestnut, buckeye) is a genus of 12-19 extant woody species native to the temperate Northern Hemisphere. This genus is known for unusually large seeds among angiosperms. While chromosome counts are available for many Aesculus species, only one has had its genome size measured. The aim of this study is to provide more genome size data and analyse the relationship between genome size and seed mass in this genus. Chromosome numbers in root tip cuttings were confirmed for four species and reported for the first time for three additional species. Flow cytometric measurements of 2C nuclear DNA values were conducted on eight species, and mean seed mass values were estimated for the same taxa. The same chromosome number, 2 n = 40, was determined in all investigated taxa. Original measurements of 2C values for seven Aesculus species (eight taxa), added to just one reliable datum for A. hippocastanum , confirmed the notion that the genome size in this genus with relatively large seeds is surprisingly low, ranging from 0·955 pg 2C -1 in A. parviflora to 1·275 pg 2C -1 in A. glabra var. glabra. The chromosome number of 2 n = 40 seems to be conclusively the universal 2 n number for non-hybrid species in this genus. Aesculus genome sizes are relatively small, not only within its own family, Sapindaceae, but also within woody angiosperms. The genome sizes seem to be distinct and non-overlapping among the four major Aesculus clades. These results provide an extra support for the most recent reconstruction of Aesculus phylogeny. The correlation between the 2C values and seed masses in examined Aesculus species is slightly negative and not significant. However, when the four major clades are treated separately, there is consistent positive association between larger genome size and larger seed mass within individual lineages. © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For
Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

Energy Technology Data Exchange (ETDEWEB)

Margaret Riley; Merry Buckley

2009-01-01

Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin
BFAST: an alignment tool for large scale genome resequencing.

Directory of Open Access Journals (Sweden)

Nils Homer

2009-11-01

Full Text Available The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation.We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels.We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

The role of prevention-oriented attitudes towards nature in people's judgment of new applications of genomics techniques in soil ecology

NARCIS (Netherlands)

de Boer, J.

2010-01-01

New applications of genomics techniques in soil ecology may provide people with fresh insights into the richness of microbial life forms and natural methods to build on the "self-cleaning capacity" of soils. Because genetic modification might also be involved, this paper examines people's judgments
Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

Science.gov (United States)

Mackey, Aaron J; Pearson, William R

2004-10-01

Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Ultra Large Gene Families: A Matter of Adaptation or Genomic Parasites?

Directory of Open Access Journals (Sweden)

Philipp H. Schiffer

2016-08-01

Full Text Available Gene duplication is an important mechanism of molecular evolution. It offers a fast track to modification, diversification, redundancy or rescue of gene function. However, duplication may also be neutral or (slightly deleterious, and often ends in pseudo-geneisation. Here, we investigate the phylogenetic distribution of ultra large gene families on long and short evolutionary time scales. In particular, we focus on a family of NACHT-domain and leucine-rich-repeat-containing (NLR-genes, which we previously found in large numbers to occupy one chromosome arm of the zebrafish genome. We were interested to see whether such a tight clustering is characteristic for ultra large gene families. Our data reconfirm that most gene family inflations are lineage-specific, but we can only identify very few gene clusters. Based on our observations we hypothesise that, beyond a certain size threshold, ultra large gene families continue to proliferate in a mechanism we term “run-away evolution”. This process might ultimately lead to the failure of genomic integrity and drive species to extinction.
The Hualien Large-Scale Seismic Test for soil-structure interaction research

International Nuclear Information System (INIS)

Tang, H.T.; Stepp, J.C.; Cheng, Y.H.

1991-01-01

A Large-Scale Seismic Test (LSST) Program at Hualien, Taiwan, has been initiated with the primary objective of obtaining earthquake-induced SSI data at a stiff soil site having similar prototypical nuclear power plant soil conditions. Preliminary soil boring, geophysical testing and ambient and earthquake-induced ground motion monitoring have been conducted to understand the experiment site conditions. More refined field and laboratory tests will be conducted such as the state-of-the-art freezing sampling technique and the large penetration test (LPT) method to characterize the soil constitutive behavior. The test model to be constructed will be similar to the Lotung model. The instrumentation layout will be designed to provide data for studies of SSI, spatial incoherence, soil stability, foundation uplifting, ground motion wave field and structural response. A consortium consisting of EPRI, Taipower, CRIEPI, TEPCO, CEA, EdF and Framatome has been established to carry out the project. It is envisaged that the Hualien SSI array will be ready to record earthquakes by the middle of 1992. The duration of the recording scheduled for five years. (author)
Complete genome sequence of Nocardia brasiliensis HUJEG-1.

Science.gov (United States)

Vera-Cabrera, Lucio; Ortiz-Lopez, Rocio; Elizondo-Gonzalez, Ramiro; Perez-Maya, Antonio Ali; Ocampo-Candiani, Jorge

2012-05-01

In Mexico, actinomycetoma is mainly caused by Nocardia brasiliensis, which is a soil inhabitant actinobacterium. Here, we report for the first time the draft genome of a strain isolated from a human case that has largely been found in in vitro and experimental models of actinomycetoma, N. brasiliensis HUJEG-1.
Continuous data assimilation for downscaling large-footprint soil moisture retrievals

KAUST Repository

Altaf, Muhammad

2016-01-01

Soil moisture is a key component of the hydrologic cycle, influencing processes leading to runoff generation, infiltration and groundwater recharge, evaporation and transpiration. Generally, the measurement scale for soil moisture is found to be different from the modeling scales for these processes. Reducing this mismatch between observation and model scales in necessary for improved hydrological modeling. An innovative approach to downscaling coarse resolution soil moisture data by combining continuous data assimilation and physically based modeling is presented. In this approach, we exploit the features of Continuous Data Assimilation (CDA) which was initially designed for general dissipative dynamical systems and later tested numerically on the incompressible Navier-Stokes equation, and the Benard equation. A nudging term, estimated as the misfit between interpolants of the assimilated coarse grid measurements and the fine grid model solution, is added to the model equations to constrain the model\\'s large scale variability by available measurements. Soil moisture fields generated at a fine resolution by a physically-based vadose zone model (HYDRUS) are subjected to data assimilation conditioned upon coarse resolution observations. This enables nudging of the model outputs towards values that honor the coarse resolution dynamics while still being generated at the fine scale. Results show that the approach is feasible to generate fine scale soil moisture fields across large extents, based on coarse scale observations. Application of this approach is likely in generating fine and intermediate resolution soil moisture fields conditioned on the radiometerbased, coarse resolution products from remote sensing satellites.
Large BRCA1 and BRCA2 genomic rearrangements in Danish high risk breast-ovarian cancer families

DEFF Research Database (Denmark)

Hansen, Thomas v O; Jønson, Lars; Albrechtsen, Anders

2009-01-01

BRCA1 and BRCA2 germ-line mutations predispose to breast and ovarian cancer. Large genomic rearrangements of BRCA1 account for 0-36% of all disease causing mutations in various populations, while large genomic rearrangements in BRCA2 are more rare. We examined 642 East Danish breast and/or ovaria...
Modern Sorters for Soil Segregation on Large Scale Remediation Projects

International Nuclear Information System (INIS)

Shonka, J.J.; Kelley, J.E.; O'Brien, J.M.

2008-01-01

In the mid-1940's, Dr. C. Lapointe developed a Geiger tube based uranium ore scanner and picker to replace hand-cobbing. In the 1990's, a modern version of the Lapointe Picker for soil sorting was developed around the need to clean the Johnston Atoll of plutonium. It worked well with sand, but these systems are ineffective with soil, especially with wet conditions. Additionally, several other constraints limited throughput. Slow moving belts and thin layers of material on the belt coupled with the use of multiple small detectors and small sorting gates make these systems ineffective for high throughput. Soil sorting of clay-bearing soils and building debris requires a new look at both the material handling equipment, and the radiation detection methodology. A new class of Super-Sorters has attained throughput of one hundred times that of the old designs. Higher throughput means shorter schedules which reduce costs substantially. The planning, cost, implementation, and other site considerations for these new Super-Sorters are discussed. Modern soil segregation was developed by Ed Bramlitt of the Defense Nuclear Agency for clean up at Johnston Atoll. The process eventually became the Segmented Gate System (SGS). This system uses an array of small sodium iodide (NaI) detectors, each viewing a small volume (segment), that control a gate. The volume in the gate is approximately one kg. This system works well when the material to be processed is sand; however, when the material is wet and sticky (soils with clays) the system has difficulty moving the material through the gates. Super-Sorters are a new class of machine designed to take advantage of high throughput aggregate processing conveyors, large acquisition volumes, and large NaI detectors using gamma spectroscopy. By using commercially available material handling equipment, the system can attain processing rates of up to 400 metric tons/hr with spectrum acquisition approximately every 0.5 sec, so the acquisition
Targeted sequencing of large genomic regions with CATCH-Seq.

Directory of Open Access Journals (Sweden)

Kenneth Day

Full Text Available Current target enrichment systems for large-scale next-generation sequencing typically require synthetic oligonucleotides used as capture reagents to isolate sequences of interest. The majority of target enrichment reagents are focused on gene coding regions or promoters en masse. Here we introduce development of a customizable targeted capture system using biotinylated RNA probe baits transcribed from sheared bacterial artificial chromosome clone templates that enables capture of large, contiguous blocks of the genome for sequencing applications. This clone adapted template capture hybridization sequencing (CATCH-Seq procedure can be used to capture both coding and non-coding regions of a gene, and resolve the boundaries of copy number variations within a genomic target site. Furthermore, libraries constructed with methylated adapters prior to solution hybridization also enable targeted bisulfite sequencing. We applied CATCH-Seq to diverse targets ranging in size from 125 kb to 3.5 Mb. Our approach provides a simple and cost effective alternative to other capture platforms because of template-based, enzymatic probe synthesis and the lack of oligonucleotide design costs. Given its similarity in procedure, CATCH-Seq can also be performed in parallel with commercial systems.
Insertion Sequence-Caused Large Scale-Rearrangements in the Genome of Escherichia coli

Science.gov (United States)

2016-07-18

affordable ap- proach to genome-wide characterization of genetic varia - tion in bacterial and eukaryotic genomes (1–3). In addition to small-scale...Paired-End Reads), that uses a graph-based al- gorithm (27) capable of detecting most large-scale varia - tion involving repetitive regions, including novel...Avila,P., Grinsted,J. and De La Cruz,F. (1988) Analysis of the variable endpoints generated by one-ended transposition of Tn21.. J. Bacteriol., 170
Draft Genome Sequence of a Biosurfactant-Producing Bacillus subtilis UMX-103 Isolated from Hydrocarbon-Contaminated Soil in Terengganu, Malaysia.

Science.gov (United States)

Abdelhafiz, Yousri Abdelmutalab; Manaharan, Thamilvaani; BinMohamad, Saharuddin; Merican, Amir Feisal

2017-07-01

The draft genome here presents the sequence of Bacillus subtilis UMX-103. The bacterial strain was isolated from hydrocarbon-contaminated soil from Terengganu, Malaysia. The whole genome of the bacterium was sequenced using Illumina HiSeq 2000 sequencing platform. The genome was assembled using de novo approach. The genome size of UMX-103 is 4,234,627 bp with 4399 genes comprising 4301 protein-coding genes and 98 RNA genes. The analysis of assembled genes revealed the presence of 25 genes involved in biosurfactant production, where 14 of the genes are related to biosynthesis and 11 of the genes are in the regulation of biosurfactant productions. This draft genome will provide insights into the genetic bases of its biosurfactant-producing capabilities.
Complete Genome Sequence of the Facultative Methylotroph Methylobacterium extorquens TK 0001 Isolated from Soil in Poland.

Science.gov (United States)

Belkhelfa, Sophia; Labadie, Karine; Cruaud, Corinne; Aury, Jean-Marc; Roche, David; Bouzon, Madeleine; Salanoubat, Marcel; Döring, Volker

2018-02-22

Methylobacterium extorquens TK 0001 (DSM 1337, ATCC 43645) is an aerobic pink-pigmented facultative methylotrophic alphaproteobacterium isolated from soil in Poland. Here, we report the whole-genome sequence and annotation of this organism, which consists of a single 5.71-Mb chromosome. Copyright © 2018 Belkhelfa et al.
Rapid separation method for {sup 237}Np and Pu isotopes in large soil samples

Energy Technology Data Exchange (ETDEWEB)

Maxwell, Sherrod L., E-mail: sherrod.maxwell@srs.go [Savannah River Nuclear Solutions, LLC, Building 735-B, Aiken, SC 29808 (United States); Culligan, Brian K.; Noyes, Gary W. [Savannah River Nuclear Solutions, LLC, Building 735-B, Aiken, SC 29808 (United States)

2011-07-15

A new rapid method for the determination of {sup 237}Np and Pu isotopes in soil and sediment samples has been developed at the Savannah River Site Environmental Lab (Aiken, SC, USA) that can be used for large soil samples. The new soil method utilizes an acid leaching method, iron/titanium hydroxide precipitation, a lanthanum fluoride soil matrix removal step, and a rapid column separation process with TEVA Resin. The large soil matrix is removed easily and rapidly using these two simple precipitations with high chemical recoveries and effective removal of interferences. Vacuum box technology and rapid flow rates are used to reduce analytical time.
High-efficiency targeted editing of large viral genomes by RNA-guided nucleases.

Science.gov (United States)

Bi, Yanwei; Sun, Le; Gao, Dandan; Ding, Chen; Li, Zhihua; Li, Yadong; Cun, Wei; Li, Qihan

2014-05-01

A facile and efficient method for the precise editing of large viral genomes is required for the selection of attenuated vaccine strains and the construction of gene therapy vectors. The type II prokaryotic CRISPR-Cas (clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas)) RNA-guided nuclease system can be introduced into host cells during viral replication. The CRISPR-Cas9 system robustly stimulates targeted double-stranded breaks in the genomes of DNA viruses, where the non-homologous end joining (NHEJ) and homology-directed repair (HDR) pathways can be exploited to introduce site-specific indels or insert heterologous genes with high frequency. Furthermore, CRISPR-Cas9 can specifically inhibit the replication of the original virus, thereby significantly increasing the abundance of the recombinant virus among progeny virus. As a result, purified recombinant virus can be obtained with only a single round of selection. In this study, we used recombinant adenovirus and type I herpes simplex virus as examples to demonstrate that the CRISPR-Cas9 system is a valuable tool for editing the genomes of large DNA viruses.
High-efficiency targeted editing of large viral genomes by RNA-guided nucleases.

Directory of Open Access Journals (Sweden)

Yanwei Bi

2014-05-01

Full Text Available A facile and efficient method for the precise editing of large viral genomes is required for the selection of attenuated vaccine strains and the construction of gene therapy vectors. The type II prokaryotic CRISPR-Cas (clustered regularly interspaced short palindromic repeats (CRISPR-associated (Cas RNA-guided nuclease system can be introduced into host cells during viral replication. The CRISPR-Cas9 system robustly stimulates targeted double-stranded breaks in the genomes of DNA viruses, where the non-homologous end joining (NHEJ and homology-directed repair (HDR pathways can be exploited to introduce site-specific indels or insert heterologous genes with high frequency. Furthermore, CRISPR-Cas9 can specifically inhibit the replication of the original virus, thereby significantly increasing the abundance of the recombinant virus among progeny virus. As a result, purified recombinant virus can be obtained with only a single round of selection. In this study, we used recombinant adenovirus and type I herpes simplex virus as examples to demonstrate that the CRISPR-Cas9 system is a valuable tool for editing the genomes of large DNA viruses.
The detection of large deletions or duplications in genomic DNA.

Science.gov (United States)

Armour, J A L; Barton, D E; Cockburn, D J; Taylor, G R

2002-11-01

While methods for the detection of point mutations and small insertions or deletions in genomic DNA are well established, the detection of larger (>100 bp) genomic duplications or deletions can be more difficult. Most mutation scanning methods use PCR as a first step, but the subsequent analyses are usually qualitative rather than quantitative. Gene dosage methods based on PCR need to be quantitative (i.e., they should report molar quantities of starting material) or semi-quantitative (i.e., they should report gene dosage relative to an internal standard). Without some sort of quantitation, heterozygous deletions and duplications may be overlooked and therefore be under-ascertained. Gene dosage methods provide the additional benefit of reporting allele drop-out in the PCR. This could impact on SNP surveys, where large-scale genotyping may miss null alleles. Here we review recent developments in techniques for the detection of this type of mutation and compare their relative strengths and weaknesses. We emphasize that comprehensive mutation analysis should include scanning for large insertions and deletions and duplications. Copyright 2002 Wiley-Liss, Inc.
Large-scale trends in the evolution of gene structures within 11 animal genomes.

Directory of Open Access Journals (Sweden)

Mark Yandell

2006-03-01

Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.
ACTIVE SOIL DEPRESSURIZATION (ASD) DEMONSTRATION IN A LARGE BUILDING

Science.gov (United States)

The report gives results of an evaluation of the feasibility of implementing radon resistant construction techniques -- especially active soil depressurization (ASD) -- in new large buildings in Florida. Indoor radon concentrations and radon entry were monitored in a finished bui...
Genetic Competence Drives Genome Diversity in Bacillus subtilis

Science.gov (United States)

Chevreux, Bastien; Serra, Cláudia R; Schyns, Ghislain; Henriques, Adriano O

2018-01-01

Abstract Prokaryote genomes are the result of a dynamic flux of genes, with increases achieved via horizontal gene transfer and reductions occurring through gene loss. The ecological and selective forces that drive this genomic flexibility vary across species. Bacillus subtilis is a naturally competent bacterium that occupies various environments, including plant-associated, soil, and marine niches, and the gut of both invertebrates and vertebrates. Here, we quantify the genomic diversity of B. subtilis and infer the genome dynamics that explain the high genetic and phenotypic diversity observed. Phylogenomic and comparative genomic analyses of 42 B. subtilis genomes uncover a remarkable genome diversity that translates into a core genome of 1,659 genes and an asymptotic pangenome growth rate of 57 new genes per new genome added. This diversity is due to a large proportion of low-frequency genes that are acquired from closely related species. We find no gene-loss bias among wild isolates, which explains why the cloud genome, 43% of the species pangenome, represents only a small proportion of each genome. We show that B. subtilis can acquire xenologous copies of core genes that propagate laterally among strains within a niche. While not excluding the contributions of other mechanisms, our results strongly suggest a process of gene acquisition that is largely driven by competence, where the long-term maintenance of acquired genes depends on local and global fitness effects. This competence-driven genomic diversity provides B. subtilis with its generalist character, enabling it to occupy a wide range of ecological niches and cycle through them. PMID:29272410
Dynamic Soil-Pile Interaction for large diameter monopile foundations

DEFF Research Database (Denmark)

Zania, Varvara

2013-01-01

of the study is to analyse the dynamic interaction of the soil and a single pile embedded in it by accounting for the geometric and stiffness properties of the pile. In doing so, a semi – analytical approach is adopted based on the fundamental solution of horizontal pile vibration by Novak and Nogami (1977...... eigenfrequencies of the soil layer do not affect the soil – pile interaction. The decrease of the eigefrequency of the OWT depends on the aforementioned variation of the dynamic stiffness and the slenderness ratio of the monopile.......Monopile foundations have been used in a large extent to support offshore wind turbines (OWT), being considered as a reliable and cost effective design solution. The accurate estimation of their dynamic response characteristics is essential, since the design of support structures for OWTs has been...

Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

Science.gov (United States)

Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

2016-09-19

Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.
From Genome to Function: Systematic Analysis of the Soil Bacterium Bacillus Subtilis

Science.gov (United States)

Crawshaw, Samuel G.; Wipat, Anil

2001-01-01

Bacillus subtilis is a sporulating Gram-positive bacterium that lives primarily in the soil and associated water sources. Whilst this bacterium has been studied extensively in the laboratory, relatively few studies have been undertaken to study its activity in natural environments. The publication of the B. subtilis genome sequence and subsequent systematic functional analysis programme have provided an opportunity to develop tools for analysing the role and expression of Bacillus genes in situ. In this paper we discuss analytical approaches that are being developed to relate genes to function in environments such as the rhizosphere. PMID:18628943
Low frequency of large genomic rearrangements of BRCA1 and BRCA2 in western Denmark

DEFF Research Database (Denmark)

Thomassen, Mads; Gerdes, Anne-Marie; Cruger, Dorthe

2006-01-01

Germline mutations in BRCA1 and BRCA2 predispose female carriers to breast and ovarian cancer. The majority of mutations identified are small deletions or insertions or are nonsense mutations. Large genomic rearrangements in BRCA1 are found with varying frequencies in different populations......, but BRCA2 rearrangements have not been investigated thoroughly. The objective in this study was to determine the frequency of large genomic rearrangements in BRCA1 and BRCA2 in a large group of Danish families with increased risk of breast and ovarian cancer. A total of 617 families previously tested...... negative for mutations involving few bases were screened with multiplex ligation-dependent probe amplification (MLPA). Two deletions in BRCA1 were identified in three families; no large rearrangements were detected in BRCA2. The large deletions constitute 3.8% of the BRCA1 mutations identified, which...
Soil-Structure Interaction for Non-Slender, Large-Diameter Offshore Monopiles

DEFF Research Database (Denmark)

Sørensen, Søren Peder Hyldal

conducted. The initial part of p-y curves for non-slender piles has been investigated by means of numerical modelling. The general behaviour of eccentrically loaded non-slender piles has been investigated by physical modelling. These tests have been conducted in the pressure tank at Aalborg University....... The monopile foundation concept has been employed as the foundation for the majority of the currently installed offshore wind turbines. Therefore, this PhD thesis concerns the soil-pile interaction for non-slender, large-diameter offshore piles. A combination of numerical and physical modelling has been....... Hence, the application of an overburden pressure is possible. The timescale of the backfill process and the compaction of soil material backfilled around piles in storm conditions have been investigated by means of large-scale physical modelling....
Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

Science.gov (United States)

diCenzo, George C; Finan, Turlough M

2018-01-01

The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.
Genome Sequence and Analysis of the Soil Cellulolytic ActinomyceteThermobifida fusca

Energy Technology Data Exchange (ETDEWEB)

Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia; Anderson, Iain; Land, Miriam; DiBartolo, Genevieve; Martinez, Michele; Lapidus, Alla; Lucas, Susan; Copeland, Alex; Richardson, Paul; Wilson,David B.; Kyrpides, Nikos

2007-02-01

Thermobifida fusca is a moderately thermophilic soilbacterium that belongs to Actinobacteria. 3 It is a major degrader ofplant cell walls and has been used as a model organism for the study of 4secreted, thermostable cellulases. The complete genome sequence showedthat T. fusca has a 5 single circular chromosome of 3642249 bp predictedto encode 3117 proteins and 65 RNA6 species with a coding densityof 85percent. Genome analysis revealed the existence of 29 putative 7glycoside hydrolases in addition to the previously identified cellulasesand xylanases. The 8 glycosyl hydrolases include enzymes predicted toexhibit mainly dextran/starch and xylan 9 degrading functions. T. fuscapossesses two protein secretion systems: the sec general secretion 10system and the twin-arginine translocation system. Several of thesecreted cellulases have 11 sequence signatures indicating theirsecretion may be mediated by the twin-arginine12 translocation system. T.fusca has extensive transport systems for import of carbohydrates 13coupled to transcriptional regulators controlling the expression of thetransporters and14 glycosylhydrolases. In addition to providing anoverview of the physiology of a soil 15 actinomycete, this study presentsinsights on the transcriptional regulation and secretion of16 cellulaseswhich may facilitate the industrial exploitation of thesesystems.
Draft Genome Sequence of Bacillus thuringiensis Strain BrMgv02-JM63, a Chitinolytic Bacterium Isolated from Oil-Contaminated Mangrove Soil in Brazil.

Science.gov (United States)

Marcon, Joelma; Taketani, Rodrigo Gouvêa; Dini-Andreote, Francisco; Mazzero, Giulia Inocêncio; Soares, Fabio Lino; Melo, Itamar Soares; Azevedo, João Lúcio; Andreote, Fernando Dini

2014-01-30

Here, we report the draft genome sequence and the automatic annotation of Bacillus thuringiensis strain BrMgv02-JM63. This genome comprises a set of genes involved in the metabolism of chitin and N-acetylglucosamine utilization, thus suggesting the possible role of this strain in the cycling of organic matter in mangrove soils.
Final Report for DOE grant no. DE-FG02-04ER63883: Can soil genomics predict the impact of precipitation on nitrous oxide flux from soil

Energy Technology Data Exchange (ETDEWEB)

Egbert Schwartz

2008-12-15

Nitrous oxide is a potent greenhouse gas that is released by microorganisms in soil. However, the production of nitrous oxide in soil is highly variable and difficult to predict. Future climate change may have large impacts on nitrous oxide release through alteration of precipitation patterns. We analyzed DNA extracted from soil in order to uncover relationships between microbial processes, abundance of particular DNA sequences and net nitrous oxide fluxes from soil. Denitrification, a microbial process in which nitrate is used as an electron acceptor, correlated with nitrous oxide flux from soil. The abundance of ammonia oxidizing archaea correlated positively, but weakly, with nitrous oxide production in soil. The abundance of bacterial genes in soil was negatively correlated with gross nitrogen mineralization rates and nitrous oxide release from soil. We suggest that the most important control over nitrous oxide production in soil is the growth and death of microorganisms. When organisms are growing nitrogen is incorporated into their biomass and nitrous oxide flux is low. In contrast, when microorganisms die, due to predation or infection by viruses, inorganic nitrogen is released into the soil resulting in nitrous oxide release. Higher rates of precipitation increase access to microorganisms by predators or viruses through filling large soil pores with water and therefore can lead to large releases of nitrous oxide from soil. We developed a new technique, stable isotope probing with 18O-water, to study growth and mortality of microorganisms in soil.
Radiation hybrid maps of the D-genome of Aegilops tauschii and their application in sequence assembly of large and complex plant genomes.

Science.gov (United States)

Kumar, Ajay; Seetan, Raed; Mergoum, Mohamed; Tiwari, Vijay K; Iqbal, Muhammad J; Wang, Yi; Al-Azzam, Omar; Šimková, Hana; Luo, Ming-Cheng; Dvorak, Jan; Gu, Yong Q; Denton, Anne; Kilian, Andrzej; Lazo, Gerard R; Kianian, Shahryar F

2015-10-16

The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high resolution genome maps with saturated marker scaffolds to anchor and orient BAC contigs/ sequence scaffolds for whole genome assembly. Radiation hybrid (RH) mapping has proven to be an excellent tool for the development of such maps for it offers much higher and more uniform marker resolution across the length of the chromosome compared to genetic mapping and does not require marker polymorphism per se, as it is based on presence (retention) vs. absence (deletion) marker assay. In this study, a 178 line RH panel was genotyped with SSRs and DArT markers to develop the first high resolution RH maps of the entire D-genome of Ae. tauschii accession AL8/78. To confirm map order accuracy, the AL8/78-RH maps were compared with:1) a DArT consensus genetic map constructed using more than 100 bi-parental populations, 2) a RH map of the D-genome of reference hexaploid wheat 'Chinese Spring', and 3) two SNP-based genetic maps, one with anchored D-genome BAC contigs and another with anchored D-genome sequence scaffolds. Using marker sequences, the RH maps were also anchored with a BAC contig based physical map and draft sequence of the D-genome of Ae. tauschii. A total of 609 markers were mapped to 503 unique positions on the seven D-genome chromosomes, with a total map length of 14,706.7 cR. The average distance between any two marker loci was 29.2 cR which corresponds to 2.1 cM or 9.8 Mb. The average mapping resolution across the D-genome was estimated to be 0.34 Mb (Mb/cR) or 0.07 cM (cM/cR). The RH maps showed almost perfect agreement with several published maps with regard to chromosome assignments of markers. The mean rank correlations between the position of markers on AL8/78 maps and the four published maps, ranged from 0.75 to 0.92, suggesting a good agreement in marker order. With 609 mapped markers, a total of 2481 deletions for the whole D-genome were detected with an average
Large-scale experience with biological treatment of contaminated soil

International Nuclear Information System (INIS)

Schulz-Berendt, V.; Poetzsch, E.

1995-01-01

The efficiency of biological methods for the cleanup of soil contaminated with total petroleum hydrocarbons (TPH) and polycyclic aromatic hydrocarbons (PAH) was demonstrated by a large-scale example in which 38,000 tons of TPH- and PAH-polluted soil was treated onsite with the TERRAFERM reg-sign degradation system to reach the target values of 300 mg/kg TPH and 5 mg/kg PAH. Detection of the ecotoxicological potential (Microtox reg-sign assay) showed a significant decrease during the remediation. Low concentrations of PAH in the ground were treated by an in situ technology. The in situ treatment was combined with mechanical measures (slurry wall) to prevent the contamination from dispersing from the site
Efficient Meshfree Large Deformation Simulation of Rainfall Induced Soil Slope Failure

Science.gov (United States)

Wang, Dongdong; Li, Ling

2010-05-01

An efficient Lagrangian Galerkin meshfree framework is presented for large deformation simulation of rainfall-induced soil slope failure. Detailed coupled soil-rainfall seepage equations are given for the proposed formulation. This nonlinear meshfree formulation is featured by the Lagrangian stabilized conforming nodal integration method where the low cost nature of nodal integration approach is kept and at the same time the numerical stability is maintained. The initiation and evolution of progressive failure in the soil slope is modeled by the coupled constitutive equations of isotropic damage and Drucker-Prager pressure-dependent plasticity. The gradient smoothing in the stabilized conforming integration also serves as a non-local regularization of material instability and consequently the present method is capable of effectively capture the shear band failure. The efficacy of the present method is demonstrated by simulating the rainfall-induced failure of two typical soil slopes.
Procedure for plutonium analysis of large (100g) soil and sediment samples

International Nuclear Information System (INIS)

Meadows, J.W.T.; Schweiger, J.S.; Mendoza, B.; Stone, R.

1975-01-01

A method for the complete dissolution of large soil or sediment samples is described. This method is in routine usage at Lawrence Livermore Laboratory for the analysis of fall-out levels of Pu in soils and sediments. Intercomparison with partial dissolution (leach) techniques shows the complete dissolution method to be superior for the determination of plutonium in a wide variety of environmental samples. (author)
The use of soil moisture - remote sensing products for large-scale groundwater modeling and assessment

NARCIS (Netherlands)

Sutanudjaja, E.H.

2012-01-01

In this thesis, the possibilities of using spaceborne remote sensing for large-scale groundwater modeling are explored. We focus on a soil moisture product called European Remote Sensing Soil Water Index (ERS SWI, Wagner et al., 1999) - representing the upper profile soil moisture. As a test-bed, we
Draft Genome Sequence of Pseudomonas sp. Strain In5 Isolated from a Greenlandic Disease Suppressive Soil with Potent Antimicrobial Activity

DEFF Research Database (Denmark)

Hennessy, Rosanna C.; Glaring, Mikkel Andreas; Frydenlund Michelsen, Charlotte

2015-01-01

Pseudomonas sp. In5 is an isolate of disease suppressive soil with potent activity against pathogens. Its antifungal activity has been linked to a gene cluster encoding nonribosomal peptide synthetases producing the peptides nunamycin and nunapeptin. The genome sequence will provide insight into ...
Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

Science.gov (United States)

Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

2016-01-01

ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including
Emerging Genomic Tools for Legume Breeding: Current Status and Future Prospects

Science.gov (United States)

Pandey, Manish K.; Roorkiwal, Manish; Singh, Vikas K.; Ramalingam, Abirami; Kudapa, Himabindu; Thudi, Mahendar; Chitikineni, Anu; Rathore, Abhishek; Varshney, Rajeev K.

2016-01-01

Legumes play a vital role in ensuring global nutritional food security and improving soil quality through nitrogen fixation. Accelerated higher genetic gains is required to meet the demand of ever increasing global population. In recent years, speedy developments have been witnessed in legume genomics due to advancements in next-generation sequencing (NGS) and high-throughput genotyping technologies. Reference genome sequences for many legume crops have been reported in the last 5 years. The availability of the draft genome sequences and re-sequencing of elite genotypes for several important legume crops have made it possible to identify structural variations at large scale. Availability of large-scale genomic resources and low-cost and high-throughput genotyping technologies are enhancing the efficiency and resolution of genetic mapping and marker-trait association studies. Most importantly, deployment of molecular breeding approaches has resulted in development of improved lines in some legume crops such as chickpea and groundnut. In order to support genomics-driven crop improvement at a fast pace, the deployment of breeder-friendly genomics and decision support tools seems appear to be critical in breeding programs in developing countries. This review provides an overview of emerging genomics and informatics tools/approaches that will be the key driving force for accelerating genomics-assisted breeding and ultimately ensuring nutritional and food security in developing countries. PMID:27199998
Software engineering the mixed model for genome-wide association studies on large samples

Science.gov (United States)

Mixed models improve the ability to detect phenotype-genotype associations in the presence of population stratification and multiple levels of relatedness in genome-wide association studies (GWAS), but for large data sets the resource consumption becomes impractical. At the same time, the sample siz...
Evaluation of digital soil mapping approaches with large sets of environmental covariates

Science.gov (United States)

Nussbaum, Madlene; Spiess, Kay; Baltensweiler, Andri; Grob, Urs; Keller, Armin; Greiner, Lucie; Schaepman, Michael E.; Papritz, Andreas

2018-01-01

The spatial assessment of soil functions requires maps of basic soil properties. Unfortunately, these are either missing for many regions or are not available at the desired spatial resolution or down to the required soil depth. The field-based generation of large soil datasets and conventional soil maps remains costly. Meanwhile, legacy soil data and comprehensive sets of spatial environmental data are available for many regions. Digital soil mapping (DSM) approaches relating soil data (responses) to environmental data (covariates) face the challenge of building statistical models from large sets of covariates originating, for example, from airborne imaging spectroscopy or multi-scale terrain analysis. We evaluated six approaches for DSM in three study regions in Switzerland (Berne, Greifensee, ZH forest) by mapping the effective soil depth available to plants (SD), pH, soil organic matter (SOM), effective cation exchange capacity (ECEC), clay, silt, gravel content and fine fraction bulk density for four soil depths (totalling 48 responses). Models were built from 300-500 environmental covariates by selecting linear models through (1) grouped lasso and (2) an ad hoc stepwise procedure for robust external-drift kriging (georob). For (3) geoadditive models we selected penalized smoothing spline terms by component-wise gradient boosting (geoGAM). We further used two tree-based methods: (4) boosted regression trees (BRTs) and (5) random forest (RF). Lastly, we computed (6) weighted model averages (MAs) from the predictions obtained from methods 1-5. Lasso, georob and geoGAM successfully selected strongly reduced sets of covariates (subsets of 3-6 % of all covariates). Differences in predictive performance, tested on independent validation data, were mostly small and did not reveal a single best method for 48 responses. Nevertheless, RF was often the best among methods 1-5 (28 of 48 responses), but was outcompeted by MA for 14 of these 28 responses. RF tended to over
Overview of a large-scale bioremediation soil treatment project

International Nuclear Information System (INIS)

Stechmann, R.

1991-01-01

How long does it take to remediate 290,000 yd 3 of impacted soil containing an average total petroleum hydrocarbon concentration of 3,000 ppm? Approximately 15 months from start to end of treatment using bioremediation. Mittelhauser was retained by the seller of the property (a major oil company) as technical manager to supervise remediation of a 45-ac parcel in the Los Angeles basin. Mittelhauser completed site characterization, negotiated clean-up levels with the regulatory agencies, and prepared the remedial action plan (RAP) with which the treatment approach was approved and permitted. The RAP outlined the excavation, treatment, and recompaction procedures for the impacted soil resulting from leakage of bunker fuel oil from a large surface impoundment. The impacted soil was treated on site in unline Land Treatment Units (LTUs) in 18-in.-thick lifts. Due to space restraints, multiple lifts site. The native microbial population was cultivated using soil stabilization mixing equipment with the application of water and agricultural grade fertilizers. Costs on this multimillion dollar project are broken down as follows: general contractor cost (47%), bioremediation subcontractor cost (35%), site characterization (10%), technical management (7%), analytical services (3%), RAP preparation and permitting (1%), and civil engineering subcontractor cost (1%). Start-up of field work could have been severely impacted by the existence of Red Fox habitation. The foxes were successfully relocated prior to start of field work
The benefits of using remotely sensed soil moisture in parameter identification of large-scale hydrological models

Science.gov (United States)

Wanders, N.; Bierkens, M. F. P.; de Jong, S. M.; de Roo, A.; Karssenberg, D.

2014-08-01

Large-scale hydrological models are nowadays mostly calibrated using observed discharge. As a result, a large part of the hydrological system, in particular the unsaturated zone, remains uncalibrated. Soil moisture observations from satellites have the potential to fill this gap. Here we evaluate the added value of remotely sensed soil moisture in calibration of large-scale hydrological models by addressing two research questions: (1) Which parameters of hydrological models can be identified by calibration with remotely sensed soil moisture? (2) Does calibration with remotely sensed soil moisture lead to an improved calibration of hydrological models compared to calibration based only on discharge observations, such that this leads to improved simulations of soil moisture content and discharge? A dual state and parameter Ensemble Kalman Filter is used to calibrate the hydrological model LISFLOOD for the Upper Danube. Calibration is done using discharge and remotely sensed soil moisture acquired by AMSR-E, SMOS, and ASCAT. Calibration with discharge data improves the estimation of groundwater and routing parameters. Calibration with only remotely sensed soil moisture results in an accurate identification of parameters related to land-surface processes. For the Upper Danube upstream area up to 40,000 km2, calibration on both discharge and soil moisture results in a reduction by 10-30% in the RMSE for discharge simulations, compared to calibration on discharge alone. The conclusion is that remotely sensed soil moisture holds potential for calibration of hydrological models, leading to a better simulation of soil moisture content throughout the catchment and a better simulation of discharge in upstream areas. This article was corrected on 15 SEP 2014. See the end of the full text for details.

Genomic profiling of plasmablastic lymphoma using array comparative genomic hybridization (aCGH: revealing significant overlapping genomic lesions with diffuse large B-cell lymphoma

Directory of Open Access Journals (Sweden)

Lu Xin-Yan

2009-11-01

Full Text Available Abstract Background Plasmablastic lymphoma (PL is a subtype of diffuse large B-cell lymphoma (DLBCL. Studies have suggested that tumors with PL morphology represent a group of neoplasms with clinopathologic characteristics corresponding to different entities including extramedullary plasmablastic tumors associated with plasma cell myeloma (PCM. The goal of the current study was to evaluate the genetic similarities and differences among PL, DLBCL (AIDS-related and non AIDS-related and PCM using array-based comparative genomic hybridization. Results Examination of genomic data in PL revealed that the most frequent segmental gain (> 40% include: 1p36.11-1p36.33, 1p34.1-1p36.13, 1q21.1-1q23.1, 7q11.2-7q11.23, 11q12-11q13.2 and 22q12.2-22q13.3. This correlated with segmental gains occurring in high frequency in DLBCL (AIDS-related and non AIDS-related cases. There were some segmental gains and some segmental loss that occurred in PL but not in the other types of lymphoma suggesting that these foci may contain genes responsible for the differentiation of this lymphoma. Additionally, some segmental gains and some segmental loss occurred only in PL and AIDS associated DLBCL suggesting that these foci may be associated with HIV infection. Furthermore, some segmental gains and some segmental loss occurred only in PL and PCM suggesting that these lesions may be related to plasmacytic differentiation. Conclusion To the best of our knowledge, the current study represents the first genomic exploration of PL. The genomic aberration pattern of PL appears to be more similar to that of DLBCL (AIDS-related or non AIDS-related than to PCM. Our findings suggest that PL may remain best classified as a subtype of DLBCL at least at the genome level.
Soil respiration in Tibetan alpine grasslands: belowground biomass and soil moisture, but not soil temperature, best explain the large-scale patterns.

Directory of Open Access Journals (Sweden)

Yan Geng

Full Text Available The Tibetan Plateau is an essential area to study the potential feedback effects of soils to climate change due to the rapid rise in its air temperature in the past several decades and the large amounts of soil organic carbon (SOC stocks, particularly in the permafrost. Yet it is one of the most under-investigated regions in soil respiration (Rs studies. Here, Rs rates were measured at 42 sites in alpine grasslands (including alpine steppes and meadows along a transect across the Tibetan Plateau during the peak growing season of 2006 and 2007 in order to test whether: (1 belowground biomass (BGB is most closely related to spatial variation in Rs due to high root biomass density, and (2 soil temperature significantly influences spatial pattern of Rs owing to metabolic limitation from the low temperature in cold, high-altitude ecosystems. The average daily mean Rs of the alpine grasslands at peak growing season was 3.92 µmol CO(2 m(-2 s(-1, ranging from 0.39 to 12.88 µmol CO(2 m(-2 s(-1, with average daily mean Rs of 2.01 and 5.49 µmol CO(2 m(-2 s(-1 for steppes and meadows, respectively. By regression tree analysis, BGB, aboveground biomass (AGB, SOC, soil moisture (SM, and vegetation type were selected out of 15 variables examined, as the factors influencing large-scale variation in Rs. With a structural equation modelling approach, we found only BGB and SM had direct effects on Rs, while other factors indirectly affecting Rs through BGB or SM. Most (80% of the variation in Rs could be attributed to the difference in BGB among sites. BGB and SM together accounted for the majority (82% of spatial patterns of Rs. Our results only support the first hypothesis, suggesting that models incorporating BGB and SM can improve Rs estimation at regional scale.
The large-scale blast score ratio (LS-BSR pipeline: a method to rapidly compare genetic content between bacterial genomes

Directory of Open Access Journals (Sweden)

Jason W. Sahl

2014-04-01

Full Text Available Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR.Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27–57 h, depending upon the alignment method, using 16 processors.Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated
The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

Energy Technology Data Exchange (ETDEWEB)

Smith, David R.; Lee, Robert W.; Cushman, John C.; Magnuson, Jon K.; Tran, Duc; Polle, Juergen E.

2010-05-07

Abstract Background: Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results: The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA) sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions: These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the development of a viable
The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

Directory of Open Access Journals (Sweden)

Tran Duc

2010-05-01

Full Text Available Abstract Background Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the
An alternative method for determining particle-size distribution of forest road aggregate and soil with large-sized particles

Science.gov (United States)

Hakjun Rhee; Randy B. Foltz; James L. Fridley; Finn Krogstad; Deborah S. Page-Dumroese

2014-01-01

Measurement of particle-size distribution (PSD) of soil with large-sized particles (e.g., 25.4 mm diameter) requires a large sample and numerous particle-size analyses (PSAs). A new method is needed that would reduce time, effort, and cost for PSAs of the soil and aggregate material with large-sized particles. We evaluated a nested method for sampling and PSA by...
High-Affinity Methanotrophy Informed by Genome-Wide Analysis of Upland Soil Cluster Alpha (USCα) from Axel Heiberg Island, Canadian High Arctic

Science.gov (United States)

Rusley, C.; Onstott, T. C.; Lau, M.

2017-12-01

Methane (CH4) is a potent greenhouse gas whose proper budgeting is vital to climate predictions. Recent studies have identified upland Arctic mineral cryosols as consistent CH4 sinks, drawing CH4 from both the atmosphere and underlying anaerobic soil layers. Global atmospheric CH4 uptake is proposed to be mediated by high-affinity methanotrophs based on the detection of the marker gene pmoA (particulate methane monooxygenase beta subunit). However, a lack of pure cultures and scarcity of genomic information have hindered our understanding of their metabolic capabilities and versatility. Together with the missing genetic linkage between its pmoA and 16S ribosomal RNA (rRNA) gene, the factors that control the distribution and magnitude of high-affinity methanotrophy in the Arctic permafrost-affected region have remained elusive. Using 21 metagenomic datasets of surface soils obtained from long-term core incubation experiments,1 this bioinformatics study aimed to reconstruct the draft genome of the Upland Soil Cluster α-proteobacteria (USCα), the high-affinity methanotroph previously detected in the samples,2 and to determine its phylogeny and metabolic requirements. We obtained a genome bin containing the high-affinity form of the USCα-like pmoA gene. The 3.03 Mbp assembly is 91.6% complete with a unique set of single-copy marker genes. The 16S rRNA gene fragment of USCα belongs to the α-proteobacterial family Beijerinckiaceae. Genome annotation indicates possible formaldehyde oxidation via tetrahydromethanopterin-linked C1 transfer pathways, acetate utilization, carbon fixation via the Calvin-Benson-Bassham cycle, and glycogen production. Notably, the key enzymes for formaldehyde assimilation via the serine and ribulose monophosphate pathways are missing. The presence of genes encoding nitrate reductase and hemoglobin suggests adaptation to low O2 under water-logged conditions. Since USCα has versatile carbon metabolisms, it may not be an obligate methanotroph
Experience from large scale use of the EuroGenomics custom SNP chip in cattle

DEFF Research Database (Denmark)

Boichard, Didier A; Boussaha, Mekki; Capitan, Aurélien

2018-01-01

This article presents the strategy to evaluate candidate mutations underlying QTL or responsible for genetic defects, based upon the design and large-scale use of the Eurogenomics custom SNP chip set up for bovine genomic selection. Some variants under study originated from mapping genetic defect...
Comparative genomic hybridizations reveal absence of large Streptomyces coelicolor genomic islands in Streptomyces lividans

OpenAIRE

Jayapal, Karthik P; Lian, Wei; Glod, Frank; Sherman, David H; Hu, Wei-Shou

2007-01-01

Abstract Background The genomes of Streptomyces coelicolor and Streptomyces lividans bear a considerable degree of synteny. While S. coelicolor is the model streptomycete for studying antibiotic synthesis and differentiation, S. lividans is almost exclusively considered as the preferred host, among actinomycetes, for cloning and expression of exogenous DNA. We used whole genome microarrays as a comparative genomics tool for identifying the subtle differences between these two chromosomes. Res...
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions

Science.gov (United States)

Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R.; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions. PMID:29447216
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions.

Directory of Open Access Journals (Sweden)

Corina Diana Ceapă

Full Text Available Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions.
Genomic insights into the Acidobacteria reveal strategies for their success in terrestrial environments

Science.gov (United States)

Trojan, Daniela; Roux, Simon; Herbold, Craig; Rattei, Thomas; Woebken, Dagmar

2018-01-01

Summary Members of the phylum Acidobacteria are abundant and ubiquitous across soils. We performed a large‐scale comparative genome analysis spanning subdivisions 1, 3, 4, 6, 8 and 23 (n = 24) with the goal to identify features to help explain their prevalence in soils and understand their ecophysiology. Our analysis revealed that bacteriophage integration events along with transposable and mobile elements influenced the structure and plasticity of these genomes. Low‐ and high‐affinity respiratory oxygen reductases were detected in multiple genomes, suggesting the capacity for growing across different oxygen gradients. Among many genomes, the capacity to use a diverse collection of carbohydrates, as well as inorganic and organic nitrogen sources (such as via extracellular peptidases), was detected – both advantageous traits in environments with fluctuating nutrient environments. We also identified multiple soil acidobacteria with the potential to scavenge atmospheric concentrations of H2, now encompassing mesophilic soil strains within the subdivision 1 and 3, in addition to a previously identified thermophilic strain in subdivision 4. This large‐scale acidobacteria genome analysis reveal traits that provide genomic, physiological and metabolic versatility, presumably allowing flexibility and versatility in the challenging and fluctuating soil environment. PMID:29327410
Specific single-cell isolation and genomic amplification of uncultured microorganisms

DEFF Research Database (Denmark)

Kvist, Thomas; Ahring, Birgitte Kiær; Lasken, R.S.

2007-01-01

We in this study describe a new method for genomic studies of individual uncultured prokaryotic organisms, which was used for the isolation and partial genome sequencing of a soil archaeon. The diversity of Archaea in a soil sample was mapped by generating a clone library using group-specific pri......We in this study describe a new method for genomic studies of individual uncultured prokaryotic organisms, which was used for the isolation and partial genome sequencing of a soil archaeon. The diversity of Archaea in a soil sample was mapped by generating a clone library using group......-specific primers in combination with a terminal restriction fragment length polymorphism profile. Intact cells were extracted from the environmental sample, and fluorescent in situ hybridization probing with Cy3-labeled probes designed from the clone library was subsequently used to detect the organisms...... of interest. Single cells with a bright fluorescent signal were isolated using a micromanipulator and the genome of the single isolated cells served as a template for multiple displacement amplification (MDA) using the Phi29 DNA polymerase. The generated MDA product was afterwards used for 16S rRNA gene...
Draft Genome Sequence of Hymenobacter sp. Strain AT01-02, Isolated from a Surface Soil Sample in the Atacama Desert, Chile

DEFF Research Database (Denmark)

Hansen, Anders Cai Holm; Paulino-Lima, Ivan Glaucio; Fujishima, Kosuke

2016-01-01

Here, we report the 5.09-Mb draft genome sequence of Hymenobacter sp. strain AT01-02, which was isolated from a surface soil sample in the Atacama Desert, Chile. The isolate is extremely resistant to UV-C radiation and is able to accumulate high intracellular levels of Mn/Fe....
Initial characterization of the large genome of the salamander Ambystoma mexicanum using shotgun and laser capture chromosome sequencing.

Science.gov (United States)

Keinath, Melissa C; Timoshevskiy, Vladimir A; Timoshevskaya, Nataliya Y; Tsonis, Panagiotis A; Voss, S Randal; Smith, Jeramiah J

2015-11-10

Vertebrates exhibit substantial diversity in genome size, and some of the largest genomes exist in species that uniquely inform diverse areas of basic and biomedical research. For example, the salamander Ambystoma mexicanum (the Mexican axolotl) is a model organism for studies of regeneration, development and genome evolution, yet its genome is ~10× larger than the human genome. As part of a hierarchical approach toward improving genome resources for the species, we generated 600 Gb of shotgun sequence data and developed methods for sequencing individual laser-captured chromosomes. Based on these data, we estimate that the A. mexicanum genome is ~32 Gb. Notably, as much as 19 Gb of the A. mexicanum genome can potentially be considered single copy, which presumably reflects the evolutionary diversification of mobile elements that accumulated during an ancient episode of genome expansion. Chromosome-targeted sequencing permitted the development of assemblies within the constraints of modern computational platforms, allowed us to place 2062 genes on the two smallest A. mexicanum chromosomes and resolves key events in the history of vertebrate genome evolution. Our analyses show that the capture and sequencing of individual chromosomes is likely to provide valuable information for the systematic sequencing, assembly and scaffolding of large genomes.
Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma

NARCIS (Netherlands)

Cerhan, James R.; Berndt, Sonja I.; Vijai, Joseph; Ghesquières, Hervé; McKay, James; Wang, Sophia S.; Wang, Zhaoming; Yeager, Meredith; Conde, Lucia; De Bakker, Paul I W; Nieters, Alexandra; Cox, David; Burdett, Laurie; Monnereau, Alain; Flowers, Christopher R.; De Roos, Anneclaire J.; Brooks-Wilson, Angela R.; Lan, Qing; Severi, Gianluca; Melbye, Mads; Gu, Jian; Jackson, Rebecca D.; Kane, Eleanor; Teras, Lauren R.; Purdue, Mark P.; Vajdic, Claire M.; Spinelli, John J.; Giles, Graham G.; Albanes, Demetrius; Kelly, Rachel S.; Zucca, Mariagrazia; Bertrand, Kimberly A.; Zeleniuch-Jacquotte, Anne; Lawrence, Charles; Hutchinson, Amy; Zhi, Degui; Habermann, Thomas M.; Link, Brian K.; Novak, Anne J.; Dogan, Ahmet; Asmann, Yan W.; Liebow, Mark; Thompson, Carrie A.; Ansell, Stephen M.; Witzig, Thomas E.; Weiner, George J.; Veron, Amelie S.; Zelenika, Diana; Tilly, Hervé; Haioun, Corinne; Molina, Thierry Jo; Hjalgrim, Henrik; Glimelius, Bengt; Adami, Hans Olov; Bracci, Paige M.; Riby, Jacques; Smith, Martyn T.; Holly, Elizabeth A.; Cozen, Wendy; Hartge, Patricia; Morton, Lindsay M.; Severson, Richard K.; Tinker, Lesley F.; North, Kari E.; Becker, Nikolaus; Benavente, Yolanda; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; Staines, Anthony; Lightfoot, Tracy; Crouch, Simon; Smith, Alex; Roman, Eve; Diver, W. Ryan; Offit, Kenneth; Zelenetz, Andrew; Klein, Robert J.; Villano, Danylo J.; Zheng, Tongzhang; Zhang, Yawei; Holford, Theodore R.; Kricker, Anne; Turner, Jenny; Southey, Melissa C.; Clavel, Jacqueline; Virtamo, Jarmo; Weinstein, Stephanie; Riboli, Elio; Vineis, Paolo; Kaaks, Rudolph; Trichopoulos, Dimitrios; Vermeulen, Roel C H; Boeing, Heiner; Tjonneland, Anne; Angelucci, Emanuele; Di Lollo, Simonetta; Rais, Marco; Birmann, Brenda M.; Laden, Francine; Giovannucci, Edward; Kraft, Peter; Huang, Jinyan; Ma, Baoshan; Ye, Yuanqing; Chiu, Brian C H; Sampson, Joshua; Liang, Liming; Park, Ju Hyun; Chung, Charles C.; Weisenburger, Dennis D.; Chatterjee, Nilanjan; Fraumeni, Joseph F.; Slager, Susan L.; Wu, Xifeng; De Sanjose, Silvia; Smedby, Karin E.; Salles, Gilles; Skibola, Christine F.; Rothman, Nathaniel; Chanock, Stephen J.

2014-01-01

Diffuse large B cell lymphoma (DLBCL) is the most common lymphoma subtype and is clinically aggressive. To identify genetic susceptibility loci for DLBCL, we conducted a meta-analysis of 3 new genome-wide association studies (GWAS) and 1 previous scan, totaling 3,857 cases and 7,666 controls of
Soil Characterization by Large Scale Sampling of Soil Mixed with Buried Construction Debris at a Former Uranium Fuel Fabrication Facility

International Nuclear Information System (INIS)

Nardi, A.J.; Lamantia, L.

2009-01-01

Recent soil excavation activities on a site identified the presence of buried uranium contaminated building construction debris. The site previously was the location of a low enriched uranium fuel fabrication facility. This resulted in the collection of excavated materials from the two locations where contaminated subsurface debris was identified. The excavated material was temporarily stored in two piles on the site until a determination could be made as to the appropriate disposition of the material. Characterization of the excavated material was undertaken in a manner that involved the collection of large scale samples of the excavated material in 1 cubic meter Super Sacks. Twenty bags were filled with excavated material that consisted of the mixture of both the construction debris and the associated soil. In order to obtain information on the level of activity associated with the construction debris, ten additional bags were filled with construction debris that had been separated, to the extent possible, from the associated soil. Radiological surveys were conducted of the resulting bags of collected materials and the soil associated with the waste mixture. The 30 large samples, collected as bags, were counted using an In-Situ Object Counting System (ISOCS) unit to determine the average concentration of U-235 present in each bag. The soil fraction was sampled by the collection of 40 samples of soil for analysis in an on-site laboratory. A fraction of these samples were also sent to an off-site laboratory for additional analysis. This project provided the necessary soil characterization information to allow consideration of alternate options for disposition of the material. The identified contaminant was verified to be low enriched uranium. Concentrations of uranium in the waste were found to be lower than the calculated site specific derived concentration guideline levels (DCGLs) but higher than the NRC's screening values. The methods and results are presented
Genomics Portals: integrative web-platform for mining genomics data.

Science.gov (United States)

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Viral impacts on microbial carbon cycling in thawing permafrost soils

Science.gov (United States)

Trubl, G. G.; Roux, S.; Bolduc, B.; Jang, H. B.; Emerson, J. B.; Solonenko, N.; Li, F.; Solden, L. M.; Vik, D. R.; Wrighton, K. C.; Saleska, S. R.; Sullivan, M. B.; Rich, V. I.

2017-12-01

Permafrost contains 30-50% of global soil carbon (C) and is rapidly thawing. While the fate of this C is unknown, it will be shaped in part by microbes and their associated viruses, which modulate host activities via mortality and metabolic control. To date, viral research in soils has been outpaced by that in aquatic environments, due to the technical challenges of accessing viruses as well as the dramatic physicochemical heterogeneity in soils. Here, we describe advances in soil viromics from our research on permafrost-associated soils, and their implications for associated terrestrial C cycling. First, we optimized viral resuspension-DNA extraction methods for a range of soil types. Second, we applied cutting-edge viral-specific informatics methods to recover viral populations, define their gene content, connect them to potential hosts, and analyze their relationships to environmental parameters. A total of 781 viral populations were recovered from size-fractionated virus samples of three soils along a permafrost thaw gradient. Ecological analyses revealed endemism as recovered viral populations were largely unique to each habitat and unlike those in aquatic communities. Genome- and network-based classification assigned these viruses into 226 viral clusters (VCs; genus-level taxonomy), 55% of which were novel. This increases the number of VCs by a third and triples the number of soil viral populations in the RefSeq database (currently contains 256 VCs and 316 soil viral populations). Genomic analyses revealed 85% of the genes were functionally unknown, though 5% of the annotatable genes contained C-related auxiliary metabolic genes (AMGs; e.g. glycoside hydrolases). Using sequence-based features and microbial population genomes, we were able to in silico predict hosts for 30% of the viral populations. The identified hosts spanned 3 phyla and 6 genera but suggested these viruses have species-specific host ranges as >80% of hosts for a given virus were in the same
PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

Science.gov (United States)

Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

2016-10-06

With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

Large scale genomic reorganization of topological domains at the HoxD locus.

Science.gov (United States)

Fabre, Pierre J; Leleu, Marion; Mormann, Benjamin H; Lopez-Delisle, Lucille; Noordermeer, Daan; Beccari, Leonardo; Duboule, Denis

2017-08-07

The transcriptional activation of HoxD genes during mammalian limb development involves dynamic interactions with two topologically associating domains (TADs) flanking the HoxD cluster. In particular, the activation of the most posterior HoxD genes in developing digits is controlled by regulatory elements located in the centromeric TAD (C-DOM) through long-range contacts. To assess the structure-function relationships underlying such interactions, we measured compaction levels and TAD discreteness using a combination of chromosome conformation capture (4C-seq) and DNA FISH. We assessed the robustness of the TAD architecture by using a series of genomic deletions and inversions that impact the integrity of this chromatin domain and that remodel long-range contacts. We report multi-partite associations between HoxD genes and up to three enhancers. We find that the loss of native chromatin topology leads to the remodeling of TAD structure following distinct parameters. Our results reveal that the recomposition of TAD architectures after large genomic re-arrangements is dependent on a boundary-selection mechanism in which CTCF mediates the gating of long-range contacts in combination with genomic distance and sequence specificity. Accordingly, the building of a recomposed TAD at this locus depends on distinct functional and constitutive parameters.
The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

Science.gov (United States)

González-Recio, O; Jiménez-Montero, J A; Alenda, R

2013-01-01

In the next few years, with the advent of high-density single nucleotide polymorphism (SNP) arrays and genome sequencing, genomic evaluation methods will need to deal with a large number of genetic variants and an increasing sample size. The boosting algorithm is a machine-learning technique that may alleviate the drawbacks of dealing with such large data sets. This algorithm combines different predictors in a sequential manner with some shrinkage on them; each predictor is applied consecutively to the residuals from the committee formed by the previous ones to form a final prediction based on a subset of covariates. Here, a detailed description is provided and examples using a toy data set are included. A modification of the algorithm called "random boosting" was proposed to increase predictive ability and decrease computation time of genome-assisted evaluation in large data sets. Random boosting uses a random selection of markers to add a subsequent weak learner to the predictive model. These modifications were applied to a real data set composed of 1,797 bulls genotyped for 39,714 SNP. Deregressed proofs of 4 yield traits and 1 type trait from January 2009 routine evaluations were used as dependent variables. A 2-fold cross-validation scenario was implemented. Sires born before 2005 were used as a training sample (1,576 and 1,562 for production and type traits, respectively), whereas younger sires were used as a testing sample to evaluate predictive ability of the algorithm on yet-to-be-observed phenotypes. Comparison with the original algorithm was provided. The predictive ability of the algorithm was measured as Pearson correlations between observed and predicted responses. Further, estimated bias was computed as the average difference between observed and predicted phenotypes. The results showed that the modification of the original boosting algorithm could be run in 1% of the time used with the original algorithm and with negligible differences in accuracy
D-GENIES: dot plot large genomes in an interactive, efficient and simple way.

Science.gov (United States)

Cabanettes, Floréal; Klopp, Christophe

2018-01-01

Dot plots are widely used to quickly compare sequence sets. They provide a synthetic similarity overview, highlighting repetitions, breaks and inversions. Different tools have been developed to easily generated genomic alignment dot plots, but they are often limited in the input sequence size. D-GENIES is a standalone and web application performing large genome alignments using minimap2 software package and generating interactive dot plots. It enables users to sort query sequences along the reference, zoom in the plot and download several image, alignment or sequence files. D-GENIES is an easy-to-install, open-source software package (GPL) developed in Python and JavaScript. The source code is available at https://github.com/genotoul-bioinfo/dgenies and it can be tested at http://dgenies.toulouse.inra.fr/.
Soil Pollution with Copper, Lead and Zinc in the Surroundings of Large Copper Ore Tailings Impoundment

Directory of Open Access Journals (Sweden)

Musztyfaga Elżbieta

2014-12-01

Full Text Available Analysis of the top-soil total content of heavy metals was carried out inthe vicinity of large copper ore tailings pound in the south-western Poland with regard to soil properties, direction and distance from the tailings pound. None of the soils under study ex-ceeded the limits admitted in the official standards for soil quality, but the assessment made in accordance with IUNG-guidelines to soil contamination determination showed that more than half of the monitoring sites have elevated metal content, Cu, in par-ticular. The results confirmed high effectiveness of dust control preventing its eolian spread from the tailings pound.
The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis

Energy Technology Data Exchange (ETDEWEB)

Martin, F.; Aerts, A.; Ahren, D.; Brun, A.; Danchin, E. G. J.; Duchaussoy, F.; Gibon, J.; Kohler, A.; Lindquist, E.; Peresa, V.; Salamov, A.; Shapiro, H. J.; Wuyts, J.; Blaudez, D.; Buee, M.; Brokstein, P.; Canback, B.; Cohen, D.; Courty, P. E.; Coutinho, P. M.; Delaruelle, C.; Detter, J. C.; Deveau, A.; DiFazio, S.; Duplessis, S.; Fraissinet-Tachet, L.; Lucic, E.; Frey-Klett, P.; Fourrey, C.; Feussner, I.; Gay, G.; Grimwood, J.; Hoegger, P. J.; Jain, P.; Kilaru, S.; Labbe, J.; Lin, Y. C.; Legue, V.; Le Tacon, F.; Marmeisse, R.; Melayah, D.; Montanini, B.; Muratet, M.; Nehls, U.; Niculita-Hirzel, H.; Secq, M. P. Oudot-Le; Peter, M.; Quesneville, H.; Rajashekar, B.; Reich, M.; Rouhier, N.; Schmutz, J.; Yin, T.; Chalot, M.; Henrissat, B.; Kues, U.; Lucas, S.; Van de Peer, Y.; Podila, G. K.; Polle, A.; Pukkila, P. J.; Richardson, P. M.; Rouze, P.; Sanders, I. R.; Stajich, J. E.; Tunlid, A.; Tuskan, G.; Grigoriev, I. V.

2007-08-10

Mycorrhizal symbioses the union of roots and soil fungi are universal in terrestrial ecosystems and may have been fundamental to land colonization by plants 1, 2. Boreal, temperate and montane forests all depend on ectomycorrhizae1. Identification of the primary factors that regulate symbiotic development and metabolic activity will therefore open the door to understanding the role of ectomycorrhizae in plant development and physiology, allowing the full ecological significance of this symbiosis to be explored. Here we report the genome sequence of the ectomycorrhizal basidiomycete Laccaria bicolor (Fig. 1) and highlight gene sets involved in rhizosphere colonization and symbiosis. This 65-megabase genome assembly contains 20,000 predicted protein-encoding genes and a very large number of transposons and repeated sequences. We detected unexpected genomic features, most notably a battery of effector-type small secreted proteins (SSPs) with unknown function, several of which are only expressed in symbiotic tissues. The most highly expressed SSP accumulates in the proliferating hyphae colonizing the host root. The ectomycorrhizae-specific SSPs probably have a decisive role in the establishment of the symbiosis. The unexpected observation that the genome of L. bicolor lacks carbohydrate-active enzymes involved in degradation of plant cell walls, but maintains the ability to degrade non-plant cell wall polysaccharides, reveals the dual saprotrophic and biotrophic lifestyle of the mycorrhizal fungus that enables it to grow within both soil and living plant roots. The predicted gene inventory of the L. bicolor genome, therefore, points to previously unknown mechanisms of symbiosis operating in biotrophic mycorrhizal fungi. The availability of this genome provides an unparalleled opportunity to develop a deeper understanding of the processes by which symbionts interact with plants within their ecosystem to perform vital functions in the carbon and nitrogen cycles that are
Genomics Portals: integrative web-platform for mining genomics data

Directory of Open Access Journals (Sweden)

Ghosh Krishnendu

2010-01-01

Full Text Available Abstract Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc, and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
The genome sequence of Polymorphum gilvum SL003B-26A1(T reveals its genetic basis for crude oil degradation and adaptation to the saline soil.

Directory of Open Access Journals (Sweden)

Yong Nie

Full Text Available Polymorphum gilvum SL003B-26A1(T is the type strain of a novel species in the recently published novel genus Polymorphum isolated from saline soil contaminated with crude oil. It is capable of using crude oil as the sole carbon and energy source and can adapt to saline soil at a temperature of 45°C. The Polymorphum gilvum genome provides a genetic basis for understanding how the strain could degrade crude oil and adapt to a saline environment. Genome analysis revealed the versatility of the strain for emulsifying crude oil, metabolizing aromatic compounds (a characteristic specific to the Polymorphum gilvum genome in comparison with other known genomes of oil-degrading bacteria, as well as possibly metabolizing n-alkanes through the LadA pathway. In addition, COG analysis revealed Polymorphum gilvum SL003B-26A1(T has significantly higher abundances of the proteins responsible for cell motility, lipid transport and metabolism, and secondary metabolite biosynthesis, transport and catabolism than the average levels found in all other genomes sequenced thus far, but lower abundances of the proteins responsible for carbohydrate transport and metabolism, defense mechanisms, and translation than the average levels. These traits support the adaptability of Polymorphum gilvum to a crude oil-contaminated saline environment. The Polymorphum gilvum genome could serve as a platform for further study of oil-degrading microorganisms for bioremediation and microbial-enhanced oil recovery in harsh saline environments.
Bacterial phylogeny structures soil resistomes across habitats

Science.gov (United States)

Forsberg, Kevin J.; Patel, Sanket; Gibson, Molly K.; Lauber, Christian L.; Knight, Rob; Fierer, Noah; Dantas, Gautam

2014-05-01

Ancient and diverse antibiotic resistance genes (ARGs) have previously been identified from soil, including genes identical to those in human pathogens. Despite the apparent overlap between soil and clinical resistomes, factors influencing ARG composition in soil and their movement between genomes and habitats remain largely unknown. General metagenome functions often correlate with the underlying structure of bacterial communities. However, ARGs are proposed to be highly mobile, prompting speculation that resistomes may not correlate with phylogenetic signatures or ecological divisions. To investigate these relationships, we performed functional metagenomic selections for resistance to 18 antibiotics from 18 agricultural and grassland soils. The 2,895 ARGs we discovered were mostly new, and represent all major resistance mechanisms. We demonstrate that distinct soil types harbour distinct resistomes, and that the addition of nitrogen fertilizer strongly influenced soil ARG content. Resistome composition also correlated with microbial phylogenetic and taxonomic structure, both across and within soil types. Consistent with this strong correlation, mobility elements (genes responsible for horizontal gene transfer between bacteria such as transposases and integrases) syntenic with ARGs were rare in soil by comparison with sequenced pathogens, suggesting that ARGs may not transfer between soil bacteria as readily as is observed between human pathogens. Together, our results indicate that bacterial community composition is the primary determinant of soil ARG content, challenging previous hypotheses that horizontal gene transfer effectively decouples resistomes from phylogeny.
Complete Genome Sequence of Amycolicicoccus subflavusDQS3-9A1T, an Actinomycete Isolated from Crude Oil-Polluted Soil ▿

Science.gov (United States)

Cai, Man; Chen, Wei-Min; Nie, Yong; Chi, Chang-Qiao; Wang, Ya-Nan; Tang, Yue-Qin; Li, Guo-Ying; Wu, Xiao-Lei

2011-01-01

Amycolicicoccus subflavusDQS3-9A1T, isolated from crude oil-polluted soil in the Daqing Oilfield in China, is a type strain of a newly published novel species in the novel genus Amycolicicoccus. Here we report the complete genome of DQS3-9A1Tand genes associated with oil-polluted environment. PMID:21725023
Who ate whom? Adaptive Helicobacter genomic changes that accompanied a host jump from early humans to large felines.

Directory of Open Access Journals (Sweden)

Mark Eppinger

2006-07-01

Full Text Available Helicobacter pylori infection of humans is so old that its population genetic structure reflects that of ancient human migrations. A closely related species, Helicobacter acinonychis, is specific for large felines, including cheetahs, lions, and tigers, whereas hosts more closely related to humans harbor more distantly related Helicobacter species. This observation suggests a jump between host species. But who ate whom and when did it happen? In order to resolve this question, we determined the genomic sequence of H. acinonychis strain Sheeba and compared it to genomes from H. pylori. The conserved core genes between the genomes are so similar that the host jump probably occurred within the last 200,000 (range 50,000-400,000 years. However, the Sheeba genome also possesses unique features that indicate the direction of the host jump, namely from early humans to cats. Sheeba possesses an unusually large number of highly fragmented genes, many encoding outer membrane proteins, which may have been destroyed in order to bypass deleterious responses from the feline host immune system. In addition, the few Sheeba-specific genes that were found include a cluster of genes encoding sialylation of the bacterial cell surface carbohydrates, which were imported by horizontal genetic exchange and might also help to evade host immune defenses. These results provide a genomic basis for elucidating molecular events that allow bacteria to adapt to novel animal hosts.
Genomics With Cloud Computing

Directory of Open Access Journals (Sweden)

Sukhamrit Kaur

2015-04-01

Full Text Available Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computing to genomics are like easy access and sharing of data security of data less cost to pay for resources but still there are some demerits like large time needed to transfer data less network bandwidth.
The genomes of three Bradyrhizobium sp. isolated from root nodules of Lupinus albescens grown in extremely poor soils display important genes for resistance to environmental stress.

Science.gov (United States)

Granada, Camille E; Vargas, Luciano K; Sant'Anna, Fernando Hayashi; Balsanelli, Eduardo; Baura, Valter Antonio de; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Falcon, Tiago; Passaglia, Luciane M P

2018-05-17

Lupinus albescens is a resistant cover plant that establishes symbiotic relationships with bacteria belonging to the Bradyrhizobium genus. This symbiosis helps the development of these plants in adverse environmental conditions, such as the ones found in arenized areas of Southern Brazil. This work studied three Bradyrhizobium sp. (AS23, NAS80 and NAS96) isolated from L. albescens plants that grow in extremely poor soils (arenized areas and adjacent grasslands). The genomes of these three strains were sequenced in the Ion Torrent platform using the IonXpress library preparation kit, and presented a total number of bases of 1,230,460,823 for AS23, 1,320,104,022 for NAS80, and 1,236,105,093 for NAS96. The genome comparison with closest strains Bradyrhizobium japonicum USDA6 and Bradyrhizobium diazoefficiens USDA110 showed important variable regions (with less than 80% of similarity). Genes encoding for factors for resistance/tolerance to heavy metal, flagellar motility, response to osmotic and oxidative stresses, heat shock proteins (present only in the three sequenced genomes) could be responsible for the ability of these microorganisms to survive in inhospitable environments. Knowledge about these genomes will provide a foundation for future development of an inoculant bioproduct that should optimize the recovery of degraded soils using cover crops.
Twenty years of artificial directional selection have shaped the genome of the Italian Large White pig breed.

Science.gov (United States)

Schiavo, G; Galimberti, G; Calò, D G; Samorè, A B; Bertolini, F; Russo, V; Gallo, M; Buttazzoni, L; Fontanesi, L

2016-04-01

In this study, we investigated at the genome-wide level if 20 years of artificial directional selection based on boar genetic evaluation obtained with a classical BLUP animal model shaped the genome of the Italian Large White pig breed. The most influential boars of this breed (n = 192), born from 1992 (the beginning of the selection program of this breed) to 2012, with an estimated breeding value reliability of >0.85, were genotyped with the Illumina Porcine SNP60 BeadChip. After grouping the boars in eight classes according to their year of birth, filtered single nucleotide polymorphisms (SNPs) were used to evaluate the effects of time on genotype frequency changes using multinomial logistic regression models. Of these markers, 493 had a PBonferroni selection program. The obtained results indicated that the genome of the Italian Large White pigs was shaped by a directional selection program derived by the application of methodologies assuming the infinitesimal model that captured a continuous trend of allele frequency changes in the boar population. © 2015 Stichting International Foundation for Animal Genetics.
Genomics With Cloud Computing

OpenAIRE

Sukhamrit Kaur; Sandeep Kaur

2015-01-01

Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computin...
Comparative analysis of Acinetobacters: three genomes for three lifestyles.

Directory of Open Access Journals (Sweden)

David Vallenet

Full Text Available Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss; ii strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS. Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment, louse, soil.
Large area mapping of soil moisture using the ESTAR passive microwave radiometer in Washita'92

International Nuclear Information System (INIS)

Jackson, T.J.; Le Vine, D.M.; Swift, C.T.; Schmugge, T.J.; Schiebe, F.R.

1995-01-01

Washita'92 was a large-scale study of remote sensing and hydrology conducted on the Little Washita watershed in southwest Oklahoma. Data collection during the experiment included passive microwave observations using an L-band electronically scanned thinned array radiometer (ESTAR) and surface soil moisture observations at sites distributed over the area. Data were collected on 8 days over a 9-day period in June 1992. The watershed was saturated with a great deal of standing water at the outset of the study. During the experiment there was no rainfall and surface soil moisture observations exhibited a drydown pattern over the period. Significant variations in the level and rate of change in surface soil moisture were noted over areas dominated by different soil textures. ESTAR data were processed to produce brightness temperature maps of a 740 sq. km. area on each of the 8 days. These data exhibited significant spatial and temporal patterns. Spatial patterns were clearly associated with soil textures and temporal patterns with drainage and evaporative processes. Relationships between the ground sampled soil moisture and the brightness temperatures were consistent with previous results. Spatial averaging of both variables was analyzed to study scaling of soil moisture over a mixed landscape. Results of these studies showed that a strong correlation is retained at these scales, suggesting that mapping surface moisture for large footprints may provide important information for regional studies. (author)
The use of remotely sensed soil moisture data in large-scale models of the hydrological cycle

Science.gov (United States)

Salomonson, V. V.; Gurney, R. J.; Schmugge, T. J.

1985-01-01

Manabe (1982) has reviewed numerical simulations of the atmosphere which provided a framework within which an examination of the dynamics of the hydrological cycle could be conducted. It was found that the climate is sensitive to soil moisture variability in space and time. The challenge arises now to improve the observations of soil moisture so as to provide up-dated boundary condition inputs to large scale models including the hydrological cycle. Attention is given to details regarding the significance of understanding soil moisture variations, soil moisture estimation using remote sensing, and energy and moisture balance modeling.
Quantitative linkage genome scan for atopy in a large collection of Caucasian families

DEFF Research Database (Denmark)

Webb, BT; van den Oord, E; Akkari, A

2007-01-01

adulthood, asthma is frequently associated also with quantitative measures of atopy. Genome wide quantitative multipoint linkage analysis was conducted for serum IgE levels and percentage of positive skin prick test (SPT(per)) using three large groups of families originally ascertained for asthma....... In this report, 438 and 429 asthma families were informative for linkage using IgE and SPT(per) which represents 690 independent families. Suggestive linkage (LOD >/= 2) was found on chromosomes 1, 3, and 8q with maximum LODs of 2.34 (IgE), 2.03 (SPT(per)), and 2.25 (IgE) near markers D1S1653, D3S2322-D3S1764...... represents one of the biggest genome scans so far reported for asthma related phenotypes. This study also demonstrates the utility of increased sample sizes and quantitative phenotypes in linkage analysis of complex disorders....
The development in the in-situ decontamination technique for the large quantity of soils contaminated by radioactive materials

International Nuclear Information System (INIS)

Tsubaki, Junichiro

2012-01-01

The new filtration and condensation techniques that decontaminate effectively the large quantity of contaminated soils, was developed. The facility treating the soils of 5 tons per day is being developed. (M.H.)
A large-scale soil-structure interaction experiment: Part I design and construction

International Nuclear Information System (INIS)

Tang, H.T.; Tang, Y.K.; Wall, I.B.; Lin, E.

1987-01-01

In the simulated earthquake experiments (SIMQUAKE) sponsored by EPRI, the detonation of vertical arrays of explosives propagated wave motions through the ground to the model structures. Although such a simulation can provide information about dynamic soil-structure interaction (SSI) characteristics in a strong motion environment, it lacks seismic wave scattering characteristics for studying seismic input to the soil-structure system and the effect of different kinds of wave composition to the soil-structure response. To supplement the inadequacy of the simulated earthquake SSI experiment, the Electric Power Research Institute (EPRI) and the Taiwan Power Company (Taipower) jointly sponsored a large scale SSI experiment in the field. The objectives of the experiment are: (1) to obtain actual strong motion earthquakes induced database in a soft-soil environment which will substantiate predictive and design SSI models;and (2) to assess nuclear power plant reactor containment internal components dynamic response and margins relating to actual earthquake-induced excitation. These objectives are accomplished by recording and analyzing data from two instrumented, scaled down, (1/4- and 1/12-scale) reinforced concrete containments sited in a high seismic region in Taiwan where a strong-motion seismic array network is located

Soil carbon sequestration due to post-Soviet cropland abandonment: estimates from a large-scale soil organic carbon field inventory.

Science.gov (United States)

Wertebach, Tim-Martin; Hölzel, Norbert; Kämpf, Immo; Yurtaev, Andrey; Tupitsin, Sergey; Kiehl, Kathrin; Kamp, Johannes; Kleinebecker, Till

2017-09-01

The break-up of the Soviet Union in 1991 triggered cropland abandonment on a continental scale, which in turn led to carbon accumulation on abandoned land across Eurasia. Previous studies have estimated carbon accumulation rates across Russia based on large-scale modelling. Studies that assess carbon sequestration on abandoned land based on robust field sampling are rare. We investigated soil organic carbon (SOC) stocks using a randomized sampling design along a climatic gradient from forest steppe to Sub-Taiga in Western Siberia (Tyumen Province). In total, SOC contents were sampled on 470 plots across different soil and land-use types. The effect of land use on changes in SOC stock was evaluated, and carbon sequestration rates were calculated for different age stages of abandoned cropland. While land-use type had an effect on carbon accumulation in the topsoil (0-5 cm), no independent land-use effects were found for deeper SOC stocks. Topsoil carbon stocks of grasslands and forests were significantly higher than those of soils managed for crops and under abandoned cropland. SOC increased significantly with time since abandonment. The average carbon sequestration rate for soils of abandoned cropland was 0.66 Mg C ha -1 yr -1 (1-20 years old, 0-5 cm soil depth), which is at the lower end of published estimates for Russia and Siberia. There was a tendency towards SOC saturation on abandoned land as sequestration rates were much higher for recently abandoned (1-10 years old, 1.04 Mg C ha -1 yr -1 ) compared to earlier abandoned crop fields (11-20 years old, 0.26 Mg C ha -1 yr -1 ). Our study confirms the global significance of abandoned cropland in Russia for carbon sequestration. Our findings also suggest that robust regional surveys based on a large number of samples advance model-based continent-wide SOC prediction. © 2017 John Wiley & Sons Ltd.
Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms.

Science.gov (United States)

Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele

2018-06-01

Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in the realm of genome assembly. However, they are so specialized to this domain that they do not extend easily to the computation of informational and linguistic indices, concurrently on sets of genomes. Following the well-established approach in many disciplines, and with a growing success also in bioinformatics, to resort to MapReduce and Hadoop to deal with 'Big Data' problems, we present KCH, the first set of MapReduce algorithms able to perform concurrently informational and linguistic analysis of large collections of genomic sequences on a Hadoop cluster. The benchmarking of KCH that we provide indicates that it is quite effective and versatile. It is also competitive with respect to the parallel and distributed algorithms highly specialized to k-mer statistics collection for genome assembly problems. In conclusion, KCH is a much needed addition to the growing number of algorithms and tools that use MapReduce for bioinformatics core applications. The software, including instructions for running it over Amazon AWS, as well as the datasets are available at http://www.di-srv.unisa.it/KCH. umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online.
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

Science.gov (United States)

Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

Directory of Open Access Journals (Sweden)

Matthias Christen

Full Text Available Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
First Insights into the Large Genome of Epimedium sagittatum (Sieb. et Zucc Maxim, a Chinese Traditional Medicinal Plant

Directory of Open Access Journals (Sweden)

Gong Xiao

2013-06-01

Full Text Available Epimedium sagittatum (Sieb. et Zucc Maxim is a member of the Berberidaceae family of basal eudicot plants, widely distributed and used as a traditional medicinal plant in China for therapeutic effects on many diseases with a long history. Recent data shows that E. sagittatum has a relatively large genome, with a haploid genome size of ~4496 Mbp, divided into a small number of only 12 diploid chromosomes (2n = 2x = 12. However, little is known about Epimedium genome structure and composition. Here we present the analysis of 691 kb of high-quality genomic sequence derived from 672 randomly selected plasmid clones of E. sagittatum genomic DNA, representing ~0.0154% of the genome. The sampled sequences comprised at least 78.41% repetitive DNA elements and 2.51% confirmed annotated gene sequences, with a total GC% content of 39%. Retrotransposons represented the major class of transposable element (TE repeats identified (65.37% of all TE repeats, particularly LTR (Long Terminal Repeat retrotransposons (52.27% of all TE repeats. Chromosome analysis and Fluorescence in situ Hybridization of Gypsy-Ty3 retrotransposons were performed to survey the E. sagittatum genome at the cytological level. Our data provide the first insights into the composition and structure of the E. sagittatum genome, and will facilitate the functional genomic analysis of this valuable medicinal plant.
First Insights into the Large Genome of Epimedium sagittatum (Sieb. et Zucc) Maxim, a Chinese Traditional Medicinal Plant

Science.gov (United States)

Liu, Di; Zeng, Shao-Hua; Chen, Jian-Jun; Zhang, Yan-Jun; Xiao, Gong; Zhu, Lin-Yao; Wang, Ying

2013-01-01

Epimedium sagittatum (Sieb. et Zucc) Maxim is a member of the Berberidaceae family of basal eudicot plants, widely distributed and used as a traditional medicinal plant in China for therapeutic effects on many diseases with a long history. Recent data shows that E. sagittatum has a relatively large genome, with a haploid genome size of ~4496 Mbp, divided into a small number of only 12 diploid chromosomes (2n = 2x = 12). However, little is known about Epimedium genome structure and composition. Here we present the analysis of 691 kb of high-quality genomic sequence derived from 672 randomly selected plasmid clones of E. sagittatum genomic DNA, representing ~0.0154% of the genome. The sampled sequences comprised at least 78.41% repetitive DNA elements and 2.51% confirmed annotated gene sequences, with a total GC% content of 39%. Retrotransposons represented the major class of transposable element (TE) repeats identified (65.37% of all TE repeats), particularly LTR (Long Terminal Repeat) retrotransposons (52.27% of all TE repeats). Chromosome analysis and Fluorescence in situ Hybridization of Gypsy-Ty3 retrotransposons were performed to survey the E. sagittatum genome at the cytological level. Our data provide the first insights into the composition and structure of the E. sagittatum genome, and will facilitate the functional genomic analysis of this valuable medicinal plant. PMID:23807511
Comparative genome analysis identifies two large deletions in the genome of highly-passaged attenuated Streptococcus agalactiae strain YM001 compared to the parental pathogenic strain HN016.

Science.gov (United States)

Wang, Rui; Li, Liping; Huang, Yan; Luo, Fuguang; Liang, Wanwen; Gan, Xi; Huang, Ting; Lei, Aiying; Chen, Ming; Chen, Lianfu

2015-11-04

Streptococcus agalactiae (S. agalactiae), also known as group B Streptococcus (GBS), is an important pathogen for neonatal pneumonia, meningitis, bovine mastitis, and fish meningoencephalitis. The global outbreaks of Streptococcus disease in tilapia cause huge economic losses and threaten human food hygiene safety as well. To investigate the mechanism of S. agalactiae pathogenesis in tilapia and develop attenuated S. agalactiae vaccine, this study sequenced and comparatively analyzed the whole genomes of virulent wild-type S. agalactiae strain HN016 and its highly-passaged attenuated strain YM001 derived from tilapia. We performed Illumina sequencing of DNA prepared from strain HN016 and YM001. Sequencedreads were assembled and nucleotide comparisons, single nucleotide polymorphism (SNP) , indels were analyzed between the draft genomes of HN016 and YM001. Clustered regularly interspaced short palindromic repeats (CRISPRs) and prophage were detected and analyzed in different S. agalactiae strains. The genome of S. agalactiae YM001 was 2,047,957 bp with a GC content of 35.61 %; it contained 2044 genes and 88 RNAs. Meanwhile, the genome of S. agalactiae HN016 was 2,064,722 bp with a GC content of 35.66 %; it had 2063 genes and 101 RNAs. Comparative genome analysis indicated that compared with HN016, YM001 genome had two significant large deletions, at the sizes of 5832 and 11,116 bp respectively, resulting in the deletion of three rRNA and ten tRNA genes, as well as the deletion and functional damage of ten genes related to metabolism, transport, growth, anti-stress, etc. Besides these two large deletions, other ten deletions and 28 single nucleotide variations (SNVs) were also identified, mainly affecting the metabolism- and growth-related genes. The genome of attenuated S. agalactiae YM001 showed significant variations, resulting in the deletion of 10 functional genes, compared to the parental pathogenic strain HN016. The deleted and mutated functional genes all
Novel, non-symbiotic isolates of Neorhizobium from a dryland agricultural soil.

Science.gov (United States)

Soenens, Amalia; Imperial, Juan

2018-01-01

Semi-selective enrichment, followed by PCR screening, resulted in the successful direct isolation of fast-growing Rhizobia from a dryland agricultural soil. Over 50% of these isolates belong to the genus Neorhizobium , as concluded from partial rpoB and near-complete 16S rDNA sequence analysis. Further genotypic and genomic analysis of five representative isolates confirmed that they form a coherent group within Neorhizobium , closer to N. galegae than to the remaining Neorhizobium species, but clearly differentiated from the former, and constituting at least one new genomospecies within Neorhizobium. All the isolates lacked nod and nif symbiotic genes but contained a repABC replication/maintenance region, characteristic of rhizobial plasmids, within large contigs from their draft genome sequences. These repABC sequences were related, but not identical, to repABC sequences found in symbiotic plasmids from N. galegae , suggesting that the non-symbiotic isolates have the potential to harbor symbiotic plasmids. This is the first report of non-symbiotic members of Neorhizobium from soil.
A large-scale soil-structure interaction experiment: Design and construction

International Nuclear Information System (INIS)

Tang, H.T.; Tang, Y.K.; Stepp, J.C.; Wall, I.B.; Lin, E.; Cheng, S.C.; Lee, S.K.

1989-01-01

This paper describes the design and construction phase of the Large-Scale Soil-Structure Interaction Experiment project jointly sponsored by EPRI and Taipower. The project has two objectives: 1. to obtain an earthquake database which can be used to substantiate soil-structure interaction (SSI) models and analysis methods; and 2. to quantify nuclear power plant reactor containment and internal components seismic margin based on earthquake experience data. These objectives were accomplished by recording and analyzing data from two instrumented, scaled down, reinforced concrete containment structures during seismic events. The two model structures are sited in a high seismic region in Taiwan (SMART-1). A strong-motion seismic array network is located at the site. The containment models (1/4- and 1/12-scale) were constructed and instrumented specially for this experiment. Construction was completed and data recording began in September 1985. By November 1986, 18 strong motion earthquakes ranging from Richter magnitude 4.5 to 7.0 were recorded. (orig./HP)
Population Genomics of Mycobacterium tuberculosis in Ethiopia Contradicts the Virgin Soil Hypothesis for Human Tuberculosis in Sub-Saharan Africa.

Science.gov (United States)

Comas, Iñaki; Hailu, Elena; Kiros, Teklu; Bekele, Shiferaw; Mekonnen, Wondale; Gumi, Balako; Tschopp, Rea; Ameni, Gobena; Hewinson, R Glyn; Robertson, Brian D; Goig, Galo A; Stucki, David; Gagneux, Sebastien; Aseffa, Abraham; Young, Douglas; Berg, Stefan

2015-12-21

Colonial medical reports claimed that tuberculosis (TB) was largely unknown in Africa prior to European contact, providing a "virgin soil" for spread of TB in highly susceptible populations previously unexposed to the disease [1, 2]. This is in direct contrast to recent phylogenetic models which support an African origin for TB [3-6]. To address this apparent contradiction, we performed a broad genomic sampling of Mycobacterium tuberculosis in Ethiopia. All members of the M. tuberculosis complex (MTBC) arose from clonal expansion of a single common ancestor [7] with a proposed origin in East Africa [3, 4, 8]. Consistent with this proposal, MTBC lineage 7 is almost exclusively found in that region [9-11]. Although a detailed medical history of Ethiopia supports the view that TB was rare until the 20(th) century [12], over the last century Ethiopia has become a high-burden TB country [13]. Our results provide further support for an African origin for TB, with some genotypes already present on the continent well before European contact. Phylogenetic analyses reveal a pattern of serial introductions of multiple genotypes into Ethiopia in association with human migration and trade. In place of a "virgin soil" fostering the spread of TB in a previously naive population, we propose that increased TB mortality in Africa was driven by the introduction of European strains of M. tuberculosis alongside expansion of selected indigenous strains having biological characteristics that carry a fitness benefit in the urbanized settings of post-colonial Africa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
The Organelle Genomes of Hassawi Rice (Oryza sativa L.) and Its Hybrid in Saudi Arabia: Genome Variation, Rearrangement, and Origins

Science.gov (United States)

Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun

2012-01-01

Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184
Large-scale Patterns of 14C Age of Bulk Organic Carbon and Various Molecular Components in Grassland Soils

Science.gov (United States)

Jia, J.; Liu, Z.; Cao, Z.; Chen, L.; He, J. S.; Haghipour, N.; Wacker, L.; Eglinton, T. I.; Feng, X.

2017-12-01

Unraveling the fate of organic carbon (OC) in soils is essential to understanding the impact of global changes on the global carbon cycle. Previous studies have shown that while various soil OC components have different decomposability, chemically labile OC can have old 14C ages. However, few studies have compared the 14C age of various soil OC components on a large scale, which may provide important information on the link between the age or turnover of soil OC components to their sources, molecular structures as well as environmental variables. In this project, a suite of soil profiles were sampled along a large-scale transect of temperate and alpine grasslands across the Tibetan and Mongolian Plateaus in China with contrasting climatic, vegetation and soil properties. Bulk OC and source-specific compounds (including fatty acids (FAs), diacids (DAs) and lignin phenols) were radiocarbon-dated to investigate the age and turnover dynamics of different OC pools and the mechanisms controlling their stability. Our results show that lignin phenols displayed a large 14C variability. Short-chain (C16, 18) FAs sourced from vascular plants as well as microorganisms were younger than plant-derived long-chain FAs and DAs, indicating that short-chain FAs were easier to be decomposed or newly synthesized. In the temperate grasslands, long-chain DAs were younger than FAs, while the opposite trend was observed in the alpine grasslands. Preliminary correlation analysis suggests that the age of short-chain FAs were mainly influenced by clay contents and climate, while reactive minerals, clay or silt particles were important factors in the stabilization of long-chain FAs, DAs and lignin phenols. Overall, our study provided a unique 14 C dataset of soil OC components in grasslands, which will provide important constraints on soil carbon turnover in future investigations.
The large-scale process of microbial carbonate precipitation for nickel remediation from an industrial soil.

Science.gov (United States)

Zhu, Xuejiao; Li, Weila; Zhan, Lu; Huang, Minsheng; Zhang, Qiuzhuo; Achal, Varenyam

2016-12-01

Microbial carbonate precipitation is known as an efficient process for the remediation of heavy metals from contaminated soils. In the present study, a urease positive bacterial isolate, identified as Bacillus cereus NS4 through 16S rDNA sequencing, was utilized on a large scale to remove nickel from industrial soil contaminated by the battery industry. The soil was highly contaminated with an initial total nickel concentration of approximately 900 mg kg -1 . The soluble-exchangeable fraction was reduced to 38 mg kg -1 after treatment. The primary objective of metal stabilization was achieved by reducing the bioavailability through immobilizing the nickel in the urease-driven carbonate precipitation. The nickel removal in the soils contributed to the transformation of nickel from mobile species into stable biominerals identified as calcite, vaterite, aragonite and nickelous carbonate when analyzed under XRD. It was proven that during precipitation of calcite, Ni 2+ with an ion radius close to Ca 2+ was incorporated into the CaCO 3 crystal. The biominerals were also characterized by using SEM-EDS to observe the crystal shape and Raman-FTIR spectroscopy to predict responsible bonding during bioremediation with respect to Ni immobilization. The electronic structure and chemical-state information of the detected elements during MICP bioremediation process was studied by XPS. This is the first study in which microbial carbonate precipitation was used for the large-scale remediation of metal-contaminated industrial soil. Copyright © 2016 Elsevier Ltd. All rights reserved.
The role of soil microbiology in soil health

Science.gov (United States)

Microbial diversity in the rhizosphere is enormous. The complex plant-associated microbial community, or second genome of the plant, is crucial for plant health and soil function. Microbes are active in decomposition, release mineralizable nutrients, synthesize plant growth regulators, degrade/inact...
Lateral saturated hydraulic conductivity of soil horizons evaluated in large-volume soil monoliths

NARCIS (Netherlands)

Pirastru, Mario; Marrosu, Roberto; Prima, Di Simone; Keesstra, Saskia; Giadrossich, Filippo; Niedda, Marcello

2017-01-01

Evaluating the lateral saturated hydraulic conductivity, Ks,l, of soil horizons is crucial for understanding and modelling the subsurface flow dynamics in many shallow hill soils. A Ks,l measurement method should be able to catch the effects of soil heterogeneities governing hydrological processes
Draft Genome Sequences of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8, Soil Bacteria That Cooperate To Degrade the Poly-?-d-Glutamic Acid Anthrax Capsule

OpenAIRE

Stabler, Richard A.; Negus, David; Pain, Arnab; Taylor, Peter W.

2013-01-01

A mixed culture of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8 degraded poly-?-d-glutamic acid; when the 2 strains were cultured separately, no hydrolytic activity was apparent. Here we report the draft genome sequences of both soil isolates.
GIGGLE: a search engine for large-scale integrated genome analysis.

Science.gov (United States)

Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

2018-02-01

GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.
Analysis of radiation-induced genome alterations in Vigna unguiculata

Directory of Open Access Journals (Sweden)

van der Vyver C

2011-09-01

Full Text Available Christell van der Vyver1, B Juan Vorster2, Karl J Kunert3, Christopher A Cullis41Institute for Plant Biotechnology, Department of Genetics, University of Stellenbosch, Stellenbosch, South Africa; 2Department of Plant Production and Soil Science, and 3Department of Plant Science, Forestry and Agricultural Biotechnology Institute, University of Pretoria, Pretoria, South Africa; 4Case Western Reserve University, Department of Biology, Cleveland, OH, USAAbstract: Seeds from an inbred Vigna unguiculata (cowpea cultivar were gamma-irradiated with a dose of 180 Gy in order to identify and characterize possible mutations. Three techniques, ie, random amplified polymorphic DNA, microsatellites, and representational difference analysis, were used to characterize possible DNA variation among the mutants and nonirradiated control plants both immediately after irradiation and in subsequent generations. A large portion of putative radiation-induced genome changes had significant similarities to chloroplast sequences. The frequency of mutation at three of these isolated polymorphic regions with chloroplast similarity was further determined by polymerase chain reaction screening using a large number of individual parental, M1, and M2 plants. Analysis of these sequences indicated that the rate at which various regions of the genome is mutated in irradiation experiments differs significantly and also that mutations have variable “repair” rates. Furthermore, regions of the nuclear DNA derived from the chloroplast genome are highly susceptible to modification by radiation treatment. Overall, data have provided detailed information on the effects of gamma irradiation on the cowpea genome and about the ability of the plant to repair these genome changes in subsequent plant generations.Keywords: mutation breeding, gamma radiation, genetic mutations, cowpea, representational difference analysis
GIGGLE: a search engine for large-scale integrated genome analysis

Science.gov (United States)

Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

2018-01-01

GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061
Determination of 129I in large soil samples after alkaline wet disintegration

International Nuclear Information System (INIS)

Bunzl, K.; Kracke, W.

1992-01-01

Large soil samples (up to 500 g) can conveniently be disintegrated by hydrogen peroxide in an utility tank under alkaline conditions to determine subsequently 129 I by neutron activation analysis. Interfering elements such as Br are removed already before neutron irradiation to reduce the radiation exposure of the personnel. The precision of the method is 129 I also by the combustion method. (orig.)

Grass genomes

OpenAIRE

Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

1998-01-01

For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...
Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

Energy Technology Data Exchange (ETDEWEB)

Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

2006-03-01

distribution of promoter-like signals between regulatory and nonregulatory regions detected in large bacterial genomes confers a significant, although small, fitness advantage. This study paves the way for further identification of the specific types of selective constraints that affect the organization of regulatory regions and the overall distribution of promoter-like signals through more detailed comparative analyses among closely-related bacterial genomes.
Research Guidelines in the Era of Large-scale Collaborations: An Analysis of Genome-wide Association Study Consortia

Science.gov (United States)

Austin, Melissa A.; Hair, Marilyn S.; Fullerton, Stephanie M.

2012-01-01

Scientific research has shifted from studies conducted by single investigators to the creation of large consortia. Genetic epidemiologists, for example, now collaborate extensively for genome-wide association studies (GWAS). The effect has been a stream of confirmed disease-gene associations. However, effects on human subjects oversight, data-sharing, publication and authorship practices, research organization and productivity, and intellectual property remain to be examined. The aim of this analysis was to identify all research consortia that had published the results of a GWAS analysis since 2005, characterize them, determine which have publicly accessible guidelines for research practices, and summarize the policies in these guidelines. A review of the National Human Genome Research Institute’s Catalog of Published Genome-Wide Association Studies identified 55 GWAS consortia as of April 1, 2011. These consortia were comprised of individual investigators, research centers, studies, or other consortia and studied 48 different diseases or traits. Only 14 (25%) were found to have publicly accessible research guidelines on consortia websites. The available guidelines provide information on organization, governance, and research protocols; half address institutional review board approval. Details of publication, authorship, data-sharing, and intellectual property vary considerably. Wider access to consortia guidelines is needed to establish appropriate research standards with broad applicability to emerging forms of large-scale collaboration. PMID:22491085
Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

Science.gov (United States)

Machado, Henrique; Gram, Lone

2017-01-01

Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.
Insights into structural variations and genome rearrangements in prokaryotic genomes.

Science.gov (United States)

Periwal, Vinita; Scaria, Vinod

2015-01-01

Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Pre-genomic, genomic and post-genomic study of microbial communities involved in bioenergy.

Science.gov (United States)

Rittmann, Bruce E; Krajmalnik-Brown, Rosa; Halden, Rolf U

2008-08-01

Microorganisms can produce renewable energy in large quantities and without damaging the environment or disrupting food supply. The microbial communities must be robust and self-stabilizing, and their essential syntrophies must be managed. Pre-genomic, genomic and post-genomic tools can provide crucial information about the structure and function of these microbial communities. Applying these tools will help accelerate the rate at which microbial bioenergy processes move from intriguing science to real-world practice.
Reconstruction of Oomycete Genome Evolution Identifies Differences in Evolutionary Trajectories Leading to Present-Day Large Gene Families

NARCIS (Netherlands)

Seidl, M.F.; Ackerveken, van den G.; Govers, F.; Snel, B.

2012-01-01

The taxonomic class of oomycetes contains numerous pathogens of plants and animals but is related to nonpathogenic diatoms and brown algae. Oomycetes have flexible genomes comprising large gene families that play roles in pathogenicity. The evolutionary processes that shaped the gene content have
Soil-Structure Interaction for Non-Slender, Large-Diameter Offshore Monopiles

DEFF Research Database (Denmark)

Sørensen, Søren Peder Hyldal

Offshore wind power is a domestic, sustainable and largely untapped energy resource. Today, the modern offshore wind turbine offers competitive production prices compared to other sources of renewable energy. Therefore, it is a key technology in breaking the dependence on fossil fuels...... and in achieving the energy and climate goals of the future. For offshore wind turbines, the costs of foundation typically constitute 20-30% of the total costs. Hence, improved methods for the design of foundations for offshore wind turbines can increase the competitiveness of offshore wind energy significantly....... The monopile foundation concept has been employed as the foundation for the majority of the currently installed offshore wind turbines. Therefore, this PhD thesis concerns the soil-pile interaction for non-slender, large-diameter offshore piles. A combination of numerical and physical modelling has been...
SVA retrotransposon insertion-associated deletion represents a novel mutational mechanism underlying large genomic copy number changes with non-recurrent breakpoints

Science.gov (United States)

2014-01-01

Background Genomic disorders are caused by copy number changes that may exhibit recurrent breakpoints processed by nonallelic homologous recombination. However, region-specific disease-associated copy number changes have also been observed which exhibit non-recurrent breakpoints. The mechanisms underlying these non-recurrent copy number changes have not yet been fully elucidated. Results We analyze large NF1 deletions with non-recurrent breakpoints as a model to investigate the full spectrum of causative mechanisms, and observe that they are mediated by various DNA double strand break repair mechanisms, as well as aberrant replication. Further, two of the 17 NF1 deletions with non-recurrent breakpoints, identified in unrelated patients, occur in association with the concomitant insertion of SINE/variable number of tandem repeats/Alu (SVA) retrotransposons at the deletion breakpoints. The respective breakpoints are refractory to analysis by standard breakpoint-spanning PCRs and are only identified by means of optimized PCR protocols designed to amplify across GC-rich sequences. The SVA elements are integrated within SUZ12P intron 8 in both patients, and were mediated by target-primed reverse transcription of SVA mRNA intermediates derived from retrotranspositionally active source elements. Both SVA insertions occurred during early postzygotic development and are uniquely associated with large deletions of 1 Mb and 867 kb, respectively, at the insertion sites. Conclusions Since active SVA elements are abundant in the human genome and the retrotranspositional activity of many SVA source elements is high, SVA insertion-associated large genomic deletions encompassing many hundreds of kilobases could constitute a novel and as yet under-appreciated mechanism underlying large-scale copy number changes in the human genome. PMID:24958239
Genomic Selection Using Extreme Phenotypes and Pre-Selection of SNPs in Large Yellow Croaker (Larimichthys crocea).

Science.gov (United States)

Dong, Linsong; Xiao, Shijun; Chen, Junwei; Wan, Liang; Wang, Zhiyong

2016-10-01

Genomic selection (GS) is an effective method to improve predictive accuracies of genetic values. However, high cost in genotyping will limit the application of this technology in some species. Therefore, it is necessary to find some methods to reduce the genotyping costs in genomic selection. Large yellow croaker is one of the most commercially important marine fish species in southeast China and Eastern Asia. In this study, genotyping-by-sequencing was used to construct the libraries for the NGS sequencing and find 29,748 SNPs in the genome. Two traits, eviscerated weight (EW) and the ratio between eviscerated weight and whole body weight (REW), were chosen to study. Two strategies to reduce the costs were proposed as follows: selecting extreme phenotypes (EP) for genotyping in reference population or pre-selecting SNPs to construct low-density marker panels in candidates. Three methods of pre-selection of SNPs, i.e., pre-selecting SNPs by absolute effects (SE), by single marker analysis (SMA), and by fixed intervals of sequence number (EL), were studied. The results showed that using EP was a feasible method to save the genotyping costs in reference population. Heritability did not seem to have obvious influences on the predictive abilities estimated by EP. Using SMA was the most feasible method to save the genotyping costs in candidates. In addition, the combination of EP and SMA in genomic selection also showed good results, especially for trait of REW. We also described how to apply the new methods in genomic selection and compared the genotyping costs before and after using the new methods. Our study may not only offer a reference for aquatic genomic breeding but also offer a reference for genomic prediction in other species including livestock and plants, etc.
Finished Genome Sequence of Collimonas arenae Cal35

NARCIS (Netherlands)

Wu, Je-Jia; de Jager, Victor; Deng, Wen-ling; Leveau, Johan

2015-01-01

We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of
Estimating demographic parameters from large-scale population genomic data using Approximate Bayesian Computation

Directory of Open Access Journals (Sweden)

Li Sen

2012-03-01

Full Text Available Abstract Background The Approximate Bayesian Computation (ABC approach has been used to infer demographic parameters for numerous species, including humans. However, most applications of ABC still use limited amounts of data, from a small number of loci, compared to the large amount of genome-wide population-genetic data which have become available in the last few years. Results We evaluated the performance of the ABC approach for three 'population divergence' models - similar to the 'isolation with migration' model - when the data consists of several hundred thousand SNPs typed for multiple individuals by simulating data from known demographic models. The ABC approach was used to infer demographic parameters of interest and we compared the inferred values to the true parameter values that was used to generate hypothetical "observed" data. For all three case models, the ABC approach inferred most demographic parameters quite well with narrow credible intervals, for example, population divergence times and past population sizes, but some parameters were more difficult to infer, such as population sizes at present and migration rates. We compared the ability of different summary statistics to infer demographic parameters, including haplotype and LD based statistics, and found that the accuracy of the parameter estimates can be improved by combining summary statistics that capture different parts of information in the data. Furthermore, our results suggest that poor choices of prior distributions can in some circumstances be detected using ABC. Finally, increasing the amount of data beyond some hundred loci will substantially improve the accuracy of many parameter estimates using ABC. Conclusions We conclude that the ABC approach can accommodate realistic genome-wide population genetic data, which may be difficult to analyze with full likelihood approaches, and that the ABC can provide accurate and precise inference of demographic parameters from
Linking genes to ecosystem trace gas fluxes in a large-scale model system

Science.gov (United States)

Meredith, L. K.; Cueva, A.; Volkmann, T. H. M.; Sengupta, A.; Troch, P. A.

2017-12-01

Soil microorganisms mediate biogeochemical cycles through biosphere-atmosphere gas exchange with significant impact on atmospheric trace gas composition. Improving process-based understanding of these microbial populations and linking their genomic potential to the ecosystem-scale is a challenge, particularly in soil systems, which are heterogeneous in biodiversity, chemistry, and structure. In oligotrophic systems, such as the Landscape Evolution Observatory (LEO) at Biosphere 2, atmospheric trace gas scavenging may supply critical metabolic needs to microbial communities, thereby promoting tight linkages between microbial genomics and trace gas utilization. This large-scale model system of three initially homogenous and highly instrumented hillslopes facilitates high temporal resolution characterization of subsurface trace gas fluxes at hundreds of sampling points, making LEO an ideal location to study microbe-mediated trace gas fluxes from the gene to ecosystem scales. Specifically, we focus on the metabolism of ubiquitous atmospheric reduced trace gases hydrogen (H2), carbon monoxide (CO), and methane (CH4), which may have wide-reaching impacts on microbial community establishment, survival, and function. Additionally, microbial activity on LEO may facilitate weathering of the basalt matrix, which can be studied with trace gas measurements of carbonyl sulfide (COS/OCS) and carbon dioxide (O-isotopes in CO2), and presents an additional opportunity for gene to ecosystem study. This work will present initial measurements of this suite of trace gases to characterize soil microbial metabolic activity, as well as links between spatial and temporal variability of microbe-mediated trace gas fluxes in LEO and their relation to genomic-based characterization of microbial community structure (phylogenetic amplicons) and genetic potential (metagenomics). Results from the LEO model system will help build understanding of the importance of atmospheric inputs to
Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

Energy Technology Data Exchange (ETDEWEB)

Kuo, Alan; Grigoriev, Igor

2009-04-17

Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.
Draft Genome Sequences of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8, Soil Bacteria That Cooperate To Degrade the Poly- -D-Glutamic Acid Anthrax Capsule

KAUST Repository

Stabler, R. A.

2013-01-24

A mixed culture of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8 degraded poly-γ-d-glutamic acid; when the 2 strains were cultured separately, no hydrolytic activity was apparent. Here we report the draft genome sequences of both soil isolates.
Draft Genome Sequences of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8, Soil Bacteria That Cooperate To Degrade the Poly-γ-d-Glutamic Acid Anthrax Capsule.

Science.gov (United States)

Stabler, Richard A; Negus, David; Pain, Arnab; Taylor, Peter W

2013-01-01

A mixed culture of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8 degraded poly-γ-d-glutamic acid; when the 2 strains were cultured separately, no hydrolytic activity was apparent. Here we report the draft genome sequences of both soil isolates.
Draft Genome Sequences of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8, Soil Bacteria That Cooperate To Degrade the Poly- -D-Glutamic Acid Anthrax Capsule

KAUST Repository

Stabler, R. A.; Negus, D.; Pain, Arnab; Taylor, P. W.

2013-01-01

A mixed culture of Pseudomonas fluorescens BS2 and Pusillimonas noertemannii BS8 degraded poly-γ-d-glutamic acid; when the 2 strains were cultured separately, no hydrolytic activity was apparent. Here we report the draft genome sequences of both soil isolates.
Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

Directory of Open Access Journals (Sweden)

Rachel A Mann

Full Text Available The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus and strains infecting Rubus (raspberries and blackberries. Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains, the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains.
Large-Scale Agricultural Management and Soil Meso- and Macrofauna Conservation in the Argentine Pampas

Directory of Open Access Journals (Sweden)

José Camilo Bedano

2016-07-01

Full Text Available Soil is the most basic resource for sustainable agricultural production; it promotes water quality, is a key component of the biogeochemical cycles and hosts a huge diversity of organisms. However, we are not paying enough attention to soil degradation produced by land use. Modern agriculture has been successful in increasing yields but has also caused extensive environmental damage, particularly soil degradation. In the Argentine Pampas, agriculturization reached a peak with the generalized use of the no-till technological package: genetically modified soybeans tolerant to glyphosate, no-till, glyphosate, and inorganic fertilizers. This phenomenon has been widely spread in the country; the no-till package has been applied in large areas and has been used by tenants in a 60%–70% of cultivated lands. Thus, those who were involved in developing management practices may not be the same as those who will face degradation issues related to those practices. Indeed, most evidence reviewed in this paper suggests that the most widely distributed practices in the Pampas region are actually producing severe soil degradation. Biological degradation is particularly important because soil biota is involved in numerous soil processes on which soil functioning relies, affecting soil fertility and productivity. For example, soil meso- and macrofauna are especially important in nutrient cycling and in soil structure formation and maintenance, and they are key components of the network that links microbial process to the scale of fields and landscapes where ecosystem services are produced. However, the knowledge of the impact of different agricultural managements on soil meso- and macrofauna in Pampas agroecosystems is far from conclusive at this stage. The reason for this lack of definite conclusions is that this area has been given less attention than in other parts of the world; the response of soil fauna to agricultural practices is complex and taxa
Large-scale seismic test for soil-structure interaction research in Hualien, Taiwan

International Nuclear Information System (INIS)

Ueshima, T.; Kokusho, T.; Okamoto, T.

1995-01-01

It is important to evaluate dynamic soil-structure interaction more accurately in the aseismic design of important facilities such as nuclear power plants. A large-scale model structure with about 1/4th of commercial nuclear power plants was constructed on the gravelly layers in seismically active Hualien, Taiwan. This international joint project is called 'the Hualien LSST Project', where 'LSST' is short for Large-Scale Seismic Test. In this paper, research tasks and responsibilities, the process of the construction work and research tasks along the time-line, main results obtained up to now, and so on in this Project are described. (J.P.N.)

The role of large arthropods in the development of halomorphic soils in the south of Siberia

Science.gov (United States)

Mordkovich, V. G.; Lyubechanskii, I. I.

2017-06-01

Soil sequences along catenas crossing the peripheral parts of shallow-water drying lakes in the south of Siberia have been studied. They include the sulfidic and typical playa (sor) solonchaks (Gleyic Solonchaks), playa solonchak over the buried solonetz (Gleyic Solonchak Thapto-Solonetz)), shallow solonetz-solonchak (Salic Solonetz), and solonetzic and solonchakous chernozemic-meadow soil (Luvic Gleyic Chernozem (Sodic, Salic)). This spatial sequence also represents a series of historical stages of the development of halomorphic soils: the amphibian, hydromorphic, semihydromorphic, and automorphic-paleohydromorphic stages. During all of them, the biogenic component plays a significant role in the matter budget of halomorphic soils. The diversity, number, and functional activity of large insects and spiders are particularly important. Their total abundance in the course of transformation of the halomorphic soils decreases from several thousand to about 100 specimens/(m2 day), whereas their species diversity increases from 17 to 45 species. Changes in the functional structure of the soil zoocenosis and its impact on the character and intensity of pedogenetic processes can be considered driving forces of the transformation of hydromorphic soils. This is ensured by the sequential alteration of the groups of invertebrates with different types of cenotic strategy and different mechanisms of adaptation to biotic and abiotic components of the soil in the course of the development of the soil zoocenosis.
Genome plasticity and systems evolution in Streptomyces

Science.gov (United States)

2012-01-01

Background Streptomycetes are filamentous soil-dwelling bacteria. They are best known as the producers of a great variety of natural products such as antibiotics, antifungals, antiparasitics, and anticancer agents and the decomposers of organic substances for carbon recycling. They are also model organisms for the studies of gene regulatory networks, morphological differentiation, and stress response. The availability of sets of genomes from closely related Streptomyces strains makes it possible to assess the mechanisms underlying genome plasticity and systems adaptation. Results We present the results of a comprehensive analysis of the genomes of five Streptomyces species with distinct phenotypes. These streptomycetes have a pan-genome comprised of 17,362 orthologous families which includes 3,096 components in the core genome, 5,066 components in the dispensable genome, and 9,200 components that are uniquely present in only one species. The core genome makes up about 33%-45% of each genome repertoire. It contains important genes for Streptomyces biology including those involved in gene regulation, secretion, secondary metabolism and morphological differentiation. Abundant duplicate genes have been identified, with 4%-11% of the whole genomes composed of lineage-specific expansions (LSEs), suggesting that frequent gene duplication or lateral gene transfer events play a role in shaping the genome diversification within this genus. Two patterns of expansion, single gene expansion and chromosome block expansion are observed, representing different scales of duplication. Conclusions Our results provide a catalog of genome components and their potential functional roles in gene regulatory networks and metabolic networks. The core genome components reveal the minimum requirement for streptomycetes to sustain a successful lifecycle in the soil environment, reflecting the effects of both genome evolution and environmental stress acting upon the expressed phenotypes. A
Component identification of electron transport chains in curdlan-producing Agrobacterium sp. ATCC 31749 and its genome-specific prediction using comparative genome and phylogenetic trees analysis.

Science.gov (United States)

Zhang, Hongtao; Setubal, Joao Carlos; Zhan, Xiaobei; Zheng, Zhiyong; Yu, Lijun; Wu, Jianrong; Chen, Dingqiang

2011-06-01

Agrobacterium sp. ATCC 31749 (formerly named Alcaligenes faecalis var. myxogenes) is a non-pathogenic aerobic soil bacterium used in large scale biotechnological production of curdlan. However, little is known about its genomic information. DNA partial sequence of electron transport chains (ETCs) protein genes were obtained in order to understand the components of ETC and genomic-specificity in Agrobacterium sp. ATCC 31749. Degenerate primers were designed according to ETC conserved sequences in other reported species. DNA partial sequences of ETC genes in Agrobacterium sp. ATCC 31749 were cloned by the PCR method using degenerate primers. Based on comparative genomic analysis, nine electron transport elements were ascertained, including NADH ubiquinone oxidoreductase, succinate dehydrogenase complex II, complex III, cytochrome c, ubiquinone biosynthesis protein ubiB, cytochrome d terminal oxidase, cytochrome bo terminal oxidase, cytochrome cbb (3)-type terminal oxidase and cytochrome caa (3)-type terminal oxidase. Similarity and phylogenetic analyses of these genes revealed that among fully sequenced Agrobacterium species, Agrobacterium sp. ATCC 31749 is closest to Agrobacterium tumefaciens C58. Based on these results a comprehensive ETC model for Agrobacterium sp. ATCC 31749 is proposed.
High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak

Directory of Open Access Journals (Sweden)

Trout-Yakel Keri M

2010-02-01

Full Text Available Abstract Background A large, multi-province outbreak of listeriosis associated with ready-to-eat meat products contaminated with Listeria monocytogenes serotype 1/2a occurred in Canada in 2008. Subtyping of outbreak-associated isolates using pulsed-field gel electrophoresis (PFGE revealed two similar but distinct AscI PFGE patterns. High-throughput pyrosequencing of two L. monocytogenes isolates was used to rapidly provide the genome sequence of the primary outbreak strain and to investigate the extent of genetic diversity associated with a change of a single restriction enzyme fragment during PFGE. Results The chromosomes were collinear, but differences included 28 single nucleotide polymorphisms (SNPs and three indels, including a 33 kbp prophage that accounted for the observed difference in AscI PFGE patterns. The distribution of these traits was assessed within further clinical, environmental and food isolates associated with the outbreak, and this comparison indicated that three distinct, but highly related strains may have been involved in this nationwide outbreak. Notably, these two isolates were found to harbor a 50 kbp putative mobile genomic island encoding translocation and efflux functions that has not been observed in other Listeria genomes. Conclusions High-throughput genome sequencing provided a more detailed real-time assessment of genetic traits characteristic of the outbreak strains than could be achieved with routine subtyping methods. This study confirms that the latest generation of DNA sequencing technologies can be applied during high priority public health events, and laboratories need to prepare for this inevitability and assess how to properly analyze and interpret whole genome sequences in the context of molecular epidemiology.
Novel, non-symbiotic isolates of Neorhizobium from a dryland agricultural soil

Directory of Open Access Journals (Sweden)

Amalia Soenens

2018-05-01

Full Text Available Semi-selective enrichment, followed by PCR screening, resulted in the successful direct isolation of fast-growing Rhizobia from a dryland agricultural soil. Over 50% of these isolates belong to the genus Neorhizobium, as concluded from partial rpoB and near-complete 16S rDNA sequence analysis. Further genotypic and genomic analysis of five representative isolates confirmed that they form a coherent group within Neorhizobium, closer to N. galegae than to the remaining Neorhizobium species, but clearly differentiated from the former, and constituting at least one new genomospecies within Neorhizobium. All the isolates lacked nod and nif symbiotic genes but contained a repABC replication/maintenance region, characteristic of rhizobial plasmids, within large contigs from their draft genome sequences. These repABC sequences were related, but not identical, to repABC sequences found in symbiotic plasmids from N. galegae, suggesting that the non-symbiotic isolates have the potential to harbor symbiotic plasmids. This is the first report of non-symbiotic members of Neorhizobium from soil.
Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

Science.gov (United States)

Wisniewski-Dyé, Florence; Lozano, Luis; Acosta-Cruz, Erika; Borland, Stéphanie; Drogue, Benoît; Prigent-Combaret, Claire; Rouy, Zoé; Barbe, Valérie; Mendoza Herrera, Alberto; González, Victor; Mavingui, Patrick

2012-01-01

Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510). The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis), and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron) are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation. PMID:24705077
Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

Directory of Open Access Journals (Sweden)

Victor González

2012-09-01

Full Text Available Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510. The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis, and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation.
Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

Science.gov (United States)

Vera-Cabrera, Lucio; Ortiz-Lopez, Rocio; Elizondo-Gonzalez, Ramiro; Ocampo-Candiani, Jorge

2013-01-01

Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.
Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

Directory of Open Access Journals (Sweden)

Lucio Vera-Cabrera

Full Text Available Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.
The Genome of the Toluene-Degrading Pseudomonas veronii Strain 1YdBTEX2 and Its Differential Gene Expression in Contaminated Sand.

Directory of Open Access Journals (Sweden)

Marian Morales

Full Text Available The natural restoration of soils polluted by aromatic hydrocarbons such as benzene, toluene, ethylbenzene and m- and p-xylene (BTEX may be accelerated by inoculation of specific biodegraders (bioaugmentation. Bioaugmentation mainly involves introducing bacteria that deploy their metabolic properties and adaptation potential to survive and propagate in the contaminated environment by degrading the pollutant. In order to better understand the adaptive response of cells during a transition to contaminated material, we analyzed here the genome and short-term (1 h changes in genome-wide gene expression of the BTEX-degrading bacterium Pseudomonas veronii 1YdBTEX2 in non-sterile soil and liquid medium, both in presence or absence of toluene. We obtained a gapless genome sequence of P. veronii 1YdBTEX2 covering three individual replicons with a total size of 8 Mb, two of which are largely unrelated to current known bacterial replicons. One-hour exposure to toluene, both in soil and liquid, triggered massive transcription (up to 208-fold induction of multiple gene clusters, such as toluene degradation pathway(s, chemotaxis and toluene efflux pumps. This clearly underlines their key role in the adaptive response to toluene. In comparison to liquid medium, cells in soil drastically changed expression of genes involved in membrane functioning (e.g., lipid composition, lipid metabolism, cell fatty acid synthesis, osmotic stress response (e.g., polyamine or trehalose synthesis, uptake of potassium and putrescine metabolism, highlighting the immediate response mechanisms of P. veronii 1YdBTEX2 for successful establishment in polluted soil.
Environmental genomics of "Haloquadratum walsbyi" in a saltern crystallizer indicates a large pool of accessory genes in an otherwise coherent species

Directory of Open Access Journals (Sweden)

Bolhuis Henk

2006-07-01

Full Text Available Abstract Background Mature saturated brine (crystallizers communities are largely dominated (>80% of cells by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present. Results The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions. Conclusion These results point to a large pan-genome (total gene repertoire of the genus/species even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities.
Genomics and the human genome project: implications for psychiatry

OpenAIRE

Kelsoe, J R

2004-01-01

In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project h...
A large scale GIS geodatabase of soil parameters supporting the modeling of conservation practice alternatives in the United States

Science.gov (United States)

Water quality modeling requires across-scale support of combined digital soil elements and simulation parameters. This paper presents the unprecedented development of a large spatial scale (1:250,000) ArcGIS geodatabase coverage designed as a functional repository of soil-parameters for modeling an...
Genome size analyses of Pucciniales reveal the largest fungal genomes.

Science.gov (United States)

Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

2014-01-01

Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.
Dramatic improvement in genome assembly achieved using doubled-haploid genomes.

Science.gov (United States)

Zhang, Hong; Tan, Engkong; Suzuki, Yutaka; Hirose, Yusuke; Kinoshita, Shigeharu; Okano, Hideyuki; Kudoh, Jun; Shimizu, Atsushi; Saito, Kazuyoshi; Watabe, Shugo; Asakawa, Shuichi

2014-10-27

Improvement in de novo assembly of large genomes is still to be desired. Here, we improved draft genome sequence quality by employing doubled-haploid individuals. We sequenced wildtype and doubled-haploid Takifugu rubripes genomes, under the same conditions, using the Illumina platform and assembled contigs with SOAPdenovo2. We observed 5.4-fold and 2.6-fold improvement in the sizes of the N50 contig and scaffold of doubled-haploid individuals, respectively, compared to the wildtype, indicating that the use of a doubled-haploid genome aids in accurate genome analysis.
Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

Science.gov (United States)

Lara-Ramírez, Edgar E.; Salazar, Ma Isabel; López-López, María de Jesús; Salas-Benito, Juan Santiago; Sánchez-Varela, Alejandro

2014-01-01

The increasing number of dengue virus (DENV) genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4) has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC) with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3) as well as the effective number of codons (ENC, ENCp) versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA) and clustering analysis on relative synonymous codon usage (RSCU) within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution. PMID:25136631
Changes in gene expression during adaptation of Listeria monocytogenes to the soil environment.

Science.gov (United States)

Piveteau, Pascal; Depret, Géraldine; Pivato, Barbara; Garmyn, Dominique; Hartmann, Alain

2011-01-01

Listeria monocytogenes is a ubiquitous opportunistic pathogen responsible for listeriosis. In order to study the processes underlying its ability to adapt to the soil environment, whole-genome arrays were used to analyse transcriptome modifications 15 minutes, 30 minutes and 18 h after inoculation of L. monocytogenes EGD-e in soil extracts. Growth was observed within the first day of incubation and large numbers were still detected in soil extract and soil microcosms one year after the start of the experiment. Major transcriptional reprofiling was observed. Nutrient acquisition mechanisms (phosphoenolpyruvate-dependent phosphotransferase systems and ABC transporters) and enzymes involved in catabolism of specific carbohydrates (β-glucosidases; chitinases) were prevalent. This is consistent with the overrepresentation of the CodY regulon that suggests that in a nutrient depleted environment, L. monocytogenes recruits its extensive repertoire of transporters to acquire a range of substrates for energy production.
Insights into the genome of large sulfur bacteria revealed by analysis of single filaments

DEFF Research Database (Denmark)

Mussmann, Marc; Hu, Fen Z.; Richter, Michael

2007-01-01

Beggiatoa to overcome non-overlapping availabilities of electron donors and acceptors while gliding between oxic and sulfidic zones. The first look into the genome of these filamentous sulfur-oxidizing bacteria substantially deepens the understanding of their evolution and their contribution to sulfur......Marine sediments are frequently covered by mats of the filamentous Beggiatoa and other large nitrate-storing bacteria that oxidize hydrogen sulfide using either oxygen or nitrate, which they store in intracellular vacuoles. Despite their conspicuous metabolic properties and their biogeochemical...
Genome assembly of Chryseobacterium sp. strain IHBB 10212 from glacier top-surface soil in the Indian trans-Himalayas with potential for hydrolytic enzymes

Directory of Open Access Journals (Sweden)

Mohinder Pal

2017-09-01

Full Text Available The cold-active esterases are gaining importance due to their catalytic activities finding applications in chemical industry, food processes and detergent industry as additives, and organic synthesis of unstable compounds as catalysts. In the present study, the complete genome sequence of 4,843,645 bp with an average 34.08% G + C content and 4260 protein-coding genes are reported for the low temperature-active esterase-producing novel strain of Chrysobacterium isolated from the top-surface soil of a glacier in the cold deserts of the Indian trans-Himalayas. The genome contained two plasmids of 16,553 and 11,450 bp with 40.54 and 40.37% G + C contents, respectively. Several genes encoding the hydrolysis of ester linkages of triglycerides into fatty acids and glycerol were predicted in the genome. The annotation also predicted the genes encoding proteases, lipases, amylases, β-glucosidases, endoglucanases and xylanases involved in biotechnological processes. The complete genome sequence of Chryseobacterium sp. strain IHBB 10212 and two plasmids have been deposited vide accession numbers CP015199, CP015200 and CP015201 at DDBJ/EMBL/GenBank.
Genome Content and Phylogenomics Reveal both Ancestral and Lateral Evolutionary Pathways in Plant-Pathogenic Streptomyces Species

Science.gov (United States)

Huguet-Tapia, Jose C.; Lefebure, Tristan; Badger, Jonathan H.; Guan, Dongli; Stanhope, Michael J.

2016-01-01

Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer. PMID:26826232

Segregation distortion causes large-scale differences between male and female genomes in hybrid ants.

Science.gov (United States)

Kulmuni, Jonna; Seifert, Bernhard; Pamilo, Pekka

2010-04-20

Hybridization in isolated populations can lead either to hybrid breakdown and extinction or in some cases to speciation. The basis of hybrid breakdown lies in genetic incompatibilities between diverged genomes. In social Hymenoptera, the consequences of hybridization can differ from those in other animals because of haplodiploidy and sociality. Selection pressures differ between sexes because males are haploid and females are diploid. Furthermore, sociality and group living may allow survival of hybrid genotypes. We show that hybridization in Formica ants has resulted in a stable situation in which the males form two highly divergent gene pools whereas all the females are hybrids. This causes an exceptional situation with large-scale differences between male and female genomes. The genotype differences indicate strong transmission ratio distortion depending on offspring sex, whereby the mother transmits some alleles exclusively to her daughters and other alleles exclusively to her sons. The genetic differences between the sexes and the apparent lack of multilocus hybrid genotypes in males can be explained by recessive incompatibilities which cause the elimination of hybrid males because of their haploid genome. Alternatively, differentiation between sexes could be created by prezygotic segregation into male-forming and female-forming gametes in diploid females. Differentiation between sexes is stable and maintained throughout generations. The present study shows a unique outcome of hybridization and demonstrates that hybridization has the potential of generating evolutionary novelties in animals.
Remediation and recycling of oil-contaminated soil beneath a large above-ground storage tank

International Nuclear Information System (INIS)

Wallace, G.

1994-01-01

While retrofitting a large 30-year-old, above-ground petroleum storage tank, Southern California Edison Company (SCE) discovered that soil beneath the fixed-roof, single-bottom tank was contaminated with 40,000 gallons of number-sign 6 fuel oil. The steel tank was left in place during the excavation and remediation of the contaminated soil to retain the operating permit. The resulting 2,000 tons of contaminated aggregate was recycled to make asphalt concrete for paving the tank basin and the remaining 5,600 tons of oily soil was thermally treated on site for use as engineered fill at another location. This successful operation provided an economical cleanup solution for a common leakage problem of single-lined tanks and eliminated the long-term liability of Class 1 landfill disposal. As a pro-active environmental effort, this paper shares SCE's site assessment procedure, reveals the engineering method developed to stabilize the tank, discusses the soil treatment technologies used, describes the problems encountered and lessons learned during the cleanup, discloses the costs of the operation, and offers guidelines and recommendations for similar tank remediation. This paper does not describe the work or costs for removing or replacing the tank bottom
Software engineering the mixed model for genome-wide association studies on large samples.

Science.gov (United States)

Zhang, Zhiwu; Buckler, Edward S; Casstevens, Terry M; Bradbury, Peter J

2009-11-01

Mixed models improve the ability to detect phenotype-genotype associations in the presence of population stratification and multiple levels of relatedness in genome-wide association studies (GWAS), but for large data sets the resource consumption becomes impractical. At the same time, the sample size and number of markers used for GWAS is increasing dramatically, resulting in greater statistical power to detect those associations. The use of mixed models with increasingly large data sets depends on the availability of software for analyzing those models. While multiple software packages implement the mixed model method, no single package provides the best combination of fast computation, ability to handle large samples, flexible modeling and ease of use. Key elements of association analysis with mixed models are reviewed, including modeling phenotype-genotype associations using mixed models, population stratification, kinship and its estimation, variance component estimation, use of best linear unbiased predictors or residuals in place of raw phenotype, improving efficiency and software-user interaction. The available software packages are evaluated, and suggestions made for future software development.
Biological soil crusts accelerate the nitrogen cycle through large NO and HONO emissions in drylands.

Science.gov (United States)

Weber, Bettina; Wu, Dianming; Tamm, Alexandra; Ruckteschler, Nina; Rodríguez-Caballero, Emilio; Steinkamp, Jörg; Meusel, Hannah; Elbert, Wolfgang; Behrendt, Thomas; Sörgel, Matthias; Cheng, Yafang; Crutzen, Paul J; Su, Hang; Pöschl, Ulrich

2015-12-15

Reactive nitrogen species have a strong influence on atmospheric chemistry and climate, tightly coupling the Earth's nitrogen cycle with microbial activity in the biosphere. Their sources, however, are not well constrained, especially in dryland regions accounting for a major fraction of the global land surface. Here, we show that biological soil crusts (biocrusts) are emitters of nitric oxide (NO) and nitrous acid (HONO). Largest fluxes are obtained by dark cyanobacteria-dominated biocrusts, being ∼20 times higher than those of neighboring uncrusted soils. Based on laboratory, field, and satellite measurement data, we obtain a best estimate of ∼1.7 Tg per year for the global emission of reactive nitrogen from biocrusts (1.1 Tg a(-1) of NO-N and 0.6 Tg a(-1) of HONO-N), corresponding to ∼20% of global nitrogen oxide emissions from soils under natural vegetation. On continental scales, emissions are highest in Africa and South America and lowest in Europe. Our results suggest that dryland emissions of reactive nitrogen are largely driven by biocrusts rather than the underlying soil. They help to explain enigmatic discrepancies between measurement and modeling approaches of global reactive nitrogen emissions. As the emissions of biocrusts strongly depend on precipitation events, climate change affecting the distribution and frequency of precipitation may have a strong impact on terrestrial emissions of reactive nitrogen and related climate feedback effects. Because biocrusts also account for a large fraction of global terrestrial biological nitrogen fixation, their impacts should be further quantified and included in regional and global models of air chemistry, biogeochemistry, and climate.
SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Large Scale

Energy Technology Data Exchange (ETDEWEB)

Meng, Jintao; Seo, Sangmin; Balaji, Pavan; Wei, Yanjie; Wang, Bingqiang; Feng, Shengzhong

2016-08-16

In this paper, we analyze and optimize the most time-consuming steps of the SWAP-Assembler, a parallel genome assembler, so that it can scale to a large number of cores for huge genomes with the size of sequencing data ranging from terabyes to petabytes. According to the performance analysis results, the most time-consuming steps are input parallelization, k-mer graph construction, and graph simplification (edge merging). For the input parallelization, the input data is divided into virtual fragments with nearly equal size, and the start position and end position of each fragment are automatically separated at the beginning of the reads. In k-mer graph construction, in order to improve the communication efficiency, the message size is kept constant between any two processes by proportionally increasing the number of nucleotides to the number of processes in the input parallelization step for each round. The memory usage is also decreased because only a small part of the input data is processed in each round. With graph simplification, the communication protocol reduces the number of communication loops from four to two loops and decreases the idle communication time. The optimized assembler is denoted as SWAP-Assembler 2 (SWAP2). In our experiments using a 1000 Genomes project dataset of 4 terabytes (the largest dataset ever used for assembling) on the supercomputer Mira, the results show that SWAP2 scales to 131,072 cores with an efficiency of 40%. We also compared our work with both the HipMER assembler and the SWAP-Assembler. On the Yanhuang dataset of 300 gigabytes, SWAP2 shows a 3X speedup and 4X better scalability compared with the HipMer assembler and is 45 times faster than the SWAP-Assembler. The SWAP2 software is available at https://sourceforge.net/projects/swapassembler.
Genome sequencing and annotation of Serratia sp. strain TEL.

Science.gov (United States)

Lephoto, Tiisetso E; Gray, Vincent M

2015-12-01

We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.
Genome sequencing and annotation of Serratia sp. strain TEL

Directory of Open Access Journals (Sweden)

Tiisetso E. Lephoto

2015-12-01

Full Text Available We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410. This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926 collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.
Genome sequencing and annotation of Serratia sp. strain TEL

OpenAIRE

Lephoto, Tiisetso E.; Gray, Vincent M.

2015-01-01

We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.
Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study

DEFF Research Database (Denmark)

de Vries, Paul S; Sabater-Lleal, Maria; Chasman, Daniel I

2017-01-01

An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In...
Physical mapping of a large plant genome using global high-information-content-fingerprinting: the distal region of the wheat ancestor Aegilops tauschii chromosome 3DS

Directory of Open Access Journals (Sweden)

You Frank M

2010-06-01

Full Text Available Abstract Background Physical maps employing libraries of bacterial artificial chromosome (BAC clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum, Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat. Results We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs. Conclusions The physical map reported here is the first physical map using fingerprinting of a complete
Rapid monitoring of soil, smears, and air dusts by direct large-area alpha spectrometry

International Nuclear Information System (INIS)

Sill, C.W.

1992-01-01

Experimental conditions to permit rapid monitoring of soils, smears, and air dusts for transuranic (TRU) radionuclides under field conditions are described. The monitoring technique involves direct measurement of alpha emitters by alpha spectrometry using a large-area detector to identify and quantify the radionuclides present. The direct alpha spectrometry employs a circular gridded ionization chamber 35 cm in diameter which accommodates either a circular sample holder 25 cm in diameter or a rectangular one 20 by 25 cm (8 by 10 in.). Soils or settled dusts are finely ground, suspended in 30% ethanol, and sprayed onto a 25-cm stainless steel dish. Air dusts are collected with a high-volume sampler onto 20- by 25-cm membrane filters. Removable contamination is collected from surfaces onto a 20- by 25-cm filter using an 18-cm (7-in.) paint roller to hold the large filter in contact with the surface during sample collection. All three types of samples are then counted directly in the alpha spectrometer and no other sample preparation is necessary. Some results obtained are described
Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

Science.gov (United States)

Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

2017-01-01

Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop
Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

Directory of Open Access Journals (Sweden)

Karolina Chwialkowska

2017-11-01

Full Text Available Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq. We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation
Genomic research in Eucalyptus.

Science.gov (United States)

Poke, Fiona S; Vaillancourt, René E; Potts, Brad M; Reid, James B

2005-09-01

Eucalyptus L'Hérit. is a genus comprised of more than 700 species that is of vital importance ecologically to Australia and to the forestry industry world-wide, being grown in plantations for the production of solid wood products as well as pulp for paper. With the sequencing of the genomes of Arabidopsis thaliana and Oryza sativa and the recent completion of the first tree genome sequence, Populus trichocarpa, attention has turned to the current status of genomic research in Eucalyptus. For several eucalypt species, large segregating families have been established, high-resolution genetic maps constructed and large EST databases generated. Collaborative efforts have been initiated for the integration of diverse genomic projects and will provide the framework for future research including exploiting the sequence of the entire eucalypt genome which is currently being sequenced. This review summarises the current position of genomic research in Eucalyptus and discusses the direction of future research.
Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

Directory of Open Access Journals (Sweden)

Edgar E. Lara-Ramírez

2014-01-01

Full Text Available The increasing number of dengue virus (DENV genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4 has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3 as well as the effective number of codons (ENC, ENCp versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA and clustering analysis on relative synonymous codon usage (RSCU within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution.
Genomic Resource and Genome Guided Comparison of Twenty Type Strains of the Genus Methylobacterium

Directory of Open Access Journals (Sweden)

Vasvi Chaudhry

2017-12-01

Full Text Available Bacteria of the genus Methylobacterium are widespread in diverse habitats ranging from soil, water and plant (phyllosphere, rhizosphere and endosphere. In the present study, we in house generated genomic data resource of six type strains along with fourteen database genomes of the Methylobacterium genus to carry out phylogenomic, taxonomic, comparative and ecological studies of this genus. Overall, the genus shows high diversity and genetic variation primarily due to its ability to acquire genetic material from diverse sources through horizontal gene transfer. As majority of species identified in this study are plant associated with their genomes equipped with methylotrophy and photosynthesis related gene along with genes for plant probiotic traits. Most of the species genomes are equipped with genes for adaptation and defense for UV radiation, oxidative stress and desiccation. The genus has an open pan-genome and we predicted the role of gain/loss of prophages and CRISPR elements in diversity and evolution. Our genomic resource with annotation and analysis provides a platform for interspecies genomic comparisons in the genus Methylobacterium, and to unravel their natural genome diversity and to study how natural selection shapes their genome with the adaptive mechanisms which allow them to acquire diverse habitat lifestyles. This type strains genomic data display power of Next Generation Sequencing in rapidly creating resource paving the way for studies on phylogeny and taxonomy as well as for basic and applied research for this important genus.
Variability of soil properties within large termite mounds in South Katanga, DRC - origins and applications.

Science.gov (United States)

Erens, Hans; Bazirake Mujinya, Basile; Boeckx, Pascal; Baert, Geert; Mees, Florias; Van Ranst, Eric

2014-05-01

The miombo woodlands of South Katanga (D.R. Congo) are characterized by a high spatial density of large conic termite mounds built by Macrotermes falciger (3 to 5 ha-1). With an average height of 5.05 m and diameter of 14.88 m, these are some of the largest biogenic structures in the world. The mound material is known to differ considerably from the surrounding Ferralsols. Specifically, mound material exhibits a finer texture, higher CEC and exchangeable basic cation content, lower organic matter content, and an accumulation of phosphorous, nitrate and secondary carbonates. However, as demonstrated by the present study, these soil properties are far from uniform within the volume of the mound. The termites' nesting and foraging activity, combined with pedogenic processes over extended periods of time, generates a wide range of physical, chemical, and biological conditions in different parts of the mound. Analysis of samples taken along a cross-section of a large active mound allowed generating contour plots, thus visualizing the variability of soil properties within the mound. The central columns of three other mounds were sampled to confirm apparent trends. The contour plots show that the mounds comprise four functional zones: (i) the active nest, found at the top; (ii) an accumulation zone , in more central parts of the mound; (iii) a dense inactive zone, surrounding the accumulation zone and consisting of accumulated erosion products from former active nests; and (iv) the outer mantle, characterized by intense varied biological activity and by a well-developed soil structure. Intermittent leaching plays a key role in explaining these patterns. Using radiocarbon dating, we found that some of these mounds are at least 2000 years old. Their current size and shape is likely the result of successive stages of erosion and rebuilding, in the course of alternating periods of mound abandonment and recolonization. Over time, termite foraging combined with limited leaching
Sensitivity Analysis of Electromagnetic Induction Technique to Determine Soil Salinity in Large –Scale

Directory of Open Access Journals (Sweden)

Yousef Hasheminejhad

2017-02-01

Full Text Available Introduction: Monitoring and management of saline soils depends on exact and updatable measurements of soil electrical conductivity. Large scale direct measurements are not only expensive but also time consuming. Therefore application of near ground surface sensors could be considered as acceptable time- and cost-saving methods with high accuracy in soil salinity detection. . One of these relatively innovative methods is electromagnetic induction technique. Apparent soil electrical conductivity measurement by electromagnetic induction technique is affected by several key properties of soils including soil moisture and clay content. Materials and Methods: Soil salinity and apparent soil electrical conductivity data of two years of 50000 ha area in Sabzevar- Davarzan plain were used to evaluate the sensitivity of electromagnetic induction to soil moisture and clay content. Locations of the sampling points were determined by the Latin Hypercube Sampling strategy, based on 100 sampling points were selected for the first year and 25 sampling points for the second year. Regarding to difficulties in finding and sampling the points 97 sampling points were found in the area for the first year out of which 82 points were sampled down to 90 cm depth in 30 cm intervals and all of them were measured with electromagnetic induction device at horizontal orientation. The first year data were used for training the model which included 82 points measurement of bulk conductivity and laboratory determination of electrical conductivity of saturated extract, soil texture and moisture content in soil samples. On the other hand, the second year data which were used for testing the model integrated by 25 sampling points and 9 bulk conductivity measurements around each point. Electrical conductivity of saturated extract was just measured as the only parameter in the laboratory for the second year samples. Results and Discussion: Results of the first year showed a
Deep whole-genome sequencing of 90 Han Chinese genomes.

Science.gov (United States)

Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

2017-09-01

Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000
Calculating Soil Wetness, Evapotranspiration and Carbon Cycle Processes Over Large Grid Areas Using a New Scaling Technique

Science.gov (United States)

Sellers, Piers

2012-01-01

Soil wetness typically shows great spatial variability over the length scales of general circulation model (GCM) grid areas (approx 100 km ), and the functions relating evapotranspiration and photosynthetic rate to local-scale (approx 1 m) soil wetness are highly non-linear. Soil respiration is also highly dependent on very small-scale variations in soil wetness. We therefore expect significant inaccuracies whenever we insert a single grid area-average soil wetness value into a function to calculate any of these rates for the grid area. For the particular case of evapotranspiration., this method - use of a grid-averaged soil wetness value - can also provoke severe oscillations in the evapotranspiration rate and soil wetness under some conditions. A method is presented whereby the probability distribution timction(pdf) for soil wetness within a grid area is represented by binning. and numerical integration of the binned pdf is performed to provide a spatially-integrated wetness stress term for the whole grid area, which then permits calculation of grid area fluxes in a single operation. The method is very accurate when 10 or more bins are used, can deal realistically with spatially variable precipitation, conserves moisture exactly and allows for precise modification of the soil wetness pdf after every time step. The method could also be applied to other ecological problems where small-scale processes must be area-integrated, or upscaled, to estimate fluxes over large areas, for example in treatments of the terrestrial carbon budget or trace gas generation.

Temperature response of soil respiration largely unaltered with experimental warming

Science.gov (United States)

Carey, Joanna C.; Tang, Jianwu; Templer, Pamela H.; Kroeger, Kevin D.; Crowther, Thomas W.; Burton, Andrew J.; Dukes, Jeffrey S.; Emmett, Bridget; Frey, Serita D.; Heskel, Mary A.; Jiang, Lifen; Machmuller, Megan B.; Mohan, Jacqueline; Panetta, Anne Marie; Reich, Peter B.; Reinsch, Sabine; Wang, Xin; Allison, Steven D.; Bamminger, Chris; Bridgham, Scott; Collins, Scott L.; de Dato, Giovanbattista; Eddy, William C.; Enquist, Brian J.; Estiarte, Marc; Harte, John; Henderson, Amanda; Johnson, Bart R.; Steenberg Larsen, Klaus; Luo, Yiqi; Marhan, Sven; Melillo, Jerry M.; Penuelas, Josep; Pfeifer-Meister, Laurel; Poll, Christian; Rastetter, Edward B.; Reinmann, Andrew B.; Reynolds, Lorien L.; Schmidt, Inger K.; Shaver, Gaius R.; Strong, Aaron L.; Suseela, Vidya; Tietema, Albert

2016-01-01

The respiratory release of carbon dioxide (CO2) from soil is a major yet poorly understood flux in the global carbon cycle. Climatic warming is hypothesized to increase rates of soil respiration, potentially fueling further increases in global temperatures. However, despite considerable scientific attention in recent decades, the overall response of soil respiration to anticipated climatic warming remains unclear. We synthesize the largest global dataset to date of soil respiration, moisture, and temperature measurements, totaling >3,800 observations representing 27 temperature manipulation studies, spanning nine biomes and over 2 decades of warming. Our analysis reveals no significant differences in the temperature sensitivity of soil respiration between control and warmed plots in all biomes, with the exception of deserts and boreal forests. Thus, our data provide limited evidence of acclimation of soil respiration to experimental warming in several major biome types, contrary to the results from multiple single-site studies. Moreover, across all nondesert biomes, respiration rates with and without experimental warming follow a Gaussian response, increasing with soil temperature up to a threshold of ∼25 °C, above which respiration rates decrease with further increases in temperature. This consistent decrease in temperature sensitivity at higher temperatures demonstrates that rising global temperatures may result in regionally variable responses in soil respiration, with colder climates being considerably more responsive to increased ambient temperatures compared with warmer regions. Our analysis adds a unique cross-biome perspective on the temperature response of soil respiration, information critical to improving our mechanistic understanding of how soil carbon dynamics change with climatic warming.
The Sequenced Angiosperm Genomes and Genome Databases.

Science.gov (United States)

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.
Application of stochastic models in identification and apportionment of heavy metal pollution sources in the surface soils of a large-scale region.

Science.gov (United States)

Hu, Yuanan; Cheng, Hefa

2013-04-16

As heavy metals occur naturally in soils at measurable concentrations and their natural background contents have significant spatial variations, identification and apportionment of heavy metal pollution sources across large-scale regions is a challenging task. Stochastic models, including the recently developed conditional inference tree (CIT) and the finite mixture distribution model (FMDM), were applied to identify the sources of heavy metals found in the surface soils of the Pearl River Delta, China, and to apportion the contributions from natural background and human activities. Regression trees were successfully developed for the concentrations of Cd, Cu, Zn, Pb, Cr, Ni, As, and Hg in 227 soil samples from a region of over 7.2 × 10(4) km(2) based on seven specific predictors relevant to the source and behavior of heavy metals: land use, soil type, soil organic carbon content, population density, gross domestic product per capita, and the lengths and classes of the roads surrounding the sampling sites. The CIT and FMDM results consistently indicate that Cd, Zn, Cu, Pb, and Cr in the surface soils of the PRD were contributed largely by anthropogenic sources, whereas As, Ni, and Hg in the surface soils mostly originated from the soil parent materials.
Extreme-Scale De Novo Genome Assembly

Energy Technology Data Exchange (ETDEWEB)

Georganas, Evangelos [Intel Corporation, Santa Clara, CA (United States); Hofmeyr, Steven [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Egan, Rob [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Rokhsar, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.

2017-09-26

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.
Cloud computing for comparative genomics.

Science.gov (United States)

Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J

2010-05-18

Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.
Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

OpenAIRE

Henrique Machado; Henrique Machado; Lone Gram

2017-01-01

Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...
Strain Dependent Genetic Networks for Antibiotic-Sensitivity in a Bacterial Pathogen with a Large Pan-Genome.

Directory of Open Access Journals (Sweden)

Tim van Opijnen

2016-09-01

Full Text Available The interaction between an antibiotic and bacterium is not merely restricted to the drug and its direct target, rather antibiotic induced stress seems to resonate through the bacterium, creating selective pressures that drive the emergence of adaptive mutations not only in the direct target, but in genes involved in many different fundamental processes as well. Surprisingly, it has been shown that adaptive mutations do not necessarily have the same effect in all species, indicating that the genetic background influences how phenotypes are manifested. However, to what extent the genetic background affects the manner in which a bacterium experiences antibiotic stress, and how this stress is processed is unclear. Here we employ the genome-wide tool Tn-Seq to construct daptomycin-sensitivity profiles for two strains of the bacterial pathogen Streptococcus pneumoniae. Remarkably, over half of the genes that are important for dealing with antibiotic-induced stress in one strain are dispensable in another. By confirming over 100 genotype-phenotype relationships, probing potassium-loss, employing genetic interaction mapping as well as temporal gene-expression experiments we reveal genome-wide conditionally important/essential genes, we discover roles for genes with unknown function, and uncover parts of the antibiotic's mode-of-action. Moreover, by mapping the underlying genomic network for two query genes we encounter little conservation in network connectivity between strains as well as profound differences in regulatory relationships. Our approach uniquely enables genome-wide fitness comparisons across strains, facilitating the discovery that antibiotic responses are complex events that can vary widely between strains, which suggests that in some cases the emergence of resistance could be strain specific and at least for species with a large pan-genome less predictable.
[Genome editing of industrial microorganism].

Science.gov (United States)

Zhu, Linjiang; Li, Qi

2015-03-01

Genome editing is defined as highly-effective and precise modification of cellular genome in a large scale. In recent years, such genome-editing methods have been rapidly developed in the field of industrial strain improvement. The quickly-updating methods thoroughly change the old mode of inefficient genetic modification, which is "one modification, one selection marker, and one target site". Highly-effective modification mode in genome editing have been developed including simultaneous modification of multiplex genes, highly-effective insertion, replacement, and deletion of target genes in the genome scale, cut-paste of a large DNA fragment. These new tools for microbial genome editing will certainly be applied widely, and increase the efficiency of industrial strain improvement, and promote the revolution of traditional fermentation industry and rapid development of novel industrial biotechnology like production of biofuel and biomaterial. The technological principle of these genome-editing methods and their applications were summarized in this review, which can benefit engineering and construction of industrial microorganism.
A Genome-Wide Landscape of Retrocopies in Primate Genomes.

Science.gov (United States)

Navarro, Fábio C P; Galante, Pedro A F

2015-07-29

Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (∼7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (∼10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Life-cycle and genome of OtV5, a large DNA virus of the pelagic marine unicellular green alga Ostreococcus tauri.

Directory of Open Access Journals (Sweden)

Evelyne Derelle

Full Text Available Large DNA viruses are ubiquitous, infecting diverse organisms ranging from algae to man, and have probably evolved from an ancient common ancestor. In aquatic environments, such algal viruses control blooms and shape the evolution of biodiversity in phytoplankton, but little is known about their biological functions. We show that Ostreococcus tauri, the smallest known marine photosynthetic eukaryote, whose genome is completely characterized, is a host for large DNA viruses, and present an analysis of the life-cycle and 186,234 bp long linear genome of OtV5. OtV5 is a lytic phycodnavirus which unexpectedly does not degrade its host chromosomes before the host cell bursts. Analysis of its complete genome sequence confirmed that it lacks expected site-specific endonucleases, and revealed the presence of 16 genes whose predicted functions are novel to this group of viruses. OtV5 carries at least one predicted gene whose protein closely resembles its host counterpart and several other host-like sequences, suggesting that horizontal gene transfers between host and viral genomes may occur frequently on an evolutionary scale. Fifty seven percent of the 268 predicted proteins present no similarities with any known protein in Genbank, underlining the wealth of undiscovered biological diversity present in oceanic viruses, which are estimated to harbour 200Mt of carbon.
Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

Science.gov (United States)

Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

2017-01-01

Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.

Science.gov (United States)

Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

2007-09-18

Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
Genomic prediction using subsampling

OpenAIRE

Xavier, Alencar; Xu, Shizhong; Muir, William; Rainey, Katy Martin

2017-01-01

Background Genome-wide assisted selection is a critical tool for the?genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each rou...
Systematic CpT (ApG) Depletion and CpG Excess Are Unique Genomic Signatures of Large DNA Viruses Infecting Invertebrates

Science.gov (United States)

Upadhyay, Mohita; Sharma, Neha; Vivekanandan, Perumal

2014-01-01

Differences in the relative abundance of dinucleotides, if any may provide important clues on host-driven evolution of viruses. We studied dinucleotide frequencies of large DNA viruses infecting vertebrates (n = 105; viruses infecting mammals = 99; viruses infecting aves = 6; viruses infecting reptiles = 1) and invertebrates (n = 88; viruses infecting insects = 84; viruses infecting crustaceans = 4). We have identified systematic depletion of CpT(ApG) dinucleotides and over-representation of CpG dinucleotides as the unique genomic signature of large DNA viruses infecting invertebrates. Detailed investigation of this unique genomic signature suggests the existence of invertebrate host-induced pressures specifically targeting CpT(ApG) and CpG dinucleotides. The depletion of CpT dinucleotides among large DNA viruses infecting invertebrates is at least in part, explained by non-canonical DNA methylation by the infected host. Our findings highlight the role of invertebrate host-related factors in shaping virus evolution and they also provide the necessary framework for future studies on evolution, epigenetics and molecular biology of viruses infecting this group of hosts. PMID:25369195
Ensembl 2002: accommodating comparative genomics.

Science.gov (United States)

Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E

2003-01-01

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.
The Ensembl genome database project.

Science.gov (United States)

Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

2002-01-01

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
GRIMP: A web- and grid-based tool for high-speed analysis of large-scale genome-wide association using imputed data.

NARCIS (Netherlands)

K. Estrada Gil (Karol); A. Abuseiris (Anis); F.G. Grosveld (Frank); A.G. Uitterlinden (André); T.A. Knoch (Tobias); F. Rivadeneira Ramirez (Fernando)

2009-01-01

textabstractThe current fast growth of genome-wide association studies (GWAS) combined with now common computationally expensive imputation requires the online access of large user groups to high-performance computing resources capable of analyzing rapidly and efficiently millions of genetic
A method to study response of large trees to different amounts of available soil water

Science.gov (United States)

D.H. Marx; Shi-Jean S. Sung; J.S. Cunningham; M.D. Thompson; L.M. White

1995-01-01

A method was developed to manipulate available soil water on large trees by intercepting thrufall with gutters placed under tree canopies and irrigating the intercepted thrufall onto other trees. With this design, trees were exposed for 2 years to either 25% less thrufall, normal thrufall, or 25% additional thrufall.Undercanopy construction in these plots moderately...
Genome Sequence of the Palaeopolyploid soybean

Energy Technology Data Exchange (ETDEWEB)

Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

2009-08-03

Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Cloud computing for comparative genomics

Directory of Open Access Journals (Sweden)

Pivovarov Rimma

2010-05-01

Full Text Available Abstract Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD, to run within Amazon's Elastic Computing Cloud (EC2. We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

Large-scale assessment of soil erosion in Africa: satellites help to jointly account for dynamic rainfall and vegetation cover

Science.gov (United States)

Vrieling, Anton; Hoedjes, Joost C. B.; van der Velde, Marijn

2015-04-01

Efforts to map and monitor soil erosion need to account for the erratic nature of the soil erosion process. Soil erosion by water occurs on sloped terrain when erosive rainfall and consequent surface runoff impact soils that are not well-protected by vegetation or other soil protective measures. Both rainfall erosivity and vegetation cover are highly variable through space and time. Due to data paucity and the relative ease of spatially overlaying geographical data layers into existing models like USLE (Universal Soil Loss Equation), many studies and mapping efforts merely use average annual values for erosivity and vegetation cover as input. We first show that rainfall erosivity can be estimated from satellite precipitation data. We obtained average annual erosivity estimates from 15 yr of 3-hourly TRMM Multi-satellite Precipitation Analysis (TMPA) data (1998-2012) using intensity-erosivity relationships. Our estimates showed a positive correlation (r = 0.84) with long-term annual erosivity values of 37 stations obtained from literature. Using these TMPA erosivity retrievals, we demonstrate the large interannual variability, with maximum annual erosivity often exceeding two to three times the mean value, especially in semi-arid areas. We then calculate erosivity at a 10-daily time-step and combine this with vegetation cover development for selected locations in Africa using NDVI - normalized difference vegetation index - time series from SPOT VEGETATION. Although we do not integrate the data at this point, the joint analysis of both variables stresses the need for joint accounting for erosivity and vegetation cover for large-scale erosion assessment and monitoring.
Numerical Analysis of Soil Settlement Prediction and Its Application In Large-Scale Marine Reclamation Artificial Island Project

Directory of Open Access Journals (Sweden)

Zhao Jie

2017-11-01

Full Text Available In an artificial island construction project based on the large-scale marine reclamation land, the soil settlement is a key to affect the late safe operation of the whole field. To analyze the factors of the soil settlement in a marine reclamation project, the SEM method in the soil micro-structural analysis method is used to test and study six soil samples such as the representative silt, mucky silty clay, silty clay and clay in the area. The structural characteristics that affect the soil settlement are obtained by observing the SEM charts at different depths. By combining numerical calculation method of Terzaghi’s one-dimensional and Biot’s two-dimensional consolidation theory, the one-dimensional and two-dimensional creep models are established and the numerical calculation results of two consolidation theories are compared in order to predict the maximum settlement of the soils 100 years after completion. The analysis results indicate that the micro-structural characteristics are the essential factor to affect the settlement in this area. Based on numerical analysis of one-dimensional and two-dimensional settlement, the settlement law and trend obtained by two numerical analysis method is similar. The analysis of this paper can provide reference and guidance to the project related to the marine reclamation land.
Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

DEFF Research Database (Denmark)

Machado, Henrique; Gram, Lone

2017-01-01

was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...
Temperature response of soil respiration largely unaltered with experimental warming

DEFF Research Database (Denmark)

Carey, Joanna C; Tang, Jianwu; Templer, Pamela H

2016-01-01

The respiratory release of carbon dioxide (CO2) from soil is a major yet poorly understood flux in the global carbon cycle. Climatic warming is hypothesized to increase rates of soil respiration, potentially fueling further increases in global temperatures. However, despite considerable scientific...... attention in recent decades, the overall response of soil respiration to anticipated climatic warming remains unclear. We synthesize the largest global dataset to date of soil respiration, moisture, and temperature measurements, totaling >3,800 observations representing 27 temperature manipulation studies......, spanning nine biomes and over 2 decades of warming. Our analysis reveals no significant differences in the temperature sensitivity of soil respiration between control and warmed plots in all biomes, with the exception of deserts and boreal forests. Thus, our data provide limited evidence of acclimation...
High proportion of large genomic deletions and a genotype phenotype update in 80 unrelated families with juvenile polyposis syndrome

DEFF Research Database (Denmark)

Aretz, S; Stienen, D; Uhlhaas, S

2007-01-01

BACKGROUND: In patients with juvenile polyposis syndrome (JPS) the frequency of large genomic deletions in the SMAD4 and BMPR1A genes was unknown. METHODS: Mutation and phenotype analysis was used in 80 unrelated patients of whom 65 met the clinical criteria for JPS (typical JPS) and 15 were susp...
Direct quantification of fungal DNA from soil substrate using real-time PCR.

Science.gov (United States)

Filion, Martin; St-Arnaud, Marc; Jabaji-Hare, Suha H

2003-04-01

Detection and quantification of genomic DNA from two ecologically different fungi, the plant pathogen Fusarium solani f. sp. phaseoli and the arbuscular mycorrhizal fungus Glomus intraradices, was achieved from soil substrate. Specific primers targeting a 362-bp fragment from the SSU rRNA gene region of G. intraradices and a 562-bp fragment from the F. solani f. sp. phaseoli translation elongation factor 1 alpha gene were used in real-time polymerase chain reaction (PCR) assays conjugated with the fluorescent SYBR(R) Green I dye. Standard curves showed a linear relation (r(2)=0.999) between log values of fungal genomic DNA of each species and real-time PCR threshold cycles and were quantitative over 4-5 orders of magnitude. Real-time PCR assays were applied to in vitro-produced fungal structures and sterile and non-sterile soil substrate seeded with known propagule numbers of either fungi. Detection and genomic DNA quantification was obtained from the different treatments, while no amplicon was detected from non-seeded non-sterile soil samples, confirming the absence of cross-reactivity with the soil microflora DNA. A significant correlation (Pgenomic DNA of F. solani f. sp. phaseoli or G. intraradices detected and the number of fungal propagules present in seeded soil substrate. The DNA extraction protocol and real-time PCR quantification assay can be performed in less than 2 h and is adaptable to detect and quantify genomic DNA from other soilborne fungi.
Large-scale sulfolane-impacted soil remediation at a gas plant

Energy Technology Data Exchange (ETDEWEB)

Lavoie, G.; Rockwell, K. [Biogenie Inc., Calgary, AB (Canada)

2006-07-01

A large-scale sulfolane-impacted soil remediation project at a gas plant in central Alberta was discussed. The plant was operational from the 1960s to present and the former operation involved the Sulfinol process which resulted in groundwater contamination. In 2005, the client wanted to address the sources area. The Sulfinol process has been used since the 1960s to remove hydrogen sulfide and other corrosive gases from natural gas streams. Sulfinol uses sulfolane and diisopropanolamine. Sulfolane is toxic, non-volatile, and water soluble. The presentation also addressed the remediation objectives and an additional site assessment that was conducted to better delineate the sulfolane and sulphur plume, as well as metals. The findings of the ESA and site specific challenges were presented. These challenges included: plant operation concerns; numerous overhead, surface, and underground structures; large volume of impacted material, limited space available on site; several types of contaminants; and time required to perform the overall work. Next, the sulfolane remediation strategy was discussed including advantages and results of the investigation. Last, the results of the project were presented. It was found that there were no recordable safety incidents and that all remedial objectives were achieved. tabs., figs.
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context

Directory of Open Access Journals (Sweden)

Gardner Timothy S

2007-09-01

Full Text Available Abstract Background Lightweight genome viewer (lwgv is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
Simulation of large-scale soil water systems using groundwater data and satellite based soil moisture

Science.gov (United States)

Kreye, Phillip; Meon, Günter

2016-04-01

Complex concepts for the physically correct depiction of dominant processes in the hydrosphere are increasingly at the forefront of hydrological modelling. Many scientific issues in hydrological modelling demand for additional system variables besides a simulation of runoff only, such as groundwater recharge or soil moisture conditions. Models that include soil water simulations are either very simplified or require a high number of parameters. Against this backdrop there is a heightened demand of observations to be used to calibrate the model. A reasonable integration of groundwater data or remote sensing data in calibration procedures as well as the identifiability of physically plausible sets of parameters is subject to research in the field of hydrology. Since this data is often combined with conceptual models, the given interfaces are not suitable for such demands. Furthermore, the application of automated optimisation procedures is generally associated with conceptual models, whose (fast) computing times allow many iterations of the optimisation in an acceptable time frame. One of the main aims of this study is to reduce the discrepancy between scientific and practical applications in the field of hydrological modelling. Therefore, the soil model DYVESOM (DYnamic VEgetation SOil Model) was developed as one of the primary components of the hydrological modelling system PANTA RHEI. DYVESOMs structure provides the required interfaces for the calibrations made at runoff, satellite based soil moisture and groundwater level. The model considers spatial and temporal differentiated feedback of the development of the vegetation on the soil system. In addition, small scale heterogeneities of soil properties (subgrid-variability) are parameterized by variation of van Genuchten parameters depending on distribution functions. Different sets of parameters are operated simultaneously while interacting with each other. The developed soil model is innovative regarding concept
PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

Directory of Open Access Journals (Sweden)

Wasnick Michael

2008-03-01

Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any
Iodine in soil

International Nuclear Information System (INIS)

Johanson, Karl Johan

2000-12-01

A literature study of the migration and the appearance of iodine isotopes in the bio-sphere particularly in soil is presented. Some important papers in the field of iodine appearance in soil and the appearance of 129 I in the surroundings of reprocessing plants are discussed. The most important conclusions are: 1. Iodine binds to organic matter in the soil and also to some oxides of aluminium and iron. 2. If the iodine is not bound to the soil a large fraction of added 129 I is volatilized after a rather short period. 3. The binding and also the volatilisation seems to be due to biological activity in the soil. It may take place within living microorganisms or by external enzymes excreted from microorganisms. 4. Due to variations in the composition of soil there may be a large variation in the distribution of 129 I in the vertical profile of soil - usually most of the 129 I in the upper layer - which also results in large variations in the 129 I uptake to plants
Iodine in soil

Energy Technology Data Exchange (ETDEWEB)

Johanson, Karl Johan [Swedish Univ. of Agricultural Sciences, Uppsala (Sweden). Dept. of Forest Mycology and Pathology

2000-12-01

A literature study of the migration and the appearance of iodine isotopes in the bio-sphere particularly in soil is presented. Some important papers in the field of iodine appearance in soil and the appearance of {sup 129}I in the surroundings of reprocessing plants are discussed. The most important conclusions are: 1. Iodine binds to organic matter in the soil and also to some oxides of aluminium and iron. 2. If the iodine is not bound to the soil a large fraction of added {sup 129}I is volatilized after a rather short period. 3. The binding and also the volatilisation seems to be due to biological activity in the soil. It may take place within living microorganisms or by external enzymes excreted from microorganisms. 4. Due to variations in the composition of soil there may be a large variation in the distribution of {sup 129}I in the vertical profile of soil - usually most of the {sup 129}I in the upper layer - which also results in large variations in the {sup 129}I uptake to plants.
[Evaluation and source analysis of the mercury pollution in soils and vegetables around a large-scale zinc smelting plant].

Science.gov (United States)

Liu, Fang; Wang, Shu-Xiao; Wu, Qing-Ru; Lin, Hai

2013-02-01

The farming soil and vegetable samples around a large-scale zinc smelter were collected for mercury content analyses, and the single pollution index method with relevant regulations was used to evaluate the pollution status of sampled soils and vegetables. The results indicated that the surface soil and vegetables were polluted with mercury to different extent. Of the soil samples, 78% exceeded the national standard. The mercury concentration in the most severely contaminated area was 29 times higher than the background concentration, reaching the severe pollution degree. The mercury concentration in all vegetable samples exceeded the standard of non-pollution vegetables. Mercury concentration, in the most severely polluted vegetables were 64.5 times of the standard, and averagely the mercury concentration in the vegetable samples was 25.4 times of the standard. For 85% of the vegetable samples, the mercury concentration, of leaves were significantly higher than that of roots, which implies that the mercury in leaves mainly came from the atmosphere. The mercury concentrations in vegetable roots were significantly correlated with that in soils, indicating the mercury in roots was mainly from soil. The mercury emissions from the zinc smelter have obvious impacts on the surrounding soils and vegetables. Key words:zinc smelting; mercury pollution; soil; vegetable; mercury content
Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

Energy Technology Data Exchange (ETDEWEB)

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

2005-08-26

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.
Exploration of the Germline Genome of the Ciliate Chilodonella uncinata through Single-Cell Omics (Transcriptomics and Genomics

Directory of Open Access Journals (Sweden)

Xyrus X. Maurer-Alcalá

2018-01-01

Full Text Available Separate germline and somatic genomes are found in numerous lineages across the eukaryotic tree of life, often separated into distinct tissues (e.g., in plants, animals, and fungi or distinct nuclei sharing a common cytoplasm (e.g., in ciliates and some foraminifera. In ciliates, germline-limited (i.e., micronuclear-specific DNA is eliminated during the development of a new somatic (i.e., macronuclear genome in a process that is tightly linked to large-scale genome rearrangements, such as deletions and reordering of protein-coding sequences. Most studies of germline genome architecture in ciliates have focused on the model ciliates Oxytricha trifallax, Paramecium tetraurelia, and Tetrahymena thermophila, for which the complete germline genome sequences are known. Outside of these model taxa, only a few dozen germline loci have been characterized from a limited number of cultivable species, which is likely due to difficulties in obtaining sufficient quantities of “purified” germline DNA in these taxa. Combining single-cell transcriptomics and genomics, we have overcome these limitations and provide the first insights into the structure of the germline genome of the ciliate Chilodonella uncinata, a member of the understudied class Phyllopharyngea. Our analyses reveal the following: (i large gene families contain a disproportionate number of genes from scrambled germline loci; (ii germline-soma boundaries in the germline genome are demarcated by substantial shifts in GC content; (iii single-cell omics techniques provide large-scale quality germline genome data with limited effort, at least for ciliates with extensively fragmented somatic genomes. Our approach provides an efficient means to understand better the evolution of genome rearrangements between germline and soma in ciliates.
RPAN: rice pan-genome browser for ∼3000 rice genomes.

Science.gov (United States)

Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

2017-01-25

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Environmental genomics of "Haloquadratum walsbyi" in a saltern crystallizer indicates a large pool of accessory genes in an otherwise coherent species

NARCIS (Netherlands)

Legault, Boris A.; Lopez-Lopez, Arantxa; Alba-Casado, Jose Carlos; Doolittle, W. Ford; Bolhuis, Henk; Rodriguez-Valera, Francisco; Papke, R. Thane

2006-01-01

Background: Mature saturated brine (crystallizers) communities are largely dominated (> 80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically
Comparative Genome Analysis of Enterobacter cloacae

Science.gov (United States)

Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

2013-01-01

The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314
Spectral estimation of soil properties in siberian tundra soils and relations with plant species composition

DEFF Research Database (Denmark)

Bartholomeus, Harm; Schaepman-Strub, Gabriela; Blok, Daan

2012-01-01

yields a good prediction model for K and a moderate model for pH. Using these models, soil properties are determined for a larger number of samples, and soil properties are related to plant species composition. This analysis shows that variation of soil properties is large within vegetation classes......Predicted global warming will be most pronounced in the Arctic and will severely affect permafrost environments. Due to its large spatial extent and large stocks of soil organic carbon, changes to organic matter decomposition rates and associated carbon fluxes in Arctic permafrost soils...
Identification of a large genomic region in UV-irradiated human cells which has fewer cyclobutane pyrimidine dimers than most genomic regions

International Nuclear Information System (INIS)

Kantor, G.J.; Deiss-Tolbert, D.M.

1995-01-01

Size separation after UV-endonuclease digestion of DNA from UV-irradiated human cells using denaturing conditions fractionates the genome based on cyclobutane pyrimidine dimer content. We have examined the largest molecules available (50-80 kb; about 5% of the DNA) after fractionation and those of average size (5-15 kb) for content of some specific genes. We find that the largest molecules are not a representative sampling of the genome. Three contiguous genes located in a G+C-rich isochore (tyrosine hydroxylase, insulin, insulin-like growth factor II) have concentrations two to three times greater in the largest molecules. This shows that this genomic region has fewer pyrimidine dimers than most other genomic regions. In contrast, the β-actin genomic region, which has a similar G+C content, has an equal concentration in both fractions as do the p53 and β-globin genomic regions, which are A+T-rich. These data show that DNA damage in the form of cyclobutane pyrimidine dimers occurs with different probabilities in specific isochores. Part of the reason may be the relative G-C content, but other factors must play a significant role. We also report that the transcriptionally inactive insulin region is repaired at the genome-overall rate in normal cells and is not repaired in xeroderma pigmentosum complementation group C cells. (author)

Big Data Analytics for Genomic Medicine.

Science.gov (United States)

He, Karen Y; Ge, Dongliang; He, Max M

2017-02-15

Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients' genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs.
Body maps on the human genome.

Science.gov (United States)

Cherniak, Christopher; Rodriguez-Esteban, Raul

2013-12-20

Chromosomes have territories, or preferred locales, in the cell nucleus. When these sites are taken into account, some large-scale structure of the human genome emerges. The synoptic picture is that genes highly expressed in particular topologically compact tissues are not randomly distributed on the genome. Rather, such tissue-specific genes tend to map somatotopically onto the complete chromosome set. They seem to form a "genome homunculus": a multi-dimensional, genome-wide body representation extending across chromosome territories of the entire spermcell nucleus. The antero-posterior axis of the body significantly corresponds to the head-tail axis of the nucleus, and the dorso-ventral body axis to the central-peripheral nucleus axis. This large-scale genomic structure includes thousands of genes. One rationale for a homuncular genome structure would be to minimize connection costs in genetic networks. Somatotopic maps in cerebral cortex have been reported for over a century.
Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach.

Directory of Open Access Journals (Sweden)

Simon Boitard

2016-03-01

Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
A novel common large genomic deletion and two new missense mutations identified in the Romanian phenylketonuria population.

Science.gov (United States)

Gemperle-Britschgi, Corinne; Iorgulescu, Daniela; Mager, Monica Alina; Anton-Paduraru, Dana; Vulturar, Romana; Thöny, Beat

2016-01-15

The mutation spectrum for the phenylalanine hydroxylase (PAH) gene was investigated in a cohort of 84 hyperphenylalaninemia (HPA) patients from Romania identified through newborn screening or neurometabolic investigations. Differential diagnosis identified 81 patients with classic PAH deficiency while 3 had tetrahydropterin-cofactor deficiency and/or remained uncertain due to insufficient specimen. PAH-genetic analysis included a combination of Sanger sequencing of exons and exon–intron boundaries, MLPA and NGS with genomic DNA, and cDNA analysis from immortalized lymphoblasts. A diagnostic efficiency of 99.4% was achieved, as for one allele (out of a total of 162 alleles) no mutation could be identified. The most prevalent mutation was p.Arg408Trp which was found in ~ 38% of all PKU alleles. Three novel mutations were identified, including the two missense mutations p.Gln226Lys and p.Tyr268Cys that were both disease causing by prediction algorithms, and the large genomic deletion EX6del7831 (c.509 + 4140_706 + 510del7831) that resulted in skipping of exon 6 based on PAH-cDNA analysis in immortalized lymphocytes. The genomic deletion was present in a heterozygous state in 12 patients, i.e. in ~ 8% of all the analyzed PKU alleles, and might have originated from a Romanian founder.
A soil map of a large watershed in China: applying digital soil mapping in a data sparse region

Science.gov (United States)

Barthold, F.; Blank, B.; Wiesmeier, M.; Breuer, L.; Frede, H.-G.

2009-04-01

Prediction of soil classes in data sparse regions is a major research challenge. With the advent of machine learning the possibilities to spatially predict soil classes have increased tremendously and given birth to new possibilities in soil mapping. Digital soil mapping is a research field that has been established during the last decades and has been accepted widely. We now need to develop tools to reduce the uncertainty in soil predictions. This is especially challenging in data sparse regions. One approach to do this is to implement soil taxonomic distance as a classification error criterion in classification and regression trees (CART) as suggested by Minasny et al. (Geoderma 142 (2007) 285-293). This approach assumes that the classification error should be larger between soils that are more dissimilar, i.e. differ in a larger number of soil properties, and smaller between more similar soils. Our study area is the Xilin River Basin, which is located in central Inner Mongolia in China. It is characterized by semi arid climate conditions and is representative for the natural occurring steppe ecosystem. The study area comprises 3600 km2. We applied a random, stratified sampling design after McKenzie and Ryan (Geoderma 89 (1999) 67-94) with landuse and topography as stratifying variables. We defined 10 sampling classes, from each class 14 replicates were randomly drawn and sampled. The dataset was split into 100 soil profiles for training and 40 soil profiles for validation. We then applied classification and regression trees (CART) to quantify the relationships between soil classes and environmental covariates. The classification tree explained 75.5% of the variance with land use and geology as most important predictor variables. Among the 8 soil classes that we predicted, the Kastanozems cover most of the area. They are predominantly found in steppe areas. However, even some of the soils at sand dune sites, which were thought to show only little soil formation
Large-scale parallel genome assembler over cloud computing environment.

Science.gov (United States)

Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

2017-06-01

The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Detection of Ophiocordyceps sinensis in soil by quantitative real-time PCR.

Science.gov (United States)

Peng, Qingyun; Zhong, Xin; Lei, Wei; Zhang, Guren; Liu, Xin

2013-03-01

Ophiocordyceps sinensis, one of the best known entomopathogenic fungi in traditional Chinese medicine, parasitizes larvae of the moth genus Thitarodes, which lives in soil tunnels. However, little is known about the spatial distribution of O. sinensis in the soil. We established a protocol for DNA extraction, purification, and quantification of O. sinensis in soil with quantitative real-time PCR targeting the internal transcribed spacer region. The method was assessed using 34 soil samples from Tibet. No inhibitory effects in purified soil DNA extracts were detected. The standard curve method for absolute DNA quantification generated crossing point values that were strongly and linearly correlated to the log10 of the initial amount of O. sinensis genomic DNA (r(2) = 0.999) over 7 orders of magnitude (4 × 10(1) to 4 × 10(7) fg). The amplification efficiency and y-intercept value of the standard curve were 1.953 and 37.70, respectively. The amount of O. sinensis genomic DNA decreased with increasing soil depth and horizontal distance from a sclerotium (P protocol is rapid, specific, sensitive, and provides a powerful tool for quantification of O. sinensis from soil.
Sequencing of a new target genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) genome project.

Science.gov (United States)

Pittendrigh, B R; Clark, J M; Johnston, J S; Lee, S H; Romero-Severson, J; Dasch, G A

2006-11-01

The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus, determining the body louse genome will enhance studies of host-vector-pathogen interactions. Although not important as a major disease vector, head lice are of major social concern. Resistance to traditional pesticides used to control head and body lice have developed. It is imperative that new molecular targets be discovered for the development of novel compounds to control these insects. No complete genome sequence exists for a hemimetabolous insect species primarily because hemimetabolous insects often have large (2000 Mb) to very large (up to 16,300 Mb) genomes. Fortuitously, we determined that the human body louse has one of the smallest genome sizes known in insects, suggesting it may be a suitable choice as a minimal hemimetabolous genome in which many genes have been eliminated during its adaptation to human parasitism. Because many louse species infest birds and mammals, the body louse genome-sequencing project will facilitate studies of their comparative genomics. A 6-8X coverage of the body louse genome, plus sequenced expressed sequence tags, should provide the entomological, evolutionary biology, medical, and public health communities with useful genetic information.
PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses in R

OpenAIRE

Pfeifer, Bastian; Wittelsbürger, Ulrich; Ramos-Onsins, Sebastian E.; Lercher, Martin J.

2014-01-01

Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a populati...
Modeling temporal and large-scale spatial variability of soil respiration from soil water availability, temperature and vegetation productivity indices

Science.gov (United States)

Reichstein, Markus; Rey, Ana; Freibauer, Annette; Tenhunen, John; Valentini, Riccardo; Banza, Joao; Casals, Pere; Cheng, Yufu; Grünzweig, Jose M.; Irvine, James; Joffre, Richard; Law, Beverly E.; Loustau, Denis; Miglietta, Franco; Oechel, Walter; Ourcival, Jean-Marc; Pereira, Joao S.; Peressotti, Alessandro; Ponti, Francesca; Qi, Ye; Rambal, Serge; Rayment, Mark; Romanya, Joan; Rossi, Federica; Tedeschi, Vanessa; Tirone, Giampiero; Xu, Ming; Yakir, Dan

2003-12-01

Field-chamber measurements of soil respiration from 17 different forest and shrubland sites in Europe and North America were summarized and analyzed with the goal to develop a model describing seasonal, interannual and spatial variability of soil respiration as affected by water availability, temperature, and site properties. The analysis was performed at a daily and at a monthly time step. With the daily time step, the relative soil water content in the upper soil layer expressed as a fraction of field capacity was a good predictor of soil respiration at all sites. Among the site variables tested, those related to site productivity (e.g., leaf area index) correlated significantly with soil respiration, while carbon pool variables like standing biomass or the litter and soil carbon stocks did not show a clear relationship with soil respiration. Furthermore, it was evidenced that the effect of precipitation on soil respiration stretched beyond its direct effect via soil moisture. A general statistical nonlinear regression model was developed to describe soil respiration as dependent on soil temperature, soil water content, and site-specific maximum leaf area index. The model explained nearly two thirds of the temporal and intersite variability of soil respiration with a mean absolute error of 0.82 μmol m-2 s-1. The parameterized model exhibits the following principal properties: (1) At a relative amount of upper-layer soil water of 16% of field capacity, half-maximal soil respiration rates are reached. (2) The apparent temperature sensitivity of soil respiration measured as Q10 varies between 1 and 5 depending on soil temperature and water content. (3) Soil respiration under reference moisture and temperature conditions is linearly related to maximum site leaf area index. At a monthly timescale, we employed the approach by [2002] that used monthly precipitation and air temperature to globally predict soil respiration (T&P model). While this model was able to
Modelling temporal and large-scale spatial variability of soil respiration from soil water availability, temperature and vegetation productivity indices

Science.gov (United States)

Reichstein, M.; Rey, A.; Freibauer, A.; Tenhunen, J.; Valentini, R.; Soil Respiration Synthesis Team

2003-04-01

Field-chamber measurements of soil respiration from 17 different forest and shrubland sites in Europe and North America were summarized and analyzed with the goal to develop a model describing seasonal, inter-annual and spatial variability of soil respiration as affected by water availability, temperature and site properties. The analysis was performed at a daily and at a monthly time step. With the daily time step, the relative soil water content in the upper soil layer expressed as a fraction of field capacity was a good predictor of soil respiration at all sites. Among the site variables tested, those related to site productivity (e.g. leaf area index) correlated significantly with soil respiration, while carbon pool variables like standing biomass or the litter and soil carbon stocks did not show a clear relationship with soil respiration. Furthermore, it was evidenced that the effect of precipitation on soil respiration stretched beyond its direct effect via soil moisture. A general statistical non-linear regression model was developed to describe soil respiration as dependent on soil temperature, soil water content and site-specific maximum leaf area index. The model explained nearly two thirds of the temporal and inter-site variability of soil respiration with a mean absolute error of 0.82 µmol m-2 s-1. The parameterised model exhibits the following principal properties: 1) At a relative amount of upper-layer soil water of 16% of field capacity half-maximal soil respiration rates are reached. 2) The apparent temperature sensitivity of soil respiration measured as Q10 varies between 1 and 5 depending on soil temperature and water content. 3) Soil respiration under reference moisture and temperature conditions is linearly related to maximum site leaf area index. At a monthly time-scale we employed the approach by Raich et al. (2002, Global Change Biol. 8, 800-812) that used monthly precipitation and air temperature to globally predict soil respiration (T
Continuous data assimilation for downscaling large-footprint soil moisture retrievals

KAUST Repository

Altaf, M. U.; Jana, Raghavendra Belur; Hoteit, Ibrahim; McCabe, Matthew

2016-01-01

Soil moisture is a crucial component of the hydrologic cycle, significantly influencing runoff, infiltration, recharge, evaporation and transpiration processes. Models characterizing these processes require soil moisture as an input, either directly
Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

Science.gov (United States)

Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

2017-10-17

Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.
Shared strategies for β-lactam catabolism in the soil microbiome

DEFF Research Database (Denmark)

Crofts, Terence S.; Wang, Bin; Spivak, Aaron

2018-01-01

The soil microbiome can produce, resist, or degrade antibiotics and even catabolize them. While resistance genes are widely distributed in the soil, there is a dearth of knowledge concerning antibiotic catabolism. Here we describe a pathway for penicillin catabolism in four isolates. Genomic......, respectively. Elucidation of additional pathways may allow bioremediation of antibiotic-contaminated soils and discovery of antibiotic-remodeling enzymes with industrial utility....
Comparative genomics reveals insights into avian genome evolution and adaptation

Science.gov (United States)

Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

2015-01-01

Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712
Draft Genome Sequence of Bacillus velezensis 3A-25B, a Strain with Biocontrol Activity against Fungal and Oomycete Root Plant Phytopathogens, Isolated from Grassland Soil.

Science.gov (United States)

Martínez-Raudales, Inés; De La Cruz-Rodríguez, Yumiko; Vega-Arreguín, Julio; Alvarado-Gutiérrez, Alejandro; Fraire-Mayorga, Atzin; Alvarado-Rodríguez, Miguel; Balderas-Hernández, Victor; Gómez-Soto, José Manuel; Fraire-Velázquez, Saúl

2017-09-28

Here, we present the draft genome of Bacillus velezensis 3A-25B, which totaled 4.01 Mb with 36 contigs, 3,948 genes, and a GC content of 46.34%. This strain, which demonstrates biocontrol activity against root rot causal phytopathogens in horticultural crops and friendly interactions in roots of pepper plantlets, was obtained from grassland soil in Zacatecas Province, Mexico. Copyright © 2017 Martínez-Raudales et al.
Evolutionary analysis of a large mtDNA translocation (numt) into the nuclear genome of the Panthera genus species.

Science.gov (United States)

Kim, Jae-Heup; Antunes, Agostinho; Luo, Shu-Jin; Menninger, Joan; Nash, William G; O'Brien, Stephen J; Johnson, Warren E

2006-02-01

Translocation of cymtDNA into the nuclear genome, also referred to as numt, has been reported in many species, including several closely related to the domestic cat (Felis catus). We describe the recent transposition of 12,536 bp of the 17 kb mitochondrial genome into the nucleus of the common ancestor of the five Panthera genus species: tiger, P. tigris; snow leopard, P. uncia; jaguar, P. onca; leopard, P. pardus; and lion, P. leo. This nuclear integration, representing 74% of the mitochondrial genome, is one of the largest to be reported in eukaryotes. The Panthera genus numt differs from the numt previously described in the Felis genus in: (1) chromosomal location (F2-telomeric region vs. D2-centromeric region), (2) gene make up (from the ND5 to the ATP8 vs. from the CR to the COII), (3) size (12.5 vs. 7.9 kb), and (4) structure (single monomer vs. tandemly repeated in Felis). These distinctions indicate that the origin of this large numt fragment in the nuclear genome of the Panthera species is an independent insertion from that of the domestic cat lineage, which has been further supported by phylogenetic analyses. The tiger cymtDNA shared around 90% sequence identity with the homologous numt sequence, suggesting an origin for the Panthera numt at around 3.5 million years ago, prior to the radiation of the five extant Panthera species.
Satellite based radar interferometry to estimate large-scale soil water depletion from clay shrinkage: possibilities and limitations

NARCIS (Netherlands)

Brake, te B.; Hanssen, R.F.; Ploeg, van der M.J.; Rooij, de G.H.

2013-01-01

Satellite-based radar interferometry is a technique capable of measuring small surface elevation changes at large scales and with a high resolution. In vadose zone hydrology, it has been recognized for a long time that surface elevation changes due to swell and shrinkage of clayey soils can serve as
Big Data Analytics for Genomic Medicine

Science.gov (United States)

He, Karen Y.; Ge, Dongliang; He, Max M.

2017-01-01

Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients’ genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs. PMID:28212287
Pyrosequencing-based comparative genome analysis of the nosocomial pathogen Enterococcus faecium and identification of a large transferable pathogenicity island

Directory of Open Access Journals (Sweden)

Bonten Marc JM

2010-04-01

Full Text Available Abstract Background The Gram-positive bacterium Enterococcus faecium is an important cause of nosocomial infections in immunocompromized patients. Results We present a pyrosequencing-based comparative genome analysis of seven E. faecium strains that were isolated from various sources. In the genomes of clinical isolates several antibiotic resistance genes were identified, including the vanA transposon that confers resistance to vancomycin in two strains. A functional comparison between E. faecium and the related opportunistic pathogen E. faecalis based on differences in the presence of protein families, revealed divergence in plant carbohydrate metabolic pathways and oxidative stress defense mechanisms. The E. faecium pan-genome was estimated to be essentially unlimited in size, indicating that E. faecium can efficiently acquire and incorporate exogenous DNA in its gene pool. One of the most prominent sources of genomic diversity consists of bacteriophages that have integrated in the genome. The CRISPR-Cas system, which contributes to immunity against bacteriophage infection in prokaryotes, is not present in the sequenced strains. Three sequenced isolates carry the esp gene, which is involved in urinary tract infections and biofilm formation. The esp gene is located on a large pathogenicity island (PAI, which is between 64 and 104 kb in size. Conjugation experiments showed that the entire esp PAI can be transferred horizontally and inserts in a site-specific manner. Conclusions Genes involved in environmental persistence, colonization and virulence can easily be aquired by E. faecium. This will make the development of successful treatment strategies targeted against this organism a challenge for years to come.

The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

Directory of Open Access Journals (Sweden)

Lincoln D Stein

2003-11-01

Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C
Genome projects and the functional-genomic era.

Science.gov (United States)

Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

2005-12-01

The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.
Enabling Graph Appliance for Genome Assembly

Energy Technology Data Exchange (ETDEWEB)

Singh, Rina [ORNL; Graves, Jeffrey A [ORNL; Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Shankar, Mallikarjun [ORNL

2015-01-01

In recent years, there has been a huge growth in the amount of genomic data available as reads generated from various genome sequencers. The number of reads generated can be huge, ranging from hundreds to billions of nucleotide, each varying in size. Assembling such large amounts of data is one of the challenging computational problems for both biomedical and data scientists. Most of the genome assemblers developed have used de Bruijn graph techniques. A de Bruijn graph represents a collection of read sequences by billions of vertices and edges, which require large amounts of memory and computational power to store and process. This is the major drawback to de Bruijn graph assembly. Massively parallel, multi-threaded, shared memory systems can be leveraged to overcome some of these issues. The objective of our research is to investigate the feasibility and scalability issues of de Bruijn graph assembly on Cray s Urika-GD system; Urika-GD is a high performance graph appliance with a large shared memory and massively multithreaded custom processor designed for executing SPARQL queries over large-scale RDF data sets. However, to the best of our knowledge, there is no research on representing a de Bruijn graph as an RDF graph or finding Eulerian paths in RDF graphs using SPARQL for potential genome discovery. In this paper, we address the issues involved in representing a de Bruin graphs as RDF graphs and propose an iterative querying approach for finding Eulerian paths in large RDF graphs. We evaluate the performance of our implementation on real world ebola genome datasets and illustrate how genome assembly can be accomplished with Urika-GD using iterative SPARQL queries.
Genomes to Proteomes

Energy Technology Data Exchange (ETDEWEB)

Panisko, Ellen A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Grigoriev, Igor [USDOE Joint Genome Inst., Walnut Creek, CA (United States); Daly, Don S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Webb-Robertson, Bobbie-Jo [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Baker, Scott E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

2009-03-01

Biologists are awash with genomic sequence data. In large part, this is due to the rapid acceleration in the generation of DNA sequence that occurred as public and private research institutes raced to sequence the human genome. In parallel with the large human genome effort, mostly smaller genomes of other important model organisms were sequenced. Projects following on these initial efforts have made use of technological advances and the DNA sequencing infrastructure that was built for the human and other organism genome projects. As a result, the genome sequences of many organisms are available in high quality draft form. While in many ways this is good news, there are limitations to the biological insights that can be gleaned from DNA sequences alone; genome sequences offer only a bird's eye view of the biological processes endemic to an organism or community. Fortunately, the genome sequences now being produced at such a high rate can serve as the foundation for other global experimental platforms such as proteomics. Proteomic methods offer a snapshot of the proteins present at a point in time for a given biological sample. Current global proteomics methods combine enzymatic digestion, separations, mass spectrometry and database searching for peptide identification. One key aspect of proteomics is the prediction of peptide sequences from mass spectrometry data. Global proteomic analysis uses computational matching of experimental mass spectra with predicted spectra based on databases of gene models that are often generated computationally. Thus, the quality of gene models predicted from a genome sequence is crucial in the generation of high quality peptide identifications. Once peptides are identified they can be assigned to their parent protein. Proteins identified as expressed in a given experiment are most useful when compared to other expressed proteins in a larger biological context or biochemical pathway. In this chapter we will discuss the automatic
[Interrelationships between soil fauna and soil environmental factors in China: research advance].

Science.gov (United States)

Wang, Yi; Wei, Wei; Yang, Xing-zhong; Chen, Li-ding; Yang, Lei

2010-09-01

Soil fauna has close relations with various environmental factors in soil ecosystem. To explore the interrelationships between soil fauna and soil environmental factors is of vital importance to deep understand the dynamics of soil ecosystem and to assess the functioning of the ecosystem. The environmental factors affecting soil fauna can be classified as soil properties and soil external environment. The former contains soil basic physical and chemical properties, soil moisture, and soil pollution. The latter includes vegetation, land use type, landform, and climate, etc. From these aspects, this paper summarized the published literatures in China on the interrelationships between soil fauna and soil environmental factors. It was considered that several problems were existed in related studies, e.g., fewer researches were made in integrating soil fauna's bio-indicator function, research methods were needed to be improved, and the studies on the multi-environmental factors and their large scale spatial-temporal variability were in deficiency. Corresponding suggestions were proposed, i.e., more work should be done according to the practical needs, advanced experiences from abroad should be referenced, and comprehensive studies on multi-environmental factors and long-term monitoring should be conducted on large scale areas.
The Salmonella enterica Pan-genome

DEFF Research Database (Denmark)

Jacobsen, Annika; Hendriksen, Rene S.; Aarestrup, Frank Møller

2011-01-01

Salmonella enterica is divided into four subspecies containing a large number of different serovars, several of which are important zoonotic pathogens and some show a high degree of host specificity or host preference. We compare 45 sequenced S. enterica genomes that are publicly available (22......, and the core and pan-genome of Salmonella were estimated to be around 2,800 and 10,000 gene families, respectively. The constructed pan-genomic dendrograms suggest that gene content is often, but not uniformly correlated to serotype. Any given Salmonella strain has a large stable core, whilst...... there is an abundance of accessory genes, including the Salmonella pathogenicity islands (SPIs), transposable elements, phages, and plasmid DNA. We visualize conservation in the genomes in relation to chromosomal location and DNA structural features and find that variation in gene content is localized in a selection...
Single-Cell (Meta-Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity

Directory of Open Access Journals (Sweden)

Beverly E. Flood

2016-05-01

Full Text Available The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria.Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence transposable elements and miniature inverted-repeat transposable elements (MITEs. In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsr
Between Two Fern Genomes

Science.gov (United States)

2014-01-01

Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969
Circumpolar assessment of rhizosphere priming shows limited increase in carbon loss estimates for permafrost soils but large regional variability

Science.gov (United States)

Wild, B.; Keuper, F.; Kummu, M.; Beer, C.; Blume-Werry, G.; Fontaine, S.; Gavazov, K.; Gentsch, N.; Guggenberger, G.; Hugelius, G.; Jalava, M.; Koven, C.; Krab, E. J.; Kuhry, P.; Monteux, S.; Richter, A.; Shazhad, T.; Dorrepaal, E.

2017-12-01

Predictions of soil organic carbon (SOC) losses in the northern circumpolar permafrost area converge around 15% (± 3% standard error) of the initial C pool by 2100 under the RCP 8.5 warming scenario. Yet, none of these estimates consider plant-soil interactions such as the rhizosphere priming effect (RPE). While laboratory experiments have shown that the input of plant-derived compounds can stimulate SOC losses by up to 1200%, the magnitude of RPE in natural ecosystems is unknown and no methods for upscaling exist so far. We here present the first spatial and depth explicit RPE model that allows estimates of RPE on a large scale (PrimeSCale). We combine available spatial data (SOC, C/N, GPP, ALT and ecosystem type) and new ecological insights to assess the importance of the RPE at the circumpolar scale. We use a positive saturating relationship between the RPE and belowground C allocation and two ALT-dependent rooting-depth distribution functions (for tundra and boreal forest) to proportionally assign belowground C allocation and RPE to individual soil depth increments. The model permits to take into account reasonable limiting factors on additional SOC losses by RPE including interactions between spatial and/or depth variation in GPP, plant root density, SOC stocks and ALT. We estimate potential RPE-induced SOC losses at 9.7 Pg C (5 - 95% CI: 1.5 - 23.2 Pg C) by 2100 (RCP 8.5). This corresponds to an increase of the current permafrost SOC-loss estimate from 15% of the initial C pool to about 16%. If we apply an additional molar C/N threshold of 20 to account for microbial C limitation as a requirement for the RPE, SOC losses by RPE are further reduced to 6.5 Pg C (5 - 95% CI: 1.0 - 16.8 Pg C) by 2100 (RCP 8.5). Although our results show that current estimates of permafrost soil C losses are robust without taking into account the RPE, our model also highlights high-RPE risk in Siberian lowland areas and Alaska north of the Brooks Range. The small overall impact of
Goodbye genome paper, hello genome report: the increasing popularity of 'genome announcements' and their impact on science.

Science.gov (United States)

Smith, David Roy

2017-05-01

Next-generation sequencing technologies have revolutionized genomics and altered the scientific publication landscape. Life-science journals abound with genome papers-peer-reviewed descriptions of newly sequenced chromosomes. Although they once filled the pages of Nature and Science, genome papers are now mostly relegated to journals with low-impact factors. Some have forecast the death of the genome paper and argued that they are using up valuable resources and not advancing science. However, the publication rate of genome papers is on the rise. This increase is largely because some journals have created a new category of manuscript called genome reports, which are short, fast-tracked papers describing a chromosome sequence(s), its GenBank accession number and little else. In 2015, for example, more than 2000 genome reports were published, and 2016 is poised to bring even more. Here, I highlight the growing popularity of genome reports and discuss their merits, drawbacks and impact on science and the academic publication infrastructure. Genome reports can be excellent assets for the research community, but they are also being used as quick and easy routes to a publication, and in some instances they are not peer reviewed. One of the best arguments for genome reports is that they are a citable, user-generated genomic resource providing essential methodological and biological information, which may not be present in the sequence database. But they are expensive and time-consuming avenues for achieving such a goal. © The Author 2016. Published by Oxford University Press.
Temperature response of soil respiration largely unaltered with experimental warming

NARCIS (Netherlands)

Carey, J.C.; Tang, J.; Templer, P.H.; Kroeger, K.D.; Crowther, T.W.; Burton, A.J.; Dukes, J.S.; Emmett, B.; Frey, S.D.; Heskel, M.A.; Jiang, L.; Machmuller, M.B.; Mohan, J.; Panetta, A.M.; Reich, P.B.; Reinsch, S.; Wang, X.; Allison, S.D.; Bamminger, C.; Bridgham, S.; Collins, S.L.; de Dato, G.; Eddy, W.C.; Enquist, B.J.; Estiarte, M.; Harte, J.; Henderson, A.; Johnson, B.R.; Larsen, K.S.; Luo, Y.; Marhan, S.; Melillo, J.M.; Peñuelas, J.; Pfeifer-Meister, L.; Poll, C.; Rastetter, E.; Reinmann, A.B.; Reynolds, L.L.; Schmidt, I.K.; Shaver, G.R.; Strong, A.L.; Suseela, V.; Tietema, A.

2016-01-01

The respiratory release of carbon dioxide (CO2) from soil is a major yet poorly understood flux in the global carbon cycle. Climatic warming is hypothesized to increase rates of soil respiration, potentially fueling further increases in global temperatures. However, despite considerable scientific
Correction for Measurement Error from Genotyping-by-Sequencing in Genomic Variance and Genomic Prediction Models

DEFF Research Database (Denmark)

Ashraf, Bilal; Janss, Luc; Jensen, Just

sample). The GBSeq data can be used directly in genomic models in the form of individual SNP allele-frequency estimates (e.g., reference reads/total reads per polymorphic site per individual), but is subject to measurement error due to the low sequencing depth per individual. Due to technical reasons....... In the current work we show how the correction for measurement error in GBSeq can also be applied in whole genome genomic variance and genomic prediction models. Bayesian whole-genome random regression models are proposed to allow implementation of large-scale SNP-based models with a per-SNP correction...... for measurement error. We show correct retrieval of genomic explained variance, and improved genomic prediction when accounting for the measurement error in GBSeq data...
Diurnal hysteresis between soil CO2 and soil temperature is controlled by soil water content

Science.gov (United States)

Diego A. Riveros-Iregui; Ryan E. Emanuel; Daniel J. Muth; L. McGlynn Brian; Howard E. Epstein; Daniel L. Welsch; Vincent J. Pacific; Jon M. Wraith

2007-01-01

Recent years have seen a growing interest in measuring and modeling soil CO2 efflux, as this flux represents a large component of ecosystem respiration and is a key determinant of ecosystem carbon balance. Process-based models of soil CO2 production and efflux, commonly based on soil temperature, are limited by nonlinearities such as the observed diurnal hysteresis...
Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

Energy Technology Data Exchange (ETDEWEB)

Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

2010-03-23

Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.
Genome sequence of the moderately thermophilic sulfur-reducing bacterium Thermanaerovibrio velox type strain (Z-9701T) and emended description of the genus Thermanaerovibrio

OpenAIRE

Palaniappan, Krishna; Meier-Kolthoff, Jan P.; Teshima, Hazuki; Nolan, Matt; Lapidus, Alla; Tice, Hope; Del Rio, Tijana Glavina; Cheng, Jan-Fang; Han, Cliff; Tapia, Roxanne; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Pagani, Ioanna

2013-01-01

Thermanaerovibrio velox Zavarzina et al. 2000 is a member of the Synergistaceae , a family in the phylum Synergistetes that is already well-characterized at the genome level. Members of this phylum were described as Gram-negative staining anaerobic bacteria with a rod/vibrioid cell shape and possessing an atypical outer cell envelope. They inhabit a large variety of anaerobic environments including soil, oil wells, wastewater treatment plants and animal gastrointestinal tracts. They are also ...
Novel Insights into the Diversity of Catabolic Metabolism from Ten Haloarchaeal Genomes

Energy Technology Data Exchange (ETDEWEB)

Anderson, Iain; Scheuner, Carmen; Goker, Markus; Mavromatis, Kostas; Hooper, Sean D.; Porat, Iris; Klenk, Hans-Peter; Ivanova, Natalia; Kyrpides, Nikos

2011-05-03

The extremely halophilic archaea are present worldwide in saline environments and have important biotechnological applications. Ten complete genomes of haloarchaea are now available, providing an opportunity for comparative analysis. We report here the comparative analysis of five newly sequenced haloarchaeal genomes with five previously published ones. Whole genome trees based on protein sequences provide strong support for deep relationships between the ten organisms. Using a soft clustering approach, we identified 887 protein clusters present in all halophiles. Of these core clusters, 112 are not found in any other archaea and therefore constitute the haloarchaeal signature. Four of the halophiles were isolated from water, and four were isolated from soil or sediment. Although there are few habitat-specific clusters, the soil/sediment halophiles tend to have greater capacity for polysaccharide degradation, siderophore synthesis, and cell wall modification. Halorhabdus utahensis and Haloterrigena turkmenica encode over forty glycosyl hydrolases each, and may be capable of breaking down naturally occurring complex carbohydrates. H. utahensis is specialized for growth on carbohydrates and has few amino acid degradation pathways. It uses the non-oxidative pentose phosphate pathway instead of the oxidative pathway, giving it more flexibility in the metabolism of pentoses. These new genomes expand our understanding of haloarchaeal catabolic pathways, providing a basis for further experimental analysis, especially with regard to carbohydrate metabolism. Halophilic glycosyl hydrolases for use in biofuel production are more likely to be found in halophiles isolated from soil or sediment.
Feasibility analysis of using inverse modeling for estimating natural groundwater recharge from a large-scale soil moisture monitoring network

Science.gov (United States)

Wang, Tiejun; Franz, Trenton E.; Yue, Weifeng; Szilagyi, Jozsef; Zlotnik, Vitaly A.; You, Jinsheng; Chen, Xunhong; Shulski, Martha D.; Young, Aaron

2016-02-01

Despite the importance of groundwater recharge (GR), its accurate estimation still remains one of the most challenging tasks in the field of hydrology. In this study, with the help of inverse modeling, long-term (6 years) soil moisture data at 34 sites from the Automated Weather Data Network (AWDN) were used to estimate the spatial distribution of GR across Nebraska, USA, where significant spatial variability exists in soil properties and precipitation (P). To ensure the generality of this study and its potential broad applications, data from public domains and literature were used to parameterize the standard Hydrus-1D model. Although observed soil moisture differed significantly across the AWDN sites mainly due to the variations in P and soil properties, the simulations were able to capture the dynamics of observed soil moisture under different climatic and soil conditions. The inferred mean annual GR from the calibrated models varied over three orders of magnitude across the study area. To assess the uncertainties of the approach, estimates of GR and actual evapotranspiration (ETa) from the calibrated models were compared to the GR and ETa obtained from other techniques in the study area (e.g., remote sensing, tracers, and regional water balance). Comparison clearly demonstrated the feasibility of inverse modeling and large-scale (>104 km2) soil moisture monitoring networks for estimating GR. In addition, the model results were used to further examine the impacts of climate and soil on GR. The data showed that both P and soil properties had significant impacts on GR in the study area with coarser soils generating higher GR; however, different relationships between GR and P emerged at the AWDN sites, defined by local climatic and soil conditions. In general, positive correlations existed between annual GR and P for the sites with coarser-textured soils or under wetter climatic conditions. With the rapidly expanding soil moisture monitoring networks around the
M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species

Directory of Open Access Journals (Sweden)

Messeguer Xavier

2006-10-01

Full Text Available Abstract Background Due to recent advances in whole genome shotgun sequencing and assembly technologies, the financial cost of decoding an organism's DNA has been drastically reduced, resulting in a recent explosion of genomic sequencing projects. This increase in related genomic data will allow for in depth studies of evolution in closely related species through multiple whole genome comparisons. Results To facilitate such comparisons, we present an interactive multiple genome comparison and alignment tool, M-GCAT, that can efficiently construct multiple genome comparison frameworks in closely related species. M-GCAT is able to compare and identify highly conserved regions in up to 20 closely related bacterial species in minutes on a standard computer, and as many as 90 (containing 75 cloned genomes from a set of 15 published enterobacterial genomes in an hour. M-GCAT also incorporates a novel comparative genomics data visualization interface allowing the user to globally and locally examine and inspect the conserved regions and gene annotations. Conclusion M-GCAT is an interactive comparative genomics tool well suited for quickly generating multiple genome comparisons frameworks and alignments among closely related species. M-GCAT is freely available for download for academic and non-commercial use at: http://alggen.lsi.upc.es/recerca/align/mgcat/intro-mgcat.html.
The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

Science.gov (United States)

Liu, Shengyi; Liu, Yumei; Yang, Xinhua; Tong, Chaobo; Edwards, David; Parkin, Isobel A. P.; Zhao, Meixia; Ma, Jianxin; Yu, Jingyin; Huang, Shunmou; Wang, Xiyin; Wang, Junyi; Lu, Kun; Fang, Zhiyuan; Bancroft, Ian; Yang, Tae-Jin; Hu, Qiong; Wang, Xinfa; Yue, Zhen; Li, Haojie; Yang, Linfeng; Wu, Jian; Zhou, Qing; Wang, Wanxin; King, Graham J; Pires, J. Chris; Lu, Changxin; Wu, Zhangyan; Sampath, Perumal; Wang, Zhuo; Guo, Hui; Pan, Shengkai; Yang, Limei; Min, Jiumeng; Zhang, Dong; Jin, Dianchuan; Li, Wanshun; Belcram, Harry; Tu, Jinxing; Guan, Mei; Qi, Cunkou; Du, Dezhi; Li, Jiana; Jiang, Liangcai; Batley, Jacqueline; Sharpe, Andrew G; Park, Beom-Seok; Ruperao, Pradeep; Cheng, Feng; Waminal, Nomar Espinosa; Huang, Yin; Dong, Caihua; Wang, Li; Li, Jingping; Hu, Zhiyong; Zhuang, Mu; Huang, Yi; Huang, Junyan; Shi, Jiaqin; Mei, Desheng; Liu, Jing; Lee, Tae-Ho; Wang, Jinpeng; Jin, Huizhe; Li, Zaiyun; Li, Xun; Zhang, Jiefu; Xiao, Lu; Zhou, Yongming; Liu, Zhongsong; Liu, Xuequn; Qin, Rui; Tang, Xu; Liu, Wenbin; Wang, Yupeng; Zhang, Yangyong; Lee, Jonghoon; Kim, Hyun Hee; Denoeud, France; Xu, Xun; Liang, Xinming; Hua, Wei; Wang, Xiaowu; Wang, Jun; Chalhoub, Boulos; Paterson, Andrew H

2014-01-01

Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus. PMID:24852848
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

Science.gov (United States)

Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

Directory of Open Access Journals (Sweden)

Jianmin Fu

Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.
Complete genome sequence of Beijerinckia indica subsp. indica.

Science.gov (United States)

Tamas, Ivica; Dedysh, Svetlana N; Liesack, Werner; Stott, Matthew B; Alam, Maqsudul; Murrell, J Colin; Dunfield, Peter F

2010-09-01

Beijerinckia indica subsp. indica is an aerobic, acidophilic, exopolysaccharide-producing, N(2)-fixing soil bacterium. It is a generalist chemoorganotroph that is phylogenetically closely related to facultative and obligate methanotrophs of the genera Methylocella and Methylocapsa. Here we report the full genome sequence of this bacterium.
Changes in photosynthesis and soil moisture drive the seasonal soil respiration-temperature hysteresis relationship

Science.gov (United States)

In nearly all large-scale models, CO2 efflux from soil (i.e., soil respiration) is represented as a function of soil temperature. However, the relationship between soil respiration and soil temperature is highly variable at the local scale, and there is often a pronounced hysteresis in the soil resp...
Ensembl Genomes 2013: scaling up access to genome-wide data.

Science.gov (United States)

Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

2014-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.
GenoSets: visual analytic methods for comparative genomics.

Directory of Open Access Journals (Sweden)

Aurora A Cain

Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.
Mobilisation and remobilisation of a large archetypal pathogenicity island of uropathogenic Escherichia coli in vitro support the role of conjugation for horizontal transfer of genomic islands

Directory of Open Access Journals (Sweden)

Hochhut Bianca

2011-09-01

Full Text Available Abstract Background A substantial amount of data has been accumulated supporting the important role of genomic islands (GEIs - including pathogenicity islands (PAIs - in bacterial genome plasticity and the evolution of bacterial pathogens. Their instability and the high level sequence similarity of different (partial islands suggest an exchange of PAIs between strains of the same or even different bacterial species by horizontal gene transfer (HGT. Transfer events of archetypal large genomic islands of enterobacteria which often lack genes required for mobilisation or transfer have been rarely investigated so far. Results To study mobilisation of such large genomic regions in prototypic uropathogenic E. coli (UPEC strain 536, PAI II536 was supplemented with the mobRP4 region, an origin of replication (oriVR6K, an origin of transfer (oriTRP4 and a chloramphenicol resistance selection marker. In the presence of helper plasmid RP4, conjugative transfer of the 107-kb PAI II536 construct occured from strain 536 into an E. coli K-12 recipient. In transconjugants, PAI II536 existed either as a cytoplasmic circular intermediate (CI or integrated site-specifically into the recipient's chromosome at the leuX tRNA gene. This locus is the chromosomal integration site of PAI II536 in UPEC strain 536. From the E. coli K-12 recipient, the chromosomal PAI II536 construct as well as the CIs could be successfully remobilised and inserted into leuX in a PAI II536 deletion mutant of E. coli 536. Conclusions Our results corroborate that mobilisation and conjugal transfer may contribute to evolution of bacterial pathogens through horizontal transfer of large chromosomal regions such as PAIs. Stabilisation of these mobile genetic elements in the bacterial chromosome result from selective loss of mobilisation and transfer functions of genomic islands.
Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

Science.gov (United States)

Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

2015-01-01

Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486
Insights from the genome of a high alkaline cellulase producing Aspergillus fumigatus strain obtained from Peruvian Amazon rainforest.

Science.gov (United States)

Paul, Sujay; Zhang, Angel; Ludeña, Yvette; Villena, Gretty K; Yu, Fengan; Sherman, David H; Gutiérrez-Correa, Marcel

2017-06-10

Here, we report the complete genome sequence of a high alkaline cellulase producing Aspergillus fumigatus strain LMB-35Aa isolated from soil of Peruvian Amazon rainforest. The genome is ∼27.5mb in size, comprises of 228 scaffolds with an average GC content of 50%, and is predicted to contain a total of 8660 protein-coding genes. Of which, 6156 are with known function; it codes for 607 putative CAZymes families potentially involved in carbohydrate metabolism. Several important cellulose degrading genes, such as endoglucanase A, endoglucanase B, endoglucanase D and beta-glucosidase, are also identified. The genome of A. fumigatus strain LMB-35Aa represents the first whole sequenced genome of non-clinical, high cellulase producing A. fumigatus strain isolated from forest soil. Copyright © 2017 Elsevier B.V. All rights reserved.
Large-Scale Isolation of Microsatellites from Chinese Mitten Crab Eriocheir sinensis via a Solexa Genomic Survey

Directory of Open Access Journals (Sweden)

Qun Wang

2012-12-01

Full Text Available Microsatellites are simple sequence repeats with a high degree of polymorphism in the genome; they are used as DNA markers in many molecular genetic studies. Using traditional methods such as the magnetic beads enrichment method, only a few microsatellite markers have been isolated from the Chinese mitten crab Eriocheir sinensis, as the crab genome sequence information is unavailable. Here, we have identified a large number of microsatellites from the Chinese mitten crab by taking advantage of Solexa genomic surveying. A total of 141,737 SSR (simple sequence repeats motifs were identified via analysis of 883 Mb of the crab genomic DNA information, including mono-, di-, tri-, tetra-, penta- and hexa-nucleotide repeat motifs. The number of di-nucleotide repeat motifs was 82,979, making this the most abundant type of repeat motif (58.54%; the second most abundant were the tri-nucleotide repeats (42,657, 30.11%. Among di-nucleotide repeats, the most frequent repeats were AC motifs, accounting for 67.55% of the total number. AGG motifs were the most frequent (59.32% of the tri-nucleotide motifs. A total of 15,125 microsatellite loci had a flanking sequence suitable for setting the primer of a polymerase chain reaction (PCR. To verify the identified SSRs, a subset of 100 primer pairs was randomly selected for PCR. Eighty two primer sets (82% produced strong PCR products matching expected sizes, and 78% were polymorphic. In an analysis of 30 wild individuals from the Yangtze River with 20 primer sets, the number of alleles per locus ranged from 2–14 and the mean allelic richness was 7.4. No linkage disequilibrium was found between any pair of loci, indicating that the markers were independent. The Hardy-Weinberg equilibrium test showed significant deviation in four of the 20 microsatellite loci after sequential Bonferroni corrections. This method is cost- and time-effective in comparison to traditional approaches for the isolation of microsatellites.
Genome Modeling System: A Knowledge Management Platform for Genomics.

Directory of Open Access Journals (Sweden)

Malachi Griffith

2015-07-01

Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

Science.gov (United States)

Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

2016-01-01

PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Genomics-assisted breeding in fruit trees

OpenAIRE

Iwata, Hiroyoshi; Minamikawa, Mai F.; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi

2016-01-01

Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the pl...
Complete genome sequence of Bacillus velezensis S3-1, a potential biological pesticide with plant pathogen inhibiting and plant promoting capabilities.

Science.gov (United States)

Jin, Qing; Jiang, Qiuyue; Zhao, Lei; Su, Cuizhu; Li, Songshuo; Si, Fangyi; Li, Shanshan; Zhou, Chenhao; Mu, Yonglin; Xiao, Ming

2017-10-10

Antagonistic soil microorganisms, which are non-toxic, harmless non-pollutants, can effectively reduce the density of pathogenic species by some ways. Bacillus velezensis strain S3-1 was isolated from the rhizosphere soil of cucumber, and was shown to inhibit plant pathogens, promote plant growth and efficiently colonize rhizosphere soils. The strain produced 13 kinds of lipopeptide antibiotics, belonging to the surfactin, iturin and fengycin families. Here, we presented the complete genome sequence of S3-1. The genome consists of one chromosome without plasmids and also contains the biosynthetic gene cluster that encodes difficidin, macrolactin, surfactin and fengycin. The genome contains 86 tRNA genes, 27 rRNA genes and 57 antibiotic-related genes. The complete genome sequence of B. velezensis S3-1 provides useful information to further detect the molecular mechanisms behind antifungal actions, and will facilitate its potential as a biological pesticide in the agricultural industry. Copyright © 2017 Elsevier B.V. All rights reserved.
Validation of soil moisture ocean salinity (SMOS) satellite soil moisture products

Science.gov (United States)

The surface soil moisture state controls the partitioning of precipitation into infiltration and runoff. High-resolution observations of soil moisture will lead to improved flood forecasts, especially for intermediate to large watersheds where most flood damage occurs. Soil moisture is also key in d...
Complete Genome Sequence of Beijerinckia indica subsp. indica▿

Science.gov (United States)

Tamas, Ivica; Dedysh, Svetlana N.; Liesack, Werner; Stott, Matthew B.; Alam, Maqsudul; Murrell, J. Colin; Dunfield, Peter F.

2010-01-01

Beijerinckia indica subsp. indica is an aerobic, acidophilic, exopolysaccharide-producing, N2-fixing soil bacterium. It is a generalist chemoorganotroph that is phylogenetically closely related to facultative and obligate methanotrophs of the genera Methylocella and Methylocapsa. Here we report the full genome sequence of this bacterium. PMID:20601475
Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles.

Science.gov (United States)

Stepanauskas, Ramunas; Fergusson, Elizabeth A; Brown, Joseph; Poulton, Nicole J; Tupper, Ben; Labonté, Jessica M; Becraft, Eric D; Brown, Julia M; Pachiadaki, Maria G; Povilaitis, Tadas; Thompson, Brian P; Mascena, Corianna J; Bellows, Wendy K; Lubys, Arvydas

2017-07-20

Microbial single-cell genomics can be used to provide insights into the metabolic potential, interactions, and evolution of uncultured microorganisms. Here we present WGA-X, a method based on multiple displacement amplification of DNA that utilizes a thermostable mutant of the phi29 polymerase. WGA-X enhances genome recovery from individual microbial cells and viral particles while maintaining ease of use and scalability. The greatest improvements are observed when amplifying high G+C content templates, such as those belonging to the predominant bacteria in agricultural soils. By integrating WGA-X with calibrated index-cell sorting and high-throughput genomic sequencing, we are able to analyze genomic sequences and cell sizes of hundreds of individual, uncultured bacteria, archaea, protists, and viral particles, obtained directly from marine and soil samples, in a single experiment. This approach may find diverse applications in microbiology and in biomedical and forensic studies of humans and other multicellular organisms.Single-cell genomics can be used to study uncultured microorganisms. Here, Stepanauskas et al. present a method combining improved multiple displacement amplification and FACS, to obtain genomic sequences and cell size information from uncultivated microbial cells and viral particles in environmental samples.
A New Perspective on Polyploid Fragaria (Strawberry) Genome Composition Based on Large-Scale, Multi-Locus Phylogenetic Analysis.

Science.gov (United States)

Yang, Yilong; Davis, Thomas M

2017-12-01

The subgenomic compositions of the octoploid (2n = 8× = 56) strawberry (Fragaria) species, including the economically important cultivated species Fragaria x ananassa, have been a topic of long-standing interest. Phylogenomic approaches utilizing next-generation sequencing technologies offer a new window into species relationships and the subgenomic compositions of polyploids. We have conducted a large-scale phylogenetic analysis of Fragaria (strawberry) species using the Fluidigm Access Array system and 454 sequencing platform. About 24 single-copy or low-copy nuclear genes distributed across the genome were amplified and sequenced from 96 genomic DNA samples representing 16 Fragaria species from diploid (2×) to decaploid (10×), including the most extensive sampling of octoploid taxa yet reported. Individual gene trees were constructed by different tree-building methods. Mosaic genomic structures of diploid Fragaria species consisting of sequences at different phylogenetic positions were observed. Our findings support the presence in octoploid species of genetic signatures from at least five diploid ancestors (F. vesca, F. iinumae, F. bucharica, F. viridis, and at least one additional allele contributor of unknown identity), and questions the extent to which distinct subgenomes are preserved over evolutionary time in the allopolyploid Fragaria species. In addition, our data support divergence between the two wild octoploid species, F. virginiana and F. chiloensis. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Quantitative linkage genome scan for atopy in a large collection of Caucasian families

DEFF Research Database (Denmark)

Webb, BT; van den Oord, E; Akkari, A

2007-01-01

Quantitative phenotypes correlated with a complex disorder offer increased power to detect linkage in comparison to affected-unaffected classifications. Asthma is a complex disorder characterized by periods of bronchial obstruction and increased bronchial hyper reactivity. In childhood and early...... adulthood, asthma is frequently associated also with quantitative measures of atopy. Genome wide quantitative multipoint linkage analysis was conducted for serum IgE levels and percentage of positive skin prick test (SPT(per)) using three large groups of families originally ascertained for asthma....... In this report, 438 and 429 asthma families were informative for linkage using IgE and SPT(per) which represents 690 independent families. Suggestive linkage (LOD >/= 2) was found on chromosomes 1, 3, and 8q with maximum LODs of 2.34 (IgE), 2.03 (SPT(per)), and 2.25 (IgE) near markers D1S1653, D3S2322-D3S1764...
Environmental whole-genome amplification to access microbial populations in contaminated sediments

Energy Technology Data Exchange (ETDEWEB)

Abulencia, Carl B [Diversa Corporation; Wyborski, Denise L. [Diversa Corporation; Garcia, Joseph A. [Diversa Corporation; Podar, Mircea [ORNL; Chen, Wenqiong [Diversa Corporation; Chang, Sherman H. [Diversa Corporation; Chang, Hwai W. [Diversa Corporation; Watson, David B [ORNL; Brodie, Eoin L. [Lawrence Berkeley National Laboratory (LBNL); Hazen, Terry [Lawrence Berkeley National Laboratory (LBNL); Keller, Martin [ORNL

2006-05-01

Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using {phi}29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2% genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small-subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9% of the sequences had significant similarities to known proteins, and 'clusters of orthologous groups' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.
Environmental Whole-Genome Amplification to Access Microbial Diversity in Contaminated Sediments

Energy Technology Data Exchange (ETDEWEB)

Abulencia, C.B.; Wyborski, D.L.; Garcia, J.; Podar, M.; Chen, W.; Chang, S.H.; Chang, H.W.; Watson, D.; Brodie,E.I.; Hazen, T.C.; Keller, M.

2005-12-10

Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using ?29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2 percent genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9 percent of the sequences had significant similarities to known proteins, and ''clusters of orthologous groups'' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.

A universal genomic coordinate translator for comparative genomics.

Science.gov (United States)

Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

2014-06-30

Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across
Insights into Conifer Giga-Genomes1

Science.gov (United States)

De La Torre, Amanda R.; Birol, Inanc; Bousquet, Jean; Ingvarsson, Pär K.; Jansson, Stefan; Jones, Steven J.M.; Keeling, Christopher I.; MacKay, John; Nilsson, Ove; Ritland, Kermit; Street, Nathaniel; Yanchuk, Alvin; Zerbe, Philipp; Bohlmann, Jörg

2014-01-01

Insights from sequenced genomes of major land plant lineages have advanced research in almost every aspect of plant biology. Until recently, however, assembled genome sequences of gymnosperms have been missing from this picture. Conifers of the pine family (Pinaceae) are a group of gymnosperms that dominate large parts of the world’s forests. Despite their ecological and economic importance, conifers seemed long out of reach for complete genome sequencing, due in part to their enormous genome size (20–30 Gb) and the highly repetitive nature of their genomes. Technological advances in genome sequencing and assembly enabled the recent publication of three conifer genomes: white spruce (Picea glauca), Norway spruce (Picea abies), and loblolly pine (Pinus taeda). These genome sequences revealed distinctive features compared with other plant genomes and may represent a window into the past of seed plant genomes. This Update highlights recent advances, remaining challenges, and opportunities in light of the publication of the first conifer and gymnosperm genomes. PMID:25349325
Mass spectrometry allows direct identification of proteins in large genomes

DEFF Research Database (Denmark)

Küster, B; Mortensen, Peter V.; Andersen, Jens S.

2001-01-01

Proteome projects seek to provide systematic functional analysis of the genes uncovered by genome sequencing initiatives. Mass spectrometric protein identification is a key requirement in these studies but to date, database searching tools rely on the availability of protein sequences derived fro...
Genome Sequence of Gordonia Phage BetterKatz

Science.gov (United States)

Berryman, Emily N.; Forrest, Kaitlyn M.; McHale, Lilliana; Wertz, Anthony T.; Zhuang, Zenas; Kasturiarachi, Naomi S.; Pressimone, Catherine A.; Schiebel, Johnathon G.; Furbee, Emily C.; Grubb, Sarah R.; Warner, Marcie H.; Montgomery, Matthew T.; Garlena, Rebecca A.; Russell, Daniel A.; Jacobs-Sera, Deborah; Hatfull, Graham F.

2016-01-01

BetterKatz is a bacteriophage isolated from a soil sample collected in Pittsburgh, Pennsylvania using the host Gordonia terrae 3612. BetterKatz’s genome is 50,636 bp long and contains 75 predicted protein-coding genes, 35 of which have been assigned putative functions. BetterKatz is not closely related to other sequenced Gordonia phages. PMID:27516497
An unexpectedly large and loosely packed mitochondrial genome in the charophycean green alga Chlorokybus atmophyticus

Directory of Open Access Journals (Sweden)

Lemieux Claude

2007-05-01

Full Text Available Abstract Background The Streptophyta comprises all land plants and six groups of charophycean green algae. The scaly biflagellate Mesostigma viride (Mesostigmatales and the sarcinoid Chlorokybus atmophyticus (Chlorokybales represent the earliest diverging lineages of this phylum. In trees based on chloroplast genome data, these two charophycean green algae are nested in the same clade. To validate this relationship and gain insight into the ancestral state of the mitochondrial genome in the Charophyceae, we sequenced the mitochondrial DNA (mtDNA of Chlorokybus and compared this genome sequence with those of three other charophycean green algae and the bryophytes Marchantia polymorpha and Physcomitrella patens. Results The Chlorokybus genome differs radically from its 42,424-bp Mesostigma counterpart in size, gene order, intron content and density of repeated elements. At 201,763-bp, it is the largest mtDNA yet reported for a green alga. The 70 conserved genes represent 41.4% of the genome sequence and include nad10 and trnL(gag, two genes reported for the first time in a streptophyte mtDNA. At the gene order level, the Chlorokybus genome shares with its Chara, Chaetosphaeridium and bryophyte homologues eight to ten gene clusters including about 20 genes. Notably, some of these clusters exhibit gene linkages not previously found outside the Streptophyta, suggesting that they originated early during streptophyte evolution. In addition to six group I and 14 group II introns, short repeated sequences accounting for 7.5% of the genome were identified. Mitochondrial trees were unable to resolve the correct position of Mesostigma, due to analytical problems arising from accelerated sequence evolution in this lineage. Conclusion The Chlorokybus and Mesostigma mtDNAs exemplify the marked fluidity of the mitochondrial genome in charophycean green algae. The notion that the mitochondrial genome was constrained to remain compact during charophycean
The variability of the large genomic segment of Tahyna orthobunyavirus and an all-atom exploration of its anti-viral drug resistance

Czech Academy of Sciences Publication Activity Database

Kilian, Patrik; Valdés, James J.; Lecina-Casas, D.; Chrudimský, T.; Růžek, Daniel

2013-01-01

Roč. 20, 2013-Dec (2013), s. 304-311 ISSN 1567-1348 R&D Projects: GA ČR GAP502/11/2116; GA MŠk(CZ) EE2.3.30.0032 Institutional support: RVO:60077344 Keywords : Tahyna virus * Orthobunyavirus * California complex * Genetic variability * Large genomic segment Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.264, year: 2013
Mitochondrial genome analysis of the predatory mite Phytoseiulus persimilis and a revisit of the Metaseiulus occidentalis mitochondrial genome.

Science.gov (United States)

Dermauw, Wannes; Vanholme, Bartel; Tirry, Luc; Van Leeuwen, Thomas

2010-04-01

In this study we sequenced and analysed the complete mitochondrial (mt) genome of the Chilean predatory mite Phytoseiulus persimilis Athias-Henriot (Chelicerata: Acari: Mesostigmata: Phytoseiidae: Amblyseiinae). The 16 199 bp genome (79.8% AT) contains the standard set of 13 protein-coding and 24 RNA genes. Compared with the ancestral arthropod mtDNA pattern, the gene order is extremely reshuffled (35 genes changed position) and represents a novel arrangement within the arthropods. This is probably related to the presence of several large noncoding regions in the genome. In contrast with the mt genome of the closely related species Metaseiulus occidentalis (Phytoseiidae: Typhlodrominae) - which was reported to be unusually large (24 961 bp), to lack nad6 and nad3 protein-coding genes, and to contain 22 tRNAs without T-arms - the genome of P. persimilis has all the features of a standard metazoan mt genome. Consequently, we performed additional experiments on the M. occidentalis mt genome. Our preliminary restriction digests and Southern hybridization data revealed that this genome is smaller than previously reported. In addition, we cloned nad3 in M. occidentalis and positioned this gene between nad4L and 12S-rRNA on the mt genome. Finally, we report that at least 15 of the 22 tRNAs in the M. occidentalis mt genome can be folded into canonical cloverleaf structures similar to their counterparts in P. persimilis.
Merlin: Computer-Aided Oligonucleotide Design for Large Scale Genome Engineering with MAGE.

Science.gov (United States)

Quintin, Michael; Ma, Natalie J; Ahmed, Samir; Bhatia, Swapnil; Lewis, Aaron; Isaacs, Farren J; Densmore, Douglas

2016-06-17

Genome engineering technologies now enable precise manipulation of organism genotype, but can be limited in scalability by their design requirements. Here we describe Merlin ( http://merlincad.org ), an open-source web-based tool to assist biologists in designing experiments using multiplex automated genome engineering (MAGE). Merlin provides methods to generate pools of single-stranded DNA oligonucleotides (oligos) for MAGE experiments by performing free energy calculation and BLAST scoring on a sliding window spanning the targeted site. These oligos are designed not only to improve recombination efficiency, but also to minimize off-target interactions. The application further assists experiment planning by reporting predicted allelic replacement rates after multiple MAGE cycles, and enables rapid result validation by generating primer sequences for multiplexed allele-specific colony PCR. Here we describe the Merlin oligo and primer design procedures and validate their functionality compared to OptMAGE by eliminating seven AvrII restriction sites from the Escherichia coli genome.
Draft Genome Sequence of the Microbispora sp. Strain ATCC-PTA-5024, Producing the Lantibiotic NAI-107

DEFF Research Database (Denmark)

Sosio, M.; Gallo, G.; Pozzi, R.

2014-01-01

We report the draft genome sequence of Microbispora sp. strain ATCC-PTA-5024, a soil isolate that produces NAI-107, a new lantibiotic with the potential to treat life-threatening infections caused by multidrug-resistant Gram-positive pathogens. The draft genome of strain Microbispora sp. ATCC...
CoCoNUT: an efficient system for the comparison and analysis of genomes

Directory of Open Access Journals (Sweden)

Kurtz Stefan

2008-11-01

Full Text Available Abstract Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit that allows solving several different tasks in a unified framework: (1 finding regions of high similarity among multiple genomic sequences and aligning them, (2 comparing two draft or multi-chromosomal genomes, (3 locating large segmental duplications in large genomic sequences, and (4 mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component, CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics.
Changes in photosynthesis and soil moisture drive the seasonal soil respiration-temperature hysteresis relationship

Science.gov (United States)

Quan Zhang; Richard P. Phillips; Stefano Manzoni; Russell L. Scott; A. Christopher Oishi; Adrien Finzi; Edoardo Daly; Rodrigo Vargas; Kimberly A. Novick

2018-01-01

In nearly all large-scale terrestrial ecosystem models, soil respiration is represented as a function of soil temperature. However, the relationship between soil respiration and soil temperature is highly variable across sites and there is often a pronounced hysteresis in the soil respiration-temperature relationship over the course of the growing season. This...
Two draft genome sequences of Pseudomonas jessenii strains isolated from a copper contaminated site in Denmark

DEFF Research Database (Denmark)

Qin, Yanan; Wang, Dan; Brandt, Kristian Koefoed

2016-01-01

Pseudomonas jessenii C2 and Pseudomonas jessenii H16 were isolated from low-Cu and high-Cu industrially contaminated soil, respectively. P. jessenii H16 displayed significant resistance to copper when compared to P. jessenii C2. Here we describe genome sequences and interesting features of these ......Pseudomonas jessenii C2 and Pseudomonas jessenii H16 were isolated from low-Cu and high-Cu industrially contaminated soil, respectively. P. jessenii H16 displayed significant resistance to copper when compared to P. jessenii C2. Here we describe genome sequences and interesting features...... of these two strains. The genome of P. jessenii C2 comprised 6,420,113 bp, with 5814 protein-coding genes and 67 RNA genes. P. jessenii H16 comprised 6,807,788 bp, with 5995 protein-coding genes and 70 RNA genes. Of special interest was a specific adaptation to this harsh copper-contaminated environment as P....... jessenii H16 contained a novel putative copper resistance genomic island (GI) of around 50,000 bp....
Genome interrogation for novel salinity tolerant Arabidopsis mutants.

Science.gov (United States)

van Tol, Niels; Pinas, Johan; Schat, Henk; Hooykaas, Paul J J; van der Zaal, Bert J

2016-12-01

Soil salinity is becoming an increasingly large problem in agriculture. In this study, we have investigated whether a capacity to withstand salinity can be induced in the salinity sensitive plant species Arabidopsis thaliana, and whether it can be maintained in subsequent generations. To this end, we have used zinc finger artificial transcription factor (ZF-ATFs) mediated genome interrogation. Already within a relatively small collection Arabidopsis lines expressing ZF-ATFs, we found 41 lines that were tolerant to 100 mM NaCl. Furthermore, ZF-ATF encoding gene constructs rescued from the most strongly salinity tolerant lines were indeed found to act as dominant and heritable agents for salinity tolerance. Altogether, our data provide evidence that a silent capacity to withstand normally lethal levels of salinity exists in Arabidopsis and can be evoked relatively easily by in trans acting transcription factors like ZF-ATFs. © 2016 John Wiley & Sons Ltd.
Privacy Challenges of Genomic Big Data.

Science.gov (United States)

Shen, Hong; Ma, Jian

2017-01-01

With the rapid advancement of high-throughput DNA sequencing technologies, genomics has become a big data discipline where large-scale genetic information of human individuals can be obtained efficiently with low cost. However, such massive amount of personal genomic data creates tremendous challenge for privacy, especially given the emergence of direct-to-consumer (DTC) industry that provides genetic testing services. Here we review the recent development in genomic big data and its implications on privacy. We also discuss the current dilemmas and future challenges of genomic privacy.
V-GAP: Viral genome assembly pipeline

KAUST Repository

Nakamura, Yoji

2015-10-22

Next-generation sequencing technologies have allowed the rapid determination of the complete genomes of many organisms. Although shotgun sequences from large genome organisms are still difficult to reconstruct perfect contigs each of which represents a full chromosome, those from small genomes have been assembled successfully into a very small number of contigs. In this study, we show that shotgun reads from phage genomes can be reconstructed into a single contig by controlling the number of read sequences used in de novo assembly. We have developed a pipeline to assemble small viral genomes with good reliability using a resampling method from shotgun data. This pipeline, named V-GAP (Viral Genome Assembly Pipeline), will contribute to the rapid genome typing of viruses, which are highly divergent, and thus will meet the increasing need for viral genome comparisons in metagenomic studies.
V-GAP: Viral genome assembly pipeline

KAUST Repository

Nakamura, Yoji; Yasuike, Motoshige; Nishiki, Issei; Iwasaki, Yuki; Fujiwara, Atushi; Kawato, Yasuhiko; Nakai, Toshihiro; Nagai, Satoshi; Kobayashi, Takanori; Gojobori, Takashi; Ototake, Mitsuru

2015-01-01

Next-generation sequencing technologies have allowed the rapid determination of the complete genomes of many organisms. Although shotgun sequences from large genome organisms are still difficult to reconstruct perfect contigs each of which represents a full chromosome, those from small genomes have been assembled successfully into a very small number of contigs. In this study, we show that shotgun reads from phage genomes can be reconstructed into a single contig by controlling the number of read sequences used in de novo assembly. We have developed a pipeline to assemble small viral genomes with good reliability using a resampling method from shotgun data. This pipeline, named V-GAP (Viral Genome Assembly Pipeline), will contribute to the rapid genome typing of viruses, which are highly divergent, and thus will meet the increasing need for viral genome comparisons in metagenomic studies.
Genome Sequences of Subcluster K5 Mycobacteriophages AlleyCat, Edugator, and Guillsminger.

Science.gov (United States)

King, Rodney A; Slowan-Pomeroy, Tina M; Thomas, Jodi E; Ahmed, Tithe; Alexander, Katie L; Biddle, James M; Daniels, Makenzie K; Rowlett, Jenna R; Senay, Taylor E; Rinehart, Claire A; Staples, Amanda K; Rowland, Naomi S; Gaffney, Bobby L; Emmons, Christine B; Hauk, Maya D; Nguyen, Rebecca L; Naegele, Leonard; Strickland, Summer S; Briggs, Laura A; Rush, Alexander N; Saha, Sanghamitra; Sadana, Rachna; Cresawn, Steven G; Russell, Daniel A; Garlena, Rebecca A; Pope, Welkin H; Jacobs-Sera, Deborah; Hatfull, Graham F

2017-11-09

Bacteriophages AlleyCat, Edugator, and Guillsminger were isolated on Mycobacterium smegmatis mc 2 155 from enriched soil samples. All are members of mycobacteriophage subcluster K5, with genomes of 62,112 to 63,344 bp. Each genome contains 92 to 99 predicted protein-coding genes and one tRNA. Guillsminger is the first mycobacteriophage to carry an IS 1380 family transposon. Copyright © 2017 King et al.
Assessing soil erosion rates for a large catchment in the Central Highlands of Vietnam using fallout radionuclides

International Nuclear Information System (INIS)

Phan Son Hai; Nguyen Thanh Binh; Nguyen Minh Dao; Nguyen Thi Huong Lan; Nguyen Thi Mui; Le Xuan Thang; Phan Quang Trung; Trinh Cong Tu; Tran Tien Dung

2014-01-01

Fallout radionuclides Be-7 and Cs-137 were applied to assess soil erosion rates for a 270.5 km 2 catchment with a variety of slope (from 0 o to more than 45 o , crops or vegetation (natural forest, artificial forest, perennial crops, annual crops) and a variety of tillage and soil conservation measures. Soil erosion rates were estimated at 90 areas within the catchment. Each sampling area has at least one feature of the slope, rainfall, crops, farming practice different from others. Soil erosion rates in this region depend significantly on the slope, crops and farming techniques. Averaging over crops, soil erosion rates by slopes 0 - 5 o , 5 - 15 o , 15 - 25 o and 25 - 35 o are 5.0, 12.8, 18.9 and 21.3 t.ha -1 .y -1 , respectively. Forest land has the least soil erosion rates, ranging between 0.5 t.ha -1 .y -1 and 14 t.ha -1 .y -1 depending on the slope. Annual crops land has the highest soil erosion rates, ranging between 6 t.ha -1 .y -1 and 42 t.ha -1 .y -1 when slope varies from < 5 o to 32 o . Perennial crop land has soil erosion rates in the range of 5 t.ha -1 .y -1 and 39 t.ha -1 .y -1 . In areas with the same slope, the soil erosion rate is the highest for cashew plantations, lower for mulberry field and the lowest for tea or coffee plantations. Soil erosion has resulted in losing a significant quantity of plant nutrients such as OM, N, P 2 O 5 and K 2 O every year. Generally, lost nutrient quantities due to soil erosion are proportional to erosion rates. Some areas of annual crop land lost a large amount of nutrients every year, up to 1435 kg OM, 79 kg N, 54 kg P 2 O 5 and 36 kg K 2 O. Similarly, perennial crop lands in this region could lost up to 1736 kg OM, 91 kg N, 66 kg P 2 O 5 and 40 kg K 2 O every year. Owing to soil erosion, the catchment has lost about 211200 tons of surface soil per year during last 50 years, corresponding to the rate of 7.8 t.ha -1 .y -1 . This amount of eroded soil was deposited in drainage of the catchment and in reservoirs
The Genome of the Generalist Plant Pathogen Fusarium avenaceum Is Enriched with Genes Involved in Redox, Signaling and Secondary Metabolism

Science.gov (United States)

Lysøe, Erik; Harris, Linda J.; Walkowiak, Sean; Subramaniam, Rajagopal; Divon, Hege H.; Riiser, Even S.; Llorens, Carlos; Gabaldón, Toni; Kistler, H. Corby; Jonkers, Wilfried; Kolseth, Anna-Karin; Nielsen, Kristian F.; Thrane, Ulf; Frandsen, Rasmus J. N.

2014-01-01

Fusarium avenaceum is a fungus commonly isolated from soil and associated with a wide range of host plants. We present here three genome sequences of F. avenaceum, one isolated from barley in Finland and two from spring and winter wheat in Canada. The sizes of the three genomes range from 41.6–43.1 MB, with 13217–13445 predicted protein-coding genes. Whole-genome analysis showed that the three genomes are highly syntenic, and share>95% gene orthologs. Comparative analysis to other sequenced Fusaria shows that F. avenaceum has a very large potential for producing secondary metabolites, with between 75 and 80 key enzymes belonging to the polyketide, non-ribosomal peptide, terpene, alkaloid and indole-diterpene synthase classes. In addition to known metabolites from F. avenaceum, fuscofusarin and JM-47 were detected for the first time in this species. Many protein families are expanded in F. avenaceum, such as transcription factors, and proteins involved in redox reactions and signal transduction, suggesting evolutionary adaptation to a diverse and cosmopolitan ecology. We found that 20% of all predicted proteins were considered to be secreted, supporting a life in the extracellular space during interaction with plant hosts. PMID:25409087
Complete genome sequence of the industrial bacterium Bacillus licheniformis and comparisons with closely related Bacillus species

Science.gov (United States)

Rey, Michael W; Ramaiya, Preethi; Nelson, Beth A; Brody-Karpin, Shari D; Zaretsky, Elizabeth J; Tang, Maria; de Leon, Alfredo Lopez; Xiang, Henry; Gusti, Veronica; Clausen, Ib Groth; Olsen, Peter B; Rasmussen, Michael D; Andersen, Jens T; Jørgensen, Per L; Larsen, Thomas S; Sorokin, Alexei; Bolotin, Alexander; Lapidus, Alla; Galleron, Nathalie; Ehrlich, S Dusko; Berka, Randy M

2004-01-01

Background Bacillus licheniformis is a Gram-positive, spore-forming soil bacterium that is used in the biotechnology industry to manufacture enzymes, antibiotics, biochemicals and consumer products. This species is closely related to the well studied model organism Bacillus subtilis, and produces an assortment of extracellular enzymes that may contribute to nutrient cycling in nature. Results We determined the complete nucleotide sequence of the B. licheniformis ATCC 14580 genome which comprises a circular chromosome of 4,222,336 base-pairs (bp) containing 4,208 predicted protein-coding genes with an average size of 873 bp, seven rRNA operons, and 72 tRNA genes. The B. licheniformis chromosome contains large regions that are colinear with the genomes of B. subtilis and Bacillus halodurans, and approximately 80% of the predicted B. licheniformis coding sequences have B. subtilis orthologs. Conclusions Despite the unmistakable organizational similarities between the B. licheniformis and B. subtilis genomes, there are notable differences in the numbers and locations of prophages, transposable elements and a number of extracellular enzymes and secondary metabolic pathway operons that distinguish these species. Differences include a region of more than 80 kilobases (kb) that comprises a cluster of polyketide synthase genes and a second operon of 38 kb encoding plipastatin synthase enzymes that are absent in the B. licheniformis genome. The availability of a completed genome sequence for B. licheniformis should facilitate the design and construction of improved industrial strains and allow for comparative genomics and evolutionary studies within this group of Bacillaceae. PMID:15461803

DNA Data Bank of Japan at work on genome sequence data.

Science.gov (United States)

Tateno, Y; Fukami-Kobayashi, K; Miyazaki, S; Sugawara, H; Gojobori, T

1998-01-01

We at the DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) have recently begun receiving, processing and releasing EST and genome sequence data submitted by various Japanese genome projects. The data include those for human, Arabidopsis thaliana, rice, nematode, Synechocystis sp. and Escherichia coli. Since the quantity of data is very large, we organized teams to conduct preliminary discussions with project teams about data submission and handling for release to the public. We also developed a mass submission tool to cope with a large quantity of data. In addition, to provide genome data on WWW, we developed a genome information system using Java. This system (http://mol.genes.nig.ac.jp/ecoli/) can in theory be used for any genome sequence data. These activities will facilitate processing of large quantities of EST and genome data.
The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

Science.gov (United States)

Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

2015-04-01

Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

Science.gov (United States)

Thorvaldsdóttir, Helga; Robinson, James T; Mesirov, Jill P

2013-03-01

Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.
Genome sequence of the moderately thermophilic sulfur-reducing bacterium Thermanaerovibrio velox type strain (Z-9701^T) and emended description of the genus Thermanaerovibrio

OpenAIRE

Palaniappan, K; Meier-Kolthoff, JP; Teshima, H; Nolan, M; Lapidus, A; Tice, H; Del Rio, TG; Cheng, JF; Han, C; Tapia, R; Goodwin, LA; Pitluck, S; Liolios, K; Mavromatis, K; Pagani, I

2013-01-01

Thermanaerovibrio velox Zavarzina et al. 2000 is a member of the Synergistaceae, a family in the phylum Synergistetes that is already well-characterized at the genome level. Members of this phylum were described as Gram-negative staining anaerobic bacteria with a rod/vibrioid cell shape and possessing an atypical outer cell envelope. They inhabit a large variety of an-aerobic environments including soil, oil wells, wastewater treatment plants and animal gas-trointestinal tracts. They are also...
Implementation of genomic recursions in single-step genomic best linear unbiased predictor for US Holsteins with a large number of genotyped animals.

Science.gov (United States)

Masuda, Y; Misztal, I; Tsuruta, S; Legarra, A; Aguilar, I; Lourenco, D A L; Fragomeni, B O; Lawlor, T J

2016-03-01

The objectives of this study were to develop and evaluate an efficient implementation in the computation of the inverse of genomic relationship matrix with the recursion algorithm, called the algorithm for proven and young (APY), in single-step genomic BLUP. We validated genomic predictions for young bulls with more than 500,000 genotyped animals in final score for US Holsteins. Phenotypic data included 11,626,576 final scores on 7,093,380 US Holstein cows, and genotypes were available for 569,404 animals. Daughter deviations for young bulls with no classified daughters in 2009, but at least 30 classified daughters in 2014 were computed using all the phenotypic data. Genomic predictions for the same bulls were calculated with single-step genomic BLUP using phenotypes up to 2009. We calculated the inverse of the genomic relationship matrix GAPY(-1) based on a direct inversion of genomic relationship matrix on a small subset of genotyped animals (core animals) and extended that information to noncore animals by recursion. We tested several sets of core animals including 9,406 bulls with at least 1 classified daughter, 9,406 bulls and 1,052 classified dams of bulls, 9,406 bulls and 7,422 classified cows, and random samples of 5,000 to 30,000 animals. Validation reliability was assessed by the coefficient of determination from regression of daughter deviation on genomic predictions for the predicted young bulls. The reliabilities were 0.39 with 5,000 randomly chosen core animals, 0.45 with the 9,406 bulls, and 7,422 cows as core animals, and 0.44 with the remaining sets. With phenotypes truncated in 2009 and the preconditioned conjugate gradient to solve mixed model equations, the number of rounds to convergence for core animals defined by bulls was 1,343; defined by bulls and cows, 2,066; and defined by 10,000 random animals, at most 1,629. With complete phenotype data, the number of rounds decreased to 858, 1,299, and at most 1,092, respectively. Setting up GAPY(-1
Genome Improvement at JGI-HAGSC

Energy Technology Data Exchange (ETDEWEB)

Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

2012-03-03

Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence. For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.
Small genomes and large seeds: chromosome numbers, genome size and seed mass in diploid Aesculus species (Sapindaceae)

Czech Academy of Sciences Publication Activity Database

Krahulcová, Anna; Trávníček, Pavel; Krahulec, František; Rejmánek, M.

2017-01-01

Roč. 119, č. 6 (2017), s. 957-964 ISSN 0305-7364 Institutional support: RVO:67985939 Keywords : Aesculus * chromosome number * genome size * phylogeny * seed mass Subject RIV: EF - Botanics OBOR OECD: Plant sciences, botany Impact factor: 4.041, year: 2016
Genome sequencing and annotation of Amycolatopsis azurea DSM 43854T

Directory of Open Access Journals (Sweden)

Indu Khatri

2014-12-01

Full Text Available We report the 9.2 Mb genome of the azureomycin A and B antibiotic producing strain Amycolatopsis azurea isolated from a Japanese soil sample. The draft genome of strain DSM 43854T consists of 9,223,451 bp with a G + C content of 69.0% and the genome contains 3 rRNA genes (5S–23S–16S and 58 aminoacyl-tRNA synthetase genes. The homology searches revealed that the PKS gene clusters are supposed to be responsible for the biosynthesis of naptomycin, macbecin, rifamycin, mitomycin, maduropeptin enediyne, neocarzinostatin enediyne, C-1027 enediyne, calicheamicin enediyne, landomycin, simocyclinone, medermycin, granaticin, polyketomycin, teicoplanin, balhimycin, vancomycin, staurosporine, rubradirin and complestatin.
Exploring the potential offered by legacy soil databases for ecosystem services mapping of Central African soils

Science.gov (United States)

Verdoodt, Ann; Baert, Geert; Van Ranst, Eric

2014-05-01

Central African soil resources are characterised by a large variability, ranging from stony, shallow or sandy soils with poor life-sustaining capabilities to highly weathered soils that recycle and support large amounts of biomass. Socio-economic drivers within this largely rural region foster inappropriate land use and management, threaten soil quality and finally culminate into a declining soil productivity and increasing food insecurity. For the development of sustainable land use strategies targeting development planning and natural hazard mitigation, decision makers often rely on legacy soil maps and soil profile databases. Recent development cooperation financed projects led to the design of soil information systems for Rwanda, D.R. Congo, and (ongoing) Burundi. A major challenge is to exploit these existing soil databases and convert them into soil inference systems through an optimal combination of digital soil mapping techniques, land evaluation tools, and biogeochemical models. This presentation aims at (1) highlighting some key characteristics of typical Central African soils, (2) assessing the positional, geographic and semantic quality of the soil information systems, and (3) revealing its potential impacts on the use of these datasets for thematic mapping of soil ecosystem services (e.g. organic carbon storage, pH buffering capacity). Soil map quality is assessed considering positional and semantic quality, as well as geographic completeness. Descriptive statistics, decision tree classification and linear regression techniques are used to mine the soil profile databases. Geo-matching as well as class-matching approaches are considered when developing thematic maps. Variability in inherent as well as dynamic soil properties within the soil taxonomic units is highlighted. It is hypothesized that within-unit variation in soil properties highly affects the use and interpretation of thematic maps for ecosystem services mapping. Results will mainly be based
Effect of different soil washing solutions on bioavailability of residual arsenic in soils and soil properties.

Science.gov (United States)

Im, Jinwoo; Yang, Kyung; Jho, Eun Hea; Nam, Kyoungphile

2015-11-01

The effect of soil washing used for arsenic (As)-contaminated soil remediation on soil properties and bioavailability of residual As in soil is receiving increasing attention due to increasing interest in conserving soil qualities after remediation. This study investigates the effect of different washing solutions on bioavailability of residual As in soils and soil properties after soil washing. Regardless of washing solutions, the sequential extraction revealed that the residual As concentrations and the amount of readily labile As in soils were reduced after soil washing. However, the bioassay tests showed that the washed soils exhibited ecotoxicological effects - lower seed germination, shoot growth, and enzyme activities - and this could largely be attributed to the acidic pH and/or excessive nutrient contents of the washed soils depending on washing solutions. Overall, this study showed that treated soils having lower levels of contaminants could still exhibit toxic effects due to changes in soil properties, which highly depended on washing solutions. This study also emphasizes that data on the As concentrations, the soil properties, and the ecotoxicological effects are necessary to properly manage the washed soils for reuses. The results of this study can, thus, be utilized to select proper post-treatment techniques for the washed soils. Copyright © 2015 Elsevier Ltd. All rights reserved.
Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

DEFF Research Database (Denmark)

Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

2015-01-01

Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins. Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...
Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

Science.gov (United States)

Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

2017-07-05

Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.
A genome-wide association study to detect QTL for commercially important traits in Swiss Large White boars.

Directory of Open Access Journals (Sweden)

Doreen Becker

Full Text Available The improvement of meat quality and production traits has high priority in the pork industry. Many of these traits show a low to moderate heritability and are difficult and expensive to measure. Their improvement by targeted breeding programs is challenging and requires knowledge of the genetic and molecular background. For this study we genotyped 192 artificial insemination boars of a commercial line derived from the Swiss Large White breed using the PorcineSNP60 BeadChip with 62,163 evenly spaced SNPs across the pig genome. We obtained 26 estimated breeding values (EBVs for various traits including exterior, meat quality, reproduction, and production. The subsequent genome-wide association analysis allowed us to identify four QTL with suggestive significance for three of these traits (p-values ranging from 4.99×10⁻⁶ to 2.73×10⁻⁵. Single QTL for the EBVs pH one hour post mortem (pH1 and carcass length were on pig chromosome (SSC 14 and SSC 2, respectively. Two QTL for the EBV rear view hind legs were on SSC 10 and SSC 16.
Analysis of ground response data at Lotung large-scale soil- structure interaction experiment site

International Nuclear Information System (INIS)

Chang, C.Y.; Mok, C.M.; Power, M.S.

1991-12-01

The Electric Power Research Institute (EPRI), in cooperation with the Taiwan Power Company (TPC), constructed two models (1/4-scale and 1/2-scale) of a nuclear plant containment structure at a site in Lotung (Tang, 1987), a seismically active region in northeast Taiwan. The models were constructed to gather data for the evaluation and validation of soil-structure interaction (SSI) analysis methodologies. Extensive instrumentation was deployed to record both structural and ground responses at the site during earthquakes. The experiment is generally referred to as the Lotung Large-Scale Seismic Test (LSST). As part of the LSST, two downhole arrays were installed at the site to record ground motions at depths as well as at the ground surface. Structural response and ground response have been recorded for a number of earthquakes (i.e. a total of 18 earthquakes in the period of October 1985 through November 1986) at the LSST site since the completion of the installation of the downhole instruments in October 1985. These data include those from earthquakes having magnitudes ranging from M L 4.5 to M L 7.0 and epicentral distances range from 4.7 km to 77.7 km. Peak ground surface accelerations range from 0.03 g to 0.21 g for the horizontal component and from 0.01 g to 0.20 g for the vertical component. The objectives of the study were: (1) to obtain empirical data on variations of earthquake ground motion with depth; (2) to examine field evidence of nonlinear soil response due to earthquake shaking and to determine the degree of soil nonlinearity; (3) to assess the ability of ground response analysis techniques including techniques to approximate nonlinear soil response to estimate ground motions due to earthquake shaking; and (4) to analyze earth pressures recorded beneath the basemat and on the side wall of the 1/4 scale model structure during selected earthquakes
Genome bioinformatics of tomato and potato

NARCIS (Netherlands)

Datema, E.

2011-01-01

In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have
Genome Sequences of Oryza Species

KAUST Repository

Kumagai, Masahiko

2018-02-14

This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.
Genome Sequences of Oryza Species

KAUST Repository

Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

2018-01-01

This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.
Soil-geographical regionalization as a basis for digital soil mapping: Karelia case study

Science.gov (United States)

Krasilnikov, P.; Sidorova, V.; Dubrovina, I.

2010-12-01

Recent development of digital soil mapping (DSM) allowed improving significantly the quality of soil maps. We tried to make a set of empirical models for the territory of Karelia, a republic at the North-East of the European territory of Russian Federation. This territory was selected for the pilot study for DSM for two reasons. First, the soils of the region are mainly monogenetic; thus, the effect of paleogeographic environment on recent soils is reduced. Second, the territory was poorly mapped because of low agricultural development: only 1.8% of the total area of the republic is used for agriculture and has large-scale soil maps. The rest of the territory has only small-scale soil maps, compiled basing on the general geographic concepts rather than on field surveys. Thus, the only solution for soil inventory was the predictive digital mapping. The absence of large-scaled soil maps did not allow data mining from previous soil surveys, and only empirical models could be applied. For regionalization purposes, we accepted the division into Northern and Southern Karelia, proposed in the general scheme of soil regionalization of Russia; boundaries between the regions were somewhat modified. Within each region, we specified from 15 (Northern Karelia) to 32 (Southern Karelia) individual soilscapes and proposed soil-topographic and soil-lithological relationships for every soilscape. Further field verification is needed to adjust the models.
Insights from Human/Mouse genome comparisons

Energy Technology Data Exchange (ETDEWEB)

Pennacchio, Len A.

2003-03-30

Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestry of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.
Observing copepods through a genomic lens

Directory of Open Access Journals (Sweden)

Johnson Stewart C

2011-09-01

Full Text Available Abstract Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to

Observing copepods through a genomic lens

Science.gov (United States)

2011-01-01

Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to provide genomics tools for
Genomic island excisions in Bordetella petrii

Directory of Open Access Journals (Sweden)

Levillain Erwan

2009-07-01

Full Text Available Abstract Background Among the members of the genus Bordetella B. petrii is unique, since it is the only species isolated from the environment, while the pathogenic Bordetellae are obligately associated with host organisms. Another feature distinguishing B. petrii from the other sequenced Bordetellae is the presence of a large number of mobile genetic elements including several large genomic regions with typical characteristics of genomic islands collectively known as integrative and conjugative elements (ICEs. These elements mainly encode accessory metabolic factors enabling this bacterium to grow on a large repertoire of aromatic compounds. Results During in vitro culture of Bordetella petrii colony variants appear frequently. We show that this variability can be attributed to the presence of a large number of metastable mobile genetic elements on its chromosome. In fact, the genome sequence of B. petrii revealed the presence of at least seven large genomic islands mostly encoding accessory metabolic functions involved in the degradation of aromatic compounds and detoxification of heavy metals. Four of these islands (termed GI1 to GI3 and GI6 are highly related to ICEclc of Pseudomonas knackmussii sp. strain B13. Here we present first data about the molecular characterization of these islands. We defined the exact borders of each island and we show that during standard culture of the bacteria these islands get excised from the chromosome. For all but one of these islands (GI5 we could detect circular intermediates. For the clc-like elements GI1 to GI3 of B. petrii we provide evidence that tandem insertion of these islands which all encode highly related integrases and attachment sites may also lead to incorporation of genomic DNA which originally was not part of the island and to the formation of huge composite islands. By integration of a tetracycline resistance cassette into GI3 we found this island to be rather unstable and to be lost from
Soil friability

DEFF Research Database (Denmark)

Munkholm, Lars Juhl

2011-01-01

This review gathers and synthesizes literature on soil friability produced during the last three decades. Soil friability is of vital importance for crop production and the impact of crop production on the environment. A friable soil is characterized by an ease of fragmentation of undesirably large...... aggregates/clods and a difficulty in fragmentation of minor aggregates into undesirable small elements. Soil friability has been assessed using qualitative field methods as well as quantitative field and laboratory methods at different scales of observation. The qualitative field methods are broadly used...... by scientists, advisors and farmers, whereas the quantitative laboratory methods demand specialized skills and more or less sophisticated equipment. Most methods address only one aspect of soil friability, i.e. either the strength of unconfined soil or the fragment size distribution after applying a stress. All...
Biological soil crusts emit large amounts of NO and HONO affecting the nitrogen cycle in drylands

Science.gov (United States)

Tamm, Alexandra; Wu, Dianming; Ruckteschler, Nina; Rodríguez-Caballero, Emilio; Steinkamp, Jörg; Meusel, Hannah; Elbert, Wolfgang; Behrendt, Thomas; Sörgel, Matthias; Cheng, Yafang; Crutzen, Paul J.; Su, Hang; Pöschl, Ulrich; Weber, Bettina

2016-04-01

to the latest IPCC report. In summary, our measurements show that dryland emissions of nitrogen oxides are largely driven by biocrusts and not by the underlying soil. As precipitation patterns, which influence biocrust activity, are affected by climate change, alterations in global nitrogen oxide emissions are to be expected. Thus, the role of biocrusts in the global cycling of reactive nitrogen needs to be followed and also implemented in regional and global models of biogeochemistry, air chemistry and climate.
Dryland biological soil crust cyanobacteria show unexpected decreases in abundance under long-term elevated CO2

Science.gov (United States)

Steven, Blaire; Gallegos-Graves, La Verne; Yeager, Chris M.; Belnap, Jayne; Evans, R. David; Kuske, Cheryl R.

2012-01-01

Biological soil crusts (biocrusts) cover soil surfaces in many drylands globally. The impacts of 10 years of elevated atmospheric CO2 on the cyanobacteria in biocrusts of an arid shrubland were examined at a large manipulated experiment in Nevada, USA. Cyanobacteria-specific quantitative PCR surveys of cyanobacteria small-subunit (SSU) rRNA genes suggested a reduction in biocrust cyanobacterial biomass in the elevated CO2 treatment relative to the ambient controls. Additionally, SSU rRNA gene libraries and shotgun metagenomes showed reduced representation of cyanobacteria in the total microbial community. Taxonomic composition of the cyanobacteria was similar under ambient and elevated CO2 conditions, indicating the decline was manifest across multiple cyanobacterial lineages. Recruitment of cyanobacteria sequences from replicate shotgun metagenomes to cyanobacterial genomes representing major biocrust orders also suggested decreased abundance of cyanobacteria sequences across the majority of genomes tested. Functional assignment of cyanobacteria-related shotgun metagenome sequences indicated that four subsystem categories, three related to oxidative stress, were differentially abundant in relation to the elevated CO2 treatment. Taken together, these results suggest that elevated CO2 affected a generalized decrease in cyanobacteria in the biocrusts and may have favoured cyanobacteria with altered gene inventories for coping with oxidative stress.
Plantagora: modeling whole genome sequencing and assembly of plant genomes.

Directory of Open Access Journals (Sweden)

Roger Barthelson

Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly
Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes.

Science.gov (United States)

Puigbò, Pere; Lobkovsky, Alexander E; Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2014-08-21

Genomes of bacteria and archaea (collectively, prokaryotes) appear to exist in incessant flux, expanding via horizontal gene transfer and gene duplication, and contracting via gene loss. However, the actual rates of genome dynamics and relative contributions of different types of event across the diversity of prokaryotes are largely unknown, as are the sizes of microbial supergenomes, i.e. pools of genes that are accessible to the given microbial species. We performed a comprehensive analysis of the genome dynamics in 35 groups (34 bacterial and one archaeal) of closely related microbial genomes using a phylogenetic birth-and-death maximum likelihood model to quantify the rates of gene family gain and loss, as well as expansion and reduction. The results show that loss of gene families dominates the evolution of prokaryotes, occurring at approximately three times the rate of gain. The rates of gene family expansion and reduction are typically seven and twenty times less than the gain and loss rates, respectively. Thus, the prevailing mode of evolution in bacteria and archaea is genome contraction, which is partially compensated by the gain of new gene families via horizontal gene transfer. However, the rates of gene family gain, loss, expansion and reduction vary within wide ranges, with the most stable genomes showing rates about 25 times lower than the most dynamic genomes. For many groups, the supergenome estimated from the fraction of repetitive gene family gains includes about tenfold more gene families than the typical genome in the group although some groups appear to have vast, 'open' supergenomes. Reconstruction of evolution for groups of closely related bacteria and archaea reveals an extremely rapid and highly variable flux of genes in evolving microbial genomes, demonstrates that extensive gene loss and horizontal gene transfer leading to innovation are the two dominant evolutionary processes, and yields robust estimates of the supergenome size.
the use of integrated soil fertility approach in the improvement of soil

African Journals Online (AJOL)

Sammy

Innovational practices in the management of organic matters in semi-arid soil, ... Compared to other areas, a large proportion of soil in semi-arid areas has low .... combine old and new methods of nutrient management into ecologically sound and ... Furthermore, organic matter is the energy source for soil fauna and micro ...
Human genetics and genomics a decade after the release of the draft sequence of the human genome

Science.gov (United States)

2011-01-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605
The Arab genome: Health and wealth.

Science.gov (United States)

Zayed, Hatem

2016-11-05

The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.
The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

Directory of Open Access Journals (Sweden)

Leila do Nascimento Vieira

Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of
Big Data Analysis of Human Genome Variations

KAUST Repository

Gojobori, Takashi

2016-01-01

Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human
Cultural Patterns of Soil Understanding

Science.gov (United States)

Patzel, Nikola; Feller, Christian

2017-04-01

Living soil supports all terrestrial ecosystems. The only global threat to earth's soils comes from human societies' land use and resource consuming activities. Soil perception and understanding by soil scientists are mainly drawn from biophysical parameters and found within Cartesian rationality, and not, or much less consciously from its rather intangible cultural dimension. But nevertheless, human soil perception, soil awareness, and soil relation are a cultural phenomenon, too. Aiming at soil awareness and education, it is of first order importance for the soil science community and the IUSS to study, discuss and communicate also about the cultural perceptions and representations of soil. For any society, cultural patterns in their relation to soil encompass: (i) General culturally underlying structures like (religious or 'secular') myths and belief systems. (ii) The personal, individual relation to/with and behaviour towards soil. This includes implicit concepts of soil being part integral concepts of landscape because the large majority of humans don't see soil as a distinct object. This communication would be to make evident: (i) the importance of cultural patterns and psychic/psychological background concerning soil, by case studies and overviews on different cultural areas, (ii) the necessity to develop reflections on this topic as well to communicate about soil with large public, as to raise awareness soil scientists to the cultural dimension of soils. A working group was recently founded at IUSS (Division 4) on this topic.
Soil Spectroscopy: An Alternative to Wet Chemistry for Soil Monitoring

DEFF Research Database (Denmark)

Nocita, M.; Stevens, A.; van Wesemael, Bas

2015-01-01

The soil science community is facing a growing demand of regional, continental, and worldwide databases in order to monitor the status of the soil. However, the availability of such data is very scarce. Cost-effective tools to measure soil properties for large areas (e.g., Europe) are required....... Soil spectroscopy has shown to be a fast, cost-effective, envi-ronmental-friendly, nondestructive, reproducible, and repeatable analytical technique. The main aim of this paper is to describe the state of the art of soil spectroscopy as well as its potential to facilitating soil monitoring. The factors...... constraining the application of soil spectroscopy as an alternative to traditional laboratory analyses, together with the limits of the technique, are addressed. The paper also highlights that the widespread use of spectroscopy to monitor the status of the soil should be encouraged by (1) the creation...
Two Rounds of Whole Genome Duplication in the AncestralVertebrate

Energy Technology Data Exchange (ETDEWEB)

Dehal, Paramvir; Boore, Jeffrey L.

2005-04-12

The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish-tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of 4-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage.
Bioinformatics for whole-genome shotgun sequencing of microbial communities.

Directory of Open Access Journals (Sweden)

Kevin Chen

2005-07-01

Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.
MUMmer4: A fast and versatile genome alignment system.

Directory of Open Access Journals (Sweden)

Guillaume Marçais

2018-01-01

Full Text Available The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer version 3 in 2004, it has been applied to many types of problems including aligning whole genome sequences, aligning reads to a reference genome, and comparing different assemblies of the same genome. Despite its broad utility, MUMmer3 has limitations that can make it difficult to use for large genomes and for the very large sequence data sets that are common today. In this paper we describe MUMmer4, a substantially improved version of MUMmer that addresses genome size constraints by changing the 32-bit suffix tree data structure at the core of MUMmer to a 48-bit suffix array, and that offers improved speed through parallel processing of input query sequences. With a theoretical limit on the input size of 141Tbp, MUMmer4 can now work with input sequences of any biologically realistic length. We show that as a result of these enhancements, the nucmer program in MUMmer4 is easily able to handle alignments of large genomes; we illustrate this with an alignment of the human and chimpanzee genomes, which allows us to compute that the two species are 98% identical across 96% of their length. With the enhancements described here, MUMmer4 can also be used to efficiently align reads to reference genomes, although it is less sensitive and accurate than the dedicated read aligners. The nucmer aligner in MUMmer4 can now be called from scripting languages such as Perl, Python and Ruby. These improvements make MUMer4 one the most versatile genome alignment packages available.
Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex.

Directory of Open Access Journals (Sweden)

Daniel Garrido-Sanz

Full Text Available The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as
Large Scale Sequencing of Dothideomycetes Provides Insights into Genome Evolution and Adaptation

Energy Technology Data Exchange (ETDEWEB)

Haridas, Sajeet; Crous, Pedro; Binder, Manfred; Spatafora, Joseph; Grigoriev, Igor

2015-03-16

Dothideomycetes is the largest and most diverse class of ascomycete fungi with 23 orders 110 families, 1300 genera and over 19,000 known species. We present comparative analysis of 70 Dothideomycete genomes including over 50 that we sequenced and are as yet unpublished. This extensive sampling has almost quadrupled the previous study of 18 species and uncovered a 10 fold range of genome sizes. We were able to clarify the phylogenetic positions of several species whose origins were unclear in previous morphological and sequence comparison studies. We analyzed selected gene families including proteases, transporters and small secreted proteins and show that major differences in gene content is influenced by speciation.
Improving Genetic Gain with Genomic Selection in Autotetraploid Potato

Directory of Open Access Journals (Sweden)

Anthony T. Slater

2016-11-01

Full Text Available Potato ( L. breeders consider a large number of traits during cultivar development and progress in conventional breeding can be slow. There is accumulating evidence that some of these traits, such as yield, are affected by a large number of genes with small individual effects. Recently, significant efforts have been applied to the development of genomic resources to improve potato breeding, culminating in a draft genome sequence and the identification of a large number of single nucleotide polymorphisms (SNPs. The availability of these genome-wide SNPs is a prerequisite for implementing genomic selection for improvement of polygenic traits such as yield. In this review, we investigate opportunities for the application of genomic selection to potato, including novel breeding program designs. We have considered a number of factors that will influence this process, including the autotetraploid and heterozygous genetic nature of potato, the rate of decay of linkage disequilibrium, the number of required markers, the design of a reference population, and trait heritability. Based on estimates of the effective population size derived from a potato breeding program, we have calculated the expected accuracy of genomic selection for four key traits of varying heritability and propose that it will be reasonably accurate. We compared the expected genetic gain from genomic selection with the expected gain from phenotypic and pedigree selection, and found that genetic gain can be substantially improved by using genomic selection.

Successful application of FTA Classic Card technology and use of bacteriophage phi29 DNA polymerase for large-scale field sampling and cloning of complete maize streak virus genomes.

Science.gov (United States)

Owor, Betty E; Shepherd, Dionne N; Taylor, Nigel J; Edema, Richard; Monjane, Adérito L; Thomson, Jennifer A; Martin, Darren P; Varsani, Arvind

2007-03-01

Leaf samples from 155 maize streak virus (MSV)-infected maize plants were collected from 155 farmers' fields in 23 districts in Uganda in May/June 2005 by leaf-pressing infected samples onto FTA Classic Cards. Viral DNA was successfully extracted from cards stored at room temperature for 9 months. The diversity of 127 MSV isolates was analysed by PCR-generated RFLPs. Six representative isolates having different RFLP patterns and causing either severe, moderate or mild disease symptoms, were chosen for amplification from FTA cards by bacteriophage phi29 DNA polymerase using the TempliPhi system. Full-length genomes were inserted into a cloning vector using a unique restriction enzyme site, and sequenced. The 1.3-kb PCR product amplified directly from FTA-eluted DNA and used for RFLP analysis was also cloned and sequenced. Comparison of cloned whole genome sequences with those of the original PCR products indicated that the correct virus genome had been cloned and that no errors were introduced by the phi29 polymerase. This is the first successful large-scale application of FTA card technology to the field, and illustrates the ease with which large numbers of infected samples can be collected and stored for downstream molecular applications such as diversity analysis and cloning of potentially new virus genomes.
Direct Cellular Lysis/Protein Extraction Protocol for Soil Metaproteomics

Energy Technology Data Exchange (ETDEWEB)

Chourey, Karuna [ORNL; Jansson, Janet [Lawrence Berkeley National Laboratory (LBNL); Verberkmoes, Nathan C [ORNL; Shah, Manesh B [ORNL; Chavarria, Krystle L. [Lawrence Berkeley National Laboratory (LBNL); Tom, Lauren M [Lawrence Berkeley National Laboratory (LBNL); Brodie, Eoin L. [Lawrence Berkeley National Laboratory (LBNL); Hettich, Robert {Bob} L [ORNL

2010-01-01

We present a novel direct protocol for deep proteome characterization of microorganisms in soil. The method employs thermally assisted detergent-based cellular lysis (SDS) of soil samples, followed by TCA precipitation for proteome extraction/cleanup prior to liquid chromatography-mass spectrometric characterization. This approach was developed and optimized using different soils inoculated with genome-sequenced bacteria (Gram-negative Pseudomonas putida or Gram-positive Arthrobacter chlorophenolicus). Direct soil protein extraction was compared to protein extraction from cells isolated from the soil matrix prior to lysis (indirect method). Each approach resulted in identification of greater than 500 unique proteins, with a wide range in molecular mass and functional categories. To our knowledge, this SDS-TCA approach enables the deepest proteome characterizations of microbes in soil to date, without significant biases in protein size, localization, or functional category compared to pure cultures. This protocol should provide a powerful tool for ecological studies of soil microbial communities.
Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

Science.gov (United States)

Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

2015-03-17

The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.
Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study.

Directory of Open Access Journals (Sweden)

Paul S de Vries

Full Text Available An increasing number of genome-wide association (GWA studies are now using the higher resolution 1000 Genomes Project reference panel (1000G for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10-8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10-8, the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development.
Tandemly Arrayed Genes in Vertebrate Genomes

Directory of Open Access Journals (Sweden)

Deng Pan

2008-01-01

Full Text Available Tandemly arrayed genes (TAGs are duplicated genes that are linked as neighbors on a chromosome, many of which have important physiological and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72–94% have parallel transcription orientation (i.e., they are encoded on the same strand in contrast to the genome, which has about 50% of its genes in parallel transcription orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our recent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mechanism of duplication, especially in large families. Finally, several species have a higher proportion of large tandem arrays that are species-specific than random expectation.
Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

Science.gov (United States)

Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

2018-02-01

This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.
An Atypical Human Induced Pluripotent Stem Cell Line With a Complex, Stable, and Balanced Genomic Rearrangement Including a Large De Novo 1q Uniparental Disomy

Science.gov (United States)

Steichen, Clara; Maluenda, Jérôme; Tosca, Lucie; Luce, Eléanor; Pineau, Dominique; Dianat, Noushin; Hannoun, Zara; Tachdjian, Gérard; Melki, Judith

2015-01-01

Human induced pluripotent stem cells (hiPSCs) hold great promise for cell therapy through their use as vital tools for regenerative and personalized medicine. However, the genomic integrity of hiPSCs still raises some concern and is one of the barriers limiting their use in clinical applications. Numerous articles have reported the occurrence of aneuploidies, copy number variations, or single point mutations in hiPSCs, and nonintegrative reprogramming strategies have been developed to minimize the impact of the reprogramming process on the hiPSC genome. Here, we report the characterization of an hiPSC line generated by daily transfections of modified messenger RNAs, displaying several genomic abnormalities. Karyotype analysis showed a complex genomic rearrangement, which remained stable during long-term culture. Fluorescent in situ hybridization analyses were performed on the hiPSC line showing that this karyotype is balanced. Interestingly, single-nucleotide polymorphism analysis revealed the presence of a large 1q region of uniparental disomy (UPD), demonstrating for the first time that UPD can occur in a noncompensatory context during nonintegrative reprogramming of normal fibroblasts. PMID:25650439
Complete genome sequence of 'Thermobaculum terrenum' type strain (YNP1).

Science.gov (United States)

Kiss, Hajnalka; Cleland, David; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Nolan, Matt; Tice, Hope; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Lu, Megan; Brettin, Thomas; Detter, John C; Göker, Markus; Tindall, Brian J; Beck, Brian; McDermott, Timothy R; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Cheng, Jan-Fang

2010-10-27

'Thermobaculum terrenum' Botero et al. 2004 is the sole species within the proposed genus 'Thermobaculum'. Strain YNP1(T) is the only cultivated member of an acid tolerant, extremely thermophilic species belonging to a phylogenetically isolated environmental clone group within the phylum Chloroflexi. At present, the name 'Thermobaculum terrenum' is not yet validly published as it contravenes Rule 30 (3a) of the Bacteriological Code. The bacterium was isolated from a slightly acidic extreme thermal soil in Yellowstone National Park, Wyoming (USA). Depending on its final taxonomic allocation, this is likely to be the third completed genome sequence of a member of the class Thermomicrobia and the seventh type strain genome from the phylum Chloroflexi. The 3,101,581 bp long genome with its 2,872 protein-coding and 58 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
Improving predictions of large scale soil carbon dynamics: Integration of fine-scale hydrological and biogeochemical processes, scaling, and benchmarking

Science.gov (United States)

Riley, W. J.; Dwivedi, D.; Ghimire, B.; Hoffman, F. M.; Pau, G. S. H.; Randerson, J. T.; Shen, C.; Tang, J.; Zhu, Q.

2015-12-01

Numerical model representations of decadal- to centennial-scale soil-carbon dynamics are a dominant cause of uncertainty in climate change predictions. Recent attempts by some Earth System Model (ESM) teams to integrate previously unrepresented soil processes (e.g., explicit microbial processes, abiotic interactions with mineral surfaces, vertical transport), poor performance of many ESM land models against large-scale and experimental manipulation observations, and complexities associated with spatial heterogeneity highlight the nascent nature of our community's ability to accurately predict future soil carbon dynamics. I will present recent work from our group to develop a modeling framework to integrate pore-, column-, watershed-, and global-scale soil process representations into an ESM (ACME), and apply the International Land Model Benchmarking (ILAMB) package for evaluation. At the column scale and across a wide range of sites, observed depth-resolved carbon stocks and their 14C derived turnover times can be explained by a model with explicit representation of two microbial populations, a simple representation of mineralogy, and vertical transport. Integrating soil and plant dynamics requires a 'process-scaling' approach, since all aspects of the multi-nutrient system cannot be explicitly resolved at ESM scales. I will show that one approach, the Equilibrium Chemistry Approximation, improves predictions of forest nitrogen and phosphorus experimental manipulations and leads to very different global soil carbon predictions. Translating model representations from the site- to ESM-scale requires a spatial scaling approach that either explicitly resolves the relevant processes, or more practically, accounts for fine-resolution dynamics at coarser scales. To that end, I will present recent watershed-scale modeling work that applies reduced order model methods to accurately scale fine-resolution soil carbon dynamics to coarse-resolution simulations. Finally, we
Inversion of Farmland Soil Moisture in Large Region Based on Modified Vegetation Index

Science.gov (United States)

Wang, J. X.; Yu, B. S.; Zhang, G. Z.; Zhao, G. C.; He, S. D.; Luo, W. R.; Zhang, C. C.

2018-04-01

Soil moisture is an important parameter for agricultural production. Efficient and accurate monitoring of soil moisture is an important link to ensure the safety of agricultural production. Remote sensing technology has been widely used in agricultural moisture monitoring because of its timeliness, cyclicality, dynamic tracking of changes in things, easy access to data, and extensive monitoring. Vegetation index and surface temperature are important parameters for moisture monitoring. Based on NDVI, this paper introduces land surface temperature and average temperature for optimization. This article takes the soil moisture in winter wheat growing area in Henan Province as the research object, dividing Henan Province into three main regions producing winter wheat and dividing the growth period of winter wheat into the early, middle and late stages on the basis of phenological characteristics and regional characteristics. Introducing appropriate correction factor during the corresponding growth period of winter wheat, correcting the vegetation index in the corresponding area, this paper establishes regression models of soil moisture on NDVI and soil moisture on modified NDVI based on correlation analysis and compare models. It shows that modified NDVI is more suitable as a indicator of soil moisture because of the better correlation between soil moisture and modified NDVI and the higher prediction accuracy of the regression model of soil moisture on modified NDVI. The research in this paper has certain reference value for winter wheat farmland management and decision-making.
Complete genome sequence of Catenulispora acidiphila type strain (ID 139908T)

Energy Technology Data Exchange (ETDEWEB)

Copeland, Alex; Lapidus, Alla; Rio, Tijana GlavinaDel; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Mikhailova, Natalia; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Chain, Patrick; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chertkov, Olga; Brettin, Thomas; Detter, John C.; Han, Cliff; Ali, Zahid; Tindall, Brian J.; Goker, Markus; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

2009-05-20

Catenulispora acidiphila Busti et al. 2006 is the type species of the genus Catenulispora, and is of interest because of the rather isolated phylogenetic location of the genomically little studied suborder Catenulisporineae within the order Actinomycetales. C. acidiphilia is known for its acidophilic, aerobic lifestyle, but can also grow scantly under anaerobic conditions. Under regular conditions C. acidiphilia grows in long filaments of relatively short aerial hyphae with marked septation. It is a free living, non motile, Gram-positive bacterium isolated from a forest soil sample taken from a wooded area in Gerenzano, Italy. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of the actinobacterial family Catenulisporaceae, and the 10,467,782 bp long single replicon genome with its 9056 protein-coding and 69 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
Genome-scale neurogenetics: methodology and meaning.

Science.gov (United States)

McCarroll, Steven A; Feng, Guoping; Hyman, Steven E

2014-06-01

Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology.
Genome Variation Map: a data repository of genome variations in BIG Data Center.

Science.gov (United States)

Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

2018-01-04

The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome Variation Map: a data repository of genome variations in BIG Data Center

Science.gov (United States)

Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

2018-01-01

Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473
Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics' GemCode Sequencing Data.

Directory of Open Access Journals (Sweden)

Lauren Coombe

Full Text Available The linked read sequencing library preparation platform by 10X Genomics produces barcoded sequencing libraries, which are subsequently sequenced using the Illumina short read sequencing technology. In this new approach, long fragments of DNA are partitioned into separate micro-reactions, where the same index sequence is incorporated into each of the sequencing fragment inserts derived from a given long fragment. In this study, we exploited this property by using reads from index sequences associated with a large number of reads, to assemble the chloroplast genome of the Sitka spruce tree (Picea sitchensis. Here we report on the first Sitka spruce chloroplast genome assembled exclusively from P. sitchensis genomic libraries prepared using the 10X Genomics protocol. We show that the resulting 124,049 base pair long genome shares high sequence similarity with the related white spruce and Norway spruce chloroplast genomes, but diverges substantially from a previously published P. sitchensis- P. thunbergii chimeric genome. The use of reads from high-frequency indices enabled separation of the nuclear genome reads from that of the chloroplast, which resulted in the simplification of the de Bruijn graphs used at the various stages of assembly.
Impact of carbonaceous materials in soil on the transport of soil-bound PAHs during rainfall-runoff events

International Nuclear Information System (INIS)

Luo, Xiaolin; Zheng, Yi; Wu, Bin; Lin, Zhongrong; Han, Feng; Zhang, Wei; Wang, Xuejun

2013-01-01

Polycyclic Aromatic Hydrocarbons (PAHs) transported from contaminated soils by surface runoff pose significant risk for aquatic ecosystems. Based on a rainfall-runoff simulation experiment, this study investigated the impact of carbonaceous materials (CMs) in soil, identified by organic petrology analysis, on the transport of soil-bound PAHs under rainfall conditions. The hypothesis that composition of soil organic matter significantly impacts the enrichment and transport of PAHs was proved. CMs in soil, varying significantly in content, mobility and adsorption capacity, act differently on the transport of PAHs. Anthropogenic CMs like black carbon (BC) largely control the transport, as PAHs may be preferentially attached to them. Eventually, this study led to a rethink of the traditional enrichment theory. An important implication is that CMs in soil have to be explicitly considered to appropriately model the nonpoint source pollution of PAHs (possibly other hydrophobic chemicals as well) and assess its environmental risk. -- Highlights: •Composition of SOM significantly impacts the enrichment and transport of PAHs. •Anthropogenic carbonaceous materials in soil largely control the transport of PAHs. •The classic enrichment theory is invalid if anthropogenic CMs are abundant in the soil. •Organic petrology analysis introduced to study the fate and transport of PAHs. -- Anthropogenic carbonaceous materials in soil, especially black carbon, largely control the transport of soil-bound PAHs during rainfall-runoff events
Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

Science.gov (United States)

Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

2015-01-01

Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.
Integrative Genomics Viewer (IGV) | Informatics Technology for Cancer Research (ITCR)

Science.gov (United States)

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.
Complete Genome Sequence of the Soybean Symbiont Bradyrhizobium japonicum Strain USDA6T

Directory of Open Access Journals (Sweden)

Nobukazu Uchiike

2011-10-01

Full Text Available The complete nucleotide sequence of the genome of the soybean symbiont Bradyrhizobium japonicum strain USDA6T was determined. The genome of USDA6T is a single circular chromosome of 9,207,384 bp. The genome size is similar to that of the genome of another soybean symbiont, B. japonicum USDA110 (9,105,828 bp. Comparison of the whole-genome sequences of USDA6T and USDA110 showed colinearity of major regions in the two genomes, although a large inversion exists between them. A significantly high level of sequence conservation was detected in three regions on each genome. The gene constitution and nucleotide sequence features in these three regions indicate that they may have been derived from a symbiosis island. An ancestral, large symbiosis island, approximately 860 kb in total size, appears to have been split into these three regions by unknown large-scale genome rearrangements. The two integration events responsible for this appear to have taken place independently, but through comparable mechanisms, in both genomes.
A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

Science.gov (United States)

Luo, Li; Zhu, Yun; Xiong, Momiao

2012-06-01

The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

Effect of soil surface roughness on infiltration water, ponding and runoff on tilled soils under rainfall simulation experiments

NARCIS (Netherlands)

Zhao, Longshan; Hou, Rui; Wu, Faqi; Keesstra, Saskia

2018-01-01

Agriculture has a large effect on the properties of the soil and with that on soil hydrology. The partitioning of rainfall into infiltration and runoff is relevant to understand runoff generation, infiltration and soil erosion. Tillage manages soil surface properties and generates soil surface
Crop rotations and poultry litter impact dynamic soil chemical properties and soil biota long-term

Science.gov (United States)

Dynamic soil physiochemical interactions with conservation agricultural practices and soil biota are largely unknown. Therefore, this study aims to quantify long-term (12-yr) impacts of cover crops, poultry litter, crop rotations, and conservation tillage and their interactions on soil physiochemica...
Soil biodiversity and soil community composition determine ecosystem multifunctionality

Science.gov (United States)

Wagg, Cameron; Bender, S. Franz; Widmer, Franco; van der Heijden, Marcel G. A.

2014-01-01

Biodiversity loss has become a global concern as evidence accumulates that it will negatively affect ecosystem services on which society depends. So far, most studies have focused on the ecological consequences of above-ground biodiversity loss; yet a large part of Earth’s biodiversity is literally hidden below ground. Whether reductions of biodiversity in soil communities below ground have consequences for the overall performance of an ecosystem remains unresolved. It is important to investigate this in view of recent observations that soil biodiversity is declining and that soil communities are changing upon land use intensification. We established soil communities differing in composition and diversity and tested their impact on eight ecosystem functions in model grassland communities. We show that soil biodiversity loss and simplification of soil community composition impair multiple ecosystem functions, including plant diversity, decomposition, nutrient retention, and nutrient cycling. The average response of all measured ecosystem functions (ecosystem multifunctionality) exhibited a strong positive linear relationship to indicators of soil biodiversity, suggesting that soil community composition is a key factor in regulating ecosystem functioning. Our results indicate that changes in soil communities and the loss of soil biodiversity threaten ecosystem multifunctionality and sustainability. PMID:24639507
Sample sizes to control error estimates in determining soil bulk density in California forest soils

Science.gov (United States)

Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber

2016-01-01

Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...
iSOIL: Interactions between soil related sciences - Linking geophysics, soil science and digital soil mapping

Science.gov (United States)

Dietrich, Peter; Werban, Ulrike; Sauer, Uta

2010-05-01

High-resolution soil property maps are one major prerequisite for the specific protection of soil functions and restoration of degraded soils as well as sustainable land use, water and environmental management. To generate such maps the combination of digital soil mapping approaches and remote as well as proximal soil sensing techniques is most promising. However, a feasible and reliable combination of these technologies for the investigation of large areas (e.g. catchments and landscapes) and the assessment of soil degradation threats is missing. Furthermore, there is insufficient dissemination of knowledge on digital soil mapping and proximal soil sensing in the scientific community, to relevant authorities as well as prospective users. As one consequence there is inadequate standardization of techniques. At the poster we present the EU collaborative project iSOIL within the 7th framework program of the European Commission. iSOIL focuses on improving fast and reliable mapping methods of soil properties, soil functions and soil degradation risks. This requires the improvement and integration of advanced soil sampling approaches, geophysical and spectroscopic measuring techniques, as well as pedometric and pedophysical approaches. The focus of the iSOIL project is to develop new and to improve existing strategies and innovative methods for generating accurate, high resolution soil property maps. At the same time the developments will reduce costs compared to traditional soil mapping. ISOIL tackles the challenges by the integration of three major components: (i)high resolution, non-destructive geophysical (e.g. Electromagnetic Induction EMI; Ground Penetrating Radar, GPR; magnetics, seismics) and spectroscopic (e.g., Near Surface Infrared, NIR) methods, (ii)Concepts of Digital Soil Mapping (DSM) and pedometrics as well as (iii)optimized soil sampling with respect to profound soil scientific and (geo)statistical strategies. A special focus of iSOIL lies on the
Smart plants, smart models? On adaptive responses in vegetation-soil systems

Science.gov (United States)

van der Ploeg, Martine; Teuling, Ryan; van Dam, Nicole; de Rooij, Gerrit

2015-04-01

Hydrological models that will be able to cope with future precipitation and evapotranspiration regimes need a solid base describing the essence of the processes involved [1]. The essence of emerging patterns at large scales often originates from micro-behaviour in the soil-vegetation-atmosphere system. A complicating factor in capturing this behaviour is the constant interaction between vegetation and geology in which water plays a key role. The resilience of the coupled vegetation-soil system critically depends on its sensitivity to environmental changes. To assess root water uptake by plants in a changing soil environment, a direct indication of the amount of energy required by plants to take up water can be obtained by measuring the soil water potential in the vicinity of roots with polymer tensiometers [2]. In a lysimeter experiment with various levels of imposed water stress the polymer tensiometer data suggest maize roots regulate their root water uptake on the derivative of the soil water retention curve, rather than the amount of moisture alone. As a result of environmental changes vegetation may wither and die, or these changes may instead trigger gene adaptation. Constant exposure to environmental stresses, biotic or abiotic, influences plant physiology, gene adaptations, and flexibility in gene adaptation [3-7]. To investigate a possible relation between plant genotype, the plant stress hormone abscisic acid (ABA) and the soil water potential, a proof of principle experiment was set up with Solanum Dulcamare plants. The results showed a significant difference in ABA response between genotypes from a dry and a wet environment, and this response was also reflected in the root water uptake. Adaptive responses may have consequences for the way species are currently being treated in models (single plant to global scale). In particular, model parameters that control root water uptake and plant transpiration are generally assumed to be a property of the plant
Large-scale analysis of antisense transcription in wheat using the Affymetrix GeneChip Wheat Genome Array

Directory of Open Access Journals (Sweden)

Settles Matthew L

2009-05-01

Full Text Available Abstract Background Natural antisense transcripts (NATs are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded or a different locus (trans-encoded. They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation. NATs give rise to sense-antisense transcript pairs and the number of these identified has escalated greatly with the availability of DNA sequencing resources and public databases. Traditionally, NATs were identified by the alignment of full-length cDNAs or expressed sequence tags to genome sequences, but an alternative method for large-scale detection of sense-antisense transcript pairs involves the use of microarrays. In this study we developed a novel protocol to assay sense- and antisense-strand transcription on the 55 K Affymetrix GeneChip Wheat Genome Array, which is a 3' in vitro transcription (3'IVT expression array. We selected five different tissue types for assay to enable maximum discovery, and used the 'Chinese Spring' wheat genotype because most of the wheat GeneChip probe sequences were based on its genomic sequence. This study is the first report of using a 3'IVT expression array to discover the expression of natural sense-antisense transcript pairs, and may be considered as proof-of-concept. Results By using alternative target preparation schemes, both the sense- and antisense-strand derived transcripts were labeled and hybridized to the Wheat GeneChip. Quality assurance verified that successful hybridization did occur in the antisense-strand assay. A stringent threshold for positive hybridization was applied, which resulted in the identification of 110 sense-antisense transcript pairs, as well as 80 potentially antisense-specific transcripts. Strand-specific RT-PCR validated the microarray observations, and showed that antisense transcription is likely to be tissue specific. For the annotated sense
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

Science.gov (United States)

Hiscock, D; Upton, C

2000-05-01

The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
Small homologous blocks in phytophthora genomes do not point to an ancient whole-genome duplication.

Science.gov (United States)

van Hooff, Jolien J E; Snel, Berend; Seidl, Michael F

2014-05-01

Genomes of the plant-pathogenic genus Phytophthora are characterized by small duplicated blocks consisting of two consecutive genes (2HOM blocks) and by an elevated abundance of similarly aged gene duplicates. Both properties, in particular the presence of 2HOM blocks, have been attributed to a whole-genome duplication (WGD) at the last common ancestor of Phytophthora. However, large intraspecies synteny-compelling evidence for a WGD-has not been detected. Here, we revisited the WGD hypothesis by deducing the age of 2HOM blocks. Two independent timing methods reveal that the majority of 2HOM blocks arose after divergence of the Phytophthora lineages. In addition, a large proportion of the 2HOM block copies colocalize on the same scaffold. Therefore, the presence of 2HOM blocks does not support a WGD at the last common ancestor of Phytophthora. Thus, genome evolution of Phytophthora is likely driven by alternative mechanisms, such as bursts of transposon activity.
The UCSC Genome Browser Database: 2008 update

DEFF Research Database (Denmark)

Karolchik, D; Kuhn, R M; Baertsch, R

2007-01-01

The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrat...
Methodological advances to study the diversity of soil protists and their functioning in soil food webs

NARCIS (Netherlands)

Geisen, Stefan; Bonkowski, Michael

2017-01-01

Soils host the most complex communities of organisms, which are still largely considered as an unknown 'black box'. A key role in soil food webs is held by the highly abundant and diverse group of protists. Traditionally, soil protists are considered as the main consumers of bacteria in soils.
Cloud computing for genomic data analysis and collaboration.

Science.gov (United States)

Langmead, Ben; Nellore, Abhinav

2018-04-01

Next-generation sequencing has made major strides in the past decade. Studies based on large sequencing data sets are growing in number, and public archives for raw sequencing data have been doubling in size every 18 months. Leveraging these data requires researchers to use large-scale computational resources. Cloud computing, a model whereby users rent computers and storage from large data centres, is a solution that is gaining traction in genomics research. Here, we describe how cloud computing is used in genomics for research and large-scale collaborations, and argue that its elasticity, reproducibility and privacy features make it ideally suited for the large-scale reanalysis of publicly available archived data, including privacy-protected data.
Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

Science.gov (United States)

Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

2014-01-01

Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738
Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

Directory of Open Access Journals (Sweden)

Zheng Ping

2014-01-01

Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.
Draft Genome Sequence of Bacillus velezensis B6, a Rhizobacterium That Can Control Plant Diseases.

Science.gov (United States)

Gao, Yu-Han; Guo, Rong-Jun; Li, Shi-Dong

2018-03-22

The draft genome of Bacillus velezensis strain B6, a rhizobacterium with good biocontrol performance isolated from soil in China, was sequenced. The assembly comprises 32 scaffolds with a total size of 3.88 Mb. Gene clusters coding either ribosomally encoded bacteriocins or nonribosomally encoded antimicrobial polyketides and lipopeptides in the genome may contribute to plant disease control. Copyright © 2018 Gao et al.
Contribution of Large Genomic Rearrangements in Italian Lynch Syndrome Patients: Characterization of a Novel Alu-Mediated Deletion

Directory of Open Access Journals (Sweden)

Francesca Duraturo

2013-01-01

Full Text Available Lynch syndrome is associated with germ-line mutations in the DNA mismatch repair (MMR genes, mainly MLH1 and MSH2. Most of the mutations reported in these genes to date are point mutations, small deletions, and insertions. Large genomic rearrangements in the MMR genes predisposing to Lynch syndrome also occur, but the frequency varies depending on the population studied on average from 5 to 20%. The aim of this study was to examine the contribution of large rearrangements in the MLH1 and MSH2 genes in a well-characterised series of 63 unrelated Southern Italian Lynch syndrome patients who were negative for pathogenic point mutations in the MLH1, MSH2, and MSH6 genes. We identified a large novel deletion in the MSH2 gene, including exon 6 in one of the patients analysed (1.6% frequency. This deletion was confirmed and localised by long-range PCR. The breakpoints of this rearrangement were characterised by sequencing. Further analysis of the breakpoints revealed that this rearrangement was a product of Alu-mediated recombination. Our findings identified a novel Alu-mediated rearrangement within MSH2 gene and showed that large deletions or duplications in MLH1 and MSH2 genes are low-frequency mutational events in Southern Italian patients with an inherited predisposition to colon cancer.
The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

Science.gov (United States)

Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

2013-01-01

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.
Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease.

Directory of Open Access Journals (Sweden)

Christel Cazalet

2010-02-01

Full Text Available Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these
Soil fungal communities in a Castanea sativa (chestnut) forest producing large quantities of Boletus edulis sensu lato (porcini): where is the mycelium of porcini?

Science.gov (United States)

Peintner, Ursula; Iotti, Mirco; Klotz, Petra; Bonuso, Enrico; Zambonelli, Alessandra

2007-04-01

A study was conducted in a Castanea sativa forest that produces large quantities of the edible mushroom porcini (Boletus edulis sensu lato). The primary aim was to study porcini mycelia in the soil, and to determine if there were any possible ecological and functional interactions with other dominant soil fungi. Three different approaches were used: collection and morphological identification of fruiting bodies, morphological and molecular identification of ectomycorrhizae by rDNA-ITS sequence analyses and molecular identification of the soil mycelia by ITS clone libraries. Soil samples were taken directly under basidiomes of Boletus edulis, Boletus aestivalis, Boletus aereus and Boletus pinophilus. Thirty-nine ectomycorrhizal fungi were identified on root tips whereas 40 fungal species were found in the soil using the cloning technique. The overlap between above- and below-ground fungal communities was very low. Boletus mycelia, compared with other soil fungi, were rare and with scattered distribution, whereas their fruiting bodies dominated the above-ground fungal community. Only B. aestivalis ectomycorrhizae were relatively abundant and detected as mycelia in the soil. No specific fungus-fungus association was found. Factors triggering formation of mycorrhizae and fructification of porcini appear to be too complex to be simply explained on the basis of the amount of fungal mycelia in the soil.
Polycyclic aromatic hydrocarbons in ambient air, surface soil and wheat grain near a large steel-smelting manufacturer in northern China.

Science.gov (United States)

Liu, Weijian; Wang, Yilong; Chen, Yuanchen; Tao, Shu; Liu, Wenxin

2017-07-01

The total concentrations and component profiles of polycyclic aromatic hydrocarbons (PAHs) in ambient air, surface soil and wheat grain collected from wheat fields near a large steel-smelting manufacturer in Northern China were determined. Based on the specific isomeric ratios of paired species in ambient air, principle component analysis and multivariate linear regression, the main emission source of local PAHs was identified as a mixture of industrial and domestic coal combustion, biomass burning and traffic exhaust. The total organic carbon (TOC) fraction was considerably correlated with the total and individual PAH concentrations in surface soil. The total concentrations of PAHs in wheat grain were relatively low, with dominant low molecular weight constituents, and the compositional profile was more similar to that in ambient air than in topsoil. Combined with more significant results from partial correlation and linear regression models, the contribution from air PAHs to grain PAHs may be greater than that from soil PAHs. Copyright © 2016. Published by Elsevier B.V.

Genome sequencing and annotation of Acinetobacter guillouiae strain MSP 4-18

Directory of Open Access Journals (Sweden)

Nitin Kumar Singh

2014-12-01

Full Text Available The genus Acinetobacter consists of 31 validly published species ubiquitously distributed in nature and primarily associated with nosocomial infection. We report the 4.8 Mb genome of Acinetobacter guillouiae MSP 4-18, isolated from a mangrove soil sample from Parangipettai (11°30′N, 79°47′E, Tamil Nadu, India. The draft genome of A. guillouiae MSP 4-18 has a G + C content of 38.0% and includes 3 rRNA genes (5S, 23S, 16S and 69 aminoacyl-tRNA synthetase genes.
Genome sequencing and annotation of Amycolatopsis vancoresmycina strain DSM 44592T

Directory of Open Access Journals (Sweden)

Navjot Kaur

2014-12-01

Full Text Available We report the 9.0-Mb draft genome of Amycolatopsis vancoresmycina strain DSM 44592T, isolated from Indian soil sample; produces antibiotic vancoresmycin. Draft genome of strain DSM44592T consists of 9,037,069 bp with a G+C content of 71.79% and 8340 predicted protein coding genes and 57 RNAs. RAST annotation indicates that strains Streptomyces sp. AA4 (score 521, Saccharomonospora viridis DSM 43017 (score 400 and Actinosynnema mirum DSM 43827 (score 372 are the closest neighbors of the strain DSM 44592T.
Draft genome of the gayal, Bos frontalis

Science.gov (United States)

Wang, Ming-Shan; Zeng, Yan; Wang, Xiao; Nie, Wen-Hui; Wang, Jin-Huan; Su, Wei-Ting; Xiong, Zi-Jun; Wang, Sheng; Qu, Kai-Xing; Yan, Shou-Qing; Yang, Min-Min; Wang, Wen; Dong, Yang; Zhang, Ya-Ping

2017-01-01

Abstract Gayal (Bos frontalis), also known as mithan or mithun, is a large endangered semi-domesticated bovine that has a limited geographical distribution in the hill-forests of China, Northeast India, Bangladesh, Myanmar, and Bhutan. Many questions about the gayal such as its origin, population history, and genetic basis of local adaptation remain largely unresolved. De novo sequencing and assembly of the whole gayal genome provides an opportunity to address these issues. We report a high-depth sequencing, de novo assembly, and annotation of a female Chinese gayal genome. Based on the Illumina genomic sequencing platform, we have generated 350.38 Gb of raw data from 16 different insert-size libraries. A total of 276.86 Gb of clean data is retained after quality control. The assembled genome is about 2.85 Gb with scaffold and contig N50 sizes of 2.74 Mb and 14.41 kb, respectively. Repetitive elements account for 48.13% of the genome. Gene annotation has yielded 26 667 protein-coding genes, of which 97.18% have been functionally annotated. BUSCO assessment shows that our assembly captures 93% (3183 of 4104) of the core eukaryotic genes and 83.1% of vertebrate universal single-copy orthologs. We provide the first comprehensive de novo genome of the gayal. This genetic resource is integral for investigating the origin of the gayal and performing comparative genomic studies to improve understanding of the speciation and divergence of bovine species. The assembled genome could be used as reference in future population genetic studies of gayal. PMID:29048483
Deep horizons: Soil Carbon sequestration and storage potential in grassland soils

Science.gov (United States)

Torres-Sallan, Gemma; Schulte, Rogier; Lanigan, Gary J.; Byrne, Kenneth A.; Reidy, Brian; Creamer, Rachel

2016-04-01

Soil Organic Carbon (SOC) enhances soil fertility, holding nutrients in a plant-available form. It also improves aeration and water infiltration. Soils are considered a vital pool for C (Carbon) sequestration, as they are the largest pool of C after the oceans, and contain 3.5 more C than the atmosphere. SOC models and inventories tend to focus on the top 30 cm of soils, only analysing total SOC values. Association of C with microaggregates (53-250 μm) and silt and clay (40 °C. Through a wet sieving procedure, four aggregate sizes were isolated: large macroaggregates (>2000 μm); macroaggregates (250-2000 μm); microaggregates and silt & clay. Organic C associated to each aggregate fraction was analysed on a LECO combustion analyser. Sand-free C was calculated for each aggregate size. For all soil types, 84% of the SOC located in the first 30 cm was contained inside macroaggregates and large macroaggregates. Given that this fraction has a turnover time of 1 to 10 years, sampling at that depth only provides information on the labile fraction in soil, and does not consider the longer term C sequestration potential. Only when looking at the whole profile, two clear trends could be observed: 1) soils with a clay increase at depth had most of their C located in the silt and clay fractions, which indicate their enhanced C sequestration capacity, 2) free-draining soils had a bigger part of their SOC located in the macroaggregate fractions. These results indicate that current C inventories and models that focus on the top 30 cm, do not accurately measure soil C sequestration potential in soils, but rather the more labile fraction. However, at depth soil forming processes have been identified as a major factor influencing C sequestration potential in soils. This has a major impact in further quantifying and sustaining C sequestration into the future. Soils with a high sequestration potential at depth need to be managed to enhance the residence time to contribute to future
High resolution soil moisture radiometer. [large space structures

Science.gov (United States)

Wilheit, T. T.

1978-01-01

An electrically scanned pushbroom phased antenna array is described for a microwave radiometer which can provide agriculturally meaningful measurements of soil moisture. The antenna size of 100 meters at 1400 MHz or 230 meters at 611 MHz requires several shuttle launches and orbital assembly. Problems inherent to the size of the structure and specific instrument problems are discussed as well as the preliminary design.
THE SOIL-GEOCHEMICAL SUBSTANTIATION PRINCIPLES OF AN ESTIMATION OF HALOMORPHIC SOIL WITH THE PURPOSES OF FOREST MELIORATION

Directory of Open Access Journals (Sweden)

Gheorghe Jigau

2006-10-01

Full Text Available In Prut - Dniester interfluves the halomorphic soil meets practically in all soilgeographical areas and is characterized by a significant genetic variety. To them concern as interzones types sodium soil, saline soil and halomorphic soil of the large and small river valleys, and alkalized and solonchak genuses of zones chernozem’s soil. They are distributed both in flood land of the rivers, and outside of flood land. In outside flood lands they form large missives. On an outside flood lands territories they form island sites with the small area dated to an exit on a surface of salt Neogene clays or to territories testing constant or periodic humidifying more often.
Complete genome sequencing of Agrobacterium sp. H13-3, the former Rhizobium lupini H13-3, reveals a tripartite genome consisting of a circular and a linear chromosome and an accessory plasmid but lacking a tumor-inducing Ti-plasmid.

Science.gov (United States)

Wibberg, Daniel; Blom, Jochen; Jaenicke, Sebastian; Kollin, Florian; Rupp, Oliver; Scharf, Birgit; Schneiker-Bekel, Susanne; Sczcepanowski, Rafael; Goesmann, Alexander; Setubal, Joao Carlos; Schmitt, Rüdiger; Pühler, Alfred; Schlüter, Andreas

2011-08-20

Agrobacterium sp. H13-3, formerly known as Rhizobium lupini H13-3, is a soil bacterium that was isolated from the rhizosphere of Lupinus luteus. The isolate has been established as a model system for studying novel features of flagellum structure, motility and chemotaxis within the family Rhizobiaceae. The complete genome sequence of Agrobacterium sp. H13-3 has been established and the genome structure and phylogenetic assignment of the organism was analysed. For de novo sequencing of the Agrobacterium sp. H13-3 genome, a combined strategy comprising 454-pyrosequencing on the Genome Sequencer FLX platform and PCR-based amplicon sequencing for gap closure was applied. The finished genome consists of three replicons and comprises 5,573,770 bases. Based on phylogenetic analyses, the isolate could be assigned to the genus Agrobacterium biovar I and represents a genomic species G1 strain within this biovariety. The highly conserved circular chromosome (2.82 Mb) of Agrobacterium sp. H13-3 mainly encodes housekeeping functions characteristic for an aerobic, heterotrophic bacterium. Agrobacterium sp. H13-3 is a motile bacterium driven by the rotation of several complex flagella. Its behaviour towards external stimuli is regulated by a large chemotaxis regulon and a total of 17 chemoreceptors. Comparable to the genome of Agrobacterium tumefaciens C58, Agrobacterium sp. H13-3 possesses a linear chromosome (2.15 Mb) that is related to its reference replicon and features chromosomal and plasmid-like properties. The accessory plasmid pAspH13-3a (0.6 Mb) is only distantly related to the plasmid pAtC58 of A. tumefaciens C58 and shows a mosaic structure. A tumor-inducing Ti-plasmid is missing in the sequenced strain H13-3 indicating that it is a non-virulent isolate. Copyright © 2011 Elsevier B.V. All rights reserved.
Comparative genomic in situ hybridization analysis on the ...

African Journals Online (AJOL)

The nucleolar organizing regions (NORs), a few telomeres, most centromeric regions and numerous interstitial sites were detected. The signals in small genomes were relatively sparse and unevenly distributed along chromosomes, whereas those in large genomes were dense and basically evenly distributed.
Cloning-free genome engineering in Sinorhizobium meliloti advances applications of Cre/loxP site-specific recombination.

Science.gov (United States)

Döhlemann, Johannes; Brennecke, Meike; Becker, Anke

2016-09-10

The soil-dwelling α-proteobacterium Sinorhizobium meliloti serves as model for studies of symbiotic nitrogen fixation, a highly important process in sustainable agriculture. Here, we report advancements of the genetic toolbox accelerating genome editing in S. meliloti. The hsdMSR operon encodes a type-I restriction-modification (R-M) system. Transformation of S. meliloti is counteracted by the restriction endonuclease HsdR degrading DNA which lacks the appropriate methylation pattern. We provide a stable S. meliloti hsdR deletion mutant showing enhanced transformation with Escherichia coli-derived plasmid DNA and demonstrate that using an E. coli plasmid donor, expressing S. meliloti methyl transferase genes, is an alternative strategy of increasing the transformation efficiency of S. meliloti. Furthermore, we devise a novel cloning-free genome editing (CFGE) method for S. meliloti, Agrobacterium tumefaciens and Xanthomonas campestris, and demonstrate the applicability of this method for intricate applications of the Cre/lox recombination system in S. meliloti. An enhanced Cre/lox system, allowing for serial deletions of large genomic regions, was established. An assay of lox spacer mutants identified a set of lox sites mediating specific recombination. The availability of several non-promiscuous Cre recognition sites enables simultaneous specific Cre/lox recombination events. CFGE combined with Cre/lox recombination is put forward as powerful approach for targeted genome editing, involving serial steps of manipulation to expedite the genetic accessibility of S. meliloti as chassis. Copyright © 2016 Elsevier B.V. All rights reserved.
Viral symbiosis and the holobiontic nature of the human genome.

Science.gov (United States)

Ryan, Francis Patrick

2016-01-01

The human genome is a holobiontic union of the mammalian nuclear genome, the mitochondrial genome and large numbers of endogenized retroviral genomes. This article defines and explores this symbiogenetic pattern of evolution, looking at the implications for human genetics, epigenetics, embryogenesis, physiology and the pathogenesis of inborn errors of metabolism and many other diseases. © 2016 APMIS. Published by John Wiley & Sons Ltd.
A BAC-based physical map of the Drosophila buzzatii genome

Energy Technology Data Exchange (ETDEWEB)

Gonzalez, Josefa; Nefedov, Michael; Bosdet, Ian; Casals, Ferran; Calvete, Oriol; Delprat, Alejandra; Shin, Heesun; Chiu, Readman; Mathewson, Carrie; Wye, Natasja; Hoskins, Roger A.; Schein, JacquelineE.; de Jong, Pieter; Ruiz, Alfredo

2005-03-18

Large-insert genomic libraries facilitate cloning of large genomic regions, allow the construction of clone-based physical maps and provide useful resources for sequencing entire genomes. Drosophilabuzzatii is a representative species of the repleta group in the Drosophila subgenus, which is being widely used as a model in studies of genome evolution, ecological adaptation and speciation. We constructed a Bacterial Artificial Chromosome (BAC) genomic library of D. buzzatii using the shuttle vector pTARBAC2.1. The library comprises 18,353 clones with an average insert size of 152 kb and a {approx}18X expected representation of the D. buzzatii euchromatic genome. We screened the entire library with six euchromatic gene probes and estimated the actual genome representation to be {approx}23X. In addition, we fingerprinted by restriction digestion and agarose gel electrophoresis a sample of 9,555 clones, and assembled them using Finger Printed Contigs (FPC) software and manual editing into 345 contigs (mean of 26 clones per contig) and 670singletons. Finally, we anchored 181 large contigs (containing 7,788clones) to the D. buzzatii salivary gland polytene chromosomes by in situ hybridization of 427 representative clones. The BAC library and a database with all the information regarding the high coverage BAC-based physical map described in this paper are available to the research community.
Study of the Effect of Turbulence and Large Obstacles on the Evaporation from Bare Soil Surface through Coupled Free-flow and Porous-medium Flow Model

Science.gov (United States)

Gao, B.; Smits, K. M.

2017-12-01

Evaporation is a strongly coupled exchange process of mass, momentum and energy between the atmosphere and the soil. Several mechanisms influence evaporation, such as the atmospheric conditions, the structure of the soil surface, and the physical properties of the soil. Among the previous studies associated with evaporation modeling, most efforts use uncoupled models which simplify the influences of the atmosphere and soil through the use of resistance terms. Those that do consider the coupling between the free flow and porous media flow mainly consider flat terrain with grain-scale roughness. However, larger obstacles, which may form drags or ridges allowing normal convective air flow through the soil, are common in nature and may affect the evaporation significantly. Therefore, the goal of this work is to study the influence of large obstacles such as wavy surfaces on the flow behavior within the soil and exchange processes to the atmosphere under turbulent free-flow conditions. For simplicity, the soil surface with large obstacles are represented by a simple wavy surface. To do this, we modified a previously developed theory for two-phase two-component porous-medium flow, coupling it to single-phase two-component turbulent flow to simulate and analyze the evaporation from wavy soil surfaces. Detailed laboratory scale experiments using a wind tunnel interfaced with a porous media tank were carried out to test the modeling results. The characteristics of turbulent flow across a permeable wavy surface are discussed. Results demonstrate that there is an obvious recirculation zone formed at the surface, which is special because of the accumulation of water vapor and the thicker boundary layer in this area. In addition, the influences of both the free flow and porous medium on the evaporation are also analyzed. The porous medium affects the evaporation through the amount of water it can provide to the soil surface; while the atmosphere influences the evaporation
Large meta-analysis of genome-wide association studies identifies five loci for lean body mass

DEFF Research Database (Denmark)

Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang

2017-01-01

Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorpt...... a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.......-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p
An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

Directory of Open Access Journals (Sweden)

Herring Christopher D

2007-08-01

Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.
The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

KAUST Repository

Bracken-Grissom, Heather

2013-12-12

Over 95% of all metazoan (animal) species comprise the invertebrates, but very few genomes from these organisms have been sequenced. We have, therefore, formed a Global Invertebrate Genomics Alliance (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site () has been launched to facilitate this collaborative venture.
Comparative Genomics of the Ubiquitous, Hydrocarbon-degrading Genus Marinobacter

Science.gov (United States)

Singer, E.; Webb, E.; Edwards, K. J.

2012-12-01

The genus Marinobacter is amongst the most ubiquitous in the global oceans and strains have been isolated from a wide variety of marine environments, including offshore oil-well heads, coastal thermal springs, Antarctic sea water, saline soils and associations with diatoms and dinoflagellates. Many strains have been recognized to be important hydrocarbon degraders in various marine habitats presenting sometimes extreme pH or salinity conditions. Analysis of the genome of M. aquaeolei revealed enormous adaptation versatility with an assortment of strategies for carbon and energy acquisition, sensation, and defense. In an effort to elucidate the ecological and biogeochemical significance of the Marinobacters, seven Marinobacter strains from diverse environments were included in a comparative genomics study. Genomes were screened for metabolic and adaptation potential to elucidate the strategies responsible for the omnipresence of the Marinobacter genus and their remedial action potential in hydrocarbon-polluted waters. The core genome predominantly encodes for key genes involved in hydrocarbon degradation, biofilm-relevant processes, including utilization of external DNA, halotolerance, as well as defense mechanisms against heavy metals, antibiotics, and toxins. All Marinobacter strains were observed to degrade a wide spectrum of hydrocarbon species, including aliphatic, polycyclic aromatic as well as acyclic isoprenoid compounds. Various genes predicted to facilitate hydrocarbon degradation, e.g. alkane 1-monooxygenase, appear to have originated from lateral gene transfer as they are located on gene clusters of 10-20% lower GC-content compared to genome averages and are flanked by transposases. Top ortholog hits are found in other hydrocarbon degrading organisms, e.g. Alcanivorax borkumensis. Strategies for hydrocarbon uptake encoded by various Marinobacter strains include cell surface hydrophobicity adaptation via capsular polysaccharide biosynthesis and attachment
In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

Science.gov (United States)

Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

2015-01-01

The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.
In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

Directory of Open Access Journals (Sweden)

Jiří Macas

Full Text Available The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57% of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%. Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.
Soil carbon management in large-scale Earth system modelling

DEFF Research Database (Denmark)

Olin, S.; Lindeskog, M.; Pugh, T. A. M.

2015-01-01

, carbon sequestration and nitrogen leaching from croplands are evaluated and discussed. Compared to the version of LPJ-GUESS that does not include land-use dynamics, estimates of soil carbon stocks and nitrogen leaching from terrestrial to aquatic ecosystems were improved. Our model experiments allow us...
Genome Sequences of Gordonia Phages BaxterFox, Kita, Nymphadora, and Yeezy

OpenAIRE

Pope, Welkin H.; Bandla, Sharanya; Colbert, Alexandra K.; Eichinger, Fiona G.; Gamburg, Michelle B.; Horiates, Stavroula G.; Jamison, Jerrica M.; Julian, Dana R.; Moore, Whitney A.; Murthy, Pranav; Powell, Meghan C.; Smith, Sydney V.; Mezghani, Nadia; Milliken, Katherine A.; Thompson, Paige K.

2016-01-01

Gordonia phages BaxterFox, Kita, Nymphadora, and Yeezy are newly characterized phages of Gordonia terrae, isolated from soil samples in Pittsburgh, Pennsylvania. These phages have genome lengths between 50,346 and 53,717?bp, and encode on average 84 predicted proteins. All have G+C content of 66.6%.

Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

Science.gov (United States)

Lee, Hayan; Schatz, Michael C

2012-08-15

Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net
Soil mechanics and analysis of soils overlying cavitose bedrock

International Nuclear Information System (INIS)

Drumm, E.C.

1987-08-01

The stability of the residual soils existing at the West Chestnut Ridge Site, Oak Ridge Reservation, Tennessee, was evaluated. The weathered bedrock below this residual soil contains numerous solution cavities, and several karst features were identified. The West Chestnut Ridge site was evaluated with respect to deformation and collapse of the residual soil into the bedrock cavities. A finite element analysis investigated the effects of bedrock cavity radius, thickness of soil overburden, and surface surcharge upon the deformational and stability characteristics of the residual soil. The results indicate that for small cavity radii, the thickness of the soil cover has little effect on the zone of yielded soil. For large cavity radii, a smaller zone of distressed soil occurs under thick soil cover than under thin soil cover. Dimensionless curves are presented to enable the prediction of the vertical extent of the zone of yielded soil for a range of site geometries. Although the thick soil deposits (100 feet or greater) typically found on the ridges result in high stresses adjacent to the cavity, the area of the distressed or yielded soil is small and unlikely to extend to the surface. In addition, the surface deformation or subsidence is expected to be minimal. Thus, the siting of waste facilities on the ridges where the overburden is maximum would tend to reduce the effects of deformation into the cavities. 29 refs., 37 figs., 7 tabs
Large Diversity of Nonstandard Genes and Dynamic Evolution of Chloroplast Genomes in Siphonous Green Algae (Bryopsidales, Chlorophyta).

Science.gov (United States)

Cremen, Ma Chiela M; Leliaert, Frederik; Marcelino, Vanessa R; Verbruggen, Heroen

2018-04-01

Chloroplast genomes have undergone tremendous alterations through the evolutionary history of the green algae (Chloroplastida). This study focuses on the evolution of chloroplast genomes in the siphonous green algae (order Bryopsidales). We present five new chloroplast genomes, which along with existing sequences, yield a data set representing all but one families of the order. Using comparative phylogenetic methods, we investigated the evolutionary dynamics of genomic features in the order. Our results show extensive variation in chloroplast genome architecture and intron content. Variation in genome size is accounted for by the amount of intergenic space and freestanding open reading frames that do not show significant homology to standard plastid genes. We show the diversity of these nonstandard genes based on their conserved protein domains, which are often associated with mobile functions (reverse transcriptase/intron maturase, integrases, phage- or plasmid-DNA primases, transposases, integrases, ligases). Investigation of the introns showed proliferation of group II introns in the early evolution of the order and their subsequent loss in the core Halimedineae, possibly through RT-mediated intron loss.
ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling | Office of Cancer Genomics

Science.gov (United States)

Functional genomics (FG) screens, using RNAi or CRISPR technology, have become a standard tool for systematic, genome-wide loss-of-function studies for therapeutic target discovery. As in many large-scale assays, however, off-target effects, variable reagents' potency and experimental noise must be accounted for appropriately control for false positives.
Prediction of Vehicle Mobility on Large-Scale Soft-Soil Terrain Maps Using Physics-Based Simulation

Science.gov (United States)

2016-08-04

and contact constraints using a time-stepping explicit integration procedure. The DEM soil model can account for the soil cohesion, compressibility ...maximum unconsolidated radius and when the particles are compressed that radius is reduced by the amount of plastic deformation. The primary soil ... compressing the soil to a desired consolidation stress using a lid, after which the lid is removed. This step is essential for cohesive soils since
Genomics-assisted breeding in fruit trees.

Science.gov (United States)

Iwata, Hiroyoshi; Minamikawa, Mai F; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi

2016-01-01

Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the plant to assess the marketable product (fruit). In this article, we describe the potential of genomics-assisted breeding, which uses these novel genomics-based approaches, to break through these barriers in conventional fruit tree breeding. We first introduce the molecular marker systems and whole-genome sequence data that are available for fruit tree breeding. Next we introduce the statistical methods for biparental linkage and quantitative trait locus (QTL) mapping as well as GWAS and GS. We then review QTL mapping, GWAS, and GS studies conducted on fruit trees. We also review novel technologies for rapid generation advancement. Finally, we note the future prospects of genomics-assisted fruit tree breeding and problems that need to be overcome in the breeding.
A petroleum contaminated soil bioremediation facility

Energy Technology Data Exchange (ETDEWEB)

Lombard, K.; Hazen, T.

1994-06-01

The amount of petroleum contaminated soil (PCS) at the Savannah River site (SRS) that has been identified, excavated and is currently in storage has increased several fold during the last few years. Several factors have contributed to this problem: (1) South Carolina Department of Health ad Environmental control (SCDHEC) lowered the sanitary landfill maximum concentration for total petroleum hydrocarbons (TPH) in the soil from 500 to 100 parts per million (ppm), (2) removal and replacement of underground storage tanks at several sites, (3) most recently SCDHEC disallowed aeration for treatment of contaminated soil, and (4) discovery of several very large contaminated areas of soil associated with leaking underground storage tanks (LUST), leaking pipes, disposal areas, and spills. Thus, SRS has an urgent need to remediate large quantities of contaminated soil that are currently stockpiled and the anticipated contaminated soils to be generated from accidental spills. As long as we utilize petroleum based compounds at the site, we will continue to generate contaminated soil that will require remediation.
A petroleum contaminated soil bioremediation facility

International Nuclear Information System (INIS)

Lombard, K.; Hazen, T.

1994-01-01

The amount of petroleum contaminated soil (PCS) at the Savannah River site (SRS) that has been identified, excavated and is currently in storage has increased several fold during the last few years. Several factors have contributed to this problem: (1) South Carolina Department of Health ad Environmental control (SCDHEC) lowered the sanitary landfill maximum concentration for total petroleum hydrocarbons (TPH) in the soil from 500 to 100 parts per million (ppm), (2) removal and replacement of underground storage tanks at several sites, (3) most recently SCDHEC disallowed aeration for treatment of contaminated soil, and (4) discovery of several very large contaminated areas of soil associated with leaking underground storage tanks (LUST), leaking pipes, disposal areas, and spills. Thus, SRS has an urgent need to remediate large quantities of contaminated soil that are currently stockpiled and the anticipated contaminated soils to be generated from accidental spills. As long as we utilize petroleum based compounds at the site, we will continue to generate contaminated soil that will require remediation
A Genome-Wide Association Study in Large White and Landrace Pig Populations for Number Piglets Born Alive

Science.gov (United States)

Bergfelder-Drüing, Sarah; Grosse-Brinkhaus, Christine; Lind, Bianca; Erbe, Malena; Schellander, Karl; Simianer, Henner; Tholen, Ernst

2015-01-01

The number of piglets born alive (NBA) per litter is one of the most important traits in pig breeding due to its influence on production efficiency. It is difficult to improve NBA because the heritability of the trait is low and it is governed by a high number of loci with low to moderate effects. To clarify the biological and genetic background of NBA, genome-wide association studies (GWAS) were performed using 4,012 Large White and Landrace pigs from herdbook and commercial breeding companies in Germany (3), Austria (1) and Switzerland (1). The animals were genotyped with the Illumina PorcineSNP60 BeadChip. Because of population stratifications within and between breeds, clusters were formed using the genetic distances between the populations. Five clusters for each breed were formed and analysed by GWAS approaches. In total, 17 different significant markers affecting NBA were found in regions with known effects on female reproduction. No overlapping significant chromosome areas or QTL between Large White and Landrace breed were detected. PMID:25781935
A genome-wide association study in large white and landrace pig populations for number piglets born alive.

Directory of Open Access Journals (Sweden)

Sarah Bergfelder-Drüing

Full Text Available The number of piglets born alive (NBA per litter is one of the most important traits in pig breeding due to its influence on production efficiency. It is difficult to improve NBA because the heritability of the trait is low and it is governed by a high number of loci with low to moderate effects. To clarify the biological and genetic background of NBA, genome-wide association studies (GWAS were performed using 4,012 Large White and Landrace pigs from herdbook and commercial breeding companies in Germany (3, Austria (1 and Switzerland (1. The animals were genotyped with the Illumina PorcineSNP60 BeadChip. Because of population stratifications within and between breeds, clusters were formed using the genetic distances between the populations. Five clusters for each breed were formed and analysed by GWAS approaches. In total, 17 different significant markers affecting NBA were found in regions with known effects on female reproduction. No overlapping significant chromosome areas or QTL between Large White and Landrace breed were detected.
DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

Directory of Open Access Journals (Sweden)

Inês Soares

Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.
Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

Directory of Open Access Journals (Sweden)

Keeling Patrick J

2007-09-01

Full Text Available Abstract Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements
Nannochloropsis genomes reveal evolution of microalgal oleaginous traits.

Directory of Open Access Journals (Sweden)

Dongmei Wang

2014-01-01

Full Text Available Oleaginous microalgae are promising feedstock for biofuels, yet the genetic diversity, origin and evolution of oleaginous traits remain largely unknown. Here we present a detailed phylogenomic analysis of five oleaginous Nannochloropsis species (a total of six strains and one time-series transcriptome dataset for triacylglycerol (TAG synthesis on one representative strain. Despite small genome sizes, high coding potential and relative paucity of mobile elements, the genomes feature small cores of ca. 2,700 protein-coding genes and a large pan-genome of >38,000 genes. The six genomes share key oleaginous traits, such as the enrichment of selected lipid biosynthesis genes and certain glycoside hydrolase genes that potentially shift carbon flux from chrysolaminaran to TAG synthesis. The eleven type II diacylglycerol acyltransferase genes (DGAT-2 in every strain, each expressed during TAG synthesis, likely originated from three ancient genomes, including the secondary endosymbiosis host and the engulfed green and red algae. Horizontal gene transfers were inferred in most lipid synthesis nodes with expanded gene doses and many glycoside hydrolase genes. Thus multiple genome pooling and horizontal genetic exchange, together with selective inheritance of lipid synthesis genes and species-specific gene loss, have led to the enormous genetic apparatus for oleaginousness and the wide genomic divergence among present-day Nannochloropsis. These findings have important implications in the screening and genetic engineering of microalgae for biofuels.
Soil strength and forest operations

OpenAIRE

Beekman, F.

1987-01-01

The use of heavy machinery and transport vehicles is an integral part of modern forest operations. This use often causes damage to the standing trees and to the soil. In this study the effects of vehicle traffic on the soil are analysed and the possible consequences for forest management discussed. The study is largely restricted to sandy and loamy soils because of their importance for Dutch forestry.
Soil strength, defined as the resistance of soil structure against the impa...
Analysis of the Pantoea ananatis pan-genome reveals factors underlying its ability to colonize and interact with plant, insect and vertebrate hosts.

Science.gov (United States)

De Maayer, Pieter; Chan, Wai Yin; Rubagotti, Enrico; Venter, Stephanus N; Toth, Ian K; Birch, Paul R J; Coutinho, Teresa A

2014-05-27

Pantoea ananatis is found in a wide range of natural environments, including water, soil, as part of the epi- and endophytic flora of various plant hosts, and in the insect gut. Some strains have proven effective as biological control agents and plant-growth promoters, while other strains have been implicated in diseases of a broad range of plant hosts and humans. By analysing the pan-genome of eight sequenced P. ananatis strains isolated from different sources we identified factors potentially underlying its ability to colonize and interact with hosts in both the plant and animal Kingdoms. The pan-genome of the eight compared P. ananatis strains consisted of a core genome comprised of 3,876 protein coding sequences (CDSs) and a sizeable accessory genome consisting of 1,690 CDSs. We estimate that ~106 unique CDSs would be added to the pan-genome with each additional P. ananatis genome sequenced in the future. The accessory fraction is derived mainly from integrated prophages and codes mostly for proteins of unknown function. Comparison of the translated CDSs on the P. ananatis pan-genome with the proteins encoded on all sequenced bacterial genomes currently available revealed that P. ananatis carries a number of CDSs with orthologs restricted to bacteria associated with distinct hosts, namely plant-, animal- and insect-associated bacteria. These CDSs encode proteins with putative roles in transport and metabolism of carbohydrate and amino acid substrates, adherence to host tissues, protection against plant and animal defense mechanisms and the biosynthesis of potential pathogenicity determinants including insecticidal peptides, phytotoxins and type VI secretion system effectors. P. ananatis has an 'open' pan-genome typical of bacterial species that colonize several different environments. The pan-genome incorporates a large number of genes encoding proteins that may enable P. ananatis to colonize, persist in and potentially cause disease symptoms in a wide range of
The genome of Eucalyptus grandis

Energy Technology Data Exchange (ETDEWEB)

Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.; Hellsten, Uffe; Hayes, Richard D.; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M.; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R. K.; Hussey, Steven G.; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B.; Togawa, Roberto C.; Pappas, Marilia R.; Faria, Danielle A.; Sansaloni, Carolina P.; Petroli, Cesar D.; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J.; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A.; Bornberg-Bauer, Erich; Kersting, Anna R.; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E.; Liston, Aaron; Spatafora, Joseph W.; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H.; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C.; Steane, Dorothy A.; Vaillancourt, René E.; Potts, Brad M.; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J.; Strauss, Steven H.; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S.; Schmutz, Jeremy

2014-06-11

Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defence against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.
Apparent soil electrical conductivity in two different soil types

Directory of Open Access Journals (Sweden)

Wilker Nunes Medeiros

Full Text Available ABSTRACT Mapping the apparent soil electrical conductivity (ECa has become important for the characterization of the soil variability in precision agriculture systems. Could the ECa be used to locate the soil sampling points for mapping the chemical and physical soil attributes? The objective of this work was to examine the relations between ECa and soil attributes in two fields presenting different soil textures. In each field, 50 sampling points were chosen using a path that presented a high variability of ECa obtained from a preliminary ECa map. At each sampling point, the ECa was measured in soil depths of 0-20, 0-40 and 0-60 cm. In addition, at each point, soil samples were collected for the determination of physical and chemical attributes in the laboratory. The ECa data obtained for different soil depths was very similar. A large number of significant correlations between ECa and the soil attributes were found. In the sandy clay loam texture field there was no correlation between ECa and organic matter or between ECa and soil clay and sand content. However, a significant positive correlation was shown for the remaining phosphorus. In the sandy loam texture field the ECa had a significant positive correlation with clay content and a significant negative correlation with sand content. The results suggest that the mapping of apparent soil electrical conductivity does not replace traditional soil sampling, however, it can be used as information to delimit regions in a field that have similar soil attributes.
The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome.

Science.gov (United States)

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-04-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.
MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

Science.gov (United States)

Holt, Carson; Yandell, Mark

2011-12-22

Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

Energy Technology Data Exchange (ETDEWEB)

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; Harmon-Smith, Miranda; Doud, Devin; Reddy, T. B. K.; Schulz, Frederik; Jarett, Jessica; Rivers, Adam R.; Eloe-Fadrosh, Emiley A.; Tringe, Susannah G.; Ivanova, Natalia N.; Copeland, Alex; Clum, Alicia; Becraft, Eric D.; Malmstrom, Rex R.; Birren, Bruce; Podar, Mircea; Bork, Peer; Weinstock, George M.; Garrity, George M.; Dodsworth, Jeremy A.; Yooseph, Shibu; Sutton, Granger; Glöckner, Frank O.; Gilbert, Jack A.; Nelson, William C.; Hallam, Steven J.; Jungbluth, Sean P.; Ettema, Thijs J. G.; Tighe, Scott; Konstantinidis, Konstantinos T.; Liu, Wen-Tso; Baker, Brett J.; Rattei, Thomas; Eisen, Jonathan A.; Hedlund, Brian; McMahon, Katherine D.; Fierer, Noah; Knight, Rob; Finn, Rob; Cochrane, Guy; Karsch-Mizrachi, Ilene; Tyson, Gene W.; Rinke, Christian; Kyrpides, Nikos C.; Schriml, Lynn; Garrity, George M.; Hugenholtz, Philip; Sutton, Granger; Yilmaz, Pelin; Meyer, Folker; Glöckner, Frank O.; Gilbert, Jack A.; Knight, Rob; Finn, Rob; Cochrane, Guy; Karsch-Mizrachi, Ilene; Lapidus, Alla; Meyer, Folker; Yilmaz, Pelin; Parks, Donovan H.; Eren, A. M.; Schriml, Lynn; Banfield, Jillian F.; Hugenholtz, Philip; Woyke, Tanja

2017-08-08

The number of genomes from uncultivated microbes will soon surpass the number of isolate genomes in public databases (Hugenholtz, Skarshewski, & Parks, 2016). Technological advancements in high-throughput sequencing and assembly, including single-cell genomics and the computational extraction of genomes from metagenomes (GFMs), are largely responsible. Here we propose community standards for reporting the Minimum Information about a Single-Cell Genome (MIxS-SCG) and Minimum Information about Genomes extracted From Metagenomes (MIxS-GFM) specific for Bacteria and Archaea. The standards have been developed in the context of the International Genomics Standards Consortium (GSC) community (Field et al., 2014) and can be viewed as a supplement to other GSC checklists including the Minimum Information about a Genome Sequence (MIGS), Minimum information about a Metagenomic Sequence(s) (MIMS) (Field et al., 2008) and Minimum Information about a Marker Gene Sequence (MIMARKS) (P. Yilmaz et al., 2011). Community-wide acceptance of MIxS-SCG and MIxS-GFM for Bacteria and Archaea will enable broad comparative analyses of genomes from the majority of taxa that remain uncultivated, improving our understanding of microbial function, ecology, and evolution.

Soil microbial community structure and diversity are largely influenced by soil pH and nutrient quality in 78-year-old tree plantations

Science.gov (United States)

Zhou, Xiaoqi; Guo, Zhiying; Chen, Chengrong; Jia, Zhongjun

2017-04-01

Forest plantations have been recognised as a key strategy management tool for stocking carbon (C) in soils, thereby contributing to climate warming mitigation. However, long-term ecological consequences of anthropogenic forest plantations on the community structure and diversity of soil microorganisms and the underlying mechanisms in determining these patterns are poorly understood. In this study, we selected 78-year-old tree plantations that included three coniferous tree species (i.e. slash pine, hoop pine and kauri pine) and a eucalypt species in subtropical Australia. We investigated the patterns of community structure, and the diversity of soil bacteria and eukaryotes by using high-throughput sequencing of 16S rRNA and 18S rRNA genes. We also measured the potential methane oxidation capacity under different tree species. The results showed that slash pine and Eucalyptus significantly increased the dominant taxa of bacterial Acidobacteria and the dominant taxa of eukaryotic Ascomycota, and formed clusters of soil bacterial and eukaryotic communities, which were clearly different from the clusters under hoop pine and kauri pine. Soil pH and nutrient quality indicators such as C : nitrogen (N) and extractable organic C : extractable organic N were key factors in determining the patterns of soil bacterial and eukaryotic communities between the different tree species treatments. Slash pine and Eucalyptus had significantly lower soil bacterial and eukaryotic operational taxonomical unit numbers and lower diversity indices than kauri pine and hoop pine. A key factor limitation hypothesis was introduced, which gives a reasonable explanation for lower diversity indices under slash pine and Eucalyptus. In addition, slash pine and Eucalyptus had a higher soil methane oxidation capacity than the other tree species. These results suggest that significant changes in soil microbial communities may occur in response to chronic disturbance by tree plantations, and highlight
Utilization of complete chloroplast genomes for phylogenetic studies

NARCIS (Netherlands)

Ramlee, Shairul Izan Binti

2016-01-01

Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from
Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists

Directory of Open Access Journals (Sweden)

Matheus Sanitá Lima

2017-11-01

Full Text Available Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb, indicating that most of the organelle DNA—coding and noncoding—is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells.
Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.

Science.gov (United States)

Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi

2018-01-01

Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.
Genomes of three facultatively symbiotic Frankia sp. strainsreflect host plant biogeography

Energy Technology Data Exchange (ETDEWEB)

Normand, Philippe; Lapierre, Pascal; Tisa, Louis S.; Gogarten, J.Peter; Alloisio, Nicole; Bagnarol, Emilie; Bassi, Carla A.; Berry,Alison; Bickhart, Derek M.; Choisne, Nathalie; Couloux, Arnaud; Cournoyer, Benoit; Cruveiller, Stephane; Daubin, Vincent; Demange, Nadia; Francino, M. Pilar; Ggoltsman, Eugene; Huang, Ying; Kopp, Olga; Labarre,Laurent; Lapidus, Alla; Lavire, Celine; Marechal, Joelle; Martinez,Michele; Mastronunzio, Juliana E.; Mullin, Beth; Niemann, James; Pujic,Pierre; Rawnsley, Tania; Rouy, Zoe; Schenowitz, Chantal; Sellstedt,Anita; Tavares, Fernando; Tomkins, Jeffrey P.; Vallenet, David; Valverde,Claudio; Wall, Luis; Wang, Ying; Medigue, Claudine; Benson, David R.

2006-02-01

Filamentous actinobacteria from the genus Frankia anddiverse woody trees and shrubs together form N2-fixing actinorhizal rootnodule symbioses that are a major source of new soil nitrogen in widelydiverse biomes 1. Three major clades of Frankia sp. strains are defined;each clade is associated with a defined subset of plants from among theeight actinorhizal plant families 2,3. The evolution arytrajectoriesfollowed by the ancestors of both symbionts leading to current patternsof symbiont compatibility are unknown. Here we show that the competingprocesses of genome expansion and contraction have operated in differentgroups of Frankia strains in a manner that can be related to thespeciation of the plant hosts and their geographic distribution. Wesequenced and compared the genomes from three Frankia sp. strains havingdifferent host plant specificities. The sizes of their genomes variedfrom 5.38 Mbp for a narrow host range strain (HFPCcI3) to 7.50Mbp for amedium host range strain (ACN14a) to 9.08 Mbp for a broad host rangestrain (EAN1pec.) This size divergence is the largest yet reported forsuch closely related bacteria. Since the order of divergence of thestrains is known, the extent of gene deletion, duplication andacquisition could be estimated and was found to be inconcert with thebiogeographic history of the symbioses. Host plant isolation favoredgenome contraction, whereas host plant diversification favored genomeexpansion. The results support the idea that major genome reductions aswell as expansions can occur in facultatively symbiotic soil bacteria asthey respond to new environments in the context of theirsymbioses.
Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

Energy Technology Data Exchange (ETDEWEB)

Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

2013-03-08

Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.
The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

Science.gov (United States)

Argout, X; Martin, G; Droc, G; Fouet, O; Labadie, K; Rivals, E; Aury, J M; Lanaud, C

2017-09-15

Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes. We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes. The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence. Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).
Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

Science.gov (United States)

Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

2014-04-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.
Single-Molecule FISH Reveals Non-selective Packaging of Rift Valley Fever Virus Genome Segments

NARCIS (Netherlands)

Wichgers Schreur, Paul J.; Kortekaas, Jeroen

2016-01-01

The bunyavirus genome comprises a small (S), medium (M), and large (L) RNA segment of negative polarity. Although genome segmentation confers evolutionary advantages by enabling genome reassortment events with related viruses, genome segmentation also complicates genome replication and packaging.
The Vigna Genome Server, 'VigGS': A Genomic Knowledge Base of the Genus Vigna Based on High-Quality, Annotated Genome Sequence of the Azuki Bean, Vigna angularis (Willd.) Ohwi & Ohashi.

Science.gov (United States)

Sakai, Hiroaki; Naito, Ken; Takahashi, Yu; Sato, Toshiyuki; Yamamoto, Toshiya; Muto, Isamu; Itoh, Takeshi; Tomooka, Norihiko

2016-01-01

The genus Vigna includes legume crops such as cowpea, mungbean and azuki bean, as well as >100 wild species. A number of the wild species are highly tolerant to severe environmental conditions including high-salinity, acid or alkaline soil; drought; flooding; and pests and diseases. These features of the genus Vigna make it a good target for investigation of genetic diversity in adaptation to stressful environments; however, a lack of genomic information has hindered such research in this genus. Here, we present a genome database of the genus Vigna, Vigna Genome Server ('VigGS', http://viggs.dna.affrc.go.jp), based on the recently sequenced azuki bean genome, which incorporates annotated exon-intron structures, along with evidence for transcripts and proteins, visualized in GBrowse. VigGS also facilitates user construction of multiple alignments between azuki bean genes and those of six related dicot species. In addition, the database displays sequence polymorphisms between azuki bean and its wild relatives and enables users to design primer sequences targeting any variant site. VigGS offers a simple keyword search in addition to sequence similarity searches using BLAST and BLAT. To incorporate up to date genomic information, VigGS automatically receives newly deposited mRNA sequences of pre-set species from the public database once a week. Users can refer to not only gene structures mapped on the azuki bean genome on GBrowse but also relevant literature of the genes. VigGS will contribute to genomic research into plant biotic and abiotic stresses and to the future development of new stress-tolerant crops. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Theory of microbial genome evolution

Science.gov (United States)

Koonin, Eugene

Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.
Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource.

Science.gov (United States)

Sharpton, Thomas J; Jospin, Guillaume; Wu, Dongying; Langille, Morgan G I; Pollard, Katherine S; Eisen, Jonathan A

2012-10-13

New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as "Sifting Families," or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology-based analyses. We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/).
From genomic variation to personalized medicine

DEFF Research Database (Denmark)

Wesolowska, Agata; Schmiegelow, Kjeld

Genomic variation is the basis of interindividual differences in observable traits and disease susceptibility. Genetic studies are the driving force of personalized medicine, as many of the differences in treatment efficacy can be attributed to our genomic background. The rapid development...... a considerable amount of the phenotype variability, hence the major difficulty of interpretation lies in the complexity of molecular interactions. This PhD thesis describes the state-of-art of the functional human variation research (Chapter 1) and introduces childhood acute lymphoblastic leukaemia (ALL...... the thesis and includes some final remarks on the perspectives of genomic variation research and personalized medicine. In summary, this thesis demonstrates the feasibility of integrative analyses of genomic variations and introduces large-scale hypothesis-driven SNP exploration studies as an emerging...
Inversion variants in human and primate genomes.

Science.gov (United States)

Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro; Bitonto, Miriana; Capozzi, Oronzo; Signorile, Martina Lepore; Miroballo, Mattia; Archidiacono, Nicoletta; Eichler, Evan E; Ventura, Mario; Antonacci, Francesca

2018-05-18

For many years, inversions have been proposed to be a direct driving force in speciation since they suppress recombination when heterozygous. Inversions are the most common large-scale differences among humans and great apes. Nevertheless, they represent large events easily distinguishable by classical cytogenetics, whose resolution, however, is limited. Here, we performed a genome-wide comparison between human, great ape, and macaque genomes using the net alignments for the most recent releases of genome assemblies. We identified a total of 156 putative inversions, between 103 kb and 91 Mb, corresponding to 136 human loci. Combining literature, sequence, and experimental analyses, we analyzed 109 of these loci and found 67 regions inverted in one or multiple primates, including 28 newly identified inversions. These events overlap with 81 human genes at their breakpoints, and seven correspond to sites of recurrent rearrangements associated with human disease. This work doubles the number of validated primate inversions larger than 100 kb, beyond what was previously documented. We identified 74 sites of errors, where the sequence has been assembled in the wrong orientation, in the reference genomes analyzed. Our data serve two purposes: First, we generated a map of evolutionary inversions in these genomes representing a resource for interrogating differences among these species at a functional level; second, we provide a list of misassembled regions in these primate genomes, involving over 300 Mb of DNA and 1978 human genes. Accurately annotating these regions in the genome references has immediate applications for evolutionary and biomedical studies on primates. © 2018 Catacchio et al.; Published by Cold Spring Harbor Laboratory Press.
Continuous data assimilation for downscaling large-footprint soil moisture retrievals

KAUST Repository

Altaf, Muhammad; Jana, Raghavendra Belur; Hoteit, Ibrahim; McCabe, Matthew

2016-01-01

on coarse scale observations. Application of this approach is likely in generating fine and intermediate resolution soil moisture fields conditioned on the radiometerbased, coarse resolution products from remote sensing satellites.
Contaminated soil concrete blocks

NARCIS (Netherlands)

de Korte, A.C.J.; Brouwers, Jos; Limbachiya, Mukesh C.; Kew, Hsein Y.

2009-01-01

According to Dutch law the contaminated soil needs to be remediated or immobilised. The main focus in this article is the design of concrete blocks, containing contaminated soil, that are suitable for large production, financial feasible and meets all technical and environmental requirements. In
A Mitochondrial Genome of Rhyparochromidae (Hemiptera: Heteroptera) and a Comparative Analysis of Related Mitochondrial Genomes.

Science.gov (United States)

Li, Teng; Yang, Jie; Li, Yinwan; Cui, Ying; Xie, Qiang; Bu, Wenjun; Hillis, David M

2016-10-19

The Rhyparochromidae, the largest family of Lygaeoidea, encompasses more than 1,850 described species, but no mitochondrial genome has been sequenced to date. Here we describe the first mitochondrial genome for Rhyparochromidae: a complete mitochondrial genome of Panaorus albomaculatus (Scott, 1874). This mitochondrial genome is comprised of 16,345 bp, and contains the expected 37 genes and control region. The majority of the control region is made up of a large tandem-repeat region, which has a novel pattern not previously observed in other insects. The tandem-repeats region of P. albomaculatus consists of 53 tandem duplications (including one partial repeat), which is the largest number of tandem repeats among all the known insect mitochondrial genomes. Slipped-strand mispairing during replication is likely to have generated this novel pattern of tandem repeats. Comparative analysis of tRNA gene families in sequenced Pentatomomorpha and Lygaeoidea species shows that the pattern of nucleotide conservation is markedly higher on the J-strand. Phylogenetic reconstruction based on mitochondrial genomes suggests that Rhyparochromidae is not the sister group to all the remaining Lygaeoidea, and supports the monophyly of Lygaeoidea.
Genome-wide comparative analysis of codon usage bias and codon context patterns among cyanobacterial genomes.

Science.gov (United States)

Prabha, Ratna; Singh, Dhananjaya P; Sinha, Swati; Ahmad, Khurshid; Rai, Anil

2017-04-01

With the increasing accumulation of genomic sequence information of prokaryotes, the study of codon usage bias has gained renewed attention. The purpose of this study was to examine codon selection pattern within and across cyanobacterial species belonging to diverse taxonomic orders and habitats. We performed detailed comparative analysis of cyanobacterial genomes with respect to codon bias. Our analysis reflects that in cyanobacterial genomes, A- and/or T-ending codons were used predominantly in the genes whereas G- and/or C-ending codons were largely avoided. Variation in the codon context usage of cyanobacterial genes corresponded to the clustering of cyanobacteria as per their GC content. Analysis of codon adaptation index (CAI) and synonymous codon usage order (SCUO) revealed that majority of genes are associated with low codon bias. Codon selection pattern in cyanobacterial genomes reflected compositional constraints as major influencing factor. It is also identified that although, mutational constraint may play some role in affecting codon usage bias in cyanobacteria, compositional constraint in terms of genomic GC composition coupled with environmental factors affected codon selection pattern in cyanobacterial genomes. Copyright © 2016 Elsevier B.V. All rights reserved.
Complete genome of Martelella sp. AD-3, a moderately halophilic polycyclic aromatic hydrocarbons-degrading bacterium.

Science.gov (United States)

Cui, Changzheng; Li, Zhijie; Qian, Jiangchao; Shi, Jie; Huang, Ling; Tang, Hongzhi; Chen, Xin; Lin, Kuangfei; Xu, Ping; Liu, Yongdi

2016-05-10

Martelella sp. strain AD-3, a moderate halophilic bacterium, was isolated from a petroleum-contaminated soil with high salinity in China. Here, we report the complete genome of strain AD-3, which contains one circular chromosome and two circular plasmids. An array of genes related to metabolism of polycyclic aromatic hydrocarbons and halophilic mechanism in this bacterium was identified by the whole genome analysis. Copyright © 2016 Elsevier B.V. All rights reserved.
Separating metagenomic short reads into genomes via clustering

Directory of Open Access Journals (Sweden)

Tanaseichuk Olga

2012-09-01

Full Text Available Abstract Background The metagenomics approach allows the simultaneous sequencing of all genomes in an environmental sample. This results in high complexity datasets, where in addition to repeats and sequencing errors, the number of genomes and their abundance ratios are unknown. Recently developed next-generation sequencing (NGS technologies significantly improve the sequencing efficiency and cost. On the other hand, they result in shorter reads, which makes the separation of reads from different species harder. Among the existing computational tools for metagenomic analysis, there are similarity-based methods that use reference databases to align reads and composition-based methods that use composition patterns (i.e., frequencies of short words or l-mers to cluster reads. Similarity-based methods are unable to classify reads from unknown species without close references (which constitute the majority of reads. Since composition patterns are preserved only in significantly large fragments, composition-based tools cannot be used for very short reads, which becomes a significant limitation with the development of NGS. A recently proposed algorithm, AbundanceBin, introduced another method that bins reads based on predicted abundances of the genomes sequenced. However, it does not separate reads from genomes of similar abundance levels. Results In this work, we present a two-phase heuristic algorithm for separating short paired-end reads from different genomes in a metagenomic dataset. We use the observation that most of the l-mers belong to unique genomes when l is sufficiently large. The first phase of the algorithm results in clusters of l-mers each of which belongs to one genome. During the second phase, clusters are merged based on l-mer repeat information. These final clusters are used to assign reads. The algorithm could handle very short reads and sequencing errors. It is initially designed for genomes with similar abundance levels and then

Myc-dependent genome instability and lifespan in Drosophila.

Directory of Open Access Journals (Sweden)

Christina Greer

Full Text Available The Myc family of transcription factors are key regulators of cell growth and proliferation that are dysregulated in a large number of human cancers. When overexpressed, Myc family proteins also cause genomic instability, a hallmark of both transformed and aging cells. Using an in vivo lacZ mutation reporter, we show that overexpression of Myc in Drosophila increases the frequency of large genome rearrangements associated with erroneous repair of DNA double-strand breaks (DSBs. In addition, we find that overexpression of Myc shortens adult lifespan and, conversely, that Myc haploinsufficiency reduces mutation load and extends lifespan. Our data provide the first evidence that Myc may act as a pro-aging factor, possibly through its ability to greatly increase genome instability.
Insights into the Pathogenesis of Anaplastic Large-Cell Lymphoma through Genome-wide DNA Methylation Profiling

Directory of Open Access Journals (Sweden)

Melanie R. Hassler

2016-10-01

Full Text Available Aberrant DNA methylation patterns in malignant cells allow insight into tumor evolution and development and can be used for disease classification. Here, we describe the genome-wide DNA methylation signatures of NPM-ALK-positive (ALK+ and NPM-ALK-negative (ALK− anaplastic large-cell lymphoma (ALCL. We find that ALK+ and ALK− ALCL share common DNA methylation changes for genes involved in T cell differentiation and immune response, including TCR and CTLA-4, without an ALK-specific impact on tumor DNA methylation in gene promoters. Furthermore, we uncover a close relationship between global ALCL DNA methylation patterns and those in distinct thymic developmental stages and observe tumor-specific DNA hypomethylation in regulatory regions that are enriched for conserved transcription factor binding motifs such as AP1. Our results indicate similarity between ALCL tumor cells and thymic T cell subsets and a direct relationship between ALCL oncogenic signaling and DNA methylation through transcription factor induction and occupancy.
Resources for Functional Genomics Studies in Drosophila melanogaster

Science.gov (United States)

Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

2014-01-01

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Evaluation of Assimilated SMOS Soil Moisture Data for US Cropland Soil Moisture Monitoring

Science.gov (United States)

Yang, Zhengwei; Sherstha, Ranjay; Crow, Wade; Bolten, John; Mladenova, Iva; Yu, Genong; Di, Liping

2016-01-01

Remotely sensed soil moisture data can provide timely, objective and quantitative crop soil moisture information with broad geospatial coverage and sufficiently high resolution observations collected throughout the growing season. This paper evaluates the feasibility of using the assimilated ESA Soil Moisture Ocean Salinity (SMOS)Mission L-band passive microwave data for operational US cropland soil surface moisture monitoring. The assimilated SMOS soil moisture data are first categorized to match with the United States Department of Agriculture (USDA)National Agricultural Statistics Service (NASS) survey based weekly soil moisture observation data, which are ordinal. The categorized assimilated SMOS soil moisture data are compared with NASSs survey-based weekly soil moisture data for consistency and robustness using visual assessment and rank correlation. Preliminary results indicate that the assimilated SMOS soil moisture data highly co-vary with NASS field observations across a large geographic area. Therefore, SMOS data have great potential for US operational cropland soil moisture monitoring.
Phylogenomic, Pan-genomic, Pathogenomic and Evolutionary Genomic Insights into the Agronomically Relevant Enterobacteria Pantoea ananatis and Pantoea stewartii

Directory of Open Access Journals (Sweden)

Pieter De Maayer

2017-09-01

Full Text Available Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart’s wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes, has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis. While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range.
Phylogenomic, Pan-genomic, Pathogenomic and Evolutionary Genomic Insights into the Agronomically Relevant Enterobacteria Pantoea ananatis and Pantoea stewartii.

Science.gov (United States)

De Maayer, Pieter; Aliyu, Habibu; Vikram, Surendra; Blom, Jochen; Duffy, Brion; Cowan, Don A; Smits, Theo H M; Venter, Stephanus N; Coutinho, Teresa A

2017-01-01

Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart's wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes , has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis . While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range.
Genomic and functional features of the biosurfactant producing Bacillus sp. AM13.

Science.gov (United States)

Shaligram, Shraddha; Kumbhare, Shreyas V; Dhotre, Dhiraj P; Muddeshwar, Manohar G; Kapley, Atya; Joseph, Neetha; Purohit, Hemant P; Shouche, Yogesh S; Pawar, Shrikant P

2016-09-01

Genomic studies provide deeper insights into secondary metabolites produced by diverse bacterial communities, residing in various environmental niches. This study aims to understand the potential of a biosurfactant producing Bacillus sp. AM13, isolated from soil. An integrated approach of genomic and chemical analysis was employed to characterize the antibacterial lipopeptide produced by the strain AM13. Genome analysis revealed that strain AM13 harbors a nonribosomal peptide synthetase (NRPS) cluster; highly similar with known biosynthetic gene clusters from surfactin family: lichenysin (85 %) and surfactin (78 %). These findings were substantiated with supplementary experiments of oil displacement assay and surface tension measurements, confirming the biosurfactant production. Further investigation using LCMS approach exhibited similarity of the biomolecule with biosurfactants of the surfactin family. Our consolidated effort of functional genomics provided chemical as well as genetic leads for understanding the biochemical characteristics of the bioactive compound.
Complete genome sequence of Pedobacter heparinus type strain (HIM 762-3T)

Energy Technology Data Exchange (ETDEWEB)

Han, Cliff; Spring, Stefan; Lapidus, Alla; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Saunders, Elizabeth; Chertkov, Olga; Brettin, Thomas; Goker, Markus; Rohde, Manfred; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Detter, John C.

2009-05-20

Pedobacter heparinus (Payza and Korn 1956) Steyn et al. 1998 comb. nov. is the type species of the rapidly growing genus Pedobacter within the family Sphingobacteriaceae of the phylum 'Bacteroidetes'. P. heparinus is of interest, because it was the first isolated strain shown to grow with heparin as sole carbon and nitrogen source and because it produces several enzymes involved in the degradation of mucopolysaccharides. All available data about this species are based on a sole strain that was isolated from dry soil. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first report on a complete genome sequence of a member of the genus Pedobacter, and the 5,167,383 bp long single replicon genome with its 4287 protein-coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.
Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes.

Science.gov (United States)

Hultman, Jenni; Waldrop, Mark P; Mackelprang, Rachel; David, Maude M; McFarland, Jack; Blazewicz, Steven J; Harden, Jennifer; Turetsky, Merritt R; McGuire, A David; Shah, Manesh B; VerBerkmoes, Nathan C; Lee, Lang Ho; Mavrommatis, Kostas; Jansson, Janet K

2015-05-14

Over 20% of Earth's terrestrial surface is underlain by permafrost with vast stores of carbon that, once thawed, may represent the largest future transfer of carbon from the biosphere to the atmosphere. This process is largely dependent on microbial responses, but we know little about microbial activity in intact, let alone in thawing, permafrost. Molecular approaches have recently revealed the identities and functional gene composition of microorganisms in some permafrost soils and a rapid shift in functional gene composition during short-term thaw experiments. However, the fate of permafrost carbon depends on climatic, hydrological and microbial responses to thaw at decadal scales. Here we use the combination of several molecular 'omics' approaches to determine the phylogenetic composition of the microbial communities, including several draft genomes of novel species, their functional potential and activity in soils representing different states of thaw: intact permafrost, seasonally thawed active layer and thermokarst bog. The multi-omics strategy reveals a good correlation of process rates to omics data for dominant processes, such as methanogenesis in the bog, as well as novel survival strategies for potentially active microbes in permafrost.
The Genome of the Generalist Plant Pathogen Fusarium avenaceum Is Enriched with Genes Involved in Redox, Signaling and Secondary Metabolism

DEFF Research Database (Denmark)

Lysøe, Erik; Harris, Linda J.; Walkowiak, Sean

2014-01-01

Fusarium avenaceum is a fungus commonly isolated from soil and associated with a wide range of host plants. We present here three genome sequences of F. avenaceum, one isolated from barley in Finland and two from spring and winter wheat in Canada. The sizes of the three genomes range from 41.6-43...
Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications.

Science.gov (United States)

Lu, Jianguo; Peatman, Eric; Tang, Haibao; Lewis, Joshua; Liu, Zhanjiang

2012-06-15

Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication
Efficient privacy-preserving string search and an application in genomics.

Science.gov (United States)

Shimizu, Kana; Nuida, Koji; Rätsch, Gunnar

2016-06-01

Personal genomes carry inherent privacy risks and protecting privacy poses major social and technological challenges. We consider the case where a user searches for genetic information (e.g. an allele) on a server that stores a large genomic database and aims to receive allele-associated information. The user would like to keep the query and result private and the server the database. We propose a novel approach that combines efficient string data structures such as the Burrows-Wheeler transform with cryptographic techniques based on additive homomorphic encryption. We assume that the sequence data is searchable in efficient iterative query operations over a large indexed dictionary, for instance, from large genome collections and employing the (positional) Burrows-Wheeler transform. We use a technique called oblivious transfer that is based on additive homomorphic encryption to conceal the sequence query and the genomic region of interest in positional queries. We designed and implemented an efficient algorithm for searching sequences of SNPs in large genome databases. During search, the user can only identify the longest match while the server does not learn which sequence of SNPs the user queried. In an experiment based on 2184 aligned haploid genomes from the 1000 Genomes Project, our algorithm was able to perform typical queries within [Formula: see text] 4.6 s and [Formula: see text] 10.8 s for client and server side, respectively, on laptop computers. The presented algorithm is at least one order of magnitude faster than an exhaustive baseline algorithm. https://github.com/iskana/PBWT-sec and https://github.com/ratschlab/PBWT-sec shimizu-kana@aist.go.jp or Gunnar.Ratsch@ratschlab.org Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
A decade of human genome project conclusion: Scientific diffusion about our genome knowledge.

Science.gov (United States)

Moraes, Fernanda; Góes, Andréa

2016-05-06

The Human Genome Project (HGP) was initiated in 1990 and completed in 2003. It aimed to sequence the whole human genome. Although it represented an advance in understanding the human genome and its complexity, many questions remained unanswered. Other projects were launched in order to unravel the mysteries of our genome, including the ENCyclopedia of DNA Elements (ENCODE). This review aims to analyze the evolution of scientific knowledge related to both the HGP and ENCODE projects. Data were retrieved from scientific articles published in 1990-2014, a period comprising the development and the 10 years following the HGP completion. The fact that only 20,000 genes are protein and RNA-coding is one of the most striking HGP results. A new concept about the organization of genome arose. The ENCODE project was initiated in 2003 and targeted to map the functional elements of the human genome. This project revealed that the human genome is pervasively transcribed. Therefore, it was determined that a large part of the non-protein coding regions are functional. Finally, a more sophisticated view of chromatin structure emerged. The mechanistic functioning of the genome has been redrafted, revealing a much more complex picture. Besides, a gene-centric conception of the organism has to be reviewed. A number of criticisms have emerged against the ENCODE project approaches, raising the question of whether non-conserved but biochemically active regions are truly functional. Thus, HGP and ENCODE projects accomplished a great map of the human genome, but the data generated still requires further in depth analysis. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:215-223, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.
Radionuclide migration test using undisturbed aerated soil

International Nuclear Information System (INIS)

Yamamoto, Tadatoshi; Ohtsuka, Yoshiro; Ogawa, Hiromichi; Wadachi, Yoshiki

1988-01-01

As one of the most important part of safety assessment on the shallow land disposal of lowlevel radioactive waste, the radionuclide migration was studied using undisturbed soil samples, in order to evaluate an exact radionuclide migration in an aerated soil layer. Soil samples used in the migration test were coastal sand and loamy soil which form typical surface soil layers in Japan. The aqueous solution containing 60 CoCl 2 , 85 SrCl 2 and 137 CsCl was fed into the soil column and concentration of each radionuclide both in effluent and in soil was measured. Large amount of radionuclides was adsorbed on the surface of soil column and small amount of radionuclides moved deep into the soil column. Difference in the radionuclide profile was observed in the low concentration portion particularly. It is that some fractions of 60 Co and 137 Cs are stable in non-ionic form and move downward through the soil column together with water. The radionuclide distribution in the surface of soil column can be fairly predicted with a conventional migration equation for ionic radionuclides. As a result of radionuclide adsorption, both aerated soil layers of coastal sand and loamy soil have large barrier ability on the radionuclide migration through the ground. (author)
Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource

Directory of Open Access Journals (Sweden)

Sharpton Thomas J

2012-10-01

Full Text Available Abstract Background New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. Results We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as “Sifting Families,” or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology–based analyses. Conclusions We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/.
SoilInfo App: global soil information on your palm

Science.gov (United States)

Hengl, Tomislav; Mendes de Jesus, Jorge

2015-04-01

ISRIC ' World Soil Information has released in 2014 and app for mobile de- vices called 'SoilInfo' (http://soilinfo-app.org) and which aims at providing free access to the global soil data. SoilInfo App (available for Android v.4.0 Ice Cream Sandwhich or higher, and Apple v.6.x and v.7.x iOS) currently serves the Soil- Grids1km data ' a stack of soil property and class maps at six standard depths at a resolution of 1 km (30 arc second) predicted using automated geostatistical mapping and global soil data models. The list of served soil data includes: soil organic carbon (), soil pH, sand, silt and clay fractions (%), bulk density (kg/m3), cation exchange capacity of the fine earth fraction (cmol+/kg), coarse fragments (%), World Reference Base soil groups, and USDA Soil Taxonomy suborders (DOI: 10.1371/journal.pone.0105992). New soil properties and classes will be continuously added to the system. SoilGrids1km are available for download under a Creative Commons non-commercial license via http://soilgrids.org. They are also accessible via a Representational State Transfer API (http://rest.soilgrids.org) service. SoilInfo App mimics common weather apps, but is also largely inspired by the crowdsourcing systems such as the OpenStreetMap, Geo-wiki and similar. Two development aspects of the SoilInfo App and SoilGrids are constantly being worked on: Data quality in terms of accuracy of spatial predictions and derived information, and Data usability in terms of ease of access and ease of use (i.e. flexibility of the cyberinfrastructure / functionalities such as the REST SoilGrids API, SoilInfo App etc). The development focus in 2015 is on improving the thematic and spatial accuracy of SoilGrids predictions, primarily by using finer resolution covariates (250 m) and machine learning algorithms (such as random forests) to improve spatial predictions.
Assembly of viral genomes from metagenomes

Directory of Open Access Journals (Sweden)

Saskia L Smits

2014-12-01

Full Text Available Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.
Diversity and Symbiotic Characteristics of Cowpea Bradyrhizobium Strains in Ghanaian Soils

International Nuclear Information System (INIS)

Fening, Joseph Opoku

1999-08-01

. Analysis of the 16S rRNA gene of the isolates by PCR-RFLP identified 20 different composite genotypes. Diversity among the genomic species identified was very high, reaching 80% diversity. The various methods used indicated large diversity among the isolates, but the groupings of the isolates by the various methods were inconsistent, due to the different levels of resolution by the various methods. Diversity of the isolates in symbiotic effectiveness showed that some of the isolates had high nitrogen fixing capabilities that were comparable to plants fertilized with inorganic fertilizer nitrogen. Some of the isolates even showed superiority in symbiotic effectiveness relative to the standard strain TAL 169, suggesting that the native isolates may be useful strains for cowpea inoculation. The Gus A marker gene technique was used to assess the competitive abilities of the effective and ineffective isolates. Competition between the isolates was examined at different population ratios. The results obtained indicated that competitive ability was not directly related to effectiveness of strains. Inoculation of cowpea with indigenous bradyrhizobia isolates increased the number of nodules, shoot dry weight and total nitrogen of plants. The method of inoculation was observed to influence these parameters The results indicated that response of cowpea to inoculation in the presence of native rhizobia in some soils is possible. (au)
Methodological advances to study the diversity of soil protists and their functioning in soil food webs

NARCIS (Netherlands)

Geisen, Stefan; Bonkowski, Michael

2018-01-01

Abstract Soils host the most complex communities of organisms, which are still largely considered as an unknown ‘black box’. A key role in soil food webs is held by the highly abundant and diverse group of protists. Traditionally, soil protists are considered as the main consumers of bacteria in
Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus.

Science.gov (United States)

Müller, Bárbara S F; Neves, Leandro G; de Almeida Filho, Janeo E; Resende, Márcio F R; Muñoz, Patricio R; Dos Santos, Paulo E T; Filho, Estefano Paludzyszyn; Kirst, Matias; Grattapaglia, Dario

2017-07-11

The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses. Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study. This study provides further experimental data supporting positive prospects of using genome-wide data to

Some links on this page may take you to non-federal websites. Their policies may differ from this site.