WorldWideScience

Sample records for genome-scale promoter mapping

  1. Genome-wide mapping of autonomous promoter activity in human cells.

    Science.gov (United States)

    van Arensbergen, Joris; FitzPatrick, Vincent D; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J; van Steensel, Bas

    2017-02-01

    Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of the sequences that could be tested. Here we present 'survey of regulatory elements' (SuRE), a method that assays more than 10 8 DNA fragments, each 0.2-2 kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library of random genomic fragments upstream of a 20-bp barcode is constructed, and decoded by paired-end sequencing. This library is used to transfect cells, and barcodes in transcribed RNA are quantified by high-throughput sequencing. When applied to the human genome, we achieve 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide in K562 cells. By computational modeling we delineate subregions within promoters that are relevant for their activity. We show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites.

  2. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

    Science.gov (United States)

    Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2016-07-01

    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  3. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  4. Body maps on the human genome.

    Science.gov (United States)

    Cherniak, Christopher; Rodriguez-Esteban, Raul

    2013-12-20

    Chromosomes have territories, or preferred locales, in the cell nucleus. When these sites are taken into account, some large-scale structure of the human genome emerges. The synoptic picture is that genes highly expressed in particular topologically compact tissues are not randomly distributed on the genome. Rather, such tissue-specific genes tend to map somatotopically onto the complete chromosome set. They seem to form a "genome homunculus": a multi-dimensional, genome-wide body representation extending across chromosome territories of the entire spermcell nucleus. The antero-posterior axis of the body significantly corresponds to the head-tail axis of the nucleus, and the dorso-ventral body axis to the central-peripheral nucleus axis. This large-scale genomic structure includes thousands of genes. One rationale for a homuncular genome structure would be to minimize connection costs in genetic networks. Somatotopic maps in cerebral cortex have been reported for over a century.

  5. Single-molecule optical genome mapping of a human HapMap and a colorectal cancer cell line.

    Science.gov (United States)

    Teo, Audrey S M; Verzotto, Davide; Yao, Fei; Nagarajan, Niranjan; Hillmer, Axel M

    2015-01-01

    Next-generation sequencing (NGS) technologies have changed our understanding of the variability of the human genome. However, the identification of genome structural variations based on NGS approaches with read lengths of 35-300 bases remains a challenge. Single-molecule optical mapping technologies allow the analysis of DNA molecules of up to 2 Mb and as such are suitable for the identification of large-scale genome structural variations, and for de novo genome assemblies when combined with short-read NGS data. Here we present optical mapping data for two human genomes: the HapMap cell line GM12878 and the colorectal cancer cell line HCT116. High molecular weight DNA was obtained by embedding GM12878 and HCT116 cells, respectively, in agarose plugs, followed by DNA extraction under mild conditions. Genomic DNA was digested with KpnI and 310,000 and 296,000 DNA molecules (≥ 150 kb and 10 restriction fragments), respectively, were analyzed per cell line using the Argus optical mapping system. Maps were aligned to the human reference by OPTIMA, a new glocal alignment method. Genome coverage of 6.8× and 5.7× was obtained, respectively; 2.9× and 1.7× more than the coverage obtained with previously available software. Optical mapping allows the resolution of large-scale structural variations of the genome, and the scaffold extension of NGS-based de novo assemblies. OPTIMA is an efficient new alignment method; our optical mapping data provide a resource for genome structure analyses of the human HapMap reference cell line GM12878, and the colorectal cancer cell line HCT116.

  6. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  7. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Héloïse Bastide

    2013-06-01

    Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  8. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Science.gov (United States)

    Bastide, Héloïse; Betancourt, Andrea; Nolte, Viola; Tobler, Raymond; Stöbe, Petra; Futschik, Andreas; Schlötterer, Christian

    2013-06-01

    Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  9. Genome-scale portrait and evolutionary significance of human-specific core promoter tri- and tetranucleotide short tandem repeats.

    Science.gov (United States)

    Nazaripanah, N; Adelirad, F; Delbari, A; Sahaf, R; Abbasi-Asl, T; Ohadi, M

    2018-04-05

    While there is an ongoing trend to identify single nucleotide substitutions (SNSs) that are linked to inter/intra-species differences and disease phenotypes, short tandem repeats (STRs)/microsatellites may be of equal (if not more) importance in the above processes. Genes that contain STRs in their promoters have higher expression divergence compared to genes with fixed or no STRs in the gene promoters. In line with the above, recent reports indicate a role of repetitive sequences in the rise of young transcription start sites (TSSs) in human evolution. Following a comparative genomics study of all human protein-coding genes annotated in the GeneCards database, here we provide a genome-scale portrait of human-specific short- and medium-size (≥ 3-repeats) tri- and tetranucleotide STRs and STR motifs in the critical core promoter region between - 120 and + 1 to the TSS and evidence of skewing of this compartment in reference to the STRs that are not human-specific (Levene's test p human-specific transcripts was detected in the tri and tetra human-specific compartments (mid-p genome-scale skewing of STRs at a specific region of the human genome and a link between a number of these STRs and TSS selection/transcript specificity. The STRs and genes listed here may have a role in the evolution and development of characteristics and phenotypes that are unique to the human species.

  10. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    Directory of Open Access Journals (Sweden)

    RICARDO CRUZ-COKE

    2001-03-01

    Full Text Available In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays a semilogaritmic metric scale from 1 m up to the nanometer level. The base pair level of DNA sequences, 10-9 of this circunsference, is measured in milliarsec unit (mas, equivalent to a thousand of arcsecond. The "mas" unit, correspond to 1.27 nanometers (nm or 0.427 base pair (bp and it is the framework for measure DNA sequences. Thus the three billion base pairs of the human genome may be identified by 1296000000 "mas" units in continous correlation from number 1 to number 1296000000. This sexagesimal scale covers all the levels of the nuclear genetic material, from nucleotides to chromosomes. The locations of every codon and every gene may be numbered in the physical map of chomosome regions according to this new scale, instead of the partial kilobase and Megabase scales used today. The advantage of the new scale is the unification of the set of chromosomes under a continous scale of measurement at the DNA level, facilitating the correlation with the phenotypes of man and other speciesEn un trabajo anterior yo diseñé un diagrama del genoma humano basado en un ideograma circular del conjunto haploide de cromosomas, usando una escala de baja resolución en megabases. El propósito de este trabajo es el de diseñar una nueva escala para medir el mapa físico del genoma humano al más alto nivel de resolución. La longitud completa del genoma haploide del varon es extendido en una circunsferencia, marcada con una escala sexagesimal de 360 grados y 1296000

  11. Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study.

    Directory of Open Access Journals (Sweden)

    Paul S de Vries

    Full Text Available An increasing number of genome-wide association (GWA studies are now using the higher resolution 1000 Genomes Project reference panel (1000G for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10-8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10-8, the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development.

  12. Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study

    DEFF Research Database (Denmark)

    de Vries, Paul S; Sabater-Lleal, Maria; Chasman, Daniel I

    2017-01-01

    An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In...

  13. Genome-scale regression analysis reveals a linear relationship for promoters and enhancers after combinatorial drug treatment

    KAUST Repository

    Rapakoulia, Trisevgeni

    2017-08-09

    Motivation: Drug combination therapy for treatment of cancers and other multifactorial diseases has the potential of increasing the therapeutic effect, while reducing the likelihood of drug resistance. In order to reduce time and cost spent in comprehensive screens, methods are needed which can model additive effects of possible drug combinations. Results: We here show that the transcriptional response to combinatorial drug treatment at promoters, as measured by single molecule CAGE technology, is accurately described by a linear combination of the responses of the individual drugs at a genome wide scale. We also find that the same linear relationship holds for transcription at enhancer elements. We conclude that the described approach is promising for eliciting the transcriptional response to multidrug treatment at promoters and enhancers in an unbiased genome wide way, which may minimize the need for exhaustive combinatorial screens.

  14. INE: a rice genome database with an integrated map view.

    Science.gov (United States)

    Sakata, K; Antonio, B A; Mukai, Y; Nagasaki, H; Sakai, Y; Makino, K; Sasaki, T

    2000-01-01

    The Rice Genome Research Program (RGP) launched a large-scale rice genome sequencing in 1998 aimed at decoding all genetic information in rice. A new genome database called INE (INtegrated rice genome Explorer) has been developed in order to integrate all the genomic information that has been accumulated so far and to correlate these data with the genome sequence. A web interface based on Java applet provides a rapid viewing capability in the database. The first operational version of the database has been completed which includes a genetic map, a physical map using YAC (Yeast Artificial Chromosome) clones and PAC (P1-derived Artificial Chromosome) contigs. These maps are displayed graphically so that the positional relationships among the mapped markers on each chromosome can be easily resolved. INE incorporates the sequences and annotations of the PAC contig. A site on low quality information ensures that all submitted sequence data comply with the standard for accuracy. As a repository of rice genome sequence, INE will also serve as a common database of all sequence data obtained by collaborating members of the International Rice Genome Sequencing Project (IRGSP). The database can be accessed at http://www. dna.affrc.go.jp:82/giot/INE. html or its mirror site at http://www.staff.or.jp/giot/INE.html

  15. Mapping the space of genomic signatures.

    Directory of Open Access Journals (Sweden)

    Lila Kari

    Full Text Available We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR, is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM, implicitly compares the occurrences of oligomers of length up to k (herein k = 9 in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (superkingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal

  16. GenomeVx: simple web-based creation of editable circular chromosome maps.

    Science.gov (United States)

    Conant, Gavin C; Wolfe, Kenneth H

    2008-03-15

    We describe GenomeVx, a web-based tool for making editable, publication-quality, maps of mitochondrial and chloroplast genomes and of large plasmids. These maps show the location of genes and chromosomal features as well as a position scale. The program takes as input either raw feature positions or GenBank records. In the latter case, features are automatically extracted and colored, an example of which is given. Output is in the Adobe Portable Document Format (PDF) and can be edited by programs such as Adobe Illustrator. GenomeVx is available at http://wolfe.gen.tcd.ie/GenomeVx

  17. MOST-visualization: software for producing automated textbook-style maps of genome-scale metabolic networks.

    Science.gov (United States)

    Kelley, James J; Maor, Shay; Kim, Min Kyung; Lane, Anatoliy; Lun, Desmond S

    2017-08-15

    Visualization of metabolites, reactions and pathways in genome-scale metabolic networks (GEMs) can assist in understanding cellular metabolism. Three attributes are desirable in software used for visualizing GEMs: (i) automation, since GEMs can be quite large; (ii) production of understandable maps that provide ease in identification of pathways, reactions and metabolites; and (iii) visualization of the entire network to show how pathways are interconnected. No software currently exists for visualizing GEMs that satisfies all three characteristics, but MOST-Visualization, an extension of the software package MOST (Metabolic Optimization and Simulation Tool), satisfies (i), and by using a pre-drawn overview map of metabolism based on the Roche map satisfies (ii) and comes close to satisfying (iii). MOST is distributed for free on the GNU General Public License. The software and full documentation are available at http://most.ccib.rutgers.edu/. dslun@rutgers.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  18. Unveiling Mycoplasma hyopneumoniae Promoters: Sequence Definition and Genomic Distribution

    Science.gov (United States)

    Weber, Shana de Souto; Sant'Anna, Fernando Hayashi; Schrank, Irene Silveira

    2012-01-01

    Several Mycoplasma species have had their genome completely sequenced, including four strains of the swine pathogen Mycoplasma hyopneumoniae. Nevertheless, little is known about the nucleotide sequences that control transcriptional initiation in these microorganisms. Therefore, with the objective of investigating the promoter sequences of M. hyopneumoniae, 23 transcriptional start sites (TSSs) of distinct genes were mapped. A pattern that resembles the σ70 promoter −10 element was found upstream of the TSSs. However, no −35 element was distinguished. Instead, an AT-rich periodic signal was identified. About half of the experimentally defined promoters contained the motif 5′-TRTGn-3′, which was identical to the −16 element usually found in Gram-positive bacteria. The defined promoters were utilized to build position-specific scoring matrices in order to scan putative promoters upstream of all coding sequences (CDSs) in the M. hyopneumoniae genome. Two hundred and one signals were found associated with 169 CDSs. Most of these sequences were located within 100 nucleotides of the start codons. This study has shown that the number of promoter-like sequences in the M. hyopneumoniae genome is more frequent than expected by chance, indicating that most of the sequences detected are probably biologically functional. PMID:22334569

  19. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  20. Whole-genome shotgun optical mapping of rhodospirillumrubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, Susan; Zhou, Shiguo; Place, Mike; Zhang, Yaoping; Briska, Adam; Goldstein, Steve; Churas, Chris; Runnheim, Rod; Forrest,Dan; Lim, Alex; Lapidus, Alla; Han, Cliff S.; Roberts, Gary P.; Schwartz,David C.

    2004-07-01

    Rhodospirillum rubrum is a phototrophic purple non-sulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems, and as a source of hydrogen and biodegradable plastics production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction maps (Xba I, Nhe I, and Hind III) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction maps from randomly sheared genomic DNA molecules extracted directly from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the Hind III map acted as a scaffold for high resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and validation of genome sequence, our work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a ''molecular cytogenetics'' approach to solving problems in genomic analysis.

  1. Mapping of the methylation pattern of the hMSH2 promoter in colon cancer, using bisulfite genomic sequencing

    Directory of Open Access Journals (Sweden)

    Zhang Hua

    2006-08-01

    Full Text Available Abstract The detailed methylation status of CpG sites in the promoter region of hMSH2 gene has yet not to be reported. We have mapped the complete methylation status of the hMSH2 promoter, a region that contains 75 CpG sites, using bisulfite genomic sequencing in 60 primary colorectal cancers. And the expression of hMSH2 was detected by immunohistochemistry. The hypermethylation of hMSH2 was detected in 18.33% (11/60 of tumor tissues. The protein of hMSH2 was detected in 41.67% (25/60 of tumor tissues. No hypermethylation of hMSH2 was detected in normal tissues. The protein of hMSH2 was detected in all normal tissues. Our study demonstrated that hMSH2 hypermethylation and protein expression were associated with the development of colorectal cancer.

  2. Construction of reference chromosome-scale pseudomolecules for potato: integrating the potato genome with genetic and physical maps.

    Science.gov (United States)

    Sharma, Sanjeev Kumar; Bolser, Daniel; de Boer, Jan; Sønderkær, Mads; Amoros, Walter; Carboni, Martin Federico; D'Ambrosio, Juan Martín; de la Cruz, German; Di Genova, Alex; Douches, David S; Eguiluz, Maria; Guo, Xiao; Guzman, Frank; Hackett, Christine A; Hamilton, John P; Li, Guangcun; Li, Ying; Lozano, Roberto; Maass, Alejandro; Marshall, David; Martinez, Diana; McLean, Karen; Mejía, Nilo; Milne, Linda; Munive, Susan; Nagy, Istvan; Ponce, Olga; Ramirez, Manuel; Simon, Reinhard; Thomson, Susan J; Torres, Yerisf; Waugh, Robbie; Zhang, Zhonghua; Huang, Sanwen; Visser, Richard G F; Bachem, Christian W B; Sagredo, Boris; Feingold, Sergio E; Orjeda, Gisella; Veilleux, Richard E; Bonierbale, Merideth; Jacobs, Jeanne M E; Milbourne, Dan; Martin, David Michael Alan; Bryan, Glenn J

    2013-11-06

    The genome of potato, a major global food crop, was recently sequenced. The work presented here details the integration of the potato reference genome (DM) with a new sequence-tagged site marker-based linkage map and other physical and genetic maps of potato and the closely related species tomato. Primary anchoring of the DM genome assembly was accomplished by the use of a diploid segregating population, which was genotyped with several types of molecular genetic markers to construct a new ~936 cM linkage map comprising 2469 marker loci. In silico anchoring approaches used genetic and physical maps from the diploid potato genotype RH89-039-16 (RH) and tomato. This combined approach has allowed 951 superscaffolds to be ordered into pseudomolecules corresponding to the 12 potato chromosomes. These pseudomolecules represent 674 Mb (~93%) of the 723 Mb genome assembly and 37,482 (~96%) of the 39,031 predicted genes. The superscaffold order and orientation within the pseudomolecules are closely collinear with independently constructed high density linkage maps. Comparisons between marker distribution and physical location reveal regions of greater and lesser recombination, as well as regions exhibiting significant segregation distortion. The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal "pseudomolecules".

  3. Validation of rice genome sequence by optical mapping

    Directory of Open Access Journals (Sweden)

    Pape Louise

    2007-08-01

    Full Text Available Abstract Background Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. Results To facilitate ongoing sequencing finishing and validation efforts, we have constructed a whole-genome SwaI optical restriction map of the rice genome. The physical map consists of 14 contigs, covering 12 chromosomes, with a total genome size of 382.17 Mb; this value is about 11% smaller than original estimates. 9 of the 14 optical map contigs are without gaps, covering chromosomes 1, 2, 3, 4, 5, 7, 8 10, and 12 in their entirety – including centromeres and telomeres. Alignments between optical and in silico restriction maps constructed from IRGSP (International Rice Genome Sequencing Project and TIGR (The Institute for Genomic Research genome sequence sources are comprehensive and informative, evidenced by map coverage across virtually all published gaps, discovery of new ones, and characterization of sequence misassemblies; all totalling ~14 Mb. Furthermore, since optical maps are ordered restriction maps, identified discordances are pinpointed on a reliable physical scaffold providing an independent resource for closure of gaps and rectification of misassemblies. Conclusion Analysis of sequence and optical mapping data effectively validates genome sequence assemblies constructed from large, repeat-rich genomes. Given this conclusion we envision new applications of such single molecule analysis that will merge advantages offered by high-resolution optical maps with inexpensive, but short sequence reads generated by emerging sequencing platforms. Lastly, map construction techniques presented here points the way to new types of comparative genome analysis that would focus on discernment of

  4. Fine-scale maps of recombination rates and hotspots in the mouse genome.

    Science.gov (United States)

    Brunschwig, Hadassa; Levi, Liat; Ben-David, Eyal; Williams, Robert W; Yakir, Benjamin; Shifman, Sagiv

    2012-07-01

    Recombination events are not uniformly distributed and often cluster in narrow regions known as recombination hotspots. Several studies using different approaches have dramatically advanced our understanding of recombination hotspot regulation. Population genetic data have been used to map and quantify hotspots in the human genome. Genetic variation in recombination rates and hotspots usage have been explored in human pedigrees, mouse intercrosses, and by sperm typing. These studies pointed to the central role of the PRDM9 gene in hotspot modulation. In this study, we used single nucleotide polymorphisms (SNPs) from whole-genome resequencing and genotyping studies of mouse inbred strains to estimate recombination rates across the mouse genome and identified 47,068 historical hotspots--an average of over 2477 per chromosome. We show by simulation that inbred mouse strains can be used to identify positions of historical hotspots. Recombination hotspots were found to be enriched for the predicted binding sequences for different alleles of the PRDM9 protein. Recombination rates were on average lower near transcription start sites (TSS). Comparing the inferred historical recombination hotspots with the recent genome-wide mapping of double-strand breaks (DSBs) in mouse sperm revealed a significant overlap, especially toward the telomeres. Our results suggest that inbred strains can be used to characterize and study the dynamics of historical recombination hotspots. They also strengthen previous findings on mouse recombination hotspots, and specifically the impact of sequence variants in Prdm9.

  5. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Directory of Open Access Journals (Sweden)

    Yash Paul Khajuria

    Full Text Available The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777 of an inter-specific reference mapping population. High amplification efficiency (87%, experimental validation success rate (81% and polymorphic potential (55% of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48% detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%. An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777 having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped

  6. Whole-genome shotgun optical mapping of Rhodospirillum rubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, S. [Univ. Wisc.-Madison; Zhou, S. [Univ. Wisc.-Madison; Place, M. [Univ. Wisc.-Madison; Zhang, Y. [Univ. Wisc.-Madison; Briska, A. [Univ. Wisc.-Madison; Goldstein, S. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Lim, A. [Univ. Wisc.-Madison; Lapidus, A. [Univ. Wisc.-Madison; Han, C. S. [Univ. Wisc.-Madison; Roberts, G. P. [Univ. Wisc.-Madison; Schwartz, D. C. [Univ. Wisc.-Madison

    2005-09-01

    Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.

  7. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  8. Genome-Wide Fine-Scale Recombination Rate Variation in Drosophila melanogaster

    Science.gov (United States)

    Song, Yun S.

    2012-01-01

    Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and

  9. Computational solution to automatically map metabolite libraries in the context of genome scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Benjamin eMerlet

    2016-02-01

    Full Text Available This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc and flat file formats (SBML and Matlab files. We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics and Glasgow Polyomics on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks.In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks.In order to achieve this goal we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.

  10. Analysing human genomes at different scales

    DEFF Research Database (Denmark)

    Liu, Siyang

    The thriving of the Next-Generation sequencing (NGS) technologies in the past decade has dramatically revolutionized the field of human genetics. We are experiencing a wave of several large-scale whole genome sequencing studies of humans in the world. Those studies vary greatly regarding cohort...... will be reflected by the analysis of real data. This thesis covers studies in two human genome sequencing projects that distinctly differ in terms of studied population, sample size and sequencing depth. In the first project, we sequenced 150 Danish individuals from 50 trio families to 78x coverage....... The sophisticated experimental design enables high-quality de novo assembly of the genomes and provides a good opportunity for mapping the structural variations in the human population. We developed the AsmVar approach to discover, genotype and characterize the structural variations from the assemblies. Our...

  11. A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks.

    Science.gov (United States)

    Merlet, Benjamin; Paulhe, Nils; Vinson, Florence; Frainay, Clément; Chazalviel, Maxime; Poupin, Nathalie; Gloaguen, Yoann; Giacomoni, Franck; Jourdan, Fabien

    2016-01-01

    This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.

  12. Algorithms and Complexity Results for Genome Mapping Problems.

    Science.gov (United States)

    Rajaraman, Ashok; Zanetti, Joao Paulo Pereira; Manuch, Jan; Chauve, Cedric

    2017-01-01

    Genome mapping algorithms aim at computing an ordering of a set of genomic markers based on local ordering information such as adjacencies and intervals of markers. In most genome mapping models, markers are assumed to occur uniquely in the resulting map. We introduce algorithmic questions that consider repeats, i.e., markers that can have several occurrences in the resulting map. We show that, provided with an upper bound on the copy number of repeated markers and with intervals that span full repeat copies, called repeat spanning intervals, the problem of deciding if a set of adjacencies and repeat spanning intervals admits a genome representation is tractable if the target genome can contain linear and/or circular chromosomal fragments. We also show that extracting a maximum cardinality or weight subset of repeat spanning intervals given a set of adjacencies that admits a genome realization is NP-hard but fixed-parameter tractable in the maximum copy number and the number of adjacent repeats, and tractable if intervals contain a single repeated marker.

  13. Association Mapping and the Genomic Consequences of Selection in Sunflower

    Science.gov (United States)

    Mandel, Jennifer R.; Nambeesan, Savithri; Bowers, John E.; Marek, Laura F.; Ebert, Daniel; Rieseberg, Loren H.; Knapp, Steven J.; Burke, John M.

    2013-01-01

    The combination of large-scale population genomic analyses and trait-based mapping approaches has the potential to provide novel insights into the evolutionary history and genome organization of crop plants. Here, we describe the detailed genotypic and phenotypic analysis of a sunflower (Helianthus annuus L.) association mapping population that captures nearly 90% of the allelic diversity present within the cultivated sunflower germplasm collection. We used these data to characterize overall patterns of genomic diversity and to perform association analyses on plant architecture (i.e., branching) and flowering time, successfully identifying numerous associations underlying these agronomically and evolutionarily important traits. Overall, we found variable levels of linkage disequilibrium (LD) across the genome. In general, islands of elevated LD correspond to genomic regions underlying traits that are known to have been targeted by selection during the evolution of cultivated sunflower. In many cases, these regions also showed significantly elevated levels of differentiation between the two major sunflower breeding groups, consistent with the occurrence of divergence due to strong selection. One of these regions, which harbors a major branching locus, spans a surprisingly long genetic interval (ca. 25 cM), indicating the occurrence of an extended selective sweep in an otherwise recombinogenic interval. PMID:23555290

  14. Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

    Science.gov (United States)

    Lee, Hayan; Schatz, Michael C

    2012-08-15

    Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net

  15. RatMap--rat genome tools and data.

    Science.gov (United States)

    Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik

    2005-01-01

    The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.

  16. Genome-wide analysis of regions similar to promoters of histone genes

    KAUST Repository

    Chowdhary, Rajesh

    2010-05-28

    Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes.Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters.Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that

  17. Genome-wide mapping of DNA strand breaks.

    Directory of Open Access Journals (Sweden)

    Frédéric Leduc

    Full Text Available Determination of cellular DNA damage has so far been limited to global assessment of genome integrity whereas nucleotide-level mapping has been restricted to specific loci by the use of specific primers. Therefore, only limited DNA sequences can be studied and novel regions of genomic instability can hardly be discovered. Using a well-characterized yeast model, we describe a straightforward strategy to map genome-wide DNA strand breaks without compromising nucleotide-level resolution. This technique, termed "damaged DNA immunoprecipitation" (dDIP, uses immunoprecipitation and the terminal deoxynucleotidyl transferase-mediated dUTP-biotin end-labeling (TUNEL to capture DNA at break sites. When used in combination with microarray or next-generation sequencing technologies, dDIP will allow researchers to map genome-wide DNA strand breaks as well as other types of DNA damage and to establish a clear profiling of altered genes and/or intergenic sequences in various experimental conditions. This mapping technique could find several applications for instance in the study of aging, genotoxic drug screening, cancer, meiosis, radiation and oxidative DNA damage.

  18. Mapping and characterizing N6-methyladenine in eukaryotic genomes using single molecule real-time sequencing.

    Science.gov (United States)

    Zhu, Shijia; Beaulaurier, John; Deikus, Gintaras; Wu, Tao; Strahl, Maya; Hao, Ziyang; Luo, Guanzheng; Gregory, James A; Chess, Andrew; He, Chuan; Xiao, Andrew; Sebra, Robert; Schadt, Eric E; Fang, Gang

    2018-05-15

    N6-methyladenine (m6dA) has been discovered as a novel form of DNA methylation prevalent in eukaryotes, however, methods for high resolution mapping of m6dA events are still lacking. Single-molecule real-time (SMRT) sequencing has enabled the detection of m6dA events at single-nucleotide resolution in prokaryotic genomes, but its application to detecting m6dA in eukaryotic genomes has not been rigorously examined. Herein, we identified unique characteristics of eukaryotic m6dA methylomes that fundamentally differ from those of prokaryotes. Based on these differences, we describe the first approach for mapping m6dA events using SMRT sequencing specifically designed for the study of eukaryotic genomes, and provide appropriate strategies for designing experiments and carrying out sequencing in future studies. We apply the novel approach to study two eukaryotic genomes. For green algae, we construct the first complete genome-wide map of m6dA at single nucleotide and single molecule resolution. For human lymphoblastoid cells (hLCLs), joint analyses of SMRT sequencing and independent sequencing data suggest that putative m6dA events are enriched in the promoters of young, full length LINE-1 elements (L1s). These analyses demonstrate a general method for rigorous mapping and characterization of m6dA events in eukaryotic genomes. Published by Cold Spring Harbor Laboratory Press.

  19. Microbial genome sequencing using optical mapping and Illumina sequencing

    Science.gov (United States)

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  20. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  1. Genome-wide mapping of furfural tolerance genes in Escherichia coli.

    Science.gov (United States)

    Glebes, Tirzah Y; Sandoval, Nicholas R; Reeder, Philippa J; Schilling, Katherine D; Zhang, Min; Gill, Ryan T

    2014-01-01

    Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007) Nat. Method.) approach to map, in parallel, the effect of increased dosage for >10(5) different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate). Only 268 of >4,000 E. coli genes (∼ 6%) were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate.

  2. Genome-wide mapping of furfural tolerance genes in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Tirzah Y Glebes

    Full Text Available Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007 Nat. Method. approach to map, in parallel, the effect of increased dosage for >10(5 different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate. Only 268 of >4,000 E. coli genes (∼ 6% were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate.

  3. Analyses of a whole-genome inter-clade recombination map of hepatitis delta virus suggest a host polymerase-driven and viral RNA structure-promoted template-switching mechanism for viral RNA recombination

    Science.gov (United States)

    Chao, Mei; Wang, Tzu-Chi; Lin, Chia-Chi; Yung-Liang Wang, Robert; Lin, Wen-Bin; Lee, Shang-En; Cheng, Ying-Yu; Yeh, Chau-Ting; Iang, Shan-Bei

    2017-01-01

    The genome of hepatitis delta virus (HDV) is a 1.7-kb single-stranded circular RNA that folds into an unbranched rod-like structure and has ribozyme activity. HDV redirects host RNA polymerase(s) (RNAP) to perform viral RNA-directed RNA transcription. RNA recombination is known to contribute to the genetic heterogeneity of HDV, but its molecular mechanism is poorly understood. Here, we established a whole-genome HDV-1/HDV-4 recombination map using two cloned sequences coexisting in cultured cells. Our functional analyses of the resulting chimeric delta antigens (the only viral-encoded protein) and recombinant genomes provide insights into how recombination promotes the genotypic and phenotypic diversity of HDV. Our examination of crossover distribution and subsequent mutagenesis analyses demonstrated that ribozyme activity on HDV genome, which is required for viral replication, also contributes to the generation of an inter-clade junction. These data provide circumstantial evidence supporting our contention that HDV RNA recombination occurs via a replication-dependent mechanism. Furthermore, we identify an intrinsic asymmetric bulge on the HDV genome, which appears to promote recombination events in the vicinity. We therefore propose a mammalian RNAP-driven and viral-RNA-structure-promoted template-switching mechanism for HDV genetic recombination. The present findings improve our understanding of the capacities of the host RNAP beyond typical DNA-directed transcription. PMID:28977829

  4. Generation of a BAC-based physical map of the melon genome

    Directory of Open Access Journals (Sweden)

    Puigdomènech Pere

    2010-05-01

    Full Text Available Abstract Background Cucumis melo (melon belongs to the Cucurbitaceae family, whose economic importance among horticulture crops is second only to Solanaceae. Melon has high intra-specific genetic variation, morphologic diversity and a small genome size (450 Mb, which make this species suitable for a great variety of molecular and genetic studies that can lead to the development of tools for breeding varieties of the species. A number of genetic and genomic resources have already been developed, such as several genetic maps and BAC genomic libraries. These tools are essential for the construction of a physical map, a valuable resource for map-based cloning, comparative genomics and assembly of whole genome sequencing data. However, no physical map of any Cucurbitaceae has yet been developed. A project has recently been started to sequence the complete melon genome following a whole-genome shotgun strategy, which makes use of massive sequencing data. A BAC-based melon physical map will be a useful tool to help assemble and refine the draft genome data that is being produced. Results A melon physical map was constructed using a 5.7 × BAC library and a genetic map previously developed in our laboratories. High-information-content fingerprinting (HICF was carried out on 23,040 BAC clones, digesting with five restriction enzymes and SNaPshot labeling, followed by contig assembly with FPC software. The physical map has 1,355 contigs and 441 singletons, with an estimated physical length of 407 Mb (0.9 × coverage of the genome and the longest contig being 3.2 Mb. The anchoring of 845 BAC clones to 178 genetic markers (100 RFLPs, 76 SNPs and 2 SSRs also allowed the genetic positioning of 183 physical map contigs/singletons, representing 55 Mb (12% of the melon genome, to individual chromosomal loci. The melon FPC database is available for download at http://melonomics.upv.es/static/files/public/physical_map/. Conclusions Here we report the construction

  5. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed Missael Vargas

    2014-01-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence...... data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery...

  6. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana; Marcatili, Paolo; Tramontano, Anna

    2010-01-01

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  7. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana

    2010-10-12

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  8. TheCellMap.org: A Web-Accessible Database for Visualizing and Mining the Global Yeast Genetic Interaction Network.

    Science.gov (United States)

    Usaj, Matej; Tan, Yizhao; Wang, Wen; VanderSluis, Benjamin; Zou, Albert; Myers, Chad L; Costanzo, Michael; Andrews, Brenda; Boone, Charles

    2017-05-05

    Providing access to quantitative genomic data is key to ensure large-scale data validation and promote new discoveries. TheCellMap.org serves as a central repository for storing and analyzing quantitative genetic interaction data produced by genome-scale Synthetic Genetic Array (SGA) experiments with the budding yeast Saccharomyces cerevisiae In particular, TheCellMap.org allows users to easily access, visualize, explore, and functionally annotate genetic interactions, or to extract and reorganize subnetworks, using data-driven network layouts in an intuitive and interactive manner. Copyright © 2017 Usaj et al.

  9. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  10. A clone-free, single molecule map of the domestic cow (Bos taurus) genome.

    Science.gov (United States)

    Zhou, Shiguo; Goldstein, Steve; Place, Michael; Bechner, Michael; Patino, Diego; Potamousis, Konstantinos; Ravindran, Prabu; Pape, Louise; Rincon, Gonzalo; Hernandez-Ortiz, Juan; Medrano, Juan F; Schwartz, David C

    2015-08-28

    The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts

  11. Fine-Scale Mapping of the FGFR2 Breast Cancer Risk Locus

    DEFF Research Database (Denmark)

    Meyer, Kerstin B; O'Reilly, Martin; Michailidou, Kyriaki

    2013-01-01

    The 10q26 locus in the second intron of FGFR2 is the locus most strongly associated with estrogen-receptor-positive breast cancer in genome-wide association studies. We conducted fine-scale mapping in case-control studies genotyped with a custom chip (iCOGS), comprising 41 studies (n = 89,050) of...

  12. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  13. Cross-species mapping of bidirectional promoters enables prediction of unannotated 5' UTRs and identification of species-specific transcripts

    Directory of Open Access Journals (Sweden)

    Lewin Harris A

    2009-04-01

    Full Text Available Abstract Background Bidirectional promoters are shared regulatory regions that influence the expression of two oppositely oriented genes. This type of regulatory architecture is found more frequently than expected by chance in the human genome, yet many specifics underlying the regulatory design are unknown. Given that the function of most orthologous genes is similar across species, we hypothesized that the architecture and regulation of bidirectional promoters might also be similar across species, representing a core regulatory structure and enabling annotation of these regions in additional mammalian genomes. Results By mapping the intergenic distances of genes in human, chimpanzee, bovine, murine, and rat, we show an enrichment for pairs of genes equal to or less than 1,000 bp between their adjacent 5' ends ("head-to-head" compared to pairs of genes that fall in the same orientation ("head-to-tail" or whose 3' ends are side-by-side ("tail-to-tail". A representative set of 1,369 human bidirectional promoters was mapped to orthologous sequences in other mammals. We confirmed predictions for 5' UTRs in nine of ten manual picks in bovine based on comparison to the orthologous human promoter set and in six of seven predictions in human based on comparison to the bovine dataset. The two predictions that did not have orthology as bidirectional promoters in the other species resulted from unique events that initiated transcription in the opposite direction in only those species. We found evidence supporting the independent emergence of bidirectional promoters from the family of five RecQ helicase genes, which gained their bidirectional promoters and partner genes independently rather than through a duplication process. Furthermore, by expanding our comparisons from pairwise to multispecies analyses we developed a map representing a core set of bidirectional promoters in mammals. Conclusion We show that the orthologous positions of bidirectional

  14. An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow.

    Science.gov (United States)

    Paterson, Trevor; Law, Andy

    2009-08-14

    Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. We have developed a simple generic XML schema (GenomicMappingData.xsd - GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna workflows. The data

  15. Genome Variation Map: a data repository of genome variations in BIG Data Center

    OpenAIRE

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2017-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research a...

  16. Radiation hybrid maps of the D-genome of Aegilops tauschii and their application in sequence assembly of large and complex plant genomes.

    Science.gov (United States)

    Kumar, Ajay; Seetan, Raed; Mergoum, Mohamed; Tiwari, Vijay K; Iqbal, Muhammad J; Wang, Yi; Al-Azzam, Omar; Šimková, Hana; Luo, Ming-Cheng; Dvorak, Jan; Gu, Yong Q; Denton, Anne; Kilian, Andrzej; Lazo, Gerard R; Kianian, Shahryar F

    2015-10-16

    The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high resolution genome maps with saturated marker scaffolds to anchor and orient BAC contigs/ sequence scaffolds for whole genome assembly. Radiation hybrid (RH) mapping has proven to be an excellent tool for the development of such maps for it offers much higher and more uniform marker resolution across the length of the chromosome compared to genetic mapping and does not require marker polymorphism per se, as it is based on presence (retention) vs. absence (deletion) marker assay. In this study, a 178 line RH panel was genotyped with SSRs and DArT markers to develop the first high resolution RH maps of the entire D-genome of Ae. tauschii accession AL8/78. To confirm map order accuracy, the AL8/78-RH maps were compared with:1) a DArT consensus genetic map constructed using more than 100 bi-parental populations, 2) a RH map of the D-genome of reference hexaploid wheat 'Chinese Spring', and 3) two SNP-based genetic maps, one with anchored D-genome BAC contigs and another with anchored D-genome sequence scaffolds. Using marker sequences, the RH maps were also anchored with a BAC contig based physical map and draft sequence of the D-genome of Ae. tauschii. A total of 609 markers were mapped to 503 unique positions on the seven D-genome chromosomes, with a total map length of 14,706.7 cR. The average distance between any two marker loci was 29.2 cR which corresponds to 2.1 cM or 9.8 Mb. The average mapping resolution across the D-genome was estimated to be 0.34 Mb (Mb/cR) or 0.07 cM (cM/cR). The RH maps showed almost perfect agreement with several published maps with regard to chromosome assignments of markers. The mean rank correlations between the position of markers on AL8/78 maps and the four published maps, ranged from 0.75 to 0.92, suggesting a good agreement in marker order. With 609 mapped markers, a total of 2481 deletions for the whole D-genome were detected with an average

  17. Physical mapping in highly heterozygous genomes: a physical contig map of the Pinot Noir grapevine cultivar

    Directory of Open Access Journals (Sweden)

    Jurman Irena

    2010-03-01

    Full Text Available Abstract Background Most of the grapevine (Vitis vinifera L. cultivars grown today are those selected centuries ago, even though grapevine is one of the most important fruit crops in the world. Grapevine has therefore not benefited from the advances in modern plant breeding nor more recently from those in molecular genetics and genomics: genes controlling important agronomic traits are practically unknown. A physical map is essential to positionally clone such genes and instrumental in a genome sequencing project. Results We report on the first whole genome physical map of grapevine built using high information content fingerprinting of 49,104 BAC clones from the cultivar Pinot Noir. Pinot Noir, as most grape varieties, is highly heterozygous at the sequence level. This resulted in the two allelic haplotypes sometimes assembling into separate contigs that had to be accommodated in the map framework or in local expansions of contig maps. We performed computer simulations to assess the effects of increasing levels of sequence heterozygosity on BAC fingerprint assembly and showed that the experimental assembly results are in full agreement with the theoretical expectations, given the heterozygosity levels reported for grape. The map is anchored to a dense linkage map consisting of 994 markers. 436 contigs are anchored to the genetic map, covering 342 of the 475 Mb that make up the grape haploid genome. Conclusions We have developed a resource that makes it possible to access the grapevine genome, opening the way to a new era both in grape genetics and breeding and in wine making. The effects of heterozygosity on the assembly have been analyzed and characterized by using several complementary approaches which could be easily transferred to the study of other genomes which present the same features.

  18. Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

    Science.gov (United States)

    Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

    2012-01-01

    Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference

  19. Genomic characterization of DArT markers based on high-density linkage analysis and physical mapping to the Eucalyptus genome.

    Directory of Open Access Journals (Sweden)

    César D Petroli

    Full Text Available Diversity Arrays Technology (DArT provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for

  20. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  1. High-resolution genetic maps of Eucalyptus improve Eucalyptus grandis genome assembly.

    Science.gov (United States)

    Bartholomé, Jérôme; Mandrou, Eric; Mabiala, André; Jenkins, Jerry; Nabihoudine, Ibouniyamine; Klopp, Christophe; Schmutz, Jeremy; Plomion, Christophe; Gion, Jean-Marc

    2015-06-01

    Genetic maps are key tools in genetic research as they constitute the framework for many applications, such as quantitative trait locus analysis, and support the assembly of genome sequences. The resequencing of the two parents of a cross between Eucalyptus urophylla and Eucalyptus grandis was used to design a single nucleotide polymorphism (SNP) array of 6000 markers evenly distributed along the E. grandis genome. The genotyping of 1025 offspring enabled the construction of two high-resolution genetic maps containing 1832 and 1773 markers with an average marker interval of 0.45 and 0.5 cM for E. grandis and E. urophylla, respectively. The comparison between genetic maps and the reference genome highlighted 85% of collinear regions. A total of 43 noncollinear regions and 13 nonsynthetic regions were detected and corrected in the new genome assembly. This improved version contains 4943 scaffolds totalling 691.3 Mb of which 88.6% were captured by the 11 chromosomes. The mapping data were also used to investigate the effect of population size and number of markers on linkage mapping accuracy. This study provides the most reliable linkage maps for Eucalyptus and version 2.0 of the E. grandis genome. © 2014 CIRAD. New Phytologist © 2014 New Phytologist Trust.

  2. Genome mapping and characterization of the Anopheles gambiae heterochromatin

    Directory of Open Access Journals (Sweden)

    Sharakhova Maria V

    2010-08-01

    Full Text Available Abstract Background Heterochromatin plays an important role in chromosome function and gene regulation. Despite the availability of polytene chromosomes and genome sequence, the heterochromatin of the major malaria vector Anopheles gambiae has not been mapped and characterized. Results To determine the extent of heterochromatin within the An. gambiae genome, genes were physically mapped to the euchromatin-heterochromatin transition zone of polytene chromosomes. The study found that a minimum of 232 genes reside in 16.6 Mb of mapped heterochromatin. Gene ontology analysis revealed that heterochromatin is enriched in genes with DNA-binding and regulatory activities. Immunostaining of the An. gambiae chromosomes with antibodies against Drosophila melanogaster heterochromatin protein 1 (HP1 and the nuclear envelope protein lamin Dm0 identified the major invariable sites of the proteins' localization in all regions of pericentric heterochromatin, diffuse intercalary heterochromatin, and euchromatic region 9C of the 2R arm, but not in the compact intercalary heterochromatin. To better understand the molecular differences among chromatin types, novel Bayesian statistical models were developed to analyze genome features. The study found that heterochromatin and euchromatin differ in gene density and the coverage of retroelements and segmental duplications. The pericentric heterochromatin had the highest coverage of retroelements and tandem repeats, while intercalary heterochromatin was enriched with segmental duplications. We also provide evidence that the diffuse intercalary heterochromatin has a higher coverage of DNA transposable elements, minisatellites, and satellites than does the compact intercalary heterochromatin. The investigation of 42-Mb assembly of unmapped genomic scaffolds showed that it has molecular characteristics similar to cytologically mapped heterochromatin. Conclusions Our results demonstrate that Anopheles polytene chromosomes

  3. Comparative mapping in intraspecific populations uncovers a high degree of macrosynteny between A- and B-genome diploid species of peanut

    Directory of Open Access Journals (Sweden)

    Guo Yufang

    2012-11-01

    when comparing the homoeologous linkage groups between A (A. duranensis and B (A. batizocoi genomes. Comparison of the A- and B-genome genetic linkage maps also showed a total of five inversions and one major reciprocal translocation between two pairs of chromosomes under our current mapping resolution. Conclusions Our findings will contribute to understanding tetraploid peanut genome origin and evolution and eventually promote its genetic improvement. The newly developed EST-SSR markers will enrich current molecular marker resources in peanut.

  4. Experience from large scale use of the EuroGenomics custom SNP chip in cattle

    DEFF Research Database (Denmark)

    Boichard, Didier A; Boussaha, Mekki; Capitan, Aurélien

    2018-01-01

    This article presents the strategy to evaluate candidate mutations underlying QTL or responsible for genetic defects, based upon the design and large-scale use of the Eurogenomics custom SNP chip set up for bovine genomic selection. Some variants under study originated from mapping genetic defect...

  5. An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow

    Directory of Open Access Journals (Sweden)

    Law Andy

    2009-08-01

    Full Text Available Abstract Background Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. Results We have developed a simple generic XML schema (GenomicMappingData.xsd – GMD to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps. It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.. The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of

  6. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Shiguo [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Bechner, M. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Deng, W. [Univ. Wisc.-Madison; Wei, J. [Univ. Wisc.-Madison; Severin, J. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Dimalanta, E. [Univ. Wisc.-Madison; Lamers, C. [Univ. Wisc.-Madison; Burland, V. [Univ. Wisc.-Madison; Blattner, F. R. [Univ. Wisc.-Madison; Schwartz, David C. [Univ. Wisc.-Madison

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  7. A Chromosome-Scale Assembly of the Bactrocera cucurbitae Genome Provides Insight to the Genetic Basis of white pupae

    Directory of Open Access Journals (Sweden)

    Sheina B. Sim

    2017-06-01

    Full Text Available Genetic sexing strains (GSS used in sterile insect technique (SIT programs are textbook examples of how classical Mendelian genetics can be directly implemented in the management of agricultural insect pests. Although the foundation of traditionally developed GSS are single locus, autosomal recessive traits, their genetic basis are largely unknown. With the advent of modern genomic techniques, the genetic basis of sexing traits in GSS can now be further investigated. This study is the first of its kind to integrate traditional genetic techniques with emerging genomics to characterize a GSS using the tephritid fruit fly pest Bactrocera cucurbitae as a model. These techniques include whole-genome sequencing, the development of a mapping population and linkage map, and quantitative trait analysis. The experiment designed to map the genetic sexing trait in B. cucurbitae, white pupae (wp, also enabled the generation of a chromosome-scale genome assembly by integrating the linkage map with the assembly. Quantitative trait loci analysis revealed SNP loci near position 42 MB on chromosome 3 to be tightly linked to wp. Gene annotation and synteny analysis show a near perfect relationship between chromosomes in B. cucurbitae and Muller elements A–E in Drosophila melanogaster. This chromosome-scale genome assembly is complete, has high contiguity, was generated using a minimal input DNA, and will be used to further characterize the genetic mechanisms underlying wp. Knowledge of the genetic basis of genetic sexing traits can be used to improve SIT in this species and expand it to other economically important Diptera.

  8. Alignment of Escherichia coli K12 DNA sequences to a genomic restriction map.

    Science.gov (United States)

    Rudd, K E; Miller, W; Ostell, J; Benson, D A

    1990-01-25

    We use the extensive published information describing the genome of Escherichia coli and new restriction map alignment software to align DNA sequence, genetic, and physical maps. Restriction map alignment software is used which considers restriction maps as strings analogous to DNA or protein sequences except that two values, enzyme name and DNA base address, are associated with each position on the string. The resulting alignments reveal a nearly linear relationship between the physical and genetic maps of the E. coli chromosome. Physical map comparisons with the 1976, 1980, and 1983 genetic maps demonstrate a better fit with the more recent maps. The results of these alignments are genomic kilobase coordinates, orientation and rank of the alignment that best fits the genetic data. A statistical measure based on extreme value distribution is applied to the alignments. Additional computer analyses allow us to estimate the accuracy of the published E. coli genomic restriction map, simulate rearrangements of the bacterial chromosome, and search for repetitive DNA. The procedures we used are general enough to be applicable to other genome mapping projects.

  9. A gene-based linkage map for Bicyclus anynana butterflies allows for a comprehensive analysis of synteny with the lepidopteran reference genome.

    Directory of Open Access Journals (Sweden)

    Patrícia Beldade

    2009-02-01

    Full Text Available Lepidopterans (butterflies and moths are a rich and diverse order of insects, which, despite their economic impact and unusual biological properties, are relatively underrepresented in terms of genomic resources. The genome of the silkworm Bombyx mori has been fully sequenced, but comparative lepidopteran genomics has been hampered by the scarcity of information for other species. This is especially striking for butterflies, even though they have diverse and derived phenotypes (such as color vision and wing color patterns and are considered prime models for the evolutionary and developmental analysis of ecologically relevant, complex traits. We focus on Bicyclus anynana butterflies, a laboratory system for studying the diversification of novelties and serially repeated traits. With a panel of 12 small families and a biphasic mapping approach, we first assigned 508 expressed genes to segregation groups and then ordered 297 of them within individual linkage groups. We also coarsely mapped seven color pattern loci. This is the richest gene-based map available for any butterfly species and allowed for a broad-coverage analysis of synteny with the lepidopteran reference genome. Based on 462 pairs of mapped orthologous markers in Bi. anynana and Bo. mori, we observed strong conservation of gene assignment to chromosomes, but also evidence for numerous large- and small-scale chromosomal rearrangements. With gene collections growing for a variety of target organisms, the ability to place those genes in their proper genomic context is paramount. Methods to map expressed genes and to compare maps with relevant model systems are crucial to extend genomic-level analysis outside classical model species. Maps with gene-based markers are useful for comparative genomics and to resolve mapped genomic regions to a tractable number of candidate genes, especially if there is synteny with related model species. This is discussed in relation to the identification of

  10. Genome-wide SNP identification by high-throughput sequencing and selective mapping allows sequence assembly positioning using a framework genetic linkage map

    Directory of Open Access Journals (Sweden)

    Xu Xiangming

    2010-12-01

    Full Text Available Abstract Background Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method. Results The strategy was tested on a draft genome of the fungal pathogen Venturia inaequalis, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome Fragaria vesca. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for V. inaequalis and F. vesca, respectively, to genetic linkage maps. Conclusions We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.

  11. Definition of the zebrafish genome using flow cytometry and cytogenetic mapping

    Directory of Open Access Journals (Sweden)

    Zhou Yi

    2007-06-01

    Full Text Available Abstract Background The zebrafish (Danio rerio is an important vertebrate model organism system for biomedical research. The syntenic conservation between the zebrafish and human genome allows one to investigate the function of human genes using the zebrafish model. To facilitate analysis of the zebrafish genome, genetic maps have been constructed and sequence annotation of a reference zebrafish genome is ongoing. However, the duplicative nature of teleost genomes, including the zebrafish, complicates accurate assembly and annotation of a representative genome sequence. Cytogenetic approaches provide "anchors" that can be integrated with accumulating genomic data. Results Here, we cytogenetically define the zebrafish genome by first estimating the size of each linkage group (LG chromosome using flow cytometry, followed by the cytogenetic mapping of 575 bacterial artificial chromosome (BAC clones onto metaphase chromosomes. Of the 575 BAC clones, 544 clones localized to apparently unique chromosomal locations. 93.8% of these clones were assigned to a specific LG chromosome location using fluorescence in situ hybridization (FISH and compared to the LG chromosome assignment reported in the zebrafish genome databases. Thirty-one BAC clones localized to multiple chromosomal locations in several different hybridization patterns. From these data, a refined second generation probe panel for each LG chromosome was also constructed. Conclusion The chromosomal mapping of the 575 large-insert DNA clones allows for these clones to be integrated into existing zebrafish mapping data. An accurately annotated zebrafish reference genome serves as a valuable resource for investigating the molecular basis of human diseases using zebrafish mutant models.

  12. Using DNase Hi-C techniques to map global and local three-dimensional genome architecture at high resolution.

    Science.gov (United States)

    Ma, Wenxiu; Ay, Ferhat; Lee, Choli; Gulsoy, Gunhan; Deng, Xinxian; Cook, Savannah; Hesson, Jennifer; Cavanaugh, Christopher; Ware, Carol B; Krumm, Anton; Shendure, Jay; Blau, C Anthony; Disteche, Christine M; Noble, William S; Duan, ZhiJun

    2018-06-01

    The folding and three-dimensional (3D) organization of chromatin in the nucleus critically impacts genome function. The past decade has witnessed rapid advances in genomic tools for delineating 3D genome architecture. Among them, chromosome conformation capture (3C)-based methods such as Hi-C are the most widely used techniques for mapping chromatin interactions. However, traditional Hi-C protocols rely on restriction enzymes (REs) to fragment chromatin and are therefore limited in resolution. We recently developed DNase Hi-C for mapping 3D genome organization, which uses DNase I for chromatin fragmentation. DNase Hi-C overcomes RE-related limitations associated with traditional Hi-C methods, leading to improved methodological resolution. Furthermore, combining this method with DNA capture technology provides a high-throughput approach (targeted DNase Hi-C) that allows for mapping fine-scale chromatin architecture at exceptionally high resolution. Hence, targeted DNase Hi-C will be valuable for delineating the physical landscapes of cis-regulatory networks that control gene expression and for characterizing phenotype-associated chromatin 3D signatures. Here, we provide a detailed description of method design and step-by-step working protocols for these two methods. Copyright © 2018 Elsevier Inc. All rights reserved.

  13. Large-scale chromosome folding versus genomic DNA sequences: A discrete double Fourier transform technique.

    Science.gov (United States)

    Chechetkin, V R; Lobzin, V V

    2017-08-07

    Using state-of-the-art techniques combining imaging methods and high-throughput genomic mapping tools leaded to the significant progress in detailing chromosome architecture of various organisms. However, a gap still remains between the rapidly growing structural data on the chromosome folding and the large-scale genome organization. Could a part of information on the chromosome folding be obtained directly from underlying genomic DNA sequences abundantly stored in the databanks? To answer this question, we developed an original discrete double Fourier transform (DDFT). DDFT serves for the detection of large-scale genome regularities associated with domains/units at the different levels of hierarchical chromosome folding. The method is versatile and can be applied to both genomic DNA sequences and corresponding physico-chemical parameters such as base-pairing free energy. The latter characteristic is closely related to the replication and transcription and can also be used for the assessment of temperature or supercoiling effects on the chromosome folding. We tested the method on the genome of E. coli K-12 and found good correspondence with the annotated domains/units established experimentally. As a brief illustration of further abilities of DDFT, the study of large-scale genome organization for bacteriophage PHIX174 and bacterium Caulobacter crescentus was also added. The combined experimental, modeling, and bioinformatic DDFT analysis should yield more complete knowledge on the chromosome architecture and genome organization. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Chromosomal mapping of canine-derived BAC clones to the red fox and American mink genomes.

    Science.gov (United States)

    Kukekova, Anna V; Vorobieva, Nadegda V; Beklemisheva, Violetta R; Johnson, Jennifer L; Temnykh, Svetlana V; Yudkin, Dmitry V; Trut, Lyudmila N; Andre, Catherine; Galibert, Francis; Aguirre, Gustavo D; Acland, Gregory M; Graphodatsky, Alexander S

    2009-01-01

    High-quality sequencing of the dog (Canis lupus familiaris) genome has enabled enormous progress in genetic mapping of canine phenotypic variation. The red fox (Vulpes vulpes), another canid species, also exhibits a wide range of variation in coat color, morphology, and behavior. Although the fox genome has not yet been sequenced, canine genomic resources have been used to construct a meiotic linkage map of the red fox genome and begin genetic mapping in foxes. However, a more detailed gene-specific comparative map between the dog and fox genomes is required to establish gene order within homologous regions of dog and fox chromosomes and to refine breakpoints between homologous chromosomes of the 2 species. In the current study, we tested whether canine-derived gene-containing bacterial artificial chromosome (BAC) clones can be routinely used to build a gene-specific map of the red fox genome. Forty canine BAC clones were mapped to the red fox genome by fluorescence in situ hybridization (FISH). Each clone was uniquely assigned to a single fox chromosome, and the locations of 38 clones agreed with cytogenetic predictions. These results clearly demonstrate the utility of FISH mapping for construction of a whole-genome gene-specific map of the red fox. The further possibility of using canine BAC clones to map genes in the American mink (Mustela vison) genome was also explored. Much lower success was obtained for this more distantly related farm-bred species, although a few BAC clones were mapped to the predicted chromosomal locations.

  15. Uniparental Inheritance Promotes Adaptive Evolution in Cytoplasmic Genomes.

    Science.gov (United States)

    Christie, Joshua R; Beekman, Madeleine

    2017-03-01

    Eukaryotes carry numerous asexual cytoplasmic genomes (mitochondria and plastids). Lacking recombination, asexual genomes should theoretically suffer from impaired adaptive evolution. Yet, empirical evidence indicates that cytoplasmic genomes experience higher levels of adaptive evolution than predicted by theory. In this study, we use a computational model to show that the unique biology of cytoplasmic genomes-specifically their organization into host cells and their uniparental (maternal) inheritance-enable them to undergo effective adaptive evolution. Uniparental inheritance of cytoplasmic genomes decreases competition between different beneficial substitutions (clonal interference), promoting the accumulation of beneficial substitutions. Uniparental inheritance also facilitates selection against deleterious cytoplasmic substitutions, slowing Muller's ratchet. In addition, uniparental inheritance generally reduces genetic hitchhiking of deleterious substitutions during selective sweeps. Overall, uniparental inheritance promotes adaptive evolution by increasing the level of beneficial substitutions relative to deleterious substitutions. When we assume that cytoplasmic genome inheritance is biparental, decreasing the number of genomes transmitted during gametogenesis (bottleneck) aids adaptive evolution. Nevertheless, adaptive evolution is always more efficient when inheritance is uniparental. Our findings explain empirical observations that cytoplasmic genomes-despite their asexual mode of reproduction-can readily undergo adaptive evolution. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps.

    Science.gov (United States)

    Schaeffer, Stephen W; Bhutkar, Arjun; McAllister, Bryant F; Matsuda, Muneo; Matzkin, Luciano M; O'Grady, Patrick M; Rohde, Claudia; Valente, Vera L S; Aguadé, Montserrat; Anderson, Wyatt W; Edwards, Kevin; Garcia, Ana C L; Goodman, Josh; Hartigan, James; Kataoka, Eiko; Lapoint, Richard T; Lozovsky, Elena R; Machado, Carlos A; Noor, Mohamed A F; Papaceit, Montserrat; Reed, Laura K; Richards, Stephen; Rieger, Tania T; Russo, Susan M; Sato, Hajime; Segarra, Carmen; Smith, Douglas R; Smith, Temple F; Strelets, Victor; Tobari, Yoshiko N; Tomimura, Yoshihiko; Wasserman, Marvin; Watts, Thomas; Wilson, Robert; Yoshida, Kiyohito; Markow, Therese A; Gelbart, William M; Kaufman, Thomas C

    2008-07-01

    The sequencing of the 12 genomes of members of the genus Drosophila was taken as an opportunity to reevaluate the genetic and physical maps for 11 of the species, in part to aid in the mapping of assembled scaffolds. Here, we present an overview of the importance of cytogenetic maps to Drosophila biology and to the concepts of chromosomal evolution. Physical and genetic markers were used to anchor the genome assembly scaffolds to the polytene chromosomal maps for each species. In addition, a computational approach was used to anchor smaller scaffolds on the basis of the analysis of syntenic blocks. We present the chromosomal map data from each of the 11 sequenced non-Drosophila melanogaster species as a series of sections. Each section reviews the history of the polytene chromosome maps for each species, presents the new polytene chromosome maps, and anchors the genomic scaffolds to the cytological maps using genetic and physical markers. The mapping data agree with Muller's idea that the majority of Drosophila genes are syntenic. Despite the conservation of genes within homologous chromosome arms across species, the karyotypes of these species have changed through the fusion of chromosomal arms followed by subsequent rearrangement events.

  17. High-Throughput Sequencing and Linkage Mapping of a Clownfish Genome Provide Insights on the Distribution of Molecular Players Involved in Sex Change.

    Science.gov (United States)

    Casas, Laura; Saenz-Agudelo, Pablo; Irigoien, Xabier

    2018-03-06

    Clownfishes are an excellent model system for investigating the genetic mechanism governing hermaphroditism and socially-controlled sex change in their natural environment because they are broadly distributed and strongly site-attached. Genomic tools, such as genetic linkage maps, allow fine-mapping of loci involved in molecular pathways underlying these reproductive processes. In this study, a high-density genetic map of Amphiprion bicinctus was constructed with 3146 RAD markers in a full-sib family organized in 24 robust linkage groups which correspond to the haploid chromosome number of the species. The length of the map was 4294.71 cM, with an average marker interval of 1.38 cM. The clownfish linkage map showed various levels of conserved synteny and collinearity with the genomes of Asian and European seabass, Nile tilapia and stickleback. The map provided a platform to investigate the genomic position of genes with differential expression during sex change in A. bicinctus. This study aims to bridge the gap of genome-scale information for this iconic group of species to facilitate the study of the main gene regulatory networks governing social sex change and gonadal restructuring in protandrous hermaphrodites.

  18. High-Throughput Sequencing and Linkage Mapping of a Clownfish Genome Provide Insights on the Distribution of Molecular Players Involved in Sex Change

    KAUST Repository

    Casas, Laura

    2018-02-28

    Clownfishes are an excellent model system for investigating the genetic mechanism governing hermaphroditism and socially-controlled sex change in their natural environment because they are broadly distributed and strongly site-attached. Genomic tools, such as genetic linkage maps, allow fine-mapping of loci involved in molecular pathways underlying these reproductive processes. In this study, a high-density genetic map of Amphiprion bicinctus was constructed with 3146 RAD markers in a full-sib family organized in 24 robust linkage groups which correspond to the haploid chromosome number of the species. The length of the map was 4294.71 cM, with an average marker interval of 1.38 cM. The clownfish linkage map showed various levels of conserved synteny and collinearity with the genomes of Asian and European seabass, Nile tilapia and stickleback. The map provided a platform to investigate the genomic position of genes with differential expression during sex change in A. bicinctus. This study aims to bridge the gap of genome-scale information for this iconic group of species to facilitate the study of the main gene regulatory networks governing social sex change and gonadal restructuring in protandrous hermaphrodites.

  19. Radiation hybrid mapping as one of the main methods of the creation of high resolution maps of human and animal genomes

    International Nuclear Information System (INIS)

    Sulimova, G.E.; Kompanijtsev, A.A.; Mojsyak, E.V.; Rakhmanaliev, Eh.R.; Klimov, E.A.; Udina, I.G.; Zakharov, I.A.

    2000-01-01

    Radiation hybrid mapping (RH mapping) is considered as one of the main method of constructing physical maps of mammalian genomes. In introduction, theoretical prerequisites of developing of the RH mapping and statistical methods of data analysis are discussed. Comparative characteristics of universal commercial panels of the radiation hybrid somatic cells (RH panels) are shown. In experimental part of the work, RH mapping is used to localize nucleotide sequences adjacent to Not I sites of human chromosome 3 with the aim to integrate contig map of Nor I clones to comprehensive maps of human genome. Five nucleotide sequences adjacent to the sites of integration of papilloma virus in human genome and expressed in the cells of cervical cancer involved localized. It is demonstrated that the region 13q14.3-q21.1 was enriched with nucleotide sequences involved in the processes of carcinogenesis. RH mapping can be considered as one of the most perspective applications of modern radiation biology in the field of molecular genetics, that is, in constructing physical maps of mammalian genomes with high resolution level [ru

  20. Genome-scale neurogenetics: methodology and meaning.

    Science.gov (United States)

    McCarroll, Steven A; Feng, Guoping; Hyman, Steven E

    2014-06-01

    Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology.

  1. Public attitudes to the promotion of genomic crop studies in Japan: correlations between genomic literacy, trust, and favourable attitude.

    Science.gov (United States)

    Ishiyama, Izumi; Tanzawa, Tetsuro; Watanabe, Maiko; Maeda, Tadahiko; Muto, Kaori; Tamakoshi, Akiko; Nagai, Akiko; Yamagata, Zentaro

    2012-05-01

    This study aimed to assess public attitudes in Japan to the promotion of genomic selection in crop studies and to examine associated factors. We analysed data from a nationwide opinion survey. A total of 4,000 people were selected from the Japanese general population by a stratified two-phase sampling method, and 2,171 people participated by post; this survey asked about the pros and cons of crop-related genomic studies promotion, examined people's scientific literacy in genomics, and investigated factors thought to be related to genomic literacy and attitude. The relationships were examined using logistic regression models stratified by gender. Survey results showed that 50.0% of respondents approved of the promotion of crop-related genomic studies, while 6.7% disapproved. No correlation was found between literacy and attitude towards promotion. Trust in experts, belief in science, an interest in genomic studies and willingness to purchase new products correlated with a positive attitude towards crop-related genomic studies.

  2. Comparative BAC-based mapping in the white-throated sparrow, a novel behavioral genomics model, using interspecies overgo hybridization

    Directory of Open Access Journals (Sweden)

    Gonser Rusty A

    2011-06-01

    Full Text Available Abstract Background The genomics era has produced an arsenal of resources from sequenced organisms allowing researchers to target species that do not have comparable mapping and sequence information. These new "non-model" organisms offer unique opportunities to examine environmental effects on genomic patterns and processes. Here we use comparative mapping as a first step in characterizing the genome organization of a novel animal model, the white-throated sparrow (Zonotrichia albicollis, which occurs as white or tan morphs that exhibit alternative behaviors and physiology. Morph is determined by the presence or absence of a complex chromosomal rearrangement. This species is an ideal model for behavioral genomics because the association between genotype and phenotype is absolute, making it possible to identify the genomic bases of phenotypic variation. Findings We initiated a genomic study in this species by characterizing the white-throated sparrow BAC library via filter hybridization with overgo probes designed for the chicken, turkey, and zebra finch. Cross-species hybridization resulted in 640 positive sparrow BACs assigned to 77 chicken loci across almost all macro-and microchromosomes, with a focus on the chromosomes associated with morph. Out of 216 overgos, 36% of the probes hybridized successfully, with an average number of 3.0 positive sparrow BACs per overgo. Conclusions These data will be utilized for determining chromosomal architecture and for fine-scale mapping of candidate genes associated with phenotypic differences. Our research confirms the utility of interspecies hybridization for developing comparative maps in other non-model organisms.

  3. Considerations for creating and annotating the budding yeast Genome Map at SGD: a progress report.

    Science.gov (United States)

    Chan, Esther T; Cherry, J Michael

    2012-01-01

    The Saccharomyces Genome Database (SGD) is compiling and annotating a comprehensive catalogue of functional sequence elements identified in the budding yeast genome. Recent advances in deep sequencing technologies have enabled for example, global analyses of transcription profiling and assembly of maps of transcription factor occupancy and higher order chromatin organization, at nucleotide level resolution. With this growing influx of published genome-scale data, come new challenges for their storage, display, analysis and integration. Here, we describe SGD's progress in the creation of a consolidated resource for genome sequence elements in the budding yeast, the considerations taken in its design and the lessons learned thus far. The data within this collection can be accessed at http://browse.yeastgenome.org and downloaded from http://downloads.yeastgenome.org. DATABASE URL: http://www.yeastgenome.org.

  4. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    scientific achievements by bridging these two very important disciplines into an interactive and attractive forum. Keeping this objective in mind, Biocomp 2007 aims to promote interdisciplinary and multidisciplinary education and research. 25 high quality peer-reviewed papers were selected from 400+ submissions for this supplementary issue of BMC Genomics. Those papers contributed to a wide-range of important research fields including gene expression data analysis and applications, high-throughput genome mapping, sequence analysis, gene regulation, protein structure prediction, disease prediction by machine learning techniques, systems biology, database and biological software development. We always encourage participants submitting proposals for genomics sessions, special interest research sessions, workshops and tutorials to Professor Hamid R. Arabnia (hra@cs.uga.edu) in order to ensure that Biocomp continuously plays the leadership role in promoting inter/multidisciplinary research and education in the fields. Biocomp received top conference ranking with a high score of 0.95/1.00. Biocomp is academically co-sponsored by the International Society of Intelligent Biological Medicine and the Research Laboratories and Centers of Harvard University--Massachusetts Institute of Technology, Indiana University--Purdue University, Georgia Tech--Emory University, UIUC, UCLA, Columbia University, University of Texas at Austin and University of Iowa etc. Biocomp--Worldcomp brings leading scientists together across the nation and all over the world and aims to promote synergistic components such as keynote lectures, special interest sessions, workshops and tutorials in response to the advances of cutting-edge research.

  5. Uniparental Inheritance Promotes Adaptive Evolution in Cytoplasmic Genomes

    Science.gov (United States)

    Christie, Joshua R.; Beekman, Madeleine

    2017-01-01

    Eukaryotes carry numerous asexual cytoplasmic genomes (mitochondria and plastids). Lacking recombination, asexual genomes should theoretically suffer from impaired adaptive evolution. Yet, empirical evidence indicates that cytoplasmic genomes experience higher levels of adaptive evolution than predicted by theory. In this study, we use a computational model to show that the unique biology of cytoplasmic genomes—specifically their organization into host cells and their uniparental (maternal) inheritance—enable them to undergo effective adaptive evolution. Uniparental inheritance of cytoplasmic genomes decreases competition between different beneficial substitutions (clonal interference), promoting the accumulation of beneficial substitutions. Uniparental inheritance also facilitates selection against deleterious cytoplasmic substitutions, slowing Muller’s ratchet. In addition, uniparental inheritance generally reduces genetic hitchhiking of deleterious substitutions during selective sweeps. Overall, uniparental inheritance promotes adaptive evolution by increasing the level of beneficial substitutions relative to deleterious substitutions. When we assume that cytoplasmic genome inheritance is biparental, decreasing the number of genomes transmitted during gametogenesis (bottleneck) aids adaptive evolution. Nevertheless, adaptive evolution is always more efficient when inheritance is uniparental. Our findings explain empirical observations that cytoplasmic genomes—despite their asexual mode of reproduction—can readily undergo adaptive evolution. PMID:28025277

  6. An extended anchored linkage map and virtual mapping for the american mink genome based on homology to human and dog

    DEFF Research Database (Denmark)

    Anistoroaei, Razvan Marian; Ansari, S.; Farid, A.

    2009-01-01

    hybridization (FISH) and/or by means of human/dog/mink comparative homology. The average interval between markers is 8.5 cM and the linkage groups collectively span 1340 cM. In addition, 217 and 275 mink microsatellites have been placed on human and dog genomes, respectively. In conjunction with the existing...... comparative human/dog/mink data, these assignments represent useful virtual maps for the American mink genome. Comparison of the current human/dog assembled sequential map with the existing Zoo-FISH-based human/dog/mink maps helped to refine the human/dog/mink comparative map. Furthermore, comparison...... of the human and dog genome assemblies revealed a number of large synteny blocks, some of which are corroborated by data from the mink linkage map....

  7. A Genomic Map of the Effects of Linked Selection in Drosophila.

    Directory of Open Access Journals (Sweden)

    Eyal Elyashiv

    2016-08-01

    Full Text Available Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR. They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs. Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

  8. A Genomic Map of the Effects of Linked Selection in Drosophila.

    Science.gov (United States)

    Elyashiv, Eyal; Sattath, Shmuel; Hu, Tina T; Strutsovsky, Alon; McVicker, Graham; Andolfatto, Peter; Coop, Graham; Sella, Guy

    2016-08-01

    Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

  9. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  10. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    2009-11-01

    Full Text Available The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation.We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels.We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  11. A LDA-based approach to promoting ranking diversity for genomics information retrieval.

    Science.gov (United States)

    Chen, Yan; Yin, Xiaoshi; Li, Zhoujun; Hu, Xiaohua; Huang, Jimmy Xiangji

    2012-06-11

    In the biomedical domain, there are immense data and tremendous increase of genomics and biomedical relevant publications. The wealth of information has led to an increasing amount of interest in and need for applying information retrieval techniques to access the scientific literature in genomics and related biomedical disciplines. In many cases, the desired information of a query asked by biologists is a list of a certain type of entities covering different aspects that are related to the question, such as cells, genes, diseases, proteins, mutations, etc. Hence, it is important of a biomedical IR system to be able to provide relevant and diverse answers to fulfill biologists' information needs. However traditional IR model only concerns with the relevance between retrieved documents and user query, but does not take redundancy between retrieved documents into account. This will lead to high redundancy and low diversity in the retrieval ranked lists. In this paper, we propose an approach which employs a topic generative model called Latent Dirichlet Allocation (LDA) to promoting ranking diversity for biomedical information retrieval. Different from other approaches or models which consider aspects on word level, our approach assumes that aspects should be identified by the topics of retrieved documents. We present LDA model to discover topic distribution of retrieval passages and word distribution of each topic dimension, and then re-rank retrieval results with topic distribution similarity between passages based on N-size slide window. We perform our approach on TREC 2007 Genomics collection and two distinctive IR baseline runs, which can achieve 8% improvement over the highest Aspect MAP reported in TREC 2007 Genomics track. The proposed method is the first study of adopting topic model to genomics information retrieval, and demonstrates its effectiveness in promoting ranking diversity as well as in improving relevance of ranked lists of genomics search

  12. Comprehensive Mapping of Pluripotent Stem Cell Metabolism Using Dynamic Genome-Scale Network Modeling

    Directory of Open Access Journals (Sweden)

    Sriram Chandrasekaran

    2017-12-01

    Full Text Available Summary: Metabolism is an emerging stem cell hallmark tied to cell fate, pluripotency, and self-renewal, yet systems-level understanding of stem cell metabolism has been limited by the lack of genome-scale network models. Here, we develop a systems approach to integrate time-course metabolomics data with a computational model of metabolism to analyze the metabolic state of naive and primed murine pluripotent stem cells. Using this approach, we find that one-carbon metabolism involving phosphoglycerate dehydrogenase, folate synthesis, and nucleotide synthesis is a key pathway that differs between the two states, resulting in differential sensitivity to anti-folates. The model also predicts that the pluripotency factor Lin28 regulates this one-carbon metabolic pathway, which we validate using metabolomics data from Lin28-deficient cells. Moreover, we identify and validate metabolic reactions related to S-adenosyl-methionine production that can differentially impact histone methylation in naive and primed cells. Our network-based approach provides a framework for characterizing metabolic changes influencing pluripotency and cell fate. : Chandrasekaran et al. use computational modeling, metabolomics, and metabolic inhibitors to discover metabolic differences between various pluripotent stem cell states and infer their impact on stem cell fate decisions. Keywords: systems biology, stem cell biology, metabolism, genome-scale modeling, pluripotency, histone methylation, naive (ground state, primed state, cell fate, metabolic network

  13. Physical mapping of a large plant genome using global high-information-content-fingerprinting: the distal region of the wheat ancestor Aegilops tauschii chromosome 3DS

    Directory of Open Access Journals (Sweden)

    You Frank M

    2010-06-01

    Full Text Available Abstract Background Physical maps employing libraries of bacterial artificial chromosome (BAC clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum, Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat. Results We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs. Conclusions The physical map reported here is the first physical map using fingerprinting of a complete

  14. A meiotic linkage map of the silver fox, aligned and compared to the canine genome.

    Science.gov (United States)

    Kukekova, Anna V; Trut, Lyudmila N; Oskina, Irina N; Johnson, Jennifer L; Temnykh, Svetlana V; Kharlamova, Anastasiya V; Shepeleva, Darya V; Gulievich, Rimma G; Shikhevich, Svetlana G; Graphodatsky, Alexander S; Aguirre, Gustavo D; Acland, Gregory M

    2007-03-01

    A meiotic linkage map is essential for mapping traits of interest and is often the first step toward understanding a cryptic genome. Specific strains of silver fox (a variant of the red fox, Vulpes vulpes), which segregate behavioral and morphological phenotypes, create a need for such a map. One such strain, selected for docility, exhibits friendly dog-like responses to humans, in contrast to another strain selected for aggression. Development of a fox map is facilitated by the known cytogenetic homologies between the dog and fox, and by the availability of high resolution canine genome maps and sequence data. Furthermore, the high genomic sequence identity between dog and fox allows adaptation of canine microsatellites for genotyping and meiotic mapping in foxes. Using 320 such markers, we have constructed the first meiotic linkage map of the fox genome. The resulting sex-averaged map covers 16 fox autosomes and the X chromosome with an average inter-marker distance of 7.5 cM. The total map length corresponds to 1480.2 cM. From comparison of sex-averaged meiotic linkage maps of the fox and dog genomes, suppression of recombination in pericentromeric regions of the metacentric fox chromosomes was apparent, relative to the corresponding segments of acrocentric dog chromosomes. Alignment of the fox meiotic map against the 7.6x canine genome sequence revealed high conservation of marker order between homologous regions of the two species. The fox meiotic map provides a critical tool for genetic studies in foxes and identification of genetic loci and genes implicated in fox domestication.

  15. Ensembl Genomes 2013: scaling up access to genome-wide data.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  16. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    Energy Technology Data Exchange (ETDEWEB)

    Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  17. Unexpected observations after mapping LongSAGE tags to the human genome

    Directory of Open Access Journals (Sweden)

    Duret Laurent

    2007-05-01

    Full Text Available Abstract Background SAGE has been used widely to study the expression of known transcripts, but much less to annotate new transcribed regions. LongSAGE produces tags that are sufficiently long to be reliably mapped to a whole-genome sequence. Here we used this property to study the position of human LongSAGE tags obtained from all public libraries. We focused mainly on tags that do not map to known transcripts. Results Using a published error rate in SAGE libraries, we first removed the tags likely to result from sequencing errors. We then observed that an unexpectedly large number of the remaining tags still did not match the genome sequence. Some of these correspond to parts of human mRNAs, such as polyA tails, junctions between two exons and polymorphic regions of transcripts. Another non-negligible proportion can be attributed to contamination by murine transcripts and to residual sequencing errors. After filtering out our data with these screens to ensure that our dataset is highly reliable, we studied the tags that map once to the genome. 31% of these tags correspond to unannotated transcripts. The others map to known transcribed regions, but many of them (nearly half are located either in antisense or in new variants of these known transcripts. Conclusion We performed a comprehensive study of all publicly available human LongSAGE tags, and carefully verified the reliability of these data. We found the potential origin of many tags that did not match the human genome sequence. The properties of the remaining tags imply that the level of sequencing error may have been under-estimated. The frequency of tags matching once the genome sequence but not in an annotated exon suggests that the human transcriptome is much more complex than shown by the current human genome annotations, with many new splicing variants and antisense transcripts. SAGE data is appropriate to map new transcripts to the genome, as demonstrated by the high rate of cross

  18. Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale

    Science.gov (United States)

    Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li

    2015-01-01

    The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143

  19. Genome-wide function of H2B ubiquitylation in promoter and genic regions.

    Science.gov (United States)

    Batta, Kiran; Zhang, Zhenhai; Yen, Kuangyu; Goffman, David B; Pugh, B Franklin

    2011-11-01

    Nucleosomal organization in and around genes may contribute substantially to transcriptional regulation. The contribution of histone modifications to genome-wide nucleosomal organization has not been systematically evaluated. In the present study, we examine the role of H2BK123 ubiquitylation, a key regulator of several histone modifications, on nucleosomal organization at promoter, genic, and transcription termination regions in Saccharomyces cerevisiae. Using high-resolution MNase chromatin immunoprecipitation and sequencing (ChIP-seq), we map nucleosome positioning and occupancy in mutants of the H2BK123 ubiquitylation pathway. We found that H2B ubiquitylation-mediated nucleosome formation and/or stability inhibits the assembly of the transcription machinery at normally quiescent promoters, whereas ubiquitylation within highly active gene bodies promotes transcription elongation. This regulation does not proceed through ubiquitylation-regulated histone marks at H3K4, K36, and K79. Our findings suggest that mechanistically similar functions of H2B ubiquitylation (nucleosome assembly) elicit different functional outcomes on genes depending on its positional context in promoters (repressive) versus transcribed regions (activating).

  20. The Sequenced Angiosperm Genomes and Genome Databases.

    Science.gov (United States)

    Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

    2018-01-01

    Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

  1. Enumeration of smallest intervention strategies in genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Axel von Kamp

    2014-01-01

    Full Text Available One ultimate goal of metabolic network modeling is the rational redesign of biochemical networks to optimize the production of certain compounds by cellular systems. Although several constraint-based optimization techniques have been developed for this purpose, methods for systematic enumeration of intervention strategies in genome-scale metabolic networks are still lacking. In principle, Minimal Cut Sets (MCSs; inclusion-minimal combinations of reaction or gene deletions that lead to the fulfilment of a given intervention goal provide an exhaustive enumeration approach. However, their disadvantage is the combinatorial explosion in larger networks and the requirement to compute first the elementary modes (EMs which itself is impractical in genome-scale networks. We present MCSEnumerator, a new method for effective enumeration of the smallest MCSs (with fewest interventions in genome-scale metabolic network models. For this we combine two approaches, namely (i the mapping of MCSs to EMs in a dual network, and (ii a modified algorithm by which shortest EMs can be effectively determined in large networks. In this way, we can identify the smallest MCSs by calculating the shortest EMs in the dual network. Realistic application examples demonstrate that our algorithm is able to list thousands of the most efficient intervention strategies in genome-scale networks for various intervention problems. For instance, for the first time we could enumerate all synthetic lethals in E.coli with combinations of up to 5 reactions. We also applied the new algorithm exemplarily to compute strain designs for growth-coupled synthesis of different products (ethanol, fumarate, serine by E.coli. We found numerous new engineering strategies partially requiring less knockouts and guaranteeing higher product yields (even without the assumption of optimal growth than reported previously. The strength of the presented approach is that smallest intervention strategies can be

  2. Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant.

    Science.gov (United States)

    Wu, Pingzhi; Zhou, Changpin; Cheng, Shifeng; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Chen, Yanbo; Chen, Yan; Ni, Peixiang; Wang, Ying; Xu, Xun; Huang, Ying; Song, Chi; Wang, Zhiwen; Shi, Nan; Zhang, Xudong; Fang, Xiaohua; Yang, Qing; Jiang, Huawu; Chen, Yaping; Li, Meiru; Wang, Ying; Chen, Fan; Wang, Jun; Wu, Guojiang

    2015-03-01

    The family Euphorbiaceae includes some of the most efficient biomass accumulators. Whole genome sequencing and the development of genetic maps of these species are important components in molecular breeding and genetic improvement. Here we report the draft genome of physic nut (Jatropha curcas L.), a biodiesel plant. The assembled genome has a total length of 320.5 Mbp and contains 27,172 putative protein-coding genes. We established a linkage map containing 1208 markers and anchored the genome assembly (81.7%) to this map to produce 11 pseudochromosomes. After gene family clustering, 15,268 families were identified, of which 13,887 existed in the castor bean genome. Analysis of the genome highlighted specific expansion and contraction of a number of gene families during the evolution of this species, including the ribosome-inactivating proteins and oil biosynthesis pathway enzymes. The genomic sequence and linkage map provide a valuable resource not only for fundamental and applied research on physic nut but also for evolutionary and comparative genomics analysis, particularly in the Euphorbiaceae. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.

  3. A first generation BAC-based physical map of the rainbow trout genome

    Directory of Open Access Journals (Sweden)

    Thorgaard Gary H

    2009-10-01

    Full Text Available Abstract Background Rainbow trout (Oncorhynchus mykiss are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL have been identified for production and life-history traits in rainbow trout. A bacterial artificial chromosome (BAC physical map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS for improving rainbow trout aquaculture production. This resource will also facilitate efforts to obtain and assemble a whole-genome reference sequence for this species. Results The physical map was constructed from DNA fingerprinting of 192,096 BAC clones using the 4-color high-information content fingerprinting (HICF method. The clones were assembled into physical map contigs using the finger-printing contig (FPC program. The map is composed of 4,173 contigs and 9,379 singletons. The total number of unique fingerprinting fragments (consensus bands in contigs is 1,185,157, which corresponds to an estimated physical length of 2.0 Gb. The map assembly was validated by 1 comparison with probe hybridization results and agarose gel fingerprinting contigs; and 2 anchoring large contigs to the microsatellite-based genetic linkage map. Conclusion The production and validation of the first BAC physical map of the rainbow trout genome is described in this paper. We are currently integrating this map with the NCCCWA genetic map using more than 200 microsatellites isolated from BAC end sequences and by identifying BACs that harbor more than 300 previously mapped markers. The availability of an integrated physical and genetic map will enable detailed comparative genome

  4. The sea lamprey meiotic map improves resolution of ancient vertebrate genome duplications.

    Science.gov (United States)

    Smith, Jeramiah J; Keinath, Melissa C

    2015-08-01

    It is generally accepted that many genes present in vertebrate genomes owe their origin to two whole-genome duplications that occurred deep in the ancestry of the vertebrate lineage. However, details regarding the timing and outcome of these duplications are not well resolved. We present high-density meiotic and comparative genomic maps for the sea lamprey (Petromyzon marinus), a representative of an ancient lineage that diverged from all other vertebrates ∼550 million years ago. Linkage analyses yielded a total of 95 linkage groups, similar to the estimated number of germline chromosomes (1n ∼ 99), spanning a total of 5570.25 cM. Comparative mapping data yield strong support for the hypothesis that a single whole-genome duplication occurred in the basal vertebrate lineage, but do not strongly support a hypothetical second event. Rather, these comparative maps reveal several evolutionarily independent segmental duplications occurring over the last 600+ million years of chordate evolution. This refined history of vertebrate genome duplication should permit more precise investigations of vertebrate evolution. © 2015 Smith and Keinath; Published by Cold Spring Harbor Laboratory Press.

  5. Genome-scale biological models for industrial microbial systems.

    Science.gov (United States)

    Xu, Nan; Ye, Chao; Liu, Liming

    2018-04-01

    The primary aims and challenges associated with microbial fermentation include achieving faster cell growth, higher productivity, and more robust production processes. Genome-scale biological models, predicting the formation of an interaction among genetic materials, enzymes, and metabolites, constitute a systematic and comprehensive platform to analyze and optimize the microbial growth and production of biological products. Genome-scale biological models can help optimize microbial growth-associated traits by simulating biomass formation, predicting growth rates, and identifying the requirements for cell growth. With regard to microbial product biosynthesis, genome-scale biological models can be used to design product biosynthetic pathways, accelerate production efficiency, and reduce metabolic side effects, leading to improved production performance. The present review discusses the development of microbial genome-scale biological models since their emergence and emphasizes their pertinent application in improving industrial microbial fermentation of biological products.

  6. A High Resolution Genetic Map Anchoring Scaffolds of the Sequenced Watermelon Genome

    Science.gov (United States)

    Kou, Qinghe; Jiang, Jiao; Guo, Shaogui; Zhang, Haiying; Hou, Wenju; Zou, Xiaohua; Sun, Honghe; Gong, Guoyi; Levi, Amnon; Xu, Yong

    2012-01-01

    As part of our ongoing efforts to sequence and map the watermelon (Citrullus spp.) genome, we have constructed a high density genetic linkage map. The map positioned 234 watermelon genome sequence scaffolds (an average size of 1.41 Mb) that cover about 330 Mb and account for 93.5% of the 353 Mb of the assembled genomic sequences of the elite Chinese watermelon line 97103 (Citrullus lanatus var. lanatus). The genetic map was constructed using an F8 population of 103 recombinant inbred lines (RILs). The RILs are derived from a cross between the line 97103 and the United States Plant Introduction (PI) 296341-FR (C. lanatus var. citroides) that contains resistance to fusarium wilt (races 0, 1, and 2). The genetic map consists of eleven linkage groups that include 698 simple sequence repeat (SSR), 219 insertion-deletion (InDel) and 36 structure variation (SV) markers and spans ∼800 cM with a mean marker interval of 0.8 cM. Using fluorescent in situ hybridization (FISH) with 11 BACs that produced chromosome-specifc signals, we have depicted watermelon chromosomes that correspond to the eleven linkage groups constructed in this study. The high resolution genetic map developed here should be a useful platform for the assembly of the watermelon genome, for the development of sequence-based markers used in breeding programs, and for the identification of genes associated with important agricultural traits. PMID:22247776

  7. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    Directory of Open Access Journals (Sweden)

    McGuire Patrick E

    2010-12-01

    Full Text Available Abstract Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large

  8. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis.

    Science.gov (United States)

    Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming

    2016-08-05

    Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species comparisons and allow investigation of karyotype and genome evolution through highly efficient computation approaches such as in silico PCR. Here we described genome wide development and characterization of SSR markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We further applied these markers in evaluating the genetic diversity and population structure in watermelon germplasm collections. A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these cross-species SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species. In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis

  9. The OME Framework for genome-scale systems biology

    Energy Technology Data Exchange (ETDEWEB)

    Palsson, Bernhard O. [Univ. of California, San Diego, CA (United States); Ebrahim, Ali [Univ. of California, San Diego, CA (United States); Federowicz, Steve [Univ. of California, San Diego, CA (United States)

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  10. Extreme-Scale De Novo Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Georganas, Evangelos [Intel Corporation, Santa Clara, CA (United States); Hofmeyr, Steven [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Egan, Rob [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Rokhsar, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.

    2017-09-26

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

  11. A BAC-based physical map of the Drosophila buzzatii genome

    Energy Technology Data Exchange (ETDEWEB)

    Gonzalez, Josefa; Nefedov, Michael; Bosdet, Ian; Casals, Ferran; Calvete, Oriol; Delprat, Alejandra; Shin, Heesun; Chiu, Readman; Mathewson, Carrie; Wye, Natasja; Hoskins, Roger A.; Schein, JacquelineE.; de Jong, Pieter; Ruiz, Alfredo

    2005-03-18

    Large-insert genomic libraries facilitate cloning of large genomic regions, allow the construction of clone-based physical maps and provide useful resources for sequencing entire genomes. Drosophilabuzzatii is a representative species of the repleta group in the Drosophila subgenus, which is being widely used as a model in studies of genome evolution, ecological adaptation and speciation. We constructed a Bacterial Artificial Chromosome (BAC) genomic library of D. buzzatii using the shuttle vector pTARBAC2.1. The library comprises 18,353 clones with an average insert size of 152 kb and a {approx}18X expected representation of the D. buzzatii euchromatic genome. We screened the entire library with six euchromatic gene probes and estimated the actual genome representation to be {approx}23X. In addition, we fingerprinted by restriction digestion and agarose gel electrophoresis a sample of 9,555 clones, and assembled them using Finger Printed Contigs (FPC) software and manual editing into 345 contigs (mean of 26 clones per contig) and 670singletons. Finally, we anchored 181 large contigs (containing 7,788clones) to the D. buzzatii salivary gland polytene chromosomes by in situ hybridization of 427 representative clones. The BAC library and a database with all the information regarding the high coverage BAC-based physical map described in this paper are available to the research community.

  12. Genome-Wide Mapping Targets of the Metazoan Chromatin Remodeling Factor NURF Reveals Nucleosome Remodeling at Enhancers, Core Promoters and Gene Insulators.

    Directory of Open Access Journals (Sweden)

    So Yeon Kwon

    2016-04-01

    Full Text Available NURF is a conserved higher eukaryotic ISWI-containing chromatin remodeling complex that catalyzes ATP-dependent nucleosome sliding. By sliding nucleosomes, NURF is able to alter chromatin dynamics to control transcription and genome organization. Previous biochemical and genetic analysis of the specificity-subunit of Drosophila NURF (Nurf301/Enhancer of Bithorax (E(bx has defined NURF as a critical regulator of homeotic, heat-shock and steroid-responsive gene transcription. It has been speculated that NURF controls pathway specific transcription by co-operating with sequence-specific transcription factors to remodel chromatin at dedicated enhancers. However, conclusive in vivo demonstration of this is lacking and precise regulatory elements targeted by NURF are poorly defined. To address this, we have generated a comprehensive map of in vivo NURF activity, using MNase-sequencing to determine at base pair resolution NURF target nucleosomes, and ChIP-sequencing to define sites of NURF recruitment. Our data show that, besides anticipated roles at enhancers, NURF interacts physically and functionally with the TRF2/DREF basal transcription factor to organize nucleosomes downstream of active promoters. Moreover, we detect NURF remodeling and recruitment at distal insulator sites, where NURF functionally interacts with and co-localizes with DREF and insulator proteins including CP190 to establish nucleosome-depleted domains. This insulator function of NURF is most apparent at subclasses of insulators that mark the boundaries of chromatin domains, where multiple insulator proteins co-associate. By visualizing the complete repertoire of in vivo NURF chromatin targets, our data provide new insights into how chromatin remodeling can control genome organization and regulatory interactions.

  13. a Model Study of Small-Scale World Map Generalization

    Science.gov (United States)

    Cheng, Y.; Yin, Y.; Li, C. M.; Wu, W.; Guo, P. P.; Ma, X. L.; Hu, F. M.

    2018-04-01

    With the globalization and rapid development every filed is taking an increasing interest in physical geography and human economics. There is a surging demand for small scale world map in large formats all over the world. Further study of automated mapping technology, especially the realization of small scale production on a large scale global map, is the key of the cartographic field need to solve. In light of this, this paper adopts the improved model (with the map and data separated) in the field of the mapmaking generalization, which can separate geographic data from mapping data from maps, mainly including cross-platform symbols and automatic map-making knowledge engine. With respect to the cross-platform symbol library, the symbol and the physical symbol in the geographic information are configured at all scale levels. With respect to automatic map-making knowledge engine consists 97 types, 1086 subtypes, 21845 basic algorithm and over 2500 relevant functional modules.In order to evaluate the accuracy and visual effect of our model towards topographic maps and thematic maps, we take the world map generalization in small scale as an example. After mapping generalization process, combining and simplifying the scattered islands make the map more explicit at 1 : 2.1 billion scale, and the map features more complete and accurate. Not only it enhance the map generalization of various scales significantly, but achieve the integration among map-makings of various scales, suggesting that this model provide a reference in cartographic generalization for various scales.

  14. The Mapping of Predicted Triplex DNA:RNA in the Drosophila Genome Reveals a Prominent Location in Development- and Morphogenesis-Related Genes

    Directory of Open Access Journals (Sweden)

    Claude Pasquier

    2017-07-01

    Full Text Available Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand. A nucleic acid triplex occurs according to Hoogsteen rules that predict the stability and affinity of the third strand bound to the Watson–Crick duplex. The “triplex-forming oligonucleotide” (TFO can be a short sequence of RNA that binds to the major groove of the targeted duplex only when this duplex presents a sequence of purine or pyrimidine bases in one of the DNA strands. Many nuclear proteins are known to bind triplex DNA or DNA:RNA, but their biological functions are unexplored. We identified sequences that are capable of engaging as the “triplex-forming oligonucleotide” in both the pre-lncRNA and pre-mRNA collections of Drosophila melanogaster. These motifs were matched against the Drosophila genome in order to identify putative sequences of triplex formation in intergenic regions, promoters, and introns/exons. Most of the identified TFOs appear to be located in the intronic region of the analyzed genes. Computational prediction of the most targeted genes by TFOs originating from pre-lncRNAs and pre-mRNAs revealed that they are restrictively associated with development- and morphogenesis-related gene networks. The refined analysis by Gene Ontology enrichment demonstrates that some individual TFOs present genome-wide scale matches that are located in numerous genes and regulatory sequences. The triplex DNA:RNA computational mapping at the genome-wide scale suggests broad interference in the regulatory process of the gene networks orchestrated by TFO RNAs acting in association simultaneously at multiple sites.

  15. Bioinformatics decoding the genome

    CERN Multimedia

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  16. New generation pharmacogenomic tools: a SNP linkage disequilibrium Map, validated SNP assay resource, and high-throughput instrumentation system for large-scale genetic studies.

    Science.gov (United States)

    De La Vega, Francisco M; Dailey, David; Ziegle, Janet; Williams, Julie; Madden, Dawn; Gilbert, Dennis A

    2002-06-01

    Since public and private efforts announced the first draft of the human genome last year, researchers have reported great numbers of single nucleotide polymorphisms (SNPs). We believe that the availability of well-mapped, quality SNP markers constitutes the gateway to a revolution in genetics and personalized medicine that will lead to better diagnosis and treatment of common complex disorders. A new generation of tools and public SNP resources for pharmacogenomic and genetic studies--specifically for candidate-gene, candidate-region, and whole-genome association studies--will form part of the new scientific landscape. This will only be possible through the greater accessibility of SNP resources and superior high-throughput instrumentation-assay systems that enable affordable, highly productive large-scale genetic studies. We are contributing to this effort by developing a high-quality linkage disequilibrium SNP marker map and an accompanying set of ready-to-use, validated SNP assays across every gene in the human genome. This effort incorporates both the public sequence and SNP data sources, and Celera Genomics' human genome assembly and enormous resource ofphysically mapped SNPs (approximately 4,000,000 unique records). This article discusses our approach and methodology for designing the map, choosing quality SNPs, designing and validating these assays, and obtaining population frequency ofthe polymorphisms. We also discuss an advanced, high-performance SNP assay chemisty--a new generation of the TaqMan probe-based, 5' nuclease assay-and high-throughput instrumentation-software system for large-scale genotyping. We provide the new SNP map and validation information, validated SNP assays and reagents, and instrumentation systems as a novel resource for genetic discoveries.

  17. Reference Genome-Directed Resolution of Homologous and Homeologous Relationships within and between Different Oat Linkage Maps

    Directory of Open Access Journals (Sweden)

    Juan J. Gutierrez-Gonzalez

    2011-11-01

    Full Text Available Genome research on oat ( L. has received less attention than wheat ( L. and barley ( L. because it is a less prominent component of the human food system. To assess the potential of the model grass (L P. Beauv. as a surrogate for oat genome research, the whole genome sequence (WGS of was employed for comparative analysis with oat genetic linkage maps. Sequences of mapped molecular markers from one diploid spp. and two hexaploid oat maps were aligned to the WGS to infer syntenic relationships. Diploid and exhibit a high degree of synteny with 18 syntenic blocks covering 87% of the oat genome, which permitted postulation of an ancestral spp. chromosome structure. Synteny between oat and was also prevalent, with 50 syntenic blocks covering 76.6% of the ‘Kanota’ × ‘Ogle’ linkage map. Coalignment of diploid and hexaploid maps to helped resolve homeologous relationships between different oat linkage groups but also revealed many major rearrangements in oat subgenomes. Extending the analysis to a second oat linkage map (Ogle × ‘TAM O-301’ allowed identification of several putative homologous linkage groups across the two oat populations. These results indicate that the genome sequence will be a useful resource to assist genetics and genomics research in oat. The analytical strategy employed here should be applicable for genome research in other temperate grass crops with modest amounts of genomic data.

  18. Architectural protein subclasses shape 3-D organization of genomes during lineage commitment

    Science.gov (United States)

    Phillips-Cremins, Jennifer E.; Sauria, Michael E. G.; Sanyal, Amartya; Gerasimova, Tatiana I.; Lajoie, Bryan R.; Bell, Joshua S. K.; Ong, Chin-Tong; Hookway, Tracy A.; Guo, Changying; Sun, Yuhua; Bland, Michael J.; Wagstaff, William; Dalton, Stephen; McDevitt, Todd C.; Sen, Ranjan; Dekker, Job; Taylor, James; Corces, Victor G.

    2013-01-01

    Summary Understanding the topological configurations of chromatin may reveal valuable insights into how the genome and epigenome act in concert to control cell fate during development. Here we generate high-resolution architecture maps across seven genomic loci in embryonic stem cells and neural progenitor cells. We observe a hierarchy of 3-D interactions that undergo marked reorganization at the sub-Mb scale during differentiation. Distinct combinations of CTCF, Mediator, and cohesin show widespread enrichment in looping interactions at different length scales. CTCF/cohesin anchor long-range constitutive interactions that form the topological basis for invariant sub-domains. Conversely, Mediator/cohesin together with pioneer factors bridge shortrange enhancer-promoter interactions within and between larger sub-domains. Knockdown of Smc1 or Med12 in ES cells results in disruption of spatial architecture and down-regulation of genes found in cohesin-mediated interactions. We conclude that cell type-specific chromatin organization occurs at the sub-Mb scale and that architectural proteins shape the genome in hierarchical length scales. PMID:23706625

  19. A high-density genetic map for anchoring genome sequences and identifying QTLs associated with dwarf vine in pumpkin (Cucurbita maxima Duch.).

    Science.gov (United States)

    Zhang, Guoyu; Ren, Yi; Sun, Honghe; Guo, Shaogui; Zhang, Fan; Zhang, Jie; Zhang, Haiying; Jia, Zhangcai; Fei, Zhangjun; Xu, Yong; Li, Haizhen

    2015-12-24

    Pumpkin (Cucurbita maxima Duch.) is an economically important crop belonging to the Cucurbitaceae family. However, very few genomic and genetic resources are available for this species. As part of our ongoing efforts to sequence the pumpkin genome, high-density genetic map is essential for anchoring and orienting the assembled scaffolds. In addition, a saturated genetic map can facilitate quantitative trait locus (QTL) mapping. A set of 186 F2 plants derived from the cross of pumpkin inbred lines Rimu and SQ026 were genotyped using the genotyping-by-sequencing approach. Using the SNPs we identified, a high-density genetic map containing 458 bin-markers was constructed, spanning a total genetic distance of 2,566.8 cM across the 20 linkage groups of C. maxima with a mean marker density of 5.60 cM. Using this map we were able to anchor 58 assembled scaffolds that covered about 194.5 Mb (71.7%) of the 271.4 Mb assembled pumpkin genome, of which 44 (183.0 Mb; 67.4%) were oriented. Furthermore, the high-density genetic map was used to identify genomic regions highly associated with an important agronomic trait, dwarf vine. Three QTLs on linkage groups (LGs) 1, 3 and 4, respectively, were recovered. One QTL, qCmB2, which was located in an interval of 0.42 Mb on LG 3, explained 21.4% phenotypic variations. Within qCmB2, one gene, Cma_004516, encoding the gibberellin (GA) 20-oxidase in the GA biosynthesis pathway, had a 1249-bp deletion in its promoter in bush type lines, and its expression level was significantly increased during the vine growth and higher in vine type lines than bush type lines, supporting Cma_004516 as a possible candidate gene controlling vine growth in pumpkin. A high-density pumpkin genetic map was constructed, which was used to successfully anchor and orient the assembled genome scaffolds, and to identify QTLs highly associated with pumpkin vine length. The map provided a valuable resource for gene cloning and marker assisted breeding in pumpkin and

  20. A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

    Science.gov (United States)

    Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

    2018-05-09

    The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.

  1. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    Science.gov (United States)

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  2. The Genome-Scale Integrated Networks in Microorganisms

    Directory of Open Access Journals (Sweden)

    Tong Hao

    2018-02-01

    Full Text Available The genome-scale cellular network has become a necessary tool in the systematic analysis of microbes. In a cell, there are several layers (i.e., types of the molecular networks, for example, genome-scale metabolic network (GMN, transcriptional regulatory network (TRN, and signal transduction network (STN. It has been realized that the limitation and inaccuracy of the prediction exist just using only a single-layer network. Therefore, the integrated network constructed based on the networks of the three types attracts more interests. The function of a biological process in living cells is usually performed by the interaction of biological components. Therefore, it is necessary to integrate and analyze all the related components at the systems level for the comprehensively and correctly realizing the physiological function in living organisms. In this review, we discussed three representative genome-scale cellular networks: GMN, TRN, and STN, representing different levels (i.e., metabolism, gene regulation, and cellular signaling of a cell’s activities. Furthermore, we discussed the integration of the networks of the three types. With more understanding on the complexity of microbial cells, the development of integrated network has become an inevitable trend in analyzing genome-scale cellular networks of microorganisms.

  3. Genome-wide maps of alkylation damage, repair, and mutagenesis in yeast reveal mechanisms of mutational heterogeneity.

    Science.gov (United States)

    Mao, Peng; Brown, Alexander J; Malc, Ewa P; Mieczkowski, Piotr A; Smerdon, Michael J; Roberts, Steven A; Wyrick, John J

    2017-10-01

    DNA base damage is an important contributor to genome instability, but how the formation and repair of these lesions is affected by the genomic landscape and contributes to mutagenesis is unknown. Here, we describe genome-wide maps of DNA base damage, repair, and mutagenesis at single nucleotide resolution in yeast treated with the alkylating agent methyl methanesulfonate (MMS). Analysis of these maps revealed that base excision repair (BER) of alkylation damage is significantly modulated by chromatin, with faster repair in nucleosome-depleted regions, and slower repair and higher mutation density within strongly positioned nucleosomes. Both the translational and rotational settings of lesions within nucleosomes significantly influence BER efficiency; moreover, this effect is asymmetric relative to the nucleosome dyad axis and is regulated by histone modifications. Our data also indicate that MMS-induced mutations at adenine nucleotides are significantly enriched on the nontranscribed strand (NTS) of yeast genes, particularly in BER-deficient strains, due to higher damage formation on the NTS and transcription-coupled repair of the transcribed strand (TS). These findings reveal the influence of chromatin on repair and mutagenesis of base lesions on a genome-wide scale and suggest a novel mechanism for transcription-associated mutation asymmetry, which is frequently observed in human cancers. © 2017 Mao et al.; Published by Cold Spring Harbor Laboratory Press.

  4. Cytoplasmic ATR Activation Promotes Vaccinia Virus Genome Replication

    Directory of Open Access Journals (Sweden)

    Antonio Postigo

    2017-05-01

    Full Text Available In contrast to most DNA viruses, poxviruses replicate their genomes in the cytoplasm without host involvement. We find that vaccinia virus induces cytoplasmic activation of ATR early during infection, before genome uncoating, which is unexpected because ATR plays a fundamental nuclear role in maintaining host genome integrity. ATR, RPA, INTS7, and Chk1 are recruited to cytoplasmic DNA viral factories, suggesting canonical ATR pathway activation. Consistent with this, pharmacological and RNAi-mediated inhibition of canonical ATR signaling suppresses genome replication. RPA and the sliding clamp PCNA interact with the viral polymerase E9 and are required for DNA replication. Moreover, the ATR activator TOPBP1 promotes genome replication and associates with the viral replisome component H5. Our study suggests that, in contrast to long-held beliefs, vaccinia recruits conserved components of the eukaryote DNA replication and repair machinery to amplify its genome in the host cytoplasm.

  5. Targeted and genome-scale methylomics reveals gene body signatures in human cell lines

    Science.gov (United States)

    Ball, Madeleine Price; Li, Jin Billy; Gao, Yuan; Lee, Je-Hyuk; LeProust, Emily; Park, In-Hyun; Xie, Bin; Daley, George Q.; Church, George M.

    2012-01-01

    Cytosine methylation, an epigenetic modification of DNA, is a target of growing interest for developing high throughput profiling technologies. Here we introduce two new, complementary techniques for cytosine methylation profiling utilizing next generation sequencing technology: bisulfite padlock probes (BSPPs) and methyl sensitive cut counting (MSCC). In the first method, we designed a set of ~10,000 BSPPs distributed over the ENCODE pilot project regions to take advantage of existing expression and chromatin immunoprecipitation data. We observed a pattern of low promoter methylation coupled with high gene body methylation in highly expressed genes. Using the second method, MSCC, we gathered genome-scale data for 1.4 million HpaII sites and confirmed that gene body methylation in highly expressed genes is a consistent phenomenon over the entire genome. Our observations highlight the usefulness of techniques which are not inherently or intentionally biased in favor of only profiling particular subsets like CpG islands or promoter regions. PMID:19329998

  6. Systematic search for the Cra-binding promoters using genomic SELEX system.

    Science.gov (United States)

    Shimada, Tomohiro; Fujita, Nobuyuki; Maeda, Michihisa; Ishihama, Akira

    2005-09-01

    Cra (or FruR), a global transcription factor with both repression and activation activities, controls a large number of the genes for glycolysis and gluconeogenesis. To get insights into the entire network of transcription regulation of the E. coli genome by Cra, we isolated a set of Cra-binding sequences using an improved method of genomic SELEX. From the DNA sequences of 97 independently isolated DNA fragments by SELEX, the Cra-binding sequences were identified in a total of ten regions on the E. coli genome, including promoters of six known genes and four hitherto-unidentified genes. All six known promoters are repressed by Cra, but none of the activation-type promoters were cloned after two cyles of SELEX, because the Cra-binding affinity to the repression-type promoters is higher than the activation-type promoters, as determined by the quantitative gel shift assay. Of a total of four newly identified Cra-binding sequences, two are associated with promoter regions of the gapA (glyceraldehyde 3-phosphate dehydrogenase) and eno (enolase) genes, both involved in sugar metabolism. The regulation of newly identified genes by Cra was confirmed by the in vivo promoter strength assay using a newly developed TFP (two-fluorescent protein) vector for promoter assay or by in vitro transcription assay in the presence of Cra protein.

  7. Genome-wide evolutionary dynamics of influenza B viruses on a global scale.

    Directory of Open Access Journals (Sweden)

    Pinky Langat

    2017-12-01

    Full Text Available The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally.

  8. Genome-wide evolutionary dynamics of influenza B viruses on a global scale

    Science.gov (United States)

    Langat, Pinky; Bowden, Thomas A.; Edwards, Stephanie; Gall, Astrid; Rambaut, Andrew; Daniels, Rodney S.; Russell, Colin A.; Pybus, Oliver G.; McCauley, John

    2017-01-01

    The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally. PMID:29284042

  9. Genome Variation Map: a data repository of genome variations in BIG Data Center.

    Science.gov (United States)

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2018-01-04

    The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Genome Variation Map: a data repository of genome variations in BIG Data Center

    Science.gov (United States)

    Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

    2018-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473

  11. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  12. Genome wide SSR high density genetic map construction from an interspecific cross of Gossypium hirsutum × Gossypium tomentosum

    Directory of Open Access Journals (Sweden)

    Muhammad Kashif Riaz eKhan

    2016-04-01

    Full Text Available A high density genetic map was constructed using F2 population derived from an interspecific cross of G. hirsutum x G. tomentosum. The map consisted of 3,093 marker loci distributed across all the 26 chromosomes and covered 4,365.3 cM of cotton genome with an average inter-marker distance of 1.48 cM. The maximum length of chromosome was 218.38 cM and the minimum was 122.09 cM with an average length of 167.90 cM. A sub-genome covers more genetic distance (2,189.01 cM with an average inter loci distance of 1.53 cM than D sub-genome which covers a length of 2,176.29 cM with an average distance of 1.43 cM. There were 716 distorted loci in the map accounting for 23.14% and most distorted loci were distributed on D sub-genome (25.06%, which were more than on A sub-genome (21.23%. In our map 49 segregation hotspots (SDR were distributed across the genome with more on D sub-genome as compared to A genome. Two post-polyploidization reciprocal translocations of A2/A3 and A4/A5 were suggested by 7 pairs of duplicate loci. The map constructed through these studies is one of the three densest genetic maps in cotton however; this is the first dense genome wide SSR interspecific genetic map between G. hirsutum and G. tomentosum.

  13. Toward mapping the biology of the genome.

    Science.gov (United States)

    Chanock, Stephen

    2012-09-01

    This issue of Genome Research presents new results, methods, and tools from The ENCODE Project (ENCyclopedia of DNA Elements), which collectively represents an important step in moving beyond a parts list of the genome and promises to shape the future of genomic research. This collection sheds light on basic biological questions and frames the current debate over the optimization of tools and methodological challenges necessary to compare and interpret large complex data sets focused on how the genome is organized and regulated. In a number of instances, the authors have highlighted the strengths and limitations of current computational and technical approaches, providing the community with useful standards, which should stimulate development of new tools. In many ways, these papers will ripple through the scientific community, as those in pursuit of understanding the "regulatory genome" will heavily traverse the maps and tools. Similarly, the work should have a substantive impact on how genetic variation contributes to specific diseases and traits by providing a compendium of functional elements for follow-up study. The success of these papers should not only be measured by the scope of the scientific insights and tools but also by their ability to attract new talent to mine existing and future data.

  14. Large-scale parallel genome assembler over cloud computing environment.

    Science.gov (United States)

    Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

    2017-06-01

    The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.

  15. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    OpenAIRE

    Zuccolo, Andrea; Bowers, John E; Estill, James C; Xiong, Zhiyong; Luo, Meizhong; Sebastian, Aswathy; Goicoechea, Jos? Luis; Collura, Kristi; Yu, Yeisoo; Jiao, Yuannian; Duarte, Jill; Tang, Haibao; Ayyampalayam, Saravanaraj; Rounsley, Steve; Kudrna, Dave

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome....

  16. Large Scale Landform Mapping Using Lidar DEM

    Directory of Open Access Journals (Sweden)

    Türkay Gökgöz

    2015-08-01

    Full Text Available In this study, LIDAR DEM data was used to obtain a primary landform map in accordance with a well-known methodology. This primary landform map was generalized using the Focal Statistics tool (Majority, considering the minimum area condition in cartographic generalization in order to obtain landform maps at 1:1000 and 1:5000 scales. Both the primary and the generalized landform maps were verified visually with hillshaded DEM and an orthophoto. As a result, these maps provide satisfactory visuals of the landforms. In order to show the effect of generalization, the area of each landform in both the primary and the generalized maps was computed. Consequently, landform maps at large scales could be obtained with the proposed methodology, including generalization using LIDAR DEM.

  17. Impact of genome assembly status on ChIP-Seq and ChIP-PET data mapping

    Directory of Open Access Journals (Sweden)

    Sachs Laurent

    2009-12-01

    Full Text Available Abstract Background ChIP-Seq and ChIP-PET can potentially be used with any genome for genome wide profiling of protein-DNA interaction sites. Unfortunately, it is probable that most genome assemblies will never reach the quality of the human genome assembly. Therefore, it remains to be determined whether ChIP-Seq and ChIP-PET are practicable with genome sequences other than a few (e.g. human and mouse. Findings Here, we used in silico simulations to assess the impact of completeness or fragmentation of genome assemblies on ChIP-Seq and ChIP-PET data mapping. Conclusions Most currently published genome assemblies are suitable for mapping the short sequence tags produced by ChIP-Seq or ChIP-PET.

  18. Incorporating Protein Biosynthesis into the Saccharomyces cerevisiae Genome-scale Metabolic Model

    DEFF Research Database (Denmark)

    Olivares Hernandez, Roberto

    Based on stoichiometric biochemical equations that occur into the cell, the genome-scale metabolic models can quantify the metabolic fluxes, which are regarded as the final representation of the physiological state of the cell. For Saccharomyces Cerevisiae the genome scale model has been construc......Based on stoichiometric biochemical equations that occur into the cell, the genome-scale metabolic models can quantify the metabolic fluxes, which are regarded as the final representation of the physiological state of the cell. For Saccharomyces Cerevisiae the genome scale model has been...

  19. Dynamic maps of UV damage formation and repair for the human genome.

    Science.gov (United States)

    Hu, Jinchuan; Adebali, Ogun; Adar, Sheera; Sancar, Aziz

    2017-06-27

    Formation and repair of UV-induced DNA damage in human cells are affected by cellular context. To study factors influencing damage formation and repair genome-wide, we developed a highly sensitive single-nucleotide resolution damage mapping method [high-sensitivity damage sequencing (HS-Damage-seq)]. Damage maps of both cyclobutane pyrimidine dimers (CPDs) and pyrimidine-pyrimidone (6-4) photoproducts [(6-4)PPs] from UV-irradiated cellular and naked DNA revealed that the effect of transcription factor binding on bulky adducts formation varies, depending on the specific transcription factor, damage type, and strand. We also generated time-resolved UV damage maps of both CPDs and (6-4)PPs by HS-Damage-seq and compared them to the complementary repair maps of the human genome obtained by excision repair sequencing to gain insight into factors that affect UV-induced DNA damage and repair and ultimately UV carcinogenesis. The combination of the two methods revealed that, whereas UV-induced damage is virtually uniform throughout the genome, repair is affected by chromatin states, transcription, and transcription factor binding, in a manner that depends on the type of DNA damage.

  20. A map of human genome variation from population-scale sequencing.

    Science.gov (United States)

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  1. Genome scale engineering techniques for metabolic engineering.

    Science.gov (United States)

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.

  2. Acorn: A grid computing system for constraint based modeling and visualization of the genome scale metabolic reaction networks via a web interface

    Directory of Open Access Journals (Sweden)

    Bushell Michael E

    2011-05-01

    Full Text Available Abstract Background Constraint-based approaches facilitate the prediction of cellular metabolic capabilities, based, in turn on predictions of the repertoire of enzymes encoded in the genome. Recently, genome annotations have been used to reconstruct genome scale metabolic reaction networks for numerous species, including Homo sapiens, which allow simulations that provide valuable insights into topics, including predictions of gene essentiality of pathogens, interpretation of genetic polymorphism in metabolic disease syndromes and suggestions for novel approaches to microbial metabolic engineering. These constraint-based simulations are being integrated with the functional genomics portals, an activity that requires efficient implementation of the constraint-based simulations in the web-based environment. Results Here, we present Acorn, an open source (GNU GPL grid computing system for constraint-based simulations of genome scale metabolic reaction networks within an interactive web environment. The grid-based architecture allows efficient execution of computationally intensive, iterative protocols such as Flux Variability Analysis, which can be readily scaled up as the numbers of models (and users increase. The web interface uses AJAX, which facilitates efficient model browsing and other search functions, and intuitive implementation of appropriate simulation conditions. Research groups can install Acorn locally and create user accounts. Users can also import models in the familiar SBML format and link reaction formulas to major functional genomics portals of choice. Selected models and simulation results can be shared between different users and made publically available. Users can construct pathway map layouts and import them into the server using a desktop editor integrated within the system. Pathway maps are then used to visualise numerical results within the web environment. To illustrate these features we have deployed Acorn and created a

  3. High Resolution Typing by Whole Genome Mapping Enables Discrimination of LA-MRSA (CC398) Strains and Identification of Transmission Events

    Science.gov (United States)

    Bosch, Thijs; Verkade, Erwin; van Luit, Martijn; Pot, Bruno; Vauterin, Paul; Burggrave, Ronald; Savelkoul, Paul; Kluytmans, Jan; Schouls, Leo

    2013-01-01

    After its emergence in 2003, a livestock-associated (LA-)MRSA clade (CC398) has caused an impressive increase in the number of isolates submitted for the Dutch national MRSA surveillance and now comprises 40% of all isolates. The currently used molecular typing techniques have limited discriminatory power for this MRSA clade, which hampers studies on the origin and transmission routes. Recently, a new molecular analysis technique named whole genome mapping was introduced. This method creates high-resolution, ordered whole genome restriction maps that may have potential for strain typing. In this study, we assessed and validated the capability of whole genome mapping to differentiate LA-MRSA isolates. Multiple validation experiments showed that whole genome mapping produced highly reproducible results. Assessment of the technique on two well-documented MRSA outbreaks showed that whole genome mapping was able to confirm one outbreak, but revealed major differences between the maps of a second, indicating that not all isolates belonged to this outbreak. Whole genome mapping of LA-MRSA isolates that were epidemiologically unlinked provided a much higher discriminatory power than spa-typing or MLVA. In contrast, maps created from LA-MRSA isolates obtained during a proven LA-MRSA outbreak were nearly indistinguishable showing that transmission of LA-MRSA can be detected by whole genome mapping. Finally, whole genome maps of LA-MRSA isolates originating from two unrelated veterinarians and their household members showed that veterinarians may carry and transmit different LA-MRSA strains at the same time. No such conclusions could be drawn based spa-typing and MLVA. Although PFGE seems to be suitable for molecular typing of LA-MRSA, WGM provides a much higher discriminatory power. Furthermore, whole genome mapping can provide a comparison with other maps within 2 days after the bacterial culture is received, making it suitable to investigate transmission events and

  4. Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-03-12

    The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scale genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.

  5. The European sea bass Dicentrarchus labrax genome puzzle: comparative BAC-mapping and low coverage shotgun sequencing

    Directory of Open Access Journals (Sweden)

    Volckaert Filip AM

    2010-01-01

    Full Text Available Abstract Background Food supply from the ocean is constrained by the shortage of domesticated and selected fish. Development of genomic models of economically important fishes should assist with the removal of this bottleneck. European sea bass Dicentrarchus labrax L. (Moronidae, Perciformes, Teleostei is one of the most important fishes in European marine aquaculture; growing genomic resources put it on its way to serve as an economic model. Results End sequencing of a sea bass genomic BAC-library enabled the comparative mapping of the sea bass genome using the three-spined stickleback Gasterosteus aculeatus genome as a reference. BAC-end sequences (102,690 were aligned to the stickleback genome. The number of mappable BACs was improved using a two-fold coverage WGS dataset of sea bass resulting in a comparative BAC-map covering 87% of stickleback chromosomes with 588 BAC-contigs. The minimum size of 83 contigs covering 50% of the reference was 1.2 Mbp; the largest BAC-contig comprised 8.86 Mbp. More than 22,000 BAC-clones aligned with both ends to the reference genome. Intra-chromosomal rearrangements between sea bass and stickleback were identified. Size distributions of mapped BACs were used to calculate that the genome of sea bass may be only 1.3 fold larger than the 460 Mbp stickleback genome. Conclusions The BAC map is used for sequencing single BACs or BAC-pools covering defined genomic entities by second generation sequencing technologies. Together with the WGS dataset it initiates a sea bass genome sequencing project. This will allow the quantification of polymorphisms through resequencing, which is important for selecting highly performing domesticated fish.

  6. QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments

    Science.gov (United States)

    2011-01-01

    Background The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. Results Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers. For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies. Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. Conclusions This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and

  7. Use of genome-scale microbial models for metabolic engineering

    DEFF Research Database (Denmark)

    Patil, Kiran Raosaheb; Åkesson, M.; Nielsen, Jens

    2004-01-01

    Metabolic engineering serves as an integrated approach to design new cell factories by providing rational design procedures and valuable mathematical and experimental tools. Mathematical models have an important role for phenotypic analysis, but can also be used for the design of optimal metaboli...... network structures. The major challenge for metabolic engineering in the post-genomic era is to broaden its design methodologies to incorporate genome-scale biological data. Genome-scale stoichiometric models of microorganisms represent a first step in this direction....

  8. Mapping and annotating obesity-related genes in pig and human genomes.

    Science.gov (United States)

    Martelli, Pier Luigi; Fontanesi, Luca; Piovesan, Damiano; Fariselli, Piero; Casadio, Rita

    2014-01-01

    Background. Obesity is a major health problem in both developed and emerging countries. Obesity is a complex disease whose etiology involves genetic factors in strong interplay with environmental determinants and lifestyle. The discovery of genetic factors and biological pathways underlying human obesity is hampered by the difficulty in controlling the genetic background of human cohorts. Animal models are then necessary to further dissect the genetics of obesity. Pig has emerged as one of the most attractive models, because of the similarity with humans in the mechanisms regulating the fat deposition. Results. We collected the genes related to obesity in humans and to fat deposition traits in pig. We localized them on both human and pig genomes, building a map useful to interpret comparative studies on obesity. We characterized the collected genes structurally and functionally with BAR+ and mapped them on KEGG pathways and on STRING protein interaction network. Conclusions. The collected set consists of 361 obesity related genes in human and pig genomes. All genes were mapped on the human genome, and 54 could not be localized on the pig genome (release 2012). Only for 3 human genes there is no counterpart in pig, confirming that this animal is a good model for human obesity studies. Obesity related genes are mostly involved in regulation and signaling processes/pathways and relevant connection emerges between obesity-related genes and diseases such as cancer and infectious diseases.

  9. Assembly of the Genome of the Disease Vector Aedes aegypti onto a Genetic Linkage Map Allows Mapping of Genes Affecting Disease Transmission

    KAUST Repository

    Juneja, Punita

    2014-01-30

    The mosquito Aedes aegypti transmits some of the most important human arboviruses, including dengue, yellow fever and chikungunya viruses. It has a large genome containing many repetitive sequences, which has resulted in the genome being poorly assembled - there are 4,758 scaffolds, few of which have been assigned to a chromosome. To allow the mapping of genes affecting disease transmission, we have improved the genome assembly by scoring a large number of SNPs in recombinant progeny from a cross between two strains of Ae. aegypti, and used these to generate a genetic map. This revealed a high rate of misassemblies in the current genome, where, for example, sequences from different chromosomes were found on the same scaffold. Once these were corrected, we were able to assign 60% of the genome sequence to chromosomes and approximately order the scaffolds along the chromosome. We found that there are very large regions of suppressed recombination around the centromeres, which can extend to as much as 47% of the chromosome. To illustrate the utility of this new genome assembly, we mapped a gene that makes Ae. aegypti resistant to the human parasite Brugia malayi, and generated a list of candidate genes that could be affecting the trait. © 2014 Juneja et al.

  10. Genome-Wide Association Mapping of Crown Rust Resistance in Oat Elite Germplasm.

    Science.gov (United States)

    Klos, Kathy Esvelt; Yimer, Belayneh A; Babiker, Ebrahiem M; Beattie, Aaron D; Bonman, J Michael; Carson, Martin L; Chong, James; Harrison, Stephen A; Ibrahim, Amir M H; Kolb, Frederic L; McCartney, Curt A; McMullen, Michael; Fetch, Jennifer Mitchell; Mohammadi, Mohsen; Murphy, J Paul; Tinker, Nicholas A

    2017-07-01

    Oat crown rust, caused by f. sp. , is a major constraint to oat ( L.) production in many parts of the world. In this first comprehensive multienvironment genome-wide association map of oat crown rust, we used 2972 single-nucleotide polymorphisms (SNPs) genotyped on 631 oat lines for association mapping of quantitative trait loci (QTL). Seedling reaction to crown rust in these lines was assessed as infection type (IT) with each of 10 crown rust isolates. Adult plant reaction was assessed in the field in a total of 10 location-years as percentage severity (SV) and as infection reaction (IR) in a 0-to-1 scale. Overall, 29 SNPs on 12 linkage groups were predictive of crown rust reaction in at least one experiment at a genome-wide level of statistical significance. The QTL identified here include those in regions previously shown to be linked with seedling resistance genes , , , , , and and also with adult-plant resistance and adaptation-related QTL. In addition, QTL on linkage groups Mrg03, Mrg08, and Mrg23 were identified in regions not previously associated with crown rust resistance. Evaluation of marker genotypes in a set of crown rust differential lines supported as the identity of . The SNPs with rare alleles associated with lower disease scores may be suitable for use in marker-assisted selection of oat lines for crown rust resistance. Copyright © 2017 Crop Science Society of America.

  11. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  12. High resolution linkage maps of the model organism Petunia reveal substantial synteny decay with the related genome of tomato.

    Science.gov (United States)

    Bossolini, Eligio; Klahre, Ulrich; Brandenburg, Anna; Reinhardt, Didier; Kuhlemeier, Cris

    2011-04-01

    Two linkage maps were constructed for the model plant Petunia. Mapping populations were obtained by crossing the wild species Petunia axillaris subsp. axillaris with Petunia inflata, and Petunia axillaris subsp. parodii with Petunia exserta. Both maps cover the seven chromosomes of Petunia, and span 970 centimorgans (cM) and 700 cM of the genomes, respectively. In total, 207 markers were mapped. Of these, 28 are multilocus amplified fragment length polymorphism (AFLP) markers and 179 are gene-derived markers. For the first time we report on the development and mapping of 83 Petunia microsatellites. The two maps retain the same marker order, but display significant differences of recombination frequencies at orthologous mapping intervals. A complex pattern of genomic rearrangements was detected with the related genome of tomato (Solanum lycopersicum), indicating that synteny between Petunia and other Solanaceae crops has been considerably disrupted. The newly developed markers will facilitate the genetic characterization of mutants and ecological studies on genetic diversity and speciation within the genus Petunia. The maps will provide a powerful tool to link genetic and genomic information and will be useful to support sequence assembly of the Petunia genome.

  13. Planimetric Features Generalization for the Production of Small-Scale Map by Using Base Maps and the Existing Algorithms

    Directory of Open Access Journals (Sweden)

    M. Modiri

    2014-10-01

    Full Text Available Cartographic maps are representations of the Earth upon a flat surface in the smaller scale than it’s true. Large scale maps cover relatively small regions in great detail and small scale maps cover large regions such as nations, continents and the whole globe. Logical connection between the features and scale map must be maintained by changing the scale and it is important to recognize that even the most accurate maps sacrifice a certain amount of accuracy in scale to deliver a greater visual usefulness to its user. Cartographic generalization, or map generalization, is the method whereby information is selected and represented on a map in a way that adapts to the scale of the display medium of the map, not necessarily preserving all intricate geographical or other cartographic details. Due to the problems facing small-scale map production process and the need to spend time and money for surveying, today’s generalization is used as executive approach. The software is proposed in this paper that converted various data and information to certain Data Model. This software can produce generalization map according to base map using the existing algorithm. Planimetric generalization algorithms and roles are described in this article. Finally small-scale maps with 1:100,000, 1:250,000 and 1:500,000 scale are produced automatically and they are shown at the end.

  14. Physical mapping and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L.) genome.

    Science.gov (United States)

    Ragupathy, Raja; Rathinavelu, Rajkumar; Cloutier, Sylvie

    2011-05-09

    Flax (Linum usitatissimum L.) is an important source of oil rich in omega-3 fatty acids, which have proven health benefits and utility as an industrial raw material. Flax seeds also contain lignans which are associated with reducing the risk of certain types of cancer. Its bast fibres have broad industrial applications. However, genomic tools needed for molecular breeding were non existent. Hence a project, Total Utilization Flax GENomics (TUFGEN) was initiated. We report here the first genome-wide physical map of flax and the generation and analysis of BAC-end sequences (BES) from 43,776 clones, providing initial insights into the genome. The physical map consists of 416 contigs spanning ~368 Mb, assembled from 32,025 fingerprints, representing roughly 54.5% to 99.4% of the estimated haploid genome (370-675 Mb). The N50 size of the contigs was estimated to be ~1,494 kb. The longest contig was ~5,562 kb comprising 437 clones. There were 96 contigs containing more than 100 clones. Approximately 54.6 Mb representing 8-14.8% of the genome was obtained from 80,337 BES. Annotation revealed that a large part of the genome consists of ribosomal DNA (~13.8%), followed by known transposable elements at 6.1%. Furthermore, ~7.4% of sequence was identified to harbour novel repeat elements. Homology searches against flax-ESTs and NCBI-ESTs suggested that ~5.6% of the transcriptome is unique to flax. A total of 4064 putative genomic SSRs were identified and are being developed as novel markers for their use in molecular breeding. The first genome-wide physical map of flax constructed with BAC clones provides a framework for accessing target loci with economic importance for marker development and positional cloning. Analysis of the BES has provided insights into the uniqueness of the flax genome. Compared to other plant genomes, the proportion of rDNA was found to be very high whereas the proportion of known transposable elements was low. The SSRs identified from BES will be

  15. Healthy Universities: Mapping Health-Promotion Interventions

    Science.gov (United States)

    Sarmiento, Juan Pablo

    2017-01-01

    Purpose: The purpose of this paper is to map out and characterize existing health-promotion initiatives at Florida International University (FIU) in the USA in order to inform decision makers involved in the development of a comprehensive and a long-term healthy university strategy. Design/methodology/approach: This study encompasses a narrative…

  16. Genome-wide SNP identification, linkage map construction and QTL mapping for seed mineral concentrations and contents in pea (Pisum sativum L.).

    Science.gov (United States)

    Ma, Yu; Coyne, Clarice J; Grusak, Michael A; Mazourek, Michael; Cheng, Peng; Main, Dorrie; McGee, Rebecca J

    2017-02-13

    Marker-assisted breeding is now routinely used in major crops to facilitate more efficient cultivar improvement. This has been significantly enabled by the use of next-generation sequencing technology to identify loci and markers associated with traits of interest. While rich in a range of nutritional components, such as protein, mineral nutrients, carbohydrates and several vitamins, pea (Pisum sativum L.), one of the oldest domesticated crops in the world, remains behind many other crops in the availability of genomic and genetic resources. To further improve mineral nutrient levels in pea seeds requires the development of genome-wide tools. The objectives of this research were to develop these tools by: identifying genome-wide single nucleotide polymorphisms (SNPs) using genotyping by sequencing (GBS); constructing a high-density linkage map and comparative maps with other legumes, and identifying quantitative trait loci (QTL) for levels of boron, calcium, iron, potassium, magnesium, manganese, molybdenum, phosphorous, sulfur, and zinc in the seed, as well as for seed weight. In this study, 1609 high quality SNPs were found to be polymorphic between 'Kiflica' and 'Aragorn', two parents of an F 6 -derived recombinant inbred line (RIL) population. Mapping 1683 markers including 75 previously published markers and 1608 SNPs developed from the present study generated a linkage map of size 1310.1 cM. Comparative mapping with other legumes demonstrated that the highest level of synteny was observed between pea and the genome of Medicago truncatula. QTL analysis of the RIL population across two locations revealed at least one QTL for each of the mineral nutrient traits. In total, 46 seed mineral concentration QTLs, 37 seed mineral content QTLs, and 6 seed weight QTLs were discovered. The QTLs explained from 2.4% to 43.3% of the phenotypic variance. The genome-wide SNPs and the genetic linkage map developed in this study permitted QTL identification for pea seed mineral

  17. Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

    Directory of Open Access Journals (Sweden)

    Deng Xuemei

    2007-11-01

    Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic

  18. Comparative genomic analysis of four representative plant growth-promoting rhizobacteria in Pseudomonas

    Science.gov (United States)

    2013-01-01

    Background Some Pseudomonas strains function as predominant plant growth-promoting rhizobacteria (PGPR). Within this group, Pseudomonas chlororaphis and Pseudomonas fluorescens are non-pathogenic biocontrol agents, and some Pseudomonas aeruginosa and Pseudomonas stutzeri strains are PGPR. P. chlororaphis GP72 is a plant growth-promoting rhizobacterium with a fully sequenced genome. We conducted a genomic analysis comparing GP72 with three other pseudomonad PGPR: P. fluorescens Pf-5, P. aeruginosa M18, and the nitrogen-fixing strain P. stutzeri A1501. Our aim was to identify the similarities and differences among these strains using a comparative genomic approach to clarify the mechanisms of plant growth-promoting activity. Results The genome sizes of GP72, Pf-5, M18, and A1501 ranged from 4.6 to 7.1 M, and the number of protein-coding genes varied among the four species. Clusters of Orthologous Groups (COGs) analysis assigned functions to predicted proteins. The COGs distributions were similar among the four species. However, the percentage of genes encoding transposases and their inactivated derivatives (COG L) was 1.33% of the total genes with COGs classifications in A1501, 0.21% in GP72, 0.02% in Pf-5, and 0.11% in M18. A phylogenetic analysis indicated that GP72 and Pf-5 were the most closely related strains, consistent with the genome alignment results. Comparisons of predicted coding sequences (CDSs) between GP72 and Pf-5 revealed 3544 conserved genes. There were fewer conserved genes when GP72 CDSs were compared with those of A1501 and M18. Comparisons among the four Pseudomonas species revealed 603 conserved genes in GP72, illustrating common plant growth-promoting traits shared among these PGPR. Conserved genes were related to catabolism, transport of plant-derived compounds, stress resistance, and rhizosphere colonization. Some strain-specific CDSs were related to different kinds of biocontrol activities or plant growth promotion. The GP72 genome

  19. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  20. A RAD-based linkage map and comparative genomics in the gudgeons (genus Gnathopogon, Cyprinidae

    Directory of Open Access Journals (Sweden)

    Kakioka Ryo

    2013-01-01

    Full Text Available Abstract Background The construction of linkage maps is a first step in exploring the genetic basis for adaptive phenotypic divergence in closely related species by quantitative trait locus (QTL analysis. Linkage maps are also useful for comparative genomics in non-model organisms. Advances in genomics technologies make it more feasible than ever to study the genetics of adaptation in natural populations. Restriction-site associated DNA (RAD sequencing in next-generation sequencers facilitates the development of many genetic markers and genotyping. We aimed to construct a linkage map of the gudgeons of the genus Gnathopogon (Cyprinidae for comparative genomics with the zebrafish Danio rerio (a member of the same family as gudgeons and for the future QTL analysis of the genetic architecture underlying adaptive phenotypic evolution of Gnathopogon. Results We constructed the first genetic linkage map of Gnathopogon using a 198 F2 interspecific cross between two closely related species in Japan: river-dwelling Gnathopogon elongatus and lake-dwelling Gnathopogon caerulescens. Based on 1,622 RAD-tag markers, a linkage map spanning 1,390.9 cM with 25 linkage groups and an average marker interval of 0.87 cM was constructed. We also identified a region involving female-specific transmission ratio distortion (TRD. Synteny and collinearity were extensively conserved between Gnathopogon and zebrafish. Conclusions The dense SNP-based linkage map presented here provides a basis for future QTL analysis. It will also be useful for transferring genomic information from a “traditional” model fish species, zebrafish, to screen candidate genes underlying ecologically important traits of the gudgeons.

  1. Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation

    Directory of Open Access Journals (Sweden)

    Ward Judson A

    2013-01-01

    Full Text Available Abstract Background Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry. Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species. Results Genotyping by Sequencing (GBS was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs linked these results to published maps for cross-validation and map comparison. Conclusions GBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation

  2. Comparative genome analysis and resistance gene mapping in grain legumes

    International Nuclear Information System (INIS)

    Young, N.D.

    1998-01-01

    Using, DNA markers and genome organization, several important disease resistance genes have been analyzed in mungbean (Vigna radiata), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), and soybean (Glycine max). In the process, medium-density linkage maps consisting of restriction fragment length polymorphism (RFLP) markers were constructed for both mungbean and cowpea. Comparisons between these maps, as well as the maps of soybean and common bean, indicate that there is significant conservation of DNA marker order, though the conserved blocks in soybean are much shorter than in the others. DNA mapping results also indicate that a gene for seed weight may be conserved between mungbean and cowpea. Using the linkage maps, genes that control bruchid (genus Callosobruchus) and powdery mildew (Erysiphe polygoni) resistance in mungbean, aphid resistance in cowpea (Aphis craccivora), and cyst nematode (Heterodera glycines) resistance in soybean have all been mapped and characterized. For some of these traits resistance was found to be oligogenic and DNA mapping uncovered multiple genes involved in the phenotype. (author)

  3. Using Genome-scale Models to Predict Biological Capabilities

    DEFF Research Database (Denmark)

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome scale have been under development since the first whole-genome sequences appeared in the mid-1990s. A few years ago, this approach began to demonstrate the ability to predict a range of cellular functions, including cellul...

  4. Physical mapping and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L. genome

    Directory of Open Access Journals (Sweden)

    Cloutier Sylvie

    2011-05-01

    Full Text Available Abstract Background Flax (Linum usitatissimum L. is an important source of oil rich in omega-3 fatty acids, which have proven health benefits and utility as an industrial raw material. Flax seeds also contain lignans which are associated with reducing the risk of certain types of cancer. Its bast fibres have broad industrial applications. However, genomic tools needed for molecular breeding were non existent. Hence a project, Total Utilization Flax GENomics (TUFGEN was initiated. We report here the first genome-wide physical map of flax and the generation and analysis of BAC-end sequences (BES from 43,776 clones, providing initial insights into the genome. Results The physical map consists of 416 contigs spanning ~368 Mb, assembled from 32,025 fingerprints, representing roughly 54.5% to 99.4% of the estimated haploid genome (370-675 Mb. The N50 size of the contigs was estimated to be ~1,494 kb. The longest contig was ~5,562 kb comprising 437 clones. There were 96 contigs containing more than 100 clones. Approximately 54.6 Mb representing 8-14.8% of the genome was obtained from 80,337 BES. Annotation revealed that a large part of the genome consists of ribosomal DNA (~13.8%, followed by known transposable elements at 6.1%. Furthermore, ~7.4% of sequence was identified to harbour novel repeat elements. Homology searches against flax-ESTs and NCBI-ESTs suggested that ~5.6% of the transcriptome is unique to flax. A total of 4064 putative genomic SSRs were identified and are being developed as novel markers for their use in molecular breeding. Conclusion The first genome-wide physical map of flax constructed with BAC clones provides a framework for accessing target loci with economic importance for marker development and positional cloning. Analysis of the BES has provided insights into the uniqueness of the flax genome. Compared to other plant genomes, the proportion of rDNA was found to be very high whereas the proportion of known transposable

  5. Accurate estimation of short read mapping quality for next-generation genome sequencing

    Science.gov (United States)

    Ruffalo, Matthew; Koyutürk, Mehmet; Ray, Soumya; LaFramboise, Thomas

    2012-01-01

    Motivation: Several software tools specialize in the alignment of short next-generation sequencing reads to a reference sequence. Some of these tools report a mapping quality score for each alignment—in principle, this quality score tells researchers the likelihood that the alignment is correct. However, the reported mapping quality often correlates weakly with actual accuracy and the qualities of many mappings are underestimated, encouraging the researchers to discard correct mappings. Further, these low-quality mappings tend to correlate with variations in the genome (both single nucleotide and structural), and such mappings are important in accurately identifying genomic variants. Approach: We develop a machine learning tool, LoQuM (LOgistic regression tool for calibrating the Quality of short read mappings, to assign reliable mapping quality scores to mappings of Illumina reads returned by any alignment tool. LoQuM uses statistics on the read (base quality scores reported by the sequencer) and the alignment (number of matches, mismatches and deletions, mapping quality score returned by the alignment tool, if available, and number of mappings) as features for classification and uses simulated reads to learn a logistic regression model that relates these features to actual mapping quality. Results: We test the predictions of LoQuM on an independent dataset generated by the ART short read simulation software and observe that LoQuM can ‘resurrect’ many mappings that are assigned zero quality scores by the alignment tools and are therefore likely to be discarded by researchers. We also observe that the recalibration of mapping quality scores greatly enhances the precision of called single nucleotide polymorphisms. Availability: LoQuM is available as open source at http://compbio.case.edu/loqum/. Contact: matthew.ruffalo@case.edu. PMID:22962451

  6. Genome scale metabolic modeling of cancer

    DEFF Research Database (Denmark)

    Nilsson, Avlant; Nielsen, Jens

    2017-01-01

    of metabolism which allows simulation and hypotheses testing of metabolic strategies. It has successfully been applied to many microorganisms and is now used to study cancer metabolism. Generic models of human metabolism have been reconstructed based on the existence of metabolic genes in the human genome......Cancer cells reprogram metabolism to support rapid proliferation and survival. Energy metabolism is particularly important for growth and genes encoding enzymes involved in energy metabolism are frequently altered in cancer cells. A genome scale metabolic model (GEM) is a mathematical formalization...

  7. The human noncoding genome defined by genetic diversity.

    Science.gov (United States)

    di Iulio, Julia; Bartha, Istvan; Wong, Emily H M; Yu, Hung-Chun; Lavrenko, Victor; Yang, Dongchan; Jung, Inkyung; Hicks, Michael A; Shah, Naisha; Kirkness, Ewen F; Fabani, Martin M; Biggs, William H; Ren, Bing; Venter, J Craig; Telenti, Amalio

    2018-03-01

    Understanding the significance of genetic variants in the noncoding genome is emerging as the next challenge in human genomics. We used the power of 11,257 whole-genome sequences and 16,384 heptamers (7-nt motifs) to build a map of sequence constraint for the human species. This build differed substantially from traditional maps of interspecies conservation and identified regulatory elements among the most constrained regions of the genome. Using new Hi-C experimental data, we describe a strong pattern of coordination over 2 Mb where the most constrained regulatory elements associate with the most essential genes. Constrained regions of the noncoding genome are up to 52-fold enriched for known pathogenic variants as compared to unconstrained regions (21-fold when compared to the genome average). This map of sequence constraint across thousands of individuals is an asset to help interpret noncoding elements in the human genome, prioritize variants and reconsider gene units at a larger scale.

  8. A Targeted Capture Linkage Map Anchors the Genome of the Schistosomiasis Vector Snail, Biomphalaria glabrata.

    Science.gov (United States)

    Tennessen, Jacob A; Bollmann, Stephanie R; Blouin, Michael S

    2017-07-05

    The aquatic planorbid snail Biomphalaria glabrata is one of the most intensively-studied mollusks due to its role in the transmission of schistosomiasis. Its 916 Mb genome has recently been sequenced and annotated, but it remains poorly assembled. Here, we used targeted capture markers to map over 10,000 B. glabrata scaffolds in a linkage cross of 94 F1 offspring, generating 24 linkage groups (LGs). We added additional scaffolds to these LGs based on linkage disequilibrium (LD) analysis of targeted capture and whole-genome sequences of 96 unrelated snails. Our final linkage map consists of 18,613 scaffolds comprising 515 Mb, representing 56% of the genome and 75% of genic and nonrepetitive regions. There are 18 large (> 10 Mb) LGs, likely representing the expected 18 haploid chromosomes, and > 50% of the genome has been assigned to LGs of at least 17 Mb. Comparisons with other gastropod genomes reveal patterns of synteny and chromosomal rearrangements. Linkage relationships of key immune-relevant genes may help clarify snail-schistosome interactions. By focusing on linkage among genic and nonrepetitive regions, we have generated a useful resource for associating snail phenotypes with causal genes, even in the absence of a complete genome assembly. A similar approach could potentially improve numerous poorly-assembled genomes in other taxa. This map will facilitate future work on this host of a serious human parasite. Copyright © 2017 Tennessen et al.

  9. The Metacognitive Anger Processing (MAP) Scale

    DEFF Research Database (Denmark)

    Moeller, Stine Bjerrum

    2015-01-01

    : The present data indicate that positive as well as negative beliefs are involved in the tendency to ruminate about angry emotions. Clinical interventions may benefit from an exploration of the patient´s experience of anger, as structured by the MAP's factors and their interrelationships. The psychometric...... preliminary studies was to apply a metacognitive framework to anger and put forward a new anger self-report scale, the Metacognitive Anger Processing (MAP) scale, intended as a supplement to existing measures of anger disposition and to enhance anger treatment targets. METHOD: The new measure was tested...... in a nonclinical and a clinical sample together with measures of anger and metacognition to establish factor structure, reliability, concurrent, and convergent validity. RESULTS: The MAP showed a reliable factor structure with three factors - Positive Beliefs about anger, Negative Beliefs about anger...

  10. Mammalian RNA polymerase II core promoters: insights from genome-wide studies

    DEFF Research Database (Denmark)

    Sandelin, Albin; Carninci, Piero; Lenhard, Boris

    2007-01-01

    The identification and characterization of mammalian core promoters and transcription start sites is a prerequisite to understanding how RNA polymerase II transcription is controlled. New experimental technologies have enabled genome-wide discovery and characterization of core promoters, revealing...... in the mammalian transcriptome and proteome. Promoters can be described by their start site usage distribution, which is coupled to the occurrence of cis-regulatory elements, gene function and evolutionary constraints. A comprehensive survey of mammalian promoters is a major step towards describing...

  11. The first genetic map of a synthesized allohexaploid Brassica with A, B and C genomes based on simple sequence repeat markers.

    Science.gov (United States)

    Yang, S; Chen, S; Geng, X X; Yan, G; Li, Z Y; Meng, J L; Cowling, W A; Zhou, W J

    2016-04-01

    We present the first genetic map of an allohexaploid Brassica species, based on segregating microsatellite markers in a doubled haploid mapping population generated from a hybrid between two hexaploid parents. This study reports the first genetic map of trigenomic Brassica. A doubled haploid mapping population consisting of 189 lines was obtained via microspore culture from a hybrid H16-1 derived from a cross between two allohexaploid Brassica lines (7H170-1 and Y54-2). Simple sequence repeat primer pairs specific to the A genome (107), B genome (44) and C genome (109) were used to construct a genetic linkage map of the population. Twenty-seven linkage groups were resolved from 274 polymorphic loci on the A genome (109), B genome (49) and C genome (116) covering a total genetic distance of 3178.8 cM with an average distance between markers of 11.60 cM. This is the first genetic framework map for the artificially synthesized Brassica allohexaploids. The linkage groups represent the expected complement of chromosomes in the A, B and C genomes from the original diploid and tetraploid parents. This framework linkage map will be valuable for QTL analysis and future genetic improvement of a new allohexaploid Brassica species, and in improving our understanding of the genetic control of meiosis in new polyploids.

  12. Genomics-assisted breeding in fruit trees.

    Science.gov (United States)

    Iwata, Hiroyoshi; Minamikawa, Mai F; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi

    2016-01-01

    Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the plant to assess the marketable product (fruit). In this article, we describe the potential of genomics-assisted breeding, which uses these novel genomics-based approaches, to break through these barriers in conventional fruit tree breeding. We first introduce the molecular marker systems and whole-genome sequence data that are available for fruit tree breeding. Next we introduce the statistical methods for biparental linkage and quantitative trait locus (QTL) mapping as well as GWAS and GS. We then review QTL mapping, GWAS, and GS studies conducted on fruit trees. We also review novel technologies for rapid generation advancement. Finally, we note the future prospects of genomics-assisted fruit tree breeding and problems that need to be overcome in the breeding.

  13. Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus

    NARCIS (Netherlands)

    C. Zeng (Chenjie); Guo, X. (Xingyi); J. Long (Jirong); K.B. Kuchenbaecker (Karoline); A. Droit (Arnaud); K. Michailidou (Kyriaki); M. Ghoussaini (Maya); S. Kar (Siddhartha); Freeman, A. (Adam); J.L. Hopper (John); R.L. Milne (Roger); M.K. Bolla (Manjeet K.); Wang, Q. (Qin); J. Dennis (Joe); S. Agata (Simona); S. Ahmed (Shahana); K. Aittomäki (Kristiina); I.L. Andrulis (Irene); H. Anton-Culver (Hoda); Antonenkova, N.N. (Natalia N.); A. Arason (Adalgeir); Arndt, V. (Volker); B.K. Arun (Banu); B. Arver (Brita Wasteson); F. Bacot (Francois); D. Barrowdale (Daniel); Baynes, C. (Caroline); A. Beeghly-Fadiel (Alicia); J. Benítez (Javier); M. Bermisheva (Marina); C. Blomqvist (Carl); W.J. Blot (William); N.V. Bogdanova (Natalia); S.E. Bojesen (Stig); B. Bonnani (Bernardo); A.-L. Borresen-Dale (Anne-Lise); J.S. Brand (Judith S.); H. Brauch (Hiltrud); P. Brennan (Paul); H. Brenner (Hermann); A. Broeks (Annegien); T. Brüning (Thomas); B. Burwinkel (Barbara); S.S. Buys (Saundra); Q. Cai (Qiuyin); T. Caldes (Trinidad); I. Campbell (Ian); T.A. Carpenter (Adrian); J. Chang-Claude (Jenny); Choi, J.-Y. (Ji-Yeob); K.B.M. Claes (Kathleen B.M.); C. Clarke (Christine); A. Cox (Angela); S.S. Cross (Simon); K. Czene (Kamila); M.B. Daly (Mary B.); M. de La Hoya (Miguel); K. De Leeneer (Kim); P. Devilee (Peter); O. Díez (Orland); S.M. Domchek (Susan); M. Doody (Michele); C.M. Dorfling (Cecilia); T. Dörk (Thilo); I. dos Santos Silva (Isabel); M. Dumont (Martine); M. Dwek (Miriam); Dworniczak, B. (Bernd); K.M. Egan (Kathleen); U. Eilber (Ursula); Z. Einbeigi (Zakaria); B. Ejlertsen (Bent); S.D. Ellis (Steve); D. Frost (Debra); F. Lalloo (Fiona); P.A. Fasching (Peter); J.D. Figueroa (Jonine); H. Flyger (Henrik); M. Friedlander (Michael); E. Friedman (Eitan); Gambino, G. (Gaetana); Gao, Y.-T. (Yu-Tang); J. Garber (Judy); M. García-Closas (Montserrat); P.A. Gehrig (Paola A.); F. Damiola (Francesca); F. Lesueur (Fabienne); S. Mazoyer (Sylvie); D. Stoppa-Lyonnet (Dominique); Giles, G.G. (Graham G.); A.K. Godwin (Andrew K.); D. Goldgar (David); A. González-Neira (Anna); M.H. Greene (Mark H.); P. Guénel (Pascal); L. Haeberle (Lothar); C.A. Haiman (Christopher A.); Hallberg, E. (Emily); U. Hamann (Ute); T.V.O. Hansen (Thomas); S. Hart (Stewart); J.M. Hartikainen (J.); J.M. Hartman (Joost); N. Hassan (Norhashimah); S. Healey (Sue); F.B.L. Hogervorst (Frans); S. Verhoef; Hendricks, C.B. (Carolyn B.); P. Hillemanns (Peter); A. Hollestelle (Antoinette); P.J. Hulick (Peter); D. Hunter (David); E.N. Imyanitov (Evgeny); C. Isaacs (Claudine); H. Ito (Hidemi); A. Jakubowska (Anna); R. Janavicius (Ramunas); Jaworska-Bieniek, K. (Katarzyna); U.B. Jensen; E.M. John (Esther); Joly Beauparlant, C. (Charles); M. Jones (Michael); M. Kabisch (Maria); D. Kang (Daehee); Karlan, B.Y. (Beth Y.); S. Kauppila (Saila); M. Kerin (Michael); S. Khan (Sofia); E.K. Khusnutdinova (Elza); J.A. Knight (Julia); I. Konstantopoulou (I.); P. Kraft (Peter); A. Kwong (Ava); Y. Laitman (Yael); Lambrechts, D. (Diether); C. Lazaro (Conxi); L. Le Marchand (Loic); C.N. Lee (Chuen); M.H. Lee (Min Hyuk); K.J. Lester (Kathryn); J. Li (Jingmei); A. Liljegren (Annelie); A. Lindblom (Annika); A. Lophatananon (Artitaya); J. Lubinski (Jan); P.L. Mai (Phuong); A. Mannermaa (Arto); S. Manoukian (Siranoush); S. Margolin (Sara); Marme, F. (Frederik); K. Matsuo (Keitaro); L. McGuffog (Lesley); A. Meindl (Alfons); F. Menegaux (Florence); M. Montagna (Marco); K.R. Muir (K.); A.-M. Mulligan (Anna-Marie); K.L. Nathanson (Katherine); S.L. Neuhausen (Susan); H. Nevanlinna (Heli); P. Newcomb (Polly); S. Nord (Silje); R.L. Nussbaum (Robert L.); K. Offit (Kenneth); E. Olah; O.I. Olopade (Olufunmilayo I.); C. Olswold (Curtis); A. Osorio (Ana); L. Papi (Laura); T.-W. Park-Simon; Paulsson-Karlsson, Y. (Ylva); S.T.H. Peeters (Stephanie); B. Peissel (Bernard); P. Peterlongo (Paolo); J. Peto (Julian); G. Pfeiler (Georg); C. Phelan (Catherine); Presneau, N. (Nadege); P. Radice (Paolo); N. Rahman (Nazneen); S.J. Ramus (Susan); M.U. Rashid (Muhammad); G. Rennert (Gad); K. Rhiem (Kerstin); Rudolph, A. (Anja); R. Salani (Ritu); Sangrajrang, S. (Suleeporn); E.J. Sawyer (Elinor); M.K. Schmidt (Marjanka); R.K. Schmutzler (Rita); M. Schoemaker (Minouk); P. Schürmann (Peter); C.M. Seynaeve (Caroline); C.-Y. Shen (Chen-Yang); M. Shrubsole (Martha); X.-O. Shu (Xiao-Ou); A.J. Sigurdson (Alice); C.F. Singer (Christian); S. Slager (Susan); Soucy, P. (Penny); M.C. Southey (Melissa); D. Steinemann (Doris); A.J. Swerdlow (Anthony ); C. Szabo (Csilla); Tchatchou, S. (Sandrine); P.J. Teixeira; S.-H. Teo (Soo-Hwang); M.B. Terry (Mary Beth); D.C. Tessier (Daniel C.); A. Teulé (A.); M. Thomassen (Mads); L. Tihomirova (Laima); M. Tischkowitz (Marc); A.E. Toland (Amanda); N. Tung (Nadine); C. Turnbull (Clare); A.M.W. van den Ouweland (Ans); E.J. van Rensburg (Elizabeth); ven den Berg, D. (David); J. Vijai (Joseph); S. Wang-Gohrke (Shan); J.N. Weitzel (Jeffrey); A.S. Whittemore (Alice); R. Winqvist (Robert); Wong, T.Y. (Tien Y.); A.H. Wu (Anna); Yannoukakos, D. (Drakoulis); J-C. Yu (Jyh-Cherng); P.D.P. Pharoah (Paul); P. Hall (Per); G. Chenevix-Trench (Georgia); A.M. Dunning (Alison); J. Simard (Jacques); F.J. Couch (Fergus); A.C. Antoniou (Antonis C.); D.F. Easton (Douglas F.); W. Zheng (Wei)

    2016-01-01

    textabstractBackground: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk. Method: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more

  14. High-density genetic map using whole-genome re-sequencing for fine mapping and candidate gene discovery for disease resistance in peanut

    Science.gov (United States)

    High-density genetic linkage maps are essential for fine mapping QTLs controlling disease resistance traits, such as early leaf spot (ELS), late leaf spot (LLS), and Tomato spotted wilt virus (TSWV). With completion of the genome sequences of two diploid ancestors of cultivated peanut, we could use ...

  15. Understanding the development of human bladder cancer by using a whole-organ genomic mapping strategy.

    Science.gov (United States)

    Majewski, Tadeusz; Lee, Sangkyou; Jeong, Joon; Yoon, Dong-Sup; Kram, Andrzej; Kim, Mi-Sook; Tuziak, Tomasz; Bondaruk, Jolanta; Lee, Sooyong; Park, Weon-Seo; Tang, Kuang S; Chung, Woonbok; Shen, Lanlan; Ahmed, Saira S; Johnston, Dennis A; Grossman, H Barton; Dinney, Colin P; Zhou, Jain-Hua; Harris, R Alan; Snyder, Carrie; Filipek, Slawomir; Narod, Steven A; Watson, Patrice; Lynch, Henry T; Gazdar, Adi; Bar-Eli, Menashe; Wu, Xifeng F; McConkey, David J; Baggerly, Keith; Issa, Jean-Pierre; Benedict, William F; Scherer, Steven E; Czerniak, Bogdan

    2008-07-01

    The search for the genomic sequences involved in human cancers can be greatly facilitated by maps of genomic imbalances identifying the involved chromosomal regions, particularly those that participate in the development of occult preneoplastic conditions that progress to clinically aggressive invasive cancer. The integration of such regions with human genome sequence variation may provide valuable clues about their overall structure and gene content. By extension, such knowledge may help us understand the underlying genetic components involved in the initiation and progression of these cancers. We describe the development of a genome-wide map of human bladder cancer that tracks its progression from in situ precursor conditions to invasive disease. Testing for allelic losses using a genome-wide panel of 787 microsatellite markers was performed on multiple DNA samples, extracted from the entire mucosal surface of the bladder and corresponding to normal urothelium, in situ preneoplastic lesions, and invasive carcinoma. Using this approach, we matched the clonal allelic losses in distinct chromosomal regions to specific phases of bladder neoplasia and produced a detailed genetic map of bladder cancer development. These analyses revealed three major waves of genetic changes associated with growth advantages of successive clones and reflecting a stepwise conversion of normal urothelial cells into cancer cells. The genetic changes map to six regions at 3q22-q24, 5q22-q31, 9q21-q22, 10q26, 13q14, and 17p13, which may represent critical hits driving the development of bladder cancer. Finally, we performed high-resolution mapping using single nucleotide polymorphism markers within one region on chromosome 13q14, containing the model tumor suppressor gene RB1, and defined a minimal deleted region associated with clonal expansion of in situ neoplasia. These analyses provided new insights on the involvement of several non-coding sequences mapping to the region and identified

  16. Genomic consequences of selection and genome-wide association mapping in soybean.

    Science.gov (United States)

    Wen, Zixiang; Boyse, John F; Song, Qijian; Cregan, Perry B; Wang, Dechun

    2015-09-03

    Crop improvement always involves selection of specific alleles at genes controlling traits of agronomic importance, likely resulting in detectable signatures of selection within the genome of modern soybean (Glycine max L. Merr.). The identification of these signatures of selection is meaningful from the perspective of evolutionary biology and for uncovering the genetic architecture of agronomic traits. To this end, two populations of soybean, consisting of 342 landraces and 1062 improved lines, were genotyped with the SoySNP50K Illumina BeadChip containing 52,041 single nucleotide polymorphisms (SNPs), and systematically phenotyped for 9 agronomic traits. A cross-population composite likelihood ratio (XP-CLR) method was used to screen the signals of selective sweeps. A total of 125 candidate selection regions were identified, many of which harbored genes potentially involved in crop improvement. To further investigate whether these candidate regions were in fact enriched for genes affected by selection, genome-wide association studies (GWAS) were conducted on 7 selection traits targeted in soybean breeding (grain yield, plant height, lodging, maturity date, seed coat color, seed protein and oil content) and 2 non-selection traits (pubescence and flower color). Major genomic regions associated with selection traits overlapped with candidate selection regions, whereas no overlap of this kind occurred for the non-selection traits, suggesting that the selection sweeps identified are associated with traits of agronomic importance. Multiple novel loci and refined map locations of known loci related to these traits were also identified. These findings illustrate that comparative genomic analyses, especially when combined with GWAS, are a promising approach to dissect the genetic architecture of complex traits.

  17. Comparison of Burrows-Wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to Illumina data for livestock genomes

    Science.gov (United States)

    Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...

  18. Toward a physical map of the genome of the nematode Caenorhabditis elegans

    International Nuclear Information System (INIS)

    Coulson, A.; Sulston, J.; Brenner, S.; Karn, J.

    1986-01-01

    A technique for digital characterization and comparison of DNA fragments, using restriction enzymes, is described. The technique is being applied to fragments from the nematode Caenorhabditis elegans (i) to facilitate cross-indexing of clones emanating from different laboratories and (ii) to construct a physical map of the genome. Eight hundred sixty clusters of clones, from 35 to 350 kilobases long and totaling about 60% of the genome, have been characterized

  19. Inter-simple sequence repeat (ISSR) loci mapping in the genome of perennial ryegrass

    DEFF Research Database (Denmark)

    Pivorienė, O; Pašakinskienė, I; Brazauskas, G

    2008-01-01

    The aim of this study was to identify and characterize new ISSR markers and their loci in the genome of perennial ryegrass. A subsample of the VrnA F2 mapping family of perennial ryegrass comprising 92 individuals was used to develop a linkage map including inter-simple sequence repeat markers...... demonstrated a 70% similarity to the Hordeum vulgare germin gene GerA. Inter-SSR mapping will provide useful information for gene targeting, quantitative trait loci mapping and marker-assisted selection in perennial ryegrass....

  20. Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

    Science.gov (United States)

    Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

    2012-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.

  1. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  2. BAC-HAPPY mapping (BAP mapping: a new and efficient protocol for physical mapping.

    Directory of Open Access Journals (Sweden)

    Giang T H Vu

    2010-02-01

    Full Text Available Physical and linkage mapping underpin efforts to sequence and characterize the genomes of eukaryotic organisms by providing a skeleton framework for whole genome assembly. Hitherto, linkage and physical "contig" maps were generated independently prior to merging. Here, we develop a new and easy method, BAC HAPPY MAPPING (BAP mapping, that utilizes BAC library pools as a HAPPY mapping panel together with an Mbp-sized DNA panel to integrate the linkage and physical mapping efforts into one pipeline. Using Arabidopsis thaliana as an exemplar, a set of 40 Sequence Tagged Site (STS markers spanning approximately 10% of chromosome 4 were simultaneously assembled onto a BAP map compiled using both a series of BAC pools each comprising 0.7x genome coverage and dilute (0.7x genome samples of sheared genomic DNA. The resultant BAP map overcomes the need for polymorphic loci to separate genetic loci by recombination and allows physical mapping in segments of suppressed recombination that are difficult to analyze using traditional mapping techniques. Even virtual "BAC-HAPPY-mapping" to convert BAC landing data into BAC linkage contigs is possible.

  3. Genome-wide DNA methylation maps in follicular lymphoma cells determined by methylation-enriched bisulfite sequencing.

    Directory of Open Access Journals (Sweden)

    Jeong-Hyeon Choi

    Full Text Available BACKGROUND: Follicular lymphoma (FL is a form of non-Hodgkin's lymphoma (NHL that arises from germinal center (GC B-cells. Despite the significant advances in immunotherapy, FL is still not curable. Beyond transcriptional profiling and genomics datasets, there currently is no epigenome-scale dataset or integrative biology approach that can adequately model this disease and therefore identify novel mechanisms and targets for successful prevention and treatment of FL. METHODOLOGY/PRINCIPAL FINDINGS: We performed methylation-enriched genome-wide bisulfite sequencing of FL cells and normal CD19(+ B-cells using 454 sequencing technology. The methylated DNA fragments were enriched with methyl-binding proteins, treated with bisulfite, and sequenced using the Roche-454 GS FLX sequencer. The total number of bases covered in the human genome was 18.2 and 49.3 million including 726,003 and 1.3 million CpGs in FL and CD19(+ B-cells, respectively. 11,971 and 7,882 methylated regions of interest (MRIs were identified respectively. The genome-wide distribution of these MRIs displayed significant differences between FL and normal B-cells. A reverse trend in the distribution of MRIs between the promoter and the gene body was observed in FL and CD19(+ B-cells. The MRIs identified in FL cells also correlated well with transcriptomic data and ChIP-on-Chip analyses of genome-wide histone modifications such as tri-methyl-H3K27, and tri-methyl-H3K4, indicating a concerted epigenetic alteration in FL cells. CONCLUSIONS/SIGNIFICANCE: This study is the first to provide a large scale and comprehensive analysis of the DNA methylation sequence composition and distribution in the FL epigenome. These integrated approaches have led to the discovery of novel and frequent targets of aberrant epigenetic alterations. The genome-wide bisulfite sequencing approach developed here can be a useful tool for profiling DNA methylation in clinical samples.

  4. Dissecting genomic hotspots underlying seed protein, oil, and sucrose content in an interspecific mapping population of soybean using high-density linkage mapping.

    Science.gov (United States)

    Patil, Gunvant; Vuong, Tri D; Kale, Sandip; Valliyodan, Babu; Deshmukh, Rupesh; Zhu, Chengsong; Wu, Xiaolei; Bai, Yonghe; Yungbluth, Dennis; Lu, Fang; Kumpatla, Siva; Grover Shannon, J; Varshney, Rajeev K; Nguyen, Henry T

    2018-04-04

    The cultivated [Glycine max (L) Merr.] and wild [Glycine soja Siebold & Zucc.] soybean species comprise wide variation in seed composition traits. Compared to wild soybean, cultivated soybean contains low protein, high oil and high sucrose. In this study, an inter-specific population was derived from a cross between G. max (Williams 82) and G. soja (PI 483460B). This recombinant inbred line (RIL) population of 188 lines was sequenced at 0.3x depth. Based on 91,342 single nucleotide polymorphisms (SNPs), recombination events in RILs were defined, and a high-resolution bin map was developed (4,070 bins). In addition to bin mapping, QTL analysis for protein, oil and sucrose was performed using 3,343 polymorphic SNPs (3K-SNP), derived from Illumina Infinium BeadChip sequencing platform. The QTL regions from both platforms were compared and a significant concordance was observed between bin and 3K-SNP markers. Importantly, the bin map derived from next generation sequencing technology enhanced mapping resolution (from 1325 Kb to 50 Kb). A total of 5, 9 and 4 QTLs were identified for protein, oil and sucrose content, respectively and some of the QTLs coincided with soybean domestication related genomic loci. The major QTL for protein and oil was mapped on Chr. 20 (qPro_20) and suggested negative correlation between oil and protein. In terms of sucrose content, a novel and major QTL was identified on Chr. 8 (qSuc_08) and harbors putative genes involved in sugar transport. In addition, genome-wide association (GWAS) using 91,342 SNPs confirmed the genomic loci derived from QTL mapping. A QTL based haplotype using whole genome resequencing of 106 diverse soybean lines identified unique allelic variation in wild soybean that could be utilized to widen the genetic base in cultivated soybean. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  5. Environmental versatility promotes modularity in genome-scale metabolic networks.

    Science.gov (United States)

    Samal, Areejit; Wagner, Andreas; Martin, Olivier C

    2011-08-24

    The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Our work shows that modularity in metabolic networks can be a by-product of functional constraints, e.g., the need to sustain life in multiple

  6. Environmental versatility promotes modularity in genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Wagner Andreas

    2011-08-01

    Full Text Available Abstract Background The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Results Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Conclusions Our work shows that modularity in metabolic networks can be a by-product of functional

  7. GLIDERS - A web-based search engine for genome-wide linkage disequilibrium between HapMap SNPs

    Directory of Open Access Journals (Sweden)

    Broxholme John

    2009-10-01

    Full Text Available Abstract Background A number of tools for the examination of linkage disequilibrium (LD patterns between nearby alleles exist, but none are available for quickly and easily investigating LD at longer ranges (>500 kb. We have developed a web-based query tool (GLIDERS: Genome-wide LInkage DisEquilibrium Repository and Search engine that enables the retrieval of pairwise associations with r2 ≥ 0.3 across the human genome for any SNP genotyped within HapMap phase 2 and 3, regardless of distance between the markers. Description GLIDERS is an easy to use web tool that only requires the user to enter rs numbers of SNPs they want to retrieve genome-wide LD for (both nearby and long-range. The intuitive web interface handles both manual entry of SNP IDs as well as allowing users to upload files of SNP IDs. The user can limit the resulting inter SNP associations with easy to use menu options. These include MAF limit (5-45%, distance limits between SNPs (minimum and maximum, r2 (0.3 to 1, HapMap population sample (CEU, YRI and JPT+CHB combined and HapMap build/release. All resulting genome-wide inter-SNP associations are displayed on a single output page, which has a link to a downloadable tab delimited text file. Conclusion GLIDERS is a quick and easy way to retrieve genome-wide inter-SNP associations and to explore LD patterns for any number of SNPs of interest. GLIDERS can be useful in identifying SNPs with long-range LD. This can highlight mis-mapping or other potential association signal localisation problems.

  8. Construction of an integrated genetic linkage map for the A genome of Brassica napus using SSR markers derived from sequenced BACs in B. rapa

    Directory of Open Access Journals (Sweden)

    King Graham J

    2010-10-01

    Full Text Available Abstract Background The Multinational Brassica rapa Genome Sequencing Project (BrGSP has developed valuable genomic resources, including BAC libraries, BAC-end sequences, genetic and physical maps, and seed BAC sequences for Brassica rapa. An integrated linkage map between the amphidiploid B. napus and diploid B. rapa will facilitate the rapid transfer of these valuable resources from B. rapa to B. napus (Oilseed rape, Canola. Results In this study, we identified over 23,000 simple sequence repeats (SSRs from 536 sequenced BACs. 890 SSR markers (designated as BrGMS were developed and used for the construction of an integrated linkage map for the A genome in B. rapa and B. napus. Two hundred and nineteen BrGMS markers were integrated to an existing B. napus linkage map (BnaNZDH. Among these mapped BrGMS markers, 168 were only distributed on the A genome linkage groups (LGs, 18 distrubuted both on the A and C genome LGs, and 33 only distributed on the C genome LGs. Most of the A genome LGs in B. napus were collinear with the homoeologous LGs in B. rapa, although minor inversions or rearrangements occurred on A2 and A9. The mapping of these BAC-specific SSR markers enabled assignment of 161 sequenced B. rapa BACs, as well as the associated BAC contigs to the A genome LGs of B. napus. Conclusion The genetic mapping of SSR markers derived from sequenced BACs in B. rapa enabled direct links to be established between the B. napus linkage map and a B. rapa physical map, and thus the assignment of B. rapa BACs and the associated BAC contigs to the B. napus linkage map. This integrated genetic linkage map will facilitate exploitation of the B. rapa annotated genomic resources for gene tagging and map-based cloning in B. napus, and for comparative analysis of the A genome within Brassica species.

  9. Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

    Science.gov (United States)

    Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard

    2015-10-19

    DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.

  10. Genome-scale metabolic representation of Amycolatopsis balhimycina

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Figueiredo, L. F.; Förster, Jochen

    2012-01-01

    Infection caused by methicillin‐resistant Staphylococcus aureus (MRSA) is an increasing societal problem. Typically, glycopeptide antibiotics are used in the treatment of these infections. The most comprehensively studied glycopeptide antibiotic biosynthetic pathway is that of balhimycin...... to reconstruct a genome‐scale metabolic model for the organism. Here we generated an almost complete A. balhimycina genome sequence comprising 10,562,587 base pairs assembled into 2,153 contigs. The high GC‐genome (∼69%) includes 8,585 open reading frames (ORFs). We used our integrative toolbox called SEQTOR...

  11. A scored human protein-protein interaction network to catalyze genomic interpretation

    DEFF Research Database (Denmark)

    Li, Taibo; Wernersson, Rasmus; Hansen, Rasmus B

    2017-01-01

    Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (InWeb_InBioMap,......Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (In...

  12. Genome Sequence of the Plant Growth Promoting Endophytic Bacterium Enterobacter sp. 638

    Science.gov (United States)

    Taghavi, Safiyh; van der Lelie, Daniel; Hoffman, Adam; Zhang, Yian-Biao; Walla, Michael D.; Vangronsveld, Jaco; Newman, Lee; Monchy, Sébastien

    2010-01-01

    Enterobacter sp. 638 is an endophytic plant growth promoting gamma-proteobacterium that was isolated from the stem of poplar (Populus trichocarpa×deltoides cv. H11-11), a potentially important biofuel feed stock plant. The Enterobacter sp. 638 genome sequence reveals the presence of a 4,518,712 bp chromosome and a 157,749 bp plasmid (pENT638-1). Genome annotation and comparative genomics allowed the identification of an extended set of genes specific to the plant niche adaptation of this bacterium. This includes genes that code for putative proteins involved in survival in the rhizosphere (to cope with oxidative stress or uptake of nutrients released by plant roots), root adhesion (pili, adhesion, hemagglutinin, cellulose biosynthesis), colonization/establishment inside the plant (chemiotaxis, flagella, cellobiose phosphorylase), plant protection against fungal and bacterial infections (siderophore production and synthesis of the antimicrobial compounds 4-hydroxybenzoate and 2-phenylethanol), and improved poplar growth and development through the production of the phytohormones indole acetic acid, acetoin, and 2,3-butanediol. Metabolite analysis confirmed by quantitative RT–PCR showed that, the production of acetoin and 2,3-butanediol is induced by the presence of sucrose in the growth medium. Interestingly, both the genetic determinants required for sucrose metabolism and the synthesis of acetoin and 2,3-butanediol are clustered on a genomic island. These findings point to a close interaction between Enterobacter sp. 638 and its poplar host, where the availability of sucrose, a major plant sugar, affects the synthesis of plant growth promoting phytohormones by the endophytic bacterium. The availability of the genome sequence, combined with metabolome and transcriptome analysis, will provide a better understanding of the synergistic interactions between poplar and its growth promoting endophyte Enterobacter sp. 638. This information can be further exploited to

  13. Revealing less derived nature of cartilaginous fish genomes with their evolutionary time scale inferred with nuclear genes.

    Directory of Open Access Journals (Sweden)

    Adina J Renz

    Full Text Available Cartilaginous fishes, divided into Holocephali (chimaeras and Elasmoblanchii (sharks, rays and skates, occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.

  14. Genetic recombination is targeted towards gene promoter regions in dogs.

    Science.gov (United States)

    Auton, Adam; Rui Li, Ying; Kidd, Jeffrey; Oliveira, Kyle; Nadel, Julie; Holloway, J Kim; Hayward, Jessica J; Cohen, Paula E; Greally, John M; Wang, Jun; Bustamante, Carlos D; Boyko, Adam R

    2013-01-01

    The identification of the H3K4 trimethylase, PRDM9, as the gene responsible for recombination hotspot localization has provided considerable insight into the mechanisms by which recombination is initiated in mammals. However, uniquely amongst mammals, canids appear to lack a functional version of PRDM9 and may therefore provide a model for understanding recombination that occurs in the absence of PRDM9, and thus how PRDM9 functions to shape the recombination landscape. We have constructed a fine-scale genetic map from patterns of linkage disequilibrium assessed using high-throughput sequence data from 51 free-ranging dogs, Canis lupus familiaris. While broad-scale properties of recombination appear similar to other mammalian species, our fine-scale estimates indicate that canine highly elevated recombination rates are observed in the vicinity of CpG rich regions including gene promoter regions, but show little association with H3K4 trimethylation marks identified in spermatocytes. By comparison to genomic data from the Andean fox, Lycalopex culpaeus, we show that biased gene conversion is a plausible mechanism by which the high CpG content of the dog genome could have occurred.

  15. Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation.

    Science.gov (United States)

    Horikoshi, Momoko; Mӓgi, Reedik; van de Bunt, Martijn; Surakka, Ida; Sarin, Antti-Pekka; Mahajan, Anubha; Marullo, Letizia; Thorleifsson, Gudmar; Hӓgg, Sara; Hottenga, Jouke-Jan; Ladenvall, Claes; Ried, Janina S; Winkler, Thomas W; Willems, Sara M; Pervjakova, Natalia; Esko, Tõnu; Beekman, Marian; Nelson, Christopher P; Willenborg, Christina; Wiltshire, Steven; Ferreira, Teresa; Fernandez, Juan; Gaulton, Kyle J; Steinthorsdottir, Valgerdur; Hamsten, Anders; Magnusson, Patrik K E; Willemsen, Gonneke; Milaneschi, Yuri; Robertson, Neil R; Groves, Christopher J; Bennett, Amanda J; Lehtimӓki, Terho; Viikari, Jorma S; Rung, Johan; Lyssenko, Valeriya; Perola, Markus; Heid, Iris M; Herder, Christian; Grallert, Harald; Müller-Nurasyid, Martina; Roden, Michael; Hypponen, Elina; Isaacs, Aaron; van Leeuwen, Elisabeth M; Karssen, Lennart C; Mihailov, Evelin; Houwing-Duistermaat, Jeanine J; de Craen, Anton J M; Deelen, Joris; Havulinna, Aki S; Blades, Matthew; Hengstenberg, Christian; Erdmann, Jeanette; Schunkert, Heribert; Kaprio, Jaakko; Tobin, Martin D; Samani, Nilesh J; Lind, Lars; Salomaa, Veikko; Lindgren, Cecilia M; Slagboom, P Eline; Metspalu, Andres; van Duijn, Cornelia M; Eriksson, Johan G; Peters, Annette; Gieger, Christian; Jula, Antti; Groop, Leif; Raitakari, Olli T; Power, Chris; Penninx, Brenda W J H; de Geus, Eco; Smit, Johannes H; Boomsma, Dorret I; Pedersen, Nancy L; Ingelsson, Erik; Thorsteinsdottir, Unnur; Stefansson, Kari; Ripatti, Samuli; Prokopenko, Inga; McCarthy, Mark I; Morris, Andrew P

    2015-07-01

    Reference panels from the 1000 Genomes (1000G) Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ≥0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS), supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI) at genome-wide significance, and two for fasting glucose (FG), none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3) and FG (GCK and G6PC2). The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated.

  16. Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

    Science.gov (United States)

    2012-01-01

    Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource

  17. The complete genome sequence of the plant growth-promoting bacterium Pseudomonas sp. UW4.

    Directory of Open Access Journals (Sweden)

    Jin Duan

    Full Text Available The plant growth-promoting bacterium (PGPB Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated "housekeeping" genes (16S rRNA, gyrB, rpoB and rpoD of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup.

  18. The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp. UW4

    Science.gov (United States)

    Duan, Jin; Jiang, Wei; Cheng, Zhenyu; Heikkila, John J.; Glick, Bernard R.

    2013-01-01

    The plant growth-promoting bacterium (PGPB) Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs) were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA) biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated “housekeeping” genes (16S rRNA, gyrB, rpoB and rpoD) of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup. PMID:23516524

  19. Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans

    NARCIS (Netherlands)

    Li, Y.; Alda Alvarez, O.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.G.; Hazendonk, E.; Prins, J.C.P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  20. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    NARCIS (Netherlands)

    Li, Y.; Alvarez, O.A.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.; Hazendonk, M.G.A.; Prins, P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  1. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study.

    Science.gov (United States)

    Amyotte, Beatrice; Bowen, Amy J; Banks, Travis; Rajcan, Istvan; Somers, Daryl J

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants.

  2. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study

    Science.gov (United States)

    Amyotte, Beatrice; Bowen, Amy J.; Banks, Travis; Rajcan, Istvan; Somers, Daryl J.

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants. PMID:28231290

  3. A SNP Based Linkage Map of the Arctic Charr (Salvelinus alpinus Genome Provides Insights into the Diploidization Process After Whole Genome Duplication

    Directory of Open Access Journals (Sweden)

    Cameron M. Nugent

    2017-02-01

    Full Text Available Diploidization, which follows whole genome duplication events, does not occur evenly across the genome. In salmonid fishes, certain pairs of homeologous chromosomes preserve tetraploid loci in higher frequencies toward the telomeres due to residual tetrasomic inheritance. Research suggests this occurs only in homeologous pairs where one chromosome arm has undergone a fusion event. We present a linkage map for Arctic charr (Salvelinus alpinus, a salmonid species with relatively fewer chromosome fusions. Genotype by sequencing identified 19,418 SNPs, and a linkage map consisting of 4508 markers was constructed from a subset of high quality SNPs and microsatellite markers that were used to anchor the new map to previous versions. Both male- and female-specific linkage maps contained the expected number of 39 linkage groups. The chromosome type associated with each linkage group was determined, and 10 stable metacentric chromosomes were identified, along with a chromosome polymorphism involving the sex chromosome AC04. Two instances of a weak form of pseudolinkage were detected in the telomeric regions of homeologous chromosome arms in both female and male linkage maps. Chromosome arm homologies within the Atlantic salmon (Salmo salar and rainbow trout (Oncorhynchus mykiss genomes were determined. Paralogous sequence variants (PSVs were identified, and their comparative BLASTn hit locations showed that duplicate markers exist in higher numbers on seven pairs of homeologous arms, previously identified as preserving tetrasomy in salmonid species. Homeologous arm pairs where neither arm has been part of a fusion event in Arctic charr had fewer PSVs, suggesting faster diploidization rates in these regions.

  4. Positioning genomics in biology education: content mapping of undergraduate biology textbooks.

    Science.gov (United States)

    Wernick, Naomi L B; Ndung'u, Eric; Haughton, Dominique; Ledley, Fred D

    2014-12-01

    Biological thought increasingly recognizes the centrality of the genome in constituting and regulating processes ranging from cellular systems to ecology and evolution. In this paper, we ask whether genomics is similarly positioned as a core concept in the instructional sequence for undergraduate biology. Using quantitative methods, we analyzed the order in which core biological concepts were introduced in textbooks for first-year general and human biology. Statistical analysis was performed using self-organizing map algorithms and conventional methods to identify clusters of terms and their relative position in the books. General biology textbooks for both majors and nonmajors introduced genome-related content after text related to cell biology and biological chemistry, but before content describing higher-order biological processes. However, human biology textbooks most often introduced genomic content near the end of the books. These results suggest that genomics is not yet positioned as a core concept in commonly used textbooks for first-year biology and raises questions about whether such textbooks, or courses based on the outline of these textbooks, provide an appropriate foundation for understanding contemporary biological science.

  5. Molecular biologists backing effort to map entire human genome

    International Nuclear Information System (INIS)

    Zurer, P.S.

    1988-01-01

    This article discusses how the program to map and sequence the human genome will be managed. The National Research Council (NRC) recommends that a 15-year $200-million-a-year effort to map all human genes should begin immediately. However, some people have balked at the idea, saying it is a ploy to raise money. Part of the skeptic's uneasiness stems from the involvement of the Department of Energy (DOE), an agency not often linked with biological research. The DOE's interest arises from its commitment to understanding the biological effects of nuclear radiation. Critics say it is a budget-boosting tactic. This article explains some of the arguments for and against the project and explains exactly what it would involve

  6. Accuracy assessment of planimetric large-scale map data for decision-making

    Directory of Open Access Journals (Sweden)

    Doskocz Adam

    2016-06-01

    Full Text Available This paper presents decision-making risk estimation based on planimetric large-scale map data, which are data sets or databases which are useful for creating planimetric maps on scales of 1:5,000 or larger. The studies were conducted on four data sets of large-scale map data. Errors of map data were used for a risk assessment of decision-making about the localization of objects, e.g. for land-use planning in realization of investments. An analysis was performed for a large statistical sample set of shift vectors of control points, which were identified with the position errors of these points (errors of map data.

  7. First-generation physical map of the Culicoides variipennis (Diptera: Ceratopogonidae) genome.

    Science.gov (United States)

    Nunamaker, R A; Brown, S E; McHolland, L E; Tabachnick, W J; Knudson, D L

    1999-11-01

    Recombinant cosmids labeled with biotin-11-dUTP or digoxigenin by nick translation were used as in situ hybridization probes to metaphase chromosomes of Culicoides variipennis (Coquillett). Paired fluorescent signals were detected on each arm of sister chromatids and were ordered along the 3 chromosomes. Thirty-three unique probes were mapped to the 3 chromosomes of C. variipennis (2n = 6): 7 to chromosome 1, 20 to chromosome 2, and 6 to chromosome 3. This work represents the first stage in generating a physical map of the genome of C. variipennis.

  8. Development of genomic SSR markers for fingerprinting lettuce (Lactuca sativa L.) cultivars and mapping genes.

    Science.gov (United States)

    Rauscher, Gilda; Simko, Ivan

    2013-01-22

    Lettuce (Lactuca sativa L.) is the major crop from the group of leafy vegetables. Several types of molecular markers were developed that are effectively used in lettuce breeding and genetic studies. However only a very limited number of microsattelite-based markers are publicly available. We have employed the method of enriched microsatellite libraries to develop 97 genomic SSR markers. Testing of newly developed markers on a set of 36 Lactuca accession (33 L. sativa, and one of each L. serriola L., L. saligna L., and L. virosa L.) revealed that both the genetic heterozygosity (UHe = 0.56) and the number of loci per SSR (Na = 5.50) are significantly higher for genomic SSR markers than for previously developed EST-based SSR markers (UHe = 0.32, Na = 3.56). Fifty-four genomic SSR markers were placed on the molecular linkage map of lettuce. Distribution of markers in the genome appeared to be random, with the exception of possible cluster on linkage group 6. Any combination of 32 genomic SSRs was able to distinguish genotypes of all 36 accessions. Fourteen of newly developed SSR markers originate from fragments with high sequence similarity to resistance gene candidates (RGCs) and RGC pseudogenes. Analysis of molecular variance (AMOVA) of L. sativa accessions showed that approximately 3% of genetic diversity was within accessions, 79% among accessions, and 18% among horticultural types. The newly developed genomic SSR markers were added to the pool of previously developed EST-SSRs markers. These two types of SSR-based markers provide useful tools for lettuce cultivar fingerprinting, development of integrated molecular linkage maps, and mapping of genes.

  9. Fine scale mapping of genomic introgressions within the Drosophila yakuba clade.

    Science.gov (United States)

    Turissini, David A; Matute, Daniel R

    2017-09-01

    The process of speciation involves populations diverging over time until they are genetically and reproductively isolated. Hybridization between nascent species was long thought to directly oppose speciation. However, the amount of interspecific genetic exchange (introgression) mediated by hybridization remains largely unknown, although recent progress in genome sequencing has made measuring introgression more tractable. A natural place to look for individuals with admixed ancestry (indicative of introgression) is in regions where species co-occur. In west Africa, D. santomea and D. yakuba hybridize on the island of São Tomé, while D. yakuba and D. teissieri hybridize on the nearby island of Bioko. In this report, we quantify the genomic extent of introgression between the three species of the Drosophila yakuba clade (D. yakuba, D. santomea), D. teissieri). We sequenced the genomes of 86 individuals from all three species. We also developed and applied a new statistical framework, using a hidden Markov approach, to identify introgression. We found that introgression has occurred between both species pairs but most introgressed segments are small (on the order of a few kilobases). After ruling out the retention of ancestral polymorphism as an explanation for these similar regions, we find that the sizes of introgressed haplotypes indicate that genetic exchange is not recent (>1,000 generations ago). We additionally show that in both cases, introgression was rarer on X chromosomes than on autosomes which is consistent with sex chromosomes playing a large role in reproductive isolation. Even though the two species pairs have stable contemporary hybrid zones, providing the opportunity for ongoing gene flow, our results indicate that genetic exchange between these species is currently rare.

  10. Unexplored therapeutic opportunities in the human genome

    DEFF Research Database (Denmark)

    Oprea, Tudor I; Bologa, Cristian G; Brunak, Søren

    2018-01-01

    A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially d...... as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development....

  11. [Genome editing of industrial microorganism].

    Science.gov (United States)

    Zhu, Linjiang; Li, Qi

    2015-03-01

    Genome editing is defined as highly-effective and precise modification of cellular genome in a large scale. In recent years, such genome-editing methods have been rapidly developed in the field of industrial strain improvement. The quickly-updating methods thoroughly change the old mode of inefficient genetic modification, which is "one modification, one selection marker, and one target site". Highly-effective modification mode in genome editing have been developed including simultaneous modification of multiplex genes, highly-effective insertion, replacement, and deletion of target genes in the genome scale, cut-paste of a large DNA fragment. These new tools for microbial genome editing will certainly be applied widely, and increase the efficiency of industrial strain improvement, and promote the revolution of traditional fermentation industry and rapid development of novel industrial biotechnology like production of biofuel and biomaterial. The technological principle of these genome-editing methods and their applications were summarized in this review, which can benefit engineering and construction of industrial microorganism.

  12. A second generation genetic map of the bumblebee Bombus terrestris (Linnaeus, 1758 reveals slow genome and chromosome evolution in the Apidae

    Directory of Open Access Journals (Sweden)

    Kube Michael

    2011-01-01

    Full Text Available Abstract Background The bumblebee Bombus terrestris is an ecologically and economically important pollinator and has become an important biological model system. To study fundamental evolutionary questions at the genomic level, a high resolution genetic linkage map is an essential tool for analyses ranging from quantitative trait loci (QTL mapping to genome assembly and comparative genomics. We here present a saturated linkage map and match it with the Apis mellifera genome using homologous markers. This genome-wide comparison allows insights into structural conservations and rearrangements and thus the evolution on a chromosomal level. Results The high density linkage map covers ~ 93% of the B. terrestris genome on 18 linkage groups (LGs and has a length of 2'047 cM with an average marker distance of 4.02 cM. Based on a genome size of ~ 430 Mb, the recombination rate estimate is 4.76 cM/Mb. Sequence homologies of 242 homologous markers allowed to match 15 B. terrestris with A. mellifera LGs, five of them as composites. Comparing marker orders between both genomes we detect over 14% of the genome to be organized in synteny and 21% in rearranged blocks on the same homologous LG. Conclusions This study demonstrates that, despite the very high recombination rates of both A. mellifera and B. terrestris and a long divergence time of about 100 million years, the genomes' genetic architecture is highly conserved. This reflects a slow genome evolution in these bees. We show that data on genome organization and conserved molecular markers can be used as a powerful tool for comparative genomics and evolutionary studies, opening up new avenues of research in the Apidae.

  13. Complete genome sequence of the rapeseed plant-growth promoting Serratia plymuthica strain AS9

    Energy Technology Data Exchange (ETDEWEB)

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Hogberg, Nils [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Han, James [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Lu, Megan [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Fiebig, Anne [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Finlay, Roger D. [Uppsala University, Uppsala, Sweden

    2012-01-01

    Serratia plymuthica are plant-associated, plant beneficial species belonging to the family Enterobacteriaceae. The members of the genus Serratia are ubiquitous in nature and their life style varies from endophytic to free-living. S. plymuthica AS9 is of special interest for its ability to inhibit fungal pathogens of rapeseed and to promote plant growth. The genome of S. plymuthica AS9 comprises a 5,442,880 bp long circular chromosome that consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome is part of the project entitled Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens awarded through the 2010 DOE-JGI Community Sequencing Program (CSP2010).

  14. GIGGLE: a search engine for large-scale integrated genome analysis.

    Science.gov (United States)

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-02-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

  15. Genome scale metabolic network reconstruction of Spirochaeta cellobiosiphila

    Directory of Open Access Journals (Sweden)

    Bharat Manna

    2017-10-01

    Full Text Available Substantial rise in the global energy demand is one of the biggest challenges in this century. Environmental pollution due to rapid depletion of the fossil fuel resources and its alarming impact on the climate change and Global Warming have motivated researchers to look for non-petroleum-based sustainable, eco-friendly, renewable, low-cost energy alternatives, such as biofuel. Lignocellulosic biomass is one of the most promising bio-resources with huge potential to contribute to this worldwide energy demand. However, the complex organization of the Cellulose, Hemicellulose and Lignin in the Lignocellulosic biomass requires extensive pre-treatment and enzymatic hydrolysis followed by fermentation, raising overall production cost of biofuel. This encourages researchers to design cost-effective approaches for the production of second generation biofuels. The products from enzymatic hydrolysis of cellulose are mostly glucose monomer or cellobiose unit that are subjected to fermentation. Spirochaeta genus is a well-known group of obligate or facultative anaerobes, living primarily on carbohydrate metabolism. Spirochaeta cellobiosiphila sp. is a facultative anaerobe under this genus, which uses a variety of monosaccharides and disaccharides as energy sources. However, most rapid growth occurs on cellobiose and fermentation yields significant amount of ethanol, acetate, CO2, H2 and small amounts of formate. It is predicted to be promising microbial machinery for industrial fermentation processes for biofuel production. The metabolic pathways that govern cellobiose metabolism in Spirochaeta cellobiosiphila are yet to be explored. The function annotation of the genome sequence of Spirochaeta cellobiosiphila is in progress. In this work we aim to map all the metabolic activities for reconstruction of genome-scale metabolic model of Spirochaeta cellobiosiphila.

  16. Mapping of RNA initiation sites by high doses of uv iradiation: evidence for three independent promoters within the left 11% of the Ad-2 genome

    International Nuclear Information System (INIS)

    Wilson, M.C.; Fraser, N.W.; Darnell, J.E. Jr.

    1979-01-01

    Cells infected with Ad-2 virus were irradiated so that uv-induced lesions were introduced every 500 to 1000 nucleotides in the genomes, consequently leading to the premature termination of RNA transcription. Such cells when labeled with [ 3 H]uridine accumulate labeled promoter proximal RNA. Hybridization of this RNA after size fractionation to restriction fragments of the Ad-2 genome allowed the identification of DNA sequences containing active RNA initiation sites. Early during the infectious cycle two active RNA initiation sites were found within the left 11% of the Ad-2 genome within the 0 to 3.0 and 4.4 to 8.0 restriction fragments. During late infection (15 hr) an additional uv resistant transcript was detected indicating that a newly activated RNA initiation site, presumably for protein IX, resides within the fragment 8.0 to 11.2

  17. GIGGLE: a search engine for large-scale integrated genome analysis

    Science.gov (United States)

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-01-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061

  18. Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation.

    Directory of Open Access Journals (Sweden)

    Momoko Horikoshi

    2015-07-01

    Full Text Available Reference panels from the 1000 Genomes (1000G Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ≥0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS, supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI at genome-wide significance, and two for fasting glucose (FG, none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3 and FG (GCK and G6PC2. The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated.

  19. A meiotic linkage map of the silver fox, aligned and compared to the canine genome

    OpenAIRE

    Kukekova, Anna V.; Trut, Lyudmila N.; Oskina, Irina N.; Johnson, Jennifer L.; Temnykh, Svetlana V.; Kharlamova, Anastasiya V.; Shepeleva, Darya V.; Gulievich, Rimma G.; Shikhevich, Svetlana G.; Graphodatsky, Alexander S.; Aguirre, Gustavo D.; Acland, Gregory M.

    2007-01-01

    A meiotic linkage map is essential for mapping traits of interest and is often the first step toward understanding a cryptic genome. Specific strains of silver fox (a variant of the red fox, Vulpes vulpes), which segregate behavioral and morphological phenotypes, create a need for such a map. One such strain, selected for docility, exhibits friendly dog-like responses to humans, in contrast to another strain selected for aggression. Development of a fox map is facilitated by the known cytogen...

  20. A genome-wide RNAi screen reveals MAP kinase phosphatases as key ERK pathway regulators during embryonic stem cell differentiation.

    Directory of Open Access Journals (Sweden)

    Shen-Hsi Yang

    Full Text Available Embryonic stem cells and induced pluripotent stem cells represent potentially important therapeutic agents in regenerative medicine. Complex interlinked transcriptional and signaling networks control the fate of these cells towards maintenance of pluripotency or differentiation. In this study we have focused on how mouse embryonic stem cells begin to differentiate and lose pluripotency and, in particular, the role that the ERK MAP kinase and GSK3 signaling pathways play in this process. Through a genome-wide siRNA screen we have identified more than 400 genes involved in loss of pluripotency and promoting the onset of differentiation. These genes were functionally associated with the ERK and/or GSK3 pathways, providing an important resource for studying the roles of these pathways in controlling escape from the pluripotent ground state. More detailed analysis identified MAP kinase phosphatases as a focal point of regulation and demonstrated an important role for these enzymes in controlling ERK activation kinetics and subsequently determining early embryonic stem cell fate decisions.

  1. PREFERENCE FOR MAP SCALE OF IN-CAR ROUTE GUIDANCE AND NAVIGATION SYSTEM

    Directory of Open Access Journals (Sweden)

    Ana Paula Marques Ramos

    Full Text Available Usability issues of maps presented in-car Route Guidance and Navigation System (RGNS may result in serious impacts on traffic safety. To obtain effective RGNS, evaluation of 'user satisfaction' with the system has played a prominent role, since designers can quantify drivers' acceptance about presented information. An important variable related to design of RGNS interfaces refers to select appropriate scale for maps, since it interferes on legibility of maps. Map with good legibility may support drivers comprehend information easily and take decisions during driving task quickly. This paper evaluates drivers' preference for scales used in maps of RGNS. A total of 52 subjects participated of an experiment performed in a parked car. Maps were designed at four different scales 1:1,000, 1:3,000, 1:6,000 and 1:10,000 for a route composed of 13 junctions. Map design was based on cartographic communication principles, such as perceptive grouping and figure-ground segregation. Based on studies cases, we conclude intermediate scales (1:6,000 and 1:3,000 were more acceptable among drivers compared to large scales (1:1,000 and small (1:10,000. RGNS should select scales for maps which supports drivers to quickly identify direction of the maneuver and, simultaneously, get information about surroundings of route. More results are presented and implications discussed

  2. Do turning visited routes in black maps into white promote sightseeing?

    Science.gov (United States)

    Izumi, Tomoko; Nakatani, Yoshio

    2017-07-01

    In this paper, we propose a new approach for promoting visiting various areas during a sightseeing based on the "FUrther BENEfit of a Kind of Inconvenience" (FUBEN-EKI) theory. FUBEN-EKI is a theory such that an inconvenient system in some aspect enables users to obtain more hidden benefits than convenient systems. Familiar navigation systems lead tourists to time efficient routes, and so the systems may limit their opportunities for new discoveries during their sightseeing. It is supposed that if tourists walk freely in various areas then they will have more change to find new discoveries. To promote visiting various areas, we propose a blacked-out map: The map is hidden by a black filter initially, but only the area where the user moves becomes clear. Since the user cannot see the map of unvisited area, the user thinks that the unvisited area might have interesting sightseeing spots. Moreover, to make all of the area clear, the user have to go to all of the area on the map.

  3. Development of a dense SNP-based linkage map of an apple rootstock progeny using the Malus Infinium whole genome genotyping array.

    Science.gov (United States)

    Antanaviciute, Laima; Fernández-Fernández, Felicidad; Jansen, Johannes; Banchi, Elisa; Evans, Katherine M; Viola, Roberto; Velasco, Riccardo; Dunwell, Jim M; Troggio, Michela; Sargent, Daniel J

    2012-05-25

    A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the 'Golden Delicious' genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the 'Golden Delicious' pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been

  4. Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster.

    Science.gov (United States)

    He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric

    2012-12-01

    Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding sequences that can be mapped to the assembled euchromatic arms, we identified sequences that are specific to heterochromatin and used them to design heterochromatin specific probes ("H-probes") for microarray. By comparative genomic hybridization (CGH) analyses of embryos deficient for each chromosome or chromosome arm, we were able to map most of our H-probes to specific chromosome arms. We also positioned sequences mapped to the second and X chromosomes to finer intervals by analyzing smaller deletions with breakpoints in heterochromatin. Using this approach, we were able to map >40% (13.9 Mb) of the previously unmapped heterochromatin sequences assembled by the whole-genome sequencing effort on arm U and arm Uextra to specific locations. We also identified and mapped 110 kb of novel heterochromatic sequences. Subsequent analyses revealed that sequences located within different heterochromatic regions have distinct properties, such as sequence composition, degree of repetitiveness, and level of underreplication in polytenized tissues. Surprisingly, although heterochromatin is generally considered to be transcriptionally silent, we detected region-specific temporal patterns of transcription in heterochromatin during oogenesis and early embryonic development. Our study provides a useful approach to elucidate the molecular organization and function of heterochromatin and reveals region-specific variation of heterochromatin.

  5. Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains

    Directory of Open Access Journals (Sweden)

    Bharti Arvind K

    2008-12-01

    Full Text Available Abstract Background Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR and methylation spanning linker libraries (MSLL. These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends. Results A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig, while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%. These two techniques were compared with other gene-enrichment methods, and shown to be complementary. Conclusion MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of

  6. An object model for genome information at all levels of resolution

    Energy Technology Data Exchange (ETDEWEB)

    Honda, S.; Parrott, N.W.; Smith, R.; Lawrence, C.

    1993-12-31

    An object model for genome data at all levels of resolution is described. The model was derived by considering the requirements for representing genome related objects in three application domains: genome maps, large-scale DNA sequencing, and exploring functional information in gene and protein sequences. The methodology used for the object-oriented analysis is also described.

  7. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  8. The large-scale blast score ratio (LS-BSR pipeline: a method to rapidly compare genetic content between bacterial genomes

    Directory of Open Access Journals (Sweden)

    Jason W. Sahl

    2014-04-01

    Full Text Available Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR.Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27–57 h, depending upon the alignment method, using 16 processors.Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated

  9. Properties of promoters cloned randomly from the Saccharomyces cerevisiae genome.

    Science.gov (United States)

    Santangelo, G M; Tornow, J; McLaughlin, C S; Moldave, K

    1988-01-01

    Promoters were isolated at random from the genome of Saccharomyces cerevisiae by using a plasmid that contains a divergently arrayed pair of promoterless reporter genes. A comprehensive library was constructed by inserting random (DNase I-generated) fragments into the intergenic region upstream from the reporter genes. Simple in vivo assays for either reporter gene product (alcohol dehydrogenase or beta-galactosidase) allowed the rapid identification of promoters from among these random fragments. Poly(dA-dT) homopolymer tracts were present in three of five randomly cloned promoters. With two exceptions, each RNA start site detected was 40 to 100 base pairs downstream from a TATA element. All of the randomly cloned promoters were capable of activating reporter gene transcription bidirectionally. Interestingly, one of the promoter fragments originated in a region of the S. cerevisiae rDNA spacer; regulated divergent transcription (presumably by RNA polymerase II) initiated in the same region. Images PMID:2847031

  10. Fine-Scale Mapping at 9p22.2 Identifies Candidate Causal Variants That Modify Ovarian Cancer Risk in BRCA1 and BRCA2 Mutation Carriers

    NARCIS (Netherlands)

    Vigorito, E.; Kuchenbaecker, K.B.; Beesley, J.; Adlard, J.; Agnarsson, B.A.; Andrulis, I.L.; Arun, B.K.; Barjhoux, L.; Belotti, M.; Benitez, J.; Berger, A.; Bojesen, A.; Bonanni, B.; Brewer, C.; Caldes, T.; Caligo, M.A.; Campbell, I.; Chan, S.B.; Claes, K.B.; Cohn, D.E.; Cook, J.; Daly, M.B.; Damiola, F.; Davidson, R.; Pauw, A. de; Delnatte, C.; Diez, O.; Domchek, S.M.; Dumont, M.; Durda, K.; Dworniczak, B.; Easton, D.F.; Eccles, D.; Edwinsdotter Ardnor, C.; Eeles, R.; Ejlertsen, B.; Ellis, S.; Evans, D.G.; Feliubadalo, L.; Fostira, F.; Foulkes, W.D.; Friedman, E.; Frost, D.; Gaddam, P.; Ganz, P.A.; Garber, J.; Garcia-Barberan, V.; Gauthier-Villars, M.; Gehrig, A.; Gerdes, A.M.; Giraud, S.; Godwin, A.K.; Goldgar, D.E.; Hake, C.R.; Hansen, T.V.; Healey, S.; Hodgson, S.; Hogervorst, F.B.; Houdayer, C.; Hulick, P.J.; Imyanitov, E.N.; Isaacs, C.; Izatt, L.; Izquierdo, A.; Jacobs, L; Jakubowska, A.; Janavicius, R.; Jaworska-Bieniek, K.; Jensen, U.B.; John, E.M.; Vijai, J.; Karlan, B.Y.; Kast, K.; Khan, S.; Kwong, A.; Laitman, Y.; Lester, J.; Lesueur, F.; Liljegren, A.; Lubinski, J.; Mai, P.L.; Manoukian, S.; Mazoyer, S.; Meindl, A.; Mensenkamp, A.R.; Montagna, M.; Nathanson, K.L.; Neuhausen, S.L.; Nevanlinna, H.; Niederacher, D.; Olah, E.; Olopade, O.I.; Ong, K.R.; Osorio, A.; Park, S.K.; Paulsson-Karlsson, Y.; Pedersen, I.S.; Peissel, B.; Peterlongo, P.; et al.,

    2016-01-01

    Population-based genome wide association studies have identified a locus at 9p22.2 associated with ovarian cancer risk, which also modifies ovarian cancer risk in BRCA1 and BRCA2 mutation carriers. We conducted fine-scale mapping at 9p22.2 to identify potential causal variants in BRCA1 and BRCA2

  11. Fine-Scale Mapping at 9p22.2 Identifies Candidate Causal Variants That Modify Ovarian Cancer Risk in BRCA1 and BRCA2 Mutation Carriers

    DEFF Research Database (Denmark)

    Vigorito, Elena; Kuchenbaecker, Karoline B; Beesley, Jonathan

    2016-01-01

    Population-based genome wide association studies have identified a locus at 9p22.2 associated with ovarian cancer risk, which also modifies ovarian cancer risk in BRCA1 and BRCA2 mutation carriers. We conducted fine-scale mapping at 9p22.2 to identify potential causal variants in BRCA1 and BRCA2 ...

  12. Creation of BAC genomic resources for cocoa ( Theobroma cacao L.) for physical mapping of RGA containing BAC clones.

    Science.gov (United States)

    Clément, D; Lanaud, C; Sabau, X; Fouet, O; Le Cunff, L; Ruiz, E; Risterucci, A M; Glaszmann, J C; Piffanelli, P

    2004-05-01

    We have constructed and validated the first cocoa ( Theobroma cacao L.) BAC library, with the aim of developing molecular resources to study the structure and evolution of the genome of this perennial crop. This library contains 36,864 clones with an average insert size of 120 kb, representing approximately ten haploid genome equivalents. It was constructed from the genotype Scavina-6 (Sca-6), a Forastero clone highly resistant to cocoa pathogens and a parent of existing mapping populations. Validation of the BAC library was carried out with a set of 13 genetically-anchored single copy and one duplicated markers. An average of nine BAC clones per probe was identified, giving an initial experimental estimation of the genome coverage represented in the library. Screening of the library with a set of resistance gene analogues (RGAs), previously mapped in cocoa and co-localizing with QTL for resistance to Phytophthora traits, confirmed at the physical level the tight clustering of RGAs in the cocoa genome and provided the first insights into the relationships between genetic and physical distances in the cocoa genome. This library represents an available BAC resource for structural genomic studies or map-based cloning of genes corresponding to important QTLs for agronomic traits such as resistance genes to major cocoa pathogens like Phytophthora spp ( palmivora and megakarya), Crinipellis perniciosa and Moniliophthora roreri.

  13. The Ties That Bind: Mapping the Dynamic Enhancer-Promoter Interactome.

    Science.gov (United States)

    Spurrell, Cailyn H; Dickel, Diane E; Visel, Axel

    2016-11-17

    Coupling chromosome conformation capture to molecular enrichment for promoter-containing DNA fragments enables the systematic mapping of interactions between individual distal regulatory sequences and their target genes. In this Minireview, we describe recent progress in the application of this technique and related complementary approaches to gain insight into the lineage- and cell-type-specific dynamics of interactions between regulators and gene promoters. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. A high-resolution map of the Nile tilapia genome: a resource for studying cichlids and other percomorphs

    Science.gov (United States)

    2012-01-01

    Background The Nile tilapia (Oreochromis niloticus) is the second most farmed fish species worldwide. It is also an important model for studies of fish physiology, particularly because of its broad tolerance to an array of environments. It is a good model to study evolutionary mechanisms in vertebrates, because of its close relationship to haplochromine cichlids, which have undergone rapid speciation in East Africa. The existing genomic resources for Nile tilapia include a genetic map, BAC end sequences and ESTs, but comparative genome analysis and maps of quantitative trait loci (QTL) are still limited. Results We have constructed a high-resolution radiation hybrid (RH) panel for the Nile tilapia and genotyped 1358 markers consisting of 850 genes, 82 markers corresponding to BAC end sequences, 154 microsatellites and 272 single nucleotide polymorphisms (SNPs). From these, 1296 markers could be associated in 81 RH groups, while 62 were not linked. The total size of the RH map is 34,084 cR3500 and 937,310 kb. It covers 88% of the entire genome with an estimated inter-marker distance of 742 Kb. Mapping of microsatellites enabled integration to the genetic map. We have merged LG8 and LG24 into a single linkage group, and confirmed that LG16-LG21 are also merged. The orientation and association of RH groups to each chromosome and LG was confirmed by chromosomal in situ hybridizations (FISH) of 55 BACs. Fifty RH groups were localized on the 22 chromosomes while 31 remained small orphan groups. Synteny relationships were determined between Nile tilapia, stickleback, medaka and pufferfish. Conclusion The RH map and associated FISH map provide a valuable gene-ordered resource for gene mapping and QTL studies. All genetic linkage groups with their corresponding RH groups now have a corresponding chromosome which can be identified in the karyotype. Placement of conserved segments indicated that multiple inter-chromosomal rearrangements have occurred between Nile tilapia

  15. Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels

    Directory of Open Access Journals (Sweden)

    Xiaoyi eGao

    2012-06-01

    Full Text Available Genotype imputation is a vital tool in genome-wide association studies (GWAS and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR+CEU+YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation-based analysis in Latinos.

  16. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  17. Large-scale discovery of promoter motifs in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Thomas A Down

    2007-01-01

    Full Text Available A key step in understanding gene regulation is to identify the repertoire of transcription factor binding motifs (TFBMs that form the building blocks of promoters and other regulatory elements. Identifying these experimentally is very laborious, and the number of TFBMs discovered remains relatively small, especially when compared with the hundreds of transcription factor genes predicted in metazoan genomes. We have used a recently developed statistical motif discovery approach, NestedMICA, to detect candidate TFBMs from a large set of Drosophila melanogaster promoter regions. Of the 120 motifs inferred in our initial analysis, 25 were statistically significant matches to previously reported motifs, while 87 appeared to be novel. Analysis of sequence conservation and motif positioning suggested that the great majority of these discovered motifs are predictive of functional elements in the genome. Many motifs showed associations with specific patterns of gene expression in the D. melanogaster embryo, and we were able to obtain confident annotation of expression patterns for 25 of our motifs, including eight of the novel motifs. The motifs are available through Tiffin, a new database of DNA sequence motifs. We have discovered many new motifs that are overrepresented in D. melanogaster promoter regions, and offer several independent lines of evidence that these are novel TFBMs. Our motif dictionary provides a solid foundation for further investigation of regulatory elements in Drosophila, and demonstrates techniques that should be applicable in other species. We suggest that further improvements in computational motif discovery should narrow the gap between the set of known motifs and the total number of transcription factors in metazoan genomes.

  18. High-resolution comparative mapping among man, cattle and mouse suggests a role for repeat sequences in mammalian genome evolution

    Directory of Open Access Journals (Sweden)

    Rodolphe François

    2006-08-01

    Full Text Available Abstract Background Comparative mapping provides new insights into the evolutionary history of genomes. In particular, recent studies in mammals have suggested a role for segmental duplication in genome evolution. In some species such as Drosophila or maize, transposable elements (TEs have been shown to be involved in chromosomal rearrangements. In this work, we have explored the presence of interspersed repeats in regions of chromosomal rearrangements, using an updated high-resolution integrated comparative map among cattle, man and mouse. Results The bovine, human and mouse comparative autosomal map has been constructed using data from bovine genetic and physical maps and from FISH-mapping studies. We confirm most previous results but also reveal some discrepancies. A total of 211 conserved segments have been identified between cattle and man, of which 33 are new segments and 72 correspond to extended, previously known segments. The resulting map covers 91% and 90% of the human and bovine genomes, respectively. Analysis of breakpoint regions revealed a high density of species-specific interspersed repeats in the human and mouse genomes. Conclusion Analysis of the breakpoint regions has revealed specific repeat density patterns, suggesting that TEs may have played a significant role in chromosome evolution and genome plasticity. However, we cannot rule out that repeats and breakpoints accumulate independently in the few same regions where modifications are better tolerated. Likewise, we cannot ascertain whether increased TE density is the cause or the consequence of chromosome rearrangements. Nevertheless, the identification of high density repeat clusters combined with a well-documented repeat phylogeny should highlight probable breakpoints, and permit their precise dating. Combining new statistical models taking the present information into account should help reconstruct ancestral karyotypes.

  19. Rapid Prototyping of Microbial Cell Factories via Genome-scale Engineering

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2014-01-01

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories. PMID:25450192

  20. ESTIMAP: Ecosystem services mapping at European scale

    OpenAIRE

    ZULIAN GRAZIA; PARACCHINI Maria-Luisa; MAES JOACHIM; LIQUETE GARCIA MARIA DEL CAMINO

    2013-01-01

    Mapping, visualization and the access to suitable data as a means to facilitate the dialogue among scientists, policy makers and the general public are among the most challenging issues within current ecosystem service science and application. Recently the attention on spatially explicit ways to map ecosystem services, at local, regional and global scale is increasing. This report presents ESTIMAP: a suite of models for a spatially explicit assessment of three ecosystem services (recreati...

  1. An initial comparative map of copy number variations in the goat (Capra hircus genome

    Directory of Open Access Journals (Sweden)

    Casadio Rita

    2010-11-01

    Full Text Available Abstract Background The goat (Capra hircus represents one of the most important farm animal species. It is reared in all continents with an estimated world population of about 800 million of animals. Despite its importance, studies on the goat genome are still in their infancy compared to those in other farm animal species. Comparative mapping between cattle and goat showed only a few rearrangements in agreement with the similarity of chromosome banding. We carried out a cross species cattle-goat array comparative genome hybridization (aCGH experiment in order to identify copy number variations (CNVs in the goat genome analysing animals of different breeds (Saanen, Camosciata delle Alpi, Girgentana, and Murciano-Granadina using a tiling oligonucleotide array with ~385,000 probes designed on the bovine genome. Results We identified a total of 161 CNVs (an average of 17.9 CNVs per goat, with the largest number in the Saanen breed and the lowest in the Camosciata delle Alpi goat. By aggregating overlapping CNVs identified in different animals we determined CNV regions (CNVRs: on the whole, we identified 127 CNVRs covering about 11.47 Mb of the virtual goat genome referred to the bovine genome (0.435% of the latter genome. These 127 CNVRs included 86 loss and 41 gain and ranged from about 24 kb to about 1.07 Mb with a mean and median equal to 90,292 bp and 49,530 bp, respectively. To evaluate whether the identified goat CNVRs overlap with those reported in the cattle genome, we compared our results with those obtained in four independent cattle experiments. Overlapping between goat and cattle CNVRs was highly significant (P Conclusions We describe a first map of goat CNVRs. This provides information on a comparative basis with the cattle genome by identifying putative recurrent interspecies CNVs between these two ruminant species. Several goat CNVs affect genes with important biological functions. Further studies are needed to evaluate the

  2. Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster

    Science.gov (United States)

    He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric

    2012-01-01

    Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding sequences that can be mapped to the assembled euchromatic arms, we identified sequences that are specific to heterochromatin and used them to design heterochromatin specific probes (“H-probes”) for microarray. By comparative genomic hybridization (CGH) analyses of embryos deficient for each chromosome or chromosome arm, we were able to map most of our H-probes to specific chromosome arms. We also positioned sequences mapped to the second and X chromosomes to finer intervals by analyzing smaller deletions with breakpoints in heterochromatin. Using this approach, we were able to map >40% (13.9 Mb) of the previously unmapped heterochromatin sequences assembled by the whole-genome sequencing effort on arm U and arm Uextra to specific locations. We also identified and mapped 110 kb of novel heterochromatic sequences. Subsequent analyses revealed that sequences located within different heterochromatic regions have distinct properties, such as sequence composition, degree of repetitiveness, and level of underreplication in polytenized tissues. Surprisingly, although heterochromatin is generally considered to be transcriptionally silent, we detected region-specific temporal patterns of transcription in heterochromatin during oogenesis and early embryonic development. Our study provides a useful approach to elucidate the molecular organization and function of heterochromatin and reveals region-specific variation of heterochromatin. PMID:22745230

  3. A multi-scale method of mapping urban influence

    Science.gov (United States)

    Timothy G. Wade; James D. Wickham; Nicola Zacarelli; Kurt H. Riitters

    2009-01-01

    Urban development can impact environmental quality and ecosystem services well beyond urban extent. Many methods to map urban areas have been developed and used in the past, but most have simply tried to map existing extent of urban development, and all have been single-scale techniques. The method presented here uses a clustering approach to look beyond the extant...

  4. Power Laws, Scale-Free Networks and Genome Biology

    CERN Document Server

    Koonin, Eugene V; Karev, Georgy P

    2006-01-01

    Power Laws, Scale-free Networks and Genome Biology deals with crucial aspects of the theoretical foundations of systems biology, namely power law distributions and scale-free networks which have emerged as the hallmarks of biological organization in the post-genomic era. The chapters in the book not only describe the interesting mathematical properties of biological networks but moves beyond phenomenology, toward models of evolution capable of explaining the emergence of these features. The collection of chapters, contributed by both physicists and biologists, strives to address the problems in this field in a rigorous but not excessively mathematical manner and to represent different viewpoints, which is crucial in this emerging discipline. Each chapter includes, in addition to technical descriptions of properties of biological networks and evolutionary models, a more general and accessible introduction to the respective problems. Most chapters emphasize the potential of theoretical systems biology for disco...

  5. Split photosystem protein, linear-mapping topology, and growth of structural complexity in the plastid genome of chromera velia

    KAUST Repository

    Janouškovec, Jan

    2013-08-22

    The canonical photosynthetic plastid genomes consist of a single circular-mapping chromosome that encodes a highly conserved protein core, involved in photosynthesis and ATP generation. Here, we demonstrate that the plastid genome of the photosynthetic relative of apicomplexans, Chromera velia, departs from this view in several unique ways. Core photosynthesis proteins PsaA and AtpB have been broken into two fragments, which we show are independently transcribed, oligoU-tailed, translated, and assembled into functional photosystem I and ATP synthase complexes. Genome-wide transcription profiles support expression of many other highly modified proteins, including several that contain extensions amounting to hundreds of amino acids in length. Canonical gene clusters and operons have been fragmented and reshuffled into novel putative transcriptional units. Massive genomic coverage by paired-end reads, coupled with pulsed-field gel electrophoresis and polymerase chain reaction, consistently indicate that the C. velia plastid genome is linear-mapping, a unique state among all plastids. Abundant intragenomic duplication probably mediated by recombination can explain protein splits, extensions, and genome linearization and is perhaps the key driving force behind the many features that defy the conventional ways of plastid genome architecture and function. © The Author 2013.

  6. Fluorescent In Situ Hybridization (FISH) on Pachytene Chromosomes as a Tool for Genome Characterization. In: Legume Genomics

    NARCIS (Netherlands)

    Geurts, R.; Jong, de J.H.S.G.M.

    2013-01-01

    A growing number of international genome consortia have initiated large-scale sequencing projects for most of the major crop species. This huge amount of information not only boosted genetic and physical mapping research, but it also enabled novel applications on the level of chromosome biology

  7. Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

    Science.gov (United States)

    Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-07-01

    This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

  8. A saturated genetic linkage map of autotetraploid alfalfa (Medicago sativa L.) developed using genotyping-by-sequencing is highly syntenous with the Medicago truncatula genome.

    Science.gov (United States)

    Li, Xuehui; Wei, Yanling; Acharya, Ananta; Jiang, Qingzhen; Kang, Junmei; Brummer, E Charles

    2014-08-21

    A genetic linkage map is a valuable tool for quantitative trait locus mapping, map-based gene cloning, comparative mapping, and whole-genome assembly. Alfalfa, one of the most important forage crops in the world, is autotetraploid, allogamous, and highly heterozygous, characteristics that have impeded the construction of a high-density linkage map using traditional genetic marker systems. Using genotyping-by-sequencing (GBS), we constructed low-cost, reasonably high-density linkage maps for both maternal and paternal parental genomes of an autotetraploid alfalfa F1 population. The resulting maps contain 3591 single-nucleotide polymorphism markers on 64 linkage groups across both parents, with an average density of one marker per 1.5 and 1.0 cM for the maternal and paternal haplotype maps, respectively. Chromosome assignments were made based on homology of markers to the M. truncatula genome. Four linkage groups representing the four haplotypes of each alfalfa chromosome were assigned to each of the eight Medicago chromosomes in both the maternal and paternal parents. The alfalfa linkage groups were highly syntenous with M. truncatula, and clearly identified the known translocation between Chromosomes 4 and 8. In addition, a small inversion on Chromosome 1 was identified between M. truncatula and M. sativa. GBS enabled us to develop a saturated linkage map for alfalfa that greatly improved genome coverage relative to previous maps and that will facilitate investigation of genome structure. GBS could be used in breeding populations to accelerate molecular breeding in alfalfa. Copyright © 2014 Li et al.

  9. Scaling laws for mode lockings in circle maps

    International Nuclear Information System (INIS)

    Cvitanovic, P.; Shraiman, B.; Soederberg, B.

    1985-06-01

    The self-similar structure of mode lockings for circle maps is studied by means of the associated Farey trees. We investigate numerically several classes of scaling relations implicit in the Farey organization of mode lockings and discuss the extent to which they lead to universal scaling laws. (orig.)

  10. Rapid prototyping of microbial cell factories via genome-scale engineering.

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2015-11-15

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Genome-wide prediction and functional validation of promoter motifs regulating gene expression in spore and infection stages of Phytophthora infestans.

    Directory of Open Access Journals (Sweden)

    Sourav Roy

    2013-03-01

    Full Text Available Most eukaryotic pathogens have complex life cycles in which gene expression networks orchestrate the formation of cells specialized for dissemination or host colonization. In the oomycete Phytophthora infestans, the potato late blight pathogen, major shifts in mRNA profiles during developmental transitions were identified using microarrays. We used those data with search algorithms to discover about 100 motifs that are over-represented in promoters of genes up-regulated in hyphae, sporangia, sporangia undergoing zoosporogenesis, swimming zoospores, or germinated cysts forming appressoria (infection structures. Most of the putative stage-specific transcription factor binding sites (TFBSs thus identified had features typical of TFBSs such as position or orientation bias, palindromy, and conservation in related species. Each of six motifs tested in P. infestans transformants using the GUS reporter gene conferred the expected stage-specific expression pattern, and several were shown to bind nuclear proteins in gel-shift assays. Motifs linked to the appressoria-forming stage, including a functionally validated TFBS, were over-represented in promoters of genes encoding effectors and other pathogenesis-related proteins. To understand how promoter and genome architecture influence expression, we also mapped transcription patterns to the P. infestans genome assembly. Adjacent genes were not typically induced in the same stage, including genes transcribed in opposite directions from small intergenic regions, but co-regulated gene pairs occurred more than expected by random chance. These data help illuminate the processes regulating development and pathogenesis, and will enable future attempts to purify the cognate transcription factors.

  12. Genome-scale metabolic models as platforms for strain design and biological discovery.

    Science.gov (United States)

    Mienda, Bashir Sajo

    2017-07-01

    Genome-scale metabolic models (GEMs) have been developed and used in guiding systems' metabolic engineering strategies for strain design and development. This strategy has been used in fermentative production of bio-based industrial chemicals and fuels from alternative carbon sources. However, computer-aided hypotheses building using established algorithms and software platforms for biological discovery can be integrated into the pipeline for strain design strategy to create superior strains of microorganisms for targeted biosynthetic goals. Here, I described an integrated workflow strategy using GEMs for strain design and biological discovery. Specific case studies of strain design and biological discovery using Escherichia coli genome-scale model are presented and discussed. The integrated workflow presented herein, when applied carefully would help guide future design strategies for high-performance microbial strains that have existing and forthcoming genome-scale metabolic models.

  13. Genome-wide analysis of promoter architecture in Drosophila melanogaster

    Energy Technology Data Exchange (ETDEWEB)

    Hoskins, Roger A.; Landolin, Jane M.; Brown, James B.; Sandler, Jeremy E.; Takahashi, Hazuki; Lassmann, Timo; Yu, Charles; Booth, Benjamin W.; Zhang, Dayu; Wan, Kenneth H.; Yang, Li; Boley, Nathan; Andrews, Justen; Kaufman, Thomas C.; Graveley, Brenton R.; Bickel, Peter J.; Carninci, Piero; Carlson, Joseph W.; Celniker, Susan E.

    2010-10-20

    Core promoters are critical regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resolution map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap analysis of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid amplification of cDNA ends (RLMRACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our analysis indicates that, due to non-promoter-associated RNA background signal, previous studies have likely overestimated the number of promoter-associated CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally determined by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally diverse and control patterning of gene expression in Drosophila and mammals.

  14. Mapping the Ethics of Translational Genomics: Situating Return of Results and Navigating the Research-Clinical Divide

    Science.gov (United States)

    Wolf, Susan M.; Burke, Wylie; Koenig, Barbara A.

    2015-01-01

    Both bioethics and law have governed human genomics by distinguishing research from clinical practice. Yet the rise of translational genomics now makes this traditional dichotomy inadequate. This paper pioneers a new approach to the ethics of translational genomics. It maps the full range of ethical approaches needed, proposes a “layered” approach to determining the ethics framework for projects combining research and clinical care, and clarifies the key role that return of results can play in advancing translation. PMID:26479558

  15. Insertion Sequence-Caused Large Scale-Rearrangements in the Genome of Escherichia coli

    Science.gov (United States)

    2016-07-18

    affordable ap- proach to genome-wide characterization of genetic varia - tion in bacterial and eukaryotic genomes (1–3). In addition to small-scale...Paired-End Reads), that uses a graph-based al- gorithm (27) capable of detecting most large-scale varia - tion involving repetitive regions, including novel...Avila,P., Grinsted,J. and De La Cruz,F. (1988) Analysis of the variable endpoints generated by one-ended transposition of Tn21.. J. Bacteriol., 170

  16. QTL Mapping of Genome Regions Controlling Manganese Uptake in Lentil Seed

    Directory of Open Access Journals (Sweden)

    Duygu Ates

    2018-05-01

    Full Text Available This study evaluated Mn concentration in the seeds of 120 RILs of lentil developed from the cross “CDC Redberry” × “ILL7502”. Micronutrient analysis using atomic absorption spectrometry indicated mean seed manganese (Mn concentrations ranging from 8.5 to 26.8 mg/kg, based on replicated field trials grown at three locations in Turkey in 2012 and 2013. A linkage map of lentil was constructed and consisted of seven linkage groups with 5,385 DNA markers. The total map length was 973.1 cM, with an average distance between markers of 0.18 cM. A total of 6 QTL for Mn concentration were identified using composite interval mapping (CIM. All QTL were statistically significant and explained 15.3–24.1% of the phenotypic variation, with LOD scores ranging from 3.00 to 4.42. The high-density genetic map reported in this study will increase fundamental knowledge of the genome structure of lentil, and will be the basis for the development of micronutrient-enriched lentil genotypes to support biofortification efforts.

  17. QTL Mapping of Genome Regions Controlling Manganese Uptake in Lentil Seed.

    Science.gov (United States)

    Ates, Duygu; Aldemir, Secil; Yagmur, Bulent; Kahraman, Abdullah; Ozkan, Hakan; Vandenberg, Albert; Tanyolac, Muhammed Bahattin

    2018-05-04

    This study evaluated Mn concentration in the seeds of 120 RILs of lentil developed from the cross "CDC Redberry" × "ILL7502". Micronutrient analysis using atomic absorption spectrometry indicated mean seed manganese (Mn) concentrations ranging from 8.5 to 26.8 mg/kg, based on replicated field trials grown at three locations in Turkey in 2012 and 2013. A linkage map of lentil was constructed and consisted of seven linkage groups with 5,385 DNA markers. The total map length was 973.1 cM, with an average distance between markers of 0.18 cM. A total of 6 QTL for Mn concentration were identified using composite interval mapping (CIM). All QTL were statistically significant and explained 15.3-24.1% of the phenotypic variation, with LOD scores ranging from 3.00 to 4.42. The high-density genetic map reported in this study will increase fundamental knowledge of the genome structure of lentil, and will be the basis for the development of micronutrient-enriched lentil genotypes to support biofortification efforts. Copyright © 2018 Ates et al.

  18. A universal genomic coordinate translator for comparative genomics.

    Science.gov (United States)

    Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

    2014-06-30

    Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across

  19. Imaginal discs--a new source of chromosomes for genome mapping of the yellow fever mosquito Aedes aegypti.

    Directory of Open Access Journals (Sweden)

    Maria V Sharakhova

    2011-10-01

    Full Text Available The mosquito Aedes aegypti is the primary global vector for dengue and yellow fever viruses. Sequencing of the Ae. aegypti genome has stimulated research in vector biology and insect genomics. However, the current genome assembly is highly fragmented with only ~31% of the genome being assigned to chromosomes. A lack of a reliable source of chromosomes for physical mapping has been a major impediment to improving the genome assembly of Ae. aegypti.In this study we demonstrate the utility of mitotic chromosomes from imaginal discs of 4(th instar larva for cytogenetic studies of Ae. aegypti. High numbers of mitotic divisions on each slide preparation, large sizes, and reproducible banding patterns of the individual chromosomes simplify cytogenetic procedures. Based on the banding structure of the chromosomes, we have developed idiograms for each of the three Ae. aegypti chromosomes and placed 10 BAC clones and a 18S rDNA probe to precise chromosomal positions.The study identified imaginal discs of 4(th instar larva as a superior source of mitotic chromosomes for Ae. aegypti. The proposed approach allows precise mapping of DNA probes to the chromosomal positions and can be utilized for obtaining a high-quality genome assembly of the yellow fever mosquito.

  20. Mapping Second Chromosome Mutations to Defined Genomic Regions in Drosophila melanogaster.

    Science.gov (United States)

    Kahsai, Lily; Cook, Kevin R

    2018-01-04

    Hundreds of Drosophila melanogaster stocks are currently maintained at the Bloomington Drosophila Stock Center with mutations that have not been associated with sequence-defined genes. They have been preserved because they have interesting loss-of-function phenotypes. The experimental value of these mutations would be increased by tying them to specific genomic intervals so that geneticists can more easily associate them with annotated genes. Here, we report the mapping of 85 second chromosome complementation groups in the Bloomington collection to specific, small clusters of contiguous genes or individual genes in the sequenced genome. This information should prove valuable to Drosophila geneticists interested in processes associated with particular phenotypes and those searching for mutations affecting specific sequence-defined genes. Copyright © 2018 Kahsai,Cook.

  1. Mapping Second Chromosome Mutations to Defined Genomic Regions in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Lily Kahsai

    2018-01-01

    Full Text Available Hundreds of Drosophila melanogaster stocks are currently maintained at the Bloomington Drosophila Stock Center with mutations that have not been associated with sequence-defined genes. They have been preserved because they have interesting loss-of-function phenotypes. The experimental value of these mutations would be increased by tying them to specific genomic intervals so that geneticists can more easily associate them with annotated genes. Here, we report the mapping of 85 second chromosome complementation groups in the Bloomington collection to specific, small clusters of contiguous genes or individual genes in the sequenced genome. This information should prove valuable to Drosophila geneticists interested in processes associated with particular phenotypes and those searching for mutations affecting specific sequence-defined genes.

  2. Digital geologic map in the scale 1:50 000

    International Nuclear Information System (INIS)

    Kacer, S.; Antalik, M.

    2005-01-01

    In this presentation authors present preparation of new digital geologic map of the Slovak Republic. This map is prepared by the State Geological Institute of Dionyz Stur as a part of the project Geological information system GeoIS. One of the basic information geologic layers, which will be accessible on the web-site will be digital geologic map of the Slovak Republic in the scale 1: 50 000

  3. Finding Nemo's Genes: A chromosome-scale reference assembly of the genome of the orange clownfish Amphiprion percula

    KAUST Repository

    Lehmann, Robert; Lightfoot, Damien J; Schunter, Celia Marei; Michell, Craig T; Ohyanagi, Hajime; Mineta, Katsuhiko; Foret, Sylvain; Berumen, Michael L.; Miller, David J; Aranda, Manuel; Gojobori, Takashi; Munday, Philip L; Ravasi, Timothy

    2018-01-01

    The iconic orange clownfish, Amphiprion percula, is a model organism for studying the ecology and evolution of reef fishes, including patterns of population connectivity, sex change, social organization, habitat selection and adaptation to climate change. Notably, the orange clownfish is the only reef fish for which a complete larval dispersal kernel has been established and was the first fish species for which it was demonstrated that anti-predator responses of reef fishes could be impaired by ocean acidification. Despite its importance, molecular resources for this species remain scarce and until now it lacked a reference genome assembly. Here we present a de novo chromosome-scale assembly of the genome of the orange clownfish Amphiprion percula. We utilized single-molecule real-time sequencing technology from Pacific Biosciences to produce an initial polished assembly comprised of 1,414 contigs, with a contig N50 length of 1.86 Mb. Using Hi-C based chromatin contact maps, 98% of the genome assembly were placed into 24 chromosomes, resulting in a final assembly of 908.8 Mb in length with contig and scaffold N50s of 3.12 and 38.4 Mb, respectively. This makes it one of the most contiguous and complete fish genome assemblies currently available. The genome was annotated with 26,597 protein coding genes and contains 96% of the core set of conserved actinopterygian orthologs. The availability of this reference genome assembly as a community resource will further strengthen the role of the orange clownfish as a model species for research on the ecology and evolution of reef fishes.

  4. Finding Nemo's Genes: A chromosome-scale reference assembly of the genome of the orange clownfish Amphiprion percula

    KAUST Repository

    Lehmann, Robert

    2018-03-08

    The iconic orange clownfish, Amphiprion percula, is a model organism for studying the ecology and evolution of reef fishes, including patterns of population connectivity, sex change, social organization, habitat selection and adaptation to climate change. Notably, the orange clownfish is the only reef fish for which a complete larval dispersal kernel has been established and was the first fish species for which it was demonstrated that anti-predator responses of reef fishes could be impaired by ocean acidification. Despite its importance, molecular resources for this species remain scarce and until now it lacked a reference genome assembly. Here we present a de novo chromosome-scale assembly of the genome of the orange clownfish Amphiprion percula. We utilized single-molecule real-time sequencing technology from Pacific Biosciences to produce an initial polished assembly comprised of 1,414 contigs, with a contig N50 length of 1.86 Mb. Using Hi-C based chromatin contact maps, 98% of the genome assembly were placed into 24 chromosomes, resulting in a final assembly of 908.8 Mb in length with contig and scaffold N50s of 3.12 and 38.4 Mb, respectively. This makes it one of the most contiguous and complete fish genome assemblies currently available. The genome was annotated with 26,597 protein coding genes and contains 96% of the core set of conserved actinopterygian orthologs. The availability of this reference genome assembly as a community resource will further strengthen the role of the orange clownfish as a model species for research on the ecology and evolution of reef fishes.

  5. Genome-wide high-resolution mapping of UV-induced mitotic recombination events in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Yi Yin

    2013-10-01

    Full Text Available In the yeast Saccharomyces cerevisiae and most other eukaryotes, mitotic recombination is important for the repair of double-stranded DNA breaks (DSBs. Mitotic recombination between homologous chromosomes can result in loss of heterozygosity (LOH. In this study, LOH events induced by ultraviolet (UV light are mapped throughout the genome to a resolution of about 1 kb using single-nucleotide polymorphism (SNP microarrays. UV doses that have little effect on the viability of diploid cells stimulate crossovers more than 1000-fold in wild-type cells. In addition, UV stimulates recombination in G1-synchronized cells about 10-fold more efficiently than in G2-synchronized cells. Importantly, at high doses of UV, most conversion events reflect the repair of two sister chromatids that are broken at approximately the same position whereas at low doses, most conversion events reflect the repair of a single broken chromatid. Genome-wide mapping of about 380 unselected crossovers, break-induced replication (BIR events, and gene conversions shows that UV-induced recombination events occur throughout the genome without pronounced hotspots, although the ribosomal RNA gene cluster has a significantly lower frequency of crossovers.

  6. Visualization for genomics: the Microbial Genome Viewer.

    Science.gov (United States)

    Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J

    2004-07-22

    A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV

  7. Proteogenomic mapping of Mycoplasma hyopneumoniae virulent strain 232.

    Science.gov (United States)

    Pendarvis, Ken; Padula, Matthew P; Tacchi, Jessica L; Petersen, Andrew C; Djordjevic, Steven P; Burgess, Shane C; Minion, F Chris

    2014-07-08

    Mycoplasma hyopneumoniae causes respiratory disease in swine and contributes to the porcine respiratory disease complex, a major disease problem in the swine industry. The M. hyopneumoniae strain 232 genome is one of the smallest and best annotated microbial genomes, containing only 728 annotated genes and 691 known proteins. Standard protein databases for mass spectrometry only allow for the identification of known and predicted proteins, which if incorrect can limit our understanding of the biological processes at work. Proteogenomic mapping is a methodology which allows the entire 6-frame genome translation of an organism to be used as a mass spectrometry database to help identify unknown proteins as well as correct and confirm existing annotations. This methodology will be employed to perform an in-depth analysis of the M. hyopneumoniae proteome. Proteomic analysis indicates 483 of 691 (70%) known M. hyopneumoniae strain 232 proteins are expressed under the culture conditions given in this study. Furthermore, 171 of 328 (52%) hypothetical proteins have been confirmed. Proteogenomic mapping resulted in the identification of previously unannotated genes gatC and rpmF and 5-prime extensions to genes mhp063, mhp073, and mhp451, all conserved and annotated in other M. hyopneumoniae strains and Mycoplasma species. Gene prediction with Prodigal, a prokaryotic gene predicting program, completely supports the new genomic coordinates calculated using proteogenomic mapping. Proteogenomic mapping showed that the protein coding genes of the M. hyopneumoniae strain 232 identified in this study are well annotated. Only 1.8% of mapped peptides did not correspond to genes defined by the current genome annotation. This study also illustrates how proteogenomic mapping can be an important tool to help confirm, correct and append known gene models when using a genome sequence as search space for peptide mass spectra. Using a gene prediction program which scans for a wide variety of

  8. A saturated SSR/DArT linkage map of Musa acuminata addressing genome rearrangements among bananas.

    Science.gov (United States)

    Hippolyte, Isabelle; Bakry, Frederic; Seguin, Marc; Gardes, Laetitia; Rivallan, Ronan; Risterucci, Ange-Marie; Jenny, Christophe; Perrier, Xavier; Carreel, Françoise; Argout, Xavier; Piffanelli, Pietro; Khan, Imtiaz A; Miller, Robert N G; Pappas, Georgios J; Mbéguié-A-Mbéguié, Didier; Matsumoto, Takashi; De Bernardinis, Veronique; Huttner, Eric; Kilian, Andrzej; Baurens, Franc-Christophe; D'Hont, Angélique; Cote, François; Courtois, Brigitte; Glaszmann, Jean-Christophe

    2010-04-13

    The genus Musa is a large species complex which includes cultivars at diploid and triploid levels. These sterile and vegetatively propagated cultivars are based on the A genome from Musa acuminata, exclusively for sweet bananas such as Cavendish, or associated with the B genome (Musa balbisiana) in cooking bananas such as Plantain varieties. In M. acuminata cultivars, structural heterozygosity is thought to be one of the main causes of sterility, which is essential for obtaining seedless fruits but hampers breeding. Only partial genetic maps are presently available due to chromosomal rearrangements within the parents of the mapping populations. This causes large segregation distortions inducing pseudo-linkages and difficulties in ordering markers in the linkage groups. The present study aims at producing a saturated linkage map of M. acuminata, taking into account hypotheses on the structural heterozygosity of the parents. An F1 progeny of 180 individuals was obtained from a cross between two genetically distant accessions of M. acuminata, 'Borneo' and 'Pisang Lilin' (P. Lilin). Based on the gametic recombination of each parent, two parental maps composed of SSR and DArT markers were established. A significant proportion of the markers (21.7%) deviated (p DArTs) covering 1197 cM. This first saturated map is proposed as a "reference Musa map" for further analyses. We also propose two complete parental maps with interpretations of structural rearrangements localized on the linkage groups. The structural heterozygosity in P. Lilin is hypothesized to result from a duplication likely accompanied by an inversion on another chromosome. This paper also illustrates a methodological approach, transferable to other species, to investigate the mapping of structural rearrangements and determine their consequences on marker segregation.

  9. Update of the Large-scale Concentration Maps for the Netherlands (GCN)

    International Nuclear Information System (INIS)

    Van den Elshout, S.; Molenaar, R.

    2011-01-01

    Every year the RIVM and PBL publish the so-called Large-scale concentration maps of the Netherlands (GCN maps). These maps offer an approximation of the background concentrations of several air-polluting substances. Sometimes these maps need to be updated to realize a better approximation of the background concentrations. [nl

  10. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  11. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica).

    Science.gov (United States)

    Jia, Guanqing; Huang, Xuehui; Zhi, Hui; Zhao, Yan; Zhao, Qiang; Li, Wenjun; Chai, Yang; Yang, Lifang; Liu, Kunyan; Lu, Hengyun; Zhu, Chuanrang; Lu, Yiqi; Zhou, Congcong; Fan, Danlin; Weng, Qijun; Guo, Yunli; Huang, Tao; Zhang, Lei; Lu, Tingting; Feng, Qi; Hao, Hangfei; Liu, Hongkuan; Lu, Ping; Zhang, Ning; Li, Yuhui; Guo, Erhu; Wang, Shujun; Wang, Suying; Liu, Jinrong; Zhang, Wenfei; Chen, Guoqiu; Zhang, Baojin; Li, Wei; Wang, Yongfang; Li, Haiquan; Zhao, Baohua; Li, Jiayang; Diao, Xianmin; Han, Bin

    2013-08-01

    Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.

  12. Enriching the national map database for multi-scale use: Introducing the visibilityfilter attribution

    Science.gov (United States)

    Stauffer, Andrew J.; Webinger, Seth; Roche, Brittany

    2016-01-01

    The US Geological Survey’s (USGS) National Geospatial Technical Operations Center is prototyping and evaluating the ability to filter data through a range of scales using 1:24,000-scale The National Map (TNM) datasets as the source. A “VisibilityFilter” attribute is under evaluation that can be added to all TNM vector data themes and will permit filtering of data to eight target scales between 1:24,000 and 1:5,000,000, thus defining each feature’s smallest applicable scale-of-use. For a prototype implementation, map specifications for 1:100,000- and 1:250,000-scale USGS Topographic Map Series are being utilized to define feature content appropriate at fixed mapping scales to guide generalization decisions that are documented in a ScaleMaster diagram. This paper defines the VisibilityFilter attribute, the generalization decisions made for each TNM data theme, and how these decisions are embedded into the data to support efficient data filtering.

  13. The geological map of Canelones Department scale 1:1000.000

    International Nuclear Information System (INIS)

    Spoturno, J.; Oyhantcabal, P.; Goso, C.; Aubet, N.; Cazaux; S; Huelmo, S.; Morales, E.; Loureiro, J.

    2004-01-01

    The geological map of Canelones Department (Uruguay), scale 1:100.000 is presented. This map shows the distribution of the proterozoic, mesozoic and cenozoic lithological units. A stratigraphic division of this region is included [es

  14. The geological map of Montevideo Department scale 1:50.000

    International Nuclear Information System (INIS)

    Spoturno, J.; Oyhantcabal, P.; Goso, C.; Aubet, N.; Cazaux; S; Huelmo, S.; Morales, E.; Loureiro, J.

    2004-01-01

    The geological map of Montevideo Department (Uruguay), scale 1:50.000 is presented. This map shows the distribution of the proterozoic, mesozoic and cenozoic lithological units. A stratigraphic division of this region is included [es

  15. Assembly of the Genome of the Disease Vector Aedes aegypti onto a Genetic Linkage Map Allows Mapping of Genes Affecting Disease Transmission

    KAUST Repository

    Juneja, Punita; Osei-Poku, Jewelna; Ho, Yung S.; Ariani, Cristina V.; Palmer, William J.; Pain, Arnab; Jiggins, Francis M.

    2014-01-01

    between two strains of Ae. aegypti, and used these to generate a genetic map. This revealed a high rate of misassemblies in the current genome, where, for example, sequences from different chromosomes were found on the same scaffold. Once these were

  16. Segmental allotetraploidy and allelic interactions in buffelgrass (Pennisetum ciliare (L.) Link syn. Cenchrus ciliaris L.) as revealed by genome mapping.

    Science.gov (United States)

    Jessup, R W; Burson, B L; Burow, O; Wang, Y W; Chang, C; Li, Z; Paterson, A H; Hussey, M A

    2003-04-01

    Linkage analyses increasingly complement cytological and traditional plant breeding techniques by providing valuable information regarding genome organization and transmission genetics of complex polyploid species. This study reports a genome map of buffelgrass (Pennisetum ciliare (L.) Link syn. Cenchrus ciliaris L.). Maternal and paternal maps were constructed with restriction fragment length polymorphisms (RFLPs) segregating in 87 F1 progeny from an intraspecific cross between two heterozygous genotypes. A survey of 862 heterologous cDNAs and gDNAs from across the Poaceae, as well as 443 buffelgrass cDNAs, yielded 100 and 360 polymorphic probes, respectively. The maternal map included 322 RFLPs, 47 linkage groups, and 3464 cM, whereas the paternal map contained 245 RFLPs, 42 linkage groups, and 2757 cM. Approximately 70 to 80% of the buffelgrass genome was covered, and the average marker spacing was 10.8 and 11.3 cM on the respective maps. Preferential pairing was indicated between many linkage groups, which supports cytological reports that buffelgrass is a segmental allotetraploid. More preferential pairing (disomy) was found in the maternal than paternal parent across linkage groups (55 vs. 38%) and loci (48 vs. 15%). Comparison of interval lengths in 15 allelic bridges indicated significantly less meiotic recombination in paternal gametes. Allelic interactions were detected in four regions of the maternal map and were absent in the paternal map.

  17. A saturated SSR/DArT linkage map of Musa acuminata addressing genome rearrangements among bananas

    Directory of Open Access Journals (Sweden)

    Matsumoto Takashi

    2010-04-01

    Full Text Available Abstract Background The genus Musa is a large species complex which includes cultivars at diploid and triploid levels. These sterile and vegetatively propagated cultivars are based on the A genome from Musa acuminata, exclusively for sweet bananas such as Cavendish, or associated with the B genome (Musa balbisiana in cooking bananas such as Plantain varieties. In M. acuminata cultivars, structural heterozygosity is thought to be one of the main causes of sterility, which is essential for obtaining seedless fruits but hampers breeding. Only partial genetic maps are presently available due to chromosomal rearrangements within the parents of the mapping populations. This causes large segregation distortions inducing pseudo-linkages and difficulties in ordering markers in the linkage groups. The present study aims at producing a saturated linkage map of M. acuminata, taking into account hypotheses on the structural heterozygosity of the parents. Results An F1 progeny of 180 individuals was obtained from a cross between two genetically distant accessions of M. acuminata, 'Borneo' and 'Pisang Lilin' (P. Lilin. Based on the gametic recombination of each parent, two parental maps composed of SSR and DArT markers were established. A significant proportion of the markers (21.7% deviated (p Conclusions We propose a synthetic map with 11 linkage groups containing 489 markers (167 SSRs and 322 DArTs covering 1197 cM. This first saturated map is proposed as a "reference Musa map" for further analyses. We also propose two complete parental maps with interpretations of structural rearrangements localized on the linkage groups. The structural heterozygosity in P. Lilin is hypothesized to result from a duplication likely accompanied by an inversion on another chromosome. This paper also illustrates a methodological approach, transferable to other species, to investigate the mapping of structural rearrangements and determine their consequences on marker

  18. Heterologous and endogenous U6 snRNA promoters enable CRISPR/Cas9 mediated genome editing in Aspergillus niger.

    Science.gov (United States)

    Zheng, Xiaomei; Zheng, Ping; Sun, Jibin; Kun, Zhang; Ma, Yanhe

    2018-01-01

    U6 promoters have been used for single guide RNA (sgRNA) transcription in the clustered regularly interspaced short palindromic repeats/CRISPR-associated protein (CRISPR/Cas9) genome editing system. However, no available U6 promoters have been identified in Aspergillus niger, which is an important industrial platform for organic acid and protein production. Two CRISPR/Cas9 systems established in A. niger have recourse to the RNA polymerase II promoter or in vitro transcription for sgRNA synthesis, but these approaches generally increase cloning efforts and genetic manipulation. The validation of functional RNA polymerase II promoters is therefore an urgent need for A. niger . Here, we developed a novel CRISPR/Cas9 system in A. niger for sgRNA expression, based on one endogenous U6 promoter and two heterologous U6 promoters. The three tested U6 promoters enabled sgRNA transcription and the disruption of the polyketide synthase albA gene in A. niger . Furthermore, this system enabled highly efficient gene insertion at the targeted genome loci in A. niger using donor DNAs with homologous arms as short as 40-bp. This study demonstrated that both heterologous and endogenous U6 promoters were functional for sgRNA expression in A. niger . Based on this result, a novel and simple CRISPR/Cas9 toolbox was established in A. niger, that will benefit future gene functional analysis and genome editing.

  19. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc

    2014-02-15

    Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest. 2014 Elsevier Ltd. All rights reserved.

  20. Dynamic nucleosome organization at hox promoters during zebrafish embryogenesis.

    Directory of Open Access Journals (Sweden)

    Steven E Weicksel

    Full Text Available Nucleosome organization at promoter regions plays an important role in regulating gene activity. Genome-wide studies in yeast, flies, worms, mammalian embryonic stem cells and transformed cell lines have found well-positioned nucleosomes flanking a nucleosome depleted region (NDR at transcription start sites. This nucleosome arrangement depends on DNA sequence (cis-elements as well as DNA binding factors and ATP-dependent chromatin modifiers (trans-factors. However, little is understood about how the nascent embryonic genome positions nucleosomes during development. This is particularly intriguing since the embryonic genome must undergo a broad reprogramming event upon fusion of sperm and oocyte. Using four stages of early embryonic zebrafish development, we map nucleosome positions at the promoter region of 37 zebrafish hox genes. We find that nucleosome arrangement at the hox promoters is a progressive process that takes place over several stages. At stages immediately after fertilization, nucleosomes appear to be largely disordered at hox promoter regions. At stages after activation of the embryonic genome, nucleosomes are detectable at hox promoters, with positions becoming more uniform and more highly occupied. Since the genomic sequence is invariant during embryogenesis, this progressive change in nucleosome arrangement suggests that trans-factors play an important role in organizing nucleosomes during embryogenesis. Separating hox genes into expressed and non-expressed groups shows that expressed promoters have better positioned and occupied nucleosomes, as well as distinct NDRs, than non-expressed promoters. Finally, by blocking the retinoic acid-signaling pathway, we disrupt early hox gene transcription, but observe no effect on nucleosome positions, suggesting that active hox transcription is not a driving force behind the arrangement of nucleosomes at the promoters of hox genes during early development.

  1. Chromosome mapping by FISH to metaphase and interphase nuclei. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Trask, B.

    1997-08-01

    The overall specific aims of this project were: (1) to determine the large-scale structure of interphase and metaphase chromosomes, in order to establish new capabilities for genome mapping by fluorescence in situ hybridization (FISH); (2) to detect chromosome abnormalities associated with genetic disease and map DNA sequences relative to them in order to facilitate the identification of new genes with disease-causing mutations; (3) to establish medium resolution physical maps of selected chromosomal regions using a combined metaphase and interphase mapping strategy and to corroborate physical and genetic maps and integrate these maps with the cytogenetic map; (4) to analyze the polymorphism and sequence evolution of subtelomeric regions of human chromosomes; (5) to establish a state-of-the-art FISH and image processing facility in the Department of Molecular Biotechnology, University of Washington, in order to map DNA sequences rapidly and accurately to benefit the Human Genome Project.

  2. Decoding Synteny Blocks and Large-Scale Duplications in Mammalian and Plant Genomes

    Science.gov (United States)

    Peng, Qian; Alekseyev, Max A.; Tesler, Glenn; Pevzner, Pavel A.

    The existing synteny block reconstruction algorithms use anchors (e.g., orthologous genes) shared over all genomes to construct the synteny blocks for multiple genomes. This approach, while efficient for a few genomes, cannot be scaled to address the need to construct synteny blocks in many mammalian genomes that are currently being sequenced. The problem is that the number of anchors shared among all genomes quickly decreases with the increase in the number of genomes. Another problem is that many genomes (plant genomes in particular) had extensive duplications, which makes decoding of genomic architecture and rearrangement analysis in plants difficult. The existing synteny block generation algorithms in plants do not address the issue of generating non-overlapping synteny blocks suitable for analyzing rearrangements and evolution history of duplications. We present a new algorithm based on the A-Bruijn graph framework that overcomes these difficulties and provides a unified approach to synteny block reconstruction for multiple genomes, and for genomes with large duplications.

  3. Genome-scale analysis of aberrant DNA methylation in colorectal cancer

    Science.gov (United States)

    Hinoue, Toshinori; Weisenberger, Daniel J.; Lange, Christopher P.E.; Shen, Hui; Byun, Hyang-Min; Van Den Berg, David; Malik, Simeen; Pan, Fei; Noushmehr, Houtan; van Dijk, Cornelis M.; Tollenaar, Rob A.E.M.; Laird, Peter W.

    2012-01-01

    Colorectal cancer (CRC) is a heterogeneous disease in which unique subtypes are characterized by distinct genetic and epigenetic alterations. Here we performed comprehensive genome-scale DNA methylation profiling of 125 colorectal tumors and 29 adjacent normal tissues. We identified four DNA methylation–based subgroups of CRC using model-based cluster analyses. Each subtype shows characteristic genetic and clinical features, indicating that they represent biologically distinct subgroups. A CIMP-high (CIMP-H) subgroup, which exhibits an exceptionally high frequency of cancer-specific DNA hypermethylation, is strongly associated with MLH1 DNA hypermethylation and the BRAFV600E mutation. A CIMP-low (CIMP-L) subgroup is enriched for KRAS mutations and characterized by DNA hypermethylation of a subset of CIMP-H-associated markers rather than a unique group of CpG islands. Non-CIMP tumors are separated into two distinct clusters. One non-CIMP subgroup is distinguished by a significantly higher frequency of TP53 mutations and frequent occurrence in the distal colon, while the tumors that belong to the fourth group exhibit a low frequency of both cancer-specific DNA hypermethylation and gene mutations and are significantly enriched for rectal tumors. Furthermore, we identified 112 genes that were down-regulated more than twofold in CIMP-H tumors together with promoter DNA hypermethylation. These represent ∼7% of genes that acquired promoter DNA methylation in CIMP-H tumors. Intriguingly, 48/112 genes were also transcriptionally down-regulated in non-CIMP subgroups, but this was not attributable to promoter DNA hypermethylation. Together, we identified four distinct DNA methylation subgroups of CRC and provided novel insight regarding the role of CIMP-specific DNA hypermethylation in gene silencing. PMID:21659424

  4. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  5. Draft Genome Sequence of Ochrobactrum intermedium Strain SA148, a Plant Growth-Promoting Desert Rhizobacterium

    KAUST Repository

    Lafi, Feras Fawzi

    2017-03-03

    Ochrobactrum intermedium strain SA148 is a plant growth-promoting bacterium isolated from sandy soil in the Jizan area of Saudi Arabia. Here, we report the 4.9-Mb draft genome sequence of this strain, highlighting different pathways characteristic of plant growth promotion activity and environmental adaptation of SA148.

  6. Estimated allele substitution effects underlying genomic evaluation models depend on the scaling of allele counts

    NARCIS (Netherlands)

    Bouwman, Aniek C.; Hayes, Ben J.; Calus, Mario P.L.

    2017-01-01

    Background: Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of

  7. Genome-scale cold stress response regulatory networks in ten Arabidopsis thaliana ecotypes

    DEFF Research Database (Denmark)

    Barah, Pankaj; Jayavelu, Naresh Doni; Rasmussen, Simon

    2013-01-01

    available from Arabidopsis thaliana 1001 genome project, we further investigated sequence polymorphisms in the core cold stress regulon genes. Significant numbers of non-synonymous amino acid changes were observed in the coding region of the CBF regulon genes. Considering the limited knowledge about......BACKGROUND: Low temperature leads to major crop losses every year. Although several studies have been conducted focusing on diversity of cold tolerance level in multiple phenotypically divergent Arabidopsis thaliana (A. thaliana) ecotypes, genome-scale molecular understanding is still lacking....... RESULTS: In this study, we report genome-scale transcript response diversity of 10 A. thaliana ecotypes originating from different geographical locations to non-freezing cold stress (10°C). To analyze the transcriptional response diversity, we initially compared transcriptome changes in all 10 ecotypes...

  8. The Arab genome: Health and wealth.

    Science.gov (United States)

    Zayed, Hatem

    2016-11-05

    The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.

  9. Visualization for genomics: the Microbial Genome Viewer.

    NARCIS (Netherlands)

    Kerkhoven, R.; Enckevort, F.H.J. van; Boekhorst, J.; Molenaar, D; Siezen, R.J.

    2004-01-01

    SUMMARY: A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a

  10. LTC: a novel algorithm to improve the efficiency of contig assembly for physical mapping in complex genomes

    Directory of Open Access Journals (Sweden)

    Feuillet Catherine

    2010-11-01

    Full Text Available Abstract Background Physical maps are the substrate of genome sequencing and map-based cloning and their construction relies on the accurate assembly of BAC clones into large contigs that are then anchored to genetic maps with molecular markers. High Information Content Fingerprinting has become the method of choice for large and repetitive genomes such as those of maize, barley, and wheat. However, the high level of repeated DNA present in these genomes requires the application of very stringent criteria to ensure a reliable assembly with the FingerPrinted Contig (FPC software, which often results in short contig lengths (of 3-5 clones before merging as well as an unreliable assembly in some difficult regions. Difficulties can originate from a non-linear topological structure of clone overlaps, low power of clone ordering algorithms, and the absence of tools to identify sources of gaps in Minimal Tiling Paths (MTPs. Results To address these problems, we propose a novel approach that: (i reduces the rate of false connections and Q-clones by using a new cutoff calculation method; (ii obtains reliable clusters robust to the exclusion of single clone or clone overlap; (iii explores the topological contig structure by considering contigs as networks of clones connected by significant overlaps; (iv performs iterative clone clustering combined with ordering and order verification using re-sampling methods; and (v uses global optimization methods for clone ordering and Band Map construction. The elements of this new analytical framework called Linear Topological Contig (LTC were applied on datasets used previously for the construction of the physical map of wheat chromosome 3B with FPC. The performance of LTC vs. FPC was compared also on the simulated BAC libraries based on the known genome sequences for chromosome 1 of rice and chromosome 1 of maize. Conclusions The results show that compared to other methods, LTC enables the construction of highly

  11. Genome Modeling System: A Knowledge Management Platform for Genomics.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-07-01

    Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

  12. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes [v2; ref status: indexed, http://f1000r.es/2x3

    Directory of Open Access Journals (Sweden)

    Ted Kalbfleisch

    2014-02-01

    Full Text Available Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease.  High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals.  Comparisons between these species have provided unique insights into mammalian gene function.  However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life.  For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project.  Only six of these have reference genomes:  cattle, swine, sheep, goat, water buffalo, and bison.  Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade.  In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species’ reference genome (Ovis aries Oar3.1 and to that of a species that diverged 15 to 30 million years ago (Bos taurus UMD3.1.  In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding repeat regions and sex chromosomes, nearly 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep.  Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous.  These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand

  13. Large-scale chromatin immunoprecipitation with promoter sequence microarray analysis of the interaction of the NSs protein of Rift Valley fever virus with regulatory DNA regions of the host genome.

    Science.gov (United States)

    Benferhat, Rima; Josse, Thibaut; Albaud, Benoit; Gentien, David; Mansuroglu, Zeyni; Marcato, Vasco; Souès, Sylvie; Le Bonniec, Bernard; Bouloy, Michèle; Bonnefoy, Eliette

    2012-10-01

    Rift Valley fever virus (RVFV) is a highly pathogenic Phlebovirus that infects humans and ruminants. Initially confined to Africa, RVFV has spread outside Africa and presently represents a high risk to other geographic regions. It is responsible for high fatality rates in sheep and cattle. In humans, RVFV can induce hepatitis, encephalitis, retinitis, or fatal hemorrhagic fever. The nonstructural NSs protein that is the major virulence factor is found in the nuclei of infected cells where it associates with cellular transcription factors and cofactors. In previous work, we have shown that NSs interacts with the promoter region of the beta interferon gene abnormally maintaining the promoter in a repressed state. In this work, we performed a genome-wide analysis of the interactions between NSs and the host genome using a genome-wide chromatin immunoprecipitation combined with promoter sequence microarray, the ChIP-on-chip technique. Several cellular promoter regions were identified as significantly interacting with NSs, and the establishment of NSs interactions with these regions was often found linked to deregulation of expression of the corresponding genes. Among annotated NSs-interacting genes were present not only genes regulating innate immunity and inflammation but also genes regulating cellular pathways that have not yet been identified as targeted by RVFV. Several of these pathways, such as cell adhesion, axonal guidance, development, and coagulation were closely related to RVFV-induced disorders. In particular, we show in this work that NSs targeted and modified the expression of genes coding for coagulation factors, demonstrating for the first time that this hemorrhagic virus impairs the host coagulation cascade at the transcriptional level.

  14. CpG island mapping by epigenome prediction.

    Directory of Open Access Journals (Sweden)

    Christoph Bock

    2007-06-01

    Full Text Available CpG islands were originally identified by epigenetic and functional properties, namely, absence of DNA methylation and frequent promoter association. However, this concept was quickly replaced by simple DNA sequence criteria, which allowed for genome-wide annotation of CpG islands in the absence of large-scale epigenetic datasets. Although widely used, the current CpG island criteria incur significant disadvantages: (1 reliance on arbitrary threshold parameters that bear little biological justification, (2 failure to account for widespread heterogeneity among CpG islands, and (3 apparent lack of specificity when applied to the human genome. This study is driven by the idea that a quantitative score of "CpG island strength" that incorporates epigenetic and functional aspects can help resolve these issues. We construct an epigenome prediction pipeline that links the DNA sequence of CpG islands to their epigenetic states, including DNA methylation, histone modifications, and chromatin accessibility. By training support vector machines on epigenetic data for CpG islands on human Chromosomes 21 and 22, we identify informative DNA attributes that correlate with open versus compact chromatin structures. These DNA attributes are used to predict the epigenetic states of all CpG islands genome-wide. Combining predictions for multiple epigenetic features, we estimate the inherent CpG island strength for each CpG island in the human genome, i.e., its inherent tendency to exhibit an open and transcriptionally competent chromatin structure. We extensively validate our results on independent datasets, showing that the CpG island strength predictions are applicable and informative across different tissues and cell types, and we derive improved maps of predicted "bona fide" CpG islands. The mapping of CpG islands by epigenome prediction is conceptually superior to identifying CpG islands by widely used sequence criteria since it links CpG island detection to

  15. CpG island mapping by epigenome prediction.

    Science.gov (United States)

    Bock, Christoph; Walter, Jörn; Paulsen, Martina; Lengauer, Thomas

    2007-06-01

    CpG islands were originally identified by epigenetic and functional properties, namely, absence of DNA methylation and frequent promoter association. However, this concept was quickly replaced by simple DNA sequence criteria, which allowed for genome-wide annotation of CpG islands in the absence of large-scale epigenetic datasets. Although widely used, the current CpG island criteria incur significant disadvantages: (1) reliance on arbitrary threshold parameters that bear little biological justification, (2) failure to account for widespread heterogeneity among CpG islands, and (3) apparent lack of specificity when applied to the human genome. This study is driven by the idea that a quantitative score of "CpG island strength" that incorporates epigenetic and functional aspects can help resolve these issues. We construct an epigenome prediction pipeline that links the DNA sequence of CpG islands to their epigenetic states, including DNA methylation, histone modifications, and chromatin accessibility. By training support vector machines on epigenetic data for CpG islands on human Chromosomes 21 and 22, we identify informative DNA attributes that correlate with open versus compact chromatin structures. These DNA attributes are used to predict the epigenetic states of all CpG islands genome-wide. Combining predictions for multiple epigenetic features, we estimate the inherent CpG island strength for each CpG island in the human genome, i.e., its inherent tendency to exhibit an open and transcriptionally competent chromatin structure. We extensively validate our results on independent datasets, showing that the CpG island strength predictions are applicable and informative across different tissues and cell types, and we derive improved maps of predicted "bona fide" CpG islands. The mapping of CpG islands by epigenome prediction is conceptually superior to identifying CpG islands by widely used sequence criteria since it links CpG island detection to their characteristic

  16. A Bac Library and Paired-PCR Approach to Mapping and Completing the Genome Sequence of Sulfolobus Solfataricus P2

    DEFF Research Database (Denmark)

    She, Qunxin; Confalonieri, F.; Zivanovic, Y.

    2000-01-01

    The original strategy used in the Sulfolobus solfatnricus genome project was to sequence non overlapping, or minimally overlapping, cosmid or lambda inserts without constructing a physical map. However, after only about two thirds of the genome sequence was completed, this approach became counter......-productive because there was a high sequence bias in the cosmid and lambda libraries. Therefore, a new approach was devised for linking the sequenced regions which may be generally applicable. BAC libraries were constructed and terminal sequences of the clones were determined and used for both end mapping and PCR...

  17. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    Science.gov (United States)

    Jing, Shengli; Zhang, Lei; Ma, Yinhua; Liu, Bingfang; Zhao, Yan; Yu, Hangjin; Zhou, Xi; Qin, Rui; Zhu, Lili; He, Guangcun

    2014-01-01

    Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens) is the most destructive rice (Oryza sativa) pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5) and 14 (Qgr14). This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for controlling this most

  18. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    Directory of Open Access Journals (Sweden)

    Shengli Jing

    Full Text Available Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens is the most destructive rice (Oryza sativa pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5 and 14 (Qgr14. This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for

  19. Phosphate steering by Flap Endonuclease 1 promotes 5′-flap specificity and incision to prevent genome instability

    KAUST Repository

    Tsutakawa, Susan E.

    2017-06-27

    DNA replication and repair enzyme Flap Endonuclease 1 (FEN1) is vital for genome integrity, and FEN1 mutations arise in multiple cancers. FEN1 precisely cleaves single-stranded (ss) 5\\'-flaps one nucleotide into duplex (ds) DNA. Yet, how FEN1 selects for but does not incise the ss 5\\'-flap was enigmatic. Here we combine crystallographic, biochemical and genetic analyses to show that two dsDNA binding sites set the 5\\'polarity and to reveal unexpected control of the DNA phosphodiester backbone by electrostatic interactions. Via phosphate steering\\', basic residues energetically steer an inverted ss 5\\'-flap through a gateway over FEN1\\'s active site and shift dsDNA for catalysis. Mutations of these residues cause an 18,000-fold reduction in catalytic rate in vitro and large-scale trinucleotide (GAA) repeat expansions in vivo, implying failed phosphate-steering promotes an unanticipated lagging-strand template-switch mechanism during replication. Thus, phosphate steering is an unappreciated FEN1 function that enforces 5\\'-flap specificity and catalysis, preventing genomic instability.

  20. Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim

    2008-01-01

    Background: Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number...... to a genome scale metabolic model of A. oryzae. Results: Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted...... model was validated and shown to correctly describe the phenotypic behavior of A. oryzae grown on different carbon sources. Conclusion: A much enhanced annotation of the A. oryzae genome was performed and a genomescale metabolic model of A. oryzae was reconstructed. The model accurately predicted...

  1. High-resolution genetic map for understanding the effect of genome-wide recombination rate on nucleotide diversity in watermelon.

    Science.gov (United States)

    Reddy, Umesh K; Nimmakayala, Padma; Levi, Amnon; Abburi, Venkata Lakshmi; Saminathan, Thangasamy; Tomason, Yan R; Vajja, Gopinath; Reddy, Rishi; Abburi, Lavanya; Wehner, Todd C; Ronin, Yefim; Karol, Abraham

    2014-09-15

    We used genotyping by sequencing to identify a set of 10,480 single nucleotide polymorphism (SNP) markers for constructing a high-resolution genetic map of 1096 cM for watermelon. We assessed the genome-wide variation in recombination rate (GWRR) across the map and found an association between GWRR and genome-wide nucleotide diversity. Collinearity between the map and the genome-wide reference sequence for watermelon was studied to identify inconsistency and chromosome rearrangements. We assessed genome-wide nucleotide diversity, linkage disequilibrium (LD), and selective sweep for wild, semi-wild, and domesticated accessions of Citrullus lanatus var. lanatus to track signals of domestication. Principal component analysis combined with chromosome-wide phylogenetic study based on 1563 SNPs obtained after LD pruning with minor allele frequency of 0.05 resolved the differences between semi-wild and wild accessions as well as relationships among worldwide sweet watermelon. Population structure analysis revealed predominant ancestries for wild, semi-wild, and domesticated watermelons as well as admixture of various ancestries that were important for domestication. Sliding window analysis of Tajima's D across various chromosomes was used to resolve selective sweep. LD decay was estimated for various chromosomes. We identified a strong selective sweep on chromosome 3 consisting of important genes that might have had a role in sweet watermelon domestication. Copyright © 2014 Reddy et al.

  2. Toward the automated generation of genome-scale metabolic networks in the SEED.

    Science.gov (United States)

    DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron

    2007-04-26

    Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the

  3. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  4. A map to a new treasure island: the human genome and the concept of common heritage.

    Science.gov (United States)

    Byk, C

    1998-06-01

    While the 1970's have been called the environmental years, the 1990's could be seen as the genome years. As the challenge to map and to sequence the human genome mobilized the scientific community, risks and benefits of information and uses that would derive from this project have also raised ethical issues at the international level. The particular interest of the 1997 UNESCO Declaration relies on the fact that it emphasizes both the scientific importance of genetics and the appropriate reinforcement of human rights in this area. It considers the human genome, at least symbolically, as the common heritage of humanity.

  5. In Silico Genome-Scale Reconstruction and Validation of the Staphylococcus aureus Metabolic Network

    NARCIS (Netherlands)

    Heinemann, Matthias; Kümmel, Anne; Ruinatscha, Reto; Panke, Sven

    2005-01-01

    A genome-scale metabolic model of the Gram-positive, facultative anaerobic opportunistic pathogen Staphylococcus aureus N315 was constructed based on current genomic data, literature, and physiological information. The model comprises 774 metabolic processes representing approximately 23% of all

  6. G2S: A web-service for annotating genomic variants on 3D protein structures.

    Science.gov (United States)

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-01-27

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that support programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design conception and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. Genome-scale modeling of yeast: chronology, applications and critical perspectives.

    Science.gov (United States)

    Lopes, Helder; Rocha, Isabel

    2017-08-01

    Over the last 15 years, several genome-scale metabolic models (GSMMs) were developed for different yeast species, aiding both the elucidation of new biological processes and the shift toward a bio-based economy, through the design of in silico inspired cell factories. Here, an historical perspective of the GSMMs built over time for several yeast species is presented and the main inheritance patterns among the metabolic reconstructions are highlighted. We additionally provide a critical perspective on the overall genome-scale modeling procedure, underlining incomplete model validation and evaluation approaches and the quest for the integration of regulatory and kinetic information into yeast GSMMs. A summary of experimentally validated model-based metabolic engineering applications of yeast species is further emphasized, while the main challenges and future perspectives for the field are finally addressed. © FEMS 2017.

  8. SNP identification from RNA sequencing and linkage map construction of rubber tree for anchoring the draft genome.

    Science.gov (United States)

    Shearman, Jeremy R; Sangsrakru, Duangjai; Jomchai, Nukoon; Ruang-Areerate, Panthita; Sonthirod, Chutima; Naktang, Chaiwat; Theerawattanasuk, Kanikar; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2015-01-01

    Hevea brasiliensis, or rubber tree, is an important crop species that accounts for the majority of natural latex production. The rubber tree nuclear genome consists of 18 chromosomes and is roughly 2.15 Gb. The current rubber tree reference genome assembly consists of 1,150,326 scaffolds ranging from 200 to 531,465 bp and totalling 1.1 Gb. Only 143 scaffolds, totalling 7.6 Mb, have been placed into linkage groups. We have performed RNA-seq on 6 varieties of rubber tree to identify SNPs and InDels and used this information to perform target sequence enrichment and high throughput sequencing to genotype a set of SNPs in 149 rubber tree offspring from a cross between RRIM 600 and RRII 105 rubber tree varieties. We used this information to generate a linkage map allowing for the anchoring of 24,424 contigs from 3,009 scaffolds, totalling 115 Mb or 10.4% of the published sequence, into 18 linkage groups. Each linkage group contains between 319 and 1367 SNPs, or 60 to 194 non-redundant marker positions, and ranges from 156 to 336 cM in length. This linkage map includes 20,143 of the 69,300 predicted genes from rubber tree and will be useful for mapping studies and improving the reference genome assembly.

  9. A comparative map viewer integrating genetic maps for Brassica and Arabidopsis

    Directory of Open Access Journals (Sweden)

    Erwin Timothy A

    2007-07-01

    Full Text Available Abstract Background Molecular genetic maps provide a means to link heritable traits with underlying genome sequence variation. Several genetic maps have been constructed for Brassica species, yet to date, there has been no simple means to compare this information or to associate mapped traits with the genome sequence of the related model plant, Arabidopsis. Description We have developed a comparative genetic map database for the viewing, comparison and analysis of Brassica and Arabidopsis genetic, physical and trait map information. This web-based tool allows users to view and compare genetic and physical maps, search for traits and markers, and compare genetic linkage groups within and between the amphidiploid and diploid Brassica genomes. The inclusion of Arabidopsis data enables comparison between Brassica maps that share no common markers. Analysis of conserved syntenic blocks between Arabidopsis and collated Brassica genetic maps validates the application of this system. This tool is freely available over the internet on http://bioinformatics.pbcbasc.latrobe.edu.au/cmap. Conclusion This database enables users to interrogate the relationship between Brassica genetic maps and the sequenced genome of A. thaliana, permitting the comparison of genetic linkage groups and mapped traits and the rapid identification of candidate genes.

  10. iCN718, an Updated and Improved Genome-Scale Metabolic Network Reconstruction of Acinetobacter baumannii AYE.

    Science.gov (United States)

    Norsigian, Charles J; Kavvas, Erol; Seif, Yara; Palsson, Bernhard O; Monk, Jonathan M

    2018-01-01

    Acinetobacter baumannii has become an urgent clinical threat due to the recent emergence of multi-drug resistant strains. There is thus a significant need to discover new therapeutic targets in this organism. One means for doing so is through the use of high-quality genome-scale reconstructions. Well-curated and accurate genome-scale models (GEMs) of A. baumannii would be useful for improving treatment options. We present an updated and improved genome-scale reconstruction of A. baumannii AYE, named iCN718, that improves and standardizes previous A. baumannii AYE reconstructions. iCN718 has 80% accuracy for predicting gene essentiality data and additionally can predict large-scale phenotypic data with as much as 89% accuracy, a new capability for an A. baumannii reconstruction. We further demonstrate that iCN718 can be used to analyze conserved metabolic functions in the A. baumannii core genome and to build strain-specific GEMs of 74 other A. baumannii strains from genome sequence alone. iCN718 will serve as a resource to integrate and synthesize new experimental data being generated for this urgent threat pathogen.

  11. Genome-wide mapping of boundary element-associated factor (BEAF) binding sites in Drosophila melanogaster links BEAF to transcription.

    Science.gov (United States)

    Jiang, Nan; Emberly, Eldon; Cuvier, Olivier; Hart, Craig M

    2009-07-01

    Insulator elements play a role in gene regulation that is potentially linked to nuclear organization. Boundary element-associated factors (BEAFs) 32A and 32B associate with hundreds of sites on Drosophila polytene chromosomes. We hybridized DNA isolated by chromatin immunoprecipitation to genome tiling microarrays to construct a genome-wide map of BEAF binding locations. A distinct difference in the association of 32A and 32B with chromatin was noted. We identified 1,820 BEAF peaks and found that more than 85% were less than 300 bp from transcription start sites. Half are between head-to-head gene pairs. BEAF-associated genes are transcriptionally active as judged by the presence of RNA polymerase II, dimethylated histone H3 K4, and the alternative histone H3.3. Forty percent of these genes are also associated with the polymerase negative elongation factor NELF. Like NELF-associated genes, most BEAF-associated genes are highly expressed. Using quantitative reverse transcription-PCR, we found that the expression levels of most BEAF-associated genes decrease in embryos and cultured cells lacking BEAF. These results provide an unexpected link between BEAF and transcription, suggesting that BEAF plays a role in maintaining most associated promoter regions in an environment that facilitates high transcription levels.

  12. A Spatial Framework to Map Heat Health Risks at Multiple Scales.

    Science.gov (United States)

    Ho, Hung Chak; Knudby, Anders; Huang, Wei

    2015-12-18

    In the last few decades extreme heat events have led to substantial excess mortality, most dramatically in Central Europe in 2003, in Russia in 2010, and even in typically cool locations such as Vancouver, Canada, in 2009. Heat-related morbidity and mortality is expected to increase over the coming centuries as the result of climate-driven global increases in the severity and frequency of extreme heat events. Spatial information on heat exposure and population vulnerability may be combined to map the areas of highest risk and focus mitigation efforts there. However, a mismatch in spatial resolution between heat exposure and vulnerability data can cause spatial scale issues such as the Modifiable Areal Unit Problem (MAUP). We used a raster-based model to integrate heat exposure and vulnerability data in a multi-criteria decision analysis, and compared it to the traditional vector-based model. We then used the Getis-Ord G(i) index to generate spatially smoothed heat risk hotspot maps from fine to coarse spatial scales. The raster-based model allowed production of maps at spatial resolution, more description of local-scale heat risk variability, and identification of heat-risk areas not identified with the vector-based approach. Spatial smoothing with the Getis-Ord G(i) index produced heat risk hotspots from local to regional spatial scale. The approach is a framework for reducing spatial scale issues in future heat risk mapping, and for identifying heat risk hotspots at spatial scales ranging from the block-level to the municipality level.

  13. Maps on large-scale air quality concentrations in the Netherlands

    International Nuclear Information System (INIS)

    Velders, G.J.M.; Aben, J.M.M.; Beck, J.P.; Blom, W.F.; Van Dam, J.D.; Elzenga, H.E.; Geilenkirchen, G.P.; Hoen, A.; Jimmink, B.A.; Matthijsen, J.; Peek, C.J.; Van Velze, K.; Visser, H.; De Vries, W.J.

    2007-01-01

    Every year MNP produces maps showing large-scale concentrations of several air quality components in the Netherlands for which there are European regulations. The concentration maps are based on a combination of model calculations and measurements. These maps (called GCN maps) show the large-scale contribution of these components in air in the Netherlands for both past and future years. Local, provincial and other authorities use these maps for reporting exceedances in the framework of the EU Air Quality Directive and for planning. The report gives the underlying assumptions applied to the GCN-maps in this 2007 report. The Dutch Ministry of Housing, Spatial Planning and the Environment (VROM) is legally responsible for selecting the scenario to be used in the GCN maps. The Ministry has chosen to base the current maps of nitrogen dioxide, particulate matter (PM10) and sulphur dioxide for 2010 up to 2020 on standing and proposed Dutch and European policies. That means that the Netherlands and other European countries will meet their National Emissions Ceilings (NEC) by 2010 and the emissions according to the ambitions of the Thematic Strategy on Air Pollution of the European Commission up to 2020, as assumed in the calculations. The large-scale concentrations of NO2 and PM10, presented by the GCN maps, are in 2006 and for the 2010-2020 period, below the European limit value of yearly averaged 40 μg m 3 everywhere in the Netherlands. The large-scale concentration exceeds the European limit value for the daily average of PM10 of maximally 35 days above 50 μg m 3 in some locations in 2006. This applies close to the harbours of Amsterdam and Rotterdam and is associated with storage and handling of dry bulk material. The large-scale concentration of PM10 is below the European limit value for the daily average everywhere in 2010-2020. Several changes have been implemented, in addition to the changes in the GCN maps of last year (report March 2006). New insights into

  14. On the effects of scale for ecosystem services mapping.

    Science.gov (United States)

    Grêt-Regamey, Adrienne; Weibel, Bettina; Bagstad, Kenneth J; Ferrari, Marika; Geneletti, Davide; Klug, Hermann; Schirpke, Uta; Tappeiner, Ulrike

    2014-01-01

    Ecosystems provide life-sustaining services upon which human civilization depends, but their degradation largely continues unabated. Spatially explicit information on ecosystem services (ES) provision is required to better guide decision making, particularly for mountain systems, which are characterized by vertical gradients and isolation with high topographic complexity, making them particularly sensitive to global change. But while spatially explicit ES quantification and valuation allows the identification of areas of abundant or limited supply of and demand for ES, the accuracy and usefulness of the information varies considerably depending on the scale and methods used. Using four case studies from mountainous regions in Europe and the U.S., we quantify information gains and losses when mapping five ES - carbon sequestration, flood regulation, agricultural production, timber harvest, and scenic beauty - at coarse and fine resolution (250 m vs. 25 m in Europe and 300 m vs. 30 m in the U.S.). We analyze the effects of scale on ES estimates and their spatial pattern and show how these effects are related to different ES, terrain structure and model properties. ES estimates differ substantially between the fine and coarse resolution analyses in all case studies and across all services. This scale effect is not equally strong for all ES. We show that spatially explicit information about non-clustered, isolated ES tends to be lost at coarse resolution and against expectation, mainly in less rugged terrain, which calls for finer resolution assessments in such contexts. The effect of terrain ruggedness is also related to model properties such as dependency on land use-land cover data. We close with recommendations for mapping ES to make the resulting maps more comparable, and suggest a four-step approach to address the issue of scale when mapping ES that can deliver information to support ES-based decision making with greater accuracy and reliability.

  15. On the effects of scale for ecosystem services mapping.

    Directory of Open Access Journals (Sweden)

    Adrienne Grêt-Regamey

    Full Text Available Ecosystems provide life-sustaining services upon which human civilization depends, but their degradation largely continues unabated. Spatially explicit information on ecosystem services (ES provision is required to better guide decision making, particularly for mountain systems, which are characterized by vertical gradients and isolation with high topographic complexity, making them particularly sensitive to global change. But while spatially explicit ES quantification and valuation allows the identification of areas of abundant or limited supply of and demand for ES, the accuracy and usefulness of the information varies considerably depending on the scale and methods used. Using four case studies from mountainous regions in Europe and the U.S., we quantify information gains and losses when mapping five ES - carbon sequestration, flood regulation, agricultural production, timber harvest, and scenic beauty - at coarse and fine resolution (250 m vs. 25 m in Europe and 300 m vs. 30 m in the U.S.. We analyze the effects of scale on ES estimates and their spatial pattern and show how these effects are related to different ES, terrain structure and model properties. ES estimates differ substantially between the fine and coarse resolution analyses in all case studies and across all services. This scale effect is not equally strong for all ES. We show that spatially explicit information about non-clustered, isolated ES tends to be lost at coarse resolution and against expectation, mainly in less rugged terrain, which calls for finer resolution assessments in such contexts. The effect of terrain ruggedness is also related to model properties such as dependency on land use-land cover data. We close with recommendations for mapping ES to make the resulting maps more comparable, and suggest a four-step approach to address the issue of scale when mapping ES that can deliver information to support ES-based decision making with greater accuracy and reliability.

  16. On the effects of scale for ecosystem services mapping

    Science.gov (United States)

    Grêt-Regamey, Adrienne; Weibel, Bettina; Bagstad, Kenneth J.; Ferrari, Marika; Geneletti, Davide; Klug, Hermann; Schirpke, Uta; Tappeiner, Ulrike

    2014-01-01

    Ecosystems provide life-sustaining services upon which human civilization depends, but their degradation largely continues unabated. Spatially explicit information on ecosystem services (ES) provision is required to better guide decision making, particularly for mountain systems, which are characterized by vertical gradients and isolation with high topographic complexity, making them particularly sensitive to global change. But while spatially explicit ES quantification and valuation allows the identification of areas of abundant or limited supply of and demand for ES, the accuracy and usefulness of the information varies considerably depending on the scale and methods used. Using four case studies from mountainous regions in Europe and the U.S., we quantify information gains and losses when mapping five ES - carbon sequestration, flood regulation, agricultural production, timber harvest, and scenic beauty - at coarse and fine resolution (250 m vs. 25 m in Europe and 300 m vs. 30 m in the U.S.). We analyze the effects of scale on ES estimates and their spatial pattern and show how these effects are related to different ES, terrain structure and model properties. ES estimates differ substantially between the fine and coarse resolution analyses in all case studies and across all services. This scale effect is not equally strong for all ES. We show that spatially explicit information about non-clustered, isolated ES tends to be lost at coarse resolution and against expectation, mainly in less rugged terrain, which calls for finer resolution assessments in such contexts. The effect of terrain ruggedness is also related to model properties such as dependency on land use-land cover data. We close with recommendations for mapping ES to make the resulting maps more comparable, and suggest a four-step approach to address the issue of scale when mapping ES that can deliver information to support ES-based decision making with greater accuracy and reliability.

  17. Genome-wide recombination rate variation in a recombination map of cotton.

    Science.gov (United States)

    Shen, Chao; Li, Ximei; Zhang, Ruiting; Lin, Zhongxu

    2017-01-01

    Recombination is crucial for genetic evolution, which not only provides new allele combinations but also influences the biological evolution and efficacy of natural selection. However, recombination variation is not well understood outside of the complex species' genomes, and it is particularly unclear in Gossypium. Cotton is the most important natural fibre crop and the second largest oil-seed crop. Here, we found that the genetic and physical maps distances did not have a simple linear relationship. Recombination rates were unevenly distributed throughout the cotton genome, which showed marked changes along the chromosome lengths and recombination was completely suppressed in the centromeric regions. Recombination rates significantly varied between A-subgenome (At) (range = 1.60 to 3.26 centimorgan/megabase [cM/Mb]) and D-subgenome (Dt) (range = 2.17 to 4.97 cM/Mb), which explained why the genetic maps of At and Dt are similar but the physical map of Dt is only half that of At. The translocation regions between A02 and A03 and between A04 and A05, and the inversion regions on A10, D10, A07 and D07 indicated relatively high recombination rates in the distal regions of the chromosomes. Recombination rates were positively correlated with the densities of genes, markers and the distance from the centromere, and negatively correlated with transposable elements (TEs). The gene ontology (GO) categories showed that genes in high recombination regions may tend to response to environmental stimuli, and genes in low recombination regions are related to mitosis and meiosis, which suggested that they may provide the primary driving force in adaptive evolution and assure the stability of basic cell cycle in a rapidly changing environment. Global knowledge of recombination rates will facilitate genetics and breeding in cotton.

  18. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus) and the Scaled Quail (Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size.

    Science.gov (United States)

    Oldeschulte, David L; Halley, Yvette A; Wilson, Miranda L; Bhattarai, Eric K; Brashear, Wesley; Hill, Joshua; Metz, Richard P; Johnson, Charles D; Rollins, Dale; Peterson, Markus J; Bickhart, Derek M; Decker, Jared E; Sewell, John F; Seabury, Christopher M

    2017-09-07

    Northern bobwhite ( Colinus virginianus ; hereafter bobwhite) and scaled quail ( Callipepla squamata ) populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0) and second- (v2.0) generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb) was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb), which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%), genome-wide repetitive content (10.40%; 10.43%), and MAKER-predicted protein coding genes (17,131; 17,165) were similar for the scaled quail (v1.0) and bobwhite (v2.0) assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8%) and the bobwhite (v2.0; 82.5%), as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0), and 711 in the bobwhite genome (v2.0), including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0) and bobwhite (v2.0) genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15-20 KYA. Copyright © 2017 Oldeschulte et al.

  19. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus and the Scaled Quail (Callipepla squamata Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size

    Directory of Open Access Journals (Sweden)

    David L. Oldeschulte

    2017-09-01

    Full Text Available Northern bobwhite (Colinus virginianus; hereafter bobwhite and scaled quail (Callipepla squamata populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0 and second- (v2.0 generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb, which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%, genome-wide repetitive content (10.40%; 10.43%, and MAKER-predicted protein coding genes (17,131; 17,165 were similar for the scaled quail (v1.0 and bobwhite (v2.0 assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8% and the bobwhite (v2.0; 82.5%, as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0, and 711 in the bobwhite genome (v2.0, including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0 and bobwhite (v2.0 genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15–20 KYA.

  20. Scaling properties of a simplified bouncer model and of Chirikov's standard map

    International Nuclear Information System (INIS)

    Ladeira, Denis Gouvea; Silva, Jafferson Kamphorst Leal da

    2007-01-01

    Scaling properties of Chirikov's standard map are investigated by studying the average value of I 2 , where I is the action variable, for initial conditions in (a) the stability island and (b) the chaotic component. Scaling behavior appears in three regimes, defined by the value of the control parameter K: (i) the integrable to non-integrable transition (K ∼ 0) and K c (K c ∼ 0.9716); (ii) the transition from limited to unlimited growth of I 2 , K ∼> K c ; (iii) the regime of strong nonlinearity, K >> K c . Our scaling results are also applicable to Pustylnikov's bouncer model, since it is globally equivalent to the standard map. We also describe the scaling properties of a stochastic version of the standard map, which exhibits unlimited growth of I 2 even for small values of K

  1. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    Directory of Open Access Journals (Sweden)

    Yang Li

    2006-12-01

    Full Text Available Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic response of gene expression also shows heritable difference has not yet been studied. Here we show that differential expression induced by temperatures of 16 degrees C and 24 degrees C has a strong genetic component in Caenorhabditis elegans recombinant inbred strains derived from a cross between strains CB4856 (Hawaii and N2 (Bristol. No less than 59% of 308 trans-acting genes showed a significant eQTL-by-environment interaction, here termed plasticity quantitative trait loci. In contrast, only 8% of an estimated 188 cis-acting genes showed such interaction. This indicates that heritable differences in plastic responses of gene expression are largely regulated in trans. This regulation is spread over many different regulators. However, for one group of trans-genes we found prominent evidence for a common master regulator: a transband of 66 coregulated genes appeared at 24 degrees C. Our results suggest widespread genetic variation of differential expression responses to environmental impacts and demonstrate the potential of genetical genomics for mapping the molecular determinants of phenotypic plasticity.

  2. Construction of a river buffalo (Bubalus bubalis whole-genome radiation hybrid panel and preliminary RH mapping of chromosomes 3 and 10

    Directory of Open Access Journals (Sweden)

    J.E. Womack

    2010-02-01

    Full Text Available The buffalo (Bubalus bubalis not only is a useful source of milk, it also provides meat and works as a natural source of labor and biogas. To establish a project for buffalo genome mapping a 5,000-rad whole genome radiation hybrid panel was constructed for river buffalo and used to build preliminary RH maps from two chromosomes (BBU 3 and BBU10. The preliminary maps contain 66 markers, including coding genes, cattle ESTs and microsatellite loci. The RH maps presented here are the starting point for mapping additional loci, in particular, genes and expressed sequence tags that will allow detailed comparative maps between buffalo, cattle and other species to be constructed. A large quantity of DNA has been prepared from the cell lines forming the RH panel reported here and will be made publicly available to the international community both for the study of chromosome evolution and for the improvement of traits important to the role of buffalo in animal agriculture.

  3. Cellular Factors Shape 3D Genome Landscape

    Science.gov (United States)

    Researchers, using novel large-scale imaging technology, have mapped the spatial location of individual genes in the nucleus of human cells and identified 50 cellular factors required for the proper 3D positioning of genes. These spatial locations play important roles in gene expression, DNA repair, genome stability, and other cellular activities.

  4. Self-Organization in Coupled Map Scale-Free Networks

    International Nuclear Information System (INIS)

    Xiao-Ming, Liang; Zong-Hua, Liu; Hua-Ping, Lü

    2008-01-01

    We study the self-organization of phase synchronization in coupled map scale-free networks with chaotic logistic map at each node and find that a variety of ordered spatiotemporal patterns emerge spontaneously in a regime of coupling strength. These ordered behaviours will change with the increase of the average links and are robust to both the system size and parameter mismatch. A heuristic theory is given to explain the mechanism of self-organization and to figure out the regime of coupling for the ordered spatiotemporal patterns

  5. Molecular mapping and genomics of soybean seed protein: a review and perspective for the future.

    Science.gov (United States)

    Patil, Gunvant; Mian, Rouf; Vuong, Tri; Pantalone, Vince; Song, Qijian; Chen, Pengyin; Shannon, Grover J; Carter, Tommy C; Nguyen, Henry T

    2017-10-01

    Genetic improvement of soybean protein meal is a complex process because of negative correlation with oil, yield, and temperature. This review describes the progress in mapping and genomics, identifies knowledge gaps, and highlights the need of integrated approaches. Meal protein derived from soybean [Glycine max (L) Merr.] seed is the primary source of protein in poultry and livestock feed. Protein is a key factor that determines the nutritional and economical value of soybean. Genetic improvement of soybean seed protein content is highly desirable, and major quantitative trait loci (QTL) for soybean protein have been detected and repeatedly mapped on chromosomes (Chr.) 20 (LG-I), and 15 (LG-E). However, practical breeding progress is challenging because of seed protein content's negative genetic correlation with seed yield, other seed components such as oil and sucrose, and interaction with environmental effects such as temperature during seed development. In this review, we discuss rate-limiting factors related to soybean protein content and nutritional quality, and potential control factors regulating seed storage protein. In addition, we describe advances in next-generation sequencing technologies for precise detection of natural variants and their integration with conventional and high-throughput genotyping technologies. A syntenic analysis of QTL on Chr. 15 and 20 was performed. Finally, we discuss comprehensive approaches for integrating protein and amino acid QTL, genome-wide association studies, whole-genome resequencing, and transcriptome data to accelerate identification of genomic hot spots for allele introgression and soybean meal protein improvement.

  6. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  7. The genome of Chenopodium quinoa

    KAUST Repository

    Jarvis, David Erwin; Ho, Yung Shwen; Lightfoot, Damien; Schmö ckel, Sandra M.; Li, Bo; Borm, Theo J. A.; Ohyanagi, Hajime; Mineta, Katsuhiko; Michell, Craig; Saber, Noha; Kharbatia, Najeh M.; Rupper, Ryan R.; Sharp, Aaron R.; Dally, Nadine; Boughton, Berin A.; Woo, Yong; Gao, Ge; Schijlen, Elio G. W. M.; Guo, Xiujie; Momin, Afaque Ahmad Imtiyaz; Negrã o, Só nia; Al-Babili, Salim; Gehring, Christoph A; Roessner, Ute; Jung, Christian; Murphy, Kevin; Arold, Stefan T.; Gojobori, Takashi; Linden, C. Gerard van der; Loo, Eibertus N. van; Jellen, Eric N.; Maughan, Peter J.; Tester, Mark A.

    2017-01-01

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  8. The genome of Chenopodium quinoa

    KAUST Repository

    Jarvis, David Erwin

    2017-02-08

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  9. The genome of Chenopodium quinoa.

    Science.gov (United States)

    Jarvis, David E; Ho, Yung Shwen; Lightfoot, Damien J; Schmöckel, Sandra M; Li, Bo; Borm, Theo J A; Ohyanagi, Hajime; Mineta, Katsuhiko; Michell, Craig T; Saber, Noha; Kharbatia, Najeh M; Rupper, Ryan R; Sharp, Aaron R; Dally, Nadine; Boughton, Berin A; Woo, Yong H; Gao, Ge; Schijlen, Elio G W M; Guo, Xiujie; Momin, Afaque A; Negrão, Sónia; Al-Babili, Salim; Gehring, Christoph; Roessner, Ute; Jung, Christian; Murphy, Kevin; Arold, Stefan T; Gojobori, Takashi; Linden, C Gerard van der; van Loo, Eibertus N; Jellen, Eric N; Maughan, Peter J; Tester, Mark

    2017-02-16

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  10. Restriction map of the single-stranded DNA genome of Kilham rat virus strain 171, a nondefective parvovirus

    International Nuclear Information System (INIS)

    Banerjee, P.T.; Rathrock, R.; Mitra, S.

    1981-01-01

    A physical map of Kilham rat virus strain 171 DNA was constructed by analyzing the sizes and locations of restriction endonuclease-generated fragments of the replicative-form viral DNA synthesized in vitro. BglI, KpnI, BamHI, SmaI, XhoI, and XorII did not appear to have any cleavage sites, whereas 11 other enzymes cleaved the genome at one to eight sites, and AluI generated more than 12 distinct fragments. The 30 restriction sites that were mapped were distributed randomly in the viral genome. A comparison of the restriction fragments of in vivo- and in vitro-replicated replicative-form DNAs showed that these DNAs were identical except in the size or configuration of the terminal fragments

  11. High resolution linkage maps of the model organism Petunia reveal substantial synteny decay with the related genome of tomato

    OpenAIRE

    Bossolini, Eligio; Klahre, Ulrich; Brandenburg, Anna; Reinhardt, Didier; Kuhlemeier, Cris

    2011-01-01

    Two linkage maps were constructed for the model plant Petunia. Mapping populations were obtained by crossing the wild species Petunia axillaris subsp. axillaris with Petunia inflata, and Petunia axillaris subsp. parodii with Petunia exserta. Both maps cover the seven chromosomes of Petunia, and span 970 centimorgans (cM) and 700 cM of the genomes, respectively. In total, 207 markers were mapped. Of these, 28 are multilocus amplified fragment length polymorphism (AFLP) markers and 179 are gene...

  12. G-MAPSEQ – a new method for mapping reads to a reference genome

    Directory of Open Access Journals (Sweden)

    Wojciechowski Pawel

    2016-06-01

    Full Text Available The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences due to a limited number of mutations allowed. G-MAPSEQ is a novel, hybrid algorithm combining two interesting methods: alignment-free sequence comparison and an ultra fast sequence alignment. The former is a fast heuristic algorithm which uses k-mer characteristics of nucleotide sequences to find potential mapping places. The latter is a very fast GPU implementation of sequence alignment used to verify the correctness of these mapping positions. The source code of G-MAPSEQ along with other bioinformatic software is available at: http://gpualign.cs.put.poznan.pl.

  13. Genome-wide development and deployment of informative intron-spanning and intron-length polymorphism markers for genomics-assisted breeding applications in chickpea.

    Science.gov (United States)

    Srivastava, Rishi; Bajaj, Deepak; Sayal, Yogesh K; Meher, Prabina K; Upadhyaya, Hari D; Kumar, Rajendra; Tripathi, Shailesh; Bharadwaj, Chellapilla; Rao, Atmakuri R; Parida, Swarup K

    2016-11-01

    The discovery and large-scale genotyping of informative gene-based markers is essential for rapid delineation of genes/QTLs governing stress tolerance and yield component traits in order to drive genetic enhancement in chickpea. A genome-wide 119169 and 110491 ISM (intron-spanning markers) from 23129 desi and 20386 kabuli protein-coding genes and 7454 in silico InDel (insertion-deletion) (1-45-bp)-based ILP (intron-length polymorphism) markers from 3283 genes were developed that were structurally and functionally annotated on eight chromosomes and unanchored scaffolds of chickpea. A much higher amplification efficiency (83%) and intra-specific polymorphic potential (86%) detected by these markers than that of other sequence-based genetic markers among desi and kabuli chickpea accessions was apparent even by a cost-effective agarose gel-based assay. The genome-wide physically mapped 1718 ILP markers assayed a wider level of functional genetic diversity (19-81%) and well-defined phylogenetics among domesticated chickpea accessions. The gene-derived 1424 ILP markers were anchored on a high-density (inter-marker distance: 0.65cM) desi intra-specific genetic linkage map/functional transcript map (ICC 4958×ICC 2263) of chickpea. This reference genetic map identified six major genomic regions harbouring six robust QTLs mapped on five chromosomes, which explained 11-23% seed weight trait variation (7.6-10.5 LOD) in chickpea. The integration of high-resolution QTL mapping with differential expression profiling detected six including one potential serine carboxypeptidase gene with ILP markers (linked tightly to the major seed weight QTLs) exhibiting seed-specific expression as well as pronounced up-regulation especially in seeds of high (ICC 4958) as compared to low (ICC 2263) seed weight mapping parental accessions. The marker information generated in the present study was made publicly accessible through a user-friendly web-resource, "Chickpea ISM-ILP Marker Database

  14. Magnetic Multi-Scale Mapping to Characterize Anthropogenic Targets

    Science.gov (United States)

    Le Maire, P.; Munschy, M.

    2017-12-01

    The discovery of buried anthropic objects on construction sites can cause delays and/or dangers for workers and for the public. Indeed, every year 500 tons of Unexploded-ordnance are discovered in France. Magnetic measurements are useful to localize magnetized objects. Moreover, it is the cheapest geophysical method which does not impact environment and which is relatively fast to perform. Fluxgate magnetometers (three components) are used to measure magnetic properties bellow the ground. These magnetic sensors are not absolute, so they need to be calibrated before the onset of the measurements. The advantage is that they allow magnetic compensation of the equipment attached to the sensor. So the choice of this kind sensor gives the opportunity to install the equipment aboard different magnetized supports: boat, quad bike, unmanned aerial vehicle, aircraft,... Indeed, this methodology permits to perform magnetic mapping with different scale and different elevation above ground level. An old French aerial military plant was chosen to perform this multi-scale approach. The advantage of the site is that it contains a lot of different targets with variable sizes and depth, e.g. buildings, unexploded-ordnances of the two world wars, trenches, pipes,… By comparison between the different magnetic anomaly maps at different elevations some of the geometric parameters of the magnetic sources can be characterized. The comparison between measured maps at different elevations and the prolonged map highlights the maximum distance for the target's detection (figure).

  15. A fast approach to generate large-scale topographic maps based on new Chinese vehicle-borne Lidar system

    International Nuclear Information System (INIS)

    Youmei, Han; Bogang, Yang

    2014-01-01

    Large -scale topographic maps are important basic information for city and regional planning and management. Traditional large- scale mapping methods are mostly based on artificial mapping and photogrammetry. The traditional mapping method is inefficient and limited by the environments. While the photogrammetry methods(such as low-altitude aerial mapping) is an economical and effective way to map wide and regulate range of large scale topographic map but doesn't work well in the small area due to the high cost of manpower and resources. Recent years, the vehicle-borne LIDAR technology has a rapid development, and its application in surveying and mapping is becoming a new topic. The main objective of this investigation is to explore the potential of vehicle-borne LIDAR technology to be used to fast mapping large scale topographic maps based on new Chinese vehicle-borne LIDAR system. It studied how to use the new Chinese vehicle-borne LIDAR system measurement technology to map large scale topographic maps. After the field data capture, it can be mapped in the office based on the LIDAR data (point cloud) by software which programmed by ourselves. In addition, the detailed process and accuracy analysis were proposed by an actual case. The result show that this new technology provides a new fast method to generate large scale topographic maps, which is high efficient and accuracy compared to traditional methods

  16. The Global Genome Biodiversity Network (GGBN) Data Standard specification

    Science.gov (United States)

    Droege, G.; Barker, K.; Seberg, O.; Coddington, J.; Benson, E.; Berendsohn, W. G.; Bunk, B.; Butler, C.; Cawsey, E. M.; Deck, J.; Döring, M.; Flemons, P.; Gemeinholzer, B.; Güntsch, A.; Hollowell, T.; Kelbert, P.; Kostadinov, I.; Kottmann, R.; Lawlor, R. T.; Lyal, C.; Mackenzie-Dodds, J.; Meyer, C.; Mulcahy, D.; Nussbeck, S. Y.; O'Tuama, É.; Orrell, T.; Petersen, G.; Robertson, T.; Söhngen, C.; Whitacre, J.; Wieczorek, J.; Yilmaz, P.; Zetzsche, H.; Zhang, Y.; Zhou, X.

    2016-01-01

    Genomic samples of non-model organisms are becoming increasingly important in a broad range of studies from developmental biology, biodiversity analyses, to conservation. Genomic sample definition, description, quality, voucher information and metadata all need to be digitized and disseminated across scientific communities. This information needs to be concise and consistent in today’s ever-increasing bioinformatic era, for complementary data aggregators to easily map databases to one another. In order to facilitate exchange of information on genomic samples and their derived data, the Global Genome Biodiversity Network (GGBN) Data Standard is intended to provide a platform based on a documented agreement to promote the efficient sharing and usage of genomic sample material and associated specimen information in a consistent way. The new data standard presented here build upon existing standards commonly used within the community extending them with the capability to exchange data on tissue, environmental and DNA sample as well as sequences. The GGBN Data Standard will reveal and democratize the hidden contents of biodiversity biobanks, for the convenience of everyone in the wider biobanking community. Technical tools exist for data providers to easily map their databases to the standard. Database URL: http://terms.tdwg.org/wiki/GGBN_Data_Standard PMID:27694206

  17. Human genetics and genomics a decade after the release of the draft sequence of the human genome

    Science.gov (United States)

    2011-01-01

    Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605

  18. Research study on analysis/use technologies of genome information; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    For wide use of genome information in the industrial field, the required R and D was surveyed from the standpoints of biology and information science. To clarify the present state and issues of the international research on genome analysis, the genome map as well as sequence and function information are first surveyed. The current analysis/use technologies of genome information are analyzed, and the following are summarized: prediction and identification of gene regions in genome sequences, techniques for searching and selecting useful genes, and techniques for predicting the expression of gene functions and the gene-product structure and functions. It is recommended that R and D and data collection/interpretation necessary to clarify inter-gene interactions and information networks should be promoted by integrating Japanese advanced know-how and technologies. As examples of the impact of the research results on industry and society, the present state and future expected effect are summarized for medicines, diagnosis/analysis instruments, chemicals, foods, agriculture, fishery, animal husbandry, electronics, environment and information. 278 refs., 42 figs., 5 tabs.

  19. An autotetraploid linkage map of rose (Rosa hybrida) validated using the strawberry (Fragaria vesca) genome sequence.

    Science.gov (United States)

    Gar, Oron; Sargent, Daniel J; Tsai, Ching-Jung; Pleban, Tzili; Shalev, Gil; Byrne, David H; Zamir, Dani

    2011-01-01

    Polyploidy is a pivotal process in plant evolution as it increase gene redundancy and morphological intricacy but due to the complexity of polysomic inheritance we have only few genetic maps of autopolyploid organisms. A robust mapping framework is particularly important in polyploid crop species, rose included (2n = 4x = 28), where the objective is to study multiallelic interactions that control traits of value for plant breeding. From a cross between the garden, peach red and fragrant cultivar Fragrant Cloud (FC) and a cut-rose yellow cultivar Golden Gate (GG), we generated an autotetraploid GGFC mapping population consisting of 132 individuals. For the map we used 128 sequence-based markers, 141 AFLP, 86 SSR and three morphological markers. Seven linkage groups were resolved for FC (Total 632 cM) and GG (616 cM) which were validated by markers that segregated in both parents as well as the diploid integrated consensus map.The release of the Fragaria vesca genome, which also belongs to the Rosoideae, allowed us to place 70 rose sequenced markers on the seven strawberry pseudo-chromosomes. Synteny between Rosa and Fragaria was high with an estimated four major translocations and six inversions required to place the 17 non-collinear markers in the same order. Based on a verified linear order of the rose markers, we could further partition each of the parents into its four homologous groups, thus providing an essential framework to aid the sequencing of an autotetraploid genome.

  20. Defining a Cancer Dependency Map | Office of Cancer Genomics

    Science.gov (United States)

    Most human epithelial tumors harbor numerous alterations, making it difficult to predict which genes are required for tumor survival. To systematically identify cancer dependencies, we analyzed 501 genome-scale loss-of-function screens performed in diverse human cancer cell lines. We developed DEMETER, an analytical framework that segregates on- from off-target effects of RNAi. 769 genes were differentially required in subsets of these cell lines at a threshold of six SDs from the mean.

  1. Objective and Comprehensive Evaluation of Bisulfite Short Read Mapping Tools

    Directory of Open Access Journals (Sweden)

    Hong Tran

    2014-01-01

    Full Text Available Background. Large-scale bisulfite treatment and short reads sequencing technology allow comprehensive estimation of methylation states of Cs in the genomes of different tissues, cell types, and developmental stages. Accurate characterization of DNA methylation is essential for understanding genotype phenotype association, gene and environment interaction, diseases, and cancer. Aligning bisulfite short reads to a reference genome has been a challenging task. We compared five bisulfite short read mapping tools, BSMAP, Bismark, BS-Seeker, BiSS, and BRAT-BW, representing two classes of mapping algorithms (hash table and suffix/prefix tries. We examined their mapping efficiency (i.e., the percentage of reads that can be mapped to the genomes, usability, running time, and effects of changing default parameter settings using both real and simulated reads. We also investigated how preprocessing data might affect mapping efficiency. Conclusion. Among the five programs compared, in terms of mapping efficiency, Bismark performs the best on the real data, followed by BiSS, BSMAP, and finally BRAT-BW and BS-Seeker with very similar performance. If CPU time is not a constraint, Bismark is a good choice of program for mapping bisulfite treated short reads. Data quality impacts a great deal mapping efficiency. Although increasing the number of mismatches allowed can increase mapping efficiency, it not only significantly slows down the program, but also runs the risk of having increased false positives. Therefore, users should carefully set the related parameters depending on the quality of their sequencing data.

  2. Integration of linkage maps for the Amphidiploid Brassica napus and comparative mapping with Arabidopsis and Brassica rapa

    Directory of Open Access Journals (Sweden)

    Delourme Régine

    2011-02-01

    Full Text Available Abstract Background The large number of genetic linkage maps representing Brassica chromosomes constitute a potential platform for studying crop traits and genome evolution within Brassicaceae. However, the alignment of existing maps remains a major challenge. The integration of these genetic maps will enhance genetic resolution, and provide a means to navigate between sequence-tagged loci, and with contiguous genome sequences as these become available. Results We report the first genome-wide integration of Brassica maps based on an automated pipeline which involved collation of genome-wide genotype data for sequence-tagged markers scored on three extensively used amphidiploid Brassica napus (2n = 38 populations. Representative markers were selected from consolidated maps for each population, and skeleton bin maps were generated. The skeleton maps for the three populations were then combined to generate an integrated map for each LG, comparing two different approaches, one encapsulated in JoinMap and the other in MergeMap. The BnaWAIT_01_2010a integrated genetic map was generated using JoinMap, and includes 5,162 genetic markers mapped onto 2,196 loci, with a total genetic length of 1,792 cM. The map density of one locus every 0.82 cM, corresponding to 515 Kbp, increases by at least three-fold the locus and marker density within the original maps. Within the B. napus integrated map we identified 103 conserved collinearity blocks relative to Arabidopsis, including five previously unreported blocks. The BnaWAIT_01_2010a map was used to investigate the integrity and conservation of order proposed for genome sequence scaffolds generated from the constituent A genome of Brassica rapa. Conclusions Our results provide a comprehensive genetic integration of the B. napus genome from a range of sources, which we anticipate will provide valuable information for rapeseed and Canola research.

  3. Reconstruction and analysis of a genome-scale metabolic model for Scheffersomyces stipitis

    Directory of Open Access Journals (Sweden)

    Balagurunathan Balaji

    2012-02-01

    Full Text Available Abstract Background Fermentation of xylose, the major component in hemicellulose, is essential for economic conversion of lignocellulosic biomass to fuels and chemicals. The yeast Scheffersomyces stipitis (formerly known as Pichia stipitis has the highest known native capacity for xylose fermentation and possesses several genes for lignocellulose bioconversion in its genome. Understanding the metabolism of this yeast at a global scale, by reconstructing the genome scale metabolic model, is essential for manipulating its metabolic capabilities and for successful transfer of its capabilities to other industrial microbes. Results We present a genome-scale metabolic model for Scheffersomyces stipitis, a native xylose utilizing yeast. The model was reconstructed based on genome sequence annotation, detailed experimental investigation and known yeast physiology. Macromolecular composition of Scheffersomyces stipitis biomass was estimated experimentally and its ability to grow on different carbon, nitrogen, sulphur and phosphorus sources was determined by phenotype microarrays. The compartmentalized model, developed based on an iterative procedure, accounted for 814 genes, 1371 reactions, and 971 metabolites. In silico computed growth rates were compared with high-throughput phenotyping data and the model could predict the qualitative outcomes in 74% of substrates investigated. Model simulations were used to identify the biosynthetic requirements for anaerobic growth of Scheffersomyces stipitis on glucose and the results were validated with published literature. The bottlenecks in Scheffersomyces stipitis metabolic network for xylose uptake and nucleotide cofactor recycling were identified by in silico flux variability analysis. The scope of the model in enhancing the mechanistic understanding of microbial metabolism is demonstrated by identifying a mechanism for mitochondrial respiration and oxidative phosphorylation. Conclusion The genome-scale

  4. Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus

    OpenAIRE

    Zeng, Chenjie; Guo, Xingyi; Long, Jirong; Kuchenbaecker, Karoline B.; Droit, Arnaud; Michailidou, Kyriaki; Ghoussaini, Maya; Kar, Siddhartha; Freeman, Adam; Hopper, John L.; Milne, Roger L.; Bolla, Manjeet K.; Wang, Qin; Dennis, Joe; Agata, Simona

    2016-01-01

    Background: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk. Method: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more than 1300 imputed genetic variants in 48,155 cases and 43,612 controls of European descent, 6269 cases and 6624 controls of East Asian descent and 1116 cases and 932 controls of African descent in the Breast C...

  5. Progressive Amalgamation of Building Clusters for Map Generalization Based on Scaling Subgroups

    Directory of Open Access Journals (Sweden)

    Xianjin He

    2018-03-01

    Full Text Available Map generalization utilizes transformation operations to derive smaller-scale maps from larger-scale maps, and is a key procedure for the modelling and understanding of geographic space. Studies to date have largely applied a fixed tolerance to aggregate clustered buildings into a single object, resulting in the loss of details that meet cartographic constraints and may be of importance for users. This study aims to develop a method that amalgamates clustered buildings gradually without significant modification of geometry, while preserving the map details as much as possible under cartographic constraints. The amalgamation process consists of three key steps. First, individual buildings are grouped into distinct clusters by using the graph-based spatial clustering application with random forest (GSCARF method. Second, building clusters are decomposed into scaling subgroups according to homogeneity with regard to the mean distance of subgroups. Thus, hierarchies of building clusters can be derived based on scaling subgroups. Finally, an amalgamation operation is progressively performed from the bottom-level subgroups to the top-level subgroups using the maximum distance of each subgroup as the amalgamating tolerance instead of using a fixed tolerance. As a consequence of this step, generalized intermediate scaling results are available, which can form the multi-scale representation of buildings. The experimental results show that the proposed method can generate amalgams with correct details, statistical area balance and orthogonal shape while satisfying cartographic constraints (e.g., minimum distance and minimum area.

  6. Metingear: a development environment for annotating genome-scale metabolic models.

    Science.gov (United States)

    May, John W; James, A Gordon; Steinbeck, Christoph

    2013-09-01

    Genome-scale metabolic models often lack annotations that would allow them to be used for further analysis. Previous efforts have focused on associating metabolites in the model with a cross reference, but this can be problematic if the reference is not freely available, multiple resources are used or the metabolite is added from a literature review. Associating each metabolite with chemical structure provides unambiguous identification of the components and a more detailed view of the metabolism. We have developed an open-source desktop application that simplifies the process of adding database cross references and chemical structures to genome-scale metabolic models. Annotated models can be exported to the Systems Biology Markup Language open interchange format. Source code, binaries, documentation and tutorials are freely available at http://johnmay.github.com/metingear. The application is implemented in Java with bundles available for MS Windows and Macintosh OS X.

  7. Constant-scale natural boundary mapping to reveal global and cosmic processes

    CERN Document Server

    Clark, Pamela Elizabeth

    2013-01-01

    Whereas conventional maps can be expressed as outward-expanding formulae with well-defined central features and relatively poorly defined edges, Constant Scale Natural Boundary (CSNB) maps have well-defined boundaries that result from natural processes and thus allow spatial and dynamic relationships to be observed in a new way useful to understanding these processes. CSNB mapping presents a new approach to visualization that produces maps markedly different from those produced by conventional cartographic methods. In this approach, any body can be represented by a 3D coordinate system. For a regular body, with its surface relatively smooth on the scale of its size, locations of features can be represented by definite geographic grid (latitude and longitude) and elevation, or deviation from the triaxial ellipsoid defined surface. A continuous surface on this body can be segmented, its distinctive regional terranes enclosed, and their inter-relationships defined, by using selected morphologically identifiable ...

  8. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...... that regardless of the reference genome choice, most evolutionary genomic analyses yield qualitatively similar results, including those exploring the structure between the wolves and dogs using admixture and principal component analysis. However, we do observe differences in the genomic coverage of re-mapped...

  9. Construction of chromosomal recombination maps of three genomes of lilies (Lilium) based on GISH analysis.

    NARCIS (Netherlands)

    Nadeem Khan, M.; Shujun Zhou,; Barba Gonzalez, R.; Ramanna, M.S.; Visser, R.G.F.; Tuyl, van J.M.

    2009-01-01

    Chromosomal recombination maps were constructed for three genomes of lily (Lilium) using GISH analyses. For this purpose, the backcross (BC) progenies of two diploid (2n = 2x = 24) interspecific hybrids of lily, viz. Longiflorum × Asiatic (LA) and Oriental × Asiatic (OA), were used. Mostly the BC

  10. Selecting Appropriate Spatial Scale for Mapping Plastic-Mulched Farmland with Satellite Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    Hasituya

    2017-03-01

    Full Text Available In recent years, the area of plastic-mulched farmland (PMF has undergone rapid growth and raised remarkable environmental problems. Therefore, mapping the PMF plays a crucial role in agricultural production, environmental protection and resource management. However, appropriate data selection criteria are currently lacking. Thus, this study was carried out in two main plastic-mulching practice regions, Jizhou and Guyuan, to look for an appropriate spatial scale for mapping PMF with remote sensing. The average local variance (ALV function was used to obtain the appropriate spatial scale for mapping PMF based on the GaoFen-1 (GF-1 satellite imagery. Afterwards, in order to validate the effectiveness of the selected method and to interpret the relationship between the appropriate spatial scale derived from the ALV and the spatial scale with the highest classification accuracy, we classified the imagery with varying spatial resolution by the Support Vector Machine (SVM algorithm using the spectral features, textural features and the combined spectral and textural features respectively. The results indicated that the appropriate spatial scales from the ALV lie between 8 m and 20 m for mapping the PMF both in Jizhou and Guyuan. However, there is a proportional relation: the spatial scale with the highest classification accuracy is at the 1/2 location of the appropriate spatial scale generated from the ALV in Jizhou and at the 2/3 location of the appropriate spatial scale generated from the ALV in Guyuan. Therefore, the ALV method for quantitatively selecting the appropriate spatial scale for mapping PMF with remote sensing imagery has theoretical and practical significance.

  11. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    Science.gov (United States)

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  12. Detailed geomorphological map sheet Bela Palanka at scale 1:100,000

    Directory of Open Access Journals (Sweden)

    Menković Ljubomir

    2011-01-01

    Full Text Available The Geomorphological Map Sheet Bela Palanka is a graphical representation of landforms in the area covered by the Topographical Map Sheet Bela Palanka at scale 1:100,000. The map is published in 2008 by the Serbian Academy of Sciences and Arts (SASA and the SASA Geodynamics Board. It is the first detailed geomorphological map edited in Serbia. This paper presents the methods used in preparing the geomorphological map, the contents and the mode of data presentation, geologic structure, genetic types of landforms and the subtypes, and the geomorphological history since the Neogene.

  13. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    Science.gov (United States)

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-09-19

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

  14. Zebrafish syntenic relationship to human/mouse genomes revealed by radiation hybrid mapping

    International Nuclear Information System (INIS)

    Samonte, Irene E.

    2007-01-01

    Zebrafish (Danio rerio) is an excellent model system for vertebrate developmental analysis and a new model for human disorders. In this study, however, zebrafish was used to determine its syntenic relationship to human/mouse genomes using the zebrafish-hamster radiation hybrid panel. The focus was on genes residing on chromosomes 6 and 17 of human and mouse, respectively, and some other genes of either immunologic or evolutionary importance. Gene sequences of interest and zebrafish expressed sequence tags deposited in the GenBank were used in identifying zebrafish homologs. Polymerase chain reaction (PCR) amplification, cloning and subcloning, sequencing, and phylogenetic analysis were done to confirm the homology of the candidate genes in zebrafish. The promising markers were then tested in the 94 zebrafish-hamster radiation hybrid panel cell lines and submitted for logarithm of the odds (LOD) score analysis to position genes on the zebrafish map. A total of 19 loci were successfully mapped to zebrafish linkage groups 1, 14, 15, 19, and 20. Four of these loci were positioned in linkage group 20, whereas, 3 more loci were added in linkage group 19, thus increasing to 34 loci the number of human genes syntenic to the group. With the sequencing of the zebrafish genome, about 20 more MHC genes were reported linked on the same group. (Author)

  15. Osteoponin Promoter Controlled by DNA Methylation: Aberrant Methylation in Cloned Porcine Genome

    Directory of Open Access Journals (Sweden)

    Chih-Jie Shen

    2014-01-01

    Full Text Available Cloned animals usually exhibited many defects in physical characteristics or aberrant epigenetic reprogramming, especially in some important organ development. Osteoponin (OPN is an extracellular-matrix protein involved in heart and bone development and diseases. In this study, we investigated the correlation between OPN mRNA and its promoter methylation changes by the 5-aza-dc treatment in fibroblast cell and promoter assay. Aberrant methylation of porcine OPN was frequently found in different tissues of somatic nuclear transferred cloning pigs, and bisulfite sequence data suggested that the OPN promoter region −2615 to −2239 nucleotides (nt may be a crucial regulation DNA element. In pig ear fibroblast cell culture study, the demethylation of OPN promoter was found in dose-dependent response of 5-aza-dc treatment and followed the OPN mRNA reexpression. In cloned pig study, discrepant expression pattern was identified in several cloned pig tissues, especially in brain, heart, and ear. Promoter assay data revealed that four methylated CpG sites presenting in the −2615 to −2239 nt region cause significant downregulation of OPN promoter activity. These data suggested that methylation in the OPN promoter plays a crucial role in the regulation of OPN expression that we found in cloned pigs genome.

  16. Complete genome sequence of Bacillus amyloliquefaciens strain Co1-6, a plant growth-promoting rhizobacterium of Calendula officinalis

    Energy Technology Data Exchange (ETDEWEB)

    Koeberl, Martina; White, Richard A.; Erschen, Sabine; Spanberger, Nora; El-Arabi, Tarek F.; Jansson, Janet K.; Berg, Gabriele

    2015-08-13

    The genome sequence of Bacillus amyloliquefaciens strain Co1-6, a plant growth-promoting rhizobacterium (PGPR) with broad-spectrum antagonistic activities against plant pathogenic fungi, bacteria and nematodes, consists of a single 3.9 Mb circular chromosome. The genome reveals genes putatively responsible for its promising biocontrol and PGP properties.

  17. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  18. [The intervention mapping protocol: A structured process to develop, implement and evaluate health promotion programs].

    Science.gov (United States)

    Fassier, J-B; Lamort-Bouché, M; Sarnin, P; Durif-Bruckert, C; Péron, J; Letrilliart, L; Durand, M-J

    2016-02-01

    Health promotion programs are expected to improve population health and reduce social inequalities in health. However, their theoretical foundations are frequently ill-defined, and their implementation faces many obstacles. The aim of this article is to describe the intervention mapping protocol in health promotion programs planning, used recently in several countries. The challenges of planning health promotion programs are presented, and the six steps of the intervention mapping protocol are described with an example. Based on a literature review, the use of this protocol, its requirements and potential limitations are discussed. The intervention mapping protocol has four essential characteristics: an ecological perspective (person-environment), a participative approach, the use of theoretical models in human and social sciences and the use of scientific evidence. It comprises six steps: conduct a health needs assessment, define change objectives, select theory-based change techniques and practical applications, organize techniques and applications into an intervention program (logic model), plan for program adoption, implementation, and sustainability, and generate an evaluation plan. This protocol was used in different countries and domains such as obesity, tobacco, physical activity, cancer and occupational health. Although its utilization requires resources and a critical stance, this protocol was used to develop interventions which efficacy was demonstrated. The intervention mapping protocol is an integrated process that fits the scientific and practical challenges of health promotion. It could be tested in France as it was used in other countries, in particular to reduce social inequalities in health. Copyright © 2016. Published by Elsevier Masson SAS.

  19. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Science.gov (United States)

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  20. KAIKObase: An integrated silkworm genome database and data mining tool

    Directory of Open Access Journals (Sweden)

    Nagaraju Javaregowda

    2009-10-01

    Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the

  1. GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

    Science.gov (United States)

    Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

    2017-01-25

    Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.

  2. Evolutionary Transition of Promoter and Gene Body DNA Methylation across Invertebrate-Vertebrate Boundary.

    Science.gov (United States)

    Keller, Thomas E; Han, Priscilla; Yi, Soojin V

    2016-04-01

    Genomes of invertebrates and vertebrates exhibit highly divergent patterns of DNA methylation. Invertebrate genomes tend to be sparsely methylated, and DNA methylation is mostly targeted to a subset of transcription units (gene bodies). In a drastic contrast, vertebrate genomes are generally globally and heavily methylated, punctuated by the limited local hypo-methylation of putative regulatory regions such as promoters. These genomic differences also translate into functional differences in DNA methylation and gene regulation. Although promoter DNA methylation is an important regulatory component of vertebrate gene expression, its role in invertebrate gene regulation has been little explored. Instead, gene body DNA methylation is associated with expression of invertebrate genes. However, the evolutionary steps leading to the differentiation of invertebrate and vertebrate genomic DNA methylation remain unresolved. Here we analyzed experimentally determined DNA methylation maps of several species across the invertebrate-vertebrate boundary, to elucidate how vertebrate gene methylation has evolved. We show that, in contrast to the prevailing idea, a substantial number of promoters in an invertebrate basal chordate Ciona intestinalis are methylated. Moreover, gene expression data indicate significant, epigenomic context-dependent associations between promoter methylation and expression in C. intestinalis. However, there is no evidence that promoter methylation in invertebrate chordate has been evolutionarily maintained across the invertebrate-vertebrate boundary. Rather, body-methylated invertebrate genes preferentially obtain hypo-methylated promoters among vertebrates. Conversely, promoter methylation is preferentially found in lineage- and tissue-specific vertebrate genes. These results provide important insights into the evolutionary origin of epigenetic regulation of vertebrate gene expression. © The Author(s) 2015. Published by Oxford University Press on behalf

  3. Automated integration of genomic physical mapping data via parallel simulated annealing

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T.

    1994-06-01

    The Human Genome Center at the Lawrence Livermore National Laboratory (LLNL) is nearing closure on a high-resolution physical map of human chromosome 19. We have build automated tools to assemble 15,000 fingerprinted cosmid clones into 800 contigs with minimal spanning paths identified. These islands are being ordered, oriented, and spanned by a variety of other techniques including: Fluorescence Insitu Hybridization (FISH) at 3 levels of resolution, ECO restriction fragment mapping across all contigs, and a multitude of different hybridization and PCR techniques to link cosmid, YAC, AC, PAC, and Pl clones. The FISH data provide us with partial order and distance data as well as orientation. We made the observation that map builders need a much rougher presentation of data than do map readers; the former wish to see raw data since these can expose errors or interesting biology. We further noted that by ignoring our length and distance data we could simplify our problem into one that could be readily attacked with optimization techniques. The data integration problem could then be seen as an M x N ordering of our N cosmid clones which ``intersect`` M larger objects by defining ``intersection`` to mean either contig/map membership or hybridization results. Clearly, the goal of making an integrated map is now to rearrange the N cosmid clone ``columns`` such that the number of gaps on the object ``rows`` are minimized. Our FISH partially-ordered cosmid clones provide us with a set of constraints that cannot be violated by the rearrangement process. We solved the optimization problem via simulated annealing performed on a network of 40+ Unix machines in parallel, using a server/client model built on explicit socket calls. For current maps we can create a map in about 4 hours on the parallel net versus 4+ days on a single workstation. Our biologists are now using this software on a daily basis to guide their efforts toward final closure.

  4. Genome scale models of yeast: towards standardized evaluation and consistent omic integration

    DEFF Research Database (Denmark)

    Sanchez, Benjamin J.; Nielsen, Jens

    2015-01-01

    Genome scale models (GEMs) have enabled remarkable advances in systems biology, acting as functional databases of metabolism, and as scaffolds for the contextualization of high-throughput data. In the case of Saccharomyces cerevisiae (budding yeast), several GEMs have been published and are curre......Genome scale models (GEMs) have enabled remarkable advances in systems biology, acting as functional databases of metabolism, and as scaffolds for the contextualization of high-throughput data. In the case of Saccharomyces cerevisiae (budding yeast), several GEMs have been published...... in which all levels of omics data (from gene expression to flux) have been integrated in yeast GEMs. Relevant conclusions and current challenges for both GEM evaluation and omic integration are highlighted....

  5. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    NARCIS (Netherlands)

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M.S.M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is

  6. LARGE-SCALE INDICATIVE MAPPING OF SOIL RUNOFF

    Directory of Open Access Journals (Sweden)

    E. Panidi

    2017-11-01

    Full Text Available In our study we estimate relationships between quantitative parameters of relief, soil runoff regime, and spatial distribution of radioactive pollutants in the soil. The study is conducted on the test arable area located in basin of the upper Oka River (Orel region, Russia. Previously we collected rich amount of soil samples, which make it possible to investigate redistribution of the Chernobyl-origin cesium-137 in soil material and as a consequence the soil runoff magnitude at sampling points. Currently we are describing and discussing the technique applied to large-scale mapping of the soil runoff. The technique is based upon the cesium-137 radioactivity measurement in the different relief structures. Key stages are the allocation of the places for soil sampling points (we used very high resolution space imagery as a supporting data; soil samples collection and analysis; calibration of the mathematical model (using the estimated background value of the cesium-137 radioactivity; and automated compilation of the map (predictive map of the studied territory (digital elevation model is used for this purpose, and cesium-137 radioactivity can be predicted using quantitative parameters of the relief. The maps can be used as a support data for precision agriculture and for recultivation or melioration purposes.

  7. Genomic Organization and Physical Mapping of Tandemly Arranged Repetitive DNAs in Sterlet (Acipenser ruthenus).

    Science.gov (United States)

    Biltueva, Larisa S; Prokopov, Dimitry Y; Makunin, Alexey I; Komissarov, Alexey S; Kudryavtseva, Anna V; Lemskaya, Natalya A; Vorobieva, Nadezhda V; Serdyukova, Natalia A; Romanenko, Svetlana A; Gladkikh, Olga L; Graphodatsky, Alexander S; Trifonov, Vladimir A

    2017-01-01

    Acipenseriformes represent a phylogenetically basal clade of ray-finned fish characterized by unusual genomic traits, including paleopolyploid states of extant genomes with high chromosome numbers and slow rates of molecular evolution. Despite a high interest in this fish group, only a limited number of studies have been accomplished on the isolation and characterization of repetitive DNA, karyotype standardization is not yet complete, and sex chromosomes are still to be identified. Here, we applied next-generation sequencing and cluster analysis to characterize major fractions of sterlet (Acipenser ruthenus) repetitive DNA. Using FISH, we mapped 16 tandemly arranged sequences on sterlet chromosomes and found them to be unevenly distributed in the genome with a tendency to cluster in particular regions. Some of the satellite DNAs might be used as specific markers to identify individual chromosomes and their paralogs, resulting in the unequivocal identification of at least 18 chromosome pairs. Our results provide an insight into the characteristic genomic distribution of the most common sterlet repetitive sequences. Biased accumulation of repetitive DNAs in particular chromosomes makes them especially interesting for further search for cryptic sex chromosomes. Future studies of these sequences in other acipenserid species will provide new perspectives regarding the evolution of repetitive DNA within the genomes of this fish order. © 2017 S. Karger AG, Basel.

  8. Update on the use of random 10-mers in mapping and fingerprinting genomes

    International Nuclear Information System (INIS)

    Sinibaldi, R.M.

    2001-01-01

    The use of Randomly Amplified Polymorphic DNA (RAPDs) has continued to grow for the last several years. A quick assessment of their use can be estimated by searching PubMed at the National Library of Medicine with the acronym RAPD. Since their first report in 1990, the number of citations with RAPD in them has increased from 12 in 1990, to 45 in 1991, to, 112 in 1993, to, 130 in 1994, to 223 in 1995, to 258 in 1996, to 236 in 1997, to 316 in 1998, to 196 to date (August 31) 1999. The utilization of 10-mers for mapping or fingerprinting has many advantages. These include a relatively low cost, no use of radioactivity, easily adapted to automation, requirement for very small amounts of input DNA, rapid results, existing data bases for many organisms, and low cost equipment requirements. In conjunction with a derived technology such as SCARs (sequence characterized amplified regions), it can provide cost effective and thorough methods for mapping and fingerprinting any genome. Newer methods based on microarray technology may offer powerful but expensive alternative approaches in determining genetic diversity. The costs of arrays should come down with time and improved production methods. In the meantime, RAPDs remain a competent and cost effective method for genome characterizations. (author)

  9. Geophysical mapping of complex glaciogenic large-scale structures

    DEFF Research Database (Denmark)

    Høyer, Anne-Sophie

    2013-01-01

    This thesis presents the main results of a four year PhD study concerning the use of geophysical data in geological mapping. The study is related to the Geocenter project, “KOMPLEKS”, which focuses on the mapping of complex, large-scale geological structures. The study area is approximately 100 km2...... data types and co-interpret them in order to improve our geological understanding. However, in order to perform this successfully, methodological considerations are necessary. For instance, a structure indicated by a reflection in the seismic data is not always apparent in the resistivity data...... information) can be collected. The geophysical data are used together with geological analyses from boreholes and pits to interpret the geological history of the hill-island. The geophysical data reveal that the glaciotectonic structures truncate at the surface. The directions of the structures were mapped...

  10. Draft Genome Sequence of Ochrobactrum intermedium Strain SA148, a Plant Growth-Promoting Desert Rhizobacterium

    KAUST Repository

    Lafi, Feras Fawzi; Alam, Intikhab; Geurts, Rene; Bisseling, Ton; Bajic, Vladimir B.; Hirt, Heribert; Saad, Maged

    2017-01-01

    Ochrobactrum intermedium strain SA148 is a plant growth-promoting bacterium isolated from sandy soil in the Jizan area of Saudi Arabia. Here, we report the 4.9-Mb draft genome sequence of this strain, highlighting different pathways characteristic

  11. Large Scale Hierarchical K-Means Based Image Retrieval With MapReduce

    Science.gov (United States)

    2014-03-27

    flat vocabulary on MapReduce. In 2013, Moise and Shestakov [32, 40], have been researching large scale indexing and search with MapReduce. They...time will be greatly reduced, however image retrieval performance will almost certainly suffer. Moise and Shestakov ran tests with 100M images on 108...43–72, 2005. [32] Diana Moise , Denis Shestakov, Gylfi Gudmundsson, and Laurent Amsaleg. Indexing and searching 100m images with map-reduce. In

  12. The generation of chromosomal deletions to provide extensive coverage and subdivision of the Drosophila melanogaster genome.

    Science.gov (United States)

    Cook, R Kimberley; Christensen, Stacey J; Deal, Jennifer A; Coburn, Rachel A; Deal, Megan E; Gresens, Jill M; Kaufman, Thomas C; Cook, Kevin R

    2012-01-01

    Chromosomal deletions are used extensively in Drosophila melanogaster genetics research. Deletion mapping is the primary method used for fine-scale gene localization. Effective and efficient deletion mapping requires both extensive genomic coverage and a high density of molecularly defined breakpoints across the genome. A large-scale resource development project at the Bloomington Drosophila Stock Center has improved the choice of deletions beyond that provided by previous projects. FLP-mediated recombination between FRT-bearing transposon insertions was used to generate deletions, because it is efficient and provides single-nucleotide resolution in planning deletion screens. The 793 deletions generated pushed coverage of the euchromatic genome to 98.4%. Gaps in coverage contain haplolethal and haplosterile genes, but the sizes of these gaps were minimized by flanking these genes as closely as possible with deletions. In improving coverage, a complete inventory of haplolethal and haplosterile genes was generated and extensive information on other haploinsufficient genes was compiled. To aid mapping experiments, a subset of deletions was organized into a Deficiency Kit to provide maximal coverage efficiently. To improve the resolution of deletion mapping, screens were planned to distribute deletion breakpoints evenly across the genome. The median chromosomal interval between breakpoints now contains only nine genes and 377 intervals contain only single genes. Drosophila melanogaster now has the most extensive genomic deletion coverage and breakpoint subdivision as well as the most comprehensive inventory of haploinsufficient genes of any multicellular organism. The improved selection of chromosomal deletion strains will be useful to nearly all Drosophila researchers.

  13. Molecular cloning, genomic organization, chromosome mapping, tissues expression pattern and identification of a novel splicing variant of porcine CIDEb gene

    International Nuclear Information System (INIS)

    Li, YanHua; Li, AiHua; Yang, Z.Q.

    2016-01-01

    Cell death-inducing DNA fragmentation factor-α-like effector b (CIDEb) is a member of the CIDE family of apoptosis-inducing factors, CIDEa and CIDEc have been reported to be Lipid droplets (LDs)-associated proteins that promote atypical LD fusion in adipocytes, and responsible for liver steatosis under fasting and obese conditions, whereas CIDEb promotes lipid storage under normal diet conditions [1], and promotes the formation of triacylglyceride-enriched VLDL particles in hepatocytes [2]. Here, we report the gene cloning, chromosome mapping, tissue distribution, genetic expression analysis, and identification of a novel splicing variant of the porcine CIDEb gene. Sequence analysis shows that the open reading frame of the normal porcine CIDEb isoform covers 660bp and encodes a 219-amino acid polypeptide, whereas its alternative splicing variant encodes a 142-amino acid polypeptide truncated at the fourth exon and comprised of the CIDE-N domain and part of the CIDE-C domain. The deduced amino acid sequence of normal porcine CIDEb shows an 85.8% similarity to the human protein and 80.0% to the mouse protein. The CIDEb genomic sequence spans approximately 6KB comprised of five exons and four introns. Radiation hybrid mapping demonstrated that porcine CIDEb is located at chromosome 7q21 and at a distance of 57cR from the most significantly linked marker, S0334, regions that are syntenic with the corresponding region in the human genome. Tissue expression analysis indicated that normal CIDEb mRNA is ubiquitously expressed in many porcine tissues. It was highly expressed in white adipose tissue and was observed at relatively high levels in the liver, lung, small intestine, lymphatic tissue and brain. The normal version of CIDEb was the predominant form in all tested tissues, whereas the splicing variant was expressed at low levels in all examined tissues except the lymphatic tissue. Furthermore, genetic expression analysis indicated that CIDEb mRNA levels were

  14. Molecular cloning, genomic organization, chromosome mapping, tissues expression pattern and identification of a novel splicing variant of porcine CIDEb gene

    Energy Technology Data Exchange (ETDEWEB)

    Li, YanHua, E-mail: liyanhua.1982@aliyun.com [Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing Key Laboratory of Translational Medical Research in Cognitive Development and Learning and Memory Disorders, China International Science and Technology Cooperation base of Child development and Critical Disorders, Children’s Hospital of Chongqing Medical University, Chongqing 400014 (China); Li, AiHua [Chongqing Cancer Institute & Hospital & Cancer Center, Chongqing 404100 (China); Yang, Z.Q. [Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070 (China)

    2016-09-09

    Cell death-inducing DNA fragmentation factor-α-like effector b (CIDEb) is a member of the CIDE family of apoptosis-inducing factors, CIDEa and CIDEc have been reported to be Lipid droplets (LDs)-associated proteins that promote atypical LD fusion in adipocytes, and responsible for liver steatosis under fasting and obese conditions, whereas CIDEb promotes lipid storage under normal diet conditions [1], and promotes the formation of triacylglyceride-enriched VLDL particles in hepatocytes [2]. Here, we report the gene cloning, chromosome mapping, tissue distribution, genetic expression analysis, and identification of a novel splicing variant of the porcine CIDEb gene. Sequence analysis shows that the open reading frame of the normal porcine CIDEb isoform covers 660bp and encodes a 219-amino acid polypeptide, whereas its alternative splicing variant encodes a 142-amino acid polypeptide truncated at the fourth exon and comprised of the CIDE-N domain and part of the CIDE-C domain. The deduced amino acid sequence of normal porcine CIDEb shows an 85.8% similarity to the human protein and 80.0% to the mouse protein. The CIDEb genomic sequence spans approximately 6KB comprised of five exons and four introns. Radiation hybrid mapping demonstrated that porcine CIDEb is located at chromosome 7q21 and at a distance of 57cR from the most significantly linked marker, S0334, regions that are syntenic with the corresponding region in the human genome. Tissue expression analysis indicated that normal CIDEb mRNA is ubiquitously expressed in many porcine tissues. It was highly expressed in white adipose tissue and was observed at relatively high levels in the liver, lung, small intestine, lymphatic tissue and brain. The normal version of CIDEb was the predominant form in all tested tissues, whereas the splicing variant was expressed at low levels in all examined tissues except the lymphatic tissue. Furthermore, genetic expression analysis indicated that CIDEb mRNA levels were

  15. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    OpenAIRE

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it possible to map SCEs at orders-of-magnitude greater resolution than was previously possible. On average, murine embryonic stem (mES) cells exhibit eight SCEs, which are detected at a resolution of up...

  16. Network Partitioning Domain Knowledge Multiobjective Application Mapping for Large-Scale Network-on-Chip

    Directory of Open Access Journals (Sweden)

    Yin Zhen Tei

    2014-01-01

    Full Text Available This paper proposes a multiobjective application mapping technique targeted for large-scale network-on-chip (NoC. As the number of intellectual property (IP cores in multiprocessor system-on-chip (MPSoC increases, NoC application mapping to find optimum core-to-topology mapping becomes more challenging. Besides, the conflicting cost and performance trade-off makes multiobjective application mapping techniques even more complex. This paper proposes an application mapping technique that incorporates domain knowledge into genetic algorithm (GA. The initial population of GA is initialized with network partitioning (NP while the crossover operator is guided with knowledge on communication demands. NP reduces the large-scale application mapping complexity and provides GA with a potential mapping search space. The proposed genetic operator is compared with state-of-the-art genetic operators in terms of solution quality. In this work, multiobjective optimization of energy and thermal-balance is considered. Through simulation, knowledge-based initial mapping shows significant improvement in Pareto front compared to random initial mapping that is widely used. The proposed knowledge-based crossover also shows better Pareto front compared to state-of-the-art knowledge-based crossover.

  17. Modeling Lactococcus lactis using a genome-scale flux model

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2005-06-01

    Full Text Available Abstract Background Genome-scale flux models are useful tools to represent and analyze microbial metabolism. In this work we reconstructed the metabolic network of the lactic acid bacteria Lactococcus lactis and developed a genome-scale flux model able to simulate and analyze network capabilities and whole-cell function under aerobic and anaerobic continuous cultures. Flux balance analysis (FBA and minimization of metabolic adjustment (MOMA were used as modeling frameworks. Results The metabolic network was reconstructed using the annotated genome sequence from L. lactis ssp. lactis IL1403 together with physiological and biochemical information. The established network comprised a total of 621 reactions and 509 metabolites, representing the overall metabolism of L. lactis. Experimental data reported in the literature was used to fit the model to phenotypic observations. Regulatory constraints had to be included to simulate certain metabolic features, such as the shift from homo to heterolactic fermentation. A minimal medium for in silico growth was identified, indicating the requirement of four amino acids in addition to a sugar. Remarkably, de novo biosynthesis of four other amino acids was observed even when all amino acids were supplied, which is in good agreement with experimental observations. Additionally, enhanced metabolic engineering strategies for improved diacetyl producing strains were designed. Conclusion The L. lactis metabolic network can now be used for a better understanding of lactococcal metabolic capabilities and potential, for the design of enhanced metabolic engineering strategies and for integration with other types of 'omic' data, to assist in finding new information on cellular organization and function.

  18. A genome-wide analysis of promoter-mediated phenotypic noise in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Olin K Silander

    2012-01-01

    Full Text Available Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as "phenotypic noise." In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alone.

  19. The UK Human Genome Mapping Project online computing service.

    Science.gov (United States)

    Rysavy, F R; Bishop, M J; Gibbs, G P; Williams, G W

    1992-04-01

    This paper presents an overview of computing and networking facilities developed by the Medical Research Council to provide online computing support to the Human Genome Mapping Project (HGMP) in the UK. The facility is connected to a number of other computing facilities in various centres of genetics and molecular biology research excellence, either directly via high-speed links or through national and international wide-area networks. The paper describes the design and implementation of the current system, a 'client/server' network of Sun, IBM, DEC and Apple servers, gateways and workstations. A short outline of online computing services currently delivered by this system to the UK human genetics research community is also provided. More information about the services and their availability could be obtained by a direct approach to the UK HGMP-RC.

  20. Exploring genetic suppression interactions on a global scale.

    Science.gov (United States)

    van Leeuwen, Jolanda; Pons, Carles; Mellor, Joseph C; Yamaguchi, Takafumi N; Friesen, Helena; Koschwanez, John; Ušaj, Mojca Mattiazzi; Pechlaner, Maria; Takar, Mehmet; Ušaj, Matej; VanderSluis, Benjamin; Andrusiak, Kerry; Bansal, Pritpal; Baryshnikova, Anastasia; Boone, Claire E; Cao, Jessica; Cote, Atina; Gebbia, Marinella; Horecka, Gene; Horecka, Ira; Kuzmin, Elena; Legro, Nicole; Liang, Wendy; van Lieshout, Natascha; McNee, Margaret; San Luis, Bryan-Joseph; Shaeri, Fatemeh; Shuteriqi, Ermira; Sun, Song; Yang, Lu; Youn, Ji-Young; Yuen, Michael; Costanzo, Michael; Gingras, Anne-Claude; Aloy, Patrick; Oostenbrink, Chris; Murray, Andrew; Graham, Todd R; Myers, Chad L; Andrews, Brenda J; Roth, Frederick P; Boone, Charles

    2016-11-04

    Genetic suppression occurs when the phenotypic defects caused by a mutation in a particular gene are rescued by a mutation in a second gene. To explore the principles of genetic suppression, we examined both literature-curated and unbiased experimental data, involving systematic genetic mapping and whole-genome sequencing, to generate a large-scale suppression network among yeast genes. Most suppression pairs identified novel relationships among functionally related genes, providing new insights into the functional wiring diagram of the cell. In addition to suppressor mutations, we identified frequent secondary mutations,in a subset of genes, that likely cause a delay in the onset of stationary phase, which appears to promote their enrichment within a propagating population. These findings allow us to formulate and quantify general mechanisms of genetic suppression. Copyright © 2016, American Association for the Advancement of Science.

  1. Fine mapping of a Phytophthora-resistance gene RpsWY in soybean (Glycine max L.) by high-throughput genome-wide sequencing.

    Science.gov (United States)

    Cheng, Yanbo; Ma, Qibin; Ren, Hailong; Xia, Qiuju; Song, Enliang; Tan, Zhiyuan; Li, Shuxian; Zhang, Gengyun; Nian, Hai

    2017-05-01

    Using a combination of phenotypic screening, genetic and statistical analyses, and high-throughput genome-wide sequencing, we have finely mapped a dominant Phytophthora resistance gene in soybean cultivar Wayao. Phytophthora root rot (PRR) caused by Phytophthora sojae is one of the most important soil-borne diseases in many soybean-production regions in the world. Identification of resistant gene(s) and incorporating them into elite varieties are an effective way for breeding to prevent soybean from being harmed by this disease. Two soybean populations of 191 F 2 individuals and 196 F 7:8 recombinant inbred lines (RILs) were developed to map Rps gene by crossing a susceptible cultivar Huachun 2 with the resistant cultivar Wayao. Genetic analysis of the F 2 population indicated that PRR resistance in Wayao was controlled by a single dominant gene, temporarily named RpsWY, which was mapped on chromosome 3. A high-density genetic linkage bin map was constructed using 3469 recombination bins of the RILs to explore the candidate genes by the high-throughput genome-wide sequencing. The results of genotypic analysis showed that the RpsWY gene was located in bin 401 between 4466230 and 4502773 bp on chromosome 3 through line 71 and 100 of the RILs. Four predicted genes (Glyma03g04350, Glyma03g04360, Glyma03g04370, and Glyma03g04380) were found at the narrowed region of 36.5 kb in bin 401. These results suggest that the high-throughput genome-wide resequencing is an effective method to fine map PRR candidate genes.

  2. A high-resolution whole genome radiation hybrid map of human chromosome 17q22-q25.3 across the genes for GH and TK

    Energy Technology Data Exchange (ETDEWEB)

    Foster, J.W.; Schafer, A.J.; Critcher, R. [Univ. of Cambridge (United Kingdom)] [and others

    1996-04-15

    We have constructed a whole genome radiation hybrid (WG-RH) map across a region of human chromosome 17q, from growth hormone (GH) to thymidine kinase (TK). A panel of 128 WG-RH hybrid cell lines generated by X-irradiation and fusion has been tested for the retention of 39 sequence-tagged site (STS) markers by the polymerase chain reaction. This genome mapping technique has allowed the integration of existing VNTR and microsatellite markers with additional new markers and existing STS markers previously mapped to this region by other means. The WG-RH map includes eight expressed sequence tag (EST) and three anonymous markers developed for this study, together with 23 anonymous microsatellites and five existing ESTs. Analysis of these data resulted in a high-density comprehensive map across this region of the genome. A subset of these markers has been used to produce a framework map consisting of 20 loci ordered with odds greater than 1000:1. The markers are of sufficient density to build a YAC contig across this region based on marker content. We have developed sequence tags for both ends of a 2.1-Mb YAC and mapped these using the WG-RH panel, allowing a direct comparison of cRay{sub 6000} to physical distance. 31 refs., 3 figs., 2 tabs.

  3. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

    Science.gov (United States)

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-06-27

    Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available

  4. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  5. Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

    Science.gov (United States)

    2011-01-01

    Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of

  6. Imputation of variants from the 1000 Genomes Project modestly improves known associations and can identify low-frequency variant-phenotype associations undetected by HapMap based imputation.

    Science.gov (United States)

    Wood, Andrew R; Perry, John R B; Tanaka, Toshiko; Hernandez, Dena G; Zheng, Hou-Feng; Melzer, David; Gibbs, J Raphael; Nalls, Michael A; Weedon, Michael N; Spector, Tim D; Richards, J Brent; Bandinelli, Stefania; Ferrucci, Luigi; Singleton, Andrew B; Frayling, Timothy M

    2013-01-01

    Genome-wide association (GWA) studies have been limited by the reliance on common variants present on microarrays or imputable from the HapMap Project data. More recently, the completion of the 1000 Genomes Project has provided variant and haplotype information for several million variants derived from sequencing over 1,000 individuals. To help understand the extent to which more variants (including low frequency (1% ≤ MAF 1000 Genomes imputation, respectively, and 9 and 11 that reached a stricter, likely conservative, threshold of P1000 Genomes genotype data modestly improved the strength of known associations. Of 20 associations detected at P1000 Genomes imputed data and one was nominally more strongly associated in HapMap imputed data. We also detected an association between a low frequency variant and phenotype that was previously missed by HapMap based imputation approaches. An association between rs112635299 and alpha-1 globulin near the SERPINA gene represented the known association between rs28929474 (MAF = 0.007) and alpha1-antitrypsin that predisposes to emphysema (P = 2.5×10(-12)). Our data provide important proof of principle that 1000 Genomes imputation will detect novel, low frequency-large effect associations.

  7. Genome-wide map of Apn1 binding sites under oxidative stress in Saccharomyces cerevisiae.

    Science.gov (United States)

    Morris, Lydia P; Conley, Andrew B; Degtyareva, Natalya; Jordan, I King; Doetsch, Paul W

    2017-11-01

    The DNA is cells is continuously exposed to reactive oxygen species resulting in toxic and mutagenic DNA damage. Although the repair of oxidative DNA damage occurs primarily through the base excision repair (BER) pathway, the nucleotide excision repair (NER) pathway processes some of the same lesions. In addition, damage tolerance mechanisms, such as recombination and translesion synthesis, enable cells to tolerate oxidative DNA damage, especially when BER and NER capacities are exceeded. Thus, disruption of BER alone or disruption of BER and NER in Saccharomyces cerevisiae leads to increased mutations as well as large-scale genomic rearrangements. Previous studies demonstrated that a particular region of chromosome II is susceptible to chronic oxidative stress-induced chromosomal rearrangements, suggesting the existence of DNA damage and/or DNA repair hotspots. Here we investigated the relationship between oxidative damage and genomic instability utilizing chromatin immunoprecipitation combined with DNA microarray technology to profile DNA repair sites along yeast chromosomes under different oxidative stress conditions. We targeted the major yeast AP endonuclease Apn1 as a representative BER protein. Our results indicate that Apn1 target sequences are enriched for cytosine and guanine nucleotides. We predict that BER protects these sites in the genome because guanines and cytosines are thought to be especially susceptible to oxidative attack, thereby preventing large-scale genome destabilization from chronic accumulation of DNA damage. Information from our studies should provide insight into how regional deployment of oxidative DNA damage management systems along chromosomes protects against large-scale rearrangements. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high...

  9. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...

  10. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  11. YouGenMap: a web platform for dynamic multi-comparative mapping and visualization of genetic maps

    Science.gov (United States)

    Keith Batesole; Kokulapalan Wimalanathan; Lin Liu; Fan Zhang; Craig S. Echt; Chun Liang

    2014-01-01

    Comparative genetic maps are used in examination of genome organization, detection of conserved gene order, and exploration of marker order variations. YouGenMap is an open-source web tool that offers dynamic comparative mapping capability of users' own genetic mapping between 2 or more map sets. Users' genetic map data and optional gene annotations are...

  12. Investigating host-pathogen behavior and their interaction using genome-scale metabolic network models.

    Science.gov (United States)

    Sadhukhan, Priyanka P; Raghunathan, Anu

    2014-01-01

    Genome Scale Metabolic Modeling methods represent one way to compute whole cell function starting from the genome sequence of an organism and contribute towards understanding and predicting the genotype-phenotype relationship. About 80 models spanning all the kingdoms of life from archaea to eukaryotes have been built till date and used to interrogate cell phenotype under varying conditions. These models have been used to not only understand the flux distribution in evolutionary conserved pathways like glycolysis and the Krebs cycle but also in applications ranging from value added product formation in Escherichia coli to predicting inborn errors of Homo sapiens metabolism. This chapter describes a protocol that delineates the process of genome scale metabolic modeling for analysing host-pathogen behavior and interaction using flux balance analysis (FBA). The steps discussed in the process include (1) reconstruction of a metabolic network from the genome sequence, (2) its representation in a precise mathematical framework, (3) its translation to a model, and (4) the analysis using linear algebra and optimization. The methods for biological interpretations of computed cell phenotypes in the context of individual host and pathogen models and their integration are also discussed.

  13. Next-generation genome-scale models for metabolic engineering

    DEFF Research Database (Denmark)

    King, Zachary A.; Lloyd, Colton J.; Feist, Adam M.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict...... examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering....

  14. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...

  15. Genome-wide association mapping for yield and other agronomic traits in an elite breeding population of tropical rice (Oryza sativa).

    Science.gov (United States)

    Begum, Hasina; Spindel, Jennifer E; Lalusin, Antonio; Borromeo, Teresita; Gregorio, Glenn; Hernandez, Jose; Virk, Parminder; Collard, Bertrand; McCouch, Susan R

    2015-01-01

    Genome-wide association mapping studies (GWAS) are frequently used to detect QTL in diverse collections of crop germplasm, based on historic recombination events and linkage disequilibrium across the genome. Generally, diversity panels genotyped with high density SNP panels are utilized in order to assay a wide range of alleles and haplotypes and to monitor recombination breakpoints across the genome. By contrast, GWAS have not generally been performed in breeding populations. In this study we performed association mapping for 19 agronomic traits including yield and yield components in a breeding population of elite irrigated tropical rice breeding lines so that the results would be more directly applicable to breeding than those from a diversity panel. The population was genotyped with 71,710 SNPs using genotyping-by-sequencing (GBS), and GWAS performed with the explicit goal of expediting selection in the breeding program. Using this breeding panel we identified 52 QTL for 11 agronomic traits, including large effect QTLs for flowering time and grain length/grain width/grain-length-breadth ratio. We also identified haplotypes that can be used to select plants in our population for short stature (plant height), early flowering time, and high yield, and thus demonstrate the utility of association mapping in breeding populations for informing breeding decisions. We conclude by exploring how the newly identified significant SNPs and insights into the genetic architecture of these quantitative traits can be leveraged to build genomic-assisted selection models.

  16. The Recombination Landscape in Wild House Mice Inferred Using Population Genomic Data.

    Science.gov (United States)

    Booker, Tom R; Ness, Rob W; Keightley, Peter D

    2017-09-01

    Characterizing variation in the rate of recombination across the genome is important for understanding several evolutionary processes. Previous analysis of the recombination landscape in laboratory mice has revealed that the different subspecies have different suites of recombination hotspots. It is unknown, however, whether hotspots identified in laboratory strains reflect the hotspot diversity of natural populations or whether broad-scale variation in the rate of recombination is conserved between subspecies. In this study, we constructed fine-scale recombination rate maps for a natural population of the Eastern house mouse, Mus musculus castaneus We performed simulations to assess the accuracy of recombination rate inference in the presence of phase errors, and we used a novel approach to quantify phase error. The spatial distribution of recombination events is strongly positively correlated between our castaneus map, and a map constructed using inbred lines derived predominantly from M. m. domesticus Recombination hotspots in wild castaneus show little overlap, however, with the locations of double-strand breaks in wild-derived house mouse strains. Finally, we also find that genetic diversity in M. m. castaneus is positively correlated with the rate of recombination, consistent with pervasive natural selection operating in the genome. Our study suggests that recombination rate variation is conserved at broad scales between house mouse subspecies, but it is not strongly conserved at fine scales. Copyright © 2017 by the Genetics Society of America.

  17. Construction of a 7-fold BAC library and cytogenetic mapping of 10 genes in the giant panda (Ailuropoda melanoleuca

    Directory of Open Access Journals (Sweden)

    Zhang Ying

    2006-11-01

    Full Text Available Abstract Background The giant panda, one of the most primitive carnivores, is an endangered animal. Although it has been the subject of many interesting studies during recent years, little is known about its genome. In order to promote research on this genome, a bacterial artificial chromosome (BAC library of the giant panda was constructed in this study. Results This BAC library contains 198,844 clones with an average insert size of 108 kb, which represents approximately seven equivalents of the giant panda haploid genome. Screening the library with 15 genes and 8 microsatellite markers demonstrates that it is representative and has good genome coverage. Furthermore, ten BAC clones harbouring AGXT, GHR, FSHR, IRBP, SOX14, TTR, BDNF, NT-4, LH and ZFX1 were mapped to 8 pairs of giant panda chromosomes by fluorescence in situ hybridization (FISH. Conclusion This is the first large-insert genomic DNA library for the giant panda, and will contribute to understanding this endangered species in the areas of genome sequencing, physical mapping, gene cloning and comparative genomic studies. We also identified the physical locations of ten genes on their relative chromosomes by FISH, providing a preliminary framework for further development of a high resolution cytogenetic map of the giant panda.

  18. Optimal knockout strategies in genome-scale metabolic networks using particle swarm optimization.

    Science.gov (United States)

    Nair, Govind; Jungreuthmayer, Christian; Zanghellini, Jürgen

    2017-02-01

    Knockout strategies, particularly the concept of constrained minimal cut sets (cMCSs), are an important part of the arsenal of tools used in manipulating metabolic networks. Given a specific design, cMCSs can be calculated even in genome-scale networks. We would however like to find not only the optimal intervention strategy for a given design but the best possible design too. Our solution (PSOMCS) is to use particle swarm optimization (PSO) along with the direct calculation of cMCSs from the stoichiometric matrix to obtain optimal designs satisfying multiple objectives. To illustrate the working of PSOMCS, we apply it to a toy network. Next we show its superiority by comparing its performance against other comparable methods on a medium sized E. coli core metabolic network. PSOMCS not only finds solutions comparable to previously published results but also it is orders of magnitude faster. Finally, we use PSOMCS to predict knockouts satisfying multiple objectives in a genome-scale metabolic model of E. coli and compare it with OptKnock and RobustKnock. PSOMCS finds competitive knockout strategies and designs compared to other current methods and is in some cases significantly faster. It can be used in identifying knockouts which will force optimal desired behaviors in large and genome scale metabolic networks. It will be even more useful as larger metabolic models of industrially relevant organisms become available.

  19. Facile mutant identification via a single parental backcross method and application of whole genome sequencing based mapping pipelines

    Directory of Open Access Journals (Sweden)

    Robert Silas Allen

    2013-09-01

    Full Text Available Forward genetic screens have identified numerous genes involved in development and metabolism, and remain a cornerstone of biological research. However to locate a causal mutation, the practice of crossing to a polymorphic background to generate a mapping population can be problematic if the mutant phenotype is difficult to recognise in the hybrid F2 progeny, or dependent on parental specific traits. Here in a screen for leaf hyponasty mutants, we have performed a single backcross of an Ethane Methyl Sulphonate (EMS generated hyponastic mutant to its parent. Whole genome deep sequencing of a bulked homozygous F2 population and analysis via the Next Generation EMS mutation mapping pipeline (NGM unambiguously determined the causal mutation to be a single nucleotide polymorphisim (SNP residing in HASTY, a previously characterised gene involved in microRNA biogenesis. We have evaluated the feasibility of this backcross approach using three additional SNP mapping pipelines; SHOREmap, the GATK pipeline, and the samtools pipeline. Although there was variance in the identification of EMS SNPs, all returned the same outcome in clearly identifying the causal mutation in HASTY. The simplicity of performing a single parental backcross and genome sequencing a small pool of segregating mutants has great promise for identifying mutations that may be difficult to map using conventional approaches.

  20. Comparative mapping in Pinus: sugar pine (Pinus lambertiana Dougl.) and loblolly pine (Pinus taeda L.).Tree Genet Genomes 7:457-468

    Science.gov (United States)

    Kathleen D. Jermstad; Andrew J. Eckert; Jill L. Wegrzyn; Annette Delfino-Mix; Dean A Davis; Deems C. Burton; David B. Neale

    2011-01-01

    The majority of genomic research in conifers has been conducted in the Pinus subgenus Pinus mostly due to the high economic importance of the species within this taxon. Genetic maps have been constructed for several of these pines and comparative mapping analyses have consistently revealed notable synteny. In contrast,...

  1. Comparative genomics and association mapping approaches for blast resistant genes in finger millet using SSRs.

    Science.gov (United States)

    Babu, B Kalyana; Dinesh, Pandey; Agrawal, Pawan K; Sood, S; Chandrashekara, C; Bhatt, Jagadish C; Kumar, Anil

    2014-01-01

    The major limiting factor for production and productivity of finger millet crop is blast disease caused by Magnaporthe grisea. Since, the genome sequence information available in finger millet crop is scarce, comparative genomics plays a very important role in identification of genes/QTLs linked to the blast resistance genes using SSR markers. In the present study, a total of 58 genic SSRs were developed for use in genetic analysis of a global collection of 190 finger millet genotypes. The 58 SSRs yielded ninety five scorable alleles and the polymorphism information content varied from 0.186 to 0.677 at an average of 0.385. The gene diversity was in the range of 0.208 to 0.726 with an average of 0.487. Association mapping for blast resistance was done using 104 SSR markers which identified four QTLs for finger blast and one QTL for neck blast resistance. The genomic marker RM262 and genic marker FMBLEST32 were linked to finger blast disease at a P value of 0.007 and explained phenotypic variance (R²) of 10% and 8% respectively. The genomic marker UGEP81 was associated to finger blast at a P value of 0.009 and explained 7.5% of R². The QTLs for neck blast was associated with the genomic SSR marker UGEP18 at a P value of 0.01, which explained 11% of R². Three QTLs for blast resistance were found common by using both GLM and MLM approaches. The resistant alleles were found to be present mostly in the exotic genotypes. Among the genotypes of NW Himalayan region of India, VHC3997, VHC3996 and VHC3930 were found highly resistant, which may be effectively used as parents for developing blast resistant cultivars in the NW Himalayan region of India. The markers linked to the QTLs for blast resistance in the present study can be further used for cloning of the full length gene, fine mapping and their further use in the marker assisted breeding programmes for introgression of blast resistant alleles into locally adapted cultivars.

  2. Comprehensive genomic analysis of a plant growth-promoting rhizobacterium Pantoea agglomerans strain P5.

    Science.gov (United States)

    Shariati J, Vahid; Malboobi, Mohammad Ali; Tabrizi, Zeinab; Tavakol, Elahe; Owilia, Parviz; Safari, Maryam

    2017-11-15

    In this study, we provide a comparative genomic analysis of Pantoea agglomerans strain P5 and 10 closely related strains based on phylogenetic analyses. A next-generation shotgun strategy was implemented using the Illumina HiSeq 2500 technology followed by core- and pan-genome analysis. The genome of P. agglomerans strain P5 contains an assembly size of 5082485 bp with 55.4% G + C content. P. agglomerans consists of 2981 core and 3159 accessory genes for Coding DNA Sequences (CDSs) based on the pan-genome analysis. Strain P5 can be grouped closely with strains PG734 and 299 R using pan and core genes, respectively. All the predicted and annotated gene sequences were allocated to KEGG pathways. Accordingly,  genes involved in plant growth-promoting (PGP) ability, including phosphate solubilization, IAA and siderophore production, acetoin and 2,3-butanediol synthesis and bacterial secretion, were assigned. This study provides an in-depth view of the PGP characteristics of strain P5, highlighting its potential use in agriculture as a biofertilizer.

  3. Mapping neighborhood scale survey responses with uncertainty metrics

    Directory of Open Access Journals (Sweden)

    Charles Robert Ehlschlaeger

    2016-12-01

    Full Text Available This paper presents a methodology of mapping population-centric social, infrastructural, and environmental metrics at neighborhood scale. This methodology extends traditional survey analysis methods to create cartographic products useful in agent-based modeling and geographic information analysis. It utilizes and synthesizes survey microdata, sub-upazila attributes, land use information, and ground truth locations of attributes to create neighborhood scale multi-attribute maps. Monte Carlo methods are employed to combine any number of survey responses to stochastically weight survey cases and to simulate survey cases' locations in a study area. Through such Monte Carlo methods, known errors from each of the input sources can be retained. By keeping individual survey cases as the atomic unit of data representation, this methodology ensures that important covariates are retained and that ecological inference fallacy is eliminated. These techniques are demonstrated with a case study from the Chittagong Division in Bangladesh. The results provide a population-centric understanding of many social, infrastructural, and environmental metrics desired in humanitarian aid and disaster relief planning and operations wherever long term familiarity is lacking. Of critical importance is that the resulting products have easy to use explicit representation of the errors and uncertainties of each of the input sources via the automatically generated summary statistics created at the application's geographic scale.

  4. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  5. Advancing the STMS genomic resources for defining new locations on the intraspecific genetic linkage map of chickpea (Cicer arietinum L.

    Directory of Open Access Journals (Sweden)

    Shokeen Bhumika

    2011-02-01

    Full Text Available Abstract Background Chickpea (Cicer arietinum L. is an economically important cool season grain legume crop that is valued for its nutritive seeds having high protein content. However, several biotic and abiotic stresses and the low genetic variability in the chickpea genome have continuously hindered the chickpea molecular breeding programs. STMS (Sequence Tagged Microsatellite Sites markers which are preferred for the construction of saturated linkage maps in several crop species, have also emerged as the most efficient and reliable source for detecting allelic diversity in chickpea. However, the number of STMS markers reported in chickpea is still limited and moreover exhibit low rates of both inter and intraspecific polymorphism, thereby limiting the positions of the SSR markers especially on the intraspecific linkage maps of chickpea. Hence, this study was undertaken with the aim of developing additional STMS markers and utilizing them for advancing the genetic linkage map of chickpea which would have applications in QTL identification, MAS and for de novo assembly of high throughput whole genome sequence data. Results A microsatellite enriched library of chickpea (enriched for (GT/CAn and (GA/CTn repeats was constructed from which 387 putative microsatellite containing clones were identified. From these, 254 STMS primers were designed of which 181 were developed as functional markers. An intraspecific mapping population of chickpea, [ICCV-2 (single podded × JG-62 (double podded] and comprising of 126 RILs, was genotyped for mapping. Of the 522 chickpea STMS markers (including the double-podding trait, screened for parental polymorphism, 226 (43.3% were polymorphic in the parents and were used to genotype the RILs. At a LOD score of 3.5, eight linkage groups defining the position of 138 markers were obtained that spanned 630.9 cM with an average marker density of 4.57 cM. Further, based on the common loci present between the current map

  6. Advancing the STMS genomic resources for defining new locations on the intraspecific genetic linkage map of chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Gaur, Rashmi; Sethy, Niroj K; Choudhary, Shalu; Shokeen, Bhumika; Gupta, Varsha; Bhatia, Sabhyata

    2011-02-17

    Chickpea (Cicer arietinum L.) is an economically important cool season grain legume crop that is valued for its nutritive seeds having high protein content. However, several biotic and abiotic stresses and the low genetic variability in the chickpea genome have continuously hindered the chickpea molecular breeding programs. STMS (Sequence Tagged Microsatellite Sites) markers which are preferred for the construction of saturated linkage maps in several crop species, have also emerged as the most efficient and reliable source for detecting allelic diversity in chickpea. However, the number of STMS markers reported in chickpea is still limited and moreover exhibit low rates of both inter and intraspecific polymorphism, thereby limiting the positions of the SSR markers especially on the intraspecific linkage maps of chickpea. Hence, this study was undertaken with the aim of developing additional STMS markers and utilizing them for advancing the genetic linkage map of chickpea which would have applications in QTL identification, MAS and for de novo assembly of high throughput whole genome sequence data. A microsatellite enriched library of chickpea (enriched for (GT/CA)n and (GA/CT)n repeats) was constructed from which 387 putative microsatellite containing clones were identified. From these, 254 STMS primers were designed of which 181 were developed as functional markers. An intraspecific mapping population of chickpea, [ICCV-2 (single podded) × JG-62 (double podded)] and comprising of 126 RILs, was genotyped for mapping. Of the 522 chickpea STMS markers (including the double-podding trait, screened for parental polymorphism, 226 (43.3%) were polymorphic in the parents and were used to genotype the RILs. At a LOD score of 3.5, eight linkage groups defining the position of 138 markers were obtained that spanned 630.9 cM with an average marker density of 4.57 cM. Further, based on the common loci present between the current map and the previously published chickpea

  7. Map of open and closed chromatin domains in Drosophila genome.

    Science.gov (United States)

    Milon, Beatrice; Sun, Yezhou; Chang, Weizhong; Creasy, Todd; Mahurkar, Anup; Shetty, Amol; Nurminsky, Dmitry; Nurminskaya, Maria

    2014-11-18

    Chromatin compactness has been considered a major determinant of gene activity and has been associated with specific chromatin modifications in studies on a few individual genetic loci. At the same time, genome-wide patterns of open and closed chromatin have been understudied, and are at present largely predicted from chromatin modification and gene expression data. However the universal applicability of such predictions is not self-evident, and requires experimental verification. We developed and implemented a high-throughput analysis for general chromatin sensitivity to DNase I which provides a comprehensive epigenomic assessment in a single assay. Contiguous domains of open and closed chromatin were identified by computational analysis of the data, and correlated to other genome annotations including predicted chromatin "states", individual chromatin modifications, nuclear lamina interactions, and gene expression. While showing that the widely trusted predictions of chromatin structure are correct in the majority of cases, we detected diverse "exceptions" from the conventional rules. We found a profound paucity of chromatin modifications in a major fraction of closed chromatin, and identified a number of loci where chromatin configuration is opposite to that expected from modification and gene expression patterns. Further, we observed that chromatin of large introns tends to be closed even when the genes are expressed, and that a significant proportion of active genes including their promoters are located in closed chromatin. These findings reveal limitations of the existing predictive models, indicate novel mechanisms of epigenetic regulation, and provide important insights into genome organization and function.

  8. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber

    NARCIS (Netherlands)

    Zhang, Z.; Mao, L.; Chen, Junshi; Bu, F.; Li, G.; Sun, J.; Li, S.; Sun, H.; Jiao, C.; Blakely, R.; Pan, J.; Cai, R.; Luo, R.; Peer, Van de Y.; Jacobsen, E.; Fei, Z.; Huang, S.

    2015-01-01

    Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep

  9. Global mapping of transposon location.

    Directory of Open Access Journals (Sweden)

    Abram Gabriel

    2006-12-01

    Full Text Available Transposable genetic elements are ubiquitous, yet their presence or absence at any given position within a genome can vary between individual cells, tissues, or strains. Transposable elements have profound impacts on host genomes by altering gene expression, assisting in genomic rearrangements, causing insertional mutations, and serving as sources of phenotypic variation. Characterizing a genome's full complement of transposons requires whole genome sequencing, precluding simple studies of the impact of transposition on interindividual variation. Here, we describe a global mapping approach for identifying transposon locations in any genome, using a combination of transposon-specific DNA extraction and microarray-based comparative hybridization analysis. We use this approach to map the repertoire of endogenous transposons in different laboratory strains of Saccharomyces cerevisiae and demonstrate that transposons are a source of extensive genomic variation. We also apply this method to mapping bacterial transposon insertion sites in a yeast genomic library. This unique whole genome view of transposon location will facilitate our exploration of transposon dynamics, as well as defining bases for individual differences and adaptive potential.

  10. A comparison of multidimensional scaling methods for perceptual mapping

    NARCIS (Netherlands)

    Bijmolt, T.H.A.; Wedel, M.

    Multidimensional scaling has been applied to a wide range of marketing problems, in particular to perceptual mapping based on dissimilarity judgments. The introduction of methods based on the maximum likelihood principle is one of the most important developments. In this article, the authors compare

  11. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2014-03-01

    Full Text Available Bacterial artificial chromosome (BAC libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12, consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger.

  12. pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data

    DEFF Research Database (Denmark)

    Hansen, Morten; Gerds, Thomas Alexander; Nielsen, Ole Haagen

    2012-01-01

    Analyzing data obtained from genome-wide gene expression experiments is challenging due to the quantity of variables, the need for multivariate analyses, and the demands of managing large amounts of data. Here we present the R package pcaGoPromoter, which facilitates the interpretation of genome.......g., cell cycle progression and the predicted involvement of expected transcription factors, including E2F. In addition, unexpected results, e.g., cholesterol synthesis in serum-depleted cells and NF-¿B activation in inhibitor treated cells, were noted. In summary, the pcaGoPromoter R package provides...

  13. Small genomes: New initiatives in mapping and sequencing. Workshop summary report

    Energy Technology Data Exchange (ETDEWEB)

    McKenney, K. [National Inst. of Standards and Technology, Gaithersburg, MD (United States). Biotechnology Div.; Robb, F. [Univ. of Maryland Biotechnology Inst., Baltimore, MD (United States). Center of Marine Biotechnology

    1993-12-31

    The workshop was held 5--7 July 1993 at the Center for Advanced Research in Biotechnology (CARB) and hosted by the University of Maryland Biotechnology Institute (UMBI) and the National Institute of Standards and Technology (NIST). The objective of this workshop was to bring together individuals interested in DNA technologies and to determine the impact of these current and potential improvements of the speed and cost-effectiveness of mapping and sequencing on the planning of future small genome projects. A major goal of the workshop was to spur the collaboration of more diverse groups of scientists working on this topic, and to minimize competitiveness as an inhibitory factor to progress.

  14. Mediator binding to UASs is broadly uncoupled from transcription and cooperative with TFIID recruitment to promoters.

    Science.gov (United States)

    Grünberg, Sebastian; Henikoff, Steven; Hahn, Steven; Zentner, Gabriel E

    2016-11-15

    Mediator is a conserved, essential transcriptional coactivator complex, but its in vivo functions have remained unclear due to conflicting data regarding its genome-wide binding pattern obtained by genome-wide ChIP Here, we used ChEC-seq, a method orthogonal to ChIP, to generate a high-resolution map of Mediator binding to the yeast genome. We find that Mediator associates with upstream activating sequences (UASs) rather than the core promoter or gene body under all conditions tested. Mediator occupancy is surprisingly correlated with transcription levels at only a small fraction of genes. Using the same approach to map TFIID, we find that TFIID is associated with both TFIID- and SAGA-dependent genes and that TFIID and Mediator occupancy is cooperative. Our results clarify Mediator recruitment and binding to the genome, showing that Mediator binding to UASs is widespread, partially uncoupled from transcription, and mediated in part by TFIID. © 2016 The Authors.

  15. Identification of TNF-α-responsive promoters and enhancers in the intestinal epithelial cell model Caco-2

    DEFF Research Database (Denmark)

    Boyd, Mette; Coskun, Mehmet; Lilje, Berit

    2014-01-01

    The Caco-2 cell line is one of the most important in vitro models for enterocytes, and is used to study drug absorption and disease, including inflammatory bowel disease and cancer. In order to use the model optimally, it is necessary to map its functional entities. In this study, we have generated...... genome-wide maps of active transcription start sites (TSSs), and active enhancers in Caco-2 cells with or without tumour necrosis factor (TNF)-α stimulation to mimic an inflammatory state. We found 520 promoters that significantly changed their usage level upon TNF-α stimulation; of these, 52...... promoters. As a case example, we characterize an enhancer regulating the laminin-5 γ2-chain (LAMC2) gene by nuclear factor (NF)-κB binding. This report is the first to present comprehensive TSS and enhancer maps over Caco-2 cells, and highlights many novel inflammation-specific promoters and enhancers....

  16. Physical mapping of 20 unmapped fragments of the btau_4.0 genome assembly in cattle, sheep and river buffalo.

    Science.gov (United States)

    De Lorenzi, L; Genualdo, V; Perucatti, A; Iannuzzi, A; Iannuzzi, L; Parma, P

    2013-01-01

    The recent advances in sequencing technology and bioinformatics have revolutionized genomic research, making the decoding of the genome an easier task. Genome sequences are currently available for many species, including cattle, sheep and river buffalo. The available reference genomes are very accurate, and they represent the best possible order of loci at this time. In cattle, despite the great accuracy achieved, a part of the genome has been sequenced but not yet assembled: these genome fragments are called unmapped fragments. In the present study, 20 unmapped fragments belonging to the Btau_4.0 reference genome have been mapped by FISH in cattle (Bos taurus, 2n = 60), sheep (Ovis aries, 2n = 54) and river buffalo (Bubalus bubalis, 2n = 50). Our results confirm the accuracy of the available reference genome, though there are some discrepancies between the expected localization and the observed localization. Moreover, the available data in the literature regarding genomic homologies between cattle, sheep and river buffalo are confirmed. Finally, the results presented here suggest that FISH was, and still is, a useful technology to validate the data produced by genome sequencing programs. Copyright © 2013 S. Karger AG, Basel.

  17. Linkage disequilibrium between STRPs and SNPs across the human genome.

    Science.gov (United States)

    Payseur, Bret A; Place, Michael; Weber, James L

    2008-05-01

    Patterns of linkage disequilibrium (LD) reveal the action of evolutionary processes and provide crucial information for association mapping of disease genes. Although recent studies have described the landscape of LD among single nucleotide polymorphisms (SNPs) from across the human genome, associations involving other classes of molecular variation remain poorly understood. In addition to recombination and population history, mutation rate and process are expected to shape LD. To test this idea, we measured associations between short-tandem-repeat polymorphisms (STRPs), which can mutate rapidly and recurrently, and SNPs in 721 regions across the human genome. We directly compared STRP-SNP LD with SNP-SNP LD from the same genomic regions in the human HapMap populations. The intensity of STRP-SNP LD, measured by the average of D', was reduced, consistent with the action of recurrent mutation. Nevertheless, a higher fraction of STRP-SNP pairs than SNP-SNP pairs showed significant LD, on both short (up to 50 kb) and long (cM) scales. These results reveal the substantial effects of mutational processes on LD at STRPs and provide important measures of the potential of STRPs for association mapping of disease genes.

  18. A genomic library-based amplification approach (GL-PCR) for the mapping of multiple IS6110 insertion sites and strain differentiation of Mycobacterium tuberculosis.

    Science.gov (United States)

    Namouchi, Amine; Mardassi, Helmi

    2006-11-01

    Evidence suggests that insertion of the IS6110 element is not without consequence to the biology of Mycobacterium tuberculosis complex strains. Thus, mapping of multiple IS6110 insertion sites in the genome of biomedically relevant clinical isolates would result in a better understanding of the role of this mobile element, particularly with regard to transmission, adaptability and virulence. In the present paper, we describe a versatile strategy, referred to as GL-PCR, that amplifies IS6110-flanking sequences based on the construction of a genomic library. M. tuberculosis chromosomal DNA is fully digested with HincII and then ligated into a plasmid vector between T7 and T3 promoter sequences. The ligation reaction product is transformed into Escherichia coli and selective PCR amplification targeting both 5' and 3' IS6110-flanking sequences are performed on the plasmid library DNA. For this purpose, four separate PCR reactions are performed, each combining an outward primer specific for one IS6110 end with either T7 or T3 primer. Determination of the nucleotide sequence of the PCR products generated from a single ligation reaction allowed mapping of 21 out of the 24 IS6110 copies of two 12 banded M. tuberculosis strains, yielding an overall sensitivity of 87,5%. Furthermore, by simply comparing the migration pattern of GL-PCR-generated products, the strategy proved to be as valuable as IS6110 RFLP for molecular typing of M. tuberculosis complex strains. Importantly, GL-PCR was able to discriminate between strains differing by a single IS6110 band.

  19. Unexplored therapeutic opportunities in the human genome.

    Science.gov (United States)

    Oprea, Tudor I; Bologa, Cristian G; Brunak, Søren; Campbell, Allen; Gan, Gregory N; Gaulton, Anna; Gomez, Shawn M; Guha, Rajarshi; Hersey, Anne; Holmes, Jayme; Jadhav, Ajit; Jensen, Lars Juhl; Johnson, Gary L; Karlson, Anneli; Leach, Andrew R; Ma'ayan, Avi; Malovannaya, Anna; Mani, Subramani; Mathias, Stephen L; McManus, Michael T; Meehan, Terrence F; von Mering, Christian; Muthas, Daniel; Nguyen, Dac-Trung; Overington, John P; Papadatos, George; Qin, Jun; Reich, Christian; Roth, Bryan L; Schürer, Stephan C; Simeonov, Anton; Sklar, Larry A; Southall, Noel; Tomita, Susumu; Tudose, Ilinca; Ursu, Oleg; Vidovic, Dušica; Waller, Anna; Westergaard, David; Yang, Jeremy J; Zahoránszky-Köhalmi, Gergely

    2018-05-01

    A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially druggable, proteins, the US National Institutes of Health launched the Illuminating the Druggable Genome (IDG) initiative in 2014. In this article, we discuss how the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins, which indicates a substantial knowledge deficit for approximately one out of three proteins in the human proteome. We then present spotlights on the TDL categories as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development.

  20. QTL mapping of genome regions controlling temephos resistance in larvae of the mosquito Aedes aegypti.

    Science.gov (United States)

    Reyes-Solis, Guadalupe Del Carmen; Saavedra-Rodriguez, Karla; Suarez, Adriana Flores; Black, William C

    2014-10-01

    The mosquito Aedes aegypti is the principal vector of dengue and yellow fever flaviviruses. Temephos is an organophosphate insecticide used globally to suppress Ae. aegypti larval populations but resistance has evolved in many locations. Quantitative Trait Loci (QTL) controlling temephos survival in Ae. aegypti larvae were mapped in a pair of F3 advanced intercross lines arising from temephos resistant parents from Solidaridad, México and temephos susceptible parents from Iquitos, Peru. Two sets of 200 F3 larvae were exposed to a discriminating dose of temephos and then dead larvae were collected and preserved for DNA isolation every two hours up to 16 hours. Larvae surviving longer than 16 hours were considered resistant. For QTL mapping, single nucleotide polymorphisms (SNPs) were identified at 23 single copy genes and 26 microsatellite loci of known physical positions in the Ae. aegypti genome. In both reciprocal crosses, Multiple Interval Mapping identified eleven QTL associated with time until death. In the Solidaridad×Iquitos (SLD×Iq) cross twelve were associated with survival but in the reciprocal IqxSLD cross, only six QTL were survival associated. Polymorphisms at acetylcholine esterase (AchE) loci 1 and 2 were not associated with either resistance phenotype suggesting that target site insensitivity is not an organophosphate resistance mechanism in this region of México. Temephos resistance is under the control of many metabolic genes of small effect and dispersed throughout the Ae. aegypti genome.

  1. C-terminal region of MAP7 domain containing protein 3 (MAP7D3 promotes microtubule polymerization by binding at the C-terminal tail of tubulin.

    Directory of Open Access Journals (Sweden)

    Saroj Yadav

    Full Text Available MAP7 domain containing protein 3 (MAP7D3, a newly identified microtubule associated protein, has been shown to promote microtubule assembly and stability. Its microtubule binding region has been reported to consist of two coiled coil motifs located at the N-terminus. It possesses a MAP7 domain near the C-terminus and belongs to the microtubule associated protein 7 (MAP7 family. The MAP7 domain of MAP7 protein has been shown to bind to kinesin-1; however, the role of MAP7 domain in MAP7D3 remains unknown. Based on the bioinformatics analysis of MAP7D3, we hypothesized that the MAP7 domain of MAP7D3 may have microtubule binding activity. Indeed, we found that MAP7 domain of MAP7D3 bound to microtubules as well as enhanced the assembly of microtubules in vitro. Interestingly, a longer fragment MDCT that contained the MAP7 domain (MD with the C-terminal tail (CT of the protein promoted microtubule polymerization to a greater extent than MD and CT individually. MDCT stabilized microtubules against dilution induced disassembly. MDCT bound to reconstituted microtubules with an apparent dissociation constant of 3.0 ± 0.5 µM. An immunostaining experiment showed that MDCT localized along the length of the preassembled microtubules. Competition experiments with tau indicated that MDCT shares its binding site on microtubules with tau. Further, we present evidence indicating that MDCT binds to the C-terminal tail of tubulin. In addition, MDCT could bind to tubulin in HeLa cell extract. Here, we report a microtubule binding region in the C-terminal region of MAP7D3 that may have a role in regulating microtubule assembly dynamics.

  2. A protocol for generating a high-quality genome-scale metabolic reconstruction.

    Science.gov (United States)

    Thiele, Ines; Palsson, Bernhard Ø

    2010-01-01

    Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.

  3. A physical map of the heterozygous grapevine 'Cabernet Sauvignon' allows mapping candidate genes for disease resistance

    Directory of Open Access Journals (Sweden)

    Scalabrin Simone

    2008-06-01

    Full Text Available Abstract Background Whole-genome physical maps facilitate genome sequencing, sequence assembly, mapping of candidate genes, and the design of targeted genetic markers. An automated protocol was used to construct a Vitis vinifera 'Cabernet Sauvignon' physical map. The quality of the result was addressed with regard to the effect of high heterozygosity on the accuracy of contig assembly. Its usefulness for the genome-wide mapping of genes for disease resistance, which is an important trait for grapevine, was then assessed. Results The physical map included 29,727 BAC clones assembled into 1,770 contigs, spanning 715,684 kbp, and corresponding to 1.5-fold the genome size. Map inflation was due to high heterozygosity, which caused either the separation of allelic BACs in two different contigs, or local mis-assembly in contigs containing BACs from the two haplotypes. Genetic markers anchored 395 contigs or 255,476 kbp to chromosomes. The fully automated assembly and anchorage procedures were validated by BAC-by-BAC blast of the end sequences against the grape genome sequence, unveiling 7.3% of chimerical contigs. The distribution across the physical map of candidate genes for non-host and host resistance, and for defence signalling pathways was then studied. NBS-LRR and RLK genes for host resistance were found in 424 contigs, 133 of them (32% were assigned to chromosomes, on which they are mostly organised in clusters. Non-host and defence signalling genes were found in 99 contigs dispersed without a discernable pattern across the genome. Conclusion Despite some limitations that interfere with the correct assembly of heterozygous clones into contigs, the 'Cabernet Sauvignon' physical map is a useful and reliable intermediary step between a genetic map and the genome sequence. This tool was successfully exploited for a quick mapping of complex families of genes, and it strengthened previous clues of co-localisation of major NBS-LRR clusters and

  4. Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network

    OpenAIRE

    Mart?n-Jim?nez, Cynthia A.; Salazar-Barreto, Diego; Barreto, George E.; Gonz?lez, Janneth

    2017-01-01

    Astrocytes are the most abundant cells of the central nervous system; they have a predominant role in maintaining brain metabolism. In this sense, abnormal metabolic states have been found in different neuropathological diseases. Determination of metabolic states of astrocytes is difficult to model using current experimental approaches given the high number of reactions and metabolites present. Thus, genome-scale metabolic networks derived from transcriptomic data can be used as a framework t...

  5. Cellular promoters incorporated into the adenovirus genome: effects of viral regulatory elements on transcription rates and cell specificity of albumin and beta-globin promoters.

    OpenAIRE

    Babiss, L E; Friedman, J M; Darnell, J E

    1986-01-01

    In the accompanying paper (Friedman et al., Mol. Cell. Biol. 6:3791-3797, 1986), hepatoma-specific expression of the rat albumin promoter within the adenovirus genome was demonstrated. However, the rate of transcription was very low compared with that of the endogenous chromosomal albumin gene. Here we show that in hepatoma cells the adenovirus E1A enhancer, especially in the presence of E1A protein, greatly stimulates transcription from the albumin promoter but not the mouse beta-globin prom...

  6. Comparison of the large-scale radon risk map for southern Belgium with results of high resolution surveys

    International Nuclear Information System (INIS)

    Zhu, H.-C.; Charlet, J.M.; Poffijn, A.

    2000-01-01

    A large-scale radon survey consisting of long-term measurements in about 5200 singe-family houses in the southern part of Belgium was carried from 1995 to 1999. A radon risk map for the region was produced using geostatistical and GIS approaches. Some communes or villages situated within high risk areas were chosen for detailed surveys. A high resolution radon survey with about 330 measurements was performed in half part of the commune of Burg-Reuland. Comparison of radon maps on quite different scales shows that the general Rn risk map has similar pattern as the radon map for the detailed study area. Another detailed radon survey in the village of Hatrival, situated in a high radon area, found very high proportion of houses with elevated radon concentrations. The results of this detailed survey are comparable to the expectation for high risk areas on the large-scale radon risk map. The good correspondence between the findings of the general risk map and the analysis of the limited detailed surveys, suggests that the large-scale radon risk map is likely reliable. (author)

  7. Fine-scale mapping of 8q24 locus identifies multiple independent risk variants for breast cancer

    NARCIS (Netherlands)

    J. Shi (Jiajun); Zhang, Y. (Yanfeng); W. Zheng (Wei); K. Michailidou (Kyriaki); M. Ghoussaini (Maya); M.K. Bolla (Manjeet K.); Wang, Q. (Qin); J. Dennis (Joe); Lush, M. (Michael); R.L. Milne (Roger); X.-O. Shu (Xiao-Ou); J. Beesley (Jonathan); S. Kar (Siddhartha); I.L. Andrulis (Irene); H. Anton-Culver (Hoda); Arndt, V. (Volker); M.W. Beckmann (Matthias); Z. Zhao (Zhiguo); Guo, X. (Xingyi); J. Benítez (Javier); A. Beeghly-Fadiel (Alicia); W.J. Blot (William); N.V. Bogdanova (Natalia); S.E. Bojesen (Stig); H. Brauch (Hiltrud); H. Brenner (Hermann); L.A. Brinton (Louise); A. Broeks (Annegien); T. Brüning (Thomas); B. Burwinkel (Barbara); H. Cai (Hui); S. Canisius (Sander); J. Chang-Claude (Jenny); Choi, J.-Y. (Ji-Yeob); F.J. Couch (Fergus); A. Cox (Angela); S.S. Cross (Simon); K. Czene (Kamila); H. Darabi (Hatef); P. Devilee (Peter); A. Droit (Arnaud); T. Dörk (Thilo); P.A. Fasching (Peter); O. Fletcher (Olivia); H. Flyger (Henrik); F. Fostira (Florentia); Gaborieau, V. (Valerie); M. García-Closas (Montserrat); G.G. Giles (Graham); Grip, M. (Mervi); P. Guénel (Pascal); C.A. Haiman (Christopher A.); U. Hamann (Ute); J.M. Hartman (Joost); X. Miao; A. Hollestelle (Antoinette); J.L. Hopper (John); Hsiung, C.-N. (Chia-Ni); H. Ito (Hidemi); A. Jakubowska (Anna); Johnson, N. (Nichola); D. Torres (Diana); M. Kabisch (Maria); D. Kang (Daehee); S. Khan (Sofia); J.A. Knight (Julia); V-M. Kosma (Veli-Matti); Lambrechts, D. (Diether); J. Li (Jingmei); A. Lindblom (Annika); A. Lophatananon (Artitaya); J. Lubinski (Jan); A. Mannermaa (Arto); S. Manoukian (Siranoush); L. Le Marchand (Loic); S. Margolin (Sara); Marme, F. (Frederik); K. Matsuo (Keitaro); C.A. McLean (Catriona Ann); A. Meindl (Alfons); K.R. Muir (K.); S.L. Neuhausen (Susan); H. Nevanlinna (Heli); S. Nord (Silje); A.-L. Borresen-Dale (Anne-Lise); J.E. Olson (Janet); N. Orr (Nick); A.M.W. van den Ouweland (Ans); P. Peterlongo (Paolo); T.C. Putti (Thomas Choudary); Rudolph, A. (Anja); Sangrajrang, S. (Suleeporn); E.J. Sawyer (Elinor); M.K. Schmidt (Marjanka); R.K. Schmutzler (Rita); C.-Y. Shen (Chen-Yang); M.-F. Hou (Ming-Feng); M. Shrubsole (Martha); M.C. Southey (Melissa); A.J. Swerdlow (Anthony ); Hwang Teo, S. (Soo); B. Thienpont (Bernard); A.E. Toland (Amanda); R.A.E.M. Tollenaar (Rob); I.P. Tomlinson (Ian); T. Truong (Thérèse); C.-C. Tseng (Chiu-Chen); W. Wen (Wanqing); R. Winqvist (Robert); A.H. Wu (Anna); C. Har Yip (Cheng); P.M. Zamora (Pilar M.); Zheng, Y. (Ying); O.A.M. Floris; Cheng, C.-Y. (Ching-Yu); M.J. Hooning (Maartje); J.W.M. Martens (John); C.M. Seynaeve (Caroline); V. Kristensen (Vessela); P. Hall (Per); P.D.P. Pharoah (Paul); J. Simard (Jacques); G. Chenevix-Trench (Georgia); A.M. Dunning (Alison); A.C. Antoniou (Antonis C.); D.F. Easton (Douglas F.); Q. Cai (Qiuyin); J. Long (Jirong)

    2016-01-01

    textabstractPrevious genome-wide association studies among women of European ancestry identified two independent breast cancer susceptibility loci represented by single nucleotide polymorphisms (SNPs) rs13281615 and rs11780156 at 8q24. A fine-mapping study across 2.06 Mb

  8. ANALYSIS OF RADAR AND OPTICAL SPACE BORNE DATA FOR LARGE SCALE TOPOGRAPHICAL MAPPING

    Directory of Open Access Journals (Sweden)

    W. Tampubolon

    2015-03-01

    Full Text Available Normally, in order to provide high resolution 3 Dimension (3D geospatial data, large scale topographical mapping needs input from conventional airborne campaigns which are in Indonesia bureaucratically complicated especially during legal administration procedures i.e. security clearance from military/defense ministry. This often causes additional time delays besides technical constraints such as weather and limited aircraft availability for airborne campaigns. Of course the geospatial data quality is an important issue for many applications. The increasing demand of geospatial data nowadays consequently requires high resolution datasets as well as a sufficient level of accuracy. Therefore an integration of different technologies is required in many cases to gain the expected result especially in the context of disaster preparedness and emergency response. Another important issue in this context is the fast delivery of relevant data which is expressed by the term “Rapid Mapping”. In this paper we present first results of an on-going research to integrate different data sources like space borne radar and optical platforms. Initially the orthorectification of Very High Resolution Satellite (VHRS imagery i.e. SPOT-6 has been done as a continuous process to the DEM generation using TerraSAR-X/TanDEM-X data. The role of Ground Control Points (GCPs from GNSS surveys is mandatory in order to fulfil geometrical accuracy. In addition, this research aims on providing suitable processing algorithm of space borne data for large scale topographical mapping as described in section 3.2. Recently, radar space borne data has been used for the medium scale topographical mapping e.g. for 1:50.000 map scale in Indonesian territories. The goal of this on-going research is to increase the accuracy of remote sensing data by different activities, e.g. the integration of different data sources (optical and radar or the usage of the GCPs in both, the optical and the

  9. The research of selection model based on LOD in multi-scale display of electronic map

    Science.gov (United States)

    Zhang, Jinming; You, Xiong; Liu, Yingzhen

    2008-10-01

    This paper proposes a selection model based on LOD to aid the display of electronic map. The ratio of display scale to map scale is regarded as a LOD operator. The categorization rule, classification rule, elementary rule and spatial geometry character rule of LOD operator setting are also concluded.

  10. The Genetics of Winterhardiness in Barley: Perspectives from Genome-Wide Association Mapping

    Directory of Open Access Journals (Sweden)

    Jarislav von Zitzewitz

    2011-03-01

    Full Text Available Winterhardiness is a complex trait that involves low temperature tolerance (LTT, vernalization sensitivity, and photoperiod sensitivity. Quantitative trait loci (QTL for these traits were first identified using biparental mapping populations; candidate genes for all loci have since been identified and characterized. In this research we used a set of 148 accessions consisting of advanced breeding lines from the Oregon barley ( L. subsp breeding program and selected cultivars that were extensively phenotyped and genotyped with single nucleotide polymorphisms. Using these data for genome-wide association mapping we detected the same QTL and genes that have been systematically characterized using biparental populations over nearly two decades of intensive research. In this sample of germplasm, maximum LTT can be achieved with facultative growth habit, which can be predicted using a three-locus haplotype involving , , and . The and LTT QTL explained 25% of the phenotypic variation, offering the prospect that additional gains from selection can be achieved once favorable alleles are fixed at these loci.

  11. Genomic Signal Processing: Predicting Basic Molecular Biological Principles

    Science.gov (United States)

    Alter, Orly

    2005-03-01

    Advances in high-throughput technologies enable acquisition of different types of molecular biological data, monitoring the flow of biological information as DNA is transcribed to RNA, and RNA is translated to proteins, on a genomic scale. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment and drug development. Recently we described data-driven models for genome-scale molecular biological data, which use singular value decomposition (SVD) and the comparative generalized SVD (GSVD). Now we describe an integrative data-driven model, which uses pseudoinverse projection (1). We also demonstrate the predictive power of these matrix algebra models (2). The integrative pseudoinverse projection model formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the ``basis'' set. The mathematical variables of this integrative model, the pseudoinverse correlation patterns that are uncovered in the data, represent independent processes and corresponding cellular states (such as observed genome-wide effects of known regulators or transcription factors, the biological components of the cellular machinery that generate the genomic signals, and measured samples in which these regulators or transcription factors are over- or underactive). Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis, and gives a global picture of the correlations and possibly also causal coordination of

  12. Cloud computing for comparative genomics.

    Science.gov (United States)

    Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J

    2010-05-18

    Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  13. A multi-objective constraint-based approach for modeling genome-scale microbial ecosystems.

    Science.gov (United States)

    Budinich, Marko; Bourdon, Jérémie; Larhlimi, Abdelhalim; Eveillard, Damien

    2017-01-01

    Interplay within microbial communities impacts ecosystems on several scales, and elucidation of the consequent effects is a difficult task in ecology. In particular, the integration of genome-scale data within quantitative models of microbial ecosystems remains elusive. This study advocates the use of constraint-based modeling to build predictive models from recent high-resolution -omics datasets. Following recent studies that have demonstrated the accuracy of constraint-based models (CBMs) for simulating single-strain metabolic networks, we sought to study microbial ecosystems as a combination of single-strain metabolic networks that exchange nutrients. This study presents two multi-objective extensions of CBMs for modeling communities: multi-objective flux balance analysis (MO-FBA) and multi-objective flux variability analysis (MO-FVA). Both methods were applied to a hot spring mat model ecosystem. As a result, multiple trade-offs between nutrients and growth rates, as well as thermodynamically favorable relative abundances at community level, were emphasized. We expect this approach to be used for integrating genomic information in microbial ecosystems. Following models will provide insights about behaviors (including diversity) that take place at the ecosystem scale.

  14. An Algorithm Creating Thumbnail for Web Map Services Based on Information Entropy and Trans-scale Similarity

    Directory of Open Access Journals (Sweden)

    CHENG Xiaoqiang

    2017-11-01

    Full Text Available Thumbnail can greatly increase the efficiency of browsing pictures,videos and other image resources and improve the user experience prominently. Map service is a kind of graphic resource coupling spatial information and representation scale,its crafting,retrieval and management will not function well without the support of thumbnail. Sophisticated designed thumbnails bring users vivid first impressions and help users make efficient exploration. On the contrast,coarse thumbnail cause negative emotion and discourage users to explore the map service positively. Inspired by video summarization,key position and key scale of web map service were proposed. Meanwhile,corresponding quantitative measures and an automatic algorithm were drawn up and implemented. With the help of this algorithm,poor visual quality,lack of map information and low automation of current thumbnails was solved successfully. Information entropy was used to determine areas richer in content and tran-scale similarity was calculated to judge at which scale the appearance of the map service has changed drastically,and finally a series of static pictures were extracted which can represent the content of the map service. Experimental results show that this method produced medium-sized,content-rich and well-representative thumbnails which effectively reflect the content and appearance of map service.

  15. Large-scale development of SSR markers in tobacco and construction of a linkage map in flue-cured tobacco.

    Science.gov (United States)

    Tong, Zhijun; Xiao, Bingguang; Jiao, Fangchan; Fang, Dunhuang; Zeng, Jianmin; Wu, Xingfu; Chen, Xuejun; Yang, Jiankang; Li, Yongping

    2016-06-01

    Tobacco (Nicotiana tabacum L.), particularly flue-cured tobacco, is one of the most economically important nonfood crops and is also an important model system in plant biotechnology. Despite its importance, only limited molecular marker resources are available for genome analysis, genetic mapping, and breeding. Simple sequence repeats (SSR) are one of the most widely-used molecular markers, having significant advantages including that they are generally co-dominant, easy to use, abundant in eukaryotic organisms, and produce highly reproducible results. In this study, based on the genome sequence data of flue-cured tobacco (K326), we developed a total of 13,645 mostly novel SSR markers, which were working in a set of eighteen tobacco varieties of four different types. A mapping population of 213 backcross (BC1) individuals, which were derived from an intra-type cross between two flue-cured tobacco varieties, Y3 and K326, was selected for mapping. Based on the newly developed SSR markers as well as published SSR markers, we constructed a genetic map consisting of 626 SSR loci distributed across 24 linkage groups and covering a total length of 1120.45 cM with an average distance of 1.79 cM between adjacent markers, which is the highest density map of flue-cured tobacco till date.

  16. GeneRecon Users' Manual — A coalescent based tool for fine-scale association mapping

    DEFF Research Database (Denmark)

    Mailund, T

    2006-01-01

    GeneRecon is a software package for linkage disequilibrium mapping using coalescent theory. It is based on Bayesian Markov-chain Monte Carlo (MCMC) method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. GeneRecon explicitly models the genealogy of a sample of th...

  17. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Arneodo, Alain; Vaillant, Cedric; Audit, Benjamin; Argoul, Francoise; D'Aubenton-Carafa, Yves; Thermes, Claude

    2011-01-01

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  18. Genome to Phenome Mapping in Apple Using Historical Data

    Directory of Open Access Journals (Sweden)

    Zoë Migicovsky

    2016-07-01

    Full Text Available Apple ( X Borkh. is one of the world’s most valuable fruit crops. Its large size and long juvenile phase make it a particularly promising candidate for marker-assisted selection (MAS. However, advances in MAS in apple have been limited by a lack of phenotype and genotype data from sufficiently large samples. To establish genotype-phenotype relationships and advance MAS in apple, we extracted over 24,000 phenotype scores from the USDA-Germplasm Resources Information Network (GRIN database and linked them with over 8000 single nucleotide polymorphisms (SNPs from 689 apple accessions from the USDA apple germplasm collection clonally preserved in Geneva, NY. We find significant genetic differentiation between Old World and New World cultivars and demonstrate that the genetic structure of the domesticated apple also reflects the time required for ripening. A genome-wide association study (GWAS of 36 phenotypes confirms the association between fruit color and the MYB1 locus, and we also report a novel association between the transcription factor, NAC18.1, and harvest date and fruit firmness. We demonstrate that harvest time and fruit size can be predicted with relatively high accuracies ( > 0.46 using genomic prediction. Rapid decay of linkage disequilibrium (LD in apples means millions of SNPs may be required for well-powered GWAS. However, rapid LD decay also promises to enable extremely high resolution mapping of causal variants, which holds great potential for advancing MAS.

  19. Grass genomes

    OpenAIRE

    Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

    1998-01-01

    For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...

  20. SSR-enriched genetic linkage maps of bermudagrass (Cynodon dactylon × transvaalensis), and their comparison with allied plant genomes.

    Science.gov (United States)

    Khanal, Sameer; Kim, Changsoo; Auckland, Susan A; Rainville, Lisa K; Adhikari, Jeevan; Schwartz, Brian M; Paterson, Andrew H

    2017-04-01

    We report SSR-enriched genetic maps of bermudagrass that: (1) reveal partial residual polysomic inheritance in the tetraploid species, and (2) provide insights into the evolution of chloridoid genomes. This study describes genetic linkage maps of two bermudagrass species, Cynodon dactylon (T89) and Cynodon transvaalensis (T574), that integrate heterologous microsatellite markers from sugarcane into frameworks built with single-dose restriction fragments (SDRFs). A maximum likelihood approach was used to construct two separate parental maps from a population of 110 F 1 progeny of a cross between the two parents. The T89 map is based on 291 loci on 34 cosegregating groups (CGs), with an average marker spacing of 12.5 cM. The T574 map is based on 125 loci on 14 CGs, with an average marker spacing of 10.7 cM. Six T89 and one T574 CG(s) deviated from disomic inheritance. Furthermore, marker segregation data and linkage phase analysis revealed partial residual polysomic inheritance in T89, suggesting that common bermudagrass is undergoing diploidization following whole genome duplication (WGD). Twenty-six T89 CGs were coalesced into 9 homo(eo)logous linkage groups (LGs), while 12 T574 CGs were assembled into 9 LGs, both putatively representing the basic chromosome complement (x = 9) of the species. Eight T89 and two T574 CGs remain unassigned. The marker composition of bermudagrass ancestral chromosomes was inferred by aligning T89 and T574 homologs, and used in comparisons to sorghum and rice genome sequences based on 108 and 91 significant blast hits, respectively. Two nested chromosome fusions (NCFs) shared by two other chloridoids (i.e., zoysiagrass and finger millet) and at least three independent translocation events were evident during chromosome number reduction from 14 in the polyploid common ancestor of Poaceae to 9 in Cynodon.

  1. Genic non-coding microsatellites in the rice genome: characterization, marker design and use in assessing genetic and evolutionary relationships among domesticated groups

    Directory of Open Access Journals (Sweden)

    Singh Nagendra

    2009-03-01

    Full Text Available Abstract Background Completely sequenced plant genomes provide scope for designing a large number of microsatellite markers, which are useful in various aspects of crop breeding and genetic analysis. With the objective of developing genic but non-coding microsatellite (GNMS markers for the rice (Oryza sativa L. genome, we characterized the frequency and relative distribution of microsatellite repeat-motifs in 18,935 predicted protein coding genes including 14,308 putative promoter sequences. Results We identified 19,555 perfect GNMS repeats with densities ranging from 306.7/Mb in chromosome 1 to 450/Mb in chromosome 12 with an average of 357.5 GNMS per Mb. The average microsatellite density was maximum in the 5' untranslated regions (UTRs followed by those in introns, promoters, 3'UTRs and minimum in the coding sequences (CDS. Primers were designed for 17,966 (92% GNMS repeats, including 4,288 (94% hypervariable class I types, which were bin-mapped on the rice genome. The GNMS markers were most polymorphic in the intronic region (73.3% followed by markers in the promoter region (53.3% and least in the CDS (26.6%. The robust polymerase chain reaction (PCR amplification efficiency and high polymorphic potential of GNMS markers over genic coding and random genomic microsatellite markers suggest their immediate use in efficient genotyping applications in rice. A set of these markers could assess genetic diversity and establish phylogenetic relationships among domesticated rice cultivar groups. We also demonstrated the usefulness of orthologous and paralogous conserved non-coding microsatellite (CNMS markers, identified in the putative rice promoter sequences, for comparative physical mapping and understanding of evolutionary and gene regulatory complexities among rice and other members of the grass family. The divergence between long-grained aromatics and subspecies japonica was estimated to be more recent (0.004 Mya compared to short

  2. Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01

    Science.gov (United States)

    A landmark in soybean research, Glyma1.01, the first whole genome sequence of variety Williams 82 (Glycine max L. Merr.) was completed in 2010 and is widely used. However, because the assembly was primarily built based on the linkage maps constructed with a limited number of markers and recombinant...

  3. BAUM: Improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach.

    Science.gov (United States)

    Wang, Anqi; Wang, Zhanyu; Li, Zheng; Li, Lei M

    2018-01-15

    It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the Second Generation Sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the Single-Molecule Real-Time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can: (1) perform reference-assisted assembly based on the genome of a close species; (2) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. lilei@amss.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  4. Promoter characterization and genomic organization of the human X11β gene APBA2.

    LENUS (Irish Health Repository)

    Hao, Yan

    2012-02-15

    Overexpression of neuronal adaptor protein X11β has been shown to decrease the production of amyloid-β, a toxic peptide deposited in Alzheimer\\'s disease brains. Therefore, manipulation of the X11β level may represent a potential therapeutic strategy for Alzheimer\\'s disease. As X11β expression can be regulated at the transcription level, we determined the genomic organization and the promoter of the human X11β gene, amyloid β A4 precursor protein-binding family A member 2 (APBA2). By RNA ligase-mediated rapid amplification of cDNA ends, a single APBA2 transcription start site and the complete sequence of exon 1 were identified. The APBA2 promoter was located upstream of exon 1 and was more active in neurons. The core promoter contains several CpG dinucleotides, and was strongly suppressed by DNA methylation. In addition, mutagenesis analysis revealed a putative Pax5-binding site within the promoter. Together, APBA2 contains a potent neuronal promoter whose activity may be regulated by DNA methylation and Pax5.

  5. A map of recent positive selection in the human genome.

    Directory of Open Access Journals (Sweden)

    Benjamin F Voight

    2006-03-01

    Full Text Available The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.

  6. Complete genome analysis of Serratia marcescens RSC-14: A plant growth-promoting bacterium that alleviates cadmium stress in host plants

    Science.gov (United States)

    Khan, Abdur Rahim; Park, Gun-Seok; Asaf, Sajjad; Hong, Sung-Jun; Jung, Byung Kwon

    2017-01-01

    Serratia marcescens RSC-14 is a Gram-negative bacterium that was previously isolated from the surface-sterilized roots of the Cd-hyperaccumulator Solanum nigrum. The strain stimulates plant growth and alleviates Cd stress in host plants. To investigate the genetic basis for these traits, the complete genome of RSC-14 was obtained by single-molecule real-time sequencing. The genome of S. marcescens RSC-14 comprised a 5.12-Mbp-long circular chromosome containing 4,593 predicted protein-coding genes, 22 rRNA genes, 88 tRNA genes, and 41 pseudogenes. It contained genes with potential functions in plant growth promotion, including genes involved in indole-3-acetic acid (IAA) biosynthesis, acetoin synthesis, and phosphate solubilization. Moreover, annotation using NCBI and Rapid Annotation using Subsystem Technology identified several genes that encode antioxidant enzymes as well as genes involved in antioxidant production, supporting the observed resistance towards heavy metals, such as Cd. The presence of IAA pathway-related genes and oxidative stress-responsive enzyme genes may explain the plant growth-promoting potential and Cd tolerance, respectively. This is the first report of a complete genome sequence of Cd-tolerant S. marcescens and its plant growth promotion pathway. The whole-genome analysis of this strain clarified the genetic basis underlying its phenotypic and biochemical characteristics, underpinning the beneficial interactions between RSC-14 and plants. PMID:28187139

  7. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    Directory of Open Access Journals (Sweden)

    Schwartz Russell

    2008-08-01

    Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.

  8. Analysis of Genome-Scale Data

    NARCIS (Netherlands)

    Kemmeren, P.P.C.W.

    2005-01-01

    The genetic material of every cell in an organism is stored inside DNA in the form of genes, which together form the genome. The information stored in the DNA is translated to RNA and subsequently to proteins, which form complex biological systems. The availability of whole genome sequences has

  9. Chromosome mapping of dragline silk genes in the genomes of widow spiders (Araneae, Theridiidae.

    Directory of Open Access Journals (Sweden)

    Yonghui Zhao

    Full Text Available With its incredible strength and toughness, spider dragline silk is widely lauded for its impressive material properties. Dragline silk is composed of two structural proteins, MaSp1 and MaSp2, which are encoded by members of the spidroin gene family. While previous studies have characterized the genes that encode the constituent proteins of spider silks, nothing is known about the physical location of these genes. We determined karyotypes and sex chromosome organization for the widow spiders, Latrodectus hesperus and L. geometricus (Araneae, Theridiidae. We then used fluorescence in situ hybridization to map the genomic locations of the genes for the silk proteins that compose the remarkable spider dragline. These genes included three loci for the MaSp1 protein and the single locus for the MaSp2 protein. In addition, we mapped a MaSp1 pseudogene. All the MaSp1 gene copies and pseudogene localized to a single chromosomal region while MaSp2 was located on a different chromosome of L. hesperus. Using probes derived from L. hesperus, we comparatively mapped all three MaSp1 loci to a single region of a L. geometricus chromosome. As with L. hesperus, MaSp2 was found on a separate L. geometricus chromosome, thus again unlinked to the MaSp1 loci. These results indicate orthology of the corresponding chromosomal regions in the two widow genomes. Moreover, the occurrence of multiple MaSp1 loci in a conserved gene cluster across species suggests that MaSp1 proliferated by tandem duplication in a common ancestor of L. geometricus and L. hesperus. Unequal crossover events during recombination could have given rise to the gene copies and could also maintain sequence similarity among gene copies over time. Further comparative mapping with taxa of increasing divergence from Latrodectus will pinpoint when the MaSp1 duplication events occurred and the phylogenetic distribution of silk gene linkage patterns.

  10. Genome sequence and genetic diversity of the common carp, Cyprinus carpio.

    Science.gov (United States)

    Xu, Peng; Zhang, Xiaofeng; Wang, Xumin; Li, Jiongtang; Liu, Guiming; Kuang, Youyi; Xu, Jian; Zheng, Xianhu; Ren, Lufeng; Wang, Guoliang; Zhang, Yan; Huo, Linhe; Zhao, Zixia; Cao, Dingchen; Lu, Cuiyun; Li, Chao; Zhou, Yi; Liu, Zhanjiang; Fan, Zhonghua; Shan, Guangle; Li, Xingang; Wu, Shuangxiu; Song, Lipu; Hou, Guangyuan; Jiang, Yanliang; Jeney, Zsigmond; Yu, Dan; Wang, Li; Shao, Changjun; Song, Lai; Sun, Jing; Ji, Peifeng; Wang, Jian; Li, Qiang; Xu, Liming; Sun, Fanyue; Feng, Jianxin; Wang, Chenghui; Wang, Shaolin; Wang, Baosen; Li, Yan; Zhu, Yaping; Xue, Wei; Zhao, Lan; Wang, Jintu; Gu, Ying; Lv, Weihua; Wu, Kejing; Xiao, Jingfa; Wu, Jiayan; Zhang, Zhang; Yu, Jun; Sun, Xiaowen

    2014-11-01

    The common carp, Cyprinus carpio, is one of the most important cyprinid species and globally accounts for 10% of freshwater aquaculture production. Here we present a draft genome of domesticated C. carpio (strain Songpu), whose current assembly contains 52,610 protein-coding genes and approximately 92.3% coverage of its paleotetraploidized genome (2n = 100). The latest round of whole-genome duplication has been estimated to have occurred approximately 8.2 million years ago. Genome resequencing of 33 representative individuals from worldwide populations demonstrates a single origin for C. carpio in 2 subspecies (C. carpio Haematopterus and C. carpio carpio). Integrative genomic and transcriptomic analyses were used to identify loci potentially associated with traits including scaling patterns and skin color. In combination with the high-resolution genetic map, the draft genome paves the way for better molecular studies and improved genome-assisted breeding of C. carpio and other closely related species.

  11. Systematic Dissection of Sequence Elements Controlling σ70 Promoters Using a Genomically-Encoded Multiplexed Reporter Assay in E. coli.

    Science.gov (United States)

    Urtecho, Guillaume; Tripp, Arielle D; Insigne, Kimberly; Kim, Hwangbeom; Kosuri, Sriram

    2018-02-01

    Promoters are the key drivers of gene expression and are largely responsible for the regulation of cellular responses to time and environment. In E. coli , decades of studies have revealed most, if not all, of the sequence elements necessary to encode promoter function. Despite our knowledge of these motifs, it is still not possible to predict the strength and regulation of a promoter from primary sequence alone. Here we develop a novel multiplexed assay to study promoter function in E. coli by building a site-specific genomic recombination-mediated cassette exchange (RMCE) system that allows for the facile construction and testing of large libraries of genetic designs integrated into precise genomic locations. We build and test a library of 10,898 σ70 promoter variants consisting of all combinations of a set of eight -35 elements, eight -10 elements, three UP elements, eight spacers, and eight backgrounds. We find that the -35 and -10 sequence elements can explain approximately 74% of the variance in promoter strength within our dataset using a simple log-linear statistical model. Neural network models can explain greater than 95% of the variance in our dataset, and show the increased power is due to nonlinear interactions of other elements such as the spacer, background, and UP elements.

  12. Tomato genome mapping by fluorescence in situ hybridisation = Kartering van het tomatengenoom met behulp van fluorescentie in situ hybridisatie

    NARCIS (Netherlands)

    Zhong, X.B.

    1998-01-01

    The general introduction reviews the progress in tomato genome mapping using classical genetics, cytogenetics, and molecular genetics, emphasising the great potential of fluorescence in situ hybridisation (FISH) techniques.

    Chapter 2 describes how to

  13. Discovery and fine-mapping of adiposity loci using high density imputation of genome-wide association studies in individuals of African ancestry: African Ancestry Anthropometry Genetics Consortium.

    Science.gov (United States)

    Ng, Maggie C Y; Graff, Mariaelisa; Lu, Yingchang; Justice, Anne E; Mudgal, Poorva; Liu, Ching-Ti; Young, Kristin; Yanek, Lisa R; Feitosa, Mary F; Wojczynski, Mary K; Rand, Kristin; Brody, Jennifer A; Cade, Brian E; Dimitrov, Latchezar; Duan, Qing; Guo, Xiuqing; Lange, Leslie A; Nalls, Michael A; Okut, Hayrettin; Tajuddin, Salman M; Tayo, Bamidele O; Vedantam, Sailaja; Bradfield, Jonathan P; Chen, Guanjie; Chen, Wei-Min; Chesi, Alessandra; Irvin, Marguerite R; Padhukasahasram, Badri; Smith, Jennifer A; Zheng, Wei; Allison, Matthew A; Ambrosone, Christine B; Bandera, Elisa V; Bartz, Traci M; Berndt, Sonja I; Bernstein, Leslie; Blot, William J; Bottinger, Erwin P; Carpten, John; Chanock, Stephen J; Chen, Yii-Der Ida; Conti, David V; Cooper, Richard S; Fornage, Myriam; Freedman, Barry I; Garcia, Melissa; Goodman, Phyllis J; Hsu, Yu-Han H; Hu, Jennifer; Huff, Chad D; Ingles, Sue A; John, Esther M; Kittles, Rick; Klein, Eric; Li, Jin; McKnight, Barbara; Nayak, Uma; Nemesure, Barbara; Ogunniyi, Adesola; Olshan, Andrew; Press, Michael F; Rohde, Rebecca; Rybicki, Benjamin A; Salako, Babatunde; Sanderson, Maureen; Shao, Yaming; Siscovick, David S; Stanford, Janet L; Stevens, Victoria L; Stram, Alex; Strom, Sara S; Vaidya, Dhananjay; Witte, John S; Yao, Jie; Zhu, Xiaofeng; Ziegler, Regina G; Zonderman, Alan B; Adeyemo, Adebowale; Ambs, Stefan; Cushman, Mary; Faul, Jessica D; Hakonarson, Hakon; Levin, Albert M; Nathanson, Katherine L; Ware, Erin B; Weir, David R; Zhao, Wei; Zhi, Degui; Arnett, Donna K; Grant, Struan F A; Kardia, Sharon L R; Oloapde, Olufunmilayo I; Rao, D C; Rotimi, Charles N; Sale, Michele M; Williams, L Keoki; Zemel, Babette S; Becker, Diane M; Borecki, Ingrid B; Evans, Michele K; Harris, Tamara B; Hirschhorn, Joel N; Li, Yun; Patel, Sanjay R; Psaty, Bruce M; Rotter, Jerome I; Wilson, James G; Bowden, Donald W; Cupples, L Adrienne; Haiman, Christopher A; Loos, Ruth J F; North, Kari E

    2017-04-01

    Genome-wide association studies (GWAS) have identified >300 loci associated with measures of adiposity including body mass index (BMI) and waist-to-hip ratio (adjusted for BMI, WHRadjBMI), but few have been identified through screening of the African ancestry genomes. We performed large scale meta-analyses and replications in up to 52,895 individuals for BMI and up to 23,095 individuals for WHRadjBMI from the African Ancestry Anthropometry Genetics Consortium (AAAGC) using 1000 Genomes phase 1 imputed GWAS to improve coverage of both common and low frequency variants in the low linkage disequilibrium African ancestry genomes. In the sex-combined analyses, we identified one novel locus (TCF7L2/HABP2) for WHRadjBMI and eight previously established loci at P African ancestry individuals. An additional novel locus (SPRYD7/DLEU2) was identified for WHRadjBMI when combined with European GWAS. In the sex-stratified analyses, we identified three novel loci for BMI (INTS10/LPL and MLC1 in men, IRX4/IRX2 in women) and four for WHRadjBMI (SSX2IP, CASC8, PDE3B and ZDHHC1/HSD11B2 in women) in individuals of African ancestry or both African and European ancestry. For four of the novel variants, the minor allele frequency was low (African ancestry sex-combined and sex-stratified analyses, 26 BMI loci and 17 WHRadjBMI loci contained ≤ 20 variants in the credible sets that jointly account for 99% posterior probability of driving the associations. The lead variants in 13 of these loci had a high probability of being causal. As compared to our previous HapMap imputed GWAS for BMI and WHRadjBMI including up to 71,412 and 27,350 African ancestry individuals, respectively, our results suggest that 1000 Genomes imputation showed modest improvement in identifying GWAS loci including low frequency variants. Trans-ethnic meta-analyses further improved fine mapping of putative causal variants in loci shared between the African and European ancestry populations.

  14. The Perennial Ryegrass GenomeZipper: Targeted Use of Genome Resources for Comparative Grass Genomics1[C][W

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F.X.; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-01-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species. PMID:23184232

  15. Genome-wide development and use of microsatellite markers for large-scale genotyping applications in foxtail millet [Setaria italica (L.)].

    Science.gov (United States)

    Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-04-01

    The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species.

  16. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    Science.gov (United States)

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-03-31

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date.

  17. Genomic Analysis and Isolation of RNA Polymerase II Dependent Promoters from Spodoptera frugiperda.

    Science.gov (United States)

    Bleckmann, Maren; Fritz, Markus H-Y; Bhuju, Sabin; Jarek, Michael; Schürig, Margitta; Geffers, Robert; Benes, Vladimir; Besir, Hüseyin; van den Heuvel, Joop

    2015-01-01

    The Baculoviral Expression Vector System (BEVS) is the most commonly used method for high expression of recombinant protein in insect cells. Nevertheless, expression of some target proteins--especially those entering the secretory pathway--provides a severe challenge for the baculovirus infected insect cells, due to the reorganisation of intracellular compounds upon viral infection. Therefore, alternative strategies for recombinant protein production in insect cells like transient plasmid-based expression or stable expression cell lines are becoming more popular. However, the major bottleneck of these systems is the lack of strong endogenous polymerase II dependent promoters, as the strong baculoviral p10 and polH promoters used in BEVS are only functional in presence of the viral transcription machinery during the late phase of infection. In this work we present a draft genome and a transcriptome analysis of Sf21 cells for the identification of the first known endogenous Spodoptera frugiperda promoters. Therefore, putative promoter sequences were identified and selected because of high mRNA level or in analogy to other strong promoters in other eukaryotic organism. The chosen endogenous Sf21 promoters were compared to early viral promoters for their efficiency to trigger eGFP expression using transient plasmid based transfection in a BioLector Microfermentation system. Furthermore, promoter activity was not only shown in Sf21 cells but also in Hi5 cells. The novel endogenous Sf21 promoters were ranked according to their activity and expand the small pool of available promoters for stable insect cell line development and transient plasmid expression in insect cells. The best promoter was used to improve plasmid based transient transfection in insect cells substantially.

  18. Comparative genomics and association mapping approaches for blast resistant genes in finger millet using SSRs.

    Directory of Open Access Journals (Sweden)

    B Kalyana Babu

    Full Text Available The major limiting factor for production and productivity of finger millet crop is blast disease caused by Magnaporthe grisea. Since, the genome sequence information available in finger millet crop is scarce, comparative genomics plays a very important role in identification of genes/QTLs linked to the blast resistance genes using SSR markers. In the present study, a total of 58 genic SSRs were developed for use in genetic analysis of a global collection of 190 finger millet genotypes. The 58 SSRs yielded ninety five scorable alleles and the polymorphism information content varied from 0.186 to 0.677 at an average of 0.385. The gene diversity was in the range of 0.208 to 0.726 with an average of 0.487. Association mapping for blast resistance was done using 104 SSR markers which identified four QTLs for finger blast and one QTL for neck blast resistance. The genomic marker RM262 and genic marker FMBLEST32 were linked to finger blast disease at a P value of 0.007 and explained phenotypic variance (R² of 10% and 8% respectively. The genomic marker UGEP81 was associated to finger blast at a P value of 0.009 and explained 7.5% of R². The QTLs for neck blast was associated with the genomic SSR marker UGEP18 at a P value of 0.01, which explained 11% of R². Three QTLs for blast resistance were found common by using both GLM and MLM approaches. The resistant alleles were found to be present mostly in the exotic genotypes. Among the genotypes of NW Himalayan region of India, VHC3997, VHC3996 and VHC3930 were found highly resistant, which may be effectively used as parents for developing blast resistant cultivars in the NW Himalayan region of India. The markers linked to the QTLs for blast resistance in the present study can be further used for cloning of the full length gene, fine mapping and their further use in the marker assisted breeding programmes for introgression of blast resistant alleles into locally adapted cultivars.

  19. A BAC/BIBAC-based physical map of chickpea, Cicer arietinum L

    Directory of Open Access Journals (Sweden)

    Abbo Shahal

    2010-09-01

    Full Text Available Abstract Background Chickpea (Cicer arietinum L. is the third most important pulse crop worldwide. Despite its importance, relatively little is known about its genome. The availability of a genome-wide physical map allows rapid fine mapping of QTL, development of high-density genome maps, and sequencing of the entire genome. However, no such a physical map has been developed in chickpea. Results We present a genome-wide, BAC/BIBAC-based physical map of chickpea developed by fingerprint analysis. Four chickpea BAC and BIBAC libraries, two of which were constructed in this study, were used. A total of 67,584 clones were fingerprinted, and 64,211 (~11.7 × of the fingerprints validated and used in the physical map assembly. The physical map consists of 1,945 BAC/BIBAC contigs, with each containing an average of 28.3 clones and having an average physical length of 559 kb. The contigs collectively span approximately 1,088 Mb. By using the physical map, we identified the BAC/BIBAC contigs containing or closely linked to QTL4.1 for resistance to Didymella rabiei (RDR and QTL8 for days to first flower (DTF, thus further verifying the physical map and confirming its utility in fine mapping and cloning of QTL. Conclusion The physical map represents the first genome-wide, BAC/BIBAC-based physical map of chickpea. This map, along with other genomic resources previously developed in the species and the genome sequences of related species (soybean, Medicago and Lotus, will provide a foundation necessary for many areas of advanced genomics research in chickpea and other legume species. The inclusion of transformation-ready BIBACs in the map greatly facilitates its utility in functional analysis of the legume genomes.

  20. Noise pollution mapping approach and accuracy on landscape scales.

    Science.gov (United States)

    Iglesias Merchan, Carlos; Diaz-Balteiro, Luis

    2013-04-01

    Noise mapping allows the characterization of environmental variables, such as noise pollution or soundscape, depending on the task. Strategic noise mapping (as per Directive 2002/49/EC, 2002) is a tool intended for the assessment of noise pollution at the European level every five years. These maps are based on common methods and procedures intended for human exposure assessment in the European Union that could be also be adapted for assessing environmental noise pollution in natural parks. However, given the size of such areas, there could be an alternative approach to soundscape characterization rather than using human noise exposure procedures. It is possible to optimize the size of the mapping grid used for such work by taking into account the attributes of the area to be studied and the desired outcome. This would then optimize the mapping time and the cost. This type of optimization is important in noise assessment as well as in the study of other environmental variables. This study compares 15 models, using different grid sizes, to assess the accuracy of the noise mapping of the road traffic noise at a landscape scale, with respect to noise and landscape indicators. In a study area located in the Manzanares High River Basin Regional Park in Spain, different accuracy levels (Kappa index values from 0.725 to 0.987) were obtained depending on the terrain and noise source properties. The time taken for the calculations and the noise mapping accuracy results reveal the potential for setting the map resolution in line with decision-makers' criteria and budget considerations. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. NeatMap--non-clustering heat map alternatives in R.

    Science.gov (United States)

    Rajaram, Satwik; Oono, Yoshi

    2010-01-22

    The clustered heat map is the most popular means of visualizing genomic data. It compactly displays a large amount of data in an intuitive format that facilitates the detection of hidden structures and relations in the data. However, it is hampered by its use of cluster analysis which does not always respect the intrinsic relations in the data, often requiring non-standardized reordering of rows/columns to be performed post-clustering. This sometimes leads to uninformative and/or misleading conclusions. Often it is more informative to use dimension-reduction algorithms (such as Principal Component Analysis and Multi-Dimensional Scaling) which respect the topology inherent in the data. Yet, despite their proven utility in the analysis of biological data, they are not as widely used. This is at least partially due to the lack of user-friendly visualization methods with the visceral impact of the heat map. NeatMap is an R package designed to meet this need. NeatMap offers a variety of novel plots (in 2 and 3 dimensions) to be used in conjunction with these dimension-reduction techniques. Like the heat map, but unlike traditional displays of such results, it allows the entire dataset to be displayed while visualizing relations between elements. It also allows superimposition of cluster analysis results for mutual validation. NeatMap is shown to be more informative than the traditional heat map with the help of two well-known microarray datasets. NeatMap thus preserves many of the strengths of the clustered heat map while addressing some of its deficiencies. It is hoped that NeatMap will spur the adoption of non-clustering dimension-reduction algorithms.

  2. Current development and application of soybean genomics

    Institute of Scientific and Technical Information of China (English)

    Lingli HE; Jing ZHAO; Man ZHAO; Chaoying HE

    2011-01-01

    Soybean (Glycine max),an important domesticated species originated in China,constitutes a major source of edible oils and high-quality plant proteins worldwide.In spite of its complex genome as a consequence of an ancient tetraploidilization,platforms for map-based genomics,sequence-based genomics,comparative genomics and functional genomics have been well developed in the last decade,thus rich repertoires of genomic tools and resources are available,which have been influencing the soybean genetic improvement.Here we mainly review the progresses of soybean (including its wild relative Glycine soja) genomics and its impetus for soybean breeding,and raise the major biological questions needing to be addressed.Genetic maps,physical maps,QTL and EST mapping have been so well achieved that the marker assisted selection and positional cloning in soybean is feasible and even routine.Whole genome sequencing and transcriptomic analyses provide a large collection of molecular markers and predicted genes,which are instrumental to comparative genomics and functional genomics.Comparative genomics has started to reveal the evolution of soybean genome and the molecular basis of soybean domestication process.Microarrays resources,mutagenesis and efficient transformation systems become essential components of soybean functional genomics.Furthermore,phenotypic functional genomics via both forward and reverse genetic approaches has inferred functions of many genes involved in plant and seed development,in response to abiotic stresses,functioning in plant-pathogenic microbe interactions,and controlling the oil and protein content of seed.These achievements have paved the way for generation of transgenic or genetically modified (GM) soybean crops.

  3. Electron microscopic comparison of the sequences of single-stranded genomes of mammalian parvoviruses by heteroduplex mapping

    Energy Technology Data Exchange (ETDEWEB)

    Banerjee, P.T.; Olson, W.H.; Allison, D.P.; Bates, R.C.; Snyder, C.E.; Mitra, S.

    1983-01-01

    The sequence homologies among the linear single-stranded genomes of several mammalian parvoviruses have been studied by electron microscopic analysis of tthe heteroduplexes produced by reannealing the complementary strands of their DNAs. The genomes of Kilham rat virus, H-1, minute virus of ice and LuIII, which are antigenically distinct non-defective parvoviruses, have considerable homology: about 70% of their sequences are conserved. The homologous regions map at similar locations in the left halves (from the 3' ends) of the genomes. No sequence homology, however, is observed between the DNAs of these nondefective parvoviruses and that of bovine parvovirus, another non-defective virus, or that of defective adenoassociated virus, nor between the genomes of bovine parvovirus and adenoassociated virus. This suggests that only very short, if any, homologous regions are present. From these results, an evolutionary relationship among Kilham rat virus, H-1, minute virus of mice and LuIII is predicted. It is interesting to note that, although LuIII was originally isolated from a human cell line and is specific for human cells in vitro, its genome has sequences in common only with the rodent viruses Kilham rat virus, minute virus of mice and H-1, and not with the other two mammalian parvoviruses tested.

  4. CGI: Java software for mapping and visualizing data from array-based comparative genomic hybridization and expression profiling.

    Science.gov (United States)

    Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H; Lau, Ching C; Behl, Sanjiv; Man, Tsz-Kwong

    2007-10-06

    With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  5. CGI: Java Software for Mapping and Visualizing Data from Array-based Comparative Genomic Hybridization and Expression Profiling

    Directory of Open Access Journals (Sweden)

    Joyce Xiuweu-Xu Gu

    2007-01-01

    Full Text Available With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator that matches the BAC clones from array-based comparative genomic hybridization (aCGH to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specifi c BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  6. The first generation of a BAC-based physical map of Brassica rapa

    Directory of Open Access Journals (Sweden)

    Lee Soo

    2008-06-01

    Full Text Available Abstract Background The genus Brassica includes the most extensively cultivated vegetable crops worldwide. Investigation of the Brassica genome presents excellent challenges to study plant genome evolution and divergence of gene function associated with polyploidy and genome hybridization. A physical map of the B. rapa genome is a fundamental tool for analysis of Brassica "A" genome structure. Integration of a physical map with an existing genetic map by linking genetic markers and BAC clones in the sequencing pipeline provides a crucial resource for the ongoing genome sequencing effort and assembly of whole genome sequences. Results A genome-wide physical map of the B. rapa genome was constructed by the capillary electrophoresis-based fingerprinting of 67,468 Bacterial Artificial Chromosome (BAC clones using the five restriction enzyme SNaPshot technique. The clones were assembled into contigs by means of FPC v8.5.3. After contig validation and manual editing, the resulting contig assembly consists of 1,428 contigs and is estimated to span 717 Mb in physical length. This map provides 242 anchored contigs on 10 linkage groups to be served as seed points from which to continue bidirectional chromosome extension for genome sequencing. Conclusion The map reported here is the first physical map for Brassica "A" genome based on the High Information Content Fingerprinting (HICF technique. This physical map will serve as a fundamental genomic resource for accelerating genome sequencing, assembly of BAC sequences, and comparative genomics between Brassica genomes. The current build of the B. rapa physical map is available at the B. rapa Genome Project website for the user community.

  7. Genetical Genomics for Evolutionary Studies

    NARCIS (Netherlands)

    Prins, J.C.P.; Smant, G.; Jansen, R.C.

    2012-01-01

    Genetical genomics combines acquired high-throughput genomic data with genetic analysis. In this chapter, we discuss the application of genetical genomics for evolutionary studies, where new high-throughput molecular technologies are combined with mapping quantitative trait loci (QTL) on the genome

  8. Exploiting linkage disequilibrium in statistical modelling in quantitative genomics

    DEFF Research Database (Denmark)

    Wang, Lei

    Alleles at two loci are said to be in linkage disequilibrium (LD) when they are correlated or statistically dependent. Genomic prediction and gene mapping rely on the existence of LD between gentic markers and causul variants of complex traits. In the first part of the thesis, a novel method...... to quantify and visualize local variation in LD along chromosomes in describet, and applied to characterize LD patters at the local and genome-wide scale in three Danish pig breeds. In the second part, different ways of taking LD into account in genomic prediction models are studied. One approach is to use...... the recently proposed antedependence models, which treat neighbouring marker effects as correlated; another approach involves use of haplotype block information derived using the program Beagle. The overall conclusion is that taking LD information into account in genomic prediction models potentially improves...

  9. A multi-objective constraint-based approach for modeling genome-scale microbial ecosystems.

    Directory of Open Access Journals (Sweden)

    Marko Budinich

    Full Text Available Interplay within microbial communities impacts ecosystems on several scales, and elucidation of the consequent effects is a difficult task in ecology. In particular, the integration of genome-scale data within quantitative models of microbial ecosystems remains elusive. This study advocates the use of constraint-based modeling to build predictive models from recent high-resolution -omics datasets. Following recent studies that have demonstrated the accuracy of constraint-based models (CBMs for simulating single-strain metabolic networks, we sought to study microbial ecosystems as a combination of single-strain metabolic networks that exchange nutrients. This study presents two multi-objective extensions of CBMs for modeling communities: multi-objective flux balance analysis (MO-FBA and multi-objective flux variability analysis (MO-FVA. Both methods were applied to a hot spring mat model ecosystem. As a result, multiple trade-offs between nutrients and growth rates, as well as thermodynamically favorable relative abundances at community level, were emphasized. We expect this approach to be used for integrating genomic information in microbial ecosystems. Following models will provide insights about behaviors (including diversity that take place at the ecosystem scale.

  10. Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity

    Science.gov (United States)

    Although draft genomes are available for most agronomically important plant species, the majority are incomplete, highly fragmented, and often riddled with assembly and scaffolding errors. These assembly issues hinder advances in tool development for functional genomics and systems biology. Here we ...

  11. Genome-wide retroviral insertional tagging of genes involved in cancer in Cdkn2a-deficient mice

    DEFF Research Database (Denmark)

    Lund, Anders H; Turner, Geoffrey; Trubetskoy, Alla

    2002-01-01

    We have used large-scale insertional mutagenesis to identify functional landmarks relevant to cancer in the recently completed mouse genome sequence. We infected Cdkn2a(-/-) mice with Moloney murine leukemia virus (MoMuLV) to screen for loci that can participate in tumorigenesis in collaboration...... retroviral integration sites and mapped them against the mouse genome sequence databases from Celera and Ensembl. In addition to 17 insertions targeting gene loci known to be cancer-related, we identified a total of 37 new common insertion sites (CISs), of which 8 encode components of signaling pathways...... that are involved in cancer. The effectiveness of large-scale insertional mutagenesis in a sensitized genetic background is demonstrated by the preference for activation of MAP kinase signaling, collaborating with Cdkn2a loss in generating the lymphoid and myeloid tumors. Collectively, our results show that large...

  12. Cloud computing for comparative genomics

    Directory of Open Access Journals (Sweden)

    Pivovarov Rimma

    2010-05-01

    Full Text Available Abstract Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD, to run within Amazon's Elastic Computing Cloud (EC2. We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  13. The pig genome project has plenty to squeal about.

    Science.gov (United States)

    Fan, B; Gorbach, D M; Rothschild, M F

    2011-01-01

    Significant progress on pig genetics and genomics research has been witnessed in recent years due to the integration of advanced molecular biology techniques, bioinformatics and computational biology, and the collaborative efforts of researchers in the swine genomics community. Progress on expanding the linkage map has slowed down, but the efforts have created a higher-resolution physical map integrating the clone map and BAC end sequence. The number of QTL mapped is still growing and most of the updated QTL mapping results are available through PigQTLdb. Additionally, expression studies using high-throughput microarrays and other gene expression techniques have made significant advancements. The number of identified non-coding RNAs is rapidly increasing and their exact regulatory functions are being explored. A publishable draft (build 10) of the swine genome sequence was available for the pig genomics community by the end of December 2010. Build 9 of the porcine genome is currently available with Ensembl annotation; manual annotation is ongoing. These drafts provide useful tools for such endeavors as comparative genomics and SNP scans for fine QTL mapping. A recent community-wide effort to create a 60K porcine SNP chip has greatly facilitated whole-genome association analyses, haplotype block construction and linkage disequilibrium mapping, which can contribute to whole-genome selection. The future 'systems biology' that integrates and optimizes the information from all research levels can enhance the pig community's understanding of the full complexity of the porcine genome. These recent technological advances and where they may lead are reviewed. Copyright © 2011 S. Karger AG, Basel.

  14. Aluminum tolerance association mapping in triticale

    Directory of Open Access Journals (Sweden)

    Niedziela Agnieszka

    2012-02-01

    Full Text Available Abstract Background Crop production practices and industrialization processes result in increasing acidification of arable soils. At lower pH levels (below 5.0, aluminum (Al remains in a cationic form that is toxic to plants, reducing growth and yield. The effect of aluminum on agronomic performance is particularly important in cereals like wheat, which has promoted the development of programs directed towards selection of tolerant forms. Even in intermediately tolerant cereals (i.e., triticale, the decrease in yield may be significant. In triticale, Al tolerance seems to be influenced by both wheat and rye genomes. However, little is known about the precise chromosomal location of tolerance-related genes, and whether wheat or rye genomes are crucial for the expression of that trait in the hybrid. Results A mapping population consisting of 232 advanced breeding triticale forms was developed and phenotyped for Al tolerance using physiological tests. AFLP, SSR and DArT marker platforms were applied to obtain a sufficiently large set of molecular markers (over 3000. Associations between the markers and the trait were tested using General (GLM and Multiple (MLM Linear Models, as well as the Statistical Machine Learning (SML approach. The chromosomal locations of candidate markers were verified based on known assignments of SSRs and DArTs or by using genetic maps of rye and triticale. Two candidate markers on chromosome 3R and 9, 15 and 11 on chromosomes 4R, 6R and 7R, respectively, were identified. The r2 values were between 0.066 and 0.220 in most cases, indicating a good fit of the data, with better results obtained with the GML than the MLM approach. Several QTLs on rye chromosomes appeared to be involved in the phenotypic expression of the trait, suggesting that rye genome factors are predominantly responsible for Al tolerance in triticale. Conclusions The Diversity Arrays Technology was applied successfully to association mapping studies

  15. Deep brain stimulation, brain maps and personalized medicine: lessons from the human genome project.

    Science.gov (United States)

    Fins, Joseph J; Shapiro, Zachary E

    2014-01-01

    Although the appellation of personalized medicine is generally attributed to advanced therapeutics in molecular medicine, deep brain stimulation (DBS) can also be so categorized. Like its medical counterpart, DBS is a highly personalized intervention that needs to be tailored to a patient's individual anatomy. And because of this, DBS like more conventional personalized medicine, can be highly specific where the object of care is an N = 1. But that is where the similarities end. Besides their differing medical and surgical provenances, these two varieties of personalized medicine have had strikingly different impacts. The molecular variant, though of a more recent vintage has thrived and is experiencing explosive growth, while DBS still struggles to find a sustainable therapeutic niche. Despite its promise, and success as a vetted treatment for drug resistant Parkinson's Disease, DBS has lagged in broadening its development, often encountering regulatory hurdles and financial barriers necessary to mount an adequate number of quality trials. In this paper we will consider why DBS-or better yet neuromodulation-has encountered these challenges and contrast this experience with the more successful advance of personalized medicine. We will suggest that personalized medicine and DBS's differential performance can be explained as a matter of timing and complexity. We believe that DBS has struggled because it has been a journey of scientific exploration conducted without a map. In contrast to molecular personalized medicine which followed the mapping of the human genome and the Human Genome Project, DBS preceded plans for the mapping of the human brain. We believe that this sequence has given personalized medicine a distinct advantage and that the fullest potential of DBS will be realized both as a cartographical or electrophysiological probe and as a modality of personalized medicine.

  16. Genome wide association mapping for the tolerance to the polyamine oxidase inhibitor guazatine in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Kostadin Evgeniev eAtanasov

    2016-04-01

    Full Text Available Guazatine is a potent inhibitor of polyamine oxidase (PAO activity. In agriculture, guazatine is used as non-systemic contact fungicide efficient in the protection of cereals and citrus fruits against disease. The composition of guazatine is complex, mainly constituted by a mixture of synthetic guanidated polyamines (polyaminoguanidines. Here we have studied the effects from exposure to guazatine in the weed Arabidopsis thaliana. We report that micromolar concentrations of guazatine are sufficient to inhibit growth of Arabidopsis seedlings and induce chlorosis, whereas germination is barely affected. We observed the occurrence of quantitative variation in the response to guazatine between 107 randomly chosen Arabidopsis accessions. This enabled us to undertake genome-wide association (GWA mapping that identified a locus on chromosome one associated with guazatine tolerance. CHLOROPHYLLASE 1 (CLH1 within this locus was studied as candidate gene, together with its paralog (CLH2. The analysis of independent clh1-2, clh1-3, clh2-3, clh2-2 and double clh1-2 clh2-3 mutant alleles indicated that CLH1 and/or CLH2 loss-of-function or expression down-regulation promote guazatine tolerance in Arabidopsis. We report a natural mechanism by which Arabidopsis populations can overcome toxicity by the fungicide guazatine.

  17. Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression

    Directory of Open Access Journals (Sweden)

    Vining Kelly J

    2012-01-01

    Full Text Available Abstract Background DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood. Results We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem in the reference tree species black cottonwood (Populus trichocarpa. Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq, we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation" had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation. Conclusions We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation.

  18. Complete Genome Sequence Analysis of Enterobacter sp. SA187, a Plant Multi-Stress Tolerance Promoting Endophytic Bacterium

    KAUST Repository

    Andres-Barrao, Cristina; Lafi, Feras Fawzi; Alam, Intikhab; Zé licourt, Axel de; Eida, Abdul Aziz; Bokhari, Ameerah; Alzubaidy, Hanin S.; Bajic, Vladimir B.; Hirt, Heribert; Saad, Maged

    2017-01-01

    Enterobacter sp. SA187 is an endophytic bacterium that has been isolated from root nodules of the indigenous desert plant Indigofera argentea. SA187 could survive in the rhizosphere as well as in association with different plant species, and was able to provide abiotic stress tolerance to Arabidopsis thaliana. The genome sequence of SA187 was obtained by using Pacific BioScience (PacBio) single-molecule sequencing technology, with average coverage of 275X. The genome of SA187 consists of one single 4,429,597 bp chromosome, with an average 56% GC content and 4,347 predicted protein coding DNA sequences (CDS), 153 ncRNA, 7 rRNA, and 84 tRNA. Functional analysis of the SA187 genome revealed a large number of genes involved in uptake and exchange of nutrients, chemotaxis, mobilization and plant colonization. A high number of genes were also found to be involved in survival, defense against oxidative stress and production of antimicrobial compounds and toxins. Moreover, different metabolic pathways were identified that potentially contribute to plant growth promotion. The information encoded in the genome of SA187 reveals the characteristics of a dualistic lifestyle of a bacterium that can adapt to different environments and promote the growth of plants. This information provides a better understanding of the mechanisms involved in plant-microbe interaction and could be further exploited to develop SA187 as a biological agent to improve agricultural practices in marginal and arid lands.

  19. Complete Genome Sequence Analysis of Enterobacter sp. SA187, a Plant Multi-Stress Tolerance Promoting Endophytic Bacterium

    KAUST Repository

    Andres-Barrao, Cristina

    2017-10-20

    Enterobacter sp. SA187 is an endophytic bacterium that has been isolated from root nodules of the indigenous desert plant Indigofera argentea. SA187 could survive in the rhizosphere as well as in association with different plant species, and was able to provide abiotic stress tolerance to Arabidopsis thaliana. The genome sequence of SA187 was obtained by using Pacific BioScience (PacBio) single-molecule sequencing technology, with average coverage of 275X. The genome of SA187 consists of one single 4,429,597 bp chromosome, with an average 56% GC content and 4,347 predicted protein coding DNA sequences (CDS), 153 ncRNA, 7 rRNA, and 84 tRNA. Functional analysis of the SA187 genome revealed a large number of genes involved in uptake and exchange of nutrients, chemotaxis, mobilization and plant colonization. A high number of genes were also found to be involved in survival, defense against oxidative stress and production of antimicrobial compounds and toxins. Moreover, different metabolic pathways were identified that potentially contribute to plant growth promotion. The information encoded in the genome of SA187 reveals the characteristics of a dualistic lifestyle of a bacterium that can adapt to different environments and promote the growth of plants. This information provides a better understanding of the mechanisms involved in plant-microbe interaction and could be further exploited to develop SA187 as a biological agent to improve agricultural practices in marginal and arid lands.

  20. Environmental versatility promotes modularity in large scale metabolic networks

    OpenAIRE

    Samal A.; Wagner Andreas; Martin O.C.

    2011-01-01

    Abstract Background The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chem...

  1. A microarray-based genotyping and genetic mapping approach for highly heterozygous outcrossing species enables localization of a large fraction of the unassembled Populus trichocarpa genome sequence.

    Science.gov (United States)

    Drost, Derek R; Novaes, Evandro; Boaventura-Novaes, Carolina; Benedict, Catherine I; Brown, Ryan S; Yin, Tongming; Tuskan, Gerald A; Kirst, Matias

    2009-06-01

    Microarrays have demonstrated significant power for genome-wide analyses of gene expression, and recently have also revolutionized the genetic analysis of segregating populations by genotyping thousands of loci in a single assay. Although microarray-based genotyping approaches have been successfully applied in yeast and several inbred plant species, their power has not been proven in an outcrossing species with extensive genetic diversity. Here we have developed methods for high-throughput microarray-based genotyping in such species using a pseudo-backcross progeny of 154 individuals of Populus trichocarpa and P. deltoides analyzed with long-oligonucleotide in situ-synthesized microarray probes. Our analysis resulted in high-confidence genotypes for 719 single-feature polymorphism (SFP) and 1014 gene expression marker (GEM) candidates. Using these genotypes and an established microsatellite (SSR) framework map, we produced a high-density genetic map comprising over 600 SFPs, GEMs and SSRs. The abundance of gene-based markers allowed us to localize over 35 million base pairs of previously unplaced whole-genome shotgun (WGS) scaffold sequence to putative locations in the genome of P. trichocarpa. A high proportion of sampled scaffolds could be verified for their placement with independently mapped SSRs, demonstrating the previously un-utilized power that high-density genotyping can provide in the context of map-based WGS sequence reassembly. Our results provide a substantial contribution to the continued improvement of the Populus genome assembly, while demonstrating the feasibility of microarray-based genotyping in a highly heterozygous population. The strategies presented are applicable to genetic mapping efforts in all plant species with similarly high levels of genetic diversity.

  2. Genome-Wide Association Mapping of Flowering and Ripening Periods in Apple

    Directory of Open Access Journals (Sweden)

    Jorge Urrestarazu

    2017-11-01

    Full Text Available Deciphering the genetic control of flowering and ripening periods in apple is essential for breeding cultivars adapted to their growing environments. We implemented a large Genome-Wide Association Study (GWAS at the European level using an association panel of 1,168 different apple genotypes distributed over six locations and phenotyped for these phenological traits. The panel was genotyped at a high-density of SNPs using the Axiom®Apple 480 K SNP array. We ran GWAS with a multi-locus mixed model (MLMM, which handles the putatively confounding effect of significant SNPs elsewhere on the genome. Genomic regions were further investigated to reveal candidate genes responsible for the phenotypic variation. At the whole population level, GWAS retained two SNPs as cofactors on chromosome 9 for flowering period, and six for ripening period (four on chromosome 3, one on chromosome 10 and one on chromosome 16 which, together accounted for 8.9 and 17.2% of the phenotypic variance, respectively. For both traits, SNPs in weak linkage disequilibrium were detected nearby, thus suggesting the existence of allelic heterogeneity. The geographic origins and relationships of apple cultivars accounted for large parts of the phenotypic variation. Variation in genotypic frequency of the SNPs associated with the two traits was connected to the geographic origin of the genotypes (grouped as North+East, West and South Europe, and indicated differential selection in different growing environments. Genes encoding transcription factors containing either NAC or MADS domains were identified as major candidates within the small confidence intervals computed for the associated genomic regions. A strong microsynteny between apple and peach was revealed in all the four confidence interval regions. This study shows how association genetics can unravel the genetic control of important horticultural traits in apple, as well as reduce the confidence intervals of the associated

  3. phiGENOME: an integrative navigation throughout bacteriophage genomes.

    Science.gov (United States)

    Stano, Matej; Klucar, Lubos

    2011-11-01

    phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright © 2011 Elsevier Inc. All rights reserved.

  4. Emerging Genomic Tools for Legume Breeding: Current Status and Future Prospects

    Science.gov (United States)

    Pandey, Manish K.; Roorkiwal, Manish; Singh, Vikas K.; Ramalingam, Abirami; Kudapa, Himabindu; Thudi, Mahendar; Chitikineni, Anu; Rathore, Abhishek; Varshney, Rajeev K.

    2016-01-01

    Legumes play a vital role in ensuring global nutritional food security and improving soil quality through nitrogen fixation. Accelerated higher genetic gains is required to meet the demand of ever increasing global population. In recent years, speedy developments have been witnessed in legume genomics due to advancements in next-generation sequencing (NGS) and high-throughput genotyping technologies. Reference genome sequences for many legume crops have been reported in the last 5 years. The availability of the draft genome sequences and re-sequencing of elite genotypes for several important legume crops have made it possible to identify structural variations at large scale. Availability of large-scale genomic resources and low-cost and high-throughput genotyping technologies are enhancing the efficiency and resolution of genetic mapping and marker-trait association studies. Most importantly, deployment of molecular breeding approaches has resulted in development of improved lines in some legume crops such as chickpea and groundnut. In order to support genomics-driven crop improvement at a fast pace, the deployment of breeder-friendly genomics and decision support tools seems appear to be critical in breeding programs in developing countries. This review provides an overview of emerging genomics and informatics tools/approaches that will be the key driving force for accelerating genomics-assisted breeding and ultimately ensuring nutritional and food security in developing countries. PMID:27199998

  5. The genome of Arabidopsis thaliana.

    OpenAIRE

    Goodman, H M; Ecker, J R; Dean, C

    1995-01-01

    Arabidopsis thaliana is a small flowering plant that is a member of the family cruciferae. It has many characteristics--diploid genetics, rapid growth cycle, relatively low repetitive DNA content, and small genome size--that recommend it as the model for a plant genome project. The current status of the genetic and physical maps, as well as efforts to sequence the genome, are presented. Examples are given of genes isolated by using map-based cloning. The importance of the Arabidopsis project ...

  6. Habitat Scale Mapping of Fisheries Ecosystem Service Values in Estuaries

    Directory of Open Access Journals (Sweden)

    Timothy G. O'Higgins

    2010-12-01

    Full Text Available Little is known about the variability of ecosystem service values at spatial scales most relevant to local decision makers. Competing definitions of ecosystem services, the paucity of ecological and economic information, and the lack of standardization in methodology are major obstacles to applying the ecosystem-services approach at the estuary scale. We present a standardized method that combines habitat maps and habitat-faunal associations to estimate ecosystem service values for recreational and commercial fisheries in estuaries. Three case studies in estuaries on the U.S. west coast (Yaquina Bay, Oregon, east coast (Lagoon Pond, Massachusetts, and the Gulf of Mexico (Weeks Bay, Alabama are presented to illustrate our method's rigor and limitations using available data. The resulting spatially explicit maps of fisheries ecosystem service values show within and between estuary variations in the value of estuarine habitat types that can be used to make better informed resource-management decisions.

  7. Genome-wide mapping in a house mouse hybrid zone reveals hybrid sterility loci and Dobzhansky-Muller interactions.

    Science.gov (United States)

    Turner, Leslie M; Harr, Bettina

    2014-12-09

    Mapping hybrid defects in contact zones between incipient species can identify genomic regions contributing to reproductive isolation and reveal genetic mechanisms of speciation. The house mouse features a rare combination of sophisticated genetic tools and natural hybrid zones between subspecies. Male hybrids often show reduced fertility, a common reproductive barrier between incipient species. Laboratory crosses have identified sterility loci, but each encompasses hundreds of genes. We map genetic determinants of testis weight and testis gene expression using offspring of mice captured in a hybrid zone between M. musculus musculus and M. m. domesticus. Many generations of admixture enables high-resolution mapping of loci contributing to these sterility-related phenotypes. We identify complex interactions among sterility loci, suggesting multiple, non-independent genetic incompatibilities contribute to barriers to gene flow in the hybrid zone.

  8. Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

    Science.gov (United States)

    diCenzo, George C; Finan, Turlough M

    2018-01-01

    The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.

  9. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    Science.gov (United States)

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  10. Radon risk mapping of the Czech Republic on a scale 1:50000

    International Nuclear Information System (INIS)

    Barnet, I.; Miksova, J.; Tomas, R.; Karenova, J.

    2000-01-01

    A new type of radon risk maps on a scale 1:50000 was published in the Czech Republic. Maps are based on the vectorized contours of' geological units and rock types and field soil gas radon measurements from the radon database. Radon risk is expressed in four categories. More detailed topography enables to predict the radon risk from bedrock in the intravilans of villages and towns. (author)

  11. Genetic fine-mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci

    Science.gov (United States)

    Mahajan, Anubha; Locke, Adam; Rayner, N William; Robertson, Neil; Scott, Robert A; Prokopenko, Inga; Scott, Laura J; Green, Todd; Sparso, Thomas; Thuillier, Dorothee; Yengo, Loic; Grallert, Harald; Wahl, Simone; Frånberg, Mattias; Strawbridge, Rona J; Kestler, Hans; Chheda, Himanshu; Eisele, Lewin; Gustafsson, Stefan; Steinthorsdottir, Valgerdur; Thorleifsson, Gudmar; Qi, Lu; Karssen, Lennart C; van Leeuwen, Elisabeth M; Willems, Sara M; Li, Man; Chen, Han; Fuchsberger, Christian; Kwan, Phoenix; Ma, Clement; Linderman, Michael; Lu, Yingchang; Thomsen, Soren K; Rundle, Jana K; Beer, Nicola L; van de Bunt, Martijn; Chalisey, Anil; Kang, Hyun Min; Voight, Benjamin F; Abecasis, Goncalo R; Almgren, Peter; Baldassarre, Damiano; Balkau, Beverley; Benediktsson, Rafn; Blüher, Matthias; Boeing, Heiner; Bonnycastle, Lori L; Borringer, Erwin P; Burtt, Noël P; Carey, Jason; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn C; Couper, David J; Crenshaw, Andrew T; van Dam, Rob M; Doney, Alex SF; Dorkhan, Mozhgan; Edkins, Sarah; Eriksson, Johan G; Esko, Tonu; Eury, Elodie; Fadista, João; Flannick, Jason; Fontanillas, Pierre; Fox, Caroline; Franks, Paul W; Gertow, Karl; Gieger, Christian; Gigante, Bruna; Gottesman, Omri; Grant, George B; Grarup, Niels; Groves, Christopher J; Hassinen, Maija; Have, Christian T; Herder, Christian; Holmen, Oddgeir L; Hreidarsson, Astradur B; Humphries, Steve E; Hunter, David J; Jackson, Anne U; Jonsson, Anna; Jørgensen, Marit E; Jørgensen, Torben; Kerrison, Nicola D; Kinnunen, Leena; Klopp, Norman; Kong, Augustine; Kovacs, Peter; Kraft, Peter; Kravic, Jasmina; Langford, Cordelia; Leander, Karin; Liang, Liming; Lichtner, Peter; Lindgren, Cecilia M; Lindholm, Eero; Linneberg, Allan; Liu, Ching-Ti; Lobbens, Stéphane; Luan, Jian’an; Lyssenko, Valeriya; Männistö, Satu; McLeod, Olga; Meyer, Julia; Mihailov, Evelin; Mirza, Ghazala; Mühleisen, Thomas W; Müller-Nurasyid, Martina; Navarro, Carmen; Nöthen, Markus M; Oskolkov, Nikolay N; Owen, Katharine R; Palli, Domenico; Pechlivanis, Sonali; Perry, John RB; Platou, Carl GP; Roden, Michael; Ruderfer, Douglas; Rybin, Denis; van der Schouw, Yvonne T; Sennblad, Bengt; Sigurðsson, Gunnar; Stančáková, Alena; Steinbach, Gerald; Storm, Petter; Strauch, Konstantin; Stringham, Heather M; Sun, Qi; Thorand, Barbara; Tikkanen, Emmi; Tonjes, Anke; Trakalo, Joseph; Tremoli, Elena; Tuomi, Tiinamaija; Wennauer, Roman; Wood, Andrew R; Zeggini, Eleftheria; Dunham, Ian; Birney, Ewan; Pasquali, Lorenzo; Ferrer, Jorge; Loos, Ruth JF; Dupuis, Josée; Florez, Jose C; Boerwinkle, Eric; Pankow, James S; van Duijn, Cornelia; Sijbrands, Eric; Meigs, James B; Hu, Frank B; Thorsteinsdottir, Unnur; Stefansson, Kari; Lakka, Timo A; Rauramaa, Rainer; Stumvoll, Michael; Pedersen, Nancy L; Lind, Lars; Keinanen-Kiukaanniemi, Sirkka M; Korpi-Hyövälti, Eeva; Saaristo, Timo E; Saltevo, Juha; Kuusisto, Johanna; Laakso, Markku; Metspalu, Andres; Erbel, Raimund; Jöckel, Karl-Heinz; Moebus, Susanne; Ripatti, Samuli; Salomaa, Veikko; Ingelsson, Erik; Boehm, Bernhard O; Bergman, Richard N; Collins, Francis S; Mohlke, Karen L; Koistinen, Heikki; Tuomilehto, Jaakko; Hveem, Kristian; Njølstad, Inger; Deloukas, Panagiotis; Donnelly, Peter J; Frayling, Timothy M; Hattersley, Andrew T; de Faire, Ulf; Hamsten, Anders; Illig, Thomas; Peters, Annette; Cauchi, Stephane; Sladek, Rob; Froguel, Philippe; Hansen, Torben; Pedersen, Oluf; Morris, Andrew D; Palmer, Collin NA; Kathiresan, Sekar; Melander, Olle; Nilsson, Peter M; Groop, Leif C; Barroso, Inês; Langenberg, Claudia; Wareham, Nicholas J; O’Callaghan, Christopher A; Gloyn, Anna L; Altshuler, David; Boehnke, Michael; Teslovich, Tanya M; McCarthy, Mark I; Morris, Andrew P

    2015-01-01

    We performed fine-mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in/near KCNQ1. “Credible sets” of variants most likely to drive each distinct signal mapped predominantly to non-coding sequence, implying that T2D association is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine-mapping implicated rs10830963 as driving T2D association. We confirmed that this T2D-risk allele increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D-risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease. PMID:26551672

  12. Open quantum maps from complex scaling of kicked scattering systems

    Science.gov (United States)

    Mertig, Normann; Shudo, Akira

    2018-04-01

    We derive open quantum maps from periodically kicked scattering systems and discuss the computation of their resonance spectra in terms of theoretically grounded methods, such as complex scaling and sufficiently weak absorbing potentials. In contrast, we also show that current implementations of open quantum maps, based on strong absorptive or even projective openings, fail to produce the resonance spectra of kicked scattering systems. This comparison pinpoints flaws in current implementations of open quantum maps, namely, the inability to separate resonance eigenvalues from the continuum as well as the presence of diffraction effects due to strong absorption. The reported deviations from the true resonance spectra appear, even if the openings do not affect the classical trapped set, and become appreciable for shorter-lived resonances, e.g., those associated with chaotic orbits. This makes the open quantum maps, which we derive in this paper, a valuable alternative for future explorations of quantum-chaotic scattering systems, for example, in the context of the fractal Weyl law. The results are illustrated for a quantum map model whose classical dynamics exhibits key features of ionization and a trapped set which is organized by a topological horseshoe.

  13. Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction.

    Science.gov (United States)

    Yang, Yuedong; Li, Xiaomei; Zhao, Huiying; Zhan, Jian; Wang, Jihua; Zhou, Yaoqi

    2017-01-01

    As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5' cap of 5' and 3' cap of 3' untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA-protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org. © 2016 Yang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  14. Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D.

    Science.gov (United States)

    Preciat Gonzalez, German A; El Assal, Lemmer R P; Noronha, Alberto; Thiele, Ines; Haraldsdóttir, Hulda S; Fleming, Ronan M T

    2017-06-14

    The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, many algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.

  15. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  16. Large scale mapping: an empirical comparison of pixel-based and ...

    African Journals Online (AJOL)

    In the past, large scale mapping was carried using precise ground survey methods. Later, paradigm shift in data collection using medium to low resolution and, recently, high resolution images brought to bear the problem of accurate data analysis and fitness-for-purpose challenges. Using high resolution satellite images ...

  17. Mapping and sequencing the human genome: Science, ethics, and public policy. Final report

    Energy Technology Data Exchange (ETDEWEB)

    McInerney, J.D.

    1993-03-31

    Development of Mapping and Sequencing the Human Genome: Science, Ethics, and Public Policy followed the standard process of curriculum development at the Biological Sciences Curriculum Study (BSCS), the process is described. The production of this module was a collaborative effort between BSCS and the American Medical Association (AMA). Appendix A contains a copy of the module. Copies of reports sent to the Department of Energy (DOE) during the development process are contained in Appendix B; all reports should be on file at DOE. Appendix B also contains copies of status reports submitted to the BSCS Board of Directors.

  18. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Science.gov (United States)

    Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

    2017-01-01

    Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  19. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Directory of Open Access Journals (Sweden)

    Matthias Christen

    Full Text Available Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  20. Subgrid-scale stresses and scalar fluxes constructed by the multi-scale turnover Lagrangian map

    Science.gov (United States)

    AL-Bairmani, Sukaina; Li, Yi; Rosales, Carlos; Xie, Zheng-tong

    2017-04-01

    The multi-scale turnover Lagrangian map (MTLM) [C. Rosales and C. Meneveau, "Anomalous scaling and intermittency in three-dimensional synthetic turbulence," Phys. Rev. E 78, 016313 (2008)] uses nested multi-scale Lagrangian advection of fluid particles to distort a Gaussian velocity field and, as a result, generate non-Gaussian synthetic velocity fields. Passive scalar fields can be generated with the procedure when the fluid particles carry a scalar property [C. Rosales, "Synthetic three-dimensional turbulent passive scalar fields via the minimal Lagrangian map," Phys. Fluids 23, 075106 (2011)]. The synthetic fields have been shown to possess highly realistic statistics characterizing small scale intermittency, geometrical structures, and vortex dynamics. In this paper, we present a study of the synthetic fields using the filtering approach. This approach, which has not been pursued so far, provides insights on the potential applications of the synthetic fields in large eddy simulations and subgrid-scale (SGS) modelling. The MTLM method is first generalized to model scalar fields produced by an imposed linear mean profile. We then calculate the subgrid-scale stress, SGS scalar flux, SGS scalar variance, as well as related quantities from the synthetic fields. Comparison with direct numerical simulations (DNSs) shows that the synthetic fields reproduce the probability distributions of the SGS energy and scalar dissipation rather well. Related geometrical statistics also display close agreement with DNS results. The synthetic fields slightly under-estimate the mean SGS energy dissipation and slightly over-predict the mean SGS scalar variance dissipation. In general, the synthetic fields tend to slightly under-estimate the probability of large fluctuations for most quantities we have examined. Small scale anisotropy in the scalar field originated from the imposed mean gradient is captured. The sensitivity of the synthetic fields on the input spectra is assessed by