WorldWideScience

Sample records for integrated genome map

  1. INE: a rice genome database with an integrated map view.

    Science.gov (United States)

    Sakata, K; Antonio, B A; Mukai, Y; Nagasaki, H; Sakai, Y; Makino, K; Sasaki, T

    2000-01-01

    The Rice Genome Research Program (RGP) launched a large-scale rice genome sequencing in 1998 aimed at decoding all genetic information in rice. A new genome database called INE (INtegrated rice genome Explorer) has been developed in order to integrate all the genomic information that has been accumulated so far and to correlate these data with the genome sequence. A web interface based on Java applet provides a rapid viewing capability in the database. The first operational version of the database has been completed which includes a genetic map, a physical map using YAC (Yeast Artificial Chromosome) clones and PAC (P1-derived Artificial Chromosome) contigs. These maps are displayed graphically so that the positional relationships among the mapped markers on each chromosome can be easily resolved. INE incorporates the sequences and annotations of the PAC contig. A site on low quality information ensures that all submitted sequence data comply with the standard for accuracy. As a repository of rice genome sequence, INE will also serve as a common database of all sequence data obtained by collaborating members of the International Rice Genome Sequencing Project (IRGSP). The database can be accessed at http://www. dna.affrc.go.jp:82/giot/INE. html or its mirror site at http://www.staff.or.jp/giot/INE.html

  2. An Integrated Genetic and Cytogenetic Map of the Cucumber Genome

    Science.gov (United States)

    The Cucurbitaceae includes important crops as cucumber, melon, watermelon, and squash and pumpkin. However, few genetic and genomic resources are available for plant improvement. Some cucurbit species such as cucumber have a narrow genetic base, which impedes construction of saturated molecular li...

  3. Automated integration of genomic physical mapping data via parallel simulated annealing

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T.

    1994-06-01

    The Human Genome Center at the Lawrence Livermore National Laboratory (LLNL) is nearing closure on a high-resolution physical map of human chromosome 19. We have build automated tools to assemble 15,000 fingerprinted cosmid clones into 800 contigs with minimal spanning paths identified. These islands are being ordered, oriented, and spanned by a variety of other techniques including: Fluorescence Insitu Hybridization (FISH) at 3 levels of resolution, ECO restriction fragment mapping across all contigs, and a multitude of different hybridization and PCR techniques to link cosmid, YAC, AC, PAC, and Pl clones. The FISH data provide us with partial order and distance data as well as orientation. We made the observation that map builders need a much rougher presentation of data than do map readers; the former wish to see raw data since these can expose errors or interesting biology. We further noted that by ignoring our length and distance data we could simplify our problem into one that could be readily attacked with optimization techniques. The data integration problem could then be seen as an M x N ordering of our N cosmid clones which ``intersect`` M larger objects by defining ``intersection`` to mean either contig/map membership or hybridization results. Clearly, the goal of making an integrated map is now to rearrange the N cosmid clone ``columns`` such that the number of gaps on the object ``rows`` are minimized. Our FISH partially-ordered cosmid clones provide us with a set of constraints that cannot be violated by the rearrangement process. We solved the optimization problem via simulated annealing performed on a network of 40+ Unix machines in parallel, using a server/client model built on explicit socket calls. For current maps we can create a map in about 4 hours on the parallel net versus 4+ days on a single workstation. Our biologists are now using this software on a daily basis to guide their efforts toward final closure.

  4. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  5. Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant.

    Science.gov (United States)

    Wu, Pingzhi; Zhou, Changpin; Cheng, Shifeng; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Chen, Yanbo; Chen, Yan; Ni, Peixiang; Wang, Ying; Xu, Xun; Huang, Ying; Song, Chi; Wang, Zhiwen; Shi, Nan; Zhang, Xudong; Fang, Xiaohua; Yang, Qing; Jiang, Huawu; Chen, Yaping; Li, Meiru; Wang, Ying; Chen, Fan; Wang, Jun; Wu, Guojiang

    2015-03-01

    The family Euphorbiaceae includes some of the most efficient biomass accumulators. Whole genome sequencing and the development of genetic maps of these species are important components in molecular breeding and genetic improvement. Here we report the draft genome of physic nut (Jatropha curcas L.), a biodiesel plant. The assembled genome has a total length of 320.5 Mbp and contains 27,172 putative protein-coding genes. We established a linkage map containing 1208 markers and anchored the genome assembly (81.7%) to this map to produce 11 pseudochromosomes. After gene family clustering, 15,268 families were identified, of which 13,887 existed in the castor bean genome. Analysis of the genome highlighted specific expansion and contraction of a number of gene families during the evolution of this species, including the ribosome-inactivating proteins and oil biosynthesis pathway enzymes. The genomic sequence and linkage map provide a valuable resource not only for fundamental and applied research on physic nut but also for evolutionary and comparative genomics analysis, particularly in the Euphorbiaceae. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.

  6. Construction of reference chromosome-scale pseudomolecules for potato: integrating the potato genome with genetic and physical maps.

    Science.gov (United States)

    Sharma, Sanjeev Kumar; Bolser, Daniel; de Boer, Jan; Sønderkær, Mads; Amoros, Walter; Carboni, Martin Federico; D'Ambrosio, Juan Martín; de la Cruz, German; Di Genova, Alex; Douches, David S; Eguiluz, Maria; Guo, Xiao; Guzman, Frank; Hackett, Christine A; Hamilton, John P; Li, Guangcun; Li, Ying; Lozano, Roberto; Maass, Alejandro; Marshall, David; Martinez, Diana; McLean, Karen; Mejía, Nilo; Milne, Linda; Munive, Susan; Nagy, Istvan; Ponce, Olga; Ramirez, Manuel; Simon, Reinhard; Thomson, Susan J; Torres, Yerisf; Waugh, Robbie; Zhang, Zhonghua; Huang, Sanwen; Visser, Richard G F; Bachem, Christian W B; Sagredo, Boris; Feingold, Sergio E; Orjeda, Gisella; Veilleux, Richard E; Bonierbale, Merideth; Jacobs, Jeanne M E; Milbourne, Dan; Martin, David Michael Alan; Bryan, Glenn J

    2013-11-06

    The genome of potato, a major global food crop, was recently sequenced. The work presented here details the integration of the potato reference genome (DM) with a new sequence-tagged site marker-based linkage map and other physical and genetic maps of potato and the closely related species tomato. Primary anchoring of the DM genome assembly was accomplished by the use of a diploid segregating population, which was genotyped with several types of molecular genetic markers to construct a new ~936 cM linkage map comprising 2469 marker loci. In silico anchoring approaches used genetic and physical maps from the diploid potato genotype RH89-039-16 (RH) and tomato. This combined approach has allowed 951 superscaffolds to be ordered into pseudomolecules corresponding to the 12 potato chromosomes. These pseudomolecules represent 674 Mb (~93%) of the 723 Mb genome assembly and 37,482 (~96%) of the 39,031 predicted genes. The superscaffold order and orientation within the pseudomolecules are closely collinear with independently constructed high density linkage maps. Comparisons between marker distribution and physical location reveal regions of greater and lesser recombination, as well as regions exhibiting significant segregation distortion. The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal "pseudomolecules".

  7. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Directory of Open Access Journals (Sweden)

    Yash Paul Khajuria

    Full Text Available The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777 of an inter-specific reference mapping population. High amplification efficiency (87%, experimental validation success rate (81% and polymorphic potential (55% of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48% detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%. An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777 having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped

  8. Construction of an integrated genetic linkage map for the A genome of Brassica napus using SSR markers derived from sequenced BACs in B. rapa

    Directory of Open Access Journals (Sweden)

    King Graham J

    2010-10-01

    Full Text Available Abstract Background The Multinational Brassica rapa Genome Sequencing Project (BrGSP has developed valuable genomic resources, including BAC libraries, BAC-end sequences, genetic and physical maps, and seed BAC sequences for Brassica rapa. An integrated linkage map between the amphidiploid B. napus and diploid B. rapa will facilitate the rapid transfer of these valuable resources from B. rapa to B. napus (Oilseed rape, Canola. Results In this study, we identified over 23,000 simple sequence repeats (SSRs from 536 sequenced BACs. 890 SSR markers (designated as BrGMS were developed and used for the construction of an integrated linkage map for the A genome in B. rapa and B. napus. Two hundred and nineteen BrGMS markers were integrated to an existing B. napus linkage map (BnaNZDH. Among these mapped BrGMS markers, 168 were only distributed on the A genome linkage groups (LGs, 18 distrubuted both on the A and C genome LGs, and 33 only distributed on the C genome LGs. Most of the A genome LGs in B. napus were collinear with the homoeologous LGs in B. rapa, although minor inversions or rearrangements occurred on A2 and A9. The mapping of these BAC-specific SSR markers enabled assignment of 161 sequenced B. rapa BACs, as well as the associated BAC contigs to the A genome LGs of B. napus. Conclusion The genetic mapping of SSR markers derived from sequenced BACs in B. rapa enabled direct links to be established between the B. napus linkage map and a B. rapa physical map, and thus the assignment of B. rapa BACs and the associated BAC contigs to the B. napus linkage map. This integrated genetic linkage map will facilitate exploitation of the B. rapa annotated genomic resources for gene tagging and map-based cloning in B. napus, and for comparative analysis of the A genome within Brassica species.

  9. Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

    Science.gov (United States)

    2012-01-01

    Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource

  10. SBH and the integration of complementary approaches in the mapping, sequencing, and understanding of complex genomes

    Energy Technology Data Exchange (ETDEWEB)

    Drmanac, R.; Drmanac, S.; Labat, I.; Vicentic, A.; Gemmell, A.; Stavropoulos, N.; Jarvis, J.

    1992-01-01

    A variant of sequencing by hybridization (SBH) is being developed with a potential to inexpensively determine up to 100 million base pairs per year. The method comprises (1) arraying short clones in 864-well plates; (2) growth of the M13 clones or PCR of the inserts; (3) automated spotting of DNAs by corresponding pin-arrays; (4) hybridization of dotted samples with 200-3000 [sup 32]P- or [sup 33]P-labeled 6- to 8-mer probes; and (5) scoring hybridization signals using storage phosphor plates. Some 200 7- to 8-mers can provide an inventory of the genes if CDNA clones are hybridized, or can define the order of 2-kb genomic clones, creating physical and structural maps with 100-bp resolution; the distribution of G+C, LINEs, SINEs, and gene families would be revealed. cDNAs that represent new genes and genomic clones in regions of interest selected by SBH can be sequenced by a gel method. Uniformly distributed clones from the previous step will be hybridized with 2000--3000 6- to 8-mers. As a result, approximately 50--60% of the genomic regions containing members of large repetitive and gene families and those families represented in GenBank would be completely sequenced. In the less redundant regions, every base pair is expected to be read with 3-4 probes, but the complete sequence can not be reconstructed. Such partial sequences allow the inference of similarity and the recognition of coding, regulatory, and repetitive sequences, as well as study of the evolutionary processes all the way up to the species delineation.

  11. SBH and the integration of complementary approaches in the mapping, sequencing, and understanding of complex genomes

    Energy Technology Data Exchange (ETDEWEB)

    Drmanac, R.; Drmanac, S.; Labat, I.; Vicentic, A.; Gemmell, A.; Stavropoulos, N.; Jarvis, J.

    1992-12-01

    A variant of sequencing by hybridization (SBH) is being developed with a potential to inexpensively determine up to 100 million base pairs per year. The method comprises (1) arraying short clones in 864-well plates; (2) growth of the M13 clones or PCR of the inserts; (3) automated spotting of DNAs by corresponding pin-arrays; (4) hybridization of dotted samples with 200-3000 {sup 32}P- or {sup 33}P-labeled 6- to 8-mer probes; and (5) scoring hybridization signals using storage phosphor plates. Some 200 7- to 8-mers can provide an inventory of the genes if CDNA clones are hybridized, or can define the order of 2-kb genomic clones, creating physical and structural maps with 100-bp resolution; the distribution of G+C, LINEs, SINEs, and gene families would be revealed. cDNAs that represent new genes and genomic clones in regions of interest selected by SBH can be sequenced by a gel method. Uniformly distributed clones from the previous step will be hybridized with 2000--3000 6- to 8-mers. As a result, approximately 50--60% of the genomic regions containing members of large repetitive and gene families and those families represented in GenBank would be completely sequenced. In the less redundant regions, every base pair is expected to be read with 3-4 probes, but the complete sequence can not be reconstructed. Such partial sequences allow the inference of similarity and the recognition of coding, regulatory, and repetitive sequences, as well as study of the evolutionary processes all the way up to the species delineation.

  12. SBH and the integration of complementary approaches in the mapping, sequencing, and understanding of complex genomes

    International Nuclear Information System (INIS)

    Drmanac, R.; Drmanac, S.; Labat, I.; Vicentic, A.; Gemmell, A.; Stavropoulos, N.; Jarvis, J.

    1992-01-01

    A variant of sequencing by hybridization (SBH) is being developed with a potential to inexpensively determine up to 100 million base pairs per year. The method comprises (1) arraying short clones in 864-well plates; (2) growth of the M13 clones or PCR of the inserts; (3) automated spotting of DNAs by corresponding pin-arrays; (4) hybridization of dotted samples with 200-3000 32 P- or 33 P-labeled 6- to 8-mer probes; and (5) scoring hybridization signals using storage phosphor plates. Some 200 7- to 8-mers can provide an inventory of the genes if CDNA clones are hybridized, or can define the order of 2-kb genomic clones, creating physical and structural maps with 100-bp resolution; the distribution of G+C, LINEs, SINEs, and gene families would be revealed. cDNAs that represent new genes and genomic clones in regions of interest selected by SBH can be sequenced by a gel method. Uniformly distributed clones from the previous step will be hybridized with 2000--3000 6- to 8-mers. As a result, approximately 50--60% of the genomic regions containing members of large repetitive and gene families and those families represented in GenBank would be completely sequenced. In the less redundant regions, every base pair is expected to be read with 3-4 probes, but the complete sequence can not be reconstructed. Such partial sequences allow the inference of similarity and the recognition of coding, regulatory, and repetitive sequences, as well as study of the evolutionary processes all the way up to the species delineation

  13. Identification and Mapping of Simple Sequence Repeat Markers from Common Bean (Phaseolus vulgaris L. Bacterial Artificial Chromosome End Sequences for Genome Characterization and Genetic–Physical Map Integration

    Directory of Open Access Journals (Sweden)

    Juana M. Córdoba

    2010-11-01

    Full Text Available Microsatellite markers or simple sequence repeat (SSR loci are useful for diversity characterization and genetic–physical mapping. Different in silico microsatellite search methods have been developed for mining bacterial artificial chromosome (BAC end sequences for SSRs. The overall goal of this study was genome characterization based on SSRs in 89,017 BAC end sequences (BESs from the G19833 common bean ( L. library. Another objective was to identify new SSR taking into account three tandem motif identification programs (Automated Microsatellite Marker Development [AMMD], Tandem Repeats Finder [TRF], and SSRLocator [SSRL]. Among the microsatellite search engines, SSRL identified the highest number of SSRs; however, when primer design was attempted, the number dropped due to poor primer design regions. Automated Microsatellite Marker Development software identified many SSRs with valuable AT/TA or AG/TC motifs, while TRF found fewer SSRs and produced no primers. A subgroup of 323 AT-rich, di-, and trinucleotide SSRs were selected from the AMMD results and used in a parental survey with DOR364 and G19833, of which 75 could be mapped in the corresponding population; these represented 4052 BAC clones. Together with 92 previously mapped BES- and 114 non-BES-derived markers, a total of 280 SSRs were included in the polymerase chain reaction (PCR-based map, integrating a total of 8232 BAC clones in 162 contigs from the physical map.

  14. A first generation integrated physical and genetic map of the rainbow trout genome

    Science.gov (United States)

    The rainbow trout physical map was previously constructed from DNA fingerprinting of 192,096 BAC clones using the 4-color high-information content fingerprinting (HICF) method. The clones were assembled into physical map contigs using the finger-printing contig (FPC) program. The map is composed of ...

  15. An integrated map of genetic variation from 1.092 human genomes

    DEFF Research Database (Denmark)

    Abecasis, Goncalo R.; Auton, Adam; Brooks, Lisa D.

    2012-01-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination ...

  16. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  17. Safeguarding genome integrity

    DEFF Research Database (Denmark)

    Sørensen, Claus Storgaard; Syljuåsen, Randi G

    2012-01-01

    Mechanisms that preserve genome integrity are highly important during the normal life cycle of human cells. Loss of genome protective mechanisms can lead to the development of diseases such as cancer. Checkpoint kinases function in the cellular surveillance pathways that help cells to cope with D...

  18. Statistical Methods in Integrative Genomics

    Science.gov (United States)

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531

  19. Genome Variation Map: a data repository of genome variations in BIG Data Center

    OpenAIRE

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2017-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research a...

  20. phiGENOME: an integrative navigation throughout bacteriophage genomes.

    Science.gov (United States)

    Stano, Matej; Klucar, Lubos

    2011-11-01

    phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright © 2011 Elsevier Inc. All rights reserved.

  1. Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

    Energy Technology Data Exchange (ETDEWEB)

    Bejerman, Nicolás, E-mail: n.bejerman@uq.edu.au [Instituto de Patología Vegetal (IPAVE), Centro de Investigaciones Agropecuarias (CIAP), Instituto Nacional de Tecnología Agropecuaria INTA, Camino a 60 Cuadras k 5,5, Córdoba X5020ICA (Argentina); Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072 (Australia); Giolitti, Fabián; Breuil, Soledad de; Trucco, Verónica; Nome, Claudia; Lenardon, Sergio [Instituto de Patología Vegetal (IPAVE), Centro de Investigaciones Agropecuarias (CIAP), Instituto Nacional de Tecnología Agropecuaria INTA, Camino a 60 Cuadras k 5,5, Córdoba X5020ICA (Argentina); Dietzgen, Ralf G. [Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072 (Australia)

    2015-09-15

    Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses.

  2. Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

    International Nuclear Information System (INIS)

    Bejerman, Nicolás; Giolitti, Fabián; Breuil, Soledad de; Trucco, Verónica; Nome, Claudia; Lenardon, Sergio; Dietzgen, Ralf G.

    2015-01-01

    Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses

  3. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  4. RatMap--rat genome tools and data.

    Science.gov (United States)

    Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik

    2005-01-01

    The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.

  5. Body maps on the human genome.

    Science.gov (United States)

    Cherniak, Christopher; Rodriguez-Esteban, Raul

    2013-12-20

    Chromosomes have territories, or preferred locales, in the cell nucleus. When these sites are taken into account, some large-scale structure of the human genome emerges. The synoptic picture is that genes highly expressed in particular topologically compact tissues are not randomly distributed on the genome. Rather, such tissue-specific genes tend to map somatotopically onto the complete chromosome set. They seem to form a "genome homunculus": a multi-dimensional, genome-wide body representation extending across chromosome territories of the entire spermcell nucleus. The antero-posterior axis of the body significantly corresponds to the head-tail axis of the nucleus, and the dorso-ventral body axis to the central-peripheral nucleus axis. This large-scale genomic structure includes thousands of genes. One rationale for a homuncular genome structure would be to minimize connection costs in genetic networks. Somatotopic maps in cerebral cortex have been reported for over a century.

  6. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  7. Feynman maps without improper integrals

    International Nuclear Information System (INIS)

    Exner, P.; Kolerov, G.I.

    1980-01-01

    The Feynman maps introduced first by Truman are examined. The domain considered here consists of the Fresnel-inteo-rable functions in the sense of Albeverio and Hoegh-Krohn. The original definition of the F-maps is slightly modified: it is started from the underlying measures on the Hilbert space of paths in order to avoid use of improper integrals. Some new properties of the F-maps are derived. In particular, the dominated convergence theorem is shown to be not valid for the F 1 -map (or Feynman integral); this fact is of a certain importance for classical limit of quantum mechanics

  8. Towards mapping the Dioscorea genome

    International Nuclear Information System (INIS)

    Terauchi, R.; Kahl, G.

    1998-01-01

    Yams are important starchy tuber crops in (sub-) tropical countries of the world. Despite their importance in the regional economy, no serious attempt has been made toward their improvement. In order to obtain basic knowledge of the genetics of yams, we are trying to establish a linkage map of a wild yam species, Dioscorea tokoro. So far, six allozyme markers, six STMS markers and twenty AFLP markers have been identified. They will be used for linkage mapping of a population comprising 80 progeny obtained from a controlled cross. (author)

  9. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana; Marcatili, Paolo; Tramontano, Anna

    2010-01-01

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  10. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana

    2010-10-12

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  11. Integrating genomics into evolutionary medicine.

    Science.gov (United States)

    Rodríguez, Juan Antonio; Marigorta, Urko M; Navarro, Arcadi

    2014-12-01

    The application of the principles of evolutionary biology into medicine was suggested long ago and is already providing insight into the ultimate causes of disease. However, a full systematic integration of medical genomics and evolutionary medicine is still missing. Here, we briefly review some cases where the combination of the two fields has proven profitable and highlight two of the main issues hindering the development of evolutionary genomic medicine as a mature field, namely the dissociation between fitness and health and the still considerable difficulties in predicting phenotypes from genotypes. We use publicly available data to illustrate both problems and conclude that new approaches are needed for evolutionary genomic medicine to overcome these obstacles. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. Mapping the space of genomic signatures.

    Directory of Open Access Journals (Sweden)

    Lila Kari

    Full Text Available We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR, is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM, implicitly compares the occurrences of oligomers of length up to k (herein k = 9 in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (superkingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal

  13. Integration of linkage maps for the Amphidiploid Brassica napus and comparative mapping with Arabidopsis and Brassica rapa

    Directory of Open Access Journals (Sweden)

    Delourme Régine

    2011-02-01

    Full Text Available Abstract Background The large number of genetic linkage maps representing Brassica chromosomes constitute a potential platform for studying crop traits and genome evolution within Brassicaceae. However, the alignment of existing maps remains a major challenge. The integration of these genetic maps will enhance genetic resolution, and provide a means to navigate between sequence-tagged loci, and with contiguous genome sequences as these become available. Results We report the first genome-wide integration of Brassica maps based on an automated pipeline which involved collation of genome-wide genotype data for sequence-tagged markers scored on three extensively used amphidiploid Brassica napus (2n = 38 populations. Representative markers were selected from consolidated maps for each population, and skeleton bin maps were generated. The skeleton maps for the three populations were then combined to generate an integrated map for each LG, comparing two different approaches, one encapsulated in JoinMap and the other in MergeMap. The BnaWAIT_01_2010a integrated genetic map was generated using JoinMap, and includes 5,162 genetic markers mapped onto 2,196 loci, with a total genetic length of 1,792 cM. The map density of one locus every 0.82 cM, corresponding to 515 Kbp, increases by at least three-fold the locus and marker density within the original maps. Within the B. napus integrated map we identified 103 conserved collinearity blocks relative to Arabidopsis, including five previously unreported blocks. The BnaWAIT_01_2010a map was used to investigate the integrity and conservation of order proposed for genome sequence scaffolds generated from the constituent A genome of Brassica rapa. Conclusions Our results provide a comprehensive genetic integration of the B. napus genome from a range of sources, which we anticipate will provide valuable information for rapeseed and Canola research.

  14. Genome-wide mapping of DNA strand breaks.

    Directory of Open Access Journals (Sweden)

    Frédéric Leduc

    Full Text Available Determination of cellular DNA damage has so far been limited to global assessment of genome integrity whereas nucleotide-level mapping has been restricted to specific loci by the use of specific primers. Therefore, only limited DNA sequences can be studied and novel regions of genomic instability can hardly be discovered. Using a well-characterized yeast model, we describe a straightforward strategy to map genome-wide DNA strand breaks without compromising nucleotide-level resolution. This technique, termed "damaged DNA immunoprecipitation" (dDIP, uses immunoprecipitation and the terminal deoxynucleotidyl transferase-mediated dUTP-biotin end-labeling (TUNEL to capture DNA at break sites. When used in combination with microarray or next-generation sequencing technologies, dDIP will allow researchers to map genome-wide DNA strand breaks as well as other types of DNA damage and to establish a clear profiling of altered genes and/or intergenic sequences in various experimental conditions. This mapping technique could find several applications for instance in the study of aging, genotoxic drug screening, cancer, meiosis, radiation and oxidative DNA damage.

  15. Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

    Directory of Open Access Journals (Sweden)

    So Mee Kwon

    2012-06-01

    Full Text Available The explosive development of genomics technologies including microarrays and next generation sequencing (NGS has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research.

  16. Toward mapping the biology of the genome.

    Science.gov (United States)

    Chanock, Stephen

    2012-09-01

    This issue of Genome Research presents new results, methods, and tools from The ENCODE Project (ENCyclopedia of DNA Elements), which collectively represents an important step in moving beyond a parts list of the genome and promises to shape the future of genomic research. This collection sheds light on basic biological questions and frames the current debate over the optimization of tools and methodological challenges necessary to compare and interpret large complex data sets focused on how the genome is organized and regulated. In a number of instances, the authors have highlighted the strengths and limitations of current computational and technical approaches, providing the community with useful standards, which should stimulate development of new tools. In many ways, these papers will ripple through the scientific community, as those in pursuit of understanding the "regulatory genome" will heavily traverse the maps and tools. Similarly, the work should have a substantive impact on how genetic variation contributes to specific diseases and traits by providing a compendium of functional elements for follow-up study. The success of these papers should not only be measured by the scope of the scientific insights and tools but also by their ability to attract new talent to mine existing and future data.

  17. Genome Variation Map: a data repository of genome variations in BIG Data Center.

    Science.gov (United States)

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2018-01-04

    The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Genome Variation Map: a data repository of genome variations in BIG Data Center

    Science.gov (United States)

    Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

    2018-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473

  19. The integrated microbial genome resource of analysis.

    Science.gov (United States)

    Checcucci, Alice; Mengoni, Alessio

    2015-01-01

    Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.

  20. Microbial genome sequencing using optical mapping and Illumina sequencing

    Science.gov (United States)

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  1. Genomics Portals: integrative web-platform for mining genomics data.

    Science.gov (United States)

    Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

    2010-01-13

    A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  2. Genomics Portals: integrative web-platform for mining genomics data

    Directory of Open Access Journals (Sweden)

    Ghosh Krishnendu

    2010-01-01

    Full Text Available Abstract Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc, and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  3. GDR (Genome Database for Rosaceae: integrated web resources for Rosaceae genomics and genetics research

    Directory of Open Access Journals (Sweden)

    Ficklin Stephen

    2004-09-01

    Full Text Available Abstract Background Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. Description The Genome Database for Rosaceae (GDR is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. Conclusions The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at http://www.genome.clemson.edu/gdr/.

  4. GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research.

    Science.gov (United States)

    Jung, Sook; Jesudurai, Christopher; Staton, Margaret; Du, Zhidian; Ficklin, Stephen; Cho, Ilhyung; Abbott, Albert; Tomkins, Jeffrey; Main, Dorrie

    2004-09-09

    Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at http://www.genome.clemson.edu/gdr/.

  5. Validation of rice genome sequence by optical mapping

    Directory of Open Access Journals (Sweden)

    Pape Louise

    2007-08-01

    Full Text Available Abstract Background Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. Results To facilitate ongoing sequencing finishing and validation efforts, we have constructed a whole-genome SwaI optical restriction map of the rice genome. The physical map consists of 14 contigs, covering 12 chromosomes, with a total genome size of 382.17 Mb; this value is about 11% smaller than original estimates. 9 of the 14 optical map contigs are without gaps, covering chromosomes 1, 2, 3, 4, 5, 7, 8 10, and 12 in their entirety – including centromeres and telomeres. Alignments between optical and in silico restriction maps constructed from IRGSP (International Rice Genome Sequencing Project and TIGR (The Institute for Genomic Research genome sequence sources are comprehensive and informative, evidenced by map coverage across virtually all published gaps, discovery of new ones, and characterization of sequence misassemblies; all totalling ~14 Mb. Furthermore, since optical maps are ordered restriction maps, identified discordances are pinpointed on a reliable physical scaffold providing an independent resource for closure of gaps and rectification of misassemblies. Conclusion Analysis of sequence and optical mapping data effectively validates genome sequence assemblies constructed from large, repeat-rich genomes. Given this conclusion we envision new applications of such single molecule analysis that will merge advantages offered by high-resolution optical maps with inexpensive, but short sequence reads generated by emerging sequencing platforms. Lastly, map construction techniques presented here points the way to new types of comparative genome analysis that would focus on discernment of

  6. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data.

    Science.gov (United States)

    Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

    2008-01-01

    The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org.

  7. Definition of the zebrafish genome using flow cytometry and cytogenetic mapping

    Directory of Open Access Journals (Sweden)

    Zhou Yi

    2007-06-01

    Full Text Available Abstract Background The zebrafish (Danio rerio is an important vertebrate model organism system for biomedical research. The syntenic conservation between the zebrafish and human genome allows one to investigate the function of human genes using the zebrafish model. To facilitate analysis of the zebrafish genome, genetic maps have been constructed and sequence annotation of a reference zebrafish genome is ongoing. However, the duplicative nature of teleost genomes, including the zebrafish, complicates accurate assembly and annotation of a representative genome sequence. Cytogenetic approaches provide "anchors" that can be integrated with accumulating genomic data. Results Here, we cytogenetically define the zebrafish genome by first estimating the size of each linkage group (LG chromosome using flow cytometry, followed by the cytogenetic mapping of 575 bacterial artificial chromosome (BAC clones onto metaphase chromosomes. Of the 575 BAC clones, 544 clones localized to apparently unique chromosomal locations. 93.8% of these clones were assigned to a specific LG chromosome location using fluorescence in situ hybridization (FISH and compared to the LG chromosome assignment reported in the zebrafish genome databases. Thirty-one BAC clones localized to multiple chromosomal locations in several different hybridization patterns. From these data, a refined second generation probe panel for each LG chromosome was also constructed. Conclusion The chromosomal mapping of the 575 large-insert DNA clones allows for these clones to be integrated into existing zebrafish mapping data. An accurately annotated zebrafish reference genome serves as a valuable resource for investigating the molecular basis of human diseases using zebrafish mutant models.

  8. Generation of a BAC-based physical map of the melon genome

    Directory of Open Access Journals (Sweden)

    Puigdomènech Pere

    2010-05-01

    of the first physical map of a Cucurbitaceae species described so far. The physical map was integrated with the genetic map so that a number of physical contigs, representing 12% of the melon genome, could be anchored to known genetic positions. The data presented is already helping to improve the quality of the melon genomic sequence available as a result of a project currently being carried out in Spain, adopting a whole genome shotgun approach based on 454 sequencing data.

  9. Whole-genome shotgun optical mapping of rhodospirillumrubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, Susan; Zhou, Shiguo; Place, Mike; Zhang, Yaoping; Briska, Adam; Goldstein, Steve; Churas, Chris; Runnheim, Rod; Forrest,Dan; Lim, Alex; Lapidus, Alla; Han, Cliff S.; Roberts, Gary P.; Schwartz,David C.

    2004-07-01

    Rhodospirillum rubrum is a phototrophic purple non-sulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems, and as a source of hydrogen and biodegradable plastics production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction maps (Xba I, Nhe I, and Hind III) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction maps from randomly sheared genomic DNA molecules extracted directly from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the Hind III map acted as a scaffold for high resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and validation of genome sequence, our work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a ''molecular cytogenetics'' approach to solving problems in genomic analysis.

  10. Whole-genome shotgun optical mapping of Rhodospirillum rubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, S. [Univ. Wisc.-Madison; Zhou, S. [Univ. Wisc.-Madison; Place, M. [Univ. Wisc.-Madison; Zhang, Y. [Univ. Wisc.-Madison; Briska, A. [Univ. Wisc.-Madison; Goldstein, S. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Lim, A. [Univ. Wisc.-Madison; Lapidus, A. [Univ. Wisc.-Madison; Han, C. S. [Univ. Wisc.-Madison; Roberts, G. P. [Univ. Wisc.-Madison; Schwartz, D. C. [Univ. Wisc.-Madison

    2005-09-01

    Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.

  11. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

    Science.gov (United States)

    Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2016-07-01

    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  12. KAIKObase: An integrated silkworm genome database and data mining tool

    Directory of Open Access Journals (Sweden)

    Nagaraju Javaregowda

    2009-10-01

    Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the

  13. BAC-end sequence-based SNPs and Bin mapping for rapid integration of physical and genetic maps in apple.

    Science.gov (United States)

    Han, Yuepeng; Chagné, David; Gasic, Ksenija; Rikkerink, Erik H A; Beever, Jonathan E; Gardiner, Susan E; Korban, Schuyler S

    2009-03-01

    A genome-wide BAC physical map of the apple, Malus x domestica Borkh., has been recently developed. Here, we report on integrating the physical and genetic maps of the apple using a SNP-based approach in conjunction with bin mapping. Briefly, BAC clones located at ends of BAC contigs were selected, and sequenced at both ends. The BAC end sequences (BESs) were used to identify candidate SNPs. Subsequently, these candidate SNPs were genetically mapped using a bin mapping strategy for the purpose of mapping the physical onto the genetic map. Using this approach, 52 (23%) out of 228 BESs tested were successfully exploited to develop SNPs. These SNPs anchored 51 contigs, spanning approximately 37 Mb in cumulative physical length, onto 14 linkage groups. The reliability of the integration of the physical and genetic maps using this SNP-based strategy is described, and the results confirm the feasibility of this approach to construct an integrated physical and genetic maps for apple.

  14. Integrable mappings via rational elliptic surfaces

    International Nuclear Information System (INIS)

    Tsuda, Teruhisa

    2004-01-01

    We present a geometric description of the QRT map (which is an integrable mapping introduced by Quispel, Roberts and Thompson) in terms of the addition formula of a rational elliptic surface. By this formulation, we classify all the cases when the QRT map is periodic; and show that its period is 2, 3, 4, 5 or 6. A generalization of the QRT map which acts birationally on a pencil of K3 surfaces, or Calabi-Yau manifolds, is also presented

  15. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  16. Transcription as a Threat to Genome Integrity.

    Science.gov (United States)

    Gaillard, Hélène; Aguilera, Andrés

    2016-06-02

    Genomes undergo different types of sporadic alterations, including DNA damage, point mutations, and genome rearrangements, that constitute the basis for evolution. However, these changes may occur at high levels as a result of cell pathology and trigger genome instability, a hallmark of cancer and a number of genetic diseases. In the last two decades, evidence has accumulated that transcription constitutes an important natural source of DNA metabolic errors that can compromise the integrity of the genome. Transcription can create the conditions for high levels of mutations and recombination by its ability to open the DNA structure and remodel chromatin, making it more accessible to DNA insulting agents, and by its ability to become a barrier to DNA replication. Here we review the molecular basis of such events from a mechanistic perspective with particular emphasis on the role of transcription as a genome instability determinant.

  17. Integrating genomics into undergraduate nursing education.

    Science.gov (United States)

    Daack-Hirsch, Sandra; Dieter, Carla; Quinn Griffin, Mary T

    2011-09-01

    To prepare the next generation of nurses, faculty are now faced with the challenge of incorporating genomics into curricula. Here we discuss how to meet this challenge. Steps to initiate curricular changes to include genomics are presented along with a discussion on creating a genomic curriculum thread versus a standalone course. Ideas for use of print material and technology on genomic topics are also presented. Information is based on review of the literature and curriculum change efforts by the authors. In recognition of advances in genomics, the nursing profession is increasing an emphasis on the integration of genomics into professional practice and educational standards. Incorporating genomics into nurses' practices begins with changes in our undergraduate curricula. Information given in didactic courses should be reinforced in clinical practica, and Internet-based tools such as WebQuest, Second Life, and wikis offer attractive, up-to-date platforms to deliver this now crucial content. To provide information that may assist faculty to prepare the next generation of nurses to practice using genomics. © 2011 Sigma Theta Tau International.

  18. GAPIT: genome association and prediction integrated tool.

    Science.gov (United States)

    Lipka, Alexander E; Tian, Feng; Wang, Qishan; Peiffer, Jason; Li, Meng; Bradbury, Peter J; Gore, Michael A; Buckler, Edward S; Zhang, Zhiwu

    2012-09-15

    Software programs that conduct genome-wide association studies and genomic prediction and selection need to use methodologies that maximize statistical power, provide high prediction accuracy and run in a computationally efficient manner. We developed an R package called Genome Association and Prediction Integrated Tool (GAPIT) that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The GAPIT package can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time, while providing user-friendly access and concise tables and graphs to interpret results. http://www.maizegenetics.net/GAPIT. zhiwu.zhang@cornell.edu Supplementary data are available at Bioinformatics online.

  19. Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

    Science.gov (United States)

    Lee, Hayan; Schatz, Michael C

    2012-08-15

    Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net

  20. Algorithms and Complexity Results for Genome Mapping Problems.

    Science.gov (United States)

    Rajaraman, Ashok; Zanetti, Joao Paulo Pereira; Manuch, Jan; Chauve, Cedric

    2017-01-01

    Genome mapping algorithms aim at computing an ordering of a set of genomic markers based on local ordering information such as adjacencies and intervals of markers. In most genome mapping models, markers are assumed to occur uniquely in the resulting map. We introduce algorithmic questions that consider repeats, i.e., markers that can have several occurrences in the resulting map. We show that, provided with an upper bound on the copy number of repeated markers and with intervals that span full repeat copies, called repeat spanning intervals, the problem of deciding if a set of adjacencies and repeat spanning intervals admits a genome representation is tractable if the target genome can contain linear and/or circular chromosomal fragments. We also show that extracting a maximum cardinality or weight subset of repeat spanning intervals given a set of adjacencies that admits a genome realization is NP-hard but fixed-parameter tractable in the maximum copy number and the number of adjacent repeats, and tractable if intervals contain a single repeated marker.

  1. Periodic cluster mutations and related integrable maps

    International Nuclear Information System (INIS)

    Fordy, Allan P

    2014-01-01

    One of the remarkable properties of cluster algebras is that any cluster, obtained from a sequence of mutations from an initial cluster, can be written as a Laurent polynomial in the initial cluster (known as the ‘Laurent phenomenon’). There are many nonlinear recurrences which exhibit the Laurent phenomenon and thus unexpectedly generate integer sequences. The mutation of a typical quiver will not generate a recurrence, but rather an erratic sequence of exchange relations. How do we ‘design’ a quiver which gives rise to a given recurrence? A key role is played by the concept of ‘periodic cluster mutation’, introduced in 2009. Each recurrence corresponds to a finite dimensional map. In the context of cluster mutations, these are called ‘cluster maps’. What properties do cluster maps have? Are they integrable in some standard sense?In this review I describe how integrable maps arise in the context of cluster mutations. I first explain the concept of ‘periodic cluster mutation’, giving some classification results. I then give a review of what is meant by an integrable map and apply this to cluster maps. Two classes of integrable maps are related to interesting monodromy problems, which generate interesting Poisson algebras of functions, used to prove complete integrability and a linearization. A connection to the Hirota–Miwa equation is explained. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (review)

  2. Genome mapping and characterization of the Anopheles gambiae heterochromatin

    Directory of Open Access Journals (Sweden)

    Sharakhova Maria V

    2010-08-01

    Full Text Available Abstract Background Heterochromatin plays an important role in chromosome function and gene regulation. Despite the availability of polytene chromosomes and genome sequence, the heterochromatin of the major malaria vector Anopheles gambiae has not been mapped and characterized. Results To determine the extent of heterochromatin within the An. gambiae genome, genes were physically mapped to the euchromatin-heterochromatin transition zone of polytene chromosomes. The study found that a minimum of 232 genes reside in 16.6 Mb of mapped heterochromatin. Gene ontology analysis revealed that heterochromatin is enriched in genes with DNA-binding and regulatory activities. Immunostaining of the An. gambiae chromosomes with antibodies against Drosophila melanogaster heterochromatin protein 1 (HP1 and the nuclear envelope protein lamin Dm0 identified the major invariable sites of the proteins' localization in all regions of pericentric heterochromatin, diffuse intercalary heterochromatin, and euchromatic region 9C of the 2R arm, but not in the compact intercalary heterochromatin. To better understand the molecular differences among chromatin types, novel Bayesian statistical models were developed to analyze genome features. The study found that heterochromatin and euchromatin differ in gene density and the coverage of retroelements and segmental duplications. The pericentric heterochromatin had the highest coverage of retroelements and tandem repeats, while intercalary heterochromatin was enriched with segmental duplications. We also provide evidence that the diffuse intercalary heterochromatin has a higher coverage of DNA transposable elements, minisatellites, and satellites than does the compact intercalary heterochromatin. The investigation of 42-Mb assembly of unmapped genomic scaffolds showed that it has molecular characteristics similar to cytologically mapped heterochromatin. Conclusions Our results demonstrate that Anopheles polytene chromosomes

  3. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed Missael Vargas

    2014-01-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence...... data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery...

  4. Radiation hybrid mapping as one of the main methods of the creation of high resolution maps of human and animal genomes

    International Nuclear Information System (INIS)

    Sulimova, G.E.; Kompanijtsev, A.A.; Mojsyak, E.V.; Rakhmanaliev, Eh.R.; Klimov, E.A.; Udina, I.G.; Zakharov, I.A.

    2000-01-01

    Radiation hybrid mapping (RH mapping) is considered as one of the main method of constructing physical maps of mammalian genomes. In introduction, theoretical prerequisites of developing of the RH mapping and statistical methods of data analysis are discussed. Comparative characteristics of universal commercial panels of the radiation hybrid somatic cells (RH panels) are shown. In experimental part of the work, RH mapping is used to localize nucleotide sequences adjacent to Not I sites of human chromosome 3 with the aim to integrate contig map of Nor I clones to comprehensive maps of human genome. Five nucleotide sequences adjacent to the sites of integration of papilloma virus in human genome and expressed in the cells of cervical cancer involved localized. It is demonstrated that the region 13q14.3-q21.1 was enriched with nucleotide sequences involved in the processes of carcinogenesis. RH mapping can be considered as one of the most perspective applications of modern radiation biology in the field of molecular genetics, that is, in constructing physical maps of mammalian genomes with high resolution level [ru

  5. Super integrable four-dimensional autonomous mappings

    International Nuclear Information System (INIS)

    Capel, H W; Sahadevan, R; Rajakumar, S

    2007-01-01

    A systematic investigation of the complete integrability of a fourth-order autonomous difference equation of the type w(n + 4) = w(n)F(w(n + 1), w(n + 2), w(n + 3)) is presented. We identify seven distinct families of four-dimensional mappings which are super integrable and have three (independent) integrals via a duality relation as introduced in a recent paper by Quispel, Capel and Roberts (2005 J. Phys. A: Math. Gen. 38 3965-80). It is observed that these seven families can be related to the four-dimensional symplectic mappings with two integrals including all the four-dimensional periodic reductions of the integrable double-discrete modified Korteweg-deVries and sine-Gordon equations treated in an earlier paper by two of us (Capel and Sahadevan 2001 Physica A 289 86-106)

  6. Mapping genomic deletions down to the base

    DEFF Research Database (Denmark)

    Dunø, Morten; Hove, Hanne; Kirchhoff, Maria

    2004-01-01

    the breakpoint of the third patient was mapped to a region previously predicted to be prone for rearrangements. One patient also harboured an inversion in connection with the deletion that disrupted the HDAC9 gene. All three patients showed clinical characteristics reminiscent of the hand-foot-genital syndrome...

  7. Genomic integrity and the ageing brain.

    Science.gov (United States)

    Chow, Hei-man; Herrup, Karl

    2015-11-01

    DNA damage is correlated with and may drive the ageing process. Neurons in the brain are postmitotic and are excluded from many forms of DNA repair; therefore, neurons are vulnerable to various neurodegenerative diseases. The challenges facing the field are to understand how and when neuronal DNA damage accumulates, how this loss of genomic integrity might serve as a 'time keeper' of nerve cell ageing and why this process manifests itself as different diseases in different individuals.

  8. Outcome mapping for health system integration

    Directory of Open Access Journals (Sweden)

    Tsasis P

    2013-03-01

    Full Text Available Peter Tsasis,1 Jenna M Evans,2 David Forrest,3 Richard Keith Jones4 1School of Health Policy and Management, Faculty of Health, York University, Toronto, Canada; 2Institute of Health Policy, Management and Evaluation, Faculty of Medicine, University of Toronto, Canada; 3Global Vision Consulting Ltd, Victoria, Canada; 4R Keith Jones and Associates, Victoria, Canada Abstract: Health systems around the world are implementing integrated care strategies to improve quality, reduce or maintain costs, and improve the patient experience. Yet few practical tools exist to aid leaders and managers in building the prerequisites to integrated care, namely a shared vision, clear roles and responsibilities, and a common understanding of how the vision will be realized. Outcome mapping may facilitate stakeholder alignment on the vision, roles, and processes of integrated care delivery via participative and focused dialogue among diverse stakeholders on desired outcomes and enabling actions. In this paper, we describe an outcome-mapping exercise we conducted at a Local Health Integration Network in Ontario, Canada, using consensus development conferences. Our preliminary findings suggest that outcome mapping may help stakeholders make sense of a complex system and foster collaborative capital, a resource that can support information sharing, trust, and coordinated change toward integration across organizational and professional boundaries. Drawing from the theoretical perspectives of complex adaptive systems and collaborative capital, we also outline recommendations for future outcome-mapping exercises. In particular, we emphasize the potential for outcome mapping to be used as a tool not only for identifying and linking strategic outcomes and actions, but also for studying the boundaries, gaps, and ties that characterize social networks across the continuum of care. Keywords: integrated care, integrated delivery systems, complex adaptive systems, social capital

  9. Integrative mapping analysis of chicken microchromosome 16 organization

    Directory of Open Access Journals (Sweden)

    Bed'hom Bertrand

    2010-11-01

    Full Text Available Abstract Background The chicken karyotype is composed of 39 chromosome pairs, of which 9 still remain totally absent from the current genome sequence assembly, despite international efforts towards complete coverage. Some others are only very partially sequenced, amongst which microchromosome 16 (GGA16, particularly under-represented, with only 433 kb assembled for a full estimated size of 9 to 11 Mb. Besides the obvious need of full genome coverage with genetic markers for QTL (Quantitative Trait Loci mapping and major genes identification studies, there is a major interest in the detailed study of this chromosome because it carries the two genetically independent MHC complexes B and Y. In addition, GGA16 carries the ribosomal RNA (rRNA genes cluster, also known as the NOR (nucleolus organizer region. The purpose of the present study is to construct and present high resolution integrated maps of GGA16 to refine its organization and improve its coverage with genetic markers. Results We developed 79 STS (Sequence Tagged Site markers to build a physical RH (radiation hybrid map and 34 genetic markers to extend the genetic map of GGA16. We screened a BAC (Bacterial Artificial Chromosome library with markers for the MHC-B, MHC-Y and rRNA complexes. Selected clones were used to perform high resolution FISH (Fluorescent In Situ Hybridization mapping on giant meiotic lampbrush chromosomes, allowing meiotic mapping in addition to the confirmation of the order of the three clusters along the chromosome. A region with high recombination rates and containing PO41 repeated elements separates the two MHC complexes. Conclusions The three complementary mapping strategies used refine greatly our knowledge of chicken microchromosome 16 organisation. The characterisation of the recombination hotspots separating the two MHC complexes demonstrates the presence of PO41 repetitive sequences both in tandem and inverted orientation. However, this region still needs to

  10. Interchanging parameters and integrals in dynamical systems: the mapping case

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, John A.G. [Department of Mathematics, La Trobe University, Bundoora, VIC (Australia) and School of Mathematics, University of New South Wales, Sydney, NSW (Australia)]. E-mail: jagr@maths.unsw.edu.au; Apostolos, Iatrou; Quispel, G.R.W. [Department of Mathematics, La Trobe University, Bundoora, VIC (Australia)]. E-mails: A.Iatrou@latrobe.edu.au; R.Quispel@latrobe.edu.au

    2002-03-08

    We consider dynamical systems with discrete time (maps) that possess one or more integrals depending upon parameters. We show that integrals can be used to replace parameters in the original map so as to construct a different map with different integrals. We also highlight a process of reparametrization that can be used to increase the number of parameters in the original map prior to using integrals to replace them. Properties of the original map and the new map are compared. The theory is motivated by, and illustrated with, examples of a three-dimensional trace map and some four-dimensional maps previously shown to be integrable. (author)

  11. A first generation BAC-based physical map of the rainbow trout genome

    Directory of Open Access Journals (Sweden)

    Thorgaard Gary H

    2009-10-01

    Full Text Available Abstract Background Rainbow trout (Oncorhynchus mykiss are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL have been identified for production and life-history traits in rainbow trout. A bacterial artificial chromosome (BAC physical map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS for improving rainbow trout aquaculture production. This resource will also facilitate efforts to obtain and assemble a whole-genome reference sequence for this species. Results The physical map was constructed from DNA fingerprinting of 192,096 BAC clones using the 4-color high-information content fingerprinting (HICF method. The clones were assembled into physical map contigs using the finger-printing contig (FPC program. The map is composed of 4,173 contigs and 9,379 singletons. The total number of unique fingerprinting fragments (consensus bands in contigs is 1,185,157, which corresponds to an estimated physical length of 2.0 Gb. The map assembly was validated by 1 comparison with probe hybridization results and agarose gel fingerprinting contigs; and 2 anchoring large contigs to the microsatellite-based genetic linkage map. Conclusion The production and validation of the first BAC physical map of the rainbow trout genome is described in this paper. We are currently integrating this map with the NCCCWA genetic map using more than 200 microsatellites isolated from BAC end sequences and by identifying BACs that harbor more than 300 previously mapped markers. The availability of an integrated physical and genetic map will enable detailed comparative genome

  12. Physical mapping in highly heterozygous genomes: a physical contig map of the Pinot Noir grapevine cultivar

    Directory of Open Access Journals (Sweden)

    Jurman Irena

    2010-03-01

    Full Text Available Abstract Background Most of the grapevine (Vitis vinifera L. cultivars grown today are those selected centuries ago, even though grapevine is one of the most important fruit crops in the world. Grapevine has therefore not benefited from the advances in modern plant breeding nor more recently from those in molecular genetics and genomics: genes controlling important agronomic traits are practically unknown. A physical map is essential to positionally clone such genes and instrumental in a genome sequencing project. Results We report on the first whole genome physical map of grapevine built using high information content fingerprinting of 49,104 BAC clones from the cultivar Pinot Noir. Pinot Noir, as most grape varieties, is highly heterozygous at the sequence level. This resulted in the two allelic haplotypes sometimes assembling into separate contigs that had to be accommodated in the map framework or in local expansions of contig maps. We performed computer simulations to assess the effects of increasing levels of sequence heterozygosity on BAC fingerprint assembly and showed that the experimental assembly results are in full agreement with the theoretical expectations, given the heterozygosity levels reported for grape. The map is anchored to a dense linkage map consisting of 994 markers. 436 contigs are anchored to the genetic map, covering 342 of the 475 Mb that make up the grape haploid genome. Conclusions We have developed a resource that makes it possible to access the grapevine genome, opening the way to a new era both in grape genetics and breeding and in wine making. The effects of heterozygosity on the assembly have been analyzed and characterized by using several complementary approaches which could be easily transferred to the study of other genomes which present the same features.

  13. Variant Review with the Integrative Genomics Viewer.

    Science.gov (United States)

    Robinson, James T; Thorvaldsdóttir, Helga; Wenger, Aaron M; Zehir, Ahmet; Mesirov, Jill P

    2017-11-01

    Manual review of aligned reads for confirmation and interpretation of variant calls is an important step in many variant calling pipelines for next-generation sequencing (NGS) data. Visual inspection can greatly increase the confidence in calls, reduce the risk of false positives, and help characterize complex events. The Integrative Genomics Viewer (IGV) was one of the first tools to provide NGS data visualization, and it currently provides a rich set of tools for inspection, validation, and interpretation of NGS datasets, as well as other types of genomic data. Here, we present a short overview of IGV's variant review features for both single-nucleotide variants and structural variants, with examples from both cancer and germline datasets. IGV is freely available at https://www.igv.org Cancer Res; 77(21); e31-34. ©2017 AACR . ©2017 American Association for Cancer Research.

  14. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    Science.gov (United States)

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  15. Integrative Genomics Viewer (IGV) | Informatics Technology for Cancer Research (ITCR)

    Science.gov (United States)

    The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

  16. Considerations for creating and annotating the budding yeast Genome Map at SGD: a progress report.

    Science.gov (United States)

    Chan, Esther T; Cherry, J Michael

    2012-01-01

    The Saccharomyces Genome Database (SGD) is compiling and annotating a comprehensive catalogue of functional sequence elements identified in the budding yeast genome. Recent advances in deep sequencing technologies have enabled for example, global analyses of transcription profiling and assembly of maps of transcription factor occupancy and higher order chromatin organization, at nucleotide level resolution. With this growing influx of published genome-scale data, come new challenges for their storage, display, analysis and integration. Here, we describe SGD's progress in the creation of a consolidated resource for genome sequence elements in the budding yeast, the considerations taken in its design and the lessons learned thus far. The data within this collection can be accessed at http://browse.yeastgenome.org and downloaded from http://downloads.yeastgenome.org. DATABASE URL: http://www.yeastgenome.org.

  17. Association Mapping and the Genomic Consequences of Selection in Sunflower

    Science.gov (United States)

    Mandel, Jennifer R.; Nambeesan, Savithri; Bowers, John E.; Marek, Laura F.; Ebert, Daniel; Rieseberg, Loren H.; Knapp, Steven J.; Burke, John M.

    2013-01-01

    The combination of large-scale population genomic analyses and trait-based mapping approaches has the potential to provide novel insights into the evolutionary history and genome organization of crop plants. Here, we describe the detailed genotypic and phenotypic analysis of a sunflower (Helianthus annuus L.) association mapping population that captures nearly 90% of the allelic diversity present within the cultivated sunflower germplasm collection. We used these data to characterize overall patterns of genomic diversity and to perform association analyses on plant architecture (i.e., branching) and flowering time, successfully identifying numerous associations underlying these agronomically and evolutionarily important traits. Overall, we found variable levels of linkage disequilibrium (LD) across the genome. In general, islands of elevated LD correspond to genomic regions underlying traits that are known to have been targeted by selection during the evolution of cultivated sunflower. In many cases, these regions also showed significantly elevated levels of differentiation between the two major sunflower breeding groups, consistent with the occurrence of divergence due to strong selection. One of these regions, which harbors a major branching locus, spans a surprisingly long genetic interval (ca. 25 cM), indicating the occurrence of an extended selective sweep in an otherwise recombinogenic interval. PMID:23555290

  18. Understanding the development of human bladder cancer by using a whole-organ genomic mapping strategy.

    Science.gov (United States)

    Majewski, Tadeusz; Lee, Sangkyou; Jeong, Joon; Yoon, Dong-Sup; Kram, Andrzej; Kim, Mi-Sook; Tuziak, Tomasz; Bondaruk, Jolanta; Lee, Sooyong; Park, Weon-Seo; Tang, Kuang S; Chung, Woonbok; Shen, Lanlan; Ahmed, Saira S; Johnston, Dennis A; Grossman, H Barton; Dinney, Colin P; Zhou, Jain-Hua; Harris, R Alan; Snyder, Carrie; Filipek, Slawomir; Narod, Steven A; Watson, Patrice; Lynch, Henry T; Gazdar, Adi; Bar-Eli, Menashe; Wu, Xifeng F; McConkey, David J; Baggerly, Keith; Issa, Jean-Pierre; Benedict, William F; Scherer, Steven E; Czerniak, Bogdan

    2008-07-01

    The search for the genomic sequences involved in human cancers can be greatly facilitated by maps of genomic imbalances identifying the involved chromosomal regions, particularly those that participate in the development of occult preneoplastic conditions that progress to clinically aggressive invasive cancer. The integration of such regions with human genome sequence variation may provide valuable clues about their overall structure and gene content. By extension, such knowledge may help us understand the underlying genetic components involved in the initiation and progression of these cancers. We describe the development of a genome-wide map of human bladder cancer that tracks its progression from in situ precursor conditions to invasive disease. Testing for allelic losses using a genome-wide panel of 787 microsatellite markers was performed on multiple DNA samples, extracted from the entire mucosal surface of the bladder and corresponding to normal urothelium, in situ preneoplastic lesions, and invasive carcinoma. Using this approach, we matched the clonal allelic losses in distinct chromosomal regions to specific phases of bladder neoplasia and produced a detailed genetic map of bladder cancer development. These analyses revealed three major waves of genetic changes associated with growth advantages of successive clones and reflecting a stepwise conversion of normal urothelial cells into cancer cells. The genetic changes map to six regions at 3q22-q24, 5q22-q31, 9q21-q22, 10q26, 13q14, and 17p13, which may represent critical hits driving the development of bladder cancer. Finally, we performed high-resolution mapping using single nucleotide polymorphism markers within one region on chromosome 13q14, containing the model tumor suppressor gene RB1, and defined a minimal deleted region associated with clonal expansion of in situ neoplasia. These analyses provided new insights on the involvement of several non-coding sequences mapping to the region and identified

  19. Integrated Genome-Based Studies of Shewanella Ecophysiology

    Energy Technology Data Exchange (ETDEWEB)

    Andrei L. Osterman, Ph.D.

    2012-12-17

    Integration of bioinformatics and experimental techniques was applied to mapping and characterization of the key components (pathways, enzymes, transporters, regulators) of the core metabolic machinery in Shewanella oneidensis and related species with main focus was on metabolic and regulatory pathways involved in utilization of various carbon and energy sources. Among the main accomplishments reflected in ten joint publications with other participants of Shewanella Federation are: (i) A systems-level reconstruction of carbohydrate utilization pathways in the genus of Shewanella (19 species). This analysis yielded reconstruction of 18 sugar utilization pathways including 10 novel pathway variants and prediction of > 60 novel protein families of enzymes, transporters and regulators involved in these pathways. Selected functional predictions were verified by focused biochemical and genetic experiments. Observed growth phenotypes were consistent with bioinformatic predictions providing strong validation of the technology and (ii) Global genomic reconstruction of transcriptional regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors, 8 riboswitches and 6 translational attenuators. Of those, 45 regulons were inferred directly from the genome context analysis, whereas others were propagated from previously characterized regulons in other species. Selected regulatory predictions were experimentally tested. Integration of this analysis with microarray data revealed overall consistency and provided additional layer of interactions between regulons. All the results were captured in the new database RegPrecise, which is a joint development with the LBNL team. A more detailed analysis of the individual subsystems, pathways and regulons in Shewanella spp included bioinfiormatics-based prediction and experimental characterization of: (i) N-Acetylglucosamine catabolic pathway; (ii)Lactate utilization machinery; (iii) Novel Nrt

  20. Molecular mapping and genomics of soybean seed protein: a review and perspective for the future.

    Science.gov (United States)

    Patil, Gunvant; Mian, Rouf; Vuong, Tri; Pantalone, Vince; Song, Qijian; Chen, Pengyin; Shannon, Grover J; Carter, Tommy C; Nguyen, Henry T

    2017-10-01

    Genetic improvement of soybean protein meal is a complex process because of negative correlation with oil, yield, and temperature. This review describes the progress in mapping and genomics, identifies knowledge gaps, and highlights the need of integrated approaches. Meal protein derived from soybean [Glycine max (L) Merr.] seed is the primary source of protein in poultry and livestock feed. Protein is a key factor that determines the nutritional and economical value of soybean. Genetic improvement of soybean seed protein content is highly desirable, and major quantitative trait loci (QTL) for soybean protein have been detected and repeatedly mapped on chromosomes (Chr.) 20 (LG-I), and 15 (LG-E). However, practical breeding progress is challenging because of seed protein content's negative genetic correlation with seed yield, other seed components such as oil and sucrose, and interaction with environmental effects such as temperature during seed development. In this review, we discuss rate-limiting factors related to soybean protein content and nutritional quality, and potential control factors regulating seed storage protein. In addition, we describe advances in next-generation sequencing technologies for precise detection of natural variants and their integration with conventional and high-throughput genotyping technologies. A syntenic analysis of QTL on Chr. 15 and 20 was performed. Finally, we discuss comprehensive approaches for integrating protein and amino acid QTL, genome-wide association studies, whole-genome resequencing, and transcriptome data to accelerate identification of genomic hot spots for allele introgression and soybean meal protein improvement.

  1. Comparative genome analysis and resistance gene mapping in grain legumes

    International Nuclear Information System (INIS)

    Young, N.D.

    1998-01-01

    Using, DNA markers and genome organization, several important disease resistance genes have been analyzed in mungbean (Vigna radiata), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), and soybean (Glycine max). In the process, medium-density linkage maps consisting of restriction fragment length polymorphism (RFLP) markers were constructed for both mungbean and cowpea. Comparisons between these maps, as well as the maps of soybean and common bean, indicate that there is significant conservation of DNA marker order, though the conserved blocks in soybean are much shorter than in the others. DNA mapping results also indicate that a gene for seed weight may be conserved between mungbean and cowpea. Using the linkage maps, genes that control bruchid (genus Callosobruchus) and powdery mildew (Erysiphe polygoni) resistance in mungbean, aphid resistance in cowpea (Aphis craccivora), and cyst nematode (Heterodera glycines) resistance in soybean have all been mapped and characterized. For some of these traits resistance was found to be oligogenic and DNA mapping uncovered multiple genes involved in the phenotype. (author)

  2. MycoCosm, an Integrated Fungal Genomics Resource

    Energy Technology Data Exchange (ETDEWEB)

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  3. The Proteins API: accessing key integrated protein and genome information.

    Science.gov (United States)

    Nightingale, Andrew; Antunes, Ricardo; Alpi, Emanuele; Bursteinas, Borisas; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd; Martin, Maria

    2017-07-03

    The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to 'talk' to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

    Directory of Open Access Journals (Sweden)

    Qiang Liu

    2015-06-01

    Full Text Available Human immunodeficiency virus (HIV-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS, which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors.

  5. IMG: the integrated microbial genomes database and comparative analysis system

    Science.gov (United States)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Jacob, Biju; Huang, Jinghua; Williams, Peter; Huntemann, Marcel; Anderson, Iain; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2012-01-01

    The Integrated Microbial Genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG integrates publicly available draft and complete genomes from all three domains of life with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. IMG's data content and analytical capabilities have been continuously extended through regular updates since its first release in March 2005. IMG is available at http://img.jgi.doe.gov. Companion IMG systems provide support for expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er), teaching courses and training in microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu) and analysis of genomes related to the Human Microbiome Project (IMG/HMP: http://www.hmpdacc-resources.org/img_hmp). PMID:22194640

  6. Development of genomic SSR markers for fingerprinting lettuce (Lactuca sativa L.) cultivars and mapping genes.

    Science.gov (United States)

    Rauscher, Gilda; Simko, Ivan

    2013-01-22

    Lettuce (Lactuca sativa L.) is the major crop from the group of leafy vegetables. Several types of molecular markers were developed that are effectively used in lettuce breeding and genetic studies. However only a very limited number of microsattelite-based markers are publicly available. We have employed the method of enriched microsatellite libraries to develop 97 genomic SSR markers. Testing of newly developed markers on a set of 36 Lactuca accession (33 L. sativa, and one of each L. serriola L., L. saligna L., and L. virosa L.) revealed that both the genetic heterozygosity (UHe = 0.56) and the number of loci per SSR (Na = 5.50) are significantly higher for genomic SSR markers than for previously developed EST-based SSR markers (UHe = 0.32, Na = 3.56). Fifty-four genomic SSR markers were placed on the molecular linkage map of lettuce. Distribution of markers in the genome appeared to be random, with the exception of possible cluster on linkage group 6. Any combination of 32 genomic SSRs was able to distinguish genotypes of all 36 accessions. Fourteen of newly developed SSR markers originate from fragments with high sequence similarity to resistance gene candidates (RGCs) and RGC pseudogenes. Analysis of molecular variance (AMOVA) of L. sativa accessions showed that approximately 3% of genetic diversity was within accessions, 79% among accessions, and 18% among horticultural types. The newly developed genomic SSR markers were added to the pool of previously developed EST-SSRs markers. These two types of SSR-based markers provide useful tools for lettuce cultivar fingerprinting, development of integrated molecular linkage maps, and mapping of genes.

  7. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  8. An autotetraploid linkage map of rose (Rosa hybrida) validated using the strawberry (Fragaria vesca) genome sequence.

    Science.gov (United States)

    Gar, Oron; Sargent, Daniel J; Tsai, Ching-Jung; Pleban, Tzili; Shalev, Gil; Byrne, David H; Zamir, Dani

    2011-01-01

    Polyploidy is a pivotal process in plant evolution as it increase gene redundancy and morphological intricacy but due to the complexity of polysomic inheritance we have only few genetic maps of autopolyploid organisms. A robust mapping framework is particularly important in polyploid crop species, rose included (2n = 4x = 28), where the objective is to study multiallelic interactions that control traits of value for plant breeding. From a cross between the garden, peach red and fragrant cultivar Fragrant Cloud (FC) and a cut-rose yellow cultivar Golden Gate (GG), we generated an autotetraploid GGFC mapping population consisting of 132 individuals. For the map we used 128 sequence-based markers, 141 AFLP, 86 SSR and three morphological markers. Seven linkage groups were resolved for FC (Total 632 cM) and GG (616 cM) which were validated by markers that segregated in both parents as well as the diploid integrated consensus map.The release of the Fragaria vesca genome, which also belongs to the Rosoideae, allowed us to place 70 rose sequenced markers on the seven strawberry pseudo-chromosomes. Synteny between Rosa and Fragaria was high with an estimated four major translocations and six inversions required to place the 17 non-collinear markers in the same order. Based on a verified linear order of the rose markers, we could further partition each of the parents into its four homologous groups, thus providing an essential framework to aid the sequencing of an autotetraploid genome.

  9. Molecular biologists backing effort to map entire human genome

    International Nuclear Information System (INIS)

    Zurer, P.S.

    1988-01-01

    This article discusses how the program to map and sequence the human genome will be managed. The National Research Council (NRC) recommends that a 15-year $200-million-a-year effort to map all human genes should begin immediately. However, some people have balked at the idea, saying it is a ploy to raise money. Part of the skeptic's uneasiness stems from the involvement of the Department of Energy (DOE), an agency not often linked with biological research. The DOE's interest arises from its commitment to understanding the biological effects of nuclear radiation. Critics say it is a budget-boosting tactic. This article explains some of the arguments for and against the project and explains exactly what it would involve

  10. The European Renal Genome Project: An Integrated Approach Towards Understanding the Genetics of Kidney Development and Disease

    OpenAIRE

    Willnow, TE; Antignac, C; Brändli, AW; Christensen, EI; Cox, RD; Davidson, D; Davies, JA; Devuyst, O; Eichele, G; Hastie, ND; Verroust, PJ; Schedl, A; Meij, IC

    2005-01-01

    Rapid progress in genome research creates a wealth of information on the functional annotation of mammalian genome sequences. However, as we accumulate large amounts of scientific information we are facing problems of how to integrate and relate the data produced by various genomic approaches. Here, we propose the novel concept of an organ atlas where diverse data from expression maps to histological findings to mutant phenotypes can be queried, compared and visualized in the context of a thr...

  11. A comparative map viewer integrating genetic maps for Brassica and Arabidopsis

    Directory of Open Access Journals (Sweden)

    Erwin Timothy A

    2007-07-01

    Full Text Available Abstract Background Molecular genetic maps provide a means to link heritable traits with underlying genome sequence variation. Several genetic maps have been constructed for Brassica species, yet to date, there has been no simple means to compare this information or to associate mapped traits with the genome sequence of the related model plant, Arabidopsis. Description We have developed a comparative genetic map database for the viewing, comparison and analysis of Brassica and Arabidopsis genetic, physical and trait map information. This web-based tool allows users to view and compare genetic and physical maps, search for traits and markers, and compare genetic linkage groups within and between the amphidiploid and diploid Brassica genomes. The inclusion of Arabidopsis data enables comparison between Brassica maps that share no common markers. Analysis of conserved syntenic blocks between Arabidopsis and collated Brassica genetic maps validates the application of this system. This tool is freely available over the internet on http://bioinformatics.pbcbasc.latrobe.edu.au/cmap. Conclusion This database enables users to interrogate the relationship between Brassica genetic maps and the sequenced genome of A. thaliana, permitting the comparison of genetic linkage groups and mapped traits and the rapid identification of candidate genes.

  12. An integrated genetic map based on four mapping populations and quantitative trait loci associated with economically important traits in watermelon (Citrullus lanatus)

    Science.gov (United States)

    2014-01-01

    Background Modern watermelon (Citrullus lanatus L.) cultivars share a narrow genetic base due to many years of selection for desirable horticultural qualities. Wild subspecies within C. lanatus are important potential sources of novel alleles for watermelon breeding, but successful trait introgression into elite cultivars has had limited success. The application of marker assisted selection (MAS) in watermelon is yet to be realized, mainly due to the past lack of high quality genetic maps. Recently, a number of useful maps have become available, however these maps have few common markers, and were constructed using different marker sets, thus, making integration and comparative analysis among maps difficult. The objective of this research was to use single-nucleotide polymorphism (SNP) anchor markers to construct an integrated genetic map for C. lanatus. Results Under the framework of the high density genetic map, an integrated genetic map was constructed by merging data from four independent mapping experiments using a genetically diverse array of parental lines, which included three subspecies of watermelon. The 698 simple sequence repeat (SSR), 219 insertion-deletion (InDel), 36 structure variation (SV) and 386 SNP markers from the four maps were used to construct an integrated map. This integrated map contained 1339 markers, spanning 798 cM with an average marker interval of 0.6 cM. Fifty-eight previously reported quantitative trait loci (QTL) for 12 traits in these populations were also integrated into the map. In addition, new QTL identified for brix, fructose, glucose and sucrose were added. Some QTL associated with economically important traits detected in different genetic backgrounds mapped to similar genomic regions of the integrated map, suggesting that such QTL are responsible for the phenotypic variability observed in a broad array of watermelon germplasm. Conclusions The integrated map described herein enhances the utility of genomic tools over

  13. An integrated genetic map based on four mapping populations and quantitative trait loci associated with economically important traits in watermelon (Citrullus lanatus).

    Science.gov (United States)

    Ren, Yi; McGregor, Cecilia; Zhang, Yan; Gong, Guoyi; Zhang, Haiying; Guo, Shaogui; Sun, Honghe; Cai, Wantao; Zhang, Jie; Xu, Yong

    2014-01-20

    Modern watermelon (Citrullus lanatus L.) cultivars share a narrow genetic base due to many years of selection for desirable horticultural qualities. Wild subspecies within C. lanatus are important potential sources of novel alleles for watermelon breeding, but successful trait introgression into elite cultivars has had limited success. The application of marker assisted selection (MAS) in watermelon is yet to be realized, mainly due to the past lack of high quality genetic maps. Recently, a number of useful maps have become available, however these maps have few common markers, and were constructed using different marker sets, thus, making integration and comparative analysis among maps difficult. The objective of this research was to use single-nucleotide polymorphism (SNP) anchor markers to construct an integrated genetic map for C. lanatus. Under the framework of the high density genetic map, an integrated genetic map was constructed by merging data from four independent mapping experiments using a genetically diverse array of parental lines, which included three subspecies of watermelon. The 698 simple sequence repeat (SSR), 219 insertion-deletion (InDel), 36 structure variation (SV) and 386 SNP markers from the four maps were used to construct an integrated map. This integrated map contained 1339 markers, spanning 798 cM with an average marker interval of 0.6 cM. Fifty-eight previously reported quantitative trait loci (QTL) for 12 traits in these populations were also integrated into the map. In addition, new QTL identified for brix, fructose, glucose and sucrose were added. Some QTL associated with economically important traits detected in different genetic backgrounds mapped to similar genomic regions of the integrated map, suggesting that such QTL are responsible for the phenotypic variability observed in a broad array of watermelon germplasm. The integrated map described herein enhances the utility of genomic tools over previous watermelon genetic maps. A

  14. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc

    2014-02-15

    Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest. 2014 Elsevier Ltd. All rights reserved.

  15. Integrating cancer genomic data into electronic health records

    Directory of Open Access Journals (Sweden)

    Jeremy L. Warner

    2016-10-01

    Full Text Available Abstract The rise of genomically targeted therapies and immunotherapy has revolutionized the practice of oncology in the last 10–15 years. At the same time, new technologies and the electronic health record (EHR in particular have permeated the oncology clinic. Initially designed as billing and clinical documentation systems, EHR systems have not anticipated the complexity and variety of genomic information that needs to be reviewed, interpreted, and acted upon on a daily basis. Improved integration of cancer genomic data with EHR systems will help guide clinician decision making, support secondary uses, and ultimately improve patient care within oncology clinics. Some of the key factors relating to the challenge of integrating cancer genomic data into EHRs include: the bioinformatics pipelines that translate raw genomic data into meaningful, actionable results; the role of human curation in the interpretation of variant calls; and the need for consistent standards with regard to genomic and clinical data. Several emerging paradigms for integration are discussed in this review, including: non-standardized efforts between individual institutions and genomic testing laboratories; “middleware” products that portray genomic information, albeit outside of the clinical workflow; and application programming interfaces that have the potential to work within clinical workflow. The critical need for clinical-genomic knowledge bases, which can be independent or integrated into the aforementioned solutions, is also discussed.

  16. Integrated physical map of bread wheat chromosome arm 7DS to facilitate gene cloning and comparative studies.

    Science.gov (United States)

    Tulpová, Zuzana; Luo, Ming-Cheng; Toegelová, Helena; Visendi, Paul; Hayashi, Satomi; Vojta, Petr; Paux, Etienne; Kilian, Andrzej; Abrouk, Michaël; Bartoš, Jan; Hajdúch, Marián; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2018-03-08

    Bread wheat (Triticum aestivum L.) is a staple food for a significant part of the world's population. The growing demand on its production can be satisfied by improving yield and resistance to biotic and abiotic stress. Knowledge of the genome sequence would aid in discovering genes and QTLs underlying these traits and provide a basis for genomics-assisted breeding. Physical maps and BAC clones associated with them have been valuable resources from which to generate a reference genome of bread wheat and to assist map-based gene cloning. As a part of a joint effort coordinated by the International Wheat Genome Sequencing Consortium, we have constructed a BAC-based physical map of bread wheat chromosome arm 7DS consisting of 895 contigs and covering 94% of its estimated length. By anchoring BAC contigs to one radiation hybrid map and three high resolution genetic maps, we assigned 73% of the assembly to a distinct genomic position. This map integration, interconnecting a total of 1713 markers with ordered and sequenced BAC clones from a minimal tiling path, provides a tool to speed up gene cloning in wheat. The process of physical map assembly included the integration of the 7DS physical map with a whole-genome physical map of Aegilops tauschii and a 7DS Bionano genome map, which together enabled efficient scaffolding of physical-map contigs, even in the non-recombining region of the genetic centromere. Moreover, this approach facilitated a comparison of bread wheat and its ancestor at BAC-contig level and revealed a reconstructed region in the 7DS pericentromere. Copyright © 2018. Published by Elsevier B.V.

  17. G-InforBIO: integrated system for microbial genomics

    Directory of Open Access Journals (Sweden)

    Abe Takashi

    2006-08-01

    Full Text Available Abstract Background Genome databases contain diverse kinds of information, including gene annotations and nucleotide and amino acid sequences. It is not easy to integrate such information for genomic study. There are few tools for integrated analyses of genomic data, therefore, we developed software that enables users to handle, manipulate, and analyze genome data with a variety of sequence analysis programs. Results The G-InforBIO system is a novel tool for genome data management and sequence analysis. The system can import genome data encoded as eXtensible Markup Language documents as formatted text documents, including annotations and sequences, from DNA Data Bank of Japan and GenBank encoded as flat files. The genome database is constructed automatically after importing, and the database can be exported as documents formatted with eXtensible Markup Language or tab-deliminated text. Users can retrieve data from the database by keyword searches, edit annotation data of genes, and process data with G-InforBIO. In addition, information in the G-InforBIO database can be analyzed seamlessly with nine different software programs, including programs for clustering and homology analyses. Conclusion The G-InforBIO system simplifies genome analyses by integrating several available software programs to allow efficient handling and manipulation of genome data. G-InforBIO is freely available from the download site.

  18. Single-molecule optical genome mapping of a human HapMap and a colorectal cancer cell line.

    Science.gov (United States)

    Teo, Audrey S M; Verzotto, Davide; Yao, Fei; Nagarajan, Niranjan; Hillmer, Axel M

    2015-01-01

    Next-generation sequencing (NGS) technologies have changed our understanding of the variability of the human genome. However, the identification of genome structural variations based on NGS approaches with read lengths of 35-300 bases remains a challenge. Single-molecule optical mapping technologies allow the analysis of DNA molecules of up to 2 Mb and as such are suitable for the identification of large-scale genome structural variations, and for de novo genome assemblies when combined with short-read NGS data. Here we present optical mapping data for two human genomes: the HapMap cell line GM12878 and the colorectal cancer cell line HCT116. High molecular weight DNA was obtained by embedding GM12878 and HCT116 cells, respectively, in agarose plugs, followed by DNA extraction under mild conditions. Genomic DNA was digested with KpnI and 310,000 and 296,000 DNA molecules (≥ 150 kb and 10 restriction fragments), respectively, were analyzed per cell line using the Argus optical mapping system. Maps were aligned to the human reference by OPTIMA, a new glocal alignment method. Genome coverage of 6.8× and 5.7× was obtained, respectively; 2.9× and 1.7× more than the coverage obtained with previously available software. Optical mapping allows the resolution of large-scale structural variations of the genome, and the scaffold extension of NGS-based de novo assemblies. OPTIMA is an efficient new alignment method; our optical mapping data provide a resource for genome structure analyses of the human HapMap reference cell line GM12878, and the colorectal cancer cell line HCT116.

  19. Brassica ASTRA: an integrated database for Brassica genomic research.

    Science.gov (United States)

    Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David

    2005-01-01

    Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.

  20. Integrated proteomic and genomic analysis of colorectal cancer

    Science.gov (United States)

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  1. Integrated Genome-Based Studies of Shewanella Ecophysiology

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Jizhong [Univ. of Oklahoma, Norman, OK (United States); He, Zhili [Univ. of Oklahoma, Norman, OK (United States)

    2014-04-08

    As a part of the Shewanella Federation project, we have used integrated genomic, proteomic and computational technologies to study various aspects of energy metabolism of two Shewanella strains from a systems-level perspective.

  2. The UK Human Genome Mapping Project online computing service.

    Science.gov (United States)

    Rysavy, F R; Bishop, M J; Gibbs, G P; Williams, G W

    1992-04-01

    This paper presents an overview of computing and networking facilities developed by the Medical Research Council to provide online computing support to the Human Genome Mapping Project (HGMP) in the UK. The facility is connected to a number of other computing facilities in various centres of genetics and molecular biology research excellence, either directly via high-speed links or through national and international wide-area networks. The paper describes the design and implementation of the current system, a 'client/server' network of Sun, IBM, DEC and Apple servers, gateways and workstations. A short outline of online computing services currently delivered by this system to the UK human genetics research community is also provided. More information about the services and their availability could be obtained by a direct approach to the UK HGMP-RC.

  3. Genome to Phenome Mapping in Apple Using Historical Data

    Directory of Open Access Journals (Sweden)

    Zoë Migicovsky

    2016-07-01

    Full Text Available Apple ( X Borkh. is one of the world’s most valuable fruit crops. Its large size and long juvenile phase make it a particularly promising candidate for marker-assisted selection (MAS. However, advances in MAS in apple have been limited by a lack of phenotype and genotype data from sufficiently large samples. To establish genotype-phenotype relationships and advance MAS in apple, we extracted over 24,000 phenotype scores from the USDA-Germplasm Resources Information Network (GRIN database and linked them with over 8000 single nucleotide polymorphisms (SNPs from 689 apple accessions from the USDA apple germplasm collection clonally preserved in Geneva, NY. We find significant genetic differentiation between Old World and New World cultivars and demonstrate that the genetic structure of the domesticated apple also reflects the time required for ripening. A genome-wide association study (GWAS of 36 phenotypes confirms the association between fruit color and the MYB1 locus, and we also report a novel association between the transcription factor, NAC18.1, and harvest date and fruit firmness. We demonstrate that harvest time and fruit size can be predicted with relatively high accuracies ( > 0.46 using genomic prediction. Rapid decay of linkage disequilibrium (LD in apples means millions of SNPs may be required for well-powered GWAS. However, rapid LD decay also promises to enable extremely high resolution mapping of causal variants, which holds great potential for advancing MAS.

  4. Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study

    DEFF Research Database (Denmark)

    de Vries, Paul S; Sabater-Lleal, Maria; Chasman, Daniel I

    2017-01-01

    An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In...

  5. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    OpenAIRE

    Zuccolo, Andrea; Bowers, John E; Estill, James C; Xiong, Zhiyong; Luo, Meizhong; Sebastian, Aswathy; Goicoechea, Jos? Luis; Collura, Kristi; Yu, Yeisoo; Jiao, Yuannian; Duarte, Jill; Tang, Haibao; Ayyampalayam, Saravanaraj; Rounsley, Steve; Kudrna, Dave

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome....

  6. Rotation number of integrable symplectic mappings of the plane

    Energy Technology Data Exchange (ETDEWEB)

    Zolkin, Timofey [Fermilab; Nagaitsev, Sergei [Fermilab; Danilov, Viatcheslav [Oak Ridge

    2017-04-11

    Symplectic mappings are discrete-time analogs of Hamiltonian systems. They appear in many areas of physics, including, for example, accelerators, plasma, and fluids. Integrable mappings, a subclass of symplectic mappings, are equivalent to a Twist map, with a rotation number, constant along the phase trajectory. In this letter, we propose a succinct expression to determine the rotation number and present two examples. Similar to the period of the bounded motion in Hamiltonian systems, the rotation number is the most fundamental property of integrable maps and it provides a way to analyze the phase-space dynamics.

  7. A map of recent positive selection in the human genome.

    Directory of Open Access Journals (Sweden)

    Benjamin F Voight

    2006-03-01

    Full Text Available The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.

  8. Human papilloma viruses and cervical tumours: mapping of integration sites and analysis of adjacent cellular sequences

    International Nuclear Information System (INIS)

    Klimov, Eugene; Vinokourova, Svetlana; Moisjak, Elena; Rakhmanaliev, Elian; Kobseva, Vera; Laimins, Laimonis; Kisseljov, Fjodor; Sulimova, Galina

    2002-01-01

    In cervical tumours the integration of human papilloma viruses (HPV) transcripts often results in the generation of transcripts that consist of hybrids of viral and cellular sequences. Mapping data using a variety of techniques has demonstrated that HPV integration occurred without obvious specificity into human genome. However, these techniques could not demonstrate whether integration resulted in the generation of transcripts encoding viral or viral-cellular sequences. The aim of this work was to map the integration sites of HPV DNA and to analyse the adjacent cellular sequences. Amplification of the INTs was done by the APOT technique. The APOT products were sequenced according to standard protocols. The analysis of the sequences was performed using BLASTN program and public databases. To localise the INTs PCR-based screening of GeneBridge4-RH-panel was used. Twelve cellular sequences adjacent to integrated HPV16 (INT markers) expressed in squamous cell cervical carcinomas were isolated. For 11 INT markers homologous human genomic sequences were readily identified and 9 of these showed significant homologies to known genes/ESTs. Using the known locations of homologous cDNAs and the RH-mapping techniques, mapping studies showed that the INTs are distributed among different human chromosomes for each tumour sample and are located in regions with the high levels of expression. Integration of HPV genomes occurs into the different human chromosomes but into regions that contain highly transcribed genes. One interpretation of these studies is that integration of HPV occurs into decondensed regions, which are more accessible for integration of foreign DNA

  9. An extended anchored linkage map and virtual mapping for the american mink genome based on homology to human and dog

    DEFF Research Database (Denmark)

    Anistoroaei, Razvan Marian; Ansari, S.; Farid, A.

    2009-01-01

    hybridization (FISH) and/or by means of human/dog/mink comparative homology. The average interval between markers is 8.5 cM and the linkage groups collectively span 1340 cM. In addition, 217 and 275 mink microsatellites have been placed on human and dog genomes, respectively. In conjunction with the existing...... comparative human/dog/mink data, these assignments represent useful virtual maps for the American mink genome. Comparison of the current human/dog assembled sequential map with the existing Zoo-FISH-based human/dog/mink maps helped to refine the human/dog/mink comparative map. Furthermore, comparison...... of the human and dog genome assemblies revealed a number of large synteny blocks, some of which are corroborated by data from the mink linkage map....

  10. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  11. Prolonged Integration Site Selection of a Lentiviral Vector in the Genome of Human Keratinocytes.

    Science.gov (United States)

    Qian, Wei; Wang, Yong; Li, Rui-Fu; Zhou, Xin; Liu, Jing; Peng, Dai-Zhi

    2017-03-03

    BACKGROUND Lentiviral vectors have been successfully used for human skin cell gene transfer studies. Defining the selection of integration sites for retroviral vectors in the host genome is crucial in risk assessment analysis of gene therapy. However, genome-wide analyses of lentiviral integration sites in human keratinocytes, especially after prolonged growth, are poorly understood. MATERIAL AND METHODS In this study, 874 unique lentiviral vector integration sites in human HaCaT keratinocytes after long-term culture were identified and analyzed with the online tool GTSG-QuickMap and SPSS software. RESULTS The data indicated that lentiviral vectors showed integration site preferences for genes and gene-rich regions. CONCLUSIONS This study will likely assist in determining the relative risks of the lentiviral vector system and in the design of a safe lentiviral vector system in the gene therapy of skin diseases.

  12. Genomic consequences of selection and genome-wide association mapping in soybean.

    Science.gov (United States)

    Wen, Zixiang; Boyse, John F; Song, Qijian; Cregan, Perry B; Wang, Dechun

    2015-09-03

    Crop improvement always involves selection of specific alleles at genes controlling traits of agronomic importance, likely resulting in detectable signatures of selection within the genome of modern soybean (Glycine max L. Merr.). The identification of these signatures of selection is meaningful from the perspective of evolutionary biology and for uncovering the genetic architecture of agronomic traits. To this end, two populations of soybean, consisting of 342 landraces and 1062 improved lines, were genotyped with the SoySNP50K Illumina BeadChip containing 52,041 single nucleotide polymorphisms (SNPs), and systematically phenotyped for 9 agronomic traits. A cross-population composite likelihood ratio (XP-CLR) method was used to screen the signals of selective sweeps. A total of 125 candidate selection regions were identified, many of which harbored genes potentially involved in crop improvement. To further investigate whether these candidate regions were in fact enriched for genes affected by selection, genome-wide association studies (GWAS) were conducted on 7 selection traits targeted in soybean breeding (grain yield, plant height, lodging, maturity date, seed coat color, seed protein and oil content) and 2 non-selection traits (pubescence and flower color). Major genomic regions associated with selection traits overlapped with candidate selection regions, whereas no overlap of this kind occurred for the non-selection traits, suggesting that the selection sweeps identified are associated with traits of agronomic importance. Multiple novel loci and refined map locations of known loci related to these traits were also identified. These findings illustrate that comparative genomic analyses, especially when combined with GWAS, are a promising approach to dissect the genetic architecture of complex traits.

  13. CGI: Java software for mapping and visualizing data from array-based comparative genomic hybridization and expression profiling.

    Science.gov (United States)

    Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H; Lau, Ching C; Behl, Sanjiv; Man, Tsz-Kwong

    2007-10-06

    With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  14. CGI: Java Software for Mapping and Visualizing Data from Array-based Comparative Genomic Hybridization and Expression Profiling

    Directory of Open Access Journals (Sweden)

    Joyce Xiuweu-Xu Gu

    2007-01-01

    Full Text Available With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator that matches the BAC clones from array-based comparative genomic hybridization (aCGH to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specifi c BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  15. Chromosomal mapping of canine-derived BAC clones to the red fox and American mink genomes.

    Science.gov (United States)

    Kukekova, Anna V; Vorobieva, Nadegda V; Beklemisheva, Violetta R; Johnson, Jennifer L; Temnykh, Svetlana V; Yudkin, Dmitry V; Trut, Lyudmila N; Andre, Catherine; Galibert, Francis; Aguirre, Gustavo D; Acland, Gregory M; Graphodatsky, Alexander S

    2009-01-01

    High-quality sequencing of the dog (Canis lupus familiaris) genome has enabled enormous progress in genetic mapping of canine phenotypic variation. The red fox (Vulpes vulpes), another canid species, also exhibits a wide range of variation in coat color, morphology, and behavior. Although the fox genome has not yet been sequenced, canine genomic resources have been used to construct a meiotic linkage map of the red fox genome and begin genetic mapping in foxes. However, a more detailed gene-specific comparative map between the dog and fox genomes is required to establish gene order within homologous regions of dog and fox chromosomes and to refine breakpoints between homologous chromosomes of the 2 species. In the current study, we tested whether canine-derived gene-containing bacterial artificial chromosome (BAC) clones can be routinely used to build a gene-specific map of the red fox genome. Forty canine BAC clones were mapped to the red fox genome by fluorescence in situ hybridization (FISH). Each clone was uniquely assigned to a single fox chromosome, and the locations of 38 clones agreed with cytogenetic predictions. These results clearly demonstrate the utility of FISH mapping for construction of a whole-genome gene-specific map of the red fox. The further possibility of using canine BAC clones to map genes in the American mink (Mustela vison) genome was also explored. Much lower success was obtained for this more distantly related farm-bred species, although a few BAC clones were mapped to the predicted chromosomal locations.

  16. Advancing the STMS genomic resources for defining new locations on the intraspecific genetic linkage map of chickpea (Cicer arietinum L.

    Directory of Open Access Journals (Sweden)

    Shokeen Bhumika

    2011-02-01

    and the previously published chickpea intraspecific map, integration of maps was performed which revealed improvement of marker density and saturation of the region in the vicinity of sfl (double-podding gene thereby bringing about an advancement of the current map. Conclusion An arsenal of 181 new chickpea STMS markers was reported. The developed intraspecific linkage map defined map positions of 138 markers which included 101 new locations.Map integration with a previously published map was carried out which revealed an advanced map with improved density. This study is a major contribution towards providing advanced genomic resources which will facilitate chickpea geneticists and molecular breeders in developing superior genotypes with improved traits.

  17. Advancing the STMS genomic resources for defining new locations on the intraspecific genetic linkage map of chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Gaur, Rashmi; Sethy, Niroj K; Choudhary, Shalu; Shokeen, Bhumika; Gupta, Varsha; Bhatia, Sabhyata

    2011-02-17

    intraspecific map, integration of maps was performed which revealed improvement of marker density and saturation of the region in the vicinity of sfl (double-podding) gene thereby bringing about an advancement of the current map. An arsenal of 181 new chickpea STMS markers was reported. The developed intraspecific linkage map defined map positions of 138 markers which included 101 new locations.Map integration with a previously published map was carried out which revealed an advanced map with improved density. This study is a major contribution towards providing advanced genomic resources which will facilitate chickpea geneticists and molecular breeders in developing superior genotypes with improved traits.

  18. High-resolution comparative mapping among man, cattle and mouse suggests a role for repeat sequences in mammalian genome evolution

    Directory of Open Access Journals (Sweden)

    Rodolphe François

    2006-08-01

    Full Text Available Abstract Background Comparative mapping provides new insights into the evolutionary history of genomes. In particular, recent studies in mammals have suggested a role for segmental duplication in genome evolution. In some species such as Drosophila or maize, transposable elements (TEs have been shown to be involved in chromosomal rearrangements. In this work, we have explored the presence of interspersed repeats in regions of chromosomal rearrangements, using an updated high-resolution integrated comparative map among cattle, man and mouse. Results The bovine, human and mouse comparative autosomal map has been constructed using data from bovine genetic and physical maps and from FISH-mapping studies. We confirm most previous results but also reveal some discrepancies. A total of 211 conserved segments have been identified between cattle and man, of which 33 are new segments and 72 correspond to extended, previously known segments. The resulting map covers 91% and 90% of the human and bovine genomes, respectively. Analysis of breakpoint regions revealed a high density of species-specific interspersed repeats in the human and mouse genomes. Conclusion Analysis of the breakpoint regions has revealed specific repeat density patterns, suggesting that TEs may have played a significant role in chromosome evolution and genome plasticity. However, we cannot rule out that repeats and breakpoints accumulate independently in the few same regions where modifications are better tolerated. Likewise, we cannot ascertain whether increased TE density is the cause or the consequence of chromosome rearrangements. Nevertheless, the identification of high density repeat clusters combined with a well-documented repeat phylogeny should highlight probable breakpoints, and permit their precise dating. Combining new statistical models taking the present information into account should help reconstruct ancestral karyotypes.

  19. Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps.

    Science.gov (United States)

    Schaeffer, Stephen W; Bhutkar, Arjun; McAllister, Bryant F; Matsuda, Muneo; Matzkin, Luciano M; O'Grady, Patrick M; Rohde, Claudia; Valente, Vera L S; Aguadé, Montserrat; Anderson, Wyatt W; Edwards, Kevin; Garcia, Ana C L; Goodman, Josh; Hartigan, James; Kataoka, Eiko; Lapoint, Richard T; Lozovsky, Elena R; Machado, Carlos A; Noor, Mohamed A F; Papaceit, Montserrat; Reed, Laura K; Richards, Stephen; Rieger, Tania T; Russo, Susan M; Sato, Hajime; Segarra, Carmen; Smith, Douglas R; Smith, Temple F; Strelets, Victor; Tobari, Yoshiko N; Tomimura, Yoshihiko; Wasserman, Marvin; Watts, Thomas; Wilson, Robert; Yoshida, Kiyohito; Markow, Therese A; Gelbart, William M; Kaufman, Thomas C

    2008-07-01

    The sequencing of the 12 genomes of members of the genus Drosophila was taken as an opportunity to reevaluate the genetic and physical maps for 11 of the species, in part to aid in the mapping of assembled scaffolds. Here, we present an overview of the importance of cytogenetic maps to Drosophila biology and to the concepts of chromosomal evolution. Physical and genetic markers were used to anchor the genome assembly scaffolds to the polytene chromosomal maps for each species. In addition, a computational approach was used to anchor smaller scaffolds on the basis of the analysis of syntenic blocks. We present the chromosomal map data from each of the 11 sequenced non-Drosophila melanogaster species as a series of sections. Each section reviews the history of the polytene chromosome maps for each species, presents the new polytene chromosome maps, and anchors the genomic scaffolds to the cytological maps using genetic and physical markers. The mapping data agree with Muller's idea that the majority of Drosophila genes are syntenic. Despite the conservation of genes within homologous chromosome arms across species, the karyotypes of these species have changed through the fusion of chromosomal arms followed by subsequent rearrangement events.

  20. Genetic fine-mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci

    Science.gov (United States)

    Mahajan, Anubha; Locke, Adam; Rayner, N William; Robertson, Neil; Scott, Robert A; Prokopenko, Inga; Scott, Laura J; Green, Todd; Sparso, Thomas; Thuillier, Dorothee; Yengo, Loic; Grallert, Harald; Wahl, Simone; Frånberg, Mattias; Strawbridge, Rona J; Kestler, Hans; Chheda, Himanshu; Eisele, Lewin; Gustafsson, Stefan; Steinthorsdottir, Valgerdur; Thorleifsson, Gudmar; Qi, Lu; Karssen, Lennart C; van Leeuwen, Elisabeth M; Willems, Sara M; Li, Man; Chen, Han; Fuchsberger, Christian; Kwan, Phoenix; Ma, Clement; Linderman, Michael; Lu, Yingchang; Thomsen, Soren K; Rundle, Jana K; Beer, Nicola L; van de Bunt, Martijn; Chalisey, Anil; Kang, Hyun Min; Voight, Benjamin F; Abecasis, Goncalo R; Almgren, Peter; Baldassarre, Damiano; Balkau, Beverley; Benediktsson, Rafn; Blüher, Matthias; Boeing, Heiner; Bonnycastle, Lori L; Borringer, Erwin P; Burtt, Noël P; Carey, Jason; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn C; Couper, David J; Crenshaw, Andrew T; van Dam, Rob M; Doney, Alex SF; Dorkhan, Mozhgan; Edkins, Sarah; Eriksson, Johan G; Esko, Tonu; Eury, Elodie; Fadista, João; Flannick, Jason; Fontanillas, Pierre; Fox, Caroline; Franks, Paul W; Gertow, Karl; Gieger, Christian; Gigante, Bruna; Gottesman, Omri; Grant, George B; Grarup, Niels; Groves, Christopher J; Hassinen, Maija; Have, Christian T; Herder, Christian; Holmen, Oddgeir L; Hreidarsson, Astradur B; Humphries, Steve E; Hunter, David J; Jackson, Anne U; Jonsson, Anna; Jørgensen, Marit E; Jørgensen, Torben; Kerrison, Nicola D; Kinnunen, Leena; Klopp, Norman; Kong, Augustine; Kovacs, Peter; Kraft, Peter; Kravic, Jasmina; Langford, Cordelia; Leander, Karin; Liang, Liming; Lichtner, Peter; Lindgren, Cecilia M; Lindholm, Eero; Linneberg, Allan; Liu, Ching-Ti; Lobbens, Stéphane; Luan, Jian’an; Lyssenko, Valeriya; Männistö, Satu; McLeod, Olga; Meyer, Julia; Mihailov, Evelin; Mirza, Ghazala; Mühleisen, Thomas W; Müller-Nurasyid, Martina; Navarro, Carmen; Nöthen, Markus M; Oskolkov, Nikolay N; Owen, Katharine R; Palli, Domenico; Pechlivanis, Sonali; Perry, John RB; Platou, Carl GP; Roden, Michael; Ruderfer, Douglas; Rybin, Denis; van der Schouw, Yvonne T; Sennblad, Bengt; Sigurðsson, Gunnar; Stančáková, Alena; Steinbach, Gerald; Storm, Petter; Strauch, Konstantin; Stringham, Heather M; Sun, Qi; Thorand, Barbara; Tikkanen, Emmi; Tonjes, Anke; Trakalo, Joseph; Tremoli, Elena; Tuomi, Tiinamaija; Wennauer, Roman; Wood, Andrew R; Zeggini, Eleftheria; Dunham, Ian; Birney, Ewan; Pasquali, Lorenzo; Ferrer, Jorge; Loos, Ruth JF; Dupuis, Josée; Florez, Jose C; Boerwinkle, Eric; Pankow, James S; van Duijn, Cornelia; Sijbrands, Eric; Meigs, James B; Hu, Frank B; Thorsteinsdottir, Unnur; Stefansson, Kari; Lakka, Timo A; Rauramaa, Rainer; Stumvoll, Michael; Pedersen, Nancy L; Lind, Lars; Keinanen-Kiukaanniemi, Sirkka M; Korpi-Hyövälti, Eeva; Saaristo, Timo E; Saltevo, Juha; Kuusisto, Johanna; Laakso, Markku; Metspalu, Andres; Erbel, Raimund; Jöckel, Karl-Heinz; Moebus, Susanne; Ripatti, Samuli; Salomaa, Veikko; Ingelsson, Erik; Boehm, Bernhard O; Bergman, Richard N; Collins, Francis S; Mohlke, Karen L; Koistinen, Heikki; Tuomilehto, Jaakko; Hveem, Kristian; Njølstad, Inger; Deloukas, Panagiotis; Donnelly, Peter J; Frayling, Timothy M; Hattersley, Andrew T; de Faire, Ulf; Hamsten, Anders; Illig, Thomas; Peters, Annette; Cauchi, Stephane; Sladek, Rob; Froguel, Philippe; Hansen, Torben; Pedersen, Oluf; Morris, Andrew D; Palmer, Collin NA; Kathiresan, Sekar; Melander, Olle; Nilsson, Peter M; Groop, Leif C; Barroso, Inês; Langenberg, Claudia; Wareham, Nicholas J; O’Callaghan, Christopher A; Gloyn, Anna L; Altshuler, David; Boehnke, Michael; Teslovich, Tanya M; McCarthy, Mark I; Morris, Andrew P

    2015-01-01

    We performed fine-mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in/near KCNQ1. “Credible sets” of variants most likely to drive each distinct signal mapped predominantly to non-coding sequence, implying that T2D association is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine-mapping implicated rs10830963 as driving T2D association. We confirmed that this T2D-risk allele increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D-risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease. PMID:26551672

  1. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  2. Improving Microbial Genome Annotations in an Integrated Database Context

    Science.gov (United States)

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Anderson, Iain; Mavromatis, Konstantinos; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2013-01-01

    Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/. PMID:23424620

  3. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  4. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    Energy Technology Data Exchange (ETDEWEB)

    Baliga, Nitin S

    2011-05-26

    Final report on MAGGIE. We set ambitious goals to model the functions of individual organisms and their community from molecular to systems scale. These scientific goals are driving the development of sophisticated algorithms to analyze large amounts of experimental measurements made using high throughput technologies to explain and predict how the environment influences biological function at multiple scales and how the microbial systems in turn modify the environment. By experimentally evaluating predictions made using these models we will test the degree to which our quantitative multiscale understanding wilt help to rationally steer individual microbes and their communities towards specific tasks. Towards this end we have made substantial progress towards understanding evolution of gene families, transcriptional structures, detailed structures of keystone molecular assemblies (proteins and complexes), protein interactions, biological networks, microbial interactions, and community structure. Using comparative analysis we have tracked the evolutionary history of gene functions to understand how novel functions evolve. One level up, we have used proteomics data, high-resolution genome tiling microarrays, and 5' RNA sequencing to revise genome annotations, discover new genes including ncRNAs, and map dynamically changing operon structures of five model organisms: For Desulfovibrio vulgaris Hildenborough, Pyrococcus furiosis, Sulfolobus solfataricus, Methanococcus maripaludis and Haiobacterium salinarum NROL We have developed machine learning algorithms to accurately identify protein interactions at a near-zero false positive rate from noisy data generated using tagfess complex purification, TAP purification, and analysis of membrane complexes. Combining other genome-scale datasets produced by ENIGMA (in particular, microarray data) and available from literature we have been able to achieve a true positive rate as high as 65% at almost zero false positives

  5. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  6. Human Papillomavirus Genome Integration and Head and Neck Cancer.

    Science.gov (United States)

    Pinatti, L M; Walline, H M; Carey, T E

    2018-06-01

    We conducted a critical review of human papillomavirus (HPV) integration into the host genome in oral/oropharyngeal cancer, reviewed the literature for HPV-induced cancers, and obtained current data for HPV-related oral and oropharyngeal cancers. In addition, we performed studies to identify HPV integration sites and the relationship of integration to viral-host fusion transcripts and whether integration is required for HPV-associated oncogenesis. Viral integration of HPV into the host genome is not required for the viral life cycle and might not be necessary for cellular transformation, yet HPV integration is frequently reported in cervical and head and neck cancer specimens. Studies of large numbers of early cervical lesions revealed frequent viral integration into gene-poor regions of the host genome with comparatively rare integration into cellular genes, suggesting that integration is a stochastic event and that site of integration may be largely a function of chance. However, more recent studies of head and neck squamous cell carcinomas (HNSCCs) suggest that integration may represent an additional oncogenic mechanism through direct effects on cancer-related gene expression and generation of hybrid viral-host fusion transcripts. In HNSCC cell lines as well as primary tumors, integration into cancer-related genes leading to gene disruption has been reported. The studies have shown that integration-induced altered gene expression may be associated with tumor recurrence. Evidence from several studies indicates that viral integration into genic regions is accompanied by local amplification, increased expression in some cases, interruption of gene expression, and likely additional oncogenic effects. Similarly, reported examples of viral integration near microRNAs suggest that altered expression of these regulatory molecules may also contribute to oncogenesis. Future work is indicated to identify the mechanisms of these events on cancer cell behavior.

  7. GenomeVx: simple web-based creation of editable circular chromosome maps.

    Science.gov (United States)

    Conant, Gavin C; Wolfe, Kenneth H

    2008-03-15

    We describe GenomeVx, a web-based tool for making editable, publication-quality, maps of mitochondrial and chloroplast genomes and of large plasmids. These maps show the location of genes and chromosomal features as well as a position scale. The program takes as input either raw feature positions or GenBank records. In the latter case, features are automatically extracted and colored, an example of which is given. Output is in the Adobe Portable Document Format (PDF) and can be edited by programs such as Adobe Illustrator. GenomeVx is available at http://wolfe.gen.tcd.ie/GenomeVx

  8. Integrated Genomic Characterization of Papillary Thyroid Carcinoma

    Science.gov (United States)

    Agrawal, Nishant; Akbani, Rehan; Aksoy, B. Arman; Ally, Adrian; Arachchi, Harindra; Asa, Sylvia L.; Auman, J. Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B.; Behera, Madhusmita; Bernard, Brady; Beroukhim, Rameen; Bishop, Justin A.; Black, Aaron D.; Bodenheimer, Tom; Boice, Lori; Bootwalla, Moiz S.; Bowen, Jay; Bowlby, Reanne; Bristow, Christopher A.; Brookens, Robin; Brooks, Denise; Bryant, Robert; Buda, Elizabeth; Butterfield, Yaron S.N.; Carling, Tobias; Carlsen, Rebecca; Carter, Scott L.; Carty, Sally E.; Chan, Timothy A.; Chen, Amy Y.; Cherniack, Andrew D.; Cheung, Dorothy; Chin, Lynda; Cho, Juok; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Ciriello, Giovanni; Clarke, Amanda; Clayman, Gary L.; Cope, Leslie; Copland, John; Covington, Kyle; Danilova, Ludmila; Davidsen, Tanja; Demchok, John A.; DiCara, Daniel; Dhalla, Noreen; Dhir, Rajiv; Dookran, Sheliann S.; Dresdner, Gideon; Eldridge, Jonathan; Eley, Greg; El-Naggar, Adel K.; Eng, Stephanie; Fagin, James A.; Fennell, Timothy; Ferris, Robert L.; Fisher, Sheila; Frazer, Scott; Frick, Jessica; Gabriel, Stacey B.; Ganly, Ian; Gao, Jianjiong; Garraway, Levi A.; Gastier-Foster, Julie M.; Getz, Gad; Gehlenborg, Nils; Ghossein, Ronald; Gibbs, Richard A.; Giordano, Thomas J.; Gomez-Hernandez, Karen; Grimsby, Jonna; Gross, Benjamin; Guin, Ranabir; Hadjipanayis, Angela; Harper, Hollie A.; Hayes, D. Neil; Heiman, David I.; Herman, James G.; Hoadley, Katherine A.; Hofree, Matan; Holt, Robert A.; Hoyle, Alan P.; Huang, Franklin W.; Huang, Mei; Hutter, Carolyn M.; Ideker, Trey; Iype, Lisa; Jacobsen, Anders; Jefferys, Stuart R.; Jones, Corbin D.; Jones, Steven J.M.; Kasaian, Katayoon; Kebebew, Electron; Khuri, Fadlo R.; Kim, Jaegil; Kramer, Roger; Kreisberg, Richard; Kucherlapati, Raju; Kwiatkowski, David J.; Ladanyi, Marc; Lai, Phillip H.; Laird, Peter W.; Lander, Eric; Lawrence, Michael S.; Lee, Darlene; Lee, Eunjung; Lee, Semin; Lee, William; Leraas, Kristen M.; Lichtenberg, Tara M.; Lichtenstein, Lee; Lin, Pei; Ling, Shiyun; Liu, Jinze; Liu, Wenbin; Liu, Yingchun; LiVolsi, Virginia A.; Lu, Yiling; Ma, Yussanne; Mahadeshwar, Harshad S.; Marra, Marco A.; Mayo, Michael; McFadden, David G.; Meng, Shaowu; Meyerson, Matthew; Mieczkowski, Piotr A.; Miller, Michael; Mills, Gordon; Moore, Richard A.; Mose, Lisle E.; Mungall, Andrew J.; Murray, Bradley A.; Nikiforov, Yuri E.; Noble, Michael S.; Ojesina, Akinyemi I.; Owonikoko, Taofeek K.; Ozenberger, Bradley A.; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J.; Parker, Joel S.; Paull, Evan O.; Pedamallu, Chandra Sekhar; Perou, Charles M.; Prins, Jan F.; Protopopov, Alexei; Ramalingam, Suresh S.; Ramirez, Nilsa C.; Ramirez, Ricardo; Raphael, Benjamin J.; Rathmell, W. Kimryn; Ren, Xiaojia; Reynolds, Sheila M.; Rheinbay, Esther; Ringel, Matthew D.; Rivera, Michael; Roach, Jeffrey; Robertson, A. Gordon; Rosenberg, Mara W.; Rosenthall, Matthew; Sadeghi, Sara; Saksena, Gordon; Sander, Chris; Santoso, Netty; Schein, Jacqueline E.; Schultz, Nikolaus; Schumacher, Steven E.; Seethala, Raja R.; Seidman, Jonathan; Senbabaoglu, Yasin; Seth, Sahil; Sharpe, Samantha; Mills Shaw, Kenna R.; Shen, John P.; Shen, Ronglai; Sherman, Steven; Sheth, Margi; Shi, Yan; Shmulevich, Ilya; Sica, Gabriel L.; Simons, Janae V.; Sipahimalani, Payal; Smallridge, Robert C.; Sofia, Heidi J.; Soloway, Matthew G.; Song, Xingzhi; Sougnez, Carrie; Stewart, Chip; Stojanov, Petar; Stuart, Joshua M.; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Tarnuzzer, Roy; Taylor, Barry S.; Thiessen, Nina; Thorne, Leigh; Thorsson, Vésteinn; Tuttle, R. Michael; Umbricht, Christopher B.; Van Den Berg, David J.; Vandin, Fabio; Veluvolu, Umadevi; Verhaak, Roel G.W.; Vinco, Michelle; Voet, Doug; Walter, Vonn; Wang, Zhining; Waring, Scot; Weinberger, Paul M.; Weinstein, John N.; Weisenberger, Daniel J.; Wheeler, David; Wilkerson, Matthew D.; Wilson, Jocelyn; Williams, Michelle; Winer, Daniel A.; Wise, Lisa; Wu, Junyuan; Xi, Liu; Xu, Andrew W.; Yang, Liming; Yang, Lixing; Zack, Travis I.; Zeiger, Martha A.; Zeng, Dong; Zenklusen, Jean Claude; Zhao, Ni; Zhang, Hailei; Zhang, Jianhua; Zhang, Jiashan (Julia); Zhang, Wei; Zmuda, Erik; Zou., Lihua

    2014-01-01

    Summary Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. Here, we describe the genomic landscape of 496 PTCs. We observed a low frequency of somatic alterations (relative to other carcinomas) and extended the set of known PTC driver alterations to include EIF1AX, PPM1D and CHEK2 and diverse gene fusions. These discoveries reduced the fraction of PTC cases with unknown oncogenic driver from 25% to 3.5%. Combined analyses of genomic variants, gene expression, and methylation demonstrated that different driver groups lead to different pathologies with distinct signaling and differentiation characteristics. Similarly, we identified distinct molecular subgroups of BRAF-mutant tumors and multidimensional analyses highlighted a potential involvement of oncomiRs in less-differentiated subgroups. Our results propose a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties, which has the potential to improve their pathological classification and better inform the management of the disease. PMID:25417114

  9. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. A meiotic linkage map of the silver fox, aligned and compared to the canine genome.

    Science.gov (United States)

    Kukekova, Anna V; Trut, Lyudmila N; Oskina, Irina N; Johnson, Jennifer L; Temnykh, Svetlana V; Kharlamova, Anastasiya V; Shepeleva, Darya V; Gulievich, Rimma G; Shikhevich, Svetlana G; Graphodatsky, Alexander S; Aguirre, Gustavo D; Acland, Gregory M

    2007-03-01

    A meiotic linkage map is essential for mapping traits of interest and is often the first step toward understanding a cryptic genome. Specific strains of silver fox (a variant of the red fox, Vulpes vulpes), which segregate behavioral and morphological phenotypes, create a need for such a map. One such strain, selected for docility, exhibits friendly dog-like responses to humans, in contrast to another strain selected for aggression. Development of a fox map is facilitated by the known cytogenetic homologies between the dog and fox, and by the availability of high resolution canine genome maps and sequence data. Furthermore, the high genomic sequence identity between dog and fox allows adaptation of canine microsatellites for genotyping and meiotic mapping in foxes. Using 320 such markers, we have constructed the first meiotic linkage map of the fox genome. The resulting sex-averaged map covers 16 fox autosomes and the X chromosome with an average inter-marker distance of 7.5 cM. The total map length corresponds to 1480.2 cM. From comparison of sex-averaged meiotic linkage maps of the fox and dog genomes, suppression of recombination in pericentromeric regions of the metacentric fox chromosomes was apparent, relative to the corresponding segments of acrocentric dog chromosomes. Alignment of the fox meiotic map against the 7.6x canine genome sequence revealed high conservation of marker order between homologous regions of the two species. The fox meiotic map provides a critical tool for genetic studies in foxes and identification of genetic loci and genes implicated in fox domestication.

  11. A meiotic linkage map of the silver fox, aligned and compared to the canine genome

    OpenAIRE

    Kukekova, Anna V.; Trut, Lyudmila N.; Oskina, Irina N.; Johnson, Jennifer L.; Temnykh, Svetlana V.; Kharlamova, Anastasiya V.; Shepeleva, Darya V.; Gulievich, Rimma G.; Shikhevich, Svetlana G.; Graphodatsky, Alexander S.; Aguirre, Gustavo D.; Acland, Gregory M.

    2007-01-01

    A meiotic linkage map is essential for mapping traits of interest and is often the first step toward understanding a cryptic genome. Specific strains of silver fox (a variant of the red fox, Vulpes vulpes), which segregate behavioral and morphological phenotypes, create a need for such a map. One such strain, selected for docility, exhibits friendly dog-like responses to humans, in contrast to another strain selected for aggression. Development of a fox map is facilitated by the known cytogen...

  12. Integration of deeptow data for mapping of deepsea resources

    Digital Repository Service at National Institute of Oceanography (India)

    Sharma, R.; Jaisankar, S.; Sudhakar, M.; Ramprasad, T.

    data. Mapping of nodules and other features (crusts, biological activity and sediment characteristics) was carried out by integrating the data from various sources, such as navigation, photographic and acoustic, keeping time as the reference...

  13. IOCM Aerial Photography: Integrated Ocean and Coastal Mapping Product

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Integrated Ocean and Coastal Mapping Product (IOCM). The images were acquired from a nominal altitude of 7,500 feet above ground level (AGL), using an Applanix...

  14. IVAG: An Integrative Visualization Application for Various Types of Genomic Data Based on R-Shiny and the Docker Platform.

    Science.gov (United States)

    Lee, Tae-Rim; Ahn, Jin Mo; Kim, Gyuhee; Kim, Sangsoo

    2017-12-01

    Next-generation sequencing (NGS) technology has become a trend in the genomics research area. There are many software programs and automated pipelines to analyze NGS data, which can ease the pain for traditional scientists who are not familiar with computer programming. However, downstream analyses, such as finding differentially expressed genes or visualizing linkage disequilibrium maps and genome-wide association study (GWAS) data, still remain a challenge. Here, we introduce a dockerized web application written in R using the Shiny platform to visualize pre-analyzed RNA sequencing and GWAS data. In addition, we have integrated a genome browser based on the JBrowse platform and an automated intermediate parsing process required for custom track construction, so that users can easily build and navigate their personal genome tracks with in-house datasets. This application will help scientists perform series of downstream analyses and obtain a more integrative understanding about various types of genomic data by interactively visualizing them with customizable options.

  15. Integrating Vegetation Classification, Mapping, and Strategic Inventory for Forest Management

    Science.gov (United States)

    C. K. Brewer; R. Bush; D. Berglund; J. A. Barber; S. R. Brown

    2006-01-01

    Many of the analyses needed to address multiple resource issues are focused on vegetation pattern and process relationships and most rely on the data models produced from vegetation classification, mapping, and/or inventory. The Northern Region Vegetation Mapping Project (R1-VMP) data models are based on these three integrally related, yet separate processes. This...

  16. Alignment of Escherichia coli K12 DNA sequences to a genomic restriction map.

    Science.gov (United States)

    Rudd, K E; Miller, W; Ostell, J; Benson, D A

    1990-01-25

    We use the extensive published information describing the genome of Escherichia coli and new restriction map alignment software to align DNA sequence, genetic, and physical maps. Restriction map alignment software is used which considers restriction maps as strings analogous to DNA or protein sequences except that two values, enzyme name and DNA base address, are associated with each position on the string. The resulting alignments reveal a nearly linear relationship between the physical and genetic maps of the E. coli chromosome. Physical map comparisons with the 1976, 1980, and 1983 genetic maps demonstrate a better fit with the more recent maps. The results of these alignments are genomic kilobase coordinates, orientation and rank of the alignment that best fits the genetic data. A statistical measure based on extreme value distribution is applied to the alignments. Additional computer analyses allow us to estimate the accuracy of the published E. coli genomic restriction map, simulate rearrangements of the bacterial chromosome, and search for repetitive DNA. The procedures we used are general enough to be applicable to other genome mapping projects.

  17. An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow.

    Science.gov (United States)

    Paterson, Trevor; Law, Andy

    2009-08-14

    Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. We have developed a simple generic XML schema (GenomicMappingData.xsd - GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna workflows. The data

  18. An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow

    Directory of Open Access Journals (Sweden)

    Law Andy

    2009-08-01

    Full Text Available Abstract Background Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. Results We have developed a simple generic XML schema (GenomicMappingData.xsd – GMD to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps. It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.. The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of

  19. SODIM: Service Oriented Data Integration based on MapReduce

    Directory of Open Access Journals (Sweden)

    Ghada ElSheikh

    2013-09-01

    Data integration systems can benefit from innovative dynamic infrastructure solutions such as Clouds, with its more agility, lower cost, device independency, location independency, and scalability. This study consolidates the data integration system, Service Orientation, and distributed processing to develop a new data integration system called Service Oriented Data Integration based on MapReduce (SODIM that improves the system performance, especially with large number of data sources, and that can efficiently be hosted on modern dynamic infrastructures as Clouds.

  20. Integrative Genome Comparison of Primary and Metastatic Melanomas

    Science.gov (United States)

    Feng, Bin; Nazarian, Rosalynn M.; Bosenberg, Marcus; Wu, Min; Scott, Kenneth L.; Kwong, Lawrence N.; Xiao, Yonghong; Cordon-Cardo, Carlos; Granter, Scott R.; Ramaswamy, Sridhar; Golub, Todd; Duncan, Lyn M.; Wagner, Stephan N.; Brennan, Cameron; Chin, Lynda

    2010-01-01

    A cardinal feature of malignant melanoma is its metastatic propensity. An incomplete view of the genetic events driving metastatic progression has been a major barrier to rational development of effective therapeutics and prognostic diagnostics for melanoma patients. In this study, we conducted global genomic characterization of primary and metastatic melanomas to examine the genomic landscape associated with metastatic progression. In addition to uncovering three genomic subclasses of metastastic melanomas, we delineated 39 focal and recurrent regions of amplification and deletions, many of which encompassed resident genes that have not been implicated in cancer or metastasis. To identify progression-associated metastasis gene candidates, we applied a statistical approach, Integrative Genome Comparison (IGC), to define 32 genomic regions of interest that were significantly altered in metastatic relative to primary melanomas, encompassing 30 resident genes with statistically significant expression deregulation. Functional assays on a subset of these candidates, including MET, ASPM, AKAP9, IMP3, PRKCA, RPA3, and SCAP2, validated their pro-invasion activities in human melanoma cells. Validity of the IGC approach was further reinforced by tissue microarray analysis of Survivin showing significant increased protein expression in thick versus thin primary cutaneous melanomas, and a progression correlation with lymph node metastases. Together, these functional validation results and correlative analysis of human tissues support the thesis that integrated genomic and pathological analyses of staged melanomas provide a productive entry point for discovery of melanoma metastases genes. PMID:20520718

  1. Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans

    NARCIS (Netherlands)

    Li, Y.; Alda Alvarez, O.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.G.; Hazendonk, E.; Prins, J.C.P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  2. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    NARCIS (Netherlands)

    Li, Y.; Alvarez, O.A.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.; Hazendonk, M.G.A.; Prins, P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  3. VERSE: a novel approach to detect virus integration in host genomes through reference genome customization.

    Science.gov (United States)

    Wang, Qingguo; Jia, Peilin; Zhao, Zhongming

    2015-01-01

    Fueled by widespread applications of high-throughput next generation sequencing (NGS) technologies and urgent need to counter threats of pathogenic viruses, large-scale studies were conducted recently to investigate virus integration in host genomes (for example, human tumor genomes) that may cause carcinogenesis or other diseases. A limiting factor in these studies, however, is rapid virus evolution and resulting polymorphisms, which prevent reads from aligning readily to commonly used virus reference genomes, and, accordingly, make virus integration sites difficult to detect. Another confounding factor is host genomic instability as a result of virus insertions. To tackle these challenges and improve our capability to identify cryptic virus-host fusions, we present a new approach that detects Virus intEgration sites through iterative Reference SEquence customization (VERSE). To the best of our knowledge, VERSE is the first approach to improve detection through customizing reference genomes. Using 19 human tumors and cancer cell lines as test data, we demonstrated that VERSE substantially enhanced the sensitivity of virus integration site detection. VERSE is implemented in the open source package VirusFinder 2 that is available at http://bioinfo.mc.vanderbilt.edu/VirusFinder/.

  4. Analytic methods to generate integrable mappings

    Indian Academy of Sciences (India)

    essential integrability features of an integrable differential equation is a .... With this in mind we first write x3(t) as a cubic polynomial in (xn−1,xn,xn+1) and then ..... coefficients, the quadratic equation in xn+N has real and distinct roots which in ...

  5. The Genome-Scale Integrated Networks in Microorganisms

    Directory of Open Access Journals (Sweden)

    Tong Hao

    2018-02-01

    Full Text Available The genome-scale cellular network has become a necessary tool in the systematic analysis of microbes. In a cell, there are several layers (i.e., types of the molecular networks, for example, genome-scale metabolic network (GMN, transcriptional regulatory network (TRN, and signal transduction network (STN. It has been realized that the limitation and inaccuracy of the prediction exist just using only a single-layer network. Therefore, the integrated network constructed based on the networks of the three types attracts more interests. The function of a biological process in living cells is usually performed by the interaction of biological components. Therefore, it is necessary to integrate and analyze all the related components at the systems level for the comprehensively and correctly realizing the physiological function in living organisms. In this review, we discussed three representative genome-scale cellular networks: GMN, TRN, and STN, representing different levels (i.e., metabolism, gene regulation, and cellular signaling of a cell’s activities. Furthermore, we discussed the integration of the networks of the three types. With more understanding on the complexity of microbial cells, the development of integrated network has become an inevitable trend in analyzing genome-scale cellular networks of microorganisms.

  6. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  7. Integrating Radar Image Data with Google Maps

    Science.gov (United States)

    Chapman, Bruce D.; Gibas, Sarah

    2010-01-01

    A public Web site has been developed as a method for displaying the multitude of radar imagery collected by NASA s Airborne Synthetic Aperture Radar (AIRSAR) instrument during its 16-year mission. Utilizing NASA s internal AIRSAR site, the new Web site features more sophisticated visualization tools that enable the general public to have access to these images. The site was originally maintained at NASA on six computers: one that held the Oracle database, two that took care of the software for the interactive map, and three that were for the Web site itself. Several tasks were involved in moving this complicated setup to just one computer. First, the AIRSAR database was migrated from Oracle to MySQL. Then the back-end of the AIRSAR Web site was updated in order to access the MySQL database. To do this, a few of the scripts needed to be modified; specifically three Perl scripts that query that database. The database connections were then updated from Oracle to MySQL, numerous syntax errors were corrected, and a query was implemented that replaced one of the stored Oracle procedures. Lastly, the interactive map was designed, implemented, and tested so that users could easily browse and access the radar imagery through the Google Maps interface.

  8. Integrating Web Services into Map Image Applications

    National Research Council Canada - National Science Library

    Tu, Shengru

    2003-01-01

    Web services have been opening a wide avenue for software integration. In this paper, we have reported our experiments with three applications that are built by utilizing and providing web services for Geographic Information Systems (GIS...

  9. Integrating genomic selection into dairy cattle breeding programmes: a review.

    Science.gov (United States)

    Bouquet, A; Juga, J

    2013-05-01

    Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However

  10. Genome-wide analysis reveals the extent of EAV-HP integration in domestic chicken.

    Science.gov (United States)

    Wragg, David; Mason, Andrew S; Yu, Le; Kuo, Richard; Lawal, Raman A; Desta, Takele Taye; Mwacharo, Joram M; Cho, Chang-Yeon; Kemp, Steve; Burt, David W; Hanotte, Olivier

    2015-10-14

    EAV-HP is an ancient retrovirus pre-dating Gallus speciation, which continues to circulate in modern chicken populations, and led to the emergence of avian leukosis virus subgroup J causing significant economic losses to the poultry industry. We mapped EAV-HP integration sites in Ethiopian village chickens, a Silkie, Taiwan Country chicken, red junglefowl Gallus gallus and several inbred experimental lines using whole-genome sequence data. An average of 75.22 ± 9.52 integration sites per bird were identified, which collectively group into 279 intervals of which 5 % are common to 90 % of the genomes analysed and are suggestive of pre-domestication integration events. More than a third of intervals are specific to individual genomes, supporting active circulation of EAV-HP in modern chickens. Interval density is correlated with chromosome length (P < 2.31(-6)), and 27 % of intervals are located within 5 kb of a transcript. Functional annotation clustering of genes reveals enrichment for immune-related functions (P < 0.05). Our results illustrate a non-random distribution of EAV-HP in the genome, emphasising the importance it may have played in the adaptation of the species, and provide a platform from which to extend investigations on the co-evolutionary significance of endogenous retroviral genera with their hosts.

  11. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

    Science.gov (United States)

    Thorvaldsdóttir, Helga; Robinson, James T; Mesirov, Jill P

    2013-03-01

    Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.

  12. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng

    2010-01-01

    analysis of predicted genes indicated that the novel sequences contain potentially functional coding regions. We estimate that a complete human pan-genome would contain approximately 19-40 Mb of novel sequence not present in the extant reference genome. The extensive amount of novel sequence contributing...

  13. Integrated genetic linkage map of cultivated peanut by three RIL populations

    Institute of Scientific and Technical Information of China (English)

    Yanbin Song; Huifang Jiang; Huaiyong Luo; Li Huang; Yuning Chen; Weigang Chen; Nian Liu; Xiaoping Ren; Bolun Yu; Jianbin Guo

    2017-01-01

    High-density and precise genetic linkage map is fundamental to detect quanti-tative trait locus (QTL) of agronomic and quality related traits in cultivated peanut (Arachis hypogaea L.). In this study, three linkage maps from three RIL (recombinant inbred line) populations were used to construct an integrated map. A total of 2,069 SSR and transposon markers were anchored on the high-density integrated map which covered 2,231.53 cM with 20 linkage groups. Totally, 92 QTLs correlating with pod length (PL), pod width (PW), hun-dred pods weight (HPW) and plant height (PH) from above RIL populations were mapped on it. Seven intervals were found to harbor QTLs controlling the same traits in different pop-ulations, including one for PL, three for PW, two for HPW, and one for PH. Besides, QTLs controlling different traits in different populations were found to be overlapped in four inter-vals. Interval on A05 contains 17 QTLs for different traits from two RIL populations. New markers were added to these intervals to detect QTLs with narrow confidential intervals. Results obtained in this study may facilitate future genomic researches such as QTL study, fine mapping, positional cloning and marker-assisted selection (MAS) in peanut.

  14. High-resolution genetic maps of Eucalyptus improve Eucalyptus grandis genome assembly.

    Science.gov (United States)

    Bartholomé, Jérôme; Mandrou, Eric; Mabiala, André; Jenkins, Jerry; Nabihoudine, Ibouniyamine; Klopp, Christophe; Schmutz, Jeremy; Plomion, Christophe; Gion, Jean-Marc

    2015-06-01

    Genetic maps are key tools in genetic research as they constitute the framework for many applications, such as quantitative trait locus analysis, and support the assembly of genome sequences. The resequencing of the two parents of a cross between Eucalyptus urophylla and Eucalyptus grandis was used to design a single nucleotide polymorphism (SNP) array of 6000 markers evenly distributed along the E. grandis genome. The genotyping of 1025 offspring enabled the construction of two high-resolution genetic maps containing 1832 and 1773 markers with an average marker interval of 0.45 and 0.5 cM for E. grandis and E. urophylla, respectively. The comparison between genetic maps and the reference genome highlighted 85% of collinear regions. A total of 43 noncollinear regions and 13 nonsynthetic regions were detected and corrected in the new genome assembly. This improved version contains 4943 scaffolds totalling 691.3 Mb of which 88.6% were captured by the 11 chromosomes. The mapping data were also used to investigate the effect of population size and number of markers on linkage mapping accuracy. This study provides the most reliable linkage maps for Eucalyptus and version 2.0 of the E. grandis genome. © 2014 CIRAD. New Phytologist © 2014 New Phytologist Trust.

  15. Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study.

    Directory of Open Access Journals (Sweden)

    Paul S de Vries

    Full Text Available An increasing number of genome-wide association (GWA studies are now using the higher resolution 1000 Genomes Project reference panel (1000G for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10-8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10-8, the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development.

  16. A high-resolution map of the Nile tilapia genome: a resource for studying cichlids and other percomorphs

    Science.gov (United States)

    2012-01-01

    Background The Nile tilapia (Oreochromis niloticus) is the second most farmed fish species worldwide. It is also an important model for studies of fish physiology, particularly because of its broad tolerance to an array of environments. It is a good model to study evolutionary mechanisms in vertebrates, because of its close relationship to haplochromine cichlids, which have undergone rapid speciation in East Africa. The existing genomic resources for Nile tilapia include a genetic map, BAC end sequences and ESTs, but comparative genome analysis and maps of quantitative trait loci (QTL) are still limited. Results We have constructed a high-resolution radiation hybrid (RH) panel for the Nile tilapia and genotyped 1358 markers consisting of 850 genes, 82 markers corresponding to BAC end sequences, 154 microsatellites and 272 single nucleotide polymorphisms (SNPs). From these, 1296 markers could be associated in 81 RH groups, while 62 were not linked. The total size of the RH map is 34,084 cR3500 and 937,310 kb. It covers 88% of the entire genome with an estimated inter-marker distance of 742 Kb. Mapping of microsatellites enabled integration to the genetic map. We have merged LG8 and LG24 into a single linkage group, and confirmed that LG16-LG21 are also merged. The orientation and association of RH groups to each chromosome and LG was confirmed by chromosomal in situ hybridizations (FISH) of 55 BACs. Fifty RH groups were localized on the 22 chromosomes while 31 remained small orphan groups. Synteny relationships were determined between Nile tilapia, stickleback, medaka and pufferfish. Conclusion The RH map and associated FISH map provide a valuable gene-ordered resource for gene mapping and QTL studies. All genetic linkage groups with their corresponding RH groups now have a corresponding chromosome which can be identified in the karyotype. Placement of conserved segments indicated that multiple inter-chromosomal rearrangements have occurred between Nile tilapia

  17. Modeling and interoperability of heterogeneous genomic big data for integrative processing and querying.

    Science.gov (United States)

    Masseroli, Marco; Kaitoua, Abdulrahman; Pinoli, Pietro; Ceri, Stefano

    2016-12-01

    While a huge amount of (epi)genomic data of multiple types is becoming available by using Next Generation Sequencing (NGS) technologies, the most important emerging problem is the so-called tertiary analysis, concerned with sense making, e.g., discovering how different (epi)genomic regions and their products interact and cooperate with each other. We propose a paradigm shift in tertiary analysis, based on the use of the Genomic Data Model (GDM), a simple data model which links genomic feature data to their associated experimental, biological and clinical metadata. GDM encompasses all the data formats which have been produced for feature extraction from (epi)genomic datasets. We specifically describe the mapping to GDM of SAM (Sequence Alignment/Map), VCF (Variant Call Format), NARROWPEAK (for called peaks produced by NGS ChIP-seq or DNase-seq methods), and BED (Browser Extensible Data) formats, but GDM supports as well all the formats describing experimental datasets (e.g., including copy number variations, DNA somatic mutations, or gene expressions) and annotations (e.g., regarding transcription start sites, genes, enhancers or CpG islands). We downloaded and integrated samples of all the above-mentioned data types and formats from multiple sources. The GDM is able to homogeneously describe semantically heterogeneous data and makes the ground for providing data interoperability, e.g., achieved through the GenoMetric Query Language (GMQL), a high-level, declarative query language for genomic big data. The combined use of the data model and the query language allows comprehensive processing of multiple heterogeneous data, and supports the development of domain-specific data-driven computations and bio-molecular knowledge discovery. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

    Science.gov (United States)

    Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

    2012-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.

  19. An Integrated Tone Mapping for High Dynamic Range Image Visualization

    Science.gov (United States)

    Liang, Lei; Pan, Jeng-Shyang; Zhuang, Yongjun

    2018-01-01

    There are two type tone mapping operators for high dynamic range (HDR) image visualization. HDR image mapped by perceptual operators have strong sense of reality, but will lose local details. Empirical operators can maximize local detail information of HDR image, but realism is not strong. A common tone mapping operator suitable for all applications is not available. This paper proposes a novel integrated tone mapping framework which can achieve conversion between empirical operators and perceptual operators. In this framework, the empirical operator is rendered based on improved saliency map, which simulates the visual attention mechanism of the human eye to the natural scene. The results of objective evaluation prove the effectiveness of the proposed solution.

  20. Genome-wide conserved non-coding microsatellite (CNMS) marker-based integrative genetical genomics for quantitative dissection of seed weight in chickpea.

    Science.gov (United States)

    Bajaj, Deepak; Saxena, Maneesha S; Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Tripathi, Shailesh; Upadhyaya, Hari D; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-03-01

    Phylogenetic footprinting identified 666 genome-wide paralogous and orthologous CNMS (conserved non-coding microsatellite) markers from 5'-untranslated and regulatory regions (URRs) of 603 protein-coding chickpea genes. The (CT)n and (GA)n CNMS carrying CTRMCAMV35S and GAGA8BKN3 regulatory elements, respectively, are abundant in the chickpea genome. The mapped genic CNMS markers with robust amplification efficiencies (94.7%) detected higher intraspecific polymorphic potential (37.6%) among genotypes, implying their immense utility in chickpea breeding and genetic analyses. Seventeen differentially expressed CNMS marker-associated genes showing strong preferential and seed tissue/developmental stage-specific expression in contrasting genotypes were selected to narrow down the gene targets underlying seed weight quantitative trait loci (QTLs)/eQTLs (expression QTLs) through integrative genetical genomics. The integration of transcript profiling with seed weight QTL/eQTL mapping, molecular haplotyping, and association analyses identified potential molecular tags (GAGA8BKN3 and RAV1AAT regulatory elements and alleles/haplotypes) in the LOB-domain-containing protein- and KANADI protein-encoding transcription factor genes controlling the cis-regulated expression for seed weight in the chickpea. This emphasizes the potential of CNMS marker-based integrative genetical genomics for the quantitative genetic dissection of complex seed weight in chickpea. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  1. Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation

    Directory of Open Access Journals (Sweden)

    Ward Judson A

    2013-01-01

    Full Text Available Abstract Background Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry. Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species. Results Genotyping by Sequencing (GBS was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs linked these results to published maps for cross-validation and map comparison. Conclusions GBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation

  2. Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences.

    Science.gov (United States)

    Zhang, Jianwei; Kudrna, Dave; Mu, Ting; Li, Weiming; Copetti, Dario; Yu, Yeisoo; Goicoechea, Jose Luis; Lei, Yang; Wing, Rod A

    2016-10-15

    Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool-Genome Puzzle Master (GPM)-that enables the integration of additional genomic signposts to edit and build 'new-gen-assemblies' that result in high-quality 'annotation-ready' pseudomolecules. With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to 'group,' 'merge,' 'order and orient' sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user's total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS CONTACTS: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  3. A consensus linkage map of the chicken genome

    NARCIS (Netherlands)

    Groenen, M.A.M.; Cheng, H.H.; Bumstead, N.; Benkel, B.; Briles, E.; Burt, D.W.; Burke, T.; Dodgson, J.; Hillel, J.; Lamont, S.; Ponce, de F.A.; Soller, M.

    2000-01-01

    A consensus linkage map has been developed in the chicken that combines all of the genotyping data from the three available chicken mapping populations. Genotyping data were contributed by the laboratories that have been using the East Lansing and Compton reference populations and from the Animal

  4. Integration of genomic information with biological networks using Cytoscape.

    Science.gov (United States)

    Bauer-Mehren, Anna

    2013-01-01

    Cytoscape is an open-source software for visualizing, analyzing, and modeling biological networks. This chapter explains how to use Cytoscape to analyze the functional effect of sequence variations in the context of biological networks such as protein-protein interaction networks and signaling pathways. The chapter is divided into five parts: (1) obtaining information about the functional effect of sequence variation in a Cytoscape readable format, (2) loading and displaying different types of biological networks in Cytoscape, (3) integrating the genomic information (SNPs and mutations) with the biological networks, and (4) analyzing the effect of the genomic perturbation onto the network structure using Cytoscape built-in functions. Finally, we briefly outline how the integrated data can help in building mathematical network models for analyzing the effect of the sequence variation onto the dynamics of the biological system. Each part is illustrated by step-by-step instructions on an example use case and visualized by many screenshots and figures.

  5. Mapping and characterizing N6-methyladenine in eukaryotic genomes using single molecule real-time sequencing.

    Science.gov (United States)

    Zhu, Shijia; Beaulaurier, John; Deikus, Gintaras; Wu, Tao; Strahl, Maya; Hao, Ziyang; Luo, Guanzheng; Gregory, James A; Chess, Andrew; He, Chuan; Xiao, Andrew; Sebra, Robert; Schadt, Eric E; Fang, Gang

    2018-05-15

    N6-methyladenine (m6dA) has been discovered as a novel form of DNA methylation prevalent in eukaryotes, however, methods for high resolution mapping of m6dA events are still lacking. Single-molecule real-time (SMRT) sequencing has enabled the detection of m6dA events at single-nucleotide resolution in prokaryotic genomes, but its application to detecting m6dA in eukaryotic genomes has not been rigorously examined. Herein, we identified unique characteristics of eukaryotic m6dA methylomes that fundamentally differ from those of prokaryotes. Based on these differences, we describe the first approach for mapping m6dA events using SMRT sequencing specifically designed for the study of eukaryotic genomes, and provide appropriate strategies for designing experiments and carrying out sequencing in future studies. We apply the novel approach to study two eukaryotic genomes. For green algae, we construct the first complete genome-wide map of m6dA at single nucleotide and single molecule resolution. For human lymphoblastoid cells (hLCLs), joint analyses of SMRT sequencing and independent sequencing data suggest that putative m6dA events are enriched in the promoters of young, full length LINE-1 elements (L1s). These analyses demonstrate a general method for rigorous mapping and characterization of m6dA events in eukaryotic genomes. Published by Cold Spring Harbor Laboratory Press.

  6. In vitro analysis of integrated global high-resolution DNA methylation profiling with genomic imbalance and gene expression in osteosarcoma.

    Directory of Open Access Journals (Sweden)

    Bekim Sadikovic

    Full Text Available Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.

  7. A Targeted Capture Linkage Map Anchors the Genome of the Schistosomiasis Vector Snail, Biomphalaria glabrata.

    Science.gov (United States)

    Tennessen, Jacob A; Bollmann, Stephanie R; Blouin, Michael S

    2017-07-05

    The aquatic planorbid snail Biomphalaria glabrata is one of the most intensively-studied mollusks due to its role in the transmission of schistosomiasis. Its 916 Mb genome has recently been sequenced and annotated, but it remains poorly assembled. Here, we used targeted capture markers to map over 10,000 B. glabrata scaffolds in a linkage cross of 94 F1 offspring, generating 24 linkage groups (LGs). We added additional scaffolds to these LGs based on linkage disequilibrium (LD) analysis of targeted capture and whole-genome sequences of 96 unrelated snails. Our final linkage map consists of 18,613 scaffolds comprising 515 Mb, representing 56% of the genome and 75% of genic and nonrepetitive regions. There are 18 large (> 10 Mb) LGs, likely representing the expected 18 haploid chromosomes, and > 50% of the genome has been assigned to LGs of at least 17 Mb. Comparisons with other gastropod genomes reveal patterns of synteny and chromosomal rearrangements. Linkage relationships of key immune-relevant genes may help clarify snail-schistosome interactions. By focusing on linkage among genic and nonrepetitive regions, we have generated a useful resource for associating snail phenotypes with causal genes, even in the absence of a complete genome assembly. A similar approach could potentially improve numerous poorly-assembled genomes in other taxa. This map will facilitate future work on this host of a serious human parasite. Copyright © 2017 Tennessen et al.

  8. REST-MapReduce: An Integrated Interface but Differentiated Service

    Directory of Open Access Journals (Sweden)

    Jong-Hyuk Park

    2014-01-01

    Full Text Available With the fast deployment of cloud computing, MapReduce architectures are becoming the major technologies for mobile cloud computing. The concept of MapReduce was first introduced as a novel programming model and implementation for a large set of computing devices. In this research, we propose a novel concept of REST-MapReduce, enabling users to use only the REST interface without using the MapReduce architecture. This approach provides a higher level of abstraction by integration of the two types of access interface, REST API and MapReduce. The motivation of this research stems from the slower response time for accessing simple RDBMS on Hadoop than direct access to RDMBS. This is because there is overhead to job scheduling, initiating, starting, tracking, and management during MapReduce-based parallel execution. Therefore, we provide a good performance for REST Open API service and for MapReduce, respectively. This is very useful for constructing REST Open API services on Hadoop hosting services, for example, Amazon AWS (Macdonald, 2005 or IBM Smart Cloud. For evaluating performance of our REST-MapReduce framework, we conducted experiments with Jersey REST web server and Hadoop. Experimental result shows that our approach outperforms conventional approaches.

  9. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis.

    Science.gov (United States)

    Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming

    2016-08-05

    clustered in another group. Furthermore, structure analysis was consistent with the dendrogram indicating the 134 watermelon accessions were classified into two populations. The large number of genome wide SSR markers developed herein from the watermelon genome provides a valuable resource for genetic map construction, QTL exploration, map-based gene cloning and marker-assisted selection in watermelon which has a very narrow genetic base and extremely low polymorphism among cultivated lines. Furthermore, the cross-species transferable SSR markers identified herein should also have practical uses in many applications in species of Cucurbitaceae family whose whole genome sequences are not yet available.

  10. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  11. High-definition mapping of retroviral integration sites defines the fate of allogeneic T cells after donor lymphocyte infusion.

    Directory of Open Access Journals (Sweden)

    Claudia Cattoglio

    2010-12-01

    Full Text Available The infusion of donor lymphocytes transduced with a retroviral vector expressing the HSV-TK suicide gene in patients undergoing hematopoietic stem cell transplantation for leukemia/lymphoma promotes immune reconstitution and prevents infections and graft-versus-host disease. Analysis of the clonal dynamics of genetically modified lymphocytes in vivo is of crucial importance to understand the potential genotoxic risk of this therapeutic approach. We used linear amplification-mediated PCR and pyrosequencing to build a genome-wide, high-definition map of retroviral integration sites in the genome of peripheral blood T cells from two different donors and used gene expression profiling and bioinformatics to associate integration clusters to transcriptional activity and to genetic and epigenetic features of the T cell genome. Comparison with matched random controls and with integrations obtained from CD34(+ hematopoietic stem/progenitor cells showed that integration clusters occur within chromatin regions bearing epigenetic marks associated with active promoters and regulatory elements in a cell-specific fashion. Analysis of integration sites in T cells obtained ex vivo two months after infusion showed no evidence of integration-related clonal expansion or dominance, but rather loss of cells harboring integration events interfering with RNA post-transcriptional processing. The study shows that high-definition maps of retroviral integration sites are a powerful tool to analyze the fate of genetically modified T cells in patients and the biological consequences of retroviral transduction.

  12. A High Resolution Genetic Map Anchoring Scaffolds of the Sequenced Watermelon Genome

    Science.gov (United States)

    Kou, Qinghe; Jiang, Jiao; Guo, Shaogui; Zhang, Haiying; Hou, Wenju; Zou, Xiaohua; Sun, Honghe; Gong, Guoyi; Levi, Amnon; Xu, Yong

    2012-01-01

    As part of our ongoing efforts to sequence and map the watermelon (Citrullus spp.) genome, we have constructed a high density genetic linkage map. The map positioned 234 watermelon genome sequence scaffolds (an average size of 1.41 Mb) that cover about 330 Mb and account for 93.5% of the 353 Mb of the assembled genomic sequences of the elite Chinese watermelon line 97103 (Citrullus lanatus var. lanatus). The genetic map was constructed using an F8 population of 103 recombinant inbred lines (RILs). The RILs are derived from a cross between the line 97103 and the United States Plant Introduction (PI) 296341-FR (C. lanatus var. citroides) that contains resistance to fusarium wilt (races 0, 1, and 2). The genetic map consists of eleven linkage groups that include 698 simple sequence repeat (SSR), 219 insertion-deletion (InDel) and 36 structure variation (SV) markers and spans ∼800 cM with a mean marker interval of 0.8 cM. Using fluorescent in situ hybridization (FISH) with 11 BACs that produced chromosome-specifc signals, we have depicted watermelon chromosomes that correspond to the eleven linkage groups constructed in this study. The high resolution genetic map developed here should be a useful platform for the assembly of the watermelon genome, for the development of sequence-based markers used in breeding programs, and for the identification of genes associated with important agricultural traits. PMID:22247776

  13. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Shiguo [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Bechner, M. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Deng, W. [Univ. Wisc.-Madison; Wei, J. [Univ. Wisc.-Madison; Severin, J. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Dimalanta, E. [Univ. Wisc.-Madison; Lamers, C. [Univ. Wisc.-Madison; Burland, V. [Univ. Wisc.-Madison; Blattner, F. R. [Univ. Wisc.-Madison; Schwartz, David C. [Univ. Wisc.-Madison

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  14. An integrated genetic, physical, and transcriptional map of chromosome 13

    Energy Technology Data Exchange (ETDEWEB)

    Scheffer, H.; Kooy, R.F.; Wijngaard, A. [Univ. of Groningen (Netherlands)] [and others

    1994-09-01

    In this study a genetic map containing 20 markers and typed in 40 CEPH families is presented. It includes 7 thusfar untyped microsatellite markers, 7 that have previously been mapped on a subset of 8 CEPH families, one reference marker, D13S71, and three telomeric VNTR markers. Also, 4 intragenic RB1 markers were typed. The markers have an average heterozygosity of 73% (80%, excluding the three RFLPs). The total sex average length of the map is 140 cM. The mean female to male ratio is 1.54. For the non-telomeric part of the chromosome between the markers D13S221 in 13q12 and D13S173 in 13q33-q34, this ratio is 1.99. This ratio is reversed in the telomeric part of the chromosome between D13S173 and D13S234 in distal 13q34, where it is 0.47. A high new mutation frequency of 1% was detected in the (CTTT(T)){sub n} repeat in intron 20 of the RB1 gene. The map has been integrated with 7 microsatellite markers and 2 RFLP markers from CEPH database version 7.0, resulting in a map with 32 markers (28 loci) of chromosome 13q. In addition, a deletion hybrid breakpoint map ordering 50 markers in 18 intervals is constructed. It includes 32 microsatellite markers, 4 genes, 5 STSs, and 9 ESTs. Each of 18 intervals contains at least one microsatellite marker included in the extended genetic map. These data allow a correlation between the genetic and physical map of chromosome 13. New ESTs are currently being identified and localized at this integrated map.

  15. Mapping of the genomic regions controlling seed storability in soybean

    Indian Academy of Sciences (India)

    Composite interval mapping identified a total of three. QTLs on linkage ..... Soybean seeds decline in quality faster than seeds of other crops (Fabrizius et al. 1999). ... harvest and postharvest management practices (Lewis et al. 1998). Cho and ...

  16. A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

    Science.gov (United States)

    Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

    2018-05-09

    The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.

  17. Construction of an integrated genetic map for Capsicum baccatum L.

    Science.gov (United States)

    Moulin, M M; Rodrigues, R; Ramos, H C C; Bento, C S; Sudré, C P; Gonçalves, L S A; Viana, A P

    2015-06-18

    Capsicum baccatum L. is one of the five Capsicum domesticated species and has multiple uses in the food, pharmaceutical and cosmetic industries. This species is also a valuable source of genes for chili pepper breeding, especially genes for disease resistance and fruit quality. However, knowledge of the genetic structure of C. baccatum is limited. A reference map for C. baccatum (2n = 2x = 24) based on 42 microsatellite, 85 inter-simple sequence repeat, and 56 random amplified polymorphic DNA markers was constructed using an F2 population consisting of 203 individuals. The map was generated using the JoinMap software (version 4.0) and the linkage groups were formed and ordered using a LOD score of 3.0 and maximum of 40% recombination. The genetic map consisted of 12 major and four minor linkage groups covering a total genome distance of 2547.5 cM with an average distance of 14.25 cM between markers. Of the 152 pairs of microsatellite markers available for Capsicum annuum, 62 were successfully transferred to C. baccatum, generating polymorphism. Forty-two of these markers were mapped, allowing the introduction of C. baccatum in synteny studies with other species of the genus Capsicum.

  18. Inter-simple sequence repeat (ISSR) loci mapping in the genome of perennial ryegrass

    DEFF Research Database (Denmark)

    Pivorienė, O; Pašakinskienė, I; Brazauskas, G

    2008-01-01

    The aim of this study was to identify and characterize new ISSR markers and their loci in the genome of perennial ryegrass. A subsample of the VrnA F2 mapping family of perennial ryegrass comprising 92 individuals was used to develop a linkage map including inter-simple sequence repeat markers...... demonstrated a 70% similarity to the Hordeum vulgare germin gene GerA. Inter-SSR mapping will provide useful information for gene targeting, quantitative trait loci mapping and marker-assisted selection in perennial ryegrass....

  19. Mapping our genes: The genome projects: How big, how fast

    Energy Technology Data Exchange (ETDEWEB)

    none,

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for /open quotes/writing the rules/close quotes/ of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. OTA prepared this report with the assistance of several hundred experts throughout the world. 342 refs., 26 figs., 11 tabs.

  20. Mapping Our Genes: The Genome Projects: How Big, How Fast

    Science.gov (United States)

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for ?writing the rules? of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. The Office of Technology Assessment (OTA) prepared this report with the assistance of several hundred experts throughout the world.

  1. Accurate estimation of short read mapping quality for next-generation genome sequencing

    Science.gov (United States)

    Ruffalo, Matthew; Koyutürk, Mehmet; Ray, Soumya; LaFramboise, Thomas

    2012-01-01

    Motivation: Several software tools specialize in the alignment of short next-generation sequencing reads to a reference sequence. Some of these tools report a mapping quality score for each alignment—in principle, this quality score tells researchers the likelihood that the alignment is correct. However, the reported mapping quality often correlates weakly with actual accuracy and the qualities of many mappings are underestimated, encouraging the researchers to discard correct mappings. Further, these low-quality mappings tend to correlate with variations in the genome (both single nucleotide and structural), and such mappings are important in accurately identifying genomic variants. Approach: We develop a machine learning tool, LoQuM (LOgistic regression tool for calibrating the Quality of short read mappings, to assign reliable mapping quality scores to mappings of Illumina reads returned by any alignment tool. LoQuM uses statistics on the read (base quality scores reported by the sequencer) and the alignment (number of matches, mismatches and deletions, mapping quality score returned by the alignment tool, if available, and number of mappings) as features for classification and uses simulated reads to learn a logistic regression model that relates these features to actual mapping quality. Results: We test the predictions of LoQuM on an independent dataset generated by the ART short read simulation software and observe that LoQuM can ‘resurrect’ many mappings that are assigned zero quality scores by the alignment tools and are therefore likely to be discarded by researchers. We also observe that the recalibration of mapping quality scores greatly enhances the precision of called single nucleotide polymorphisms. Availability: LoQuM is available as open source at http://compbio.case.edu/loqum/. Contact: matthew.ruffalo@case.edu. PMID:22962451

  2. Toward a physical map of the genome of the nematode Caenorhabditis elegans

    International Nuclear Information System (INIS)

    Coulson, A.; Sulston, J.; Brenner, S.; Karn, J.

    1986-01-01

    A technique for digital characterization and comparison of DNA fragments, using restriction enzymes, is described. The technique is being applied to fragments from the nematode Caenorhabditis elegans (i) to facilitate cross-indexing of clones emanating from different laboratories and (ii) to construct a physical map of the genome. Eight hundred sixty clusters of clones, from 35 to 350 kilobases long and totaling about 60% of the genome, have been characterized

  3. Integrated genome-based studies of Shewanella ecophysiology

    Energy Technology Data Exchange (ETDEWEB)

    Segre Daniel; Beg Qasim

    2012-02-14

    This project was a component of the Shewanella Federation and, as such, contributed to the overall goal of applying the genomic tools to better understand eco-physiology and speciation of respiratory-versatile members of Shewanella genus. Our role at Boston University was to perform bioreactor and high throughput gene expression microarrays, and combine dynamic flux balance modeling with experimentally obtained transcriptional and gene expression datasets from different growth conditions. In the first part of project, we designed the S. oneidensis microarray probes for Affymetrix Inc. (based in California), then we identified the pathways of carbon utilization in the metal-reducing marine bacterium Shewanella oneidensis MR-1, using our newly designed high-density oligonucleotide Affymetrix microarray on Shewanella cells grown with various carbon sources. Next, using a combination of experimental and computational approaches, we built algorithm and methods to integrate the transcriptional and metabolic regulatory networks of S. oneidensis. Specifically, we combined mRNA microarray and metabolite measurements with statistical inference and dynamic flux balance analysis (dFBA) to study the transcriptional response of S. oneidensis MR-1 as it passes through exponential, stationary, and transition phases. By measuring time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different nutrient compositions (minimal lactate and nutritionally rich LB medium), we obtain detailed snapshots of the regulatory strategies used by this bacterium to cope with gradually changing nutrient availability. In addition to traditional clustering, which provides a first indication of major regulatory trends and transcription factors activities, we developed and implemented a new computational approach for Dynamic Detection of Transcriptional Triggers (D2T2). This new method allows us to infer a putative topology of transcriptional dependencies

  4. Visualization of RNA structure models within the Integrative Genomics Viewer.

    Science.gov (United States)

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  5. QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments

    Science.gov (United States)

    2011-01-01

    Background The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. Results Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers. For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies. Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. Conclusions This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and

  6. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica).

    Science.gov (United States)

    Jia, Guanqing; Huang, Xuehui; Zhi, Hui; Zhao, Yan; Zhao, Qiang; Li, Wenjun; Chai, Yang; Yang, Lifang; Liu, Kunyan; Lu, Hengyun; Zhu, Chuanrang; Lu, Yiqi; Zhou, Congcong; Fan, Danlin; Weng, Qijun; Guo, Yunli; Huang, Tao; Zhang, Lei; Lu, Tingting; Feng, Qi; Hao, Hangfei; Liu, Hongkuan; Lu, Ping; Zhang, Ning; Li, Yuhui; Guo, Erhu; Wang, Shujun; Wang, Suying; Liu, Jinrong; Zhang, Wenfei; Chen, Guoqiu; Zhang, Baojin; Li, Wei; Wang, Yongfang; Li, Haiquan; Zhao, Baohua; Li, Jiayang; Diao, Xianmin; Han, Bin

    2013-08-01

    Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.

  7. Structured Matrix Completion with Applications to Genomic Data Integration.

    Science.gov (United States)

    Cai, Tianxi; Cai, T Tony; Zhang, Anru

    2016-01-01

    Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.

  8. An integrated semiconductor device enabling non-optical genome sequencing.

    Science.gov (United States)

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-20

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

  9. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  10. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  11. A map to a new treasure island: the human genome and the concept of common heritage.

    Science.gov (United States)

    Byk, C

    1998-06-01

    While the 1970's have been called the environmental years, the 1990's could be seen as the genome years. As the challenge to map and to sequence the human genome mobilized the scientific community, risks and benefits of information and uses that would derive from this project have also raised ethical issues at the international level. The particular interest of the 1997 UNESCO Declaration relies on the fact that it emphasizes both the scientific importance of genetics and the appropriate reinforcement of human rights in this area. It considers the human genome, at least symbolically, as the common heritage of humanity.

  12. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    Directory of Open Access Journals (Sweden)

    McGuire Patrick E

    2010-12-01

    Full Text Available Abstract Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large

  13. Defining a Cancer Dependency Map | Office of Cancer Genomics

    Science.gov (United States)

    Most human epithelial tumors harbor numerous alterations, making it difficult to predict which genes are required for tumor survival. To systematically identify cancer dependencies, we analyzed 501 genome-scale loss-of-function screens performed in diverse human cancer cell lines. We developed DEMETER, an analytical framework that segregates on- from off-target effects of RNAi. 769 genes were differentially required in subsets of these cell lines at a threshold of six SDs from the mean.

  14. STINGRAY: system for integrated genomic resources and analysis.

    Science.gov (United States)

    Wagner, Glauber; Jardim, Rodrigo; Tschoeke, Diogo A; Loureiro, Daniel R; Ocaña, Kary A C S; Ribeiro, Antonio C B; Emmel, Vanessa E; Probst, Christian M; Pitaluga, André N; Grisard, Edmundo C; Cavalcanti, Maria C; Campos, Maria L M; Mattoso, Marta; Dávila, Alberto M R

    2014-03-07

    The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/.

  15. Map of open and closed chromatin domains in Drosophila genome.

    Science.gov (United States)

    Milon, Beatrice; Sun, Yezhou; Chang, Weizhong; Creasy, Todd; Mahurkar, Anup; Shetty, Amol; Nurminsky, Dmitry; Nurminskaya, Maria

    2014-11-18

    Chromatin compactness has been considered a major determinant of gene activity and has been associated with specific chromatin modifications in studies on a few individual genetic loci. At the same time, genome-wide patterns of open and closed chromatin have been understudied, and are at present largely predicted from chromatin modification and gene expression data. However the universal applicability of such predictions is not self-evident, and requires experimental verification. We developed and implemented a high-throughput analysis for general chromatin sensitivity to DNase I which provides a comprehensive epigenomic assessment in a single assay. Contiguous domains of open and closed chromatin were identified by computational analysis of the data, and correlated to other genome annotations including predicted chromatin "states", individual chromatin modifications, nuclear lamina interactions, and gene expression. While showing that the widely trusted predictions of chromatin structure are correct in the majority of cases, we detected diverse "exceptions" from the conventional rules. We found a profound paucity of chromatin modifications in a major fraction of closed chromatin, and identified a number of loci where chromatin configuration is opposite to that expected from modification and gene expression patterns. Further, we observed that chromatin of large introns tends to be closed even when the genes are expressed, and that a significant proportion of active genes including their promoters are located in closed chromatin. These findings reveal limitations of the existing predictive models, indicate novel mechanisms of epigenetic regulation, and provide important insights into genome organization and function.

  16. A BAC-based physical map of the Drosophila buzzatii genome

    Energy Technology Data Exchange (ETDEWEB)

    Gonzalez, Josefa; Nefedov, Michael; Bosdet, Ian; Casals, Ferran; Calvete, Oriol; Delprat, Alejandra; Shin, Heesun; Chiu, Readman; Mathewson, Carrie; Wye, Natasja; Hoskins, Roger A.; Schein, JacquelineE.; de Jong, Pieter; Ruiz, Alfredo

    2005-03-18

    Large-insert genomic libraries facilitate cloning of large genomic regions, allow the construction of clone-based physical maps and provide useful resources for sequencing entire genomes. Drosophilabuzzatii is a representative species of the repleta group in the Drosophila subgenus, which is being widely used as a model in studies of genome evolution, ecological adaptation and speciation. We constructed a Bacterial Artificial Chromosome (BAC) genomic library of D. buzzatii using the shuttle vector pTARBAC2.1. The library comprises 18,353 clones with an average insert size of 152 kb and a {approx}18X expected representation of the D. buzzatii euchromatic genome. We screened the entire library with six euchromatic gene probes and estimated the actual genome representation to be {approx}23X. In addition, we fingerprinted by restriction digestion and agarose gel electrophoresis a sample of 9,555 clones, and assembled them using Finger Printed Contigs (FPC) software and manual editing into 345 contigs (mean of 26 clones per contig) and 670singletons. Finally, we anchored 181 large contigs (containing 7,788clones) to the D. buzzatii salivary gland polytene chromosomes by in situ hybridization of 427 representative clones. The BAC library and a database with all the information regarding the high coverage BAC-based physical map described in this paper are available to the research community.

  17. From symplectic integrator to Poincare map: Spline expansion of a map generator in Cartesian coordinates

    International Nuclear Information System (INIS)

    Warnock, R.L.; Ellison, J.A.; Univ. of New Mexico, Albuquerque, NM

    1997-08-01

    Data from orbits of a symplectic integrator can be interpolated so as to construct an approximation to the generating function of a Poincare map. The time required to compute an orbit of the symplectic map induced by the generator can be much less than the time to follow the same orbit by symplectic integration. The construction has been carried out previously for full-turn maps of large particle accelerators, and a big saving in time (for instance a factor of 60) has been demonstrated. A shortcoming of the work to date arose from the use of canonical polar coordinates, which precluded map construction in small regions of phase space near coordinate singularities. This paper shows that Cartesian coordinates can also be used, thus avoiding singularities. The generator is represented in a basis of tensor product B-splines. Under weak conditions the spline expansion converges uniformly as the mesh is refined, approaching the exact generator of the Poincare map as defined by the symplectic integrator, in some parallelepiped of phase space centered at the origin

  18. LocusTrack: Integrated visualization of GWAS results and genomic annotation.

    Science.gov (United States)

    Cuellar-Partida, Gabriel; Renteria, Miguel E; MacGregor, Stuart

    2015-01-01

    Genome-wide association studies (GWAS) are an important tool for the mapping of complex traits and diseases. Visual inspection of genomic annotations may be used to generate insights into the biological mechanisms underlying GWAS-identified loci. We developed LocusTrack, a web-based application that annotates and creates plots of regional GWAS results and incorporates user-specified tracks that display annotations such as linkage disequilibrium (LD), phylogenetic conservation, chromatin state, and other genomic and regulatory elements. Currently, LocusTrack can integrate annotation tracks from the UCSC genome-browser as well as from any tracks provided by the user. LocusTrack is an easy-to-use application and can be accessed at the following URL: http://gump.qimr.edu.au/general/gabrieC/LocusTrack/. Users can upload and manage GWAS results and select from and/or provide annotation tracks using simple and intuitive menus. LocusTrack scripts and associated data can be downloaded from the website and run locally.

  19. Data Integration for Climate Vulnerability Mapping in West Africa

    Directory of Open Access Journals (Sweden)

    Alex de Sherbinin

    2015-11-01

    Full Text Available Vulnerability mapping reveals areas that are likely to be at greater risk of climate-related disasters in the future. Through integration of climate, biophysical, and socioeconomic data in an overall vulnerability framework, so-called “hotspots” of vulnerability can be identified. These maps can be used as an aid to targeting adaptation and disaster risk management interventions. This paper reviews vulnerability mapping efforts in West Africa conducted under the USAID-funded African and Latin American Resilience to Climate Change (ARCC project. The focus is on the integration of remotely sensed and socioeconomic data. Data inputs included a range of sensor data (e.g., MODIS NDVI, Landsat, SRTM elevation, DMSP-OLS night-time lights as well as high-resolution poverty, conflict, and infrastructure data. Two basic methods were used, one in which each layer was transformed into standardized indicators in an additive approach, and another in which remote sensing data were used to contextualize the results of composite indicators. We assess the benefits and challenges of data integration, and the lessons learned from these mapping exercises.

  20. A high-resolution whole genome radiation hybrid map of human chromosome 17q22-q25.3 across the genes for GH and TK

    Energy Technology Data Exchange (ETDEWEB)

    Foster, J.W.; Schafer, A.J.; Critcher, R. [Univ. of Cambridge (United Kingdom)] [and others

    1996-04-15

    We have constructed a whole genome radiation hybrid (WG-RH) map across a region of human chromosome 17q, from growth hormone (GH) to thymidine kinase (TK). A panel of 128 WG-RH hybrid cell lines generated by X-irradiation and fusion has been tested for the retention of 39 sequence-tagged site (STS) markers by the polymerase chain reaction. This genome mapping technique has allowed the integration of existing VNTR and microsatellite markers with additional new markers and existing STS markers previously mapped to this region by other means. The WG-RH map includes eight expressed sequence tag (EST) and three anonymous markers developed for this study, together with 23 anonymous microsatellites and five existing ESTs. Analysis of these data resulted in a high-density comprehensive map across this region of the genome. A subset of these markers has been used to produce a framework map consisting of 20 loci ordered with odds greater than 1000:1. The markers are of sufficient density to build a YAC contig across this region based on marker content. We have developed sequence tags for both ends of a 2.1-Mb YAC and mapped these using the WG-RH panel, allowing a direct comparison of cRay{sub 6000} to physical distance. 31 refs., 3 figs., 2 tabs.

  1. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc; Preston, Mark; Guerra-Assunç ã o, José Afonso; Hill-Cawthorn, Grant; Harris, David; Perdigã o, Joã o; Viveiros, Miguel; Portugal, Isabel; Drobniewski, Francis; Gagneux, Sebastien; Glynn, Judith R.; Pain, Arnab; Parkhill, Julian; McNerney, Ruth; Martin, Nigel; Clark, Taane G.

    2014-01-01

    ://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest

  2. Integrated Genome-Based Studies of Shewanella Echophysiology

    Energy Technology Data Exchange (ETDEWEB)

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high

  3. Data integration to prioritize drugs using genomics and curated data.

    Science.gov (United States)

    Louhimo, Riku; Laakso, Marko; Belitskin, Denis; Klefström, Juha; Lehtonen, Rainer; Hautaniemi, Sampsa

    2016-01-01

    Genomic alterations affecting drug target proteins occur in several tumor types and are prime candidates for patient-specific tailored treatments. Increasingly, patients likely to benefit from targeted cancer therapy are selected based on molecular alterations. The selection of a precision therapy benefiting most patients is challenging but can be enhanced with integration of multiple types of molecular data. Data integration approaches for drug prioritization have successfully integrated diverse molecular data but do not take full advantage of existing data and literature. We have built a knowledge-base which connects data from public databases with molecular results from over 2200 tumors, signaling pathways and drug-target databases. Moreover, we have developed a data mining algorithm to effectively utilize this heterogeneous knowledge-base. Our algorithm is designed to facilitate retargeting of existing drugs by stratifying samples and prioritizing drug targets. We analyzed 797 primary tumors from The Cancer Genome Atlas breast and ovarian cancer cohorts using our framework. FGFR, CDK and HER2 inhibitors were prioritized in breast and ovarian data sets. Estrogen receptor positive breast tumors showed potential sensitivity to targeted inhibitors of FGFR due to activation of FGFR3. Our results suggest that computational sample stratification selects potentially sensitive samples for targeted therapies and can aid in precision medicine drug repositioning. Source code is available from http://csblcanges.fimm.fi/GOPredict/.

  4. High-density Integrated Linkage Map Based on SSR Markers in Soybean

    Science.gov (United States)

    Hwang, Tae-Young; Sayama, Takashi; Takahashi, Masakazu; Takada, Yoshitake; Nakamoto, Yumi; Funatsuki, Hideyuki; Hisano, Hiroshi; Sasamoto, Shigemi; Sato, Shusei; Tabata, Satoshi; Kono, Izumi; Hoshi, Masako; Hanawa, Masayoshi; Yano, Chizuru; Xia, Zhengjun; Harada, Kyuya; Kitamura, Keisuke; Ishimoto, Masao

    2009-01-01

    A well-saturated molecular linkage map is a prerequisite for modern plant breeding. Several genetic maps have been developed for soybean with various types of molecular markers. Simple sequence repeats (SSRs) are single-locus markers with high allelic variation and are widely applicable to different genotypes. We have now mapped 1810 SSR or sequence-tagged site markers in one or more of three recombinant inbred populations of soybean (the US cultivar ‘Jack’ × the Japanese cultivar ‘Fukuyutaka’, the Chinese cultivar ‘Peking’ × the Japanese cultivar ‘Akita’, and the Japanese cultivar ‘Misuzudaizu’ × the Chinese breeding line ‘Moshidou Gong 503’) and have aligned these markers with the 20 consensus linkage groups (LGs). The total length of the integrated linkage map was 2442.9 cM, and the average number of molecular markers was 90.5 (range of 70–114) for the 20 LGs. We examined allelic diversity for 1238 of the SSR markers among 23 soybean cultivars or lines and a wild accession. The number of alleles per locus ranged from 2 to 7, with an average of 2.8. Our high-density linkage map should facilitate ongoing and future genomic research such as analysis of quantitative trait loci and positional cloning in addition to marker-assisted selection in soybean breeding. PMID:19531560

  5. Radiation hybrid maps of the D-genome of Aegilops tauschii and their application in sequence assembly of large and complex plant genomes.

    Science.gov (United States)

    Kumar, Ajay; Seetan, Raed; Mergoum, Mohamed; Tiwari, Vijay K; Iqbal, Muhammad J; Wang, Yi; Al-Azzam, Omar; Šimková, Hana; Luo, Ming-Cheng; Dvorak, Jan; Gu, Yong Q; Denton, Anne; Kilian, Andrzej; Lazo, Gerard R; Kianian, Shahryar F

    2015-10-16

    The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high resolution genome maps with saturated marker scaffolds to anchor and orient BAC contigs/ sequence scaffolds for whole genome assembly. Radiation hybrid (RH) mapping has proven to be an excellent tool for the development of such maps for it offers much higher and more uniform marker resolution across the length of the chromosome compared to genetic mapping and does not require marker polymorphism per se, as it is based on presence (retention) vs. absence (deletion) marker assay. In this study, a 178 line RH panel was genotyped with SSRs and DArT markers to develop the first high resolution RH maps of the entire D-genome of Ae. tauschii accession AL8/78. To confirm map order accuracy, the AL8/78-RH maps were compared with:1) a DArT consensus genetic map constructed using more than 100 bi-parental populations, 2) a RH map of the D-genome of reference hexaploid wheat 'Chinese Spring', and 3) two SNP-based genetic maps, one with anchored D-genome BAC contigs and another with anchored D-genome sequence scaffolds. Using marker sequences, the RH maps were also anchored with a BAC contig based physical map and draft sequence of the D-genome of Ae. tauschii. A total of 609 markers were mapped to 503 unique positions on the seven D-genome chromosomes, with a total map length of 14,706.7 cR. The average distance between any two marker loci was 29.2 cR which corresponds to 2.1 cM or 9.8 Mb. The average mapping resolution across the D-genome was estimated to be 0.34 Mb (Mb/cR) or 0.07 cM (cM/cR). The RH maps showed almost perfect agreement with several published maps with regard to chromosome assignments of markers. The mean rank correlations between the position of markers on AL8/78 maps and the four published maps, ranged from 0.75 to 0.92, suggesting a good agreement in marker order. With 609 mapped markers, a total of 2481 deletions for the whole D-genome were detected with an average

  6. The sea lamprey meiotic map improves resolution of ancient vertebrate genome duplications.

    Science.gov (United States)

    Smith, Jeramiah J; Keinath, Melissa C

    2015-08-01

    It is generally accepted that many genes present in vertebrate genomes owe their origin to two whole-genome duplications that occurred deep in the ancestry of the vertebrate lineage. However, details regarding the timing and outcome of these duplications are not well resolved. We present high-density meiotic and comparative genomic maps for the sea lamprey (Petromyzon marinus), a representative of an ancient lineage that diverged from all other vertebrates ∼550 million years ago. Linkage analyses yielded a total of 95 linkage groups, similar to the estimated number of germline chromosomes (1n ∼ 99), spanning a total of 5570.25 cM. Comparative mapping data yield strong support for the hypothesis that a single whole-genome duplication occurred in the basal vertebrate lineage, but do not strongly support a hypothetical second event. Rather, these comparative maps reveal several evolutionarily independent segmental duplications occurring over the last 600+ million years of chordate evolution. This refined history of vertebrate genome duplication should permit more precise investigations of vertebrate evolution. © 2015 Smith and Keinath; Published by Cold Spring Harbor Laboratory Press.

  7. A simple and inexpensive method for genomic restriction mapping analysis

    International Nuclear Information System (INIS)

    Huang, C.H.; Lam, V.M.S.; Tam, J.W.O.

    1988-01-01

    The Southern blotting procedure for the transfer of DNA fragments from agarose gels to nitrocellulose membranes has revolutionized nucleic acid detection methods, and it forms the cornerstone of research in molecular biology. Basically, the method involves the denaturation of DNA fragments that have been separated on an agarose gel, the immobilization of the fragments by transfer to a nitrocellulose membrane, and the identification of the fragments of interest through hybridization to /sup 32/P-labeled probes and autoradiography. While the method is sensitive and applicable to both genomic and cloned DNA, it suffers from the disadvantages of being time consuming and expensive, and fragments of greater than 15 kb are difficult to transfer. Moreover, although theoretically the nitrocellulose membrane can be washed and hybridized repeatedly using different probes, in practice, the membrane becomes brittle and difficult to handle after a few cycles. A direct hybridization method for pure DNA clones was developed in 1975 but has not been widely exploited. The authors report here a modification of their procedure as applied to genomic DNA. The method is simple, rapid, and inexpensive, and it does not involve transfer to nitrocellulose membranes

  8. Construction of chromosomal recombination maps of three genomes of lilies (Lilium) based on GISH analysis.

    NARCIS (Netherlands)

    Nadeem Khan, M.; Shujun Zhou,; Barba Gonzalez, R.; Ramanna, M.S.; Visser, R.G.F.; Tuyl, van J.M.

    2009-01-01

    Chromosomal recombination maps were constructed for three genomes of lily (Lilium) using GISH analyses. For this purpose, the backcross (BC) progenies of two diploid (2n = 2x = 24) interspecific hybrids of lily, viz. Longiflorum × Asiatic (LA) and Oriental × Asiatic (OA), were used. Mostly the BC

  9. An integrated approach to shoreline mapping for spill response planning

    International Nuclear Information System (INIS)

    Owens, E.H.; LeBlanc, S.R.; Percy, R.J.

    1996-01-01

    A desktop mapping package was introduced which has the capability to provide consistent and standardized application of mapping and data collection/generation techniques. Its application in oil spill cleanup was discussed. The data base can be updated easily as new information becomes available. This provides a response team with access to a wide range of information that would otherwise be difficult to obtain. Standard terms and definitions and shoreline segmentation procedures are part of the system to describe the shore-zone character and shore-zone oiling conditions. The program that is in place for Atlantic Canada involves the integration of (1) Environment Canada's SCAT methodology in pre-spill data generation, (2) shoreline segmentation, (3) response management by objectives, (4) Environment Canada's national sensitivity mapping program, and (5) Environment Canada's field guide for the protection and treatment of oiled shorelines. 7 refs., 6 figs

  10. Integrating Evolutionary Game Theory into Mechanistic Genotype-Phenotype Mapping.

    Science.gov (United States)

    Zhu, Xuli; Jiang, Libo; Ye, Meixia; Sun, Lidan; Gragnoli, Claudia; Wu, Rongling

    2016-05-01

    Natural selection has shaped the evolution of organisms toward optimizing their structural and functional design. However, how this universal principle can enhance genotype-phenotype mapping of quantitative traits has remained unexplored. Here we show that the integration of this principle and functional mapping through evolutionary game theory gains new insight into the genetic architecture of complex traits. By viewing phenotype formation as an evolutionary system, we formulate mathematical equations to model the ecological mechanisms that drive the interaction and coordination of its constituent components toward population dynamics and stability. Functional mapping provides a procedure for estimating the genetic parameters that specify the dynamic relationship of competition and cooperation and predicting how genes mediate the evolution of this relationship during trait formation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project.

    OpenAIRE

    Spradling, A C; Stern, D M; Kiss, I; Roote, J; Laverty, T; Rubin, G M

    1995-01-01

    Biologists require genetic as well as molecular tools to decipher genomic information and ultimately to understand gene function. The Berkeley Drosophila Genome Project is addressing these needs with a massive gene disruption project that uses individual, genetically engineered P transposable elements to target open reading frames throughout the Drosophila genome. DNA flanking the insertions is sequenced, thereby placing an extensive series of genetic markers on the physical genomic map and a...

  12. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes

    Czech Academy of Sciences Publication Activity Database

    Staňková, Helena; Hastie, A.; Chan, S.; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, P.; Hayashi, S.; Luo, M.; Batley, J.; Edwards, D.; Doležel, Jaroslav; Šimková, Hana

    2016-01-01

    Roč. 14, č. 7 (2016), s. 1523-1531 ISSN 1467-7644 R&D Projects: GA ČR(CZ) GAP501/12/2554; GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : optical mapping * wheat * sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.443, year: 2016

  13. Integrated genomic and gene expression profiling identifies two major genomic circuits in urothelial carcinoma.

    Directory of Open Access Journals (Sweden)

    David Lindgren

    Full Text Available Similar to other malignancies, urothelial carcinoma (UC is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21, and BCL2L1 (20q11. We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development.

  14. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study.

    Science.gov (United States)

    Amyotte, Beatrice; Bowen, Amy J; Banks, Travis; Rajcan, Istvan; Somers, Daryl J

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants.

  15. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study

    Science.gov (United States)

    Amyotte, Beatrice; Bowen, Amy J.; Banks, Travis; Rajcan, Istvan; Somers, Daryl J.

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants. PMID:28231290

  16. INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles.

    KAUST Repository

    Alam, Intikhab; Antunes, André ; Kamau, Allan; Ba Alawi, Wail; Kalkatawi, Manal M.; Stingl, Ulrich; Bajic, Vladimir B.

    2013-01-01

    The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes.

  17. INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles.

    KAUST Repository

    Alam, Intikhab

    2013-12-06

    The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes.

  18. Some integrable maps and their Hirota bilinear forms

    Science.gov (United States)

    Hone, A. N. W.; Kouloukas, T. E.; Quispel, G. R. W.

    2018-01-01

    We introduce a two-parameter family of birational maps, which reduces to a family previously found by Demskoi, Tran, van der Kamp and Quispel (DTKQ) when one of the parameters is set to zero. The study of the singularity confinement pattern for these maps leads to the introduction of a tau function satisfying a homogeneous recurrence which has the Laurent property, and the tropical (or ultradiscrete) analogue of this homogeneous recurrence confirms the quadratic degree growth found empirically by Demskoi et al. We prove that the tau function also satisfies two different bilinear equations, each of which is a reduction of the Hirota-Miwa equation (also known as the discrete KP equation, or the octahedron recurrence). Furthermore, these bilinear equations are related to reductions of particular two-dimensional integrable lattice equations, of discrete KdV or discrete Toda type. These connections, as well as the cluster algebra structure of the bilinear equations, allow a direct construction of Poisson brackets, Lax pairs and first integrals for the birational maps. As a consequence of the latter results, we show how each member of the family can be lifted to a system that is integrable in the Liouville sense, clarifying observations made previously in the original DTKQ case.

  19. Mapping genomic features to functional traits through microbial whole genome sequences.

    Science.gov (United States)

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.

  20. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits.

    Science.gov (United States)

    Biazzi, Elisa; Nazzicari, Nelson; Pecetti, Luciano; Brummer, E Charles; Palmonari, Alberto; Tava, Aldo; Annicchiarico, Paolo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3-0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  1. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    Science.gov (United States)

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  2. A RAD-based linkage map and comparative genomics in the gudgeons (genus Gnathopogon, Cyprinidae

    Directory of Open Access Journals (Sweden)

    Kakioka Ryo

    2013-01-01

    Full Text Available Abstract Background The construction of linkage maps is a first step in exploring the genetic basis for adaptive phenotypic divergence in closely related species by quantitative trait locus (QTL analysis. Linkage maps are also useful for comparative genomics in non-model organisms. Advances in genomics technologies make it more feasible than ever to study the genetics of adaptation in natural populations. Restriction-site associated DNA (RAD sequencing in next-generation sequencers facilitates the development of many genetic markers and genotyping. We aimed to construct a linkage map of the gudgeons of the genus Gnathopogon (Cyprinidae for comparative genomics with the zebrafish Danio rerio (a member of the same family as gudgeons and for the future QTL analysis of the genetic architecture underlying adaptive phenotypic evolution of Gnathopogon. Results We constructed the first genetic linkage map of Gnathopogon using a 198 F2 interspecific cross between two closely related species in Japan: river-dwelling Gnathopogon elongatus and lake-dwelling Gnathopogon caerulescens. Based on 1,622 RAD-tag markers, a linkage map spanning 1,390.9 cM with 25 linkage groups and an average marker interval of 0.87 cM was constructed. We also identified a region involving female-specific transmission ratio distortion (TRD. Synteny and collinearity were extensively conserved between Gnathopogon and zebrafish. Conclusions The dense SNP-based linkage map presented here provides a basis for future QTL analysis. It will also be useful for transferring genomic information from a “traditional” model fish species, zebrafish, to screen candidate genes underlying ecologically important traits of the gudgeons.

  3. Whole genome association mapping by incompatibilities and local perfect phylogenies

    DEFF Research Database (Denmark)

    Mailund, Thomas; Besenbacher, Søren; Schierup, Mikkel Heide

    2006-01-01

    around each marker that is compatible with a single phylogenetic tree. This perfect phylogenetic tree is treated as a decision tree for determining disease status, and scored by its accuracy as a decision tree. The rationale for this is that the perfect phylogeny near a disease affecting mutation should...... a fast method for accurate localisation of disease causing variants in high density case-control association mapping experiments with large numbers of cases and controls. The method searches for significant clustering of case chromosomes in the "perfect" phylogenetic tree defined by the largest region...... provide more information about the affected/unaffected classification than random trees. If regions of compatibility contain few markers, due to e.g. large marker spacing, the algorithm can allow the inclusion of incompatibility markers in order to enlarge the regions prior to estimating their phylogeny...

  4. IMG 4 version of the integrated microbial genomes comparative analysis system

    Science.gov (United States)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  5. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  6. Fast mapping rapidly integrates information into existing memory networks.

    Science.gov (United States)

    Coutanche, Marc N; Thompson-Schill, Sharon L

    2014-12-01

    Successful learning involves integrating new material into existing memory networks. A learning procedure known as fast mapping (FM), thought to simulate the word-learning environment of children, has recently been linked to distinct neuroanatomical substrates in adults. This idea has suggested the (never-before tested) hypothesis that FM may promote rapid incorporation into cortical memory networks. We test this hypothesis here in 2 experiments. In our 1st experiment, we introduced 50 participants to 16 unfamiliar animals and names through FM or explicit encoding (EE) and tested participants on the training day, and again after sleep. Learning through EE produced strong declarative memories, without immediate lexical competition, as expected from slow-consolidation models. Learning through FM, however, led to almost immediate lexical competition, which continued to the next day. Additionally, the learned words began to prime related concepts on the day following FM (but not EE) training. In a 2nd experiment, we replicated the lexical integration results and determined that presenting an already-known item during learning was crucial for rapid integration through FM. The findings presented here indicate that learned items can be integrated into cortical memory networks at an accelerated rate through fast mapping. The retrieval of a related known concept, in order to infer the target of the FM question, is critical for this effect. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  7. Closed-form expressions for integrals of MKdV and sine-Gordon maps

    International Nuclear Information System (INIS)

    Kamp, Peter H van der; Rojas, O; Quispel, G R W

    2007-01-01

    We present closed-form expressions for approximately N integrals of 2N-dimensional maps. The maps are obtained by travelling wave reductions of the modified Korteweg-de Vries equation and of the sine-Gordon equation, respectively. We provide the integrating factors corresponding to the integrals. Moreover we show how the integrals and the integrating factors relate to the staircase method

  8. Mapping and annotating obesity-related genes in pig and human genomes.

    Science.gov (United States)

    Martelli, Pier Luigi; Fontanesi, Luca; Piovesan, Damiano; Fariselli, Piero; Casadio, Rita

    2014-01-01

    Background. Obesity is a major health problem in both developed and emerging countries. Obesity is a complex disease whose etiology involves genetic factors in strong interplay with environmental determinants and lifestyle. The discovery of genetic factors and biological pathways underlying human obesity is hampered by the difficulty in controlling the genetic background of human cohorts. Animal models are then necessary to further dissect the genetics of obesity. Pig has emerged as one of the most attractive models, because of the similarity with humans in the mechanisms regulating the fat deposition. Results. We collected the genes related to obesity in humans and to fat deposition traits in pig. We localized them on both human and pig genomes, building a map useful to interpret comparative studies on obesity. We characterized the collected genes structurally and functionally with BAR+ and mapped them on KEGG pathways and on STRING protein interaction network. Conclusions. The collected set consists of 361 obesity related genes in human and pig genomes. All genes were mapped on the human genome, and 54 could not be localized on the pig genome (release 2012). Only for 3 human genes there is no counterpart in pig, confirming that this animal is a good model for human obesity studies. Obesity related genes are mostly involved in regulation and signaling processes/pathways and relevant connection emerges between obesity-related genes and diseases such as cancer and infectious diseases.

  9. A saturated SSR/DArT linkage map of Musa acuminata addressing genome rearrangements among bananas

    Directory of Open Access Journals (Sweden)

    Matsumoto Takashi

    2010-04-01

    Full Text Available Abstract Background The genus Musa is a large species complex which includes cultivars at diploid and triploid levels. These sterile and vegetatively propagated cultivars are based on the A genome from Musa acuminata, exclusively for sweet bananas such as Cavendish, or associated with the B genome (Musa balbisiana in cooking bananas such as Plantain varieties. In M. acuminata cultivars, structural heterozygosity is thought to be one of the main causes of sterility, which is essential for obtaining seedless fruits but hampers breeding. Only partial genetic maps are presently available due to chromosomal rearrangements within the parents of the mapping populations. This causes large segregation distortions inducing pseudo-linkages and difficulties in ordering markers in the linkage groups. The present study aims at producing a saturated linkage map of M. acuminata, taking into account hypotheses on the structural heterozygosity of the parents. Results An F1 progeny of 180 individuals was obtained from a cross between two genetically distant accessions of M. acuminata, 'Borneo' and 'Pisang Lilin' (P. Lilin. Based on the gametic recombination of each parent, two parental maps composed of SSR and DArT markers were established. A significant proportion of the markers (21.7% deviated (p Conclusions We propose a synthetic map with 11 linkage groups containing 489 markers (167 SSRs and 322 DArTs covering 1197 cM. This first saturated map is proposed as a "reference Musa map" for further analyses. We also propose two complete parental maps with interpretations of structural rearrangements localized on the linkage groups. The structural heterozygosity in P. Lilin is hypothesized to result from a duplication likely accompanied by an inversion on another chromosome. This paper also illustrates a methodological approach, transferable to other species, to investigate the mapping of structural rearrangements and determine their consequences on marker

  10. Maps help protect sensitive areas from spills : an integrated approach to environmental mapping

    International Nuclear Information System (INIS)

    Laflamme, A.; Leblanc, S.R.; Percy, R.J.

    2001-01-01

    The Atlantic Sensitivity Mapping Program (ASMP) is underway in Canada's Atlantic Region to develop and maintain the best possible sensitivity mapping system to provide planners and managers with the full range of information they would need in the event of a coastal oil spill drill or spill incident. This initiative also provides recommendations concerning resource protection at the time of a spill. ASMP has become a powerful tool, providing a consistent and standardized terminology throughout the range of spill planning, preparedness and real-time response activities. The desktop mapping system provides an easy-to-use approach for a wide range of technical and support data and information stored in various databases. The data and information are based on a consistent set of terms and definitions that describe the character of the shore zone, the objective and strategies for a specific response, and the methods for achieving those objectives. The data are linked with other resource information in a GIS-based system and can be updated quickly and easily as new information becomes available. The mapping program keeps evolving to better serve the needs of environmental emergency responders. In addition, all components will soon be integrated into a web-based mapping format for broader accessibility. Future work will focus on developing a pre-spill database for Labrador. 3 refs., 8 figs

  11. Dynamic maps of UV damage formation and repair for the human genome.

    Science.gov (United States)

    Hu, Jinchuan; Adebali, Ogun; Adar, Sheera; Sancar, Aziz

    2017-06-27

    Formation and repair of UV-induced DNA damage in human cells are affected by cellular context. To study factors influencing damage formation and repair genome-wide, we developed a highly sensitive single-nucleotide resolution damage mapping method [high-sensitivity damage sequencing (HS-Damage-seq)]. Damage maps of both cyclobutane pyrimidine dimers (CPDs) and pyrimidine-pyrimidone (6-4) photoproducts [(6-4)PPs] from UV-irradiated cellular and naked DNA revealed that the effect of transcription factor binding on bulky adducts formation varies, depending on the specific transcription factor, damage type, and strand. We also generated time-resolved UV damage maps of both CPDs and (6-4)PPs by HS-Damage-seq and compared them to the complementary repair maps of the human genome obtained by excision repair sequencing to gain insight into factors that affect UV-induced DNA damage and repair and ultimately UV carcinogenesis. The combination of the two methods revealed that, whereas UV-induced damage is virtually uniform throughout the genome, repair is affected by chromatin states, transcription, and transcription factor binding, in a manner that depends on the type of DNA damage.

  12. Moyal noncommutative integrability and the Burgers-KdV mapping

    International Nuclear Information System (INIS)

    Sedra, M.B.

    2005-12-01

    The Moyal momentum algebra, is once again used to discuss some important aspects of NC integrable models and 2d conformal field theories. Among the results presented, we set up algebraic structures and makes useful convention notations leading to extract non trivial properties of the Moyal momentum algebra. We study also the Lax pair building mechanism for particular examples namely, the noncommutative KdV and Burgers systems. We show in a crucial step that these two systems are mapped to each other through the following crucial mapping ∂ t 2 → ∂ t 3 ≡ ∂ t 2 ∂ x + α∂ x 3 . This makes a strong constraint on the NC Burgers system which corresponds to linearizing its associated differential equation. From the CFT's point of view, this constraint equation is nothing but the analogue of the conservation law of the conformal current. We believe that the considered mapping might help to bring new insights towards understanding the integrability of noncommutative 2d-systems. (author)

  13. Comparison of Burrows-Wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to Illumina data for livestock genomes

    Science.gov (United States)

    Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...

  14. Reference Genome-Directed Resolution of Homologous and Homeologous Relationships within and between Different Oat Linkage Maps

    Directory of Open Access Journals (Sweden)

    Juan J. Gutierrez-Gonzalez

    2011-11-01

    Full Text Available Genome research on oat ( L. has received less attention than wheat ( L. and barley ( L. because it is a less prominent component of the human food system. To assess the potential of the model grass (L P. Beauv. as a surrogate for oat genome research, the whole genome sequence (WGS of was employed for comparative analysis with oat genetic linkage maps. Sequences of mapped molecular markers from one diploid spp. and two hexaploid oat maps were aligned to the WGS to infer syntenic relationships. Diploid and exhibit a high degree of synteny with 18 syntenic blocks covering 87% of the oat genome, which permitted postulation of an ancestral spp. chromosome structure. Synteny between oat and was also prevalent, with 50 syntenic blocks covering 76.6% of the ‘Kanota’ × ‘Ogle’ linkage map. Coalignment of diploid and hexaploid maps to helped resolve homeologous relationships between different oat linkage groups but also revealed many major rearrangements in oat subgenomes. Extending the analysis to a second oat linkage map (Ogle × ‘TAM O-301’ allowed identification of several putative homologous linkage groups across the two oat populations. These results indicate that the genome sequence will be a useful resource to assist genetics and genomics research in oat. The analytical strategy employed here should be applicable for genome research in other temperate grass crops with modest amounts of genomic data.

  15. Integrating population dynamics into mapping human exposure to seismic hazard

    Directory of Open Access Journals (Sweden)

    S. Freire

    2012-11-01

    Full Text Available Disaster risk is not fully characterized without taking into account vulnerability and population exposure. Assessment of earthquake risk in urban areas would benefit from considering the variation of population distribution at more detailed spatial and temporal scales, and from a more explicit integration of this improved demographic data with existing seismic hazard maps. In the present work, "intelligent" dasymetric mapping is used to model population dynamics at high spatial resolution in order to benefit the analysis of spatio-temporal exposure to earthquake hazard in a metropolitan area. These night- and daytime-specific population densities are then classified and combined with seismic intensity levels to derive new spatially-explicit four-class-composite maps of human exposure. The presented approach enables a more thorough assessment of population exposure to earthquake hazard. Results show that there are significantly more people potentially at risk in the daytime period, demonstrating the shifting nature of population exposure in the daily cycle and the need to move beyond conventional residence-based demographic data sources to improve risk analyses. The proposed fine-scale maps of human exposure to seismic intensity are mainly aimed at benefiting visualization and communication of earthquake risk, but can be valuable in all phases of the disaster management process where knowledge of population densities is relevant for decision-making.

  16. Unexpected observations after mapping LongSAGE tags to the human genome

    Directory of Open Access Journals (Sweden)

    Duret Laurent

    2007-05-01

    Full Text Available Abstract Background SAGE has been used widely to study the expression of known transcripts, but much less to annotate new transcribed regions. LongSAGE produces tags that are sufficiently long to be reliably mapped to a whole-genome sequence. Here we used this property to study the position of human LongSAGE tags obtained from all public libraries. We focused mainly on tags that do not map to known transcripts. Results Using a published error rate in SAGE libraries, we first removed the tags likely to result from sequencing errors. We then observed that an unexpectedly large number of the remaining tags still did not match the genome sequence. Some of these correspond to parts of human mRNAs, such as polyA tails, junctions between two exons and polymorphic regions of transcripts. Another non-negligible proportion can be attributed to contamination by murine transcripts and to residual sequencing errors. After filtering out our data with these screens to ensure that our dataset is highly reliable, we studied the tags that map once to the genome. 31% of these tags correspond to unannotated transcripts. The others map to known transcribed regions, but many of them (nearly half are located either in antisense or in new variants of these known transcripts. Conclusion We performed a comprehensive study of all publicly available human LongSAGE tags, and carefully verified the reliability of these data. We found the potential origin of many tags that did not match the human genome sequence. The properties of the remaining tags imply that the level of sequencing error may have been under-estimated. The frequency of tags matching once the genome sequence but not in an annotated exon suggests that the human transcriptome is much more complex than shown by the current human genome annotations, with many new splicing variants and antisense transcripts. SAGE data is appropriate to map new transcripts to the genome, as demonstrated by the high rate of cross

  17. Integration of Physical, Genetic, and Cytogenetic Mapping Data for Cellulose Synthase (CesA Genes in Flax (Linum usitatissimum L.

    Directory of Open Access Journals (Sweden)

    Olga Y. Yurkevich

    2017-08-01

    Full Text Available Flax, Linum usitatissimum L., is a valuable multi-purpose plant, and currently, its genome is being extensively investigated. Nevertheless, mapping of genes in flax genome is still remaining a challenging task. The cellulose synthase (CesA multigene family involving in the process of cellulose synthesis is especially important for metabolism of this fiber crop. For the first time, fluorescent in situ hybridization (FISH-based chromosomal localization of the CesA conserved fragment (KF011584.1, 5S, and 26S rRNA genes was performed in landrace, oilseed, and fiber varieties of L. usitatissimum. Intraspecific polymorphism in chromosomal distribution of KF011584.1 and 5S DNA loci was revealed, and the generalized chromosome ideogram was constructed. Using BLAST analysis, available data on physical/genetic mapping and also whole-genome sequencing of flax, localization of KF011584.1, 45S, and 5S rRNA sequences on genomic scaffolds, and their anchoring to the genetic map were conducted. The alignment of the results of FISH and BLAST analyses indicated that KF011584.1 fragment revealed on chromosome 3 could be anchored to linkage group (LG 11. The common LG for 45S and 5S rDNA was not found probably due to the polymorphic localization of 5S rDNA on chromosome 1. Our findings indicate the complexity of integration of physical, genetic, and cytogenetic mapping data for multicopy gene families in plants. Nevertheless, the obtained results can be useful for future progress in constructing of integrated physical/genetic/cytological maps in L. usitatissimum which are essential for flax breeding.

  18. Integration of Physical, Genetic, and Cytogenetic Mapping Data for Cellulose Synthase (CesA) Genes in Flax (Linum usitatissimum L.).

    Science.gov (United States)

    Yurkevich, Olga Y; Kirov, Ilya V; Bolsheva, Nadezhda L; Rachinskaya, Olga A; Grushetskaya, Zoya E; Zoschuk, Svyatoslav A; Samatadze, Tatiana E; Bogdanova, Marina V; Lemesh, Valentina A; Amosova, Alexandra V; Muravenko, Olga V

    2017-01-01

    Flax, Linum usitatissimum L., is a valuable multi-purpose plant, and currently, its genome is being extensively investigated. Nevertheless, mapping of genes in flax genome is still remaining a challenging task. The cellulose synthase ( CesA ) multigene family involving in the process of cellulose synthesis is especially important for metabolism of this fiber crop. For the first time, fluorescent in situ hybridization (FISH)-based chromosomal localization of the CesA conserved fragment (KF011584.1), 5S, and 26S rRNA genes was performed in landrace, oilseed, and fiber varieties of L. usitatissimum . Intraspecific polymorphism in chromosomal distribution of KF011584.1 and 5S DNA loci was revealed, and the generalized chromosome ideogram was constructed. Using BLAST analysis, available data on physical/genetic mapping and also whole-genome sequencing of flax, localization of KF011584.1, 45S, and 5S rRNA sequences on genomic scaffolds, and their anchoring to the genetic map were conducted. The alignment of the results of FISH and BLAST analyses indicated that KF011584.1 fragment revealed on chromosome 3 could be anchored to linkage group (LG) 11. The common LG for 45S and 5S rDNA was not found probably due to the polymorphic localization of 5S rDNA on chromosome 1. Our findings indicate the complexity of integration of physical, genetic, and cytogenetic mapping data for multicopy gene families in plants. Nevertheless, the obtained results can be useful for future progress in constructing of integrated physical/genetic/cytological maps in L. usitatissimum which are essential for flax breeding.

  19. A Lithology Based Map Unit Schema For Onegeology Regional Geologic Map Integration

    Science.gov (United States)

    Moosdorf, N.; Richard, S. M.

    2012-12-01

    A system of lithogenetic categories for a global lithological map (GLiM, http://www.ifbm.zmaw.de/index.php?id=6460&L=3) has been compiled based on analysis of lithology/genesis categories for regional geologic maps for the entire globe. The scheme is presented for discussion and comment. Analysis of units on a variety of regional geologic maps indicates that units are defined based on assemblages of rock types, as well as their genetic type. In this compilation of continental geology, outcropping surface materials are dominantly sediment/sedimentary rock; major subdivisions of the sedimentary category include clastic sediment, carbonate sedimentary rocks, clastic sedimentary rocks, mixed carbonate and clastic sedimentary rock, colluvium and residuum. Significant areas of mixed igneous and metamorphic rock are also present. A system of global categories to characterize the lithology of regional geologic units is important for Earth System models of matter fluxes to soils, ecosystems, rivers and oceans, and for regional analysis of Earth surface processes at global scale. Because different applications of the classification scheme will focus on different lithologic constituents in mixed units, an ontology-type representation of the scheme that assigns properties to the units in an analyzable manner will be pursued. The OneGeology project is promoting deployment of geologic map services at million scale for all nations. Although initial efforts are commonly simple scanned map WMS services, the intention is to move towards data-based map services that categorize map units with standard vocabularies to allow use of a common map legend for better visual integration of the maps (e.g. see OneGeology Europe, http://onegeology-europe.brgm.fr/ geoportal/ viewer.jsp). Current categorization of regional units with a single lithology from the CGI SimpleLithology (http://resource.geosciml.org/201202/ Vocab2012html/ SimpleLithology201012.html) vocabulary poorly captures the

  20. The National Map 2.0 Tactical Plan: "Toward the (Integrated) National Map"

    Science.gov (United States)

    Zulick, Carl A.

    2008-01-01

    The National Map's 2-year goal, as described in this plan, is to provide a range of geospatial products and services that meet the basic goals of the original vision for The National Map while furthering the National Spatial Data Infrastructure that underpins U.S. Geological Survey (USGS) science. To accomplish this goal, the National Geospatial Program (NGP) will acquire, store, maintain, and distribute base map data. The management team for the NGP sets priorities for The National Map in three areas: Data and Products, Services, and Management. Priorities for fiscal years 2008 and 2009 (October 1, 2007 through September 30, 2009), involving the current data inventory, data acquisition, and the integration of data, are (1) incorporating current data from Federal, State, and local organizations into The National Map to the degree possible, given data availability and program resources; (2) collaborating with other USGS programs to incorporate data that support the USGS Science Strategy; (3) supporting the Department of the Interior (DOI) high-priority geospatial information needs; (4) emergency response; (5) homeland security, natural hazards; and (6) graphics products delivery. The management team identified known constraints, enablers, and drivers for the acquisition and integration of data. The NGP management team also identified customer-focused products and services of The National Map. Ongoing planning and management activities direct the development and delivery of these products and services. Management of work flow processes to support The National Map priorities are identified and established through a business-driven prioritization process. This tactical plan is primarily for use as a document to guide The National Map program for the next two fiscal years. The document is available to the public because of widespread interest in The National Map. The USGS collaborates with a broad range of customers and partners who are essential to the success of The

  1. Genome-wide SNP identification by high-throughput sequencing and selective mapping allows sequence assembly positioning using a framework genetic linkage map

    Directory of Open Access Journals (Sweden)

    Xu Xiangming

    2010-12-01

    Full Text Available Abstract Background Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method. Results The strategy was tested on a draft genome of the fungal pathogen Venturia inaequalis, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome Fragaria vesca. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for V. inaequalis and F. vesca, respectively, to genetic linkage maps. Conclusions We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.

  2. Determination Of Slope Instability Using Spatially Integrated Mapping Framework

    Science.gov (United States)

    Baharuddin, I. N. Z.; Omar, R. C.; Roslan, R.; Khalid, N. H. N.; Hanifah, M. I. M.

    2016-11-01

    The determination and identification of slope instability are often rely on data obtained from in-situ soil investigation work where it involves the logistic of machineries and manpower, thus these aspects may increase the cost especially for remote locations. Therefore a method, which is able to identify possible slope instability without frequent ground walkabout survey, is needed. This paper presents the method used in prediction of slope instability using spatial integrated mapping framework which applicable for remote areas such as tropical forest and natural hilly terrain. Spatial data such as geology, topography, land use map, slope angle and elevation were used in regional analysis during desktop study. Through this framework, the occurrence of slope instability was able to be identified and was validate using a confirmatory site- specific analysis.

  3. Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism

    DEFF Research Database (Denmark)

    Hu, Zheng; Zhu, Da; Wang, Wei

    2015-01-01

    Human papillomavirus (HPV) integration is a key genetic event in cervical carcinogenesis1. By conducting whole-genome sequencing and high-throughput viral integration detection, we identified 3,667 HPV integration breakpoints in 26 cervical intraepithelial neoplasias, 104 cervical carcinomas and ...

  4. Integrated Genomic Analysis of the Ubiquitin Pathway across Cancer Types

    Directory of Open Access Journals (Sweden)

    Zhongqi Ge

    2018-04-01

    Full Text Available Summary: Protein ubiquitination is a dynamic and reversible process of adding single ubiquitin molecules or various ubiquitin chains to target proteins. Here, using multidimensional omic data of 9,125 tumor samples across 33 cancer types from The Cancer Genome Atlas, we perform comprehensive molecular characterization of 929 ubiquitin-related genes and 95 deubiquitinase genes. Among them, we systematically identify top somatic driver candidates, including mutated FBXW7 with cancer-type-specific patterns and amplified MDM2 showing a mutually exclusive pattern with BRAF mutations. Ubiquitin pathway genes tend to be upregulated in cancer mediated by diverse mechanisms. By integrating pan-cancer multiomic data, we identify a group of tumor samples that exhibit worse prognosis. These samples are consistently associated with the upregulation of cell-cycle and DNA repair pathways, characterized by mutated TP53, MYC/TERT amplification, and APC/PTEN deletion. Our analysis highlights the importance of the ubiquitin pathway in cancer development and lays a foundation for developing relevant therapeutic strategies. : Ge et al. analyze a cohort of 9,125 TCGA samples across 33 cancer types to provide a comprehensive characterization of the ubiquitin pathway. They detect somatic driver candidates in the ubiquitin pathway and identify a cluster of patients with poor survival, highlighting the importance of this pathway in cancer development. Keywords: ubiquitin pathway, pan-cancer analysis, The Cancer Genome Atlas, tumor subtype, cancer prognosis, therapeutic targets, biomarker, FBXW7

  5. Map Matching and Real World Integrated Sensor Data Warehousing (Presentation)

    Energy Technology Data Exchange (ETDEWEB)

    Burton, E.

    2014-02-01

    The inclusion of interlinked temporal and spatial elements within integrated sensor data enables a tremendous degree of flexibility when analyzing multi-component datasets. The presentation illustrates how to warehouse, process, and analyze high-resolution integrated sensor datasets to support complex system analysis at the entity and system levels. The example cases presented utilizes in-vehicle sensor system data to assess vehicle performance, while integrating a map matching algorithm to link vehicle data to roads to demonstrate the enhanced analysis possible via interlinking data elements. Furthermore, in addition to the flexibility provided, the examples presented illustrate concepts of maintaining proprietary operational information (Fleet DNA) and privacy of study participants (Transportation Secure Data Center) while producing widely distributed data products. Should real-time operational data be logged at high resolution across multiple infrastructure types, map matched to their associated infrastructure, and distributed employing a similar approach; dependencies between urban environment infrastructures components could be better understood. This understanding is especially crucial for the cities of the future where transportation will rely more on grid infrastructure to support its energy demands.

  6. Statistical Viewer: a tool to upload and integrate linkage and association data as plots displayed within the Ensembl genome browser

    Directory of Open Access Journals (Sweden)

    Hauser Elizabeth R

    2005-04-01

    Full Text Available Abstract Background To facilitate efficient selection and the prioritization of candidate complex disease susceptibility genes for association analysis, increasingly comprehensive annotation tools are essential to integrate, visualize and analyze vast quantities of disparate data generated by genomic screens, public human genome sequence annotation and ancillary biological databases. We have developed a plug-in package for Ensembl called "Statistical Viewer" that facilitates the analysis of genomic features and annotation in the regions of interest defined by linkage analysis. Results Statistical Viewer is an add-on package to the open-source Ensembl Genome Browser and Annotation System that displays disease study-specific linkage and/or association data as 2 dimensional plots in new panels in the context of Ensembl's Contig View and Cyto View pages. An enhanced upload server facilitates the upload of statistical data, as well as additional feature annotation to be displayed in DAS tracts, in the form of Excel Files. The Statistical View panel, drawn directly under the ideogram, illustrates lod score values for markers from a study of interest that are plotted against their position in base pairs. A module called "Get Map" easily converts the genetic locations of markers to genomic coordinates. The graph is placed under the corresponding ideogram features a synchronized vertical sliding selection box that is seamlessly integrated into Ensembl's Contig- and Cyto- View pages to choose the region to be displayed in Ensembl's "Overview" and "Detailed View" panels. To resolve Association and Fine mapping data plots, a "Detailed Statistic View" plot corresponding to the "Detailed View" may be displayed underneath. Conclusion Features mapping to regions of linkage are accentuated when Statistic View is used in conjunction with the Distributed Annotation System (DAS to display supplemental laboratory information such as differentially expressed disease

  7. A saturated SSR/DArT linkage map of Musa acuminata addressing genome rearrangements among bananas.

    Science.gov (United States)

    Hippolyte, Isabelle; Bakry, Frederic; Seguin, Marc; Gardes, Laetitia; Rivallan, Ronan; Risterucci, Ange-Marie; Jenny, Christophe; Perrier, Xavier; Carreel, Françoise; Argout, Xavier; Piffanelli, Pietro; Khan, Imtiaz A; Miller, Robert N G; Pappas, Georgios J; Mbéguié-A-Mbéguié, Didier; Matsumoto, Takashi; De Bernardinis, Veronique; Huttner, Eric; Kilian, Andrzej; Baurens, Franc-Christophe; D'Hont, Angélique; Cote, François; Courtois, Brigitte; Glaszmann, Jean-Christophe

    2010-04-13

    The genus Musa is a large species complex which includes cultivars at diploid and triploid levels. These sterile and vegetatively propagated cultivars are based on the A genome from Musa acuminata, exclusively for sweet bananas such as Cavendish, or associated with the B genome (Musa balbisiana) in cooking bananas such as Plantain varieties. In M. acuminata cultivars, structural heterozygosity is thought to be one of the main causes of sterility, which is essential for obtaining seedless fruits but hampers breeding. Only partial genetic maps are presently available due to chromosomal rearrangements within the parents of the mapping populations. This causes large segregation distortions inducing pseudo-linkages and difficulties in ordering markers in the linkage groups. The present study aims at producing a saturated linkage map of M. acuminata, taking into account hypotheses on the structural heterozygosity of the parents. An F1 progeny of 180 individuals was obtained from a cross between two genetically distant accessions of M. acuminata, 'Borneo' and 'Pisang Lilin' (P. Lilin). Based on the gametic recombination of each parent, two parental maps composed of SSR and DArT markers were established. A significant proportion of the markers (21.7%) deviated (p DArTs) covering 1197 cM. This first saturated map is proposed as a "reference Musa map" for further analyses. We also propose two complete parental maps with interpretations of structural rearrangements localized on the linkage groups. The structural heterozygosity in P. Lilin is hypothesized to result from a duplication likely accompanied by an inversion on another chromosome. This paper also illustrates a methodological approach, transferable to other species, to investigate the mapping of structural rearrangements and determine their consequences on marker segregation.

  8. Impact of genome assembly status on ChIP-Seq and ChIP-PET data mapping

    Directory of Open Access Journals (Sweden)

    Sachs Laurent

    2009-12-01

    Full Text Available Abstract Background ChIP-Seq and ChIP-PET can potentially be used with any genome for genome wide profiling of protein-DNA interaction sites. Unfortunately, it is probable that most genome assemblies will never reach the quality of the human genome assembly. Therefore, it remains to be determined whether ChIP-Seq and ChIP-PET are practicable with genome sequences other than a few (e.g. human and mouse. Findings Here, we used in silico simulations to assess the impact of completeness or fragmentation of genome assemblies on ChIP-Seq and ChIP-PET data mapping. Conclusions Most currently published genome assemblies are suitable for mapping the short sequence tags produced by ChIP-Seq or ChIP-PET.

  9. First-generation physical map of the Culicoides variipennis (Diptera: Ceratopogonidae) genome.

    Science.gov (United States)

    Nunamaker, R A; Brown, S E; McHolland, L E; Tabachnick, W J; Knudson, D L

    1999-11-01

    Recombinant cosmids labeled with biotin-11-dUTP or digoxigenin by nick translation were used as in situ hybridization probes to metaphase chromosomes of Culicoides variipennis (Coquillett). Paired fluorescent signals were detected on each arm of sister chromatids and were ordered along the 3 chromosomes. Thirty-three unique probes were mapped to the 3 chromosomes of C. variipennis (2n = 6): 7 to chromosome 1, 20 to chromosome 2, and 6 to chromosome 3. This work represents the first stage in generating a physical map of the genome of C. variipennis.

  10. A clone-free, single molecule map of the domestic cow (Bos taurus) genome.

    Science.gov (United States)

    Zhou, Shiguo; Goldstein, Steve; Place, Michael; Bechner, Michael; Patino, Diego; Potamousis, Konstantinos; Ravindran, Prabu; Pape, Louise; Rincon, Gonzalo; Hernandez-Ortiz, Juan; Medrano, Juan F; Schwartz, David C

    2015-08-28

    The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts

  11. Constructing linkage maps in the genomics era with MapDisto 2.0.

    Science.gov (United States)

    Heffelfinger, Christopher; Fragoso, Christopher A; Lorieux, Mathias

    2017-07-15

    Genotyping by sequencing (GBS) generates datasets that are challenging to handle by current genetic mapping software with graphical interface. Geneticists need new user-friendly computer programs that can analyze GBS data on desktop computers. This requires improvements in computation efficiency, both in terms of speed and use of random-access memory (RAM). MapDisto v.2.0 is a user-friendly computer program for construction of genetic linkage maps. It includes several new major features: (i) handling of very large genotyping datasets like the ones generated by GBS; (ii) direct importation and conversion of Variant Call Format (VCF) files; (iii) detection of linkage, i.e. construction of linkage groups in case of segregation distortion; (iv) data imputation on VCF files using a new approach, called LB-Impute. Features i to iv operate through inclusion of new Java modules that are used transparently by MapDisto; (v) QTL detection via a new R/qtl graphical interface. The program is available free of charge at mapdisto.free.fr. mapdisto@gmail.com. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  12. Genome Sequencing and Mapping Reveal Loss of Heterozygosity as a Mechanism for Rapid Adaptation in the Vegetable Pathogen Phytophthora capsici

    Energy Technology Data Exchange (ETDEWEB)

    Lamour, Kurt H.; Mudge, Joann; Gobena, Daniel; Hurtado-Gonzales, Oscar P.; Schmutz, Jeremy; Kuo, Alan; Miller, Neil A.; Rice, Brandon J.; Raffaele, Sylvain; Cano, Liliana M.; Bharti, Arvind K.; Donahoo, Ryan S.; Finely, Sabra; Huitema, Edgar; Hulvey, Jon; Platt, Darren; Salamov, Asaf; Savidor, Alon; Sharma, Rahul; Stam, Remco; Sotrey, Dylan; Thines, Marco; Win, Joe; Haas, Brian J.; Dinwiddie, Darrell L.; Jenkins, Jerry; Knight, James R.; Affourtit, Jason P.; Han, Cliff S.; Chertkov, Olga; Lindquist, Erika A.; Detter, Chris; Grigoriev, Igor V.; Kamoun, Sophien; Kingsmore, Stephen F.

    2012-02-07

    The oomycete vegetable pathogen Phytophthora capsici has shown remarkable adaptation to fungicides and new hosts. Like other members of this destructive genus, P. capsici has an explosive epidemiology, rapidly producing massive numbers of asexual spores on infected hosts. In addition, P. capsici can remain dormant for years as sexually recombined oospores, making it difficult to produce crops at infested sites, and allowing outcrossing populations to maintain significant genetic variation. Genome sequencing, development of a high-density genetic map, and integrative genomic or genetic characterization of P. capsici field isolates and intercross progeny revealed significant mitotic loss of heterozygosity (LOH) in diverse isolates. LOH was detected in clonally propagated field isolates and sexual progeny, cumulatively affecting >30percent of the genome. LOH altered genotypes for more than 11,000 single-nucleotide variant sites and showed a strong association with changes in mating type and pathogenicity. Overall, it appears that LOH may provide a rapid mechanism for fixing alleles and may be an important component of adaptability for P. capsici.

  13. High-density genetic map using whole-genome re-sequencing for fine mapping and candidate gene discovery for disease resistance in peanut

    Science.gov (United States)

    High-density genetic linkage maps are essential for fine mapping QTLs controlling disease resistance traits, such as early leaf spot (ELS), late leaf spot (LLS), and Tomato spotted wilt virus (TSWV). With completion of the genome sequences of two diploid ancestors of cultivated peanut, we could use ...

  14. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    OpenAIRE

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it possible to map SCEs at orders-of-magnitude greater resolution than was previously possible. On average, murine embryonic stem (mES) cells exhibit eight SCEs, which are detected at a resolution of up...

  15. Assembly and Multiplex Genome Integration of Metabolic Pathways in Yeast Using CasEMBLR

    DEFF Research Database (Denmark)

    Jakočiūnas, Tadas; Jensen, Emil D.; Jensen, Michael Krogh

    2018-01-01

    and marker-free integration of the carotenoid pathway from 15 exogenously supplied DNA parts into three targeted genomic loci. As a second proof-of-principle, a total of ten DNA parts were assembled and integrated in two genomic loci to construct a tyrosine production strain, and at the same time knocking......Genome integration is a vital step for implementing large biochemical pathways to build a stable microbial cell factory. Although traditional strain construction strategies are well established for the model organism Saccharomyces cerevisiae, recent advances in CRISPR/Cas9-mediated genome...... engineering allow much higher throughput and robustness in terms of strain construction. In this chapter, we describe CasEMBLR, a highly efficient and marker-free genome engineering method for one-step integration of in vivo assembled expression cassettes in multiple genomic sites simultaneously. Cas...

  16. High resolution linkage maps of the model organism Petunia reveal substantial synteny decay with the related genome of tomato

    OpenAIRE

    Bossolini, Eligio; Klahre, Ulrich; Brandenburg, Anna; Reinhardt, Didier; Kuhlemeier, Cris

    2011-01-01

    Two linkage maps were constructed for the model plant Petunia. Mapping populations were obtained by crossing the wild species Petunia axillaris subsp. axillaris with Petunia inflata, and Petunia axillaris subsp. parodii with Petunia exserta. Both maps cover the seven chromosomes of Petunia, and span 970 centimorgans (cM) and 700 cM of the genomes, respectively. In total, 207 markers were mapped. Of these, 28 are multilocus amplified fragment length polymorphism (AFLP) markers and 179 are gene...

  17. Fine-scale maps of recombination rates and hotspots in the mouse genome.

    Science.gov (United States)

    Brunschwig, Hadassa; Levi, Liat; Ben-David, Eyal; Williams, Robert W; Yakir, Benjamin; Shifman, Sagiv

    2012-07-01

    Recombination events are not uniformly distributed and often cluster in narrow regions known as recombination hotspots. Several studies using different approaches have dramatically advanced our understanding of recombination hotspot regulation. Population genetic data have been used to map and quantify hotspots in the human genome. Genetic variation in recombination rates and hotspots usage have been explored in human pedigrees, mouse intercrosses, and by sperm typing. These studies pointed to the central role of the PRDM9 gene in hotspot modulation. In this study, we used single nucleotide polymorphisms (SNPs) from whole-genome resequencing and genotyping studies of mouse inbred strains to estimate recombination rates across the mouse genome and identified 47,068 historical hotspots--an average of over 2477 per chromosome. We show by simulation that inbred mouse strains can be used to identify positions of historical hotspots. Recombination hotspots were found to be enriched for the predicted binding sequences for different alleles of the PRDM9 protein. Recombination rates were on average lower near transcription start sites (TSS). Comparing the inferred historical recombination hotspots with the recent genome-wide mapping of double-strand breaks (DSBs) in mouse sperm revealed a significant overlap, especially toward the telomeres. Our results suggest that inbred strains can be used to characterize and study the dynamics of historical recombination hotspots. They also strengthen previous findings on mouse recombination hotspots, and specifically the impact of sequence variants in Prdm9.

  18. Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

    Science.gov (United States)

    Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard

    2015-10-19

    DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.

  19. Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels

    Directory of Open Access Journals (Sweden)

    Xiaoyi eGao

    2012-06-01

    Full Text Available Genotype imputation is a vital tool in genome-wide association studies (GWAS and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR+CEU+YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation-based analysis in Latinos.

  20. Tropical forest carbon assessment: integrating satellite and airborne mapping approaches

    International Nuclear Information System (INIS)

    Asner, Gregory P

    2009-01-01

    Large-scale carbon mapping is needed to support the UNFCCC program to reduce deforestation and forest degradation (REDD). Managers of forested land can potentially increase their carbon credits via detailed monitoring of forest cover, loss and gain (hectares), and periodic estimates of changes in forest carbon density (tons ha -1 ). Satellites provide an opportunity to monitor changes in forest carbon caused by deforestation and degradation, but only after initial carbon densities have been assessed. New airborne approaches, especially light detection and ranging (LiDAR), provide a means to estimate forest carbon density over large areas, which greatly assists in the development of practical baselines. Here I present an integrated satellite-airborne mapping approach that supports high-resolution carbon stock assessment and monitoring in tropical forest regions. The approach yields a spatially resolved, regional state-of-the-forest carbon baseline, followed by high-resolution monitoring of forest cover and disturbance to estimate carbon emissions. Rapid advances and decreasing costs in the satellite and airborne mapping sectors are already making high-resolution carbon stock and emissions assessments viable anywhere in the world.

  1. An integrative genomic and transcriptomic analysis reveals potential targets associated with cell proliferation in uterine leiomyomas.

    Directory of Open Access Journals (Sweden)

    Priscila Daniele Ramos Cirilo

    Full Text Available Uterine Leiomyomas (ULs are the most common benign tumours affecting women of reproductive age. ULs represent a major problem in public health, as they are the main indication for hysterectomy. Approximately 40-50% of ULs have non-random cytogenetic abnormalities, and half of ULs may have copy number alterations (CNAs. Gene expression microarrays studies have demonstrated that cell proliferation genes act in response to growth factors and steroids. However, only a few genes mapping to CNAs regions were found to be associated with ULs.We applied an integrative analysis using genomic and transcriptomic data to identify the pathways and molecular markers associated with ULs. Fifty-one fresh frozen specimens were evaluated by array CGH (JISTIC and gene expression microarrays (SAM. The CONEXIC algorithm was applied to integrate the data.The integrated analysis identified the top 30 significant genes (P<0.01, which comprised genes associated with cancer, whereas the protein-protein interaction analysis indicated a strong association between FANCA and BRCA1. Functional in silico analysis revealed target molecules for drugs involved in cell proliferation, including FGFR1 and IGFBP5. Transcriptional and protein analyses showed that FGFR1 (P = 0.006 and P<0.01, respectively and IGFBP5 (P = 0.0002 and P = 0.006, respectively were up-regulated in the tumours when compared with the adjacent normal myometrium.The integrative genomic and transcriptomic approach indicated that FGFR1 and IGFBP5 amplification, as well as the consequent up-regulation of the protein products, plays an important role in the aetiology of ULs and thus provides data for potential drug therapies development to target genes associated with cellular proliferation in ULs.

  2. Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

    Science.gov (United States)

    Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

    2012-01-01

    Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference

  3. Genomic characterization of DArT markers based on high-density linkage analysis and physical mapping to the Eucalyptus genome.

    Directory of Open Access Journals (Sweden)

    César D Petroli

    Full Text Available Diversity Arrays Technology (DArT provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for

  4. Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster.

    Science.gov (United States)

    He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric

    2012-12-01

    Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding sequences that can be mapped to the assembled euchromatic arms, we identified sequences that are specific to heterochromatin and used them to design heterochromatin specific probes ("H-probes") for microarray. By comparative genomic hybridization (CGH) analyses of embryos deficient for each chromosome or chromosome arm, we were able to map most of our H-probes to specific chromosome arms. We also positioned sequences mapped to the second and X chromosomes to finer intervals by analyzing smaller deletions with breakpoints in heterochromatin. Using this approach, we were able to map >40% (13.9 Mb) of the previously unmapped heterochromatin sequences assembled by the whole-genome sequencing effort on arm U and arm Uextra to specific locations. We also identified and mapped 110 kb of novel heterochromatic sequences. Subsequent analyses revealed that sequences located within different heterochromatic regions have distinct properties, such as sequence composition, degree of repetitiveness, and level of underreplication in polytenized tissues. Surprisingly, although heterochromatin is generally considered to be transcriptionally silent, we detected region-specific temporal patterns of transcription in heterochromatin during oogenesis and early embryonic development. Our study provides a useful approach to elucidate the molecular organization and function of heterochromatin and reveals region-specific variation of heterochromatin.

  5. Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster

    Science.gov (United States)

    He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric

    2012-01-01

    Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding sequences that can be mapped to the assembled euchromatic arms, we identified sequences that are specific to heterochromatin and used them to design heterochromatin specific probes (“H-probes”) for microarray. By comparative genomic hybridization (CGH) analyses of embryos deficient for each chromosome or chromosome arm, we were able to map most of our H-probes to specific chromosome arms. We also positioned sequences mapped to the second and X chromosomes to finer intervals by analyzing smaller deletions with breakpoints in heterochromatin. Using this approach, we were able to map >40% (13.9 Mb) of the previously unmapped heterochromatin sequences assembled by the whole-genome sequencing effort on arm U and arm Uextra to specific locations. We also identified and mapped 110 kb of novel heterochromatic sequences. Subsequent analyses revealed that sequences located within different heterochromatic regions have distinct properties, such as sequence composition, degree of repetitiveness, and level of underreplication in polytenized tissues. Surprisingly, although heterochromatin is generally considered to be transcriptionally silent, we detected region-specific temporal patterns of transcription in heterochromatin during oogenesis and early embryonic development. Our study provides a useful approach to elucidate the molecular organization and function of heterochromatin and reveals region-specific variation of heterochromatin. PMID:22745230

  6. An initial comparative map of copy number variations in the goat (Capra hircus genome

    Directory of Open Access Journals (Sweden)

    Casadio Rita

    2010-11-01

    Full Text Available Abstract Background The goat (Capra hircus represents one of the most important farm animal species. It is reared in all continents with an estimated world population of about 800 million of animals. Despite its importance, studies on the goat genome are still in their infancy compared to those in other farm animal species. Comparative mapping between cattle and goat showed only a few rearrangements in agreement with the similarity of chromosome banding. We carried out a cross species cattle-goat array comparative genome hybridization (aCGH experiment in order to identify copy number variations (CNVs in the goat genome analysing animals of different breeds (Saanen, Camosciata delle Alpi, Girgentana, and Murciano-Granadina using a tiling oligonucleotide array with ~385,000 probes designed on the bovine genome. Results We identified a total of 161 CNVs (an average of 17.9 CNVs per goat, with the largest number in the Saanen breed and the lowest in the Camosciata delle Alpi goat. By aggregating overlapping CNVs identified in different animals we determined CNV regions (CNVRs: on the whole, we identified 127 CNVRs covering about 11.47 Mb of the virtual goat genome referred to the bovine genome (0.435% of the latter genome. These 127 CNVRs included 86 loss and 41 gain and ranged from about 24 kb to about 1.07 Mb with a mean and median equal to 90,292 bp and 49,530 bp, respectively. To evaluate whether the identified goat CNVRs overlap with those reported in the cattle genome, we compared our results with those obtained in four independent cattle experiments. Overlapping between goat and cattle CNVRs was highly significant (P Conclusions We describe a first map of goat CNVRs. This provides information on a comparative basis with the cattle genome by identifying putative recurrent interspecies CNVs between these two ruminant species. Several goat CNVs affect genes with important biological functions. Further studies are needed to evaluate the

  7. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  8. Integrating functional data to prioritize causal variants in statistical fine-mapping studies.

    Directory of Open Access Journals (Sweden)

    Gleb Kichaev

    2014-10-01

    Full Text Available Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy. Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.

  9. High resolution linkage maps of the model organism Petunia reveal substantial synteny decay with the related genome of tomato.

    Science.gov (United States)

    Bossolini, Eligio; Klahre, Ulrich; Brandenburg, Anna; Reinhardt, Didier; Kuhlemeier, Cris

    2011-04-01

    Two linkage maps were constructed for the model plant Petunia. Mapping populations were obtained by crossing the wild species Petunia axillaris subsp. axillaris with Petunia inflata, and Petunia axillaris subsp. parodii with Petunia exserta. Both maps cover the seven chromosomes of Petunia, and span 970 centimorgans (cM) and 700 cM of the genomes, respectively. In total, 207 markers were mapped. Of these, 28 are multilocus amplified fragment length polymorphism (AFLP) markers and 179 are gene-derived markers. For the first time we report on the development and mapping of 83 Petunia microsatellites. The two maps retain the same marker order, but display significant differences of recombination frequencies at orthologous mapping intervals. A complex pattern of genomic rearrangements was detected with the related genome of tomato (Solanum lycopersicum), indicating that synteny between Petunia and other Solanaceae crops has been considerably disrupted. The newly developed markers will facilitate the genetic characterization of mutants and ecological studies on genetic diversity and speciation within the genus Petunia. The maps will provide a powerful tool to link genetic and genomic information and will be useful to support sequence assembly of the Petunia genome.

  10. Two-dimensional Value Stream Mapping: Integrating the design of the MPC system in the value stream map

    DEFF Research Database (Denmark)

    Powell, Daryl; Olesen, Peter Bjerg

    2013-01-01

    Companies use value stream mapping to identify waste, often in the early stages of a lean implementation. Though the tool helps users to visualize material and information flows and to identify improvement opportunities, a limitation of this approach is the lack of an integrated method...... for analysing and re-designing the MPC system in order to support lean improvement. We reflect on the current literature regarding value stream mapping, and use practical insights in order to develop and propose a two-dimensional value stream mapping tool that integrates the design of the MPC system within...... the material and information flow map....

  11. Genome wide SSR high density genetic map construction from an interspecific cross of Gossypium hirsutum × Gossypium tomentosum

    Directory of Open Access Journals (Sweden)

    Muhammad Kashif Riaz eKhan

    2016-04-01

    Full Text Available A high density genetic map was constructed using F2 population derived from an interspecific cross of G. hirsutum x G. tomentosum. The map consisted of 3,093 marker loci distributed across all the 26 chromosomes and covered 4,365.3 cM of cotton genome with an average inter-marker distance of 1.48 cM. The maximum length of chromosome was 218.38 cM and the minimum was 122.09 cM with an average length of 167.90 cM. A sub-genome covers more genetic distance (2,189.01 cM with an average inter loci distance of 1.53 cM than D sub-genome which covers a length of 2,176.29 cM with an average distance of 1.43 cM. There were 716 distorted loci in the map accounting for 23.14% and most distorted loci were distributed on D sub-genome (25.06%, which were more than on A sub-genome (21.23%. In our map 49 segregation hotspots (SDR were distributed across the genome with more on D sub-genome as compared to A genome. Two post-polyploidization reciprocal translocations of A2/A3 and A4/A5 were suggested by 7 pairs of duplicate loci. The map constructed through these studies is one of the three densest genetic maps in cotton however; this is the first dense genome wide SSR interspecific genetic map between G. hirsutum and G. tomentosum.

  12. Positioning genomics in biology education: content mapping of undergraduate biology textbooks.

    Science.gov (United States)

    Wernick, Naomi L B; Ndung'u, Eric; Haughton, Dominique; Ledley, Fred D

    2014-12-01

    Biological thought increasingly recognizes the centrality of the genome in constituting and regulating processes ranging from cellular systems to ecology and evolution. In this paper, we ask whether genomics is similarly positioned as a core concept in the instructional sequence for undergraduate biology. Using quantitative methods, we analyzed the order in which core biological concepts were introduced in textbooks for first-year general and human biology. Statistical analysis was performed using self-organizing map algorithms and conventional methods to identify clusters of terms and their relative position in the books. General biology textbooks for both majors and nonmajors introduced genome-related content after text related to cell biology and biological chemistry, but before content describing higher-order biological processes. However, human biology textbooks most often introduced genomic content near the end of the books. These results suggest that genomics is not yet positioned as a core concept in commonly used textbooks for first-year biology and raises questions about whether such textbooks, or courses based on the outline of these textbooks, provide an appropriate foundation for understanding contemporary biological science.

  13. Genomic Organization and Physical Mapping of Tandemly Arranged Repetitive DNAs in Sterlet (Acipenser ruthenus).

    Science.gov (United States)

    Biltueva, Larisa S; Prokopov, Dimitry Y; Makunin, Alexey I; Komissarov, Alexey S; Kudryavtseva, Anna V; Lemskaya, Natalya A; Vorobieva, Nadezhda V; Serdyukova, Natalia A; Romanenko, Svetlana A; Gladkikh, Olga L; Graphodatsky, Alexander S; Trifonov, Vladimir A

    2017-01-01

    Acipenseriformes represent a phylogenetically basal clade of ray-finned fish characterized by unusual genomic traits, including paleopolyploid states of extant genomes with high chromosome numbers and slow rates of molecular evolution. Despite a high interest in this fish group, only a limited number of studies have been accomplished on the isolation and characterization of repetitive DNA, karyotype standardization is not yet complete, and sex chromosomes are still to be identified. Here, we applied next-generation sequencing and cluster analysis to characterize major fractions of sterlet (Acipenser ruthenus) repetitive DNA. Using FISH, we mapped 16 tandemly arranged sequences on sterlet chromosomes and found them to be unevenly distributed in the genome with a tendency to cluster in particular regions. Some of the satellite DNAs might be used as specific markers to identify individual chromosomes and their paralogs, resulting in the unequivocal identification of at least 18 chromosome pairs. Our results provide an insight into the characteristic genomic distribution of the most common sterlet repetitive sequences. Biased accumulation of repetitive DNAs in particular chromosomes makes them especially interesting for further search for cryptic sex chromosomes. Future studies of these sequences in other acipenserid species will provide new perspectives regarding the evolution of repetitive DNA within the genomes of this fish order. © 2017 S. Karger AG, Basel.

  14. Naturkraft integration at Kaarstoe. Mapping study report 6 march 2009

    Energy Technology Data Exchange (ETDEWEB)

    Odland, Oeystein; Hoeie, Hans; Aanestad, Per; Hauge, Bjoern I.; Solvang, Svein; Boee, Stein Espen; Lervik, Steinar; Kristiansen, Arild

    2007-07-01

    Gassco is engaged in three different studies and projects related to reduce CO{sub 2} emissions at Kaarstoe; 1. Gassnova's CO{sub 2} capture, transport and storage projects at Kaarstoe and Mongstad (the CO{sub 2} Transportation Network), where Gassco role is to mature the transportation facilities of the project. An investment decision is planned for second half of 2009. 2. The Kaarstoe Flue Gas Capture pre-feasibility study initiated by the MPE to Gassco as operator of Gassled in connection with approval of the Kaarstoe Expansion Project 2010, to evaluate the potential of capturing the CO{sub 2} emissions at the Kaarstoe gas processing plant, and reported to the MPE 3 March 2009, and 3. This Naturkraft Integration Mapping Study initiated by MPE to meet the requirements set out by the letter to Gassco dated 3 December 2008 to evaluate possibilities to integrate existing Naturkraft's power plant with the Kaarstoe gas processing plant. In this Naturkraft Integration Study report Gassco has identified potential degrees of integration and resulting impact with respect to performance of both the Naturkraft power plant and the Kaarstoe gas processing plant. A major concern related to operations of the Kaarstoe gas processing plant is the regularity and availability issues related to natural gas and NGL export. The value of the petroleum transported over Kaarstoe on any day exceeds 200 million NOK. In addition also significant oil production will be shut down if the Kaarstoe gas processing plant is not operating. Hence the regularity of energy supply including steam is of utmost importance. The integration between Naturkraft and the gas processing plant is based on supplying energy (i.e. fuel gas, steam, heat and electricity) from Naturkraft's existing power plant, and therefore must be based on predictable and steady operations of the Naturkraft power plant. Naturkraft has only been in operation for a few weeks from the start-up of the power plant 1

  15. G-MAPSEQ – a new method for mapping reads to a reference genome

    Directory of Open Access Journals (Sweden)

    Wojciechowski Pawel

    2016-06-01

    Full Text Available The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences due to a limited number of mutations allowed. G-MAPSEQ is a novel, hybrid algorithm combining two interesting methods: alignment-free sequence comparison and an ultra fast sequence alignment. The former is a fast heuristic algorithm which uses k-mer characteristics of nucleotide sequences to find potential mapping places. The latter is a very fast GPU implementation of sequence alignment used to verify the correctness of these mapping positions. The source code of G-MAPSEQ along with other bioinformatic software is available at: http://gpualign.cs.put.poznan.pl.

  16. Digital Geologic Mapping and Integration with the Geoweb: The Death Knell for Exclusively Paper Geologic Maps

    Science.gov (United States)

    House, P. K.

    2008-12-01

    The combination of traditional methods of geologic mapping with rapidly developing web-based geospatial applications ('the geoweb') and the various collaborative opportunities of web 2.0 have the potential to change the nature, value, and relevance of geologic maps and related field studies. Parallel advances in basic GPS technology, digital photography, and related integrative applications provide practicing geologic mappers with greatly enhanced methods for collecting, visualizing, interpreting, and disseminating geologic information. Even a cursory application of available tools can make field and office work more enriching and efficient; whereas more advanced and systematic applications provide new avenues for collaboration, outreach, and public education. Moreover, they ensure a much broader audience among an immense number of internet savvy end-users with very specific expectations for geospatial data availability. Perplexingly, the geologic community as a whole is not fully exploring this opportunity despite the inevitable revolution in portends. The slow acceptance follows a broad generational trend wherein seasoned professionals are lagging behind geology students and recent graduates in their grasp of and interest in the capabilities of the geoweb and web 2.0 types of applications. Possible explanations for this include: fear of the unknown, fear of learning curve, lack of interest, lack of academic/professional incentive, and (hopefully not) reluctance toward open collaboration. Although some aspects of the expanding geoweb are cloaked in arcane computer code, others are extremely simple to understand and use. A particularly obvious and simple application to enhance any field study is photo geotagging, the digital documentation of the locations of key outcrops, illustrative vistas, and particularly complicated geologic field relations. Viewing geotagged photos in their appropriate context on a virtual globe with high-resolution imagery can be an

  17. Integrated consensus genetic and physical maps of flax (Linum usitatissimum L.).

    Science.gov (United States)

    Cloutier, Sylvie; Ragupathy, Raja; Miranda, Evelyn; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Ward, Kerry; Rowland, Gordon; Duguid, Scott; Banik, Mitali

    2012-12-01

    Three linkage maps of flax (Linum usitatissimum L.) were constructed from populations CDC Bethune/Macbeth, E1747/Viking and SP2047/UGG5-5 containing between 385 and 469 mapped markers each. The first consensus map of flax was constructed incorporating 770 markers based on 371 shared markers including 114 that were shared by all three populations and 257 shared between any two populations. The 15 linkage group map corresponds to the haploid number of chromosomes of this species. The marker order of the consensus map was largely collinear in all three individual maps but a few local inversions and marker rearrangements spanning short intervals were observed. Segregation distortion was present in all linkage groups which contained 1-52 markers displaying non-Mendelian segregation. The total length of the consensus genetic map is 1,551 cM with a mean marker density of 2.0 cM. A total of 670 markers were anchored to 204 of the 416 fingerprinted contigs of the physical map corresponding to ~274 Mb or 74 % of the estimated flax genome size of 370 Mb. This high resolution consensus map will be a resource for comparative genomics, genome organization, evolution studies and anchoring of the whole genome shotgun sequence.

  18. An integrated linkage map reveals candidate genes underlying adaptive variation in Chinook salmon (Oncorhynchus tshawytscha)

    DEFF Research Database (Denmark)

    Mckinney, G. J.; Seeb, L. W.; Larson, W. A.

    2016-01-01

    Salmonids are an important cultural and ecological resource exhibiting near worldwide distribution between their native and introduced range. Previous research has generated linkage maps and genomic resources for several species as well as genome assemblies for two species. We first leveraged...

  19. Mapping the Ethics of Translational Genomics: Situating Return of Results and Navigating the Research-Clinical Divide

    Science.gov (United States)

    Wolf, Susan M.; Burke, Wylie; Koenig, Barbara A.

    2015-01-01

    Both bioethics and law have governed human genomics by distinguishing research from clinical practice. Yet the rise of translational genomics now makes this traditional dichotomy inadequate. This paper pioneers a new approach to the ethics of translational genomics. It maps the full range of ethical approaches needed, proposes a “layered” approach to determining the ethics framework for projects combining research and clinical care, and clarifies the key role that return of results can play in advancing translation. PMID:26479558

  20. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  1. Genome-wide screen for universal individual identification SNPs based on the HapMap and 1000 Genomes databases.

    Science.gov (United States)

    Huang, Erwen; Liu, Changhui; Zheng, Jingjing; Han, Xiaolong; Du, Weian; Huang, Yuanjian; Li, Chengshi; Wang, Xiaoguang; Tong, Dayue; Ou, Xueling; Sun, Hongyu; Zeng, Zhaoshu; Liu, Chao

    2018-04-03

    Differences among SNP panels for individual identification in SNP-selecting and populations led to few common SNPs, compromising their universal applicability. To screen all universal SNPs, we performed a genome-wide SNP mining in multiple populations based on HapMap and 1000Genomes databases. SNPs with high minor allele frequencies (MAF) in 37 populations were selected. With MAF from ≥0.35 to ≥0.43, the number of selected SNPs decreased from 2769 to 0. A total of 117 SNPs with MAF ≥0.39 have no linkage disequilibrium with each other in every population. For 116 of the 117 SNPs, cumulative match probability (CMP) ranged from 2.01 × 10-48 to 1.93 × 10-50 and cumulative exclusion probability (CEP) ranged from 0.9999999996653 to 0.9999999999945. In 134 tested Han samples, 110 of the 117 SNPs remained within high MAF and conformed to Hardy-Weinberg equilibrium, with CMP = 4.70 × 10-47 and CEP = 0.999999999862. By analyzing the same number of autosomal SNPs as in the HID-Ion AmpliSeq Identity Panel, i.e. 90 randomized out of the 110 SNPs, our panel yielded preferable CMP and CEP. Taken together, the 110-SNPs panel is advantageous for forensic test, and this study provided plenty of highly informative SNPs for compiling final universal panels.

  2. Integrated genomics and proteomics of the Torpedo californica electric organ: concordance with the mammalian neuromuscular junction

    Directory of Open Access Journals (Sweden)

    Mate Suzanne E

    2011-05-01

    Full Text Available Abstract Background During development, the branchial mesoderm of Torpedo californica transdifferentiates into an electric organ capable of generating high voltage discharges to stun fish. The organ contains a high density of cholinergic synapses and has served as a biochemical model for the membrane specialization of myofibers, the neuromuscular junction (NMJ. We studied the genome and proteome of the electric organ to gain insight into its composition, to determine if there is concordance with skeletal muscle and the NMJ, and to identify novel synaptic proteins. Results Of 435 proteins identified, 300 mapped to Torpedo cDNA sequences with ≥2 peptides. We identified 14 uncharacterized proteins in the electric organ that are known to play a role in acetylcholine receptor clustering or signal transduction. In addition, two human open reading frames, C1orf123 and C6orf130, showed high sequence similarity to electric organ proteins. Our profile lists several proteins that are highly expressed in skeletal muscle or are muscle specific. Synaptic proteins such as acetylcholinesterase, acetylcholine receptor subunits, and rapsyn were present in the electric organ proteome but absent in the skeletal muscle proteome. Conclusions Our integrated genomic and proteomic analysis supports research describing a muscle-like profile of the organ. We show that it is a repository of NMJ proteins but we present limitations on its use as a comprehensive model of the NMJ. Finally, we identified several proteins that may become candidates for signaling proteins not previously characterized as components of the NMJ.

  3. CRED Integrated Benthic Habitat Map for French Frigate Shoals, Northwestern Hawaiian Islands 2007

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This is an integrated benthic habitat map system which consists of a number of separate map layers including multibeam bathymetry, acoustic backscatter imagery,...

  4. CRED Integrated Benthic Habitat Map for Tutuila Island, American Samoa Year 2007

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This is an integrated benthic habitat map system which consists of a number of separate map layers including multibeam bathymetry, digital NOAA nautical charts,...

  5. Enhanced canopy fuel mapping by integrating lidar data

    Science.gov (United States)

    Peterson, Birgit E.; Nelson, Kurtis J.

    2016-10-03

    BackgroundThe Wildfire Sciences Team at the U.S. Geological Survey’s Earth Resources Observation and Science Center produces vegetation type, vegetation structure, and fuel products for the United States, primarily through the Landscape Fire and Resource Management Planning Tools (LANDFIRE) program. LANDFIRE products are used across disciplines for a variety of applications. The LANDFIRE data retain their currency and relevancy through periodic updating or remapping. These updating and remapping efforts provide opportunities to improve the LANDFIRE product suite by incorporating data from other sources. Light detection and ranging (lidar) is uniquely suitable for gathering information on vegetation structure and spatial arrangement because it can collect data in three dimensions. The Wildfire Sciences Team has several completed and ongoing studies focused on integrating lidar into vegetation and fuels mapping.

  6. The European sea bass Dicentrarchus labrax genome puzzle: comparative BAC-mapping and low coverage shotgun sequencing

    Directory of Open Access Journals (Sweden)

    Volckaert Filip AM

    2010-01-01

    Full Text Available Abstract Background Food supply from the ocean is constrained by the shortage of domesticated and selected fish. Development of genomic models of economically important fishes should assist with the removal of this bottleneck. European sea bass Dicentrarchus labrax L. (Moronidae, Perciformes, Teleostei is one of the most important fishes in European marine aquaculture; growing genomic resources put it on its way to serve as an economic model. Results End sequencing of a sea bass genomic BAC-library enabled the comparative mapping of the sea bass genome using the three-spined stickleback Gasterosteus aculeatus genome as a reference. BAC-end sequences (102,690 were aligned to the stickleback genome. The number of mappable BACs was improved using a two-fold coverage WGS dataset of sea bass resulting in a comparative BAC-map covering 87% of stickleback chromosomes with 588 BAC-contigs. The minimum size of 83 contigs covering 50% of the reference was 1.2 Mbp; the largest BAC-contig comprised 8.86 Mbp. More than 22,000 BAC-clones aligned with both ends to the reference genome. Intra-chromosomal rearrangements between sea bass and stickleback were identified. Size distributions of mapped BACs were used to calculate that the genome of sea bass may be only 1.3 fold larger than the 460 Mbp stickleback genome. Conclusions The BAC map is used for sequencing single BACs or BAC-pools covering defined genomic entities by second generation sequencing technologies. Together with the WGS dataset it initiates a sea bass genome sequencing project. This will allow the quantification of polymorphisms through resequencing, which is important for selecting highly performing domesticated fish.

  7. BAUM: Improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach.

    Science.gov (United States)

    Wang, Anqi; Wang, Zhanyu; Li, Zheng; Li, Lei M

    2018-01-15

    It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the Second Generation Sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the Single-Molecule Real-Time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can: (1) perform reference-assisted assembly based on the genome of a close species; (2) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. lilei@amss.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  8. QTL mapping of genome regions controlling temephos resistance in larvae of the mosquito Aedes aegypti.

    Science.gov (United States)

    Reyes-Solis, Guadalupe Del Carmen; Saavedra-Rodriguez, Karla; Suarez, Adriana Flores; Black, William C

    2014-10-01

    The mosquito Aedes aegypti is the principal vector of dengue and yellow fever flaviviruses. Temephos is an organophosphate insecticide used globally to suppress Ae. aegypti larval populations but resistance has evolved in many locations. Quantitative Trait Loci (QTL) controlling temephos survival in Ae. aegypti larvae were mapped in a pair of F3 advanced intercross lines arising from temephos resistant parents from Solidaridad, México and temephos susceptible parents from Iquitos, Peru. Two sets of 200 F3 larvae were exposed to a discriminating dose of temephos and then dead larvae were collected and preserved for DNA isolation every two hours up to 16 hours. Larvae surviving longer than 16 hours were considered resistant. For QTL mapping, single nucleotide polymorphisms (SNPs) were identified at 23 single copy genes and 26 microsatellite loci of known physical positions in the Ae. aegypti genome. In both reciprocal crosses, Multiple Interval Mapping identified eleven QTL associated with time until death. In the Solidaridad×Iquitos (SLD×Iq) cross twelve were associated with survival but in the reciprocal IqxSLD cross, only six QTL were survival associated. Polymorphisms at acetylcholine esterase (AchE) loci 1 and 2 were not associated with either resistance phenotype suggesting that target site insensitivity is not an organophosphate resistance mechanism in this region of México. Temephos resistance is under the control of many metabolic genes of small effect and dispersed throughout the Ae. aegypti genome.

  9. Stakeholder engagement: a key component of integrating genomic information into electronic health records.

    Science.gov (United States)

    Hartzler, Andrea; McCarty, Catherine A; Rasmussen, Luke V; Williams, Marc S; Brilliant, Murray; Bowton, Erica A; Clayton, Ellen Wright; Faucett, William A; Ferryman, Kadija; Field, Julie R; Fullerton, Stephanie M; Horowitz, Carol R; Koenig, Barbara A; McCormick, Jennifer B; Ralston, James D; Sanderson, Saskia C; Smith, Maureen E; Trinidad, Susan Brown

    2013-10-01

    Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine.

  10. Development of an integrated genome informatics, data management and workflow infrastructure: A toolbox for the study of complex disease genetics

    Directory of Open Access Journals (Sweden)

    Burren Oliver S

    2004-01-01

    Full Text Available Abstract The genetic dissection of complex disease remains a significant challenge. Sample-tracking and the recording, processing and storage of high-throughput laboratory data with public domain data, require integration of databases, genome informatics and genetic analyses in an easily updated and scaleable format. To find genes involved in multifactorial diseases such as type 1 diabetes (T1D, chromosome regions are defined based on functional candidate gene content, linkage information from humans and animal model mapping information. For each region, genomic information is extracted from Ensembl, converted and loaded into ACeDB for manual gene annotation. Homology information is examined using ACeDB tools and the gene structure verified. Manually curated genes are extracted from ACeDB and read into the feature database, which holds relevant local genomic feature data and an audit trail of laboratory investigations. Public domain information, manually curated genes, polymorphisms, primers, linkage and association analyses, with links to our genotyping database, are shown in Gbrowse. This system scales to include genetic, statistical, quality control (QC and biological data such as expression analyses of RNA or protein, all linked from a genomics integrative display. Our system is applicable to any genetic study of complex disease, of either large or small scale.

  11. Genome-wide mapping of autonomous promoter activity in human cells.

    Science.gov (United States)

    van Arensbergen, Joris; FitzPatrick, Vincent D; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J; van Steensel, Bas

    2017-02-01

    Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of the sequences that could be tested. Here we present 'survey of regulatory elements' (SuRE), a method that assays more than 10 8 DNA fragments, each 0.2-2 kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library of random genomic fragments upstream of a 20-bp barcode is constructed, and decoded by paired-end sequencing. This library is used to transfect cells, and barcodes in transcribed RNA are quantified by high-throughput sequencing. When applied to the human genome, we achieve 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide in K562 cells. By computational modeling we delineate subregions within promoters that are relevant for their activity. We show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites.

  12. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Héloïse Bastide

    2013-06-01

    Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  13. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Science.gov (United States)

    Bastide, Héloïse; Betancourt, Andrea; Nolte, Viola; Tobler, Raymond; Stöbe, Petra; Futschik, Andreas; Schlötterer, Christian

    2013-06-01

    Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  14. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    Science.gov (United States)

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.

  15. Assembly and Multiplex Genome Integration of Metabolic Pathways in Yeast Using CasEMBLR.

    Science.gov (United States)

    Jakočiūnas, Tadas; Jensen, Emil D; Jensen, Michael K; Keasling, Jay D

    2018-01-01

    Genome integration is a vital step for implementing large biochemical pathways to build a stable microbial cell factory. Although traditional strain construction strategies are well established for the model organism Saccharomyces cerevisiae, recent advances in CRISPR/Cas9-mediated genome engineering allow much higher throughput and robustness in terms of strain construction. In this chapter, we describe CasEMBLR, a highly efficient and marker-free genome engineering method for one-step integration of in vivo assembled expression cassettes in multiple genomic sites simultaneously. CasEMBLR capitalizes on the CRISPR/Cas9 technology to generate double-strand breaks in genomic loci, thus prompting native homologous recombination (HR) machinery to integrate exogenously derived homology templates. As proof-of-principle for microbial cell factory development, CasEMBLR was used for one-step assembly and marker-free integration of the carotenoid pathway from 15 exogenously supplied DNA parts into three targeted genomic loci. As a second proof-of-principle, a total of ten DNA parts were assembled and integrated in two genomic loci to construct a tyrosine production strain, and at the same time knocking out two genes. This new method complements and improves the field of genome engineering in S. cerevisiae by providing a more flexible platform for rapid and precise strain building.

  16. Mapping Second Chromosome Mutations to Defined Genomic Regions in Drosophila melanogaster.

    Science.gov (United States)

    Kahsai, Lily; Cook, Kevin R

    2018-01-04

    Hundreds of Drosophila melanogaster stocks are currently maintained at the Bloomington Drosophila Stock Center with mutations that have not been associated with sequence-defined genes. They have been preserved because they have interesting loss-of-function phenotypes. The experimental value of these mutations would be increased by tying them to specific genomic intervals so that geneticists can more easily associate them with annotated genes. Here, we report the mapping of 85 second chromosome complementation groups in the Bloomington collection to specific, small clusters of contiguous genes or individual genes in the sequenced genome. This information should prove valuable to Drosophila geneticists interested in processes associated with particular phenotypes and those searching for mutations affecting specific sequence-defined genes. Copyright © 2018 Kahsai,Cook.

  17. Mapping Second Chromosome Mutations to Defined Genomic Regions in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Lily Kahsai

    2018-01-01

    Full Text Available Hundreds of Drosophila melanogaster stocks are currently maintained at the Bloomington Drosophila Stock Center with mutations that have not been associated with sequence-defined genes. They have been preserved because they have interesting loss-of-function phenotypes. The experimental value of these mutations would be increased by tying them to specific genomic intervals so that geneticists can more easily associate them with annotated genes. Here, we report the mapping of 85 second chromosome complementation groups in the Bloomington collection to specific, small clusters of contiguous genes or individual genes in the sequenced genome. This information should prove valuable to Drosophila geneticists interested in processes associated with particular phenotypes and those searching for mutations affecting specific sequence-defined genes.

  18. Deciphering the genomic architecture of the stickleback brain with a novel multilocus gene-mapping approach.

    Science.gov (United States)

    Li, Zitong; Guo, Baocheng; Yang, Jing; Herczeg, Gábor; Gonda, Abigél; Balázs, Gergely; Shikano, Takahito; Calboli, Federico C F; Merilä, Juha

    2017-03-01

    Quantitative traits important to organismal function and fitness, such as brain size, are presumably controlled by many small-effect loci. Deciphering the genetic architecture of such traits with traditional quantitative trait locus (QTL) mapping methods is challenging. Here, we investigated the genetic architecture of brain size (and the size of five different brain parts) in nine-spined sticklebacks (Pungitius pungitius) with the aid of novel multilocus QTL-mapping approaches based on a de-biased LASSO method. Apart from having more statistical power to detect QTL and reduced rate of false positives than conventional QTL-mapping approaches, the developed methods can handle large marker panels and provide estimates of genomic heritability. Single-locus analyses of an F 2 interpopulation cross with 239 individuals and 15 198, fully informative single nucleotide polymorphisms (SNPs) uncovered 79 QTL associated with variation in stickleback brain size traits. Many of these loci were in strong linkage disequilibrium (LD) with each other, and consequently, a multilocus mapping of individual SNPs, accounting for LD structure in the data, recovered only four significant QTL. However, a multilocus mapping of SNPs grouped by linkage group (LG) identified 14 LGs (1-6 depending on the trait) that influence variation in brain traits. For instance, 17.6% of the variation in relative brain size was explainable by cumulative effects of SNPs distributed over six LGs, whereas 42% of the variation was accounted for by all 21 LGs. Hence, the results suggest that variation in stickleback brain traits is influenced by many small-effect loci. Apart from suggesting moderately heritable (h 2  ≈ 0.15-0.42) multifactorial genetic architecture of brain traits, the results highlight the challenges in identifying the loci contributing to variation in quantitative traits. Nevertheless, the results demonstrate that the novel QTL-mapping approach developed here has distinctive advantages

  19. Toward integration of genomic selection with crop modelling: the development of an integrated approach to predicting rice heading dates.

    Science.gov (United States)

    Onogi, Akio; Watanabe, Maya; Mochizuki, Toshihiro; Hayashi, Takeshi; Nakagawa, Hiroshi; Hasegawa, Toshihiro; Iwata, Hiroyoshi

    2016-04-01

    It is suggested that accuracy in predicting plant phenotypes can be improved by integrating genomic prediction with crop modelling in a single hierarchical model. Accurate prediction of phenotypes is important for plant breeding and management. Although genomic prediction/selection aims to predict phenotypes on the basis of whole-genome marker information, it is often difficult to predict phenotypes of complex traits in diverse environments, because plant phenotypes are often influenced by genotype-environment interaction. A possible remedy is to integrate genomic prediction with crop/ecophysiological modelling, which enables us to predict plant phenotypes using environmental and management information. To this end, in the present study, we developed a novel method for integrating genomic prediction with phenological modelling of Asian rice (Oryza sativa, L.), allowing the heading date of untested genotypes in untested environments to be predicted. The method simultaneously infers the phenological model parameters and whole-genome marker effects on the parameters in a Bayesian framework. By cultivating backcross inbred lines of Koshihikari × Kasalath in nine environments, we evaluated the potential of the proposed method in comparison with conventional genomic prediction, phenological modelling, and two-step methods that applied genomic prediction to phenological model parameters inferred from Nelder-Mead or Markov chain Monte Carlo algorithms. In predicting heading dates of untested lines in untested environments, the proposed and two-step methods tended to provide more accurate predictions than the conventional genomic prediction methods, particularly in environments where phenotypes from environments similar to the target environment were unavailable for training genomic prediction. The proposed method showed greater accuracy in prediction than the two-step methods in all cross-validation schemes tested, suggesting the potential of the integrated approach in

  20. A Genomic Map of the Effects of Linked Selection in Drosophila.

    Directory of Open Access Journals (Sweden)

    Eyal Elyashiv

    2016-08-01

    Full Text Available Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR. They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs. Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

  1. A Genomic Map of the Effects of Linked Selection in Drosophila.

    Science.gov (United States)

    Elyashiv, Eyal; Sattath, Shmuel; Hu, Tina T; Strutsovsky, Alon; McVicker, Graham; Andolfatto, Peter; Coop, Graham; Sella, Guy

    2016-08-01

    Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

  2. Community standards for genomic resources, genetic conservation, and data integration

    Science.gov (United States)

    Jill Wegrzyn; Meg Staton; Emily Grau; Richard Cronn; C. Dana Nelson

    2017-01-01

    Genetics and genomics are increasingly important in forestry management and conservation. Next generation sequencing can increase analytical power, but still relies on building on the structure of previously acquired data. Data standards and data sharing allow the community to maximize the analytical power of high throughput genomics data. The landscape of incomplete...

  3. Mapping and sequencing the human genome: Science, ethics, and public policy. Final report

    Energy Technology Data Exchange (ETDEWEB)

    McInerney, J.D.

    1993-03-31

    Development of Mapping and Sequencing the Human Genome: Science, Ethics, and Public Policy followed the standard process of curriculum development at the Biological Sciences Curriculum Study (BSCS), the process is described. The production of this module was a collaborative effort between BSCS and the American Medical Association (AMA). Appendix A contains a copy of the module. Copies of reports sent to the Department of Energy (DOE) during the development process are contained in Appendix B; all reports should be on file at DOE. Appendix B also contains copies of status reports submitted to the BSCS Board of Directors.

  4. Small genomes: New initiatives in mapping and sequencing. Workshop summary report

    Energy Technology Data Exchange (ETDEWEB)

    McKenney, K. [National Inst. of Standards and Technology, Gaithersburg, MD (United States). Biotechnology Div.; Robb, F. [Univ. of Maryland Biotechnology Inst., Baltimore, MD (United States). Center of Marine Biotechnology

    1993-12-31

    The workshop was held 5--7 July 1993 at the Center for Advanced Research in Biotechnology (CARB) and hosted by the University of Maryland Biotechnology Institute (UMBI) and the National Institute of Standards and Technology (NIST). The objective of this workshop was to bring together individuals interested in DNA technologies and to determine the impact of these current and potential improvements of the speed and cost-effectiveness of mapping and sequencing on the planning of future small genome projects. A major goal of the workshop was to spur the collaboration of more diverse groups of scientists working on this topic, and to minimize competitiveness as an inhibitory factor to progress.

  5. Assembly of the Genome of the Disease Vector Aedes aegypti onto a Genetic Linkage Map Allows Mapping of Genes Affecting Disease Transmission

    KAUST Repository

    Juneja, Punita

    2014-01-30

    The mosquito Aedes aegypti transmits some of the most important human arboviruses, including dengue, yellow fever and chikungunya viruses. It has a large genome containing many repetitive sequences, which has resulted in the genome being poorly assembled - there are 4,758 scaffolds, few of which have been assigned to a chromosome. To allow the mapping of genes affecting disease transmission, we have improved the genome assembly by scoring a large number of SNPs in recombinant progeny from a cross between two strains of Ae. aegypti, and used these to generate a genetic map. This revealed a high rate of misassemblies in the current genome, where, for example, sequences from different chromosomes were found on the same scaffold. Once these were corrected, we were able to assign 60% of the genome sequence to chromosomes and approximately order the scaffolds along the chromosome. We found that there are very large regions of suppressed recombination around the centromeres, which can extend to as much as 47% of the chromosome. To illustrate the utility of this new genome assembly, we mapped a gene that makes Ae. aegypti resistant to the human parasite Brugia malayi, and generated a list of candidate genes that could be affecting the trait. © 2014 Juneja et al.

  6. GIGGLE: a search engine for large-scale integrated genome analysis

    Science.gov (United States)

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-01-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061

  7. GIGGLE: a search engine for large-scale integrated genome analysis.

    Science.gov (United States)

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-02-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

  8. The three-dimensional genome organization of Drosophila melanogaster through data integration.

    Science.gov (United States)

    Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

    2017-07-31

    Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.

  9. Protecting genomic integrity in somatic cells and embryonic stem cells

    International Nuclear Information System (INIS)

    Hong, Y.; Cervantes, R.B.; Tichy, E.; Tischfield, J.A.; Stambrook, P.J.

    2007-01-01

    Mutation frequencies at some loci in mammalian somatic cells in vivo approach 10 -4 . The majority of these events occur as a consequence of loss of heterozygosity (LOH) due to mitotic recombination. Such high levels of DNA damage in somatic cells, which can accumulate with age, will cause injury and, after a latency period, may lead to somatic disease and ultimately death. This high level of DNA damage is untenable for germ cells, and by extrapolation for embryonic stem (ES) cells, that must recreate the organism. ES cells cannot tolerate such a high frequency of damage since mutations will immediately impact the altered cell, and subsequently the entire organism. Most importantly, the mutations may be passed on to future generations. ES cells, therefore, must have robust mechanisms to protect the integrity of their genomes. We have examined two such mechanisms. Firstly, we have shown that mutation frequencies and frequencies of mitotic recombination in ES cells are about 100-fold lower than in adult somatic cells or in isogenic mouse embryonic fibroblasts (MEFs). A second complementary protective mechanism eliminates those ES cells that have acquired a mutational burden, thereby maintaining a pristine population. Consistent with this hypothesis, ES cells lack a G1 checkpoint, and the two known signaling pathways that mediate the checkpoint are compromised. The checkpoint kinase, Chk2, which participates in both pathways is sequestered at centrosomes in ES cells and does not phosphorylate its substrates (i.e. p53 and Cdc25A) that must be modified to produce a G1 arrest. Ectopic expression of Chk2 does not rescue the p53-mediated pathway, but does restore the pathway mediated by Cdc25A. Wild type ES cells exposed to ionizing radiation do not accumulate in G1 but do so in S-phase and in G2. ES cells that ectopically express Chk2 undergo cell cycle arrest in G1 as well as G2, and appear to be protected from apoptosis

  10. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    Science.gov (United States)

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  11. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas.

    Science.gov (United States)

    Brat, Daniel J; Verhaak, Roel G W; Aldape, Kenneth D; Yung, W K Alfred; Salama, Sofie R; Cooper, Lee A D; Rheinbay, Esther; Miller, C Ryan; Vitucci, Mark; Morozova, Olena; Robertson, A Gordon; Noushmehr, Houtan; Laird, Peter W; Cherniack, Andrew D; Akbani, Rehan; Huse, Jason T; Ciriello, Giovanni; Poisson, Laila M; Barnholtz-Sloan, Jill S; Berger, Mitchel S; Brennan, Cameron; Colen, Rivka R; Colman, Howard; Flanders, Adam E; Giannini, Caterina; Grifford, Mia; Iavarone, Antonio; Jain, Rajan; Joseph, Isaac; Kim, Jaegil; Kasaian, Katayoon; Mikkelsen, Tom; Murray, Bradley A; O'Neill, Brian Patrick; Pachter, Lior; Parsons, Donald W; Sougnez, Carrie; Sulman, Erik P; Vandenberg, Scott R; Van Meir, Erwin G; von Deimling, Andreas; Zhang, Hailei; Crain, Daniel; Lau, Kevin; Mallery, David; Morris, Scott; Paulauskis, Joseph; Penny, Robert; Shelton, Troy; Sherman, Mark; Yena, Peggy; Black, Aaron; Bowen, Jay; Dicostanzo, Katie; Gastier-Foster, Julie; Leraas, Kristen M; Lichtenberg, Tara M; Pierson, Christopher R; Ramirez, Nilsa C; Taylor, Cynthia; Weaver, Stephanie; Wise, Lisa; Zmuda, Erik; Davidsen, Tanja; Demchok, John A; Eley, Greg; Ferguson, Martin L; Hutter, Carolyn M; Mills Shaw, Kenna R; Ozenberger, Bradley A; Sheth, Margi; Sofia, Heidi J; Tarnuzzer, Roy; Wang, Zhining; Yang, Liming; Zenklusen, Jean Claude; Ayala, Brenda; Baboud, Julien; Chudamani, Sudha; Jensen, Mark A; Liu, Jia; Pihl, Todd; Raman, Rohini; Wan, Yunhu; Wu, Ye; Ally, Adrian; Auman, J Todd; Balasundaram, Miruna; Balu, Saianand; Baylin, Stephen B; Beroukhim, Rameen; Bootwalla, Moiz S; Bowlby, Reanne; Bristow, Christopher A; Brooks, Denise; Butterfield, Yaron; Carlsen, Rebecca; Carter, Scott; Chin, Lynda; Chu, Andy; Chuah, Eric; Cibulskis, Kristian; Clarke, Amanda; Coetzee, Simon G; Dhalla, Noreen; Fennell, Tim; Fisher, Sheila; Gabriel, Stacey; Getz, Gad; Gibbs, Richard; Guin, Ranabir; Hadjipanayis, Angela; Hayes, D Neil; Hinoue, Toshinori; Hoadley, Katherine; Holt, Robert A; Hoyle, Alan P; Jefferys, Stuart R; Jones, Steven; Jones, Corbin D; Kucherlapati, Raju; Lai, Phillip H; Lander, Eric; Lee, Semin; Lichtenstein, Lee; Ma, Yussanne; Maglinte, Dennis T; Mahadeshwar, Harshad S; Marra, Marco A; Mayo, Michael; Meng, Shaowu; Meyerson, Matthew L; Mieczkowski, Piotr A; Moore, Richard A; Mose, Lisle E; Mungall, Andrew J; Pantazi, Angeliki; Parfenov, Michael; Park, Peter J; Parker, Joel S; Perou, Charles M; Protopopov, Alexei; Ren, Xiaojia; Roach, Jeffrey; Sabedot, Thaís S; Schein, Jacqueline; Schumacher, Steven E; Seidman, Jonathan G; Seth, Sahil; Shen, Hui; Simons, Janae V; Sipahimalani, Payal; Soloway, Matthew G; Song, Xingzhi; Sun, Huandong; Tabak, Barbara; Tam, Angela; Tan, Donghui; Tang, Jiabin; Thiessen, Nina; Triche, Timothy; Van Den Berg, David J; Veluvolu, Umadevi; Waring, Scot; Weisenberger, Daniel J; Wilkerson, Matthew D; Wong, Tina; Wu, Junyuan; Xi, Liu; Xu, Andrew W; Yang, Lixing; Zack, Travis I; Zhang, Jianhua; Aksoy, B Arman; Arachchi, Harindra; Benz, Chris; Bernard, Brady; Carlin, Daniel; Cho, Juok; DiCara, Daniel; Frazer, Scott; Fuller, Gregory N; Gao, JianJiong; Gehlenborg, Nils; Haussler, David; Heiman, David I; Iype, Lisa; Jacobsen, Anders; Ju, Zhenlin; Katzman, Sol; Kim, Hoon; Knijnenburg, Theo; Kreisberg, Richard Bailey; Lawrence, Michael S; Lee, William; Leinonen, Kalle; Lin, Pei; Ling, Shiyun; Liu, Wenbin; Liu, Yingchun; Liu, Yuexin; Lu, Yiling; Mills, Gordon; Ng, Sam; Noble, Michael S; Paull, Evan; Rao, Arvind; Reynolds, Sheila; Saksena, Gordon; Sanborn, Zack; Sander, Chris; Schultz, Nikolaus; Senbabaoglu, Yasin; Shen, Ronglai; Shmulevich, Ilya; Sinha, Rileen; Stuart, Josh; Sumer, S Onur; Sun, Yichao; Tasman, Natalie; Taylor, Barry S; Voet, Doug; Weinhold, Nils; Weinstein, John N; Yang, Da; Yoshihara, Kosuke; Zheng, Siyuan; Zhang, Wei; Zou, Lihua; Abel, Ty; Sadeghi, Sara; Cohen, Mark L; Eschbacher, Jenny; Hattab, Eyas M; Raghunathan, Aditya; Schniederjan, Matthew J; Aziz, Dina; Barnett, Gene; Barrett, Wendi; Bigner, Darell D; Boice, Lori; Brewer, Cathy; Calatozzolo, Chiara; Campos, Benito; Carlotti, Carlos Gilberto; Chan, Timothy A; Cuppini, Lucia; Curley, Erin; Cuzzubbo, Stefania; Devine, Karen; DiMeco, Francesco; Duell, Rebecca; Elder, J Bradley; Fehrenbach, Ashley; Finocchiaro, Gaetano; Friedman, William; Fulop, Jordonna; Gardner, Johanna; Hermes, Beth; Herold-Mende, Christel; Jungk, Christine; Kendler, Ady; Lehman, Norman L; Lipp, Eric; Liu, Ouida; Mandt, Randy; McGraw, Mary; Mclendon, Roger; McPherson, Christopher; Neder, Luciano; Nguyen, Phuong; Noss, Ardene; Nunziata, Raffaele; Ostrom, Quinn T; Palmer, Cheryl; Perin, Alessandro; Pollo, Bianca; Potapov, Alexander; Potapova, Olga; Rathmell, W Kimryn; Rotin, Daniil; Scarpace, Lisa; Schilero, Cathy; Senecal, Kelly; Shimmel, Kristen; Shurkhay, Vsevolod; Sifri, Suzanne; Singh, Rosy; Sloan, Andrew E; Smolenski, Kathy; Staugaitis, Susan M; Steele, Ruth; Thorne, Leigh; Tirapelli, Daniela P C; Unterberg, Andreas; Vallurupalli, Mahitha; Wang, Yun; Warnick, Ronald; Williams, Felicia; Wolinsky, Yingli; Bell, Sue; Rosenberg, Mara; Stewart, Chip; Huang, Franklin; Grimsby, Jonna L; Radenbaugh, Amie J; Zhang, Jianan

    2015-06-25

    Diffuse low-grade and intermediate-grade gliomas (which together make up the lower-grade gliomas, World Health Organization grades II and III) have highly variable clinical behavior that is not adequately predicted on the basis of histologic class. Some are indolent; others quickly progress to glioblastoma. The uncertainty is compounded by interobserver variability in histologic diagnosis. Mutations in IDH, TP53, and ATRX and codeletion of chromosome arms 1p and 19q (1p/19q codeletion) have been implicated as clinically relevant markers of lower-grade gliomas. We performed genomewide analyses of 293 lower-grade gliomas from adults, incorporating exome sequence, DNA copy number, DNA methylation, messenger RNA expression, microRNA expression, and targeted protein expression. These data were integrated and tested for correlation with clinical outcomes. Unsupervised clustering of mutations and data from RNA, DNA-copy-number, and DNA-methylation platforms uncovered concordant classification of three robust, nonoverlapping, prognostically significant subtypes of lower-grade glioma that were captured more accurately by IDH, 1p/19q, and TP53 status than by histologic class. Patients who had lower-grade gliomas with an IDH mutation and 1p/19q codeletion had the most favorable clinical outcomes. Their gliomas harbored mutations in CIC, FUBP1, NOTCH1, and the TERT promoter. Nearly all lower-grade gliomas with IDH mutations and no 1p/19q codeletion had mutations in TP53 (94%) and ATRX inactivation (86%). The large majority of lower-grade gliomas without an IDH mutation had genomic aberrations and clinical behavior strikingly similar to those found in primary glioblastoma. The integration of genomewide data from multiple platforms delineated three molecular classes of lower-grade gliomas that were more concordant with IDH, 1p/19q, and TP53 status than with histologic class. Lower-grade gliomas with an IDH mutation either had 1p/19q codeletion or carried a TP53 mutation. Most

  12. Chromosome mapping of dragline silk genes in the genomes of widow spiders (Araneae, Theridiidae.

    Directory of Open Access Journals (Sweden)

    Yonghui Zhao

    Full Text Available With its incredible strength and toughness, spider dragline silk is widely lauded for its impressive material properties. Dragline silk is composed of two structural proteins, MaSp1 and MaSp2, which are encoded by members of the spidroin gene family. While previous studies have characterized the genes that encode the constituent proteins of spider silks, nothing is known about the physical location of these genes. We determined karyotypes and sex chromosome organization for the widow spiders, Latrodectus hesperus and L. geometricus (Araneae, Theridiidae. We then used fluorescence in situ hybridization to map the genomic locations of the genes for the silk proteins that compose the remarkable spider dragline. These genes included three loci for the MaSp1 protein and the single locus for the MaSp2 protein. In addition, we mapped a MaSp1 pseudogene. All the MaSp1 gene copies and pseudogene localized to a single chromosomal region while MaSp2 was located on a different chromosome of L. hesperus. Using probes derived from L. hesperus, we comparatively mapped all three MaSp1 loci to a single region of a L. geometricus chromosome. As with L. hesperus, MaSp2 was found on a separate L. geometricus chromosome, thus again unlinked to the MaSp1 loci. These results indicate orthology of the corresponding chromosomal regions in the two widow genomes. Moreover, the occurrence of multiple MaSp1 loci in a conserved gene cluster across species suggests that MaSp1 proliferated by tandem duplication in a common ancestor of L. geometricus and L. hesperus. Unequal crossover events during recombination could have given rise to the gene copies and could also maintain sequence similarity among gene copies over time. Further comparative mapping with taxa of increasing divergence from Latrodectus will pinpoint when the MaSp1 duplication events occurred and the phylogenetic distribution of silk gene linkage patterns.

  13. Computational solution to automatically map metabolite libraries in the context of genome scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Benjamin eMerlet

    2016-02-01

    Full Text Available This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc and flat file formats (SBML and Matlab files. We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics and Glasgow Polyomics on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks.In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks.In order to achieve this goal we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.

  14. QTL Mapping of Genome Regions Controlling Manganese Uptake in Lentil Seed

    Directory of Open Access Journals (Sweden)

    Duygu Ates

    2018-05-01

    Full Text Available This study evaluated Mn concentration in the seeds of 120 RILs of lentil developed from the cross “CDC Redberry” × “ILL7502”. Micronutrient analysis using atomic absorption spectrometry indicated mean seed manganese (Mn concentrations ranging from 8.5 to 26.8 mg/kg, based on replicated field trials grown at three locations in Turkey in 2012 and 2013. A linkage map of lentil was constructed and consisted of seven linkage groups with 5,385 DNA markers. The total map length was 973.1 cM, with an average distance between markers of 0.18 cM. A total of 6 QTL for Mn concentration were identified using composite interval mapping (CIM. All QTL were statistically significant and explained 15.3–24.1% of the phenotypic variation, with LOD scores ranging from 3.00 to 4.42. The high-density genetic map reported in this study will increase fundamental knowledge of the genome structure of lentil, and will be the basis for the development of micronutrient-enriched lentil genotypes to support biofortification efforts.

  15. QTL Mapping of Genome Regions Controlling Manganese Uptake in Lentil Seed.

    Science.gov (United States)

    Ates, Duygu; Aldemir, Secil; Yagmur, Bulent; Kahraman, Abdullah; Ozkan, Hakan; Vandenberg, Albert; Tanyolac, Muhammed Bahattin

    2018-05-04

    This study evaluated Mn concentration in the seeds of 120 RILs of lentil developed from the cross "CDC Redberry" × "ILL7502". Micronutrient analysis using atomic absorption spectrometry indicated mean seed manganese (Mn) concentrations ranging from 8.5 to 26.8 mg/kg, based on replicated field trials grown at three locations in Turkey in 2012 and 2013. A linkage map of lentil was constructed and consisted of seven linkage groups with 5,385 DNA markers. The total map length was 973.1 cM, with an average distance between markers of 0.18 cM. A total of 6 QTL for Mn concentration were identified using composite interval mapping (CIM). All QTL were statistically significant and explained 15.3-24.1% of the phenotypic variation, with LOD scores ranging from 3.00 to 4.42. The high-density genetic map reported in this study will increase fundamental knowledge of the genome structure of lentil, and will be the basis for the development of micronutrient-enriched lentil genotypes to support biofortification efforts. Copyright © 2018 Ates et al.

  16. A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks.

    Science.gov (United States)

    Merlet, Benjamin; Paulhe, Nils; Vinson, Florence; Frainay, Clément; Chazalviel, Maxime; Poupin, Nathalie; Gloaguen, Yoann; Giacomoni, Franck; Jourdan, Fabien

    2016-01-01

    This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.

  17. Physical mapping and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L. genome

    Directory of Open Access Journals (Sweden)

    Cloutier Sylvie

    2011-05-01

    Full Text Available Abstract Background Flax (Linum usitatissimum L. is an important source of oil rich in omega-3 fatty acids, which have proven health benefits and utility as an industrial raw material. Flax seeds also contain lignans which are associated with reducing the risk of certain types of cancer. Its bast fibres have broad industrial applications. However, genomic tools needed for molecular breeding were non existent. Hence a project, Total Utilization Flax GENomics (TUFGEN was initiated. We report here the first genome-wide physical map of flax and the generation and analysis of BAC-end sequences (BES from 43,776 clones, providing initial insights into the genome. Results The physical map consists of 416 contigs spanning ~368 Mb, assembled from 32,025 fingerprints, representing roughly 54.5% to 99.4% of the estimated haploid genome (370-675 Mb. The N50 size of the contigs was estimated to be ~1,494 kb. The longest contig was ~5,562 kb comprising 437 clones. There were 96 contigs containing more than 100 clones. Approximately 54.6 Mb representing 8-14.8% of the genome was obtained from 80,337 BES. Annotation revealed that a large part of the genome consists of ribosomal DNA (~13.8%, followed by known transposable elements at 6.1%. Furthermore, ~7.4% of sequence was identified to harbour novel repeat elements. Homology searches against flax-ESTs and NCBI-ESTs suggested that ~5.6% of the transcriptome is unique to flax. A total of 4064 putative genomic SSRs were identified and are being developed as novel markers for their use in molecular breeding. Conclusion The first genome-wide physical map of flax constructed with BAC clones provides a framework for accessing target loci with economic importance for marker development and positional cloning. Analysis of the BES has provided insights into the uniqueness of the flax genome. Compared to other plant genomes, the proportion of rDNA was found to be very high whereas the proportion of known transposable

  18. Physical mapping and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L.) genome.

    Science.gov (United States)

    Ragupathy, Raja; Rathinavelu, Rajkumar; Cloutier, Sylvie

    2011-05-09

    Flax (Linum usitatissimum L.) is an important source of oil rich in omega-3 fatty acids, which have proven health benefits and utility as an industrial raw material. Flax seeds also contain lignans which are associated with reducing the risk of certain types of cancer. Its bast fibres have broad industrial applications. However, genomic tools needed for molecular breeding were non existent. Hence a project, Total Utilization Flax GENomics (TUFGEN) was initiated. We report here the first genome-wide physical map of flax and the generation and analysis of BAC-end sequences (BES) from 43,776 clones, providing initial insights into the genome. The physical map consists of 416 contigs spanning ~368 Mb, assembled from 32,025 fingerprints, representing roughly 54.5% to 99.4% of the estimated haploid genome (370-675 Mb). The N50 size of the contigs was estimated to be ~1,494 kb. The longest contig was ~5,562 kb comprising 437 clones. There were 96 contigs containing more than 100 clones. Approximately 54.6 Mb representing 8-14.8% of the genome was obtained from 80,337 BES. Annotation revealed that a large part of the genome consists of ribosomal DNA (~13.8%), followed by known transposable elements at 6.1%. Furthermore, ~7.4% of sequence was identified to harbour novel repeat elements. Homology searches against flax-ESTs and NCBI-ESTs suggested that ~5.6% of the transcriptome is unique to flax. A total of 4064 putative genomic SSRs were identified and are being developed as novel markers for their use in molecular breeding. The first genome-wide physical map of flax constructed with BAC clones provides a framework for accessing target loci with economic importance for marker development and positional cloning. Analysis of the BES has provided insights into the uniqueness of the flax genome. Compared to other plant genomes, the proportion of rDNA was found to be very high whereas the proportion of known transposable elements was low. The SSRs identified from BES will be

  19. Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis

    NARCIS (Netherlands)

    Low, T.Y.; van Heesch, S.; van den Toorn, H.; Giansanti, P.; Cristobal, A.; Toonen, P.; Schafer, S.; Hubner, N.; van Breukelen, B.; Mohammed, S.; Cuppen, E.; Heck, A.J.R.; Guryev, V.

    2013-01-01

    Quantitative and qualitative protein characteristics are regulated at genomic, transcriptomic, and posttranscriptional levels. Here, we integrated in-depth transcriptome and proteome analyses of liver tissues from two rat strains to unravel the interactions within and between these layers. We

  20. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    Directory of Open Access Journals (Sweden)

    Yang Li

    2006-12-01

    Full Text Available Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic response of gene expression also shows heritable difference has not yet been studied. Here we show that differential expression induced by temperatures of 16 degrees C and 24 degrees C has a strong genetic component in Caenorhabditis elegans recombinant inbred strains derived from a cross between strains CB4856 (Hawaii and N2 (Bristol. No less than 59% of 308 trans-acting genes showed a significant eQTL-by-environment interaction, here termed plasticity quantitative trait loci. In contrast, only 8% of an estimated 188 cis-acting genes showed such interaction. This indicates that heritable differences in plastic responses of gene expression are largely regulated in trans. This regulation is spread over many different regulators. However, for one group of trans-genes we found prominent evidence for a common master regulator: a transband of 66 coregulated genes appeared at 24 degrees C. Our results suggest widespread genetic variation of differential expression responses to environmental impacts and demonstrate the potential of genetical genomics for mapping the molecular determinants of phenotypic plasticity.

  1. Genome-wide mapping of furfural tolerance genes in Escherichia coli.

    Science.gov (United States)

    Glebes, Tirzah Y; Sandoval, Nicholas R; Reeder, Philippa J; Schilling, Katherine D; Zhang, Min; Gill, Ryan T

    2014-01-01

    Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007) Nat. Method.) approach to map, in parallel, the effect of increased dosage for >10(5) different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate). Only 268 of >4,000 E. coli genes (∼ 6%) were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate.

  2. Biallelic and Genome Wide Association Mapping of Germanium Tolerant Loci in Rice (Oryza sativa L..

    Directory of Open Access Journals (Sweden)

    Partha Talukdar

    Full Text Available Rice plants accumulate high concentrations of silicon. Silicon has been shown to be involved in plant growth, high yield, and mitigating biotic and abiotic stresses. However, it has been demonstrated that inorganic arsenic is taken up by rice through silicon transporters under anaerobic conditions, thus the ability to efficiently take up silicon may be considered either a positive or a negative trait in rice. Germanium is an analogue of silicon that produces brown lesions in shoots and leaves, and germanium toxicity has been used to identify mutants in silicon and arsenic transport. In this study, two different genetic mapping methods were performed to determine the loci involved in germanium sensitivity in rice. Genetic mapping in the biparental cross of Bala × Azucena (an F6 population and a genome wide association (GWA study with 350 accessions from the Rice Diversity Panel 1 were conducted using 15 μM of germanic acid. This identified a number of germanium sensitive loci: some co-localised with previously identified quantitative trait loci (QTL for tissue silicon or arsenic concentration, none co-localised with Lsi1 or Lsi6, while one single nucleotide polymorphism (SNP was detected within 200 kb of Lsi2 (these are genes known to transport silicon, whose identity was discovered using germanium toxicity. However, examining candidate genes that are within the genomic region of the loci detected above reveals genes homologous to both Lsi1 and Lsi2, as well as a number of other candidate genes, which are discussed.

  3. Update on the use of random 10-mers in mapping and fingerprinting genomes

    International Nuclear Information System (INIS)

    Sinibaldi, R.M.

    2001-01-01

    The use of Randomly Amplified Polymorphic DNA (RAPDs) has continued to grow for the last several years. A quick assessment of their use can be estimated by searching PubMed at the National Library of Medicine with the acronym RAPD. Since their first report in 1990, the number of citations with RAPD in them has increased from 12 in 1990, to 45 in 1991, to, 112 in 1993, to, 130 in 1994, to 223 in 1995, to 258 in 1996, to 236 in 1997, to 316 in 1998, to 196 to date (August 31) 1999. The utilization of 10-mers for mapping or fingerprinting has many advantages. These include a relatively low cost, no use of radioactivity, easily adapted to automation, requirement for very small amounts of input DNA, rapid results, existing data bases for many organisms, and low cost equipment requirements. In conjunction with a derived technology such as SCARs (sequence characterized amplified regions), it can provide cost effective and thorough methods for mapping and fingerprinting any genome. Newer methods based on microarray technology may offer powerful but expensive alternative approaches in determining genetic diversity. The costs of arrays should come down with time and improved production methods. In the meantime, RAPDs remain a competent and cost effective method for genome characterizations. (author)

  4. Zebrafish syntenic relationship to human/mouse genomes revealed by radiation hybrid mapping

    International Nuclear Information System (INIS)

    Samonte, Irene E.

    2007-01-01

    Zebrafish (Danio rerio) is an excellent model system for vertebrate developmental analysis and a new model for human disorders. In this study, however, zebrafish was used to determine its syntenic relationship to human/mouse genomes using the zebrafish-hamster radiation hybrid panel. The focus was on genes residing on chromosomes 6 and 17 of human and mouse, respectively, and some other genes of either immunologic or evolutionary importance. Gene sequences of interest and zebrafish expressed sequence tags deposited in the GenBank were used in identifying zebrafish homologs. Polymerase chain reaction (PCR) amplification, cloning and subcloning, sequencing, and phylogenetic analysis were done to confirm the homology of the candidate genes in zebrafish. The promising markers were then tested in the 94 zebrafish-hamster radiation hybrid panel cell lines and submitted for logarithm of the odds (LOD) score analysis to position genes on the zebrafish map. A total of 19 loci were successfully mapped to zebrafish linkage groups 1, 14, 15, 19, and 20. Four of these loci were positioned in linkage group 20, whereas, 3 more loci were added in linkage group 19, thus increasing to 34 loci the number of human genes syntenic to the group. With the sequencing of the zebrafish genome, about 20 more MHC genes were reported linked on the same group. (Author)

  5. Genome-wide mapping of furfural tolerance genes in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Tirzah Y Glebes

    Full Text Available Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007 Nat. Method. approach to map, in parallel, the effect of increased dosage for >10(5 different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate. Only 268 of >4,000 E. coli genes (∼ 6% were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate.

  6. Genome-Wide Association Mapping of Crown Rust Resistance in Oat Elite Germplasm.

    Science.gov (United States)

    Klos, Kathy Esvelt; Yimer, Belayneh A; Babiker, Ebrahiem M; Beattie, Aaron D; Bonman, J Michael; Carson, Martin L; Chong, James; Harrison, Stephen A; Ibrahim, Amir M H; Kolb, Frederic L; McCartney, Curt A; McMullen, Michael; Fetch, Jennifer Mitchell; Mohammadi, Mohsen; Murphy, J Paul; Tinker, Nicholas A

    2017-07-01

    Oat crown rust, caused by f. sp. , is a major constraint to oat ( L.) production in many parts of the world. In this first comprehensive multienvironment genome-wide association map of oat crown rust, we used 2972 single-nucleotide polymorphisms (SNPs) genotyped on 631 oat lines for association mapping of quantitative trait loci (QTL). Seedling reaction to crown rust in these lines was assessed as infection type (IT) with each of 10 crown rust isolates. Adult plant reaction was assessed in the field in a total of 10 location-years as percentage severity (SV) and as infection reaction (IR) in a 0-to-1 scale. Overall, 29 SNPs on 12 linkage groups were predictive of crown rust reaction in at least one experiment at a genome-wide level of statistical significance. The QTL identified here include those in regions previously shown to be linked with seedling resistance genes , , , , , and and also with adult-plant resistance and adaptation-related QTL. In addition, QTL on linkage groups Mrg03, Mrg08, and Mrg23 were identified in regions not previously associated with crown rust resistance. Evaluation of marker genotypes in a set of crown rust differential lines supported as the identity of . The SNPs with rare alleles associated with lower disease scores may be suitable for use in marker-assisted selection of oat lines for crown rust resistance. Copyright © 2017 Crop Science Society of America.

  7. Integrating satellite imagery with simulation modeling to improve burn severity mapping

    Science.gov (United States)

    Eva C. Karau; Pamela G. Sikkink; Robert E. Keane; Gregory K. Dillon

    2014-01-01

    Both satellite imagery and spatial fire effects models are valuable tools for generating burn severity maps that are useful to fire scientists and resource managers. The purpose of this study was to test a new mapping approach that integrates imagery and modeling to create more accurate burn severity maps. We developed and assessed a statistical model that combines the...

  8. Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

    Energy Technology Data Exchange (ETDEWEB)

    Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis; Rubin, EdwardM.; Couronne, Olivier

    2005-06-13

    Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A total of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.

  9. Rice genome mapping and its application in rice genetics and breeding

    International Nuclear Information System (INIS)

    Eun, M.Y.; Cho, Y.G.; Hahn, J.H.; Yoon, U.H.; Yi, B.Y.; Chung, T.Y.

    1998-01-01

    An 'MG' recombinant inbred population which consists of 164 F 13 lines has been developed from a cross between a Tongil type variety Milyang 23 and a Japonica type Gihobyeo by single seed descent. A Restriction Fragment Length Polymorphism (RFLP) framework map using this population has been constructed. Morphological markers, isozyme loci, microsatellites, Amplified Fragment Length Polymorphisms (AFLP), and new complementary DNA (cDNA) markers are being integrated in the framework map for a highly saturated comprehensive map. So far, 207 RFLPs, 89 microsatellites, 5 isozymes, 232 AFLPs, and 2 morphological markers have been mapped through international collaboration. The map contains 1,826 cM with an average interval size of 4.5 cM on the framework map and 3.4 cM overall (as of 29 October 1996). The framework map is being used for analyzing, quantitative trait loci (QTL) of agronomic characters and some physico-chemical properties relating to rice quality. The number of significant QTLs affecting each trait ranged from one to five, and 38 QTLs were detected for 17 traits. The percentage of variance explained by each QTL ranged from 5.6 to 66.9%. The isozyme marker, EstI-2, and two RFLP markers, RG109 and RG220, were linked most tightly at a distance less than 1 cM with the semidwarf (sd-1) gene on chromosome 1. These markers could be used for precise in vitro selection of individuals carrying the semidwarf gene using single seeds or very young leaf tissue, before this character is fully expressed. Appropriate application of marker-assisted selection, using EstI-2 and RFLP markers for the semidwarf character, in combination with other markers linked to genes of agronomic importance in rice, holds promise for improving, the efficiency of breeding, and the high-resolution genetic and physical mapping near sd-1, aimed at ultimately cloning this valuable gene

  10. Comparative genomics and association mapping approaches for blast resistant genes in finger millet using SSRs.

    Directory of Open Access Journals (Sweden)

    B Kalyana Babu

    Full Text Available The major limiting factor for production and productivity of finger millet crop is blast disease caused by Magnaporthe grisea. Since, the genome sequence information available in finger millet crop is scarce, comparative genomics plays a very important role in identification of genes/QTLs linked to the blast resistance genes using SSR markers. In the present study, a total of 58 genic SSRs were developed for use in genetic analysis of a global collection of 190 finger millet genotypes. The 58 SSRs yielded ninety five scorable alleles and the polymorphism information content varied from 0.186 to 0.677 at an average of 0.385. The gene diversity was in the range of 0.208 to 0.726 with an average of 0.487. Association mapping for blast resistance was done using 104 SSR markers which identified four QTLs for finger blast and one QTL for neck blast resistance. The genomic marker RM262 and genic marker FMBLEST32 were linked to finger blast disease at a P value of 0.007 and explained phenotypic variance (R² of 10% and 8% respectively. The genomic marker UGEP81 was associated to finger blast at a P value of 0.009 and explained 7.5% of R². The QTLs for neck blast was associated with the genomic SSR marker UGEP18 at a P value of 0.01, which explained 11% of R². Three QTLs for blast resistance were found common by using both GLM and MLM approaches. The resistant alleles were found to be present mostly in the exotic genotypes. Among the genotypes of NW Himalayan region of India, VHC3997, VHC3996 and VHC3930 were found highly resistant, which may be effectively used as parents for developing blast resistant cultivars in the NW Himalayan region of India. The markers linked to the QTLs for blast resistance in the present study can be further used for cloning of the full length gene, fine mapping and their further use in the marker assisted breeding programmes for introgression of blast resistant alleles into locally adapted cultivars.

  11. Comparative genomics and association mapping approaches for blast resistant genes in finger millet using SSRs.

    Science.gov (United States)

    Babu, B Kalyana; Dinesh, Pandey; Agrawal, Pawan K; Sood, S; Chandrashekara, C; Bhatt, Jagadish C; Kumar, Anil

    2014-01-01

    The major limiting factor for production and productivity of finger millet crop is blast disease caused by Magnaporthe grisea. Since, the genome sequence information available in finger millet crop is scarce, comparative genomics plays a very important role in identification of genes/QTLs linked to the blast resistance genes using SSR markers. In the present study, a total of 58 genic SSRs were developed for use in genetic analysis of a global collection of 190 finger millet genotypes. The 58 SSRs yielded ninety five scorable alleles and the polymorphism information content varied from 0.186 to 0.677 at an average of 0.385. The gene diversity was in the range of 0.208 to 0.726 with an average of 0.487. Association mapping for blast resistance was done using 104 SSR markers which identified four QTLs for finger blast and one QTL for neck blast resistance. The genomic marker RM262 and genic marker FMBLEST32 were linked to finger blast disease at a P value of 0.007 and explained phenotypic variance (R²) of 10% and 8% respectively. The genomic marker UGEP81 was associated to finger blast at a P value of 0.009 and explained 7.5% of R². The QTLs for neck blast was associated with the genomic SSR marker UGEP18 at a P value of 0.01, which explained 11% of R². Three QTLs for blast resistance were found common by using both GLM and MLM approaches. The resistant alleles were found to be present mostly in the exotic genotypes. Among the genotypes of NW Himalayan region of India, VHC3997, VHC3996 and VHC3930 were found highly resistant, which may be effectively used as parents for developing blast resistant cultivars in the NW Himalayan region of India. The markers linked to the QTLs for blast resistance in the present study can be further used for cloning of the full length gene, fine mapping and their further use in the marker assisted breeding programmes for introgression of blast resistant alleles into locally adapted cultivars.

  12. Physical Mapping of Bread Wheat Chromosome 5A: An Integrated Approach

    Directory of Open Access Journals (Sweden)

    Delfina Barabaschi

    2015-11-01

    Full Text Available The huge size, redundancy, and highly repetitive nature of the bread wheat [ (L.] genome, makes it among the most difficult species to be sequenced. To overcome these limitations, a strategy based on the separation of individual chromosomes or chromosome arms and the subsequent production of physical maps was established within the frame of the International Wheat Genome Sequence Consortium (IWGSC. A total of 95,812 bacterial artificial chromosome (BAC clones of short-arm chromosome 5A (5AS and long-arm chromosome 5A (5AL arm-specific BAC libraries were fingerprinted and assembled into contigs by complementary analytical approaches based on the FingerPrinted Contig (FPC and Linear Topological Contig (LTC tools. Combined anchoring approaches based on polymerase chain reaction (PCR marker screening, microarray, and sequence homology searches applied to several genomic tools (i.e., genetic maps, deletion bin map, neighbor maps, BAC end sequences (BESs, genome zipper, and chromosome survey sequences allowed the development of a high-quality physical map with an anchored physical coverage of 75% for 5AS and 53% for 5AL with high portions (64 and 48%, respectively of contigs ordered along the chromosome. In the genome of grasses, [ (L. Beauv.], rice ( L., and sorghum [ (L. Moench] homologs of genes on wheat chromosome 5A were separated into syntenic blocks on different chromosomes as a result of translocations and inversions during evolution. The physical map presented represents an essential resource for fine genetic mapping and map-based cloning of agronomically relevant traits and a reference for the 5A sequencing projects.

  13. INTEGRATED GENOME-BASED STUDIES OF SHEWANELLA ECOPHYSIOLOGY

    Energy Technology Data Exchange (ETDEWEB)

    NEALSON, KENNETH H.

    2013-10-15

    products of dissimilatory iron reduction. Geochim. Cosmochim. Acta. 74:574-583. 10. Karpinets, T.V., A.Y Obraztsova, Y. Wang, D.D. Schmoyer, G.H. Kora, B.H. Park, M.H. Serres, M.F. Ropmine, M.L. Land, T.B. Kothe, J.K. Fredrickson, K.H. Nealson, and E.C. Uberbacher 2010. Conserved synteny at the protein family level reveals genes underlying Shewanella species? cold tolerance and predicts their novel phenotypes. Funct. Integr. Genomics 10: 97 ? 110. (DOI 10.1007/s10143-009-0142-y) 11. Bretschger, O., A.C.M. Cheung, F. Mansfeld, and K.H. Nealson. 2010. Comparative microbial fuel cell evaluations of Shewanella spp. Electroanalysis 22: 883-894. 12. McLean, J.S., G. Wanger, Y.A. Gorby, M. Wainstein, J. McQuaid, Shun?ichi Ishii, O. Bretschger, H. Beyanal, K.H. Nealson. 2010. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Env. Sci. Technol. 44:2721-2717. 13. El-Naggar, M., G. Wanger, K.M. Leung, T.D. Yuzvinsky, G. Southam, J. Yang, W.M. Lau, K.H. Nealson, and Y.A. Gorby. 2010. Electrical Transport Along Bacterial Nanowires from Shewanella oneidensis MR-1 Proc. Nat. Acad. Sci. USA 107:18127-18131. 14. Biffinger, J.C., L.A. Fitzgerald, R. Ray, B.J. Little, S.E. Lizewski, E.R. Petersen, B.R. Ringeisen, W.C. Sanders, P.E. Sheehan, J.J. Pietron, J.W. Baldwin, L.J. Nadeau, G.R. Johnson, M. Ribbens, S.E. Finkel, K.H. Nealson. 2010. The utility of Shewanella japonica for microbial fuel cells. Bioresource Technol. 102:290-297. 15. Rodionov, D. , C. Yang, X. Li, I. Rodionova, Y. Wang, A.Y. Obraztsova, O. P. Zagnitko, R. Overbeek, M. F. Romine, S. Reed, J.K. Fredrickson, K.H. Nealson, A.L. Osterman. 2010. Genomic encyclopedia of sugar utilization pathways in the Shewanella genus. BMC Genomics 2010, 11:494 16. Kan, J., L. Hsu, A.C.M. Cheung, M. Pirbazari, and K.H. Nealson. 2011. Current production by bacterial communities in microbial fuel cells enriched from wastewater sludge

  14. Assembly of the Genome of the Disease Vector Aedes aegypti onto a Genetic Linkage Map Allows Mapping of Genes Affecting Disease Transmission

    KAUST Repository

    Juneja, Punita; Osei-Poku, Jewelna; Ho, Yung S.; Ariani, Cristina V.; Palmer, William J.; Pain, Arnab; Jiggins, Francis M.

    2014-01-01

    between two strains of Ae. aegypti, and used these to generate a genetic map. This revealed a high rate of misassemblies in the current genome, where, for example, sequences from different chromosomes were found on the same scaffold. Once these were

  15. Physical mapping of 20 unmapped fragments of the btau_4.0 genome assembly in cattle, sheep and river buffalo.

    Science.gov (United States)

    De Lorenzi, L; Genualdo, V; Perucatti, A; Iannuzzi, A; Iannuzzi, L; Parma, P

    2013-01-01

    The recent advances in sequencing technology and bioinformatics have revolutionized genomic research, making the decoding of the genome an easier task. Genome sequences are currently available for many species, including cattle, sheep and river buffalo. The available reference genomes are very accurate, and they represent the best possible order of loci at this time. In cattle, despite the great accuracy achieved, a part of the genome has been sequenced but not yet assembled: these genome fragments are called unmapped fragments. In the present study, 20 unmapped fragments belonging to the Btau_4.0 reference genome have been mapped by FISH in cattle (Bos taurus, 2n = 60), sheep (Ovis aries, 2n = 54) and river buffalo (Bubalus bubalis, 2n = 50). Our results confirm the accuracy of the available reference genome, though there are some discrepancies between the expected localization and the observed localization. Moreover, the available data in the literature regarding genomic homologies between cattle, sheep and river buffalo are confirmed. Finally, the results presented here suggest that FISH was, and still is, a useful technology to validate the data produced by genome sequencing programs. Copyright © 2013 S. Karger AG, Basel.

  16. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Science.gov (United States)

    2012-01-01

    Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920

  17. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Directory of Open Access Journals (Sweden)

    Liu Chang

    2012-12-01

    Full Text Available Abstract Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas.

  18. SSR-enriched genetic linkage maps of bermudagrass (Cynodon dactylon × transvaalensis), and their comparison with allied plant genomes.

    Science.gov (United States)

    Khanal, Sameer; Kim, Changsoo; Auckland, Susan A; Rainville, Lisa K; Adhikari, Jeevan; Schwartz, Brian M; Paterson, Andrew H

    2017-04-01

    We report SSR-enriched genetic maps of bermudagrass that: (1) reveal partial residual polysomic inheritance in the tetraploid species, and (2) provide insights into the evolution of chloridoid genomes. This study describes genetic linkage maps of two bermudagrass species, Cynodon dactylon (T89) and Cynodon transvaalensis (T574), that integrate heterologous microsatellite markers from sugarcane into frameworks built with single-dose restriction fragments (SDRFs). A maximum likelihood approach was used to construct two separate parental maps from a population of 110 F 1 progeny of a cross between the two parents. The T89 map is based on 291 loci on 34 cosegregating groups (CGs), with an average marker spacing of 12.5 cM. The T574 map is based on 125 loci on 14 CGs, with an average marker spacing of 10.7 cM. Six T89 and one T574 CG(s) deviated from disomic inheritance. Furthermore, marker segregation data and linkage phase analysis revealed partial residual polysomic inheritance in T89, suggesting that common bermudagrass is undergoing diploidization following whole genome duplication (WGD). Twenty-six T89 CGs were coalesced into 9 homo(eo)logous linkage groups (LGs), while 12 T574 CGs were assembled into 9 LGs, both putatively representing the basic chromosome complement (x = 9) of the species. Eight T89 and two T574 CGs remain unassigned. The marker composition of bermudagrass ancestral chromosomes was inferred by aligning T89 and T574 homologs, and used in comparisons to sorghum and rice genome sequences based on 108 and 91 significant blast hits, respectively. Two nested chromosome fusions (NCFs) shared by two other chloridoids (i.e., zoysiagrass and finger millet) and at least three independent translocation events were evident during chromosome number reduction from 14 in the polyploid common ancestor of Poaceae to 9 in Cynodon.

  19. Genome-wide recombination rate variation in a recombination map of cotton.

    Science.gov (United States)

    Shen, Chao; Li, Ximei; Zhang, Ruiting; Lin, Zhongxu

    2017-01-01

    Recombination is crucial for genetic evolution, which not only provides new allele combinations but also influences the biological evolution and efficacy of natural selection. However, recombination variation is not well understood outside of the complex species' genomes, and it is particularly unclear in Gossypium. Cotton is the most important natural fibre crop and the second largest oil-seed crop. Here, we found that the genetic and physical maps distances did not have a simple linear relationship. Recombination rates were unevenly distributed throughout the cotton genome, which showed marked changes along the chromosome lengths and recombination was completely suppressed in the centromeric regions. Recombination rates significantly varied between A-subgenome (At) (range = 1.60 to 3.26 centimorgan/megabase [cM/Mb]) and D-subgenome (Dt) (range = 2.17 to 4.97 cM/Mb), which explained why the genetic maps of At and Dt are similar but the physical map of Dt is only half that of At. The translocation regions between A02 and A03 and between A04 and A05, and the inversion regions on A10, D10, A07 and D07 indicated relatively high recombination rates in the distal regions of the chromosomes. Recombination rates were positively correlated with the densities of genes, markers and the distance from the centromere, and negatively correlated with transposable elements (TEs). The gene ontology (GO) categories showed that genes in high recombination regions may tend to response to environmental stimuli, and genes in low recombination regions are related to mitosis and meiosis, which suggested that they may provide the primary driving force in adaptive evolution and assure the stability of basic cell cycle in a rapidly changing environment. Global knowledge of recombination rates will facilitate genetics and breeding in cotton.

  20. Deep brain stimulation, brain maps and personalized medicine: lessons from the human genome project.

    Science.gov (United States)

    Fins, Joseph J; Shapiro, Zachary E

    2014-01-01

    Although the appellation of personalized medicine is generally attributed to advanced therapeutics in molecular medicine, deep brain stimulation (DBS) can also be so categorized. Like its medical counterpart, DBS is a highly personalized intervention that needs to be tailored to a patient's individual anatomy. And because of this, DBS like more conventional personalized medicine, can be highly specific where the object of care is an N = 1. But that is where the similarities end. Besides their differing medical and surgical provenances, these two varieties of personalized medicine have had strikingly different impacts. The molecular variant, though of a more recent vintage has thrived and is experiencing explosive growth, while DBS still struggles to find a sustainable therapeutic niche. Despite its promise, and success as a vetted treatment for drug resistant Parkinson's Disease, DBS has lagged in broadening its development, often encountering regulatory hurdles and financial barriers necessary to mount an adequate number of quality trials. In this paper we will consider why DBS-or better yet neuromodulation-has encountered these challenges and contrast this experience with the more successful advance of personalized medicine. We will suggest that personalized medicine and DBS's differential performance can be explained as a matter of timing and complexity. We believe that DBS has struggled because it has been a journey of scientific exploration conducted without a map. In contrast to molecular personalized medicine which followed the mapping of the human genome and the Human Genome Project, DBS preceded plans for the mapping of the human brain. We believe that this sequence has given personalized medicine a distinct advantage and that the fullest potential of DBS will be realized both as a cartographical or electrophysiological probe and as a modality of personalized medicine.

  1. The Integrated Genomic Architecture and Evolution of Dental Divergence in East African Cichlid Fishes (Haplochromis chilotes x H. nyererei

    Directory of Open Access Journals (Sweden)

    C. Darrin Hulsey

    2017-09-01

    Full Text Available The independent evolution of the two toothed jaws of cichlid fishes is thought to have promoted their unparalleled ecological divergence and species richness. However, dental divergence in cichlids could exhibit substantial genetic covariance and this could dictate how traits like tooth numbers evolve in different African Lakes and on their two jaws. To test this hypothesis, we used a hybrid mapping cross of two trophically divergent Lake Victoria species (Haplochromis chilotes × Haplochromis nyererei to examine genomic regions associated with cichlid tooth diversity. Surprisingly, a similar genomic region was found to be associated with oral jaw tooth numbers in cichlids from both Lake Malawi and Lake Victoria. Likewise, this same genomic location was associated with variation in pharyngeal jaw tooth numbers. Similar relationships between tooth numbers on the two jaws in both our Victoria hybrid population and across the phylogenetic diversity of Malawi cichlids additionally suggests that tooth numbers on the two jaws of haplochromine cichlids might generally coevolve owing to shared genetic underpinnings. Integrated, rather than independent, genomic architectures could be key to the incomparable evolutionary divergence and convergence in cichlid tooth numbers.

  2. INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles.

    Science.gov (United States)

    Alam, Intikhab; Antunes, André; Kamau, Allan Anthony; Ba Alawi, Wail; Kalkatawi, Manal; Stingl, Ulrich; Bajic, Vladimir B

    2013-01-01

    The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes. We developed a data warehouse system (INDIGO) that enables the integration of annotations for exploration and analysis of newly sequenced microbial genomes. INDIGO offers an opportunity to construct complex queries and combine annotations from multiple sources starting from genomic sequence to protein domain, gene ontology and pathway levels. This data warehouse is aimed at being populated with information from genomes of pure cultures and uncultured single cells of Red Sea bacteria and Archaea. Currently, INDIGO contains information from Salinisphaera shabanensis, Haloplasma contractile, and Halorhabdus tiamatea - extremophiles isolated from deep-sea anoxic brine lakes of the Red Sea. We provide examples of utilizing the system to gain new insights into specific aspects on the unique lifestyle and adaptations of these organisms to extreme environments. We developed a data warehouse system, INDIGO, which enables comprehensive integration of information from various resources to be used for annotation, exploration and analysis of microbial genomes. It will be regularly updated and extended with new genomes. It is aimed to serve as a resource dedicated to the Red Sea microbes. In addition, through INDIGO, we provide our Automatic Annotation of Microbial Genomes (AAMG) pipeline. The INDIGO web server is freely available at http://www.cbrc.kaust.edu.sa/indigo.

  3. INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles.

    Directory of Open Access Journals (Sweden)

    Intikhab Alam

    Full Text Available The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes.We developed a data warehouse system (INDIGO that enables the integration of annotations for exploration and analysis of newly sequenced microbial genomes. INDIGO offers an opportunity to construct complex queries and combine annotations from multiple sources starting from genomic sequence to protein domain, gene ontology and pathway levels. This data warehouse is aimed at being populated with information from genomes of pure cultures and uncultured single cells of Red Sea bacteria and Archaea. Currently, INDIGO contains information from Salinisphaera shabanensis, Haloplasma contractile, and Halorhabdus tiamatea - extremophiles isolated from deep-sea anoxic brine lakes of the Red Sea. We provide examples of utilizing the system to gain new insights into specific aspects on the unique lifestyle and adaptations of these organisms to extreme environments.We developed a data warehouse system, INDIGO, which enables comprehensive integration of information from various resources to be used for annotation, exploration and analysis of microbial genomes. It will be regularly updated and extended with new genomes. It is aimed to serve as a resource dedicated to the Red Sea microbes. In addition, through INDIGO, we provide our Automatic Annotation of Microbial Genomes (AAMG pipeline. The INDIGO web server is freely available at http://www.cbrc.kaust.edu.sa/indigo.

  4. Segmental allotetraploidy and allelic interactions in buffelgrass (Pennisetum ciliare (L.) Link syn. Cenchrus ciliaris L.) as revealed by genome mapping.

    Science.gov (United States)

    Jessup, R W; Burson, B L; Burow, O; Wang, Y W; Chang, C; Li, Z; Paterson, A H; Hussey, M A

    2003-04-01

    Linkage analyses increasingly complement cytological and traditional plant breeding techniques by providing valuable information regarding genome organization and transmission genetics of complex polyploid species. This study reports a genome map of buffelgrass (Pennisetum ciliare (L.) Link syn. Cenchrus ciliaris L.). Maternal and paternal maps were constructed with restriction fragment length polymorphisms (RFLPs) segregating in 87 F1 progeny from an intraspecific cross between two heterozygous genotypes. A survey of 862 heterologous cDNAs and gDNAs from across the Poaceae, as well as 443 buffelgrass cDNAs, yielded 100 and 360 polymorphic probes, respectively. The maternal map included 322 RFLPs, 47 linkage groups, and 3464 cM, whereas the paternal map contained 245 RFLPs, 42 linkage groups, and 2757 cM. Approximately 70 to 80% of the buffelgrass genome was covered, and the average marker spacing was 10.8 and 11.3 cM on the respective maps. Preferential pairing was indicated between many linkage groups, which supports cytological reports that buffelgrass is a segmental allotetraploid. More preferential pairing (disomy) was found in the maternal than paternal parent across linkage groups (55 vs. 38%) and loci (48 vs. 15%). Comparison of interval lengths in 15 allelic bridges indicated significantly less meiotic recombination in paternal gametes. Allelic interactions were detected in four regions of the maternal map and were absent in the paternal map.

  5. The master two-dimensional gel database of human AMA cell proteins: towards linking protein and genome sequence and mapping information (update 1991)

    DEFF Research Database (Denmark)

    Celis, J E; Leffers, H; Rasmussen, H H

    1991-01-01

    autoantigens" and "cDNAs". For convenience we have included an alphabetical list of all known proteins recorded in this database. In the long run, the main goal of this database is to link protein and DNA sequencing and mapping information (Human Genome Program) and to provide an integrated picture......The master two-dimensional gel database of human AMA cells currently lists 3801 cellular and secreted proteins, of which 371 cellular polypeptides (306 IEF; 65 NEPHGE) were added to the master images during the last 10 months. These include: (i) very basic and acidic proteins that do not focus...

  6. Symmetric integrable-polynomial factorization for symplectic one-turn-map tracking

    International Nuclear Information System (INIS)

    Shi, Jicong

    1993-01-01

    It was found that any homogeneous polynomial can be written as a sum of integrable polynomials of the same degree which Lie transformations can be evaluated exactly. By utilizing symplectic integrators, an integrable-polynomial factorization is developed to convert a symplectic map in the form of Dragt-Finn factorization into a product of Lie transformations associated with integrable polynomials. A small number of factorization bases of integrable polynomials enable one to use high order symplectic integrators so that the high-order spurious terms can be greatly suppressed. A symplectic map can thus be evaluated with desired accuracy

  7. Creation of BAC genomic resources for cocoa ( Theobroma cacao L.) for physical mapping of RGA containing BAC clones.

    Science.gov (United States)

    Clément, D; Lanaud, C; Sabau, X; Fouet, O; Le Cunff, L; Ruiz, E; Risterucci, A M; Glaszmann, J C; Piffanelli, P

    2004-05-01

    We have constructed and validated the first cocoa ( Theobroma cacao L.) BAC library, with the aim of developing molecular resources to study the structure and evolution of the genome of this perennial crop. This library contains 36,864 clones with an average insert size of 120 kb, representing approximately ten haploid genome equivalents. It was constructed from the genotype Scavina-6 (Sca-6), a Forastero clone highly resistant to cocoa pathogens and a parent of existing mapping populations. Validation of the BAC library was carried out with a set of 13 genetically-anchored single copy and one duplicated markers. An average of nine BAC clones per probe was identified, giving an initial experimental estimation of the genome coverage represented in the library. Screening of the library with a set of resistance gene analogues (RGAs), previously mapped in cocoa and co-localizing with QTL for resistance to Phytophthora traits, confirmed at the physical level the tight clustering of RGAs in the cocoa genome and provided the first insights into the relationships between genetic and physical distances in the cocoa genome. This library represents an available BAC resource for structural genomic studies or map-based cloning of genes corresponding to important QTLs for agronomic traits such as resistance genes to major cocoa pathogens like Phytophthora spp ( palmivora and megakarya), Crinipellis perniciosa and Moniliophthora roreri.

  8. A Bac Library and Paired-PCR Approach to Mapping and Completing the Genome Sequence of Sulfolobus Solfataricus P2

    DEFF Research Database (Denmark)

    She, Qunxin; Confalonieri, F.; Zivanovic, Y.

    2000-01-01

    The original strategy used in the Sulfolobus solfatnricus genome project was to sequence non overlapping, or minimally overlapping, cosmid or lambda inserts without constructing a physical map. However, after only about two thirds of the genome sequence was completed, this approach became counter......-productive because there was a high sequence bias in the cosmid and lambda libraries. Therefore, a new approach was devised for linking the sequenced regions which may be generally applicable. BAC libraries were constructed and terminal sequences of the clones were determined and used for both end mapping and PCR...

  9. Dissecting genomic hotspots underlying seed protein, oil, and sucrose content in an interspecific mapping population of soybean using high-density linkage mapping.

    Science.gov (United States)

    Patil, Gunvant; Vuong, Tri D; Kale, Sandip; Valliyodan, Babu; Deshmukh, Rupesh; Zhu, Chengsong; Wu, Xiaolei; Bai, Yonghe; Yungbluth, Dennis; Lu, Fang; Kumpatla, Siva; Grover Shannon, J; Varshney, Rajeev K; Nguyen, Henry T

    2018-04-04

    The cultivated [Glycine max (L) Merr.] and wild [Glycine soja Siebold & Zucc.] soybean species comprise wide variation in seed composition traits. Compared to wild soybean, cultivated soybean contains low protein, high oil and high sucrose. In this study, an inter-specific population was derived from a cross between G. max (Williams 82) and G. soja (PI 483460B). This recombinant inbred line (RIL) population of 188 lines was sequenced at 0.3x depth. Based on 91,342 single nucleotide polymorphisms (SNPs), recombination events in RILs were defined, and a high-resolution bin map was developed (4,070 bins). In addition to bin mapping, QTL analysis for protein, oil and sucrose was performed using 3,343 polymorphic SNPs (3K-SNP), derived from Illumina Infinium BeadChip sequencing platform. The QTL regions from both platforms were compared and a significant concordance was observed between bin and 3K-SNP markers. Importantly, the bin map derived from next generation sequencing technology enhanced mapping resolution (from 1325 Kb to 50 Kb). A total of 5, 9 and 4 QTLs were identified for protein, oil and sucrose content, respectively and some of the QTLs coincided with soybean domestication related genomic loci. The major QTL for protein and oil was mapped on Chr. 20 (qPro_20) and suggested negative correlation between oil and protein. In terms of sucrose content, a novel and major QTL was identified on Chr. 8 (qSuc_08) and harbors putative genes involved in sugar transport. In addition, genome-wide association (GWAS) using 91,342 SNPs confirmed the genomic loci derived from QTL mapping. A QTL based haplotype using whole genome resequencing of 106 diverse soybean lines identified unique allelic variation in wild soybean that could be utilized to widen the genetic base in cultivated soybean. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  10. Modeling the integration of bacterial rRNA fragments into the human cancer genome.

    Science.gov (United States)

    Sieber, Karsten B; Gajer, Pawel; Dunning Hotopp, Julie C

    2016-03-21

    Cancer is a disease driven by the accumulation of genomic alterations, including the integration of exogenous DNA into the human somatic genome. We previously identified in silico evidence of DNA fragments from a Pseudomonas-like bacteria integrating into the 5'-UTR of four proto-oncogenes in stomach cancer sequencing data. The functional and biological consequences of these bacterial DNA integrations remain unknown. Modeling of these integrations suggests that the previously identified sequences cover most of the sequence flanking the junction between the bacterial and human DNA. Further examination of these reads reveals that these integrations are rich in guanine nucleotides and the integrated bacterial DNA may have complex transcript secondary structures. The models presented here lay the foundation for future experiments to test if bacterial DNA integrations alter the transcription of the human genes.

  11. Split photosystem protein, linear-mapping topology, and growth of structural complexity in the plastid genome of chromera velia

    KAUST Repository

    Janouškovec, Jan

    2013-08-22

    The canonical photosynthetic plastid genomes consist of a single circular-mapping chromosome that encodes a highly conserved protein core, involved in photosynthesis and ATP generation. Here, we demonstrate that the plastid genome of the photosynthetic relative of apicomplexans, Chromera velia, departs from this view in several unique ways. Core photosynthesis proteins PsaA and AtpB have been broken into two fragments, which we show are independently transcribed, oligoU-tailed, translated, and assembled into functional photosystem I and ATP synthase complexes. Genome-wide transcription profiles support expression of many other highly modified proteins, including several that contain extensions amounting to hundreds of amino acids in length. Canonical gene clusters and operons have been fragmented and reshuffled into novel putative transcriptional units. Massive genomic coverage by paired-end reads, coupled with pulsed-field gel electrophoresis and polymerase chain reaction, consistently indicate that the C. velia plastid genome is linear-mapping, a unique state among all plastids. Abundant intragenomic duplication probably mediated by recombination can explain protein splits, extensions, and genome linearization and is perhaps the key driving force behind the many features that defy the conventional ways of plastid genome architecture and function. © The Author 2013.

  12. Integrating genomic information with protein sequence and 3D atomic level structure at the RCSB protein data bank.

    Science.gov (United States)

    Prlic, Andreas; Kalro, Tara; Bhattacharya, Roshni; Christie, Cole; Burley, Stephen K; Rose, Peter W

    2016-12-15

    The Protein Data Bank (PDB) now contains more than 120,000 three-dimensional (3D) structures of biological macromolecules. To allow an interpretation of how PDB data relates to other publicly available annotations, we developed a novel data integration platform that maps 3D structural information across various datasets. This integration bridges from the human genome across protein sequence to 3D structure space. We developed novel software solutions for data management and visualization, while incorporating new libraries for web-based visualization using SVG graphics. The new views are available from http://www.rcsb.org and software is available from https://github.com/rcsb/. andreas.prlic@rcsb.orgSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  13. Integrated genomics of Mucorales reveals novel therapeutic targets

    Science.gov (United States)

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  14. Integrated genome-based studies of Shewanella Ecophysiology

    Energy Technology Data Exchange (ETDEWEB)

    Tiedje, James M. [Michigan State Univ., East Lansing, MI (United States); Konstantinidis, Kostas [Michigan State Univ., East Lansing, MI (United States); Worden, Mark [Michigan State Univ., East Lansing, MI (United States)

    2014-01-08

    The aim of the work reported is to study Shewanella population genomics, and to understand the evolution, ecophysiology, and speciation of Shewanella. The tasks supporting this aim are: to study genetic and ecophysiological bases defining the core and diversification of Shewanella species; to determine gene content patterns along redox gradients; and to Investigate the evolutionary processes, patterns and mechanisms of Shewanella.

  15. Genome-wide survey of recurrent HBV integration in hepatocellular carcinoma

    DEFF Research Database (Denmark)

    Sung, Wing-Kin; Zheng, Hancheng; Li, Shuyu

    2012-01-01

    To survey hepatitis B virus (HBV) integration in liver cancer genomes, we conducted massively parallel sequencing of 81 HBV-positive and 7 HBV-negative hepatocellular carcinomas (HCCs) and adjacent normal tissues. We found that HBV integration is observed more frequently in the tumors (86.4%) than...

  16. A Developmental Mapping Program Integrating Geography and Mathematics.

    Science.gov (United States)

    Muir, Sharon Pray; Cheek, Helen Neely

    Presented and discussed is a model which can be used by educators who want to develop an interdisciplinary map skills program in geography and mathematics. The model assumes that most children in elementary schools perform cognitively at Piaget's concrete operational stage, that readiness for map skills can be assessed with Piagetian or…

  17. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  18. Group sparse canonical correlation analysis for genomic data integration.

    Science.gov (United States)

    Lin, Dongdong; Zhang, Jigang; Li, Jingyao; Calhoun, Vince D; Deng, Hong-Wen; Wang, Yu-Ping

    2013-08-12

    The emergence of high-throughput genomic datasets from different sources and platforms (e.g., gene expression, single nucleotide polymorphisms (SNP), and copy number variation (CNV)) has greatly enhanced our understandings of the interplay of these genomic factors as well as their influences on the complex diseases. It is challenging to explore the relationship between these different types of genomic data sets. In this paper, we focus on a multivariate statistical method, canonical correlation analysis (CCA) method for this problem. Conventional CCA method does not work effectively if the number of data samples is significantly less than that of biomarkers, which is a typical case for genomic data (e.g., SNPs). Sparse CCA (sCCA) methods were introduced to overcome such difficulty, mostly using penalizations with l-1 norm (CCA-l1) or the combination of l-1and l-2 norm (CCA-elastic net). However, they overlook the structural or group effect within genomic data in the analysis, which often exist and are important (e.g., SNPs spanning a gene interact and work together as a group). We propose a new group sparse CCA method (CCA-sparse group) along with an effective numerical algorithm to study the mutual relationship between two different types of genomic data (i.e., SNP and gene expression). We then extend the model to a more general formulation that can include the existing sCCA models. We apply the model to feature/variable selection from two data sets and compare our group sparse CCA method with existing sCCA methods on both simulation and two real datasets (human gliomas data and NCI60 data). We use a graphical representation of the samples with a pair of canonical variates to demonstrate the discriminating characteristic of the selected features. Pathway analysis is further performed for biological interpretation of those features. The CCA-sparse group method incorporates group effects of features into the correlation analysis while performs individual feature

  19. Childhood Acute Lymphoblastic Leukemia: Integrating Genomics into Therapy

    Science.gov (United States)

    Tasian, Sarah K; Loh, Mignon L; Hunger, Stephen P

    2015-01-01

    Acute lymphoblastic leukemia (ALL), the most common malignancy of childhood, is a genetically complex entity that remains a major cause of childhood cancer-related mortality. Major advances in genomic and epigenomic profiling during the past decade have appreciably enhanced knowledge of the biology of de novo and relapsed ALL and have facilitated more precise risk stratification of patients. These achievements have also provided critical insights regarding potentially targetable lesions for development of new therapeutic approaches in the era of precision medicine. This review delineates the current genetic landscape of childhood ALL with emphasis upon patient outcomes with contemporary treatment regimens, as well as therapeutic implications of newly identified genomic alterations in specific subsets of ALL. PMID:26194091

  20. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    Directory of Open Access Journals (Sweden)

    Atsuo Kawahara

    2016-05-01

    Full Text Available The zebrafish (Danio rerio is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs, transcription activator-like effector nucleases (TALENs and the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR associated protein 9 (Cas9 system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish.

  1. International regulatory landscape and integration of corrective genome editing into in vitro fertilization.

    Science.gov (United States)

    Araki, Motoko; Ishii, Tetsuya

    2014-11-24

    Genome editing technology, including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas, has enabled far more efficient genetic engineering even in non-human primates. This biotechnology is more likely to develop into medicine for preventing a genetic disease if corrective genome editing is integrated into assisted reproductive technology, represented by in vitro fertilization. Although rapid advances in genome editing are expected to make germline gene correction feasible in a clinical setting, there are many issues that still need to be addressed before this could occur. We herein examine current status of genome editing in mammalian embryonic stem cells and zygotes and discuss potential issues in the international regulatory landscape regarding human germline gene modification. Moreover, we address some ethical and social issues that would be raised when each country considers whether genome editing-mediated germline gene correction for preventive medicine should be permitted.

  2. A differential algebraic integration algorithm for symplectic mappings in systems with three-dimensional magnetic field

    International Nuclear Information System (INIS)

    Chang, P.; Lee, S.Y.; Yan, Y.T.

    2006-01-01

    A differential algebraic integration algorithm is developed for symplectic mapping through a three-dimensional (3-D) magnetic field. The self-consistent reference orbit in phase space is obtained by making a canonical transformation to eliminate the linear part of the Hamiltonian. Transfer maps from the entrance to the exit of any 3-D magnetic field are then obtained through slice-by-slice symplectic integration. The particle phase-space coordinates are advanced by using the integrable polynomial procedure. This algorithm is a powerful tool to attain nonlinear maps for insertion devices in synchrotron light source or complicated magnetic field in the interaction region in high energy colliders

  3. A Differential Algebraic Integration Algorithm for Symplectic Mappings in Systems with Three-Dimensional Magnetic Field

    International Nuclear Information System (INIS)

    Chang, P

    2004-01-01

    A differential algebraic integration algorithm is developed for symplectic mapping through a three-dimensional (3-D) magnetic field. The self-consistent reference orbit in phase space is obtained by making a canonical transformation to eliminate the linear part of the Hamiltonian. Transfer maps from the entrance to the exit of any 3-D magnetic field are then obtained through slice-by-slice symplectic integration. The particle phase-space coordinates are advanced by using the integrable polynomial procedure. This algorithm is a powerful tool to attain nonlinear maps for insertion devices in synchrotron light source or complicated magnetic field in the interaction region in high energy colliders

  4. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    Science.gov (United States)

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  5. Integration sites of Epstein-Barr virus genome on chromosomes of human lymphoblastoid cell lines

    Energy Technology Data Exchange (ETDEWEB)

    Wuu, K.D.; Chen, Y.J.; Wang-Wuu, S. [Institute of Genetics, Taipei (Taiwan, Province of China)

    1994-09-01

    Epstein-Barr virus (EBV) is the pathogen of infectious mononucleosis. The viral genome is present in more than 95% of the African cases of Burkitt lymphoma and it is usually maintained in episomal form in the tumor cells. Viral integration has been described only for Nanalwa which is a Burkitt lymphoma cell line lacking episomes. In order to examine the role of EBV in the immortalization of human Blymphocytes, we investigated whether the EBV integration into the human genome is essential. If the integration does occur, we would like to know whether the integration is randomly distributed or whether the viral DNA integrates preferentially at certain sites. Fourteen in vitro immortalized human lymphoblastoid cell lines (LCLs) were examined by fluorescence in situ hybridization (FISH) with a biotinylated EBV BamHI w DNA fragment as probe. The episomal form of EBV DNA was found in all cells of these cell lines, while only about 65% of the cells have the integrated viral DNA. This might suggest that integration is not a pre-requisite for cell immortalization. Although all chromosomes, except Y, have been found with integrated viral genome, chromsomes 1 and 5 are the most frequent EBV DNA carrier (p<0.05). Nine chromosome bands, namely, 1p31, 1q31, 2q32, 3q13, 3q26, 5q14, 6q24, 7q31 and 12q21, are preferential targets for EBV integration (p<0.001). Eighty percent of the total 938 EBV hybridization signals were found to be at G-band-positive area. This suggests that the mechanism of EBV integration might be different from that of the retroviruses, which specifically integrate to G-band-negative areas. Thus, we conclude that the integration of EBV to host genome is non-random and it may have something to do with the structure of chromosome and DNA sequences.

  6. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  7. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  8. High Resolution Typing by Whole Genome Mapping Enables Discrimination of LA-MRSA (CC398) Strains and Identification of Transmission Events

    Science.gov (United States)

    Bosch, Thijs; Verkade, Erwin; van Luit, Martijn; Pot, Bruno; Vauterin, Paul; Burggrave, Ronald; Savelkoul, Paul; Kluytmans, Jan; Schouls, Leo

    2013-01-01

    After its emergence in 2003, a livestock-associated (LA-)MRSA clade (CC398) has caused an impressive increase in the number of isolates submitted for the Dutch national MRSA surveillance and now comprises 40% of all isolates. The currently used molecular typing techniques have limited discriminatory power for this MRSA clade, which hampers studies on the origin and transmission routes. Recently, a new molecular analysis technique named whole genome mapping was introduced. This method creates high-resolution, ordered whole genome restriction maps that may have potential for strain typing. In this study, we assessed and validated the capability of whole genome mapping to differentiate LA-MRSA isolates. Multiple validation experiments showed that whole genome mapping produced highly reproducible results. Assessment of the technique on two well-documented MRSA outbreaks showed that whole genome mapping was able to confirm one outbreak, but revealed major differences between the maps of a second, indicating that not all isolates belonged to this outbreak. Whole genome mapping of LA-MRSA isolates that were epidemiologically unlinked provided a much higher discriminatory power than spa-typing or MLVA. In contrast, maps created from LA-MRSA isolates obtained during a proven LA-MRSA outbreak were nearly indistinguishable showing that transmission of LA-MRSA can be detected by whole genome mapping. Finally, whole genome maps of LA-MRSA isolates originating from two unrelated veterinarians and their household members showed that veterinarians may carry and transmit different LA-MRSA strains at the same time. No such conclusions could be drawn based spa-typing and MLVA. Although PFGE seems to be suitable for molecular typing of LA-MRSA, WGM provides a much higher discriminatory power. Furthermore, whole genome mapping can provide a comparison with other maps within 2 days after the bacterial culture is received, making it suitable to investigate transmission events and

  9. Tomato genome mapping by fluorescence in situ hybridisation = Kartering van het tomatengenoom met behulp van fluorescentie in situ hybridisatie

    NARCIS (Netherlands)

    Zhong, X.B.

    1998-01-01

    The general introduction reviews the progress in tomato genome mapping using classical genetics, cytogenetics, and molecular genetics, emphasising the great potential of fluorescence in situ hybridisation (FISH) techniques.

    Chapter 2 describes how to

  10. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber

    NARCIS (Netherlands)

    Zhang, Z.; Mao, L.; Chen, Junshi; Bu, F.; Li, G.; Sun, J.; Li, S.; Sun, H.; Jiao, C.; Blakely, R.; Pan, J.; Cai, R.; Luo, R.; Peer, Van de Y.; Jacobsen, E.; Fei, Z.; Huang, S.

    2015-01-01

    Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep

  11. Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01

    Science.gov (United States)

    A landmark in soybean research, Glyma1.01, the first whole genome sequence of variety Williams 82 (Glycine max L. Merr.) was completed in 2010 and is widely used. However, because the assembly was primarily built based on the linkage maps constructed with a limited number of markers and recombinant...

  12. Clustered deep shadow maps for integrated polyhedral and volume rendering

    KAUST Repository

    Bornik, Alexander; Knecht, Wolfgang; Hadwiger, Markus; Schmalstieg, Dieter

    2012-01-01

    This paper presents a hardware-accelerated approach for shadow computation in scenes containing both complex volumetric objects and polyhedral models. Our system is the first hardware accelerated complete implementation of deep shadow maps, which

  13. The Genetics of Winterhardiness in Barley: Perspectives from Genome-Wide Association Mapping

    Directory of Open Access Journals (Sweden)

    Jarislav von Zitzewitz

    2011-03-01

    Full Text Available Winterhardiness is a complex trait that involves low temperature tolerance (LTT, vernalization sensitivity, and photoperiod sensitivity. Quantitative trait loci (QTL for these traits were first identified using biparental mapping populations; candidate genes for all loci have since been identified and characterized. In this research we used a set of 148 accessions consisting of advanced breeding lines from the Oregon barley ( L. subsp breeding program and selected cultivars that were extensively phenotyped and genotyped with single nucleotide polymorphisms. Using these data for genome-wide association mapping we detected the same QTL and genes that have been systematically characterized using biparental populations over nearly two decades of intensive research. In this sample of germplasm, maximum LTT can be achieved with facultative growth habit, which can be predicted using a three-locus haplotype involving , , and . The and LTT QTL explained 25% of the phenotypic variation, offering the prospect that additional gains from selection can be achieved once favorable alleles are fixed at these loci.

  14. Imaginal discs--a new source of chromosomes for genome mapping of the yellow fever mosquito Aedes aegypti.

    Directory of Open Access Journals (Sweden)

    Maria V Sharakhova

    2011-10-01

    Full Text Available The mosquito Aedes aegypti is the primary global vector for dengue and yellow fever viruses. Sequencing of the Ae. aegypti genome has stimulated research in vector biology and insect genomics. However, the current genome assembly is highly fragmented with only ~31% of the genome being assigned to chromosomes. A lack of a reliable source of chromosomes for physical mapping has been a major impediment to improving the genome assembly of Ae. aegypti.In this study we demonstrate the utility of mitotic chromosomes from imaginal discs of 4(th instar larva for cytogenetic studies of Ae. aegypti. High numbers of mitotic divisions on each slide preparation, large sizes, and reproducible banding patterns of the individual chromosomes simplify cytogenetic procedures. Based on the banding structure of the chromosomes, we have developed idiograms for each of the three Ae. aegypti chromosomes and placed 10 BAC clones and a 18S rDNA probe to precise chromosomal positions.The study identified imaginal discs of 4(th instar larva as a superior source of mitotic chromosomes for Ae. aegypti. The proposed approach allows precise mapping of DNA probes to the chromosomal positions and can be utilized for obtaining a high-quality genome assembly of the yellow fever mosquito.

  15. Mapping of genomic EGFRvIII deletions in glioblastoma: insight into rearrangement mechanisms and biomarker development.

    Science.gov (United States)

    Koga, Tomoyuki; Li, Bin; Figueroa, Javier M; Ren, Bing; Chen, Clark C; Carter, Bob S; Furnari, Frank B

    2018-04-12

    Epidermal growth factor receptor (EGFR) variant III (vIII) is the most common oncogenic rearrangement in glioblastoma (GBM) generated by deletion of exons two to seven of EGFR. The proximal breakpoints occur in variable positions within the 123-kb intron one, presenting significant challenges in terms of PCR-based mapping. Molecular mechanisms underlying these deletions remain unclear. We determined the presence of EGFRvIII and its breakpoints for 29 GBM samples using quantitative polymerase chain reaction (qPCR), arrayed PCR mapping, Sanger sequencing, and whole genome sequencing (WGS). Patient-specific breakpoint PCR was performed on tumors, plasma and cerebrospinal fluid (CSF) samples. The breakpoint sequences and single nucleotide polymorphisms (SNPs) were analyzed to elucidate the underlying biogenic mechanism. PCR mapping and WGS independently unveiled eight EGFRvIII breakpoints in six tumors. Patient-specific primers yielded EGFRvIII PCR amplicons in matched tumors, and in cell-free DNA (cfDNA) from a CSF sample, but not in cfDNA or extracellular-vesicle DNA from plasma. The breakpoint analysis revealed nucleotide insertions in four, an insertion of a region outside of EGFR locus in one, microhomologies in three, as well as a duplication or an inversion accompanied by microhomologies in two, suggestive of distinct DNA repair mechanisms. In the GBM samples that harbored distinct breakpoints, the SNP compositions of EGFRvIII and amplified non-vIII EGFR were identical, suggesting that these rearrangements arose from amplified non-vIII EGFR. Our approach efficiently "fingerprints" each sample's EGFRvIII breakpoints. Breakpoint sequence analyses suggest that independent breakpoints arose from precursor amplified non-vIII EGFR through different DNA repair mechanisms.

  16. Genome-wide association mapping of root traits in a japonica rice panel.

    Directory of Open Access Journals (Sweden)

    Brigitte Courtois

    Full Text Available Rice is a crop prone to drought stress in upland and rainfed lowland ecosystems. A deep root system is recognized as the best drought avoidance mechanism. Genome-wide association mapping offers higher resolution for locating quantitative trait loci (QTLs than QTL mapping in biparental populations. We performed an association mapping study for root traits using a panel of 167 japonica accessions, mostly of tropical origin. The panel was genotyped at an average density of one marker per 22.5 kb using genotyping by sequencing technology. The linkage disequilibrium in the panel was high (r(2>0.6, on average, for 20 kb mean distances between markers. The plants were grown in transparent 50 cm × 20 cm × 2 cm Plexiglas nailboard sandwiches filled with 1.5 mm glass beads through which a nutrient solution was circulated. Root system architecture and biomass traits were measured in 30-day-old plants. The panel showed a moderate to high diversity in the various traits, particularly for deep (below 30 cm depth root mass and the number of deep roots. Association analyses were conducted using a mixed model involving both population structure and kinship to control for false positives. Nineteen associations were significant at P<1e-05, and 78 were significant at P<1e-04. The greatest numbers of significant associations were detected for deep root mass and the number of deep roots, whereas no significant associations were found for total root biomass or deep root proportion. Because several QTLs for different traits were co-localized, 51 unique loci were detected; several co-localized with meta-QTLs for root traits, but none co-localized with rice genes known to be involved in root growth. Several likely candidate genes were found in close proximity to these loci. Additional work is necessary to assess whether these markers are relevant in other backgrounds and whether the genes identified are robust candidates.

  17. Harnessing the sorghum genome sequence:development of a genome-wide microsattelite (SSR) resource for swift genetic mapping and map based cloning in sorghum

    Science.gov (United States)

    Sorghum is the second cereal crop to have a full genome completely sequenced (Nature (2009), 457:551). This achievement is widely recognized as a scientific milestone for grass genetics and genomics in general. However, the true worth of genetic information lies in translating the sequence informa...

  18. A DNMT3A2-HDAC2 Complex Is Essential for Genomic Imprinting and Genome Integrity in Mouse Oocytes

    Directory of Open Access Journals (Sweden)

    Pengpeng Ma

    2015-11-01

    Full Text Available Maternal genomic imprints are established during oogenesis. Histone deacetylases (HDACs 1 and 2 are required for oocyte development in mouse, but their role in genomic imprinting is unknown. We find that Hdac1:Hdac2−/− double-mutant growing oocytes exhibit global DNA hypomethylation and fail to establish imprinting marks for Igf2r, Peg3, and Srnpn. Global hypomethylation correlates with increased retrotransposon expression and double-strand DNA breaks. Nuclear-associated DNMT3A2 is reduced in double-mutant oocytes, and injecting these oocytes with Hdac2 partially restores DNMT3A2 nuclear staining. DNMT3A2 co-immunoprecipitates with HDAC2 in mouse embryonic stem cells. Partial loss of nuclear DNMT3A2 and HDAC2 occurs in Sin3a−/− oocytes, which exhibit decreased DNA methylation of imprinting control regions for Igf2r and Srnpn, but not Peg3. These results suggest seminal roles of HDAC1/2 in establishing maternal genomic imprints and maintaining genomic integrity in oocytes mediated in part through a SIN3A complex that interacts with DNMT3A2.

  19. Genome-Wide Association Mapping of Flowering and Ripening Periods in Apple

    Directory of Open Access Journals (Sweden)

    Jorge Urrestarazu

    2017-11-01

    Full Text Available Deciphering the genetic control of flowering and ripening periods in apple is essential for breeding cultivars adapted to their growing environments. We implemented a large Genome-Wide Association Study (GWAS at the European level using an association panel of 1,168 different apple genotypes distributed over six locations and phenotyped for these phenological traits. The panel was genotyped at a high-density of SNPs using the Axiom®Apple 480 K SNP array. We ran GWAS with a multi-locus mixed model (MLMM, which handles the putatively confounding effect of significant SNPs elsewhere on the genome. Genomic regions were further investigated to reveal candidate genes responsible for the phenotypic variation. At the whole population level, GWAS retained two SNPs as cofactors on chromosome 9 for flowering period, and six for ripening period (four on chromosome 3, one on chromosome 10 and one on chromosome 16 which, together accounted for 8.9 and 17.2% of the phenotypic variance, respectively. For both traits, SNPs in weak linkage disequilibrium were detected nearby, thus suggesting the existence of allelic heterogeneity. The geographic origins and relationships of apple cultivars accounted for large parts of the phenotypic variation. Variation in genotypic frequency of the SNPs associated with the two traits was connected to the geographic origin of the genotypes (grouped as North+East, West and South Europe, and indicated differential selection in different growing environments. Genes encoding transcription factors containing either NAC or MADS domains were identified as major candidates within the small confidence intervals computed for the associated genomic regions. A strong microsynteny between apple and peach was revealed in all the four confidence interval regions. This study shows how association genetics can unravel the genetic control of important horticultural traits in apple, as well as reduce the confidence intervals of the associated

  20. Comparative BAC-based mapping in the white-throated sparrow, a novel behavioral genomics model, using interspecies overgo hybridization

    Directory of Open Access Journals (Sweden)

    Gonser Rusty A

    2011-06-01

    Full Text Available Abstract Background The genomics era has produced an arsenal of resources from sequenced organisms allowing researchers to target species that do not have comparable mapping and sequence information. These new "non-model" organisms offer unique opportunities to examine environmental effects on genomic patterns and processes. Here we use comparative mapping as a first step in characterizing the genome organization of a novel animal model, the white-throated sparrow (Zonotrichia albicollis, which occurs as white or tan morphs that exhibit alternative behaviors and physiology. Morph is determined by the presence or absence of a complex chromosomal rearrangement. This species is an ideal model for behavioral genomics because the association between genotype and phenotype is absolute, making it possible to identify the genomic bases of phenotypic variation. Findings We initiated a genomic study in this species by characterizing the white-throated sparrow BAC library via filter hybridization with overgo probes designed for the chicken, turkey, and zebra finch. Cross-species hybridization resulted in 640 positive sparrow BACs assigned to 77 chicken loci across almost all macro-and microchromosomes, with a focus on the chromosomes associated with morph. Out of 216 overgos, 36% of the probes hybridized successfully, with an average number of 3.0 positive sparrow BACs per overgo. Conclusions These data will be utilized for determining chromosomal architecture and for fine-scale mapping of candidate genes associated with phenotypic differences. Our research confirms the utility of interspecies hybridization for developing comparative maps in other non-model organisms.

  1. Ricebase: a breeding and genetics platform for rice, integrating individual molecular markers, pedigrees and whole-genome-based data.

    Science.gov (United States)

    Edwards, J D; Baldo, A M; Mueller, L A

    2016-01-01

    Ricebase (http://ricebase.org) is an integrative genomic database for rice (Oryza sativa) with an emphasis on combining datasets in a way that maintains the key links between past and current genetic studies. Ricebase includes DNA sequence data, gene annotations, nucleotide variation data and molecular marker fragment size data. Rice research has benefited from early adoption and extensive use of simple sequence repeat (SSR) markers; however, the majority of rice SSR markers were developed prior to the latest rice pseudomolecule assembly. Interpretation of new research using SNPs in the context of literature citing SSRs requires a common coordinate system. A new pipeline, using a stepwise relaxation of stringency, was used to map SSR primers onto the latest rice pseudomolecule assembly. The SSR markers and experimentally assayed amplicon sizes are presented in a relational database with a web-based front end, and are available as a track loaded in a genome browser with links connecting the browser and database. The combined capabilities of Ricebase link genetic markers, genome context, allele states across rice germplasm and potentially user curated phenotypic interpretations as a community resource for genetic discovery and breeding in rice. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the United States.

  2. ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets.

    Science.gov (United States)

    Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Brown, C Titus; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas; Lemaire, Patrick

    2018-01-04

    ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Roles of Werner syndrome protein in protection of genome integrity

    DEFF Research Database (Denmark)

    Rossi, Marie L; Ghosh, Avik K; Bohr, Vilhelm A

    2010-01-01

    Werner syndrome protein (WRN) is one of a family of five human RecQ helicases implicated in the maintenance of genome stability. The conserved RecQ family also includes RecQ1, Bloom syndrome protein (BLM), RecQ4, and RecQ5 in humans, as well as Sgs1 in Saccharomyces cerevisiae, Rqh1...... in Schizosaccharomyces pombe, and homologs in Caenorhabditis elegans, Xenopus laevis, and Drosophila melanogaster. Defects in three of the RecQ helicases, RecQ4, BLM, and WRN, cause human pathologies linked with cancer predisposition and premature aging. Mutations in the WRN gene are the causative factor of Werner...

  4. The Plant Genome Integrative Explorer Resource: PlantGenIE.org.

    Science.gov (United States)

    Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R

    2015-12-01

    Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  5. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  6. Integrated Ocean and Coastal Mapping (IOCM) Project FL1415: APALACHICOLA RIVER (MOUTH) TO SAUL CREEK, FL.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  7. Integrated Ocean and Coastal Mapping (IOCM) Project FL1421: ST JOHNS RIVER, FL.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  8. Integrated Ocean and Coastal Mapping (IOCM) Project WA1406: OLYMPIA, WA.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  9. Integrated Ocean and Coastal Mapping (IOCM) Project WA1405: STRAIT OF JUAN DE FUCA, WA.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  10. Integrated Ocean and Coastal Mapping (IOCM) Project FL1414: VENICE INLET - ICW, FL.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  11. Integrated Ocean and Coastal Mapping (IOCM) Project WA1002: PUDGET SOUND - WHIDBEY ISLAND, WA.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  12. 2011 NOAA Ortho-rectified Mosaic of Texas: Integrated Ocean and Coastal Mapping Product

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This data set contains ortho-rectified mosaic tiles, created as a product from the NOAA Integrated Ocean and Coastal Mapping (IOCM) initiative. The source imagery...

  13. Integrated Ocean and Coastal Mapping (IOCM) Project OR1210: CAPE PERPETUA TO CLATSOP SPIT, OR.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The objective of Integrated Ocean and Coastal Mapping (IOCM) is to improve the coordination among federal, state and local government, non-governmental and private...

  14. Genome-wide SNP identification, linkage map construction and QTL mapping for seed mineral concentrations and contents in pea (Pisum sativum L.).

    Science.gov (United States)

    Ma, Yu; Coyne, Clarice J; Grusak, Michael A; Mazourek, Michael; Cheng, Peng; Main, Dorrie; McGee, Rebecca J

    2017-02-13

    Marker-assisted breeding is now routinely used in major crops to facilitate more efficient cultivar improvement. This has been significantly enabled by the use of next-generation sequencing technology to identify loci and markers associated with traits of interest. While rich in a range of nutritional components, such as protein, mineral nutrients, carbohydrates and several vitamins, pea (Pisum sativum L.), one of the oldest domesticated crops in the world, remains behind many other crops in the availability of genomic and genetic resources. To further improve mineral nutrient levels in pea seeds requires the development of genome-wide tools. The objectives of this research were to develop these tools by: identifying genome-wide single nucleotide polymorphisms (SNPs) using genotyping by sequencing (GBS); constructing a high-density linkage map and comparative maps with other legumes, and identifying quantitative trait loci (QTL) for levels of boron, calcium, iron, potassium, magnesium, manganese, molybdenum, phosphorous, sulfur, and zinc in the seed, as well as for seed weight. In this study, 1609 high quality SNPs were found to be polymorphic between 'Kiflica' and 'Aragorn', two parents of an F 6 -derived recombinant inbred line (RIL) population. Mapping 1683 markers including 75 previously published markers and 1608 SNPs developed from the present study generated a linkage map of size 1310.1 cM. Comparative mapping with other legumes demonstrated that the highest level of synteny was observed between pea and the genome of Medicago truncatula. QTL analysis of the RIL population across two locations revealed at least one QTL for each of the mineral nutrient traits. In total, 46 seed mineral concentration QTLs, 37 seed mineral content QTLs, and 6 seed weight QTLs were discovered. The QTLs explained from 2.4% to 43.3% of the phenotypic variance. The genome-wide SNPs and the genetic linkage map developed in this study permitted QTL identification for pea seed mineral

  15. A SNP Based Linkage Map of the Arctic Charr (Salvelinus alpinus Genome Provides Insights into the Diploidization Process After Whole Genome Duplication

    Directory of Open Access Journals (Sweden)

    Cameron M. Nugent

    2017-02-01

    Full Text Available Diploidization, which follows whole genome duplication events, does not occur evenly across the genome. In salmonid fishes, certain pairs of homeologous chromosomes preserve tetraploid loci in higher frequencies toward the telomeres due to residual tetrasomic inheritance. Research suggests this occurs only in homeologous pairs where one chromosome arm has undergone a fusion event. We present a linkage map for Arctic charr (Salvelinus alpinus, a salmonid species with relatively fewer chromosome fusions. Genotype by sequencing identified 19,418 SNPs, and a linkage map consisting of 4508 markers was constructed from a subset of high quality SNPs and microsatellite markers that were used to anchor the new map to previous versions. Both male- and female-specific linkage maps contained the expected number of 39 linkage groups. The chromosome type associated with each linkage group was determined, and 10 stable metacentric chromosomes were identified, along with a chromosome polymorphism involving the sex chromosome AC04. Two instances of a weak form of pseudolinkage were detected in the telomeric regions of homeologous chromosome arms in both female and male linkage maps. Chromosome arm homologies within the Atlantic salmon (Salmo salar and rainbow trout (Oncorhynchus mykiss genomes were determined. Paralogous sequence variants (PSVs were identified, and their comparative BLASTn hit locations showed that duplicate markers exist in higher numbers on seven pairs of homeologous arms, previously identified as preserving tetrasomy in salmonid species. Homeologous arm pairs where neither arm has been part of a fusion event in Arctic charr had fewer PSVs, suggesting faster diploidization rates in these regions.

  16. An ultra-dense integrated linkage map for hexaploid chrysanthemum enables multi-allelic QTL analysis

    NARCIS (Netherlands)

    Geest, van Geert; Bourke, Peter M.; Voorrips, Roeland E.; Marasek-Ciolakowska, Agnieszka; Liao, Yanlin; Post, Aike; Meeteren, van Uulke; Visser, Richard G.F.; Maliepaard, Chris; Arens, Paul

    2017-01-01

    Key message: We constructed the first integrated genetic linkage map in a polysomic hexaploid. This enabled us to estimate inheritance of parental haplotypes in the offspring and detect multi-allelic QTL.Abstract: Construction and use of linkage maps are challenging in hexaploids with polysomic

  17. Clustered deep shadow maps for integrated polyhedral and volume rendering

    KAUST Repository

    Bornik, Alexander

    2012-01-01

    This paper presents a hardware-accelerated approach for shadow computation in scenes containing both complex volumetric objects and polyhedral models. Our system is the first hardware accelerated complete implementation of deep shadow maps, which unifies the computation of volumetric and geometric shadows. Up to now such unified computation was limited to software-only rendering . Previous hardware accelerated techniques can handle only geometric or only volumetric scenes - both resulting in the loss of important properties of the original concept. Our approach supports interactive rendering of polyhedrally bounded volumetric objects on the GPU based on ray casting. The ray casting can be conveniently used for both the shadow map computation and the rendering. We show how anti-aliased high-quality shadows are feasible in scenes composed of multiple overlapping translucent objects, and how sparse scenes can be handled efficiently using clustered deep shadow maps. © 2012 Springer-Verlag.

  18. Bioinformatics decoding the genome

    CERN Multimedia

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  19. Comprehensive Mapping of Pluripotent Stem Cell Metabolism Using Dynamic Genome-Scale Network Modeling

    Directory of Open Access Journals (Sweden)

    Sriram Chandrasekaran

    2017-12-01

    Full Text Available Summary: Metabolism is an emerging stem cell hallmark tied to cell fate, pluripotency, and self-renewal, yet systems-level understanding of stem cell metabolism has been limited by the lack of genome-scale network models. Here, we develop a systems approach to integrate time-course metabolomics data with a computational model of metabolism to analyze the metabolic state of naive and primed murine pluripotent stem cells. Using this approach, we find that one-carbon metabolism involving phosphoglycerate dehydrogenase, folate synthesis, and nucleotide synthesis is a key pathway that differs between the two states, resulting in differential sensitivity to anti-folates. The model also predicts that the pluripotency factor Lin28 regulates this one-carbon metabolic pathway, which we validate using metabolomics data from Lin28-deficient cells. Moreover, we identify and validate metabolic reactions related to S-adenosyl-methionine production that can differentially impact histone methylation in naive and primed cells. Our network-based approach provides a framework for characterizing metabolic changes influencing pluripotency and cell fate. : Chandrasekaran et al. use computational modeling, metabolomics, and metabolic inhibitors to discover metabolic differences between various pluripotent stem cell states and infer their impact on stem cell fate decisions. Keywords: systems biology, stem cell biology, metabolism, genome-scale modeling, pluripotency, histone methylation, naive (ground state, primed state, cell fate, metabolic network

  20. Quantification and genome-wide mapping of DNA double-strand breaks.

    Science.gov (United States)

    Grégoire, Marie-Chantal; Massonneau, Julien; Leduc, Frédéric; Arguin, Mélina; Brazeau, Marc-André; Boissonneault, Guylain

    2016-12-01

    DNA double-strand breaks (DSBs) represent a major threat to the genetic integrity of the cell. Knowing both their genome-wide distribution and number is important for a better assessment of genotoxicity at a molecular level. Available methods may have underestimated the extent of DSBs as they are based on markers specific to those undergoing active repair or may not be adapted for the large diversity of naturally occurring DNA ends. We have established conditions for an efficient first step of DNA nick and gap repair (NGR) allowing specific determination of DSBs by end labeling with terminal transferase. We used DNA extracted from HeLa cells harboring an I-SceI cassette to induce a targeted nick or DSB and demonstrated by immunocapture of 3'-OH that a prior step of NGR allows specific determination of loci-specific or genome wide DSBs. This method can be applied to the global determination of DSBs using radioactive end labeling and can find several applications aimed at understanding the distribution and kinetics of DSBs formation and repair. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    Directory of Open Access Journals (Sweden)

    RICARDO CRUZ-COKE

    2001-03-01

    Full Text Available In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays a semilogaritmic metric scale from 1 m up to the nanometer level. The base pair level of DNA sequences, 10-9 of this circunsference, is measured in milliarsec unit (mas, equivalent to a thousand of arcsecond. The "mas" unit, correspond to 1.27 nanometers (nm or 0.427 base pair (bp and it is the framework for measure DNA sequences. Thus the three billion base pairs of the human genome may be identified by 1296000000 "mas" units in continous correlation from number 1 to number 1296000000. This sexagesimal scale covers all the levels of the nuclear genetic material, from nucleotides to chromosomes. The locations of every codon and every gene may be numbered in the physical map of chomosome regions according to this new scale, instead of the partial kilobase and Megabase scales used today. The advantage of the new scale is the unification of the set of chromosomes under a continous scale of measurement at the DNA level, facilitating the correlation with the phenotypes of man and other speciesEn un trabajo anterior yo diseñé un diagrama del genoma humano basado en un ideograma circular del conjunto haploide de cromosomas, usando una escala de baja resolución en megabases. El propósito de este trabajo es el de diseñar una nueva escala para medir el mapa físico del genoma humano al más alto nivel de resolución. La longitud completa del genoma haploide del varon es extendido en una circunsferencia, marcada con una escala sexagesimal de 360 grados y 1296000

  2. Integrating Work Environment Considerations Into Lean and Value Stream Mapping

    DEFF Research Database (Denmark)

    Edwards, Kasper

    (Spear & Bowen, 1999; Womack & Jones, 1996) and is based on standardisation, levelling, and optimisation of work flows through value stream mapping (VSM) (Rother & Shook, 2009) and eliminating waste. Lean is essentially a rationalization approach that will reduce waste and increase productivity thereby...

  3. Genome-Wide Association Mapping of Stem Rust Resistance in Hordeum vulgare subsp. spontaneum.

    Science.gov (United States)

    Sallam, Ahmad H; Tyagi, Priyanka; Brown-Guedira, Gina; Muehlbauer, Gary J; Hulse, Alex; Steffenson, Brian J

    2017-10-05

    Stem rust was one of the most devastating diseases of barley in North America. Through the deployment of cultivars with the resistance gene Rpg1 , losses to stem rust have been minimal over the past 70 yr. However, there exist both domestic (QCCJB) and foreign (TTKSK aka isolate Ug99) pathotypes with virulence for this important gene. To identify new sources of stem rust resistance for barley, we evaluated the Wild Barley Diversity Collection (WBDC) (314 ecogeographically diverse accessions of Hordeum vulgare subsp. spontaneum ) for seedling resistance to four pathotypes (TTKSK, QCCJB, MCCFC, and HKHJC) of the wheat stem rust pathogen ( Puccinia graminis f. sp. tritici , Pgt ) and one isolate (92-MN-90) of the rye stem rust pathogen ( P. graminis f. sp. secalis , Pgs ). Based on a coefficient of infection, the frequency of resistance in the WBDC was low ranging from 0.6% with HKHJC to 19.4% with 92-MN-90. None of the accessions was resistant to all five cultures of P. graminis A genome-wide association study (GWAS) was conducted to map stem rust resistance loci using 50,842 single-nucleotide polymorphic markers generated by genotype-by-sequencing and ordered using the new barley reference genome assembly. After proper accounting for genetic relatedness and structure among accessions, 45 quantitative trait loci were identified for resistance to P. graminis across all seven barley chromosomes. Three novel loci associated with resistance to TTKSK, QCCJB, MCCFC, and 92-MN-90 were identified on chromosomes 5H and 7H, and two novel loci associated with resistance to HKHJC were identified on chromosomes 1H and 3H. These novel alleles will enhance the diversity of resistance available for cultivated barley. Copyright © 2017 Sallam et al.

  4. High-resolution genetic map for understanding the effect of genome-wide recombination rate on nucleotide diversity in watermelon.

    Science.gov (United States)

    Reddy, Umesh K; Nimmakayala, Padma; Levi, Amnon; Abburi, Venkata Lakshmi; Saminathan, Thangasamy; Tomason, Yan R; Vajja, Gopinath; Reddy, Rishi; Abburi, Lavanya; Wehner, Todd C; Ronin, Yefim; Karol, Abraham

    2014-09-15

    We used genotyping by sequencing to identify a set of 10,480 single nucleotide polymorphism (SNP) markers for constructing a high-resolution genetic map of 1096 cM for watermelon. We assessed the genome-wide variation in recombination rate (GWRR) across the map and found an association between GWRR and genome-wide nucleotide diversity. Collinearity between the map and the genome-wide reference sequence for watermelon was studied to identify inconsistency and chromosome rearrangements. We assessed genome-wide nucleotide diversity, linkage disequilibrium (LD), and selective sweep for wild, semi-wild, and domesticated accessions of Citrullus lanatus var. lanatus to track signals of domestication. Principal component analysis combined with chromosome-wide phylogenetic study based on 1563 SNPs obtained after LD pruning with minor allele frequency of 0.05 resolved the differences between semi-wild and wild accessions as well as relationships among worldwide sweet watermelon. Population structure analysis revealed predominant ancestries for wild, semi-wild, and domesticated watermelons as well as admixture of various ancestries that were important for domestication. Sliding window analysis of Tajima's D across various chromosomes was used to resolve selective sweep. LD decay was estimated for various chromosomes. We identified a strong selective sweep on chromosome 3 consisting of important genes that might have had a role in sweet watermelon domestication. Copyright © 2014 Reddy et al.

  5. Using DNase Hi-C techniques to map global and local three-dimensional genome architecture at high resolution.

    Science.gov (United States)

    Ma, Wenxiu; Ay, Ferhat; Lee, Choli; Gulsoy, Gunhan; Deng, Xinxian; Cook, Savannah; Hesson, Jennifer; Cavanaugh, Christopher; Ware, Carol B; Krumm, Anton; Shendure, Jay; Blau, C Anthony; Disteche, Christine M; Noble, William S; Duan, ZhiJun

    2018-06-01

    The folding and three-dimensional (3D) organization of chromatin in the nucleus critically impacts genome function. The past decade has witnessed rapid advances in genomic tools for delineating 3D genome architecture. Among them, chromosome conformation capture (3C)-based methods such as Hi-C are the most widely used techniques for mapping chromatin interactions. However, traditional Hi-C protocols rely on restriction enzymes (REs) to fragment chromatin and are therefore limited in resolution. We recently developed DNase Hi-C for mapping 3D genome organization, which uses DNase I for chromatin fragmentation. DNase Hi-C overcomes RE-related limitations associated with traditional Hi-C methods, leading to improved methodological resolution. Furthermore, combining this method with DNA capture technology provides a high-throughput approach (targeted DNase Hi-C) that allows for mapping fine-scale chromatin architecture at exceptionally high resolution. Hence, targeted DNase Hi-C will be valuable for delineating the physical landscapes of cis-regulatory networks that control gene expression and for characterizing phenotype-associated chromatin 3D signatures. Here, we provide a detailed description of method design and step-by-step working protocols for these two methods. Copyright © 2018 Elsevier Inc. All rights reserved.

  6. An integrated CRISPR Bombyx mori genome editing system with improved efficiency and expanded target sites.

    Science.gov (United States)

    Ma, Sanyuan; Liu, Yue; Liu, Yuanyuan; Chang, Jiasong; Zhang, Tong; Wang, Xiaogang; Shi, Run; Lu, Wei; Xia, Xiaojuan; Zhao, Ping; Xia, Qingyou

    2017-04-01

    Genome editing enabled unprecedented new opportunities for targeted genomic engineering of a wide variety of organisms ranging from microbes, plants, animals and even human embryos. The serial establishing and rapid applications of genome editing tools significantly accelerated Bombyx mori (B. mori) research during the past years. However, the only CRISPR system in B. mori was the commonly used SpCas9, which only recognize target sites containing NGG PAM sequence. In the present study, we first improve the efficiency of our previous established SpCas9 system by 3.5 folds. The improved high efficiency was also observed at several loci in both BmNs cells and B. mori embryos. Then to expand the target sites, we showed that two newly discovered CRISPR system, SaCas9 and AsCpf1, could also induce highly efficient site-specific genome editing in BmNs cells, and constructed an integrated CRISPR system. Genome-wide analysis of targetable sites was further conducted and showed that the integrated system cover 69,144,399 sites in B. mori genome, and one site could be found in every 6.5 bp. The efficiency and resolution of this CRISPR platform will probably accelerate both fundamental researches and applicable studies in B. mori, and perhaps other insects. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Human papillomavirus genome integration in squamous carcinogenesis: what have next-generation sequencing studies taught us?

    Science.gov (United States)

    Groves, Ian J; Coleman, Nicholas

    2018-05-01

    Human papillomavirus (HPV) infection is associated with ∼5% of all human cancers, including a range of squamous cell carcinomas. Persistent infection by high-risk HPVs (HRHPVs) is associated with the integration of virus genomes (which are usually stably maintained as extrachromosomal episomes) into host chromosomes. Although HRHPV integration rates differ across human sites of infection, this process appears to be an important event in HPV-associated neoplastic progression, leading to deregulation of virus oncogene expression, host gene expression modulation, and further genomic instability. However, the mechanisms by which HRHPV integration occur and by which the subsequent gene expression changes take place are incompletely understood. The advent of next-generation sequencing (NGS) of both RNA and DNA has allowed powerful interrogation of the association of HRHPVs with human disease, including precise determination of the sites of integration and the genomic rearrangements at integration loci. In turn, these data have indicated that integration occurs through two main mechanisms: looping integration and direct insertion. Improved understanding of integration sites is allowing further investigation of the factors that provide a competitive advantage to some integrants during disease progression. Furthermore, advanced approaches to the generation of genome-wide samples have given novel insights into the three-dimensional interactions within the nucleus, which could act as another layer of epigenetic control of both virus and host transcription. It is hoped that further advances in NGS techniques and analysis will not only allow the examination of further unanswered questions regarding HPV infection, but also direct new approaches to treating HPV-associated human disease. Copyright © 2018 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2018 Pathological Society of Great Britain and Ireland. Published by John

  8. GLIDERS - A web-based search engine for genome-wide linkage disequilibrium between HapMap SNPs

    Directory of Open Access Journals (Sweden)

    Broxholme John

    2009-10-01

    Full Text Available Abstract Background A number of tools for the examination of linkage disequilibrium (LD patterns between nearby alleles exist, but none are available for quickly and easily investigating LD at longer ranges (>500 kb. We have developed a web-based query tool (GLIDERS: Genome-wide LInkage DisEquilibrium Repository and Search engine that enables the retrieval of pairwise associations with r2 ≥ 0.3 across the human genome for any SNP genotyped within HapMap phase 2 and 3, regardless of distance between the markers. Description GLIDERS is an easy to use web tool that only requires the user to enter rs numbers of SNPs they want to retrieve genome-wide LD for (both nearby and long-range. The intuitive web interface handles both manual entry of SNP IDs as well as allowing users to upload files of SNP IDs. The user can limit the resulting inter SNP associations with easy to use menu options. These include MAF limit (5-45%, distance limits between SNPs (minimum and maximum, r2 (0.3 to 1, HapMap population sample (CEU, YRI and JPT+CHB combined and HapMap build/release. All resulting genome-wide inter-SNP associations are displayed on a single output page, which has a link to a downloadable tab delimited text file. Conclusion GLIDERS is a quick and easy way to retrieve genome-wide inter-SNP associations and to explore LD patterns for any number of SNPs of interest. GLIDERS can be useful in identifying SNPs with long-range LD. This can highlight mis-mapping or other potential association signal localisation problems.

  9. PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.

    Science.gov (United States)

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X

    2017-01-01

    Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.

  10. Decoding the genome with an integrative analysis tool: combinatorial CRM Decoder.

    Science.gov (United States)

    Kang, Keunsoo; Kim, Joomyeong; Chung, Jae Hoon; Lee, Daeyoup

    2011-09-01

    The identification of genome-wide cis-regulatory modules (CRMs) and characterization of their associated epigenetic features are fundamental steps toward the understanding of gene regulatory networks. Although integrative analysis of available genome-wide information can provide new biological insights, the lack of novel methodologies has become a major bottleneck. Here, we present a comprehensive analysis tool called combinatorial CRM decoder (CCD), which utilizes the publicly available information to identify and characterize genome-wide CRMs in a species of interest. CCD first defines a set of the epigenetic features which is significantly associated with a set of known CRMs as a code called 'trace code', and subsequently uses the trace code to pinpoint putative CRMs throughout the genome. Using 61 genome-wide data sets obtained from 17 independent mouse studies, CCD successfully catalogued ∼12 600 CRMs (five distinct classes) including polycomb repressive complex 2 target sites as well as imprinting control regions. Interestingly, we discovered that ∼4% of the identified CRMs belong to at least two different classes named 'multi-functional CRM', suggesting their functional importance for regulating spatiotemporal gene expression. From these examples, we show that CCD can be applied to any potential genome-wide datasets and therefore will shed light on unveiling genome-wide CRMs in various species.

  11. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  12. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    Science.gov (United States)

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  13. Genome-Wide Association Mapping of Leaf Rust Response in a Durum Wheat Worldwide Germplasm Collection.

    Science.gov (United States)

    Aoun, Meriem; Breiland, Matthew; Kathryn Turner, M; Loladze, Alexander; Chao, Shiaoman; Xu, Steven S; Ammar, Karim; Anderson, James A; Kolmer, James A; Acevedo, Maricelis

    2016-11-01

    Leaf rust (caused by Erikss. []) is increasingly impacting durum wheat ( L. var. ) production with the recent appearance of races with virulence to widely grown cultivars in many durum producing areas worldwide. A highly virulent race on durum wheat was recently detected in Kansas. This race may spread to the northern Great Plains, where most of the US durum wheat is produced. The objective of this study was to identify sources of resistance to several races from the United States and Mexico at seedling stage in the greenhouse and at adult stage in field experiments. Genome-wide association study (GWAS) was used to identify single-nucleotide polymorphism (SNP) markers associated with leaf rust response in a worldwide durum wheat collection of 496 accessions. Thirteen accessions were resistant across all experiments. Association mapping revealed 88 significant SNPs associated with leaf rust response. Of these, 33 SNPs were located on chromosomes 2A and 2B, and 55 SNPs were distributed across all other chromosomes except for 1B and 7B. Twenty markers were associated with leaf rust response at seedling stage, while 68 markers were associated with leaf rust response at adult plant stage. The current study identified a total of 14 previously uncharacterized loci associated with leaf rust response in durum wheat. The discovery of these loci through association mapping (AM) is a significant step in identifying useful sources of resistance that can be used to broaden the relatively narrow leaf rust resistance spectrum in durum wheat germplasm. Copyright © 2016 Crop Science Society of America.

  14. The first genetic map of a synthesized allohexaploid Brassica with A, B and C genomes based on simple sequence repeat markers.

    Science.gov (United States)

    Yang, S; Chen, S; Geng, X X; Yan, G; Li, Z Y; Meng, J L; Cowling, W A; Zhou, W J

    2016-04-01

    We present the first genetic map of an allohexaploid Brassica species, based on segregating microsatellite markers in a doubled haploid mapping population generated from a hybrid between two hexaploid parents. This study reports the first genetic map of trigenomic Brassica. A doubled haploid mapping population consisting of 189 lines was obtained via microspore culture from a hybrid H16-1 derived from a cross between two allohexaploid Brassica lines (7H170-1 and Y54-2). Simple sequence repeat primer pairs specific to the A genome (107), B genome (44) and C genome (109) were used to construct a genetic linkage map of the population. Twenty-seven linkage groups were resolved from 274 polymorphic loci on the A genome (109), B genome (49) and C genome (116) covering a total genetic distance of 3178.8 cM with an average distance between markers of 11.60 cM. This is the first genetic framework map for the artificially synthesized Brassica allohexaploids. The linkage groups represent the expected complement of chromosomes in the A, B and C genomes from the original diploid and tetraploid parents. This framework linkage map will be valuable for QTL analysis and future genetic improvement of a new allohexaploid Brassica species, and in improving our understanding of the genetic control of meiosis in new polyploids.

  15. Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

    Science.gov (United States)

    Bibby, Kyle

    2014-02-01

    The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.

  16. An Integrated Resource for Barley Linkage Map and Malting Quality QTL Alignment

    Directory of Open Access Journals (Sweden)

    Péter Szűcs

    2009-07-01

    Full Text Available Barley ( L. is an economically important model plant for genetics research. Barley is currently served by an increasingly comprehensive set of tools for genetic analysis that have recently been augmented by high-density genetic linkage maps built with gene-based single nucleotide polymorphisms (SNPs. These SNP-based maps need to be aligned with earlier generation maps, which were used for quantitative trait locus (QTL detection, by integrating multiple types of markers into a single map. A 2383 locus linkage map was developed using the Oregon Wolfe Barley (OWB Mapping Population to allow such alignments. The map is based on 1472 SNP, 722 DArT, and 189 prior markers which include morphological, simple sequence repeat (SSR, Restriction Fragment Length Polymorphism (RFLP, and sequence tagged site (STS loci. This new OWB map forms, therefore, a useful bridge between high-density SNP-only maps and prior QTL reports. The application of this bridge concept is shown using malting-quality QTLs from multiple mapping populations, as reported in the literature. This is the first step toward developing a Barley QTL Community Curation workbook for all types of QTLs and maps, on the GrainGenes website. The OWB-related resources are available at OWB Data and GrainGenes Tools (OWB-DGGT (.

  17. Canonical integration and analysis of periodic maps using non-standard analysis and life methods

    Energy Technology Data Exchange (ETDEWEB)

    Forest, E.; Berz, M.

    1988-06-01

    We describe a method and a way of thinking which is ideally suited for the study of systems represented by canonical integrators. Starting with the continuous description provided by the Hamiltonians, we replace it by a succession of preferably canonical maps. The power series representation of these maps can be extracted with a computer implementation of the tools of Non-Standard Analysis and analyzed by the same tools. For a nearly integrable system, we can define a Floquet ring in a way consistent with our needs. Using the finite time maps, the Floquet ring is defined only at the locations s/sub i/ where one perturbs or observes the phase space. At most the total number of locations is equal to the total number of steps of our integrator. We can also produce pseudo-Hamiltonians which describe the motion induced by these maps. 15 refs., 1 fig.

  18. A class of conservative Hamiltonians with exactly integrable discrete two-dimensional parametric maps

    International Nuclear Information System (INIS)

    Dikande, Alain M; Njumbe, E Epie

    2010-01-01

    A class of discrete conservative Hamiltonians with completely integrable two-dimensional (2D) mappings is constructed whose generic models are three families of non-integrable discrete Hamiltonians with on-site potentials whose double-well shapes vary. Unlike the discrete 2D mappings associated with the generic models, which all display pitchfork bifurcations towards randomly pinned states with chaotic features, for the derived models the pitchfork bifurcation leads to fixed points always surrounded by periodic trajectories. A nonlinear stability analysis reveals a finite crossover on the bifurcation line at which the pitchfork transition takes the maps from regular real periodic trajectories towards a regime dominated by a cluster of periodic point trajectories representing the allowed real solutions. The rich variety of structures displayed by the new class of discrete maps, combined with their complete integrability, offer rich perspectives for theoretical modelling of a wide class of systems undergoing structural instabilities without noticeable chaotic precursors.

  19. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization.

    Directory of Open Access Journals (Sweden)

    Xiaoquan Wen

    2017-03-01

    Full Text Available We propose a novel statistical framework for integrating the result from molecular quantitative trait loci (QTL mapping into genome-wide genetic association analysis of complex traits, with the primary objectives of quantitatively assessing the enrichment of the molecular QTLs in complex trait-associated genetic variants and the colocalizations of the two types of association signals. We introduce a natural Bayesian hierarchical model that treats the latent association status of molecular QTLs as SNP-level annotations for candidate SNPs of complex traits. We detail a computational procedure to seamlessly perform enrichment, fine-mapping and colocalization analyses, which is a distinct feature compared to the existing colocalization analysis procedures in the literature. The proposed approach is computationally efficient and requires only summary-level statistics. We evaluate and demonstrate the proposed computational approach through extensive simulation studies and analyses of blood lipid data and the whole blood eQTL data from the GTEx project. In addition, a useful utility from our proposed method enables the computation of expected colocalization signals using simple characteristics of the association data. Using this utility, we further illustrate the importance of enrichment analysis on the ability to discover colocalized signals and the potential limitations of currently available molecular QTL data. The software pipeline that implements the proposed computation procedures, enloc, is freely available at https://github.com/xqwen/integrative.

  20. Integrating collaborative concept mapping in case based learning

    Directory of Open Access Journals (Sweden)

    Alfredo Tifi

    2013-03-01

    Full Text Available Different significance of collaborative concept mapping and collaborative argumentation in Case Based Learning are discussed and compared in the different perspectives of answering focus questions, of fostering reflective thinking skills and in managing uncertainty in problem solving in a scaffolded environment. Marked differences are pointed out between the way concepts are used in constructing concept maps and the way meanings are adopted in case based learning through guided argumentation activities. Shared concept maps should be given different scopes, as for example a as an advance organizer in preparing a background system of concepts that will undergo transformation while accompanying the inquiry activities on case studies or problems; b together with narratives, to enhance awareness of the situated epistemologies that are being entailed in choosing certain concepts during more complex case studies, and c after-learning construction of a holistic vision of the whole domain by means of the most inclusive concepts, while scaffoldedcollaborative writing of narratives and arguments in describing-treating cases could better serve as a source of situated-inspired tools to create-refine meanings for particular concepts.

  1. Cab technology integration laboratory demonstration with moving map technology

    Science.gov (United States)

    2013-03-31

    A human performance study was conducted at the John A. Volpe National Transportation Systems Center (Volpe Center) using a locomotive research simulatorthe Cab Technology Integration Laboratory (CTIL)that was acquired by the Federal Railroad Ad...

  2. Navigating the evidentiary turn in public health: Sensemaking strategies to integrate genomics into state-level chronic disease prevention programs.

    Science.gov (United States)

    Senier, Laura; Smollin, Leandra; Lee, Rachael; Nicoll, Lauren; Shields, Michael; Tan, Catherine

    2018-06-23

    In the past decade, healthcare delivery has faced two major disruptions: the mapping of the human genome and the rise of evidence-based practice. Sociologists have documented the paradigmatic shift towards evidence-based practice in medicine, but have yet to examine its effect on other health professions or the broader healthcare arena. This article shows how evidence-based practice is transforming public health in the United States. We present an in-depth qualitative analysis of interview, ethnographic, and archival data to show how Michigan's state public health agency has navigated the turn to evidence-based practice, as they have integrated scientific advances in genomics into their chronic disease prevention programming. Drawing on organizational theory, we demonstrate how they managed ambiguity through a combination of sensegiving and sensemaking activities. Specifically, they linked novel developments in genomics to a long-accepted public health planning model, the Core Public Health Functions. This made cutting edge advances in genomics more familiar to their peers in the state health agency. They also marshaled state-specific surveillance data to illustrate the public health burden of hereditary cancers in Michigan, and to make expert panel recommendations for genetic screening more locally relevant. Finally, they mobilized expertise to help their internal colleagues and external partners modernize conventional public health activities in chronic disease prevention. Our findings show that tools and concepts from organizational sociology can help medical sociologists understand how evidence-based practice is shaping institutions and interprofessional relations in the healthcare arena. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DEFF Research Database (Denmark)

    King, Zachary A.; Lu, Justin; Dräger, Andreas

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repo...

  4. Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer

    NARCIS (Netherlands)

    Peifer, Martin; Fernandez-Cuesta, Lynnette; Sos, Martin L.; George, Julie; Seidel, Danila; Kasper, Lawryn H.; Plenker, Dennis; Leenders, Frauke; Sun, Ruping; Zander, Thomas; Menon, Roopika; Koker, Mirjam; Dahmen, Ilona; Mueller, Christian; Di Cerbo, Vincenzo; Schildhaus, Hans-Ulrich; Altmueller, Janine; Baessmann, Ingelore; Becker, Christian; de Wilde, Bram; Vandesompele, Jo; Boehm, Diana; Ansen, Sascha; Gabler, Franziska; Wilkening, Ines; Heynck, Stefanie; Heuckmann, Johannes M.; Lu, Xin; Carter, Scott L.; Cibulskis, Kristian; Banerji, Shantanu; Getz, Gad; Park, Kwon-Sik; Rauh, Daniel; Gruetter, Christian; Fischer, Matthias; Pasqualucci, Laura; Wright, Gavin; Wainer, Zoe; Russell, Prudence; Petersen, Iver; Chen, Yuan; Stoelben, Erich; Ludwig, Corinna; Schnabel, Philipp; Hoffmann, Hans; Muley, Thomas; Brockmann, Michael; Engel-Riedel, Walburga; Muscarella, Lucia A.; Fazio, Vito M.; Groen, Harry; Timens, Wim; Sietsma, Hannie; Thunnissen, Erik; Smit, Egbert; Heideman, Danielle A. M.; Snijders, Peter J. F.; Cappuzzo, Federico; Ligorio, Claudia; Damiani, Stefania; Field, John; Solberg, Steinar; Brustugun, Odd Terje; Lund-Iversen, Marius; Saenger, Joerg; Clement, Joachim H.; Soltermann, Alex; Moch, Holger; Weder, Walter; Solomon, Benjamin; Soria, Jean-Charles; Validire, Pierre; Besse, Benjamin; Brambilla, Elisabeth; Brambilla, Christian; Lantuejoul, Sylvie; Lorimier, Philippe; Schneider, Peter M.; Hallek, Michael; Pao, William; Meyerson, Matthew; Sage, Julien; Shendure, Jay; Schneider, Robert; Buettner, Reinhard; Wolf, Juergen; Nuernberg, Peter; Perner, Sven; Heukamp, Lukas C.; Brindle, Paul K.; Haas, Stefan; Thomas, Roman K.

    2012-01-01

    Small-cell lung cancer (SCLC) is an aggressive lung tumor subtype with poor prognosis(1-3). We sequenced 29 SCLC exomes, 2 genomes and 15 transcriptomes and found an extremely high mutation rate of 7.4 +/- 1 protein-changing mutations per million base pairs. Therefore, we conducted integrated

  5. Filling the knowledge gap: Integrating quantitative genetics and genomics in graduate education and outreach

    Science.gov (United States)

    The genomics revolution provides vital tools to address global food security. Yet to be incorporated into livestock breeding, molecular techniques need to be integrated into a quantitative genetics framework. Within the U.S., with shrinking faculty numbers with the requisite skills, the capacity to ...

  6. Quantitative and Qualitative Proteome Characteristics Extracted from In-Depth Integrated Genomics and Proteomics Analysis

    NARCIS (Netherlands)

    Low, Teck Yew; van Heesch, Sebastiaan; van den Toorn, Henk; Giansanti, Piero; Cristobal, Alba; Toonen, Pim; Schafer, Sebastian; Huebner, Norbert; van Breukelen, Bas; Mohammed, Shabaz; Cuppen, Edwin; Heck, Albert J. R.; Guryev, Victor

    2013-01-01

    Quantitative and qualitative protein characteristics are regulated at genomic, transcriptomic, and post-transcriptional levels. Here, we integrated in-depth transcriptome and proteome analyses of liver tissues from two rat strains to unravel the interactions within and between these layers. We

  7. Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles

    DEFF Research Database (Denmark)

    Farshidfar, Farshad; Zheng, Siyuan; Gingras, Marie-Claude

    2017-01-01

    Cholangiocarcinoma (CCA) is an aggressive malignancy of the bile ducts, with poor prognosis and limited treatment options. Here, we describe the integrated analysis of somatic mutations, RNA expression, copy number, and DNA methylation by The Cancer Genome Atlas of a set of predominantly intrahep...

  8. Nucleotide excision repair : a multi-step mechanism required to maintain genome integrity

    NARCIS (Netherlands)

    Moser, Jill

    2010-01-01

    DNA is continuously exposed to exogenous and genotoxic insults including ionizing and ultraviolet radiation as well as chemical agents. DNA damage can compromise the integrity of the genome and have potentially deleterious effects. Ultraviolet light (UV) can induce the formation of helix distorting

  9. Toward an Integrated BAC Library Resource for Genome Sequencing and Analysis; FINAL

    International Nuclear Information System (INIS)

    Simon, M. I.; Kim, U.-J.

    2002-01-01

    We developed a great deal of expertise in building large BAC libraries from a variety of DNA sources including humans, mice, corn, microorganisms, worms, and Arabidopsis. We greatly improved the technology for screening these libraries rapidly and for selecting appropriate BACs and mapping BACs to develop large overlapping contigs. We became involved in supplying BACs and BAC contigs to a variety of sequencing and mapping projects and we began to collaborate with Drs. Adams and Venter at TIGR and with Dr. Leroy Hood and his group at University of Washington to provide BACs for end sequencing and for mapping and sequencing of large fragments of chromosome 16. Together with Dr. Ian Dunham and his co-workers at the Sanger Center we completed the mapping and they completed the sequencing of the first human chromosome, chromosome 22. This was published in Nature in 1999 and our BAC contigs made a major contribution to this sequencing effort. Drs. Shizuya and Ding invented an automated highly accurate BAC mapping technique. We also developed long-term collaborations with Dr. Uli Weier at UCSF in the design of BAC probes for characterization of human tumors and specific chromosome deletions and breakpoints. Finally the contribution of our work to the human genome project has been recognized in the publication both by the international consortium and the NIH of a draft sequence of the human genome in Nature last year. Dr. Shizuya was acknowledged in the authorship of that landmark paper. Dr. Simon was also an author on the Venter/Adams Celera project sequencing the human genome that was published in Science last year

  10. Genome-Wide Association Mapping of Seed Coat Color in Brassica napus.

    Science.gov (United States)

    Wang, Jia; Xian, Xiaohua; Xu, Xinfu; Qu, Cunmin; Lu, Kun; Li, Jiana; Liu, Liezhao

    2017-07-05

    Seed coat color is an extremely important breeding characteristic of Brassica napus. To elucidate the factors affecting the genetic architecture of seed coat color, a genome-wide association study (GWAS) of seed coat color was conducted with a diversity panel comprising 520 B. napus cultivars and inbred lines. In total, 22 single-nucleotide polymorphisms (SNPs) distributed on 7 chromosomes were found to be associated with seed coat color. The most significant SNPs were found in 2014 near Bn-scaff_15763_1-p233999, only 43.42 kb away from BnaC06g17050D, which is orthologous to Arabidopsis thaliana TRANSPARENT TESTA 12 (TT12), an important gene involved in the transportation of proanthocyanidin precursors into the vacuole. Two of eight repeatedly detected SNPs can be identified and digested by restriction enzymes. Candidate gene mining revealed that the relevant regions of significant SNP loci on the A09 and C08 chromosomes are highly homologous. Moreover, a comparison of the GWAS results to those of previous quantitative trait locus (QTL) studies showed that 11 SNPs were located in the confidence intervals of the QTLs identified in previous studies based on linkage analyses or association mapping. Our results provide insights into the genetic basis of seed coat color in B. napus, and the beneficial allele, SNP information, and candidate genes should be useful for selecting yellow seeds in B. napus breeding.

  11. Genomic mapping of single-stranded DNA in hydroxyurea-challenged yeasts identifies origins of replication.

    Science.gov (United States)

    Feng, Wenyi; Collingwood, David; Boeck, Max E; Fox, Lindsay A; Alvino, Gina M; Fangman, Walton L; Raghuraman, Mosur K; Brewer, Bonita J

    2006-02-01

    During DNA replication one or both strands transiently become single stranded: first at the sites where initiation of DNA synthesis occurs (known as origins of replication) and subsequently on the lagging strands of replication forks as discontinuous Okazaki fragments are generated. We report a genome-wide analysis of single-stranded DNA (ssDNA) formation in the presence of hydroxyurea during DNA replication in wild-type and checkpoint-deficient rad53 Saccharomyces cerevisiae cells. In wild-type cells, ssDNA was first observed at a subset of replication origins and later 'migrated' bi-directionally, suggesting that ssDNA formation is associated with continuously moving replication forks. In rad53 cells, ssDNA was observed at virtually every known origin, but remained there over time, suggesting that replication forks stall. Telomeric regions seemed to be particularly sensitive to the loss of Rad53 checkpoint function. Replication origins in Schizosaccharomyces pombe were also mapped using our method.

  12. A genome-wide map of hyper-edited RNA reveals numerous new sites

    Science.gov (United States)

    Porath, Hagit T.; Carmi, Shai; Levanon, Erez Y.

    2014-01-01

    Adenosine-to-inosine editing is one of the most frequent post-transcriptional modifications, manifested as A-to-G mismatches when comparing RNA sequences with their source DNA. Recently, a number of RNA-seq data sets have been screened for the presence of A-to-G editing, and hundreds of thousands of editing sites identified. Here we show that existing screens missed the majority of sites by ignoring reads with excessive (‘hyper’) editing that do not easily align to the genome. We show that careful alignment and examination of the unmapped reads in RNA-seq studies reveal numerous new sites, usually many more than originally discovered, and in precisely those regions that are most heavily edited. Specifically, we discover 327,096 new editing sites in the heavily studied Illumina Human BodyMap data and more than double the number of detected sites in several published screens. We also identify thousands of new sites in mouse, rat, opossum and fly. Our results establish that hyper-editing events account for the majority of editing sites. PMID:25158696

  13. Genome wide association mapping for the tolerance to the polyamine oxidase inhibitor guazatine in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Kostadin Evgeniev eAtanasov

    2016-04-01

    Full Text Available Guazatine is a potent inhibitor of polyamine oxidase (PAO activity. In agriculture, guazatine is used as non-systemic contact fungicide efficient in the protection of cereals and citrus fruits against disease. The composition of guazatine is complex, mainly constituted by a mixture of synthetic guanidated polyamines (polyaminoguanidines. Here we have studied the effects from exposure to guazatine in the weed Arabidopsis thaliana. We report that micromolar concentrations of guazatine are sufficient to inhibit growth of Arabidopsis seedlings and induce chlorosis, whereas germination is barely affected. We observed the occurrence of quantitative variation in the response to guazatine between 107 randomly chosen Arabidopsis accessions. This enabled us to undertake genome-wide association (GWA mapping that identified a locus on chromosome one associated with guazatine tolerance. CHLOROPHYLLASE 1 (CLH1 within this locus was studied as candidate gene, together with its paralog (CLH2. The analysis of independent clh1-2, clh1-3, clh2-3, clh2-2 and double clh1-2 clh2-3 mutant alleles indicated that CLH1 and/or CLH2 loss-of-function or expression down-regulation promote guazatine tolerance in Arabidopsis. We report a natural mechanism by which Arabidopsis populations can overcome toxicity by the fungicide guazatine.

  14. Phenotypic plasticity, QTL mapping and genomic characterization of bud set in black poplar

    Directory of Open Access Journals (Sweden)

    Fabbrini Francesco

    2012-04-01

    Full Text Available Abstract Background The genetic control of important adaptive traits, such as bud set, is still poorly understood in most forest trees species. Poplar is an ideal model tree to study bud set because of its indeterminate shoot growth. Thus, a full-sib family derived from an intraspecific cross of P. nigra with 162 clonally replicated progeny was used to assess the phenotypic plasticity and genetic variation of bud set in two sites of contrasting environmental conditions. Results Six crucial phenological stages of bud set were scored. Night length appeared to be the most important signal triggering the onset of growth cessation. Nevertheless, the effect of other environmental factors, such as temperature, increased during the process. Moreover, a considerable role of genotype × environment (G × E interaction was found in all phenological stages with the lowest temperature appearing to influence the sensitivity of the most plastic genotypes. Descriptors of growth cessation and bud onset explained the largest part of phenotypic variation of the entire process. Quantitative trait loci (QTL for these traits were detected. For the four selected traits (the onset of growth cessation (date2.5, the transition from shoot to bud (date1.5, the duration of bud formation (subproc1 and bud maturation (subproc2 eight and sixteen QTL were mapped on the maternal and paternal map, respectively. The identified QTL, each one characterized by small or modest effect, highlighted the complex nature of traits involved in bud set process. Comparison between map location of QTL and P. trichocarpa genome sequence allowed the identification of 13 gene models, 67 bud set-related expressional and six functional candidate genes (CGs. These CGs are functionally related to relevant biological processes, environmental sensing, signaling, and cell growth and development. Some strong QTL had no obvious CGs, and hold great promise to identify unknown genes that affect bud set

  15. Integrating field sampling, geostatistics and remote sensing to map wetland vegetation in the Pantanal, Brazil

    Directory of Open Access Journals (Sweden)

    J. Arieira

    2011-03-01

    Full Text Available Development of efficient methodologies for mapping wetland vegetation is of key importance to wetland conservation. Here we propose the integration of a number of statistical techniques, in particular cluster analysis, universal kriging and error propagation modelling, to integrate observations from remote sensing and field sampling for mapping vegetation communities and estimating uncertainty. The approach results in seven vegetation communities with a known floral composition that can be mapped over large areas using remotely sensed data. The relationship between remotely sensed data and vegetation patterns, captured in four factorial axes, were described using multiple linear regression models. There were then used in a universal kriging procedure to reduce the mapping uncertainty. Cross-validation procedures and Monte Carlo simulations were used to quantify the uncertainty in the resulting map. Cross-validation showed that accuracy in classification varies according with the community type, as a result of sampling density and configuration. A map of uncertainty derived from Monte Carlo simulations revealed significant spatial variation in classification, but this had little impact on the proportion and arrangement of the communities observed. These results suggested that mapping improvement could be achieved by increasing the number of field observations of those communities with a scattered and small patch size distribution; or by including a larger number of digital images as explanatory variables in the model. Comparison of the resulting plant community map with a flood duration map, revealed that flooding duration is an important driver of vegetation zonation. This mapping approach is able to integrate field point data and high-resolution remote-sensing images, providing a new basis to map wetland vegetation and allow its future application in habitat management, conservation assessment and long-term ecological monitoring in wetland

  16. Unexpected inheritance: multiple integrations of ancient bornavirus and ebolavirus/marburgvirus sequences in vertebrate genomes.

    Science.gov (United States)

    Belyi, Vladimir A; Levine, Arnold J; Skalka, Anna Marie

    2010-07-29

    Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological

  17. Unexpected inheritance: multiple integrations of ancient bornavirus and ebolavirus/marburgvirus sequences in vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Vladimir A Belyi

    2010-07-01

    Full Text Available Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected, later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important

  18. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    Science.gov (United States)

    Jing, Shengli; Zhang, Lei; Ma, Yinhua; Liu, Bingfang; Zhao, Yan; Yu, Hangjin; Zhou, Xi; Qin, Rui; Zhu, Lili; He, Guangcun

    2014-01-01

    Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens) is the most destructive rice (Oryza sativa) pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5) and 14 (Qgr14). This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for controlling this most

  19. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    Directory of Open Access Journals (Sweden)

    Shengli Jing

    Full Text Available Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens is the most destructive rice (Oryza sativa pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5 and 14 (Qgr14. This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for

  20. From Planetary Mapping to Map Production: Planetary Cartography as integral discipline in Planetary Sciences

    Science.gov (United States)

    Nass, Andrea; van Gasselt, Stephan; Hargitai, Hendrik; Hare, Trent; Manaud, Nicolas; Karachevtseva, Irina; Kersten, Elke; Roatsch, Thomas; Wählisch, Marita; Kereszturi, Akos

    2016-04-01

    Cartography is one of the most important communication channels between users of spatial information and laymen as well as the open public alike. This applies to all known real-world objects located either here on Earth or on any other object in our Solar System. In planetary sciences, however, the main use of cartography resides in a concept called planetary mapping with all its various attached meanings: it can be (1) systematic spacecraft observation from orbit, i.e. the retrieval of physical information, (2) the interpretation of discrete planetary surface units and their abstraction, or it can be (3) planetary cartography sensu strictu, i.e., the technical and artistic creation of map products. As the concept of planetary mapping covers a wide range of different information and knowledge levels, aims associated with the concept of mapping consequently range from a technical and engineering focus to a scientific distillation process. Among others, scientific centers focusing on planetary cartography are the United State Geological Survey (USGS, Flagstaff), the Moscow State University of Geodesy and Cartography (MIIGAiK, Moscow), Eötvös Loránd University (ELTE, Hungary), and the German Aerospace Center (DLR, Berlin). The International Astronomical Union (IAU), the Commission Planetary Cartography within International Cartographic Association (ICA), the Open Geospatial Consortium (OGC), the WG IV/8 Planetary Mapping and Spatial Databases within International Society for Photogrammetry and Remote Sensing (ISPRS) and a range of other institutions contribute on definition frameworks in planetary cartography. Classical cartography is nowadays often (mis-)understood as a tool mainly rather than a scientific discipline and an art of communication. Consequently, concepts of information systems, mapping tools and cartographic frameworks are used interchangeably, and cartographic workflows and visualization of spatial information in thematic maps have often been

  1. ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-seq experiments.

    Science.gov (United States)

    Chèneby, Jeanne; Gheorghe, Marius; Artufel, Marie; Mathelier, Anthony; Ballester, Benoit

    2018-01-04

    With this latest release of ReMap (http://remap.cisreg.eu), we present a unique collection of regulatory regions in human, as a result of a large-scale integrative analysis of ChIP-seq experiments for hundreds of transcriptional regulators (TRs) such as transcription factors, transcriptional co-activators and chromatin regulators. In 2015, we introduced the ReMap database to capture the genome regulatory space by integrating public ChIP-seq datasets, covering 237 TRs across 13 million (M) peaks. In this release, we have extended this catalog to constitute a unique collection of regulatory regions. Specifically, we have collected, analyzed and retained after quality control a total of 2829 ChIP-seq datasets available from public sources, covering a total of 485 TRs with a catalog of 80M peaks. Additionally, the updated database includes new search features for TR names as well as aliases, including cell line names and the ability to navigate the data directly within genome browsers via public track hubs. Finally, full access to this catalog is available online together with a TR binding enrichment analysis tool. ReMap 2018 provides a significant update of the ReMap database, providing an in depth view of the complexity of the regulatory landscape in human. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Physical mapping of a large plant genome using global high-information-content-fingerprinting: the distal region of the wheat ancestor Aegilops tauschii chromosome 3DS

    Directory of Open Access Journals (Sweden)

    You Frank M

    2010-06-01

    Full Text Available Abstract Background Physical maps employing libraries of bacterial artificial chromosome (BAC clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum, Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat. Results We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs. Conclusions The physical map reported here is the first physical map using fingerprinting of a complete

  3. A Visual Interface Diagram For Mapping Functions In Integrated Products

    DEFF Research Database (Denmark)

    Ingerslev, Mattias; Oliver Jespersen, Mikkel; Göhler, Simon Moritz

    2015-01-01

    In product development there is a recognized tendency towards increased functionality for each new product generation. This leads to more integrated and complex products, with the risk of development delays and quality issues as a consequence of lacking overview and transparency. The work described...... of visualizing relations between parts and functions in highly integrated mechanical products. The result is an interface diagram that supports design teams in communication, decision making and design management. The diagram gives the designer an overview of the couplings and dependencies within a product...... in this article has been conducted in collaboration with Novo Nordisk on the insulin injection device FlexTouch® as case product. The FlexTouch® reflects the characteristics of an integrated product with several functions shared between a relatively low number of parts. In this article we present a novel way...

  4. Restriction map of the single-stranded DNA genome of Kilham rat virus strain 171, a nondefective parvovirus

    International Nuclear Information System (INIS)

    Banerjee, P.T.; Rathrock, R.; Mitra, S.

    1981-01-01

    A physical map of Kilham rat virus strain 171 DNA was constructed by analyzing the sizes and locations of restriction endonuclease-generated fragments of the replicative-form viral DNA synthesized in vitro. BglI, KpnI, BamHI, SmaI, XhoI, and XorII did not appear to have any cleavage sites, whereas 11 other enzymes cleaved the genome at one to eight sites, and AluI generated more than 12 distinct fragments. The 30 restriction sites that were mapped were distributed randomly in the viral genome. A comparison of the restriction fragments of in vivo- and in vitro-replicated replicative-form DNAs showed that these DNAs were identical except in the size or configuration of the terminal fragments

  5. Evolution of endogenous non-retroviral genes integrated into plant genomes

    Directory of Open Access Journals (Sweden)

    Hyosub Chu

    2014-08-01

    Full Text Available Numerous comparative genome analyses have revealed the wide extent of horizontal gene transfer (HGT in living organisms, which contributes to their evolution and genetic diversity. Viruses play important roles in HGT. Endogenous viral elements (EVEs are defined as viral DNA sequences present within the genomes of non-viral organisms. In eukaryotic cells, the majority of EVEs are derived from RNA viruses using reverse transcription. In contrast, endogenous non-retroviral elements (ENREs are poorly studied. However, the increasing availability of genomic data and the rapid development of bioinformatics tools have enabled the identification of several ENREs in various eukaryotic organisms. To date, a small number of ENREs integrated into plant genomes have been identified. Of the known non-retroviruses, most identified ENREs are derived from double-strand (ds RNA viruses, followed by single-strand (ss DNA and ssRNA viruses. At least eight virus families have been identified. Of these, viruses in the family Partitiviridae are dominant, followed by viruses of the families Chrysoviridae and Geminiviridae. The identified ENREs have been primarily identified in eudicots, followed by monocots. In this review, we briefly discuss the current view on non-retroviral sequences integrated into plant genomes that are associated with plant-virus evolution and their possible roles in antiviral resistance.

  6. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  7. The nucleolus—guardian of cellular homeostasis and genome integrity.

    Science.gov (United States)

    Grummt, Ingrid

    2013-12-01

    All organisms sense and respond to conditions that stress their homeostasis by downregulating the synthesis of rRNA and ribosome biogenesis, thus designating the nucleolus as the central hub in coordinating the cellular stress response. One of the most intriguing roles of the nucleolus, long regarded as a mere ribosome-producing factory, is its participation in monitoring cellular stress signals and transmitting them to the RNA polymerase I (Pol I) transcription machinery. As rRNA synthesis is a most energy-consuming process, switching off transcription of rRNA genes is an effective way of saving the energy required to maintain cellular homeostasis during acute stress. The Pol I transcription machinery is the key convergence point that collects and integrates a vast array of information from cellular signaling cascades to regulate ribosome production which, in turn, guides cell growth and proliferation. This review focuses on the mechanisms that link cell physiology to rDNA silencing, a prerequisite for nucleolar integrity and cell survival.

  8. Fluorescence in situ hybridization and optical mapping to correct scaffold arrangement in the tomato genome

    Science.gov (United States)

    Modern biological analyses are often assisted by recent technologies making the sequencing of complex genomes both technically possible and feasible. We recently sequenced the tomato genome that, like many eukaryotic genomes, is large and complex. Current sequencing technologies allow the developmen...

  9. Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

    Science.gov (United States)

    Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

    2014-01-01

    A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.

  10. Optical Whole-Genome Restriction Mapping as a Tool for Rapidly Distinguishing and Identifying Bacterial Contaminants in Clinical Samples

    Science.gov (United States)

    2015-08-01

    Article 3. DATES COVERED (From – To) Oct 2011 – Aug 2012 4. TITLE AND SUBTITLE Optical Whole-Genome Restriction Mapping as a Tool for Rapidly...multiple bacteria could be uniquely identified within mixtures. In the first set of experiments, three unique organisms ( Bacillus subtilis subsp. globigii...be useful in monitoring nosocomial outbreaks in neonatal and intensive care wards, or even as an initial screen for antibiotic resistant strains

  11. Genome-wide mapping of transcription start sites yields novel insights into the primary transcriptome of Pseudomonas putida

    DEFF Research Database (Denmark)

    D'Arrigo, Isotta; Bojanovic, Klara; Yang, Xiaochen

    2016-01-01

    was examined using an in vivo assay with GFP-fusion vectors and shown to function via a translational repression mechanism. Furthermore, 56 novel intergenic small RNAs and 8 putative actuaton transcripts were detected, as well as 8 novel open reading frames (ORFs). This study illustrates how global mapping...... of TSSs can yield novel insights into the transcriptional features and RNA output of bacterial genomes....

  12. An integrated clinical and genomic information system for cancer precision medicine.

    Science.gov (United States)

    Jang, Yeongjun; Choi, Taekjin; Kim, Jongho; Park, Jisub; Seo, Jihae; Kim, Sangok; Kwon, Yeajee; Lee, Seungjae; Lee, Sanghyuk

    2018-04-20

    Increasing affordability of next-generation sequencing (NGS) has created an opportunity for realizing genomically-informed personalized cancer therapy as a path to precision oncology. However, the complex nature of genomic information presents a huge challenge for clinicians in interpreting the patient's genomic alterations and selecting the optimum approved or investigational therapy. An elaborate and practical information system is urgently needed to support clinical decision as well as to test clinical hypotheses quickly. Here, we present an integrated clinical and genomic information system (CGIS) based on NGS data analyses. Major components include modules for handling clinical data, NGS data processing, variant annotation and prioritization, drug-target-pathway analysis, and population cohort explorer. We built a comprehensive knowledgebase of genes, variants, drugs by collecting annotated information from public and in-house resources. Structured reports for molecular pathology are generated using standardized terminology in order to help clinicians interpret genomic variants and utilize them for targeted cancer therapy. We also implemented many features useful for testing hypotheses to develop prognostic markers from mutation and gene expression data. Our CGIS software is an attempt to provide useful information for both clinicians and scientists who want to explore genomic information for precision oncology.

  13. Integrated genomics of ovarian xenograft tumor progression and chemotherapy response

    International Nuclear Information System (INIS)

    Stuckey, Ashley; Brodsky, Alexander S; Fischer, Andrew; Miller, Daniel H; Hillenmeyer, Sara; Kim, Kyu K; Ritz, Anna; Singh, Rakesh K; Raphael, Benjamin J; Brard, Laurent

    2011-01-01

    Ovarian cancer is the most deadly gynecological cancer with a very poor prognosis. Xenograft mouse models have proven to be one very useful tool in testing candidate therapeutic agents and gene function in vivo. In this study we identify genes and gene networks important for the efficacy of a pre-clinical anti-tumor therapeutic, MT19c. In order to understand how ovarian xenograft tumors may be growing and responding to anti-tumor therapeutics, we used genome-wide mRNA expression and DNA copy number measurements to identify key genes and pathways that may be critical for SKOV-3 xenograft tumor progression. We compared SKOV-3 xenografts treated with the ergocalciferol derived, MT19c, to untreated tumors collected at multiple time points. Cell viability assays were used to test the function of the PPARγ agonist, Rosiglitazone, on SKOV-3 cell growth. These data indicate that a number of known survival and growth pathways including Notch signaling and general apoptosis factors are differentially expressed in treated vs. untreated xenografts. As tumors grow, cell cycle and DNA replication genes show increased expression, consistent with faster growth. The steroid nuclear receptor, PPARγ, was significantly up-regulated in MT19c treated xenografts. Surprisingly, stimulation of PPARγ with Rosiglitazone reduced the efficacy of MT19c and cisplatin suggesting that PPARγ is regulating a survival pathway in SKOV-3 cells. To identify which genes may be important for tumor growth and treatment response, we observed that MT19c down-regulates some high copy number genes and stimulates expression of some low copy number genes suggesting that these genes are particularly important for SKOV-3 xenograft growth and survival. We have characterized the time dependent responses of ovarian xenograft tumors to the vitamin D analog, MT19c. Our results suggest that PPARγ promotes survival for some ovarian tumor cells. We propose that a combination of regulated expression and copy number

  14. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  15. Accuracy assessment of topographic mapping using UAV image integrated with satellite images

    International Nuclear Information System (INIS)

    Azmi, S M; Ahmad, Baharin; Ahmad, Anuar

    2014-01-01

    Unmanned Aerial Vehicle or UAV is extensively applied in various fields such as military applications, archaeology, agriculture and scientific research. This study focuses on topographic mapping and map updating. UAV is one of the alternative ways to ease the process of acquiring data with lower operating costs, low manufacturing and operational costs, plus it is easy to operate. Furthermore, UAV images will be integrated with QuickBird images that are used as base maps. The objective of this study is to make accuracy assessment and comparison between topographic mapping using UAV images integrated with aerial photograph and satellite image. The main purpose of using UAV image is as a replacement for cloud covered area which normally exists in aerial photograph and satellite image, and for updating topographic map. Meanwhile, spatial resolution, pixel size, scale, geometric accuracy and correction, image quality and information contents are important requirements needed for the generation of topographic map using these kinds of data. In this study, ground control points (GCPs) and check points (CPs) were established using real time kinematic Global Positioning System (RTK-GPS) technique. There are two types of analysis that are carried out in this study which are quantitative and qualitative assessments. Quantitative assessment is carried out by calculating root mean square error (RMSE). The outputs of this study include topographic map and orthophoto. From this study, the accuracy of UAV image is ± 0.460 m. As conclusion, UAV image has the potential to be used for updating of topographic maps

  16. Identifying candidate driver genes by integrative ovarian cancer genomics data

    Science.gov (United States)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  17. Genome-wide maps of alkylation damage, repair, and mutagenesis in yeast reveal mechanisms of mutational heterogeneity.

    Science.gov (United States)

    Mao, Peng; Brown, Alexander J; Malc, Ewa P; Mieczkowski, Piotr A; Smerdon, Michael J; Roberts, Steven A; Wyrick, John J

    2017-10-01

    DNA base damage is an important contributor to genome instability, but how the formation and repair of these lesions is affected by the genomic landscape and contributes to mutagenesis is unknown. Here, we describe genome-wide maps of DNA base damage, repair, and mutagenesis at single nucleotide resolution in yeast treated with the alkylating agent methyl methanesulfonate (MMS). Analysis of these maps revealed that base excision repair (BER) of alkylation damage is significantly modulated by chromatin, with faster repair in nucleosome-depleted regions, and slower repair and higher mutation density within strongly positioned nucleosomes. Both the translational and rotational settings of lesions within nucleosomes significantly influence BER efficiency; moreover, this effect is asymmetric relative to the nucleosome dyad axis and is regulated by histone modifications. Our data also indicate that MMS-induced mutations at adenine nucleotides are significantly enriched on the nontranscribed strand (NTS) of yeast genes, particularly in BER-deficient strains, due to higher damage formation on the NTS and transcription-coupled repair of the transcribed strand (TS). These findings reveal the influence of chromatin on repair and mutagenesis of base lesions on a genome-wide scale and suggest a novel mechanism for transcription-associated mutation asymmetry, which is frequently observed in human cancers. © 2017 Mao et al.; Published by Cold Spring Harbor Laboratory Press.

  18. Annotating novel genes by integrating synthetic lethals and genomic information

    Directory of Open Access Journals (Sweden)

    Faty Mahamadou

    2008-01-01

    Full Text Available Abstract Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process.

  19. SNP identification from RNA sequencing and linkage map construction of rubber tree for anchoring the draft genome.

    Science.gov (United States)

    Shearman, Jeremy R; Sangsrakru, Duangjai; Jomchai, Nukoon; Ruang-Areerate, Panthita; Sonthirod, Chutima; Naktang, Chaiwat; Theerawattanasuk, Kanikar; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2015-01-01

    Hevea brasiliensis, or rubber tree, is an important crop species that accounts for the majority of natural latex production. The rubber tree nuclear genome consists of 18 chromosomes and is roughly 2.15 Gb. The current rubber tree reference genome assembly consists of 1,150,326 scaffolds ranging from 200 to 531,465 bp and totalling 1.1 Gb. Only 143 scaffolds, totalling 7.6 Mb, have been placed into linkage groups. We have performed RNA-seq on 6 varieties of rubber tree to identify SNPs and InDels and used this information to perform target sequence enrichment and high throughput sequencing to genotype a set of SNPs in 149 rubber tree offspring from a cross between RRIM 600 and RRII 105 rubber tree varieties. We used this information to generate a linkage map allowing for the anchoring of 24,424 contigs from 3,009 scaffolds, totalling 115 Mb or 10.4% of the published sequence, into 18 linkage groups. Each linkage group contains between 319 and 1367 SNPs, or 60 to 194 non-redundant marker positions, and ranges from 156 to 336 cM in length. This linkage map includes 20,143 of the 69,300 predicted genes from rubber tree and will be useful for mapping studies and improving the reference genome assembly.

  20. Integrating ILI data with publicly available mapping solutions

    Energy Technology Data Exchange (ETDEWEB)

    Coleman, Grant A.; Miller, Scott J. [BJ Pipeline Inspection Services, Calgary, AB (Canada)], email: gcoleman@bjservices.ca, email: smiller@bjservices.ca

    2010-07-01

    With the improvement of the technology, in line inspection tools now incorporate accurate Inertial Mapping Units (IMU) which provide spatial coordinates for the features comprised within a pipeline. Geographic information systems (GIS) can be used to display the data, however some companies do not have such systems and novices cannot navigate easily within the application. The aim of this paper is to present the use of universally available packages to display GIS style data. Case studies are presented in which Google Earth imagery is used to display IMU data. This study showed that universally available packages such as Google Earth do not match the accuracy and functionality of GIS systems but they can be a convenient and easy to use way to display data in a portable format. Organizations that do not have a GIS system could thus benefit from the use of those packages.