WorldWideScience

Sample records for apospory-specific genomic region

  1. Identification of ovule transcripts from the Apospory-Specific Genomic Region (ASGR-carrier chromosome

    Directory of Open Access Journals (Sweden)

    Ozias-Akins Peggy

    2011-04-01

    Full Text Available Abstract Background Apomixis, asexual seed production in plants, holds great potential for agriculture as a means to fix hybrid vigor. Apospory is a form of apomixis where the embryo develops from an unreduced egg that is derived from a somatic nucellar cell, the aposporous initial, via mitosis. Understanding the molecular mechanism regulating aposporous initial specification will be a critical step toward elucidation of apomixis and also provide insight into developmental regulation and downstream signaling that results in apomixis. To discover candidate transcripts for regulating aposporous initial specification in P. squamulatum, we compared two transcriptomes derived from microdissected ovules at the stage of aposporous initial formation between the apomictic donor parent, P. squamulatum (accession PS26, and an apomictic derived backcross 8 (BC8 line containing only the Apospory-Specific Genomic Region (ASGR-carrier chromosome from P. squamulatum. Toward this end, two transcriptomes derived from ovules of an apomictic donor parent and its apomictic backcross derivative at the stage of apospory initiation, were sequenced using 454-FLX technology. Results Using 454-FLX technology, we generated 332,567 reads with an average read length of 147 base pairs (bp for the PS26 ovule transcriptome library and 363,637 reads with an average read length of 142 bp for the BC8 ovule transcriptome library. A total of 33,977 contigs from the PS26 ovule transcriptome library and 26,576 contigs from the BC8 ovule transcriptome library were assembled using the Multifunctional Inertial Reference Assembly program. Using stringent in silico parameters, 61 transcripts were predicted to map to the ASGR-carrier chromosome, of which 49 transcripts were verified as ASGR-carrier chromosome specific. One of the alien expressed genes could be assigned as tightly linked to the ASGR by screening of apomictic and sexual F1s. Only one transcript, which did not map to the ASGR

  2. Construction of BAC libraries from two apomictic grasses to study the microcolinearity of their apospory-specific genomic regions.

    Science.gov (United States)

    Roche, D.; Conner, A.; Budiman, A.; Frisch, D.; Wing, R.; Hanna, W.; Ozias-Akins, P.

    2002-04-01

    We have constructed bacterial artificial chromosome (BAC) libraries from two grass species that reproduce by apospory, a form of gametophytic apomixis. The library of an apomictic polyhaploid genotype (line MS228-20, with a 2C genome size of approximately 4,500 Mbp) derived from a cross between the obligate apomict, Pennisetum squamulatum, and pearl millet ( P. glaucum) comprises 118,272 clones with an average insert size of 82 kb. The library of buffelgrass ( Cenchrus ciliaris, apomictic line B-12-9, with a 2C genome size of approximately 3,000 Mbp) contains 68,736 clones with an average insert size of 109 kb. Based on the genome sizes of these two lines and correcting for the number for false-positive and organellar clones, library coverages were found to be 3.7 and 4.8 haploid genome equivalents for MS 228-20 and B12-9, respectively. Both libraries were screened by hybridization with six SCARs (sequence-characterized amplified regions), whose tight linkage in a single apospory-specific genomic region had been previously demonstrated in both species. Analysis of these BAC clones indicated that some of the SCAR markers are actually amplifying duplicated regions linked in coupling in both genomes and that restriction enzyme mapping will be necessary to sort out the duplications.

  3. Short Communication: An apospory-specific genomic region is conserved between Buffelgrass (Cenchrus ciliaris L.) and Pennisetum squamulatum Fresen.

    Science.gov (United States)

    Roche; Cong; Chen; Hanna; Gustine; Sherwood; Ozias-Akins

    1999-07-01

    Twelve molecular markers linked to pseudogamous apospory, a form of gametophytic apomixis, were previously isolated from Pennisetum squamulatum Fresen. No recombination between these markers was found in a segregating population of 397 individuals (Ozias-Akins et al. 1998, Proc. Natl Acad. Sci. USA, 95, 5127-5132). The objective of the present study was to test if these markers were also linked to the aposporous mode of reproduction in two small segregating populations of Cenchrus ciliaris (= Pennisetum ciliare (L.)Link), another apomictic grass species. Among 12 markers (sequence characterized amplified regions, SCARs), six were scored as dominant markers between aposporous and sexual C. ciliaris genotypes (presence/absence, respectively). Five were always linked to apospory and one showed a low level of recombination in 84 progenies. Restriction fragment length polymorphisms (RFLPs) were observed between sexual and apomictic phenotypes for three of the six remaining SCARs from P. squamulatum when used as probes. No recombination was observed in the F1 progenies. Preliminary data from megabase DNA analysis and sequencing in both species indicate that an apospory-specific genomic region (ASGR) is highly conserved between the two species. Although C. ciliaris has a smaller genome size to P. squamulatum, a higher copy number for markers linked to apospory found in the former may impair the progress of positional cloning of gene(s) for apomixis in this species.

  4. A Segment of the Apospory-Specific Genomic Region Is Highly Microsyntenic Not Only between the Apomicts Pennisetum squamulatum and Buffelgrass, But Also with a Rice Chromosome 11 Centromeric-Proximal Genomic Region1[W

    Science.gov (United States)

    Gualtieri, Gustavo; Conner, Joann A.; Morishige, Daryl T.; Moore, L. David; Mullet, John E.; Ozias-Akins, Peggy

    2006-01-01

    Bacterial artificial chromosome (BAC) clones from apomicts Pennisetum squamulatum and buffelgrass (Cenchrus ciliaris), isolated with the apospory-specific genomic region (ASGR) marker ugt197, were assembled into contigs that were extended by chromosome walking. Gene-like sequences from contigs were identified by shotgun sequencing and BLAST searches, and used to isolate orthologous rice contigs. Additional gene-like sequences in the apomicts' contigs were identified by bioinformatics using fully sequenced BACs from orthologous rice contigs as templates, as well as by interspecies, whole-contig cross-hybridizations. Hierarchical contig orthology was rapidly assessed by constructing detailed long-range contig molecular maps showing the distribution of gene-like sequences and markers, and searching for microsyntenic patterns of sequence identity and spatial distribution within and across species contigs. We found microsynteny between P. squamulatum and buffelgrass contigs. Importantly, this approach also enabled us to isolate from within the rice (Oryza sativa) genome contig Rice A, which shows the highest microsynteny and is most orthologous to the ugt197-containing C1C buffelgrass contig. Contig Rice A belongs to the rice genome database contig 77 (according to the current September 12, 2003, rice fingerprint contig build) that maps proximal to the chromosome 11 centromere, a feature that interestingly correlates with the mapping of ASGR-linked BACs proximal to the centromere or centromere-like sequences. Thus, relatedness between these two orthologous contigs is supported both by their molecular microstructure and by their centromeric-proximal location. Our discoveries promote the use of a microsynteny-based positional-cloning approach using the rice genome as a template to aid in constructing the ASGR toward the isolation of genes underlying apospory. PMID:16415213

  5. A segment of the apospory-specific genomic region is highly microsyntenic not only between the apomicts Pennisetum squamulatum and buffelgrass, but also with a rice chromosome 11 centromeric-proximal genomic region.

    Science.gov (United States)

    Gualtieri, Gustavo; Conner, Joann A; Morishige, Daryl T; Moore, L David; Mullet, John E; Ozias-Akins, Peggy

    2006-03-01

    Bacterial artificial chromosome (BAC) clones from apomicts Pennisetum squamulatum and buffelgrass (Cenchrus ciliaris), isolated with the apospory-specific genomic region (ASGR) marker ugt197, were assembled into contigs that were extended by chromosome walking. Gene-like sequences from contigs were identified by shotgun sequencing and BLAST searches, and used to isolate orthologous rice contigs. Additional gene-like sequences in the apomicts' contigs were identified by bioinformatics using fully sequenced BACs from orthologous rice contigs as templates, as well as by interspecies, whole-contig cross-hybridizations. Hierarchical contig orthology was rapidly assessed by constructing detailed long-range contig molecular maps showing the distribution of gene-like sequences and markers, and searching for microsyntenic patterns of sequence identity and spatial distribution within and across species contigs. We found microsynteny between P. squamulatum and buffelgrass contigs. Importantly, this approach also enabled us to isolate from within the rice (Oryza sativa) genome contig Rice A, which shows the highest microsynteny and is most orthologous to the ugt197-containing C1C buffelgrass contig. Contig Rice A belongs to the rice genome database contig 77 (according to the current September 12, 2003, rice fingerprint contig build) that maps proximal to the chromosome 11 centromere, a feature that interestingly correlates with the mapping of ASGR-linked BACs proximal to the centromere or centromere-like sequences. Thus, relatedness between these two orthologous contigs is supported both by their molecular microstructure and by their centromeric-proximal location. Our discoveries promote the use of a microsynteny-based positional-cloning approach using the rice genome as a template to aid in constructing the ASGR toward the isolation of genes underlying apospory.

  6. High-resolution physical mapping reveals that the apospory-specific genomic region (ASGR) in Cenchrus ciliaris is located on a heterochromatic and hemizygous region of a single chromosome.

    Science.gov (United States)

    Akiyama, Yukio; Hanna, Wayne W; Ozias-Akins, Peggy

    2005-10-01

    An apomictic mode of reproduction known as apospory is displayed by most buffelgrass (Cenchrus ciliaris) genotypes, but rare sexual individuals have been identified. Previously, intraspecific crosses between sexual and aposporous genotypes allowed linkage to be discovered between the aposporous mode of reproduction and nine molecular markers that had been isolated from an aposporous relative, Pennisetum squamulatum. This region was described as the apospory-specific genomic region (ASGR). We now show an ideogram of the chromosome complement for aposporous tetraploid buffelgrass accession B-12-9 including the ASGR-carrier chromosome. The ASGR-carrier chromosome has a region of hemizygosity, as determined by in situ hybridization of BAC clones and unique morphological characteristics when compared with other chromosomes in the genome. In spite of its unique morphology, the ASGR-carrier chromosome could be identified as one of the chromosomes of a meiosis I quadrivalent. A similar partially hemizygous segment was also detected in the ASGR-carrier chromosome of the aposporous buffelgrass genotype, Higgins, but not in the sexual accession B-2S. Two non-recombining BACs linked to apospory were physically mapped on a highly condensed chromatin region of the short arm of B-12-9, and the distance between the BACs was estimated to be approximately 11 Mbp, a distance similar to what previously has been shown in P. squamulatum. The short arm of the ASGR-carrier chromosome was highly condensed at pachytene and extended only 1.7-2.7 fold that of mitotic chromosomes. Low recombination in the ASGR may partially be due to its localization in heterochromatin.

  7. GRAbB : Selective Assembly of Genomic Regions, a New Niche for Genomic Research

    NARCIS (Netherlands)

    Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often

  8. Annotating non-coding regions of the genome.

    Science.gov (United States)

    Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B

    2010-08-01

    Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.

  9. The transcriptionally active regions in the genome of Bacillus subtilis

    DEFF Research Database (Denmark)

    Rasmussen, Simon; Nielsen, Henrik Bjørn; Jarmer, Hanne Østergaard

    2009-01-01

    The majority of all genes have so far been identified and annotated systematically through in silico gene finding. Here we report the finding of 3662 strand-specific transcriptionally active regions (TARs) in the genome of Bacillus subtilis by the use of tiling arrays. We have measured the genome...

  10. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2.......5 million single nucleotide polymorphisms (SNP's), 44 gene deletions in the CHO DXB11 genome and 9357 SNP's, which interfere with the coding regions of 3458 genes. Copy number variations for nine CHO genomes were mapped to the chromosomes of the Chinese hamster showing unique signatures for each chromosome...

  11. Regional regulation of transcription in the chicken genome

    Directory of Open Access Journals (Sweden)

    Megens Hendrik-Jan

    2010-01-01

    Full Text Available Abstract Background Over the past years, the relationship between gene transcription and chromosomal location has been studied in a number of different vertebrate genomes. Regional differences in gene expression have been found in several different species. The chicken genome, as the closest sequenced genome relative to mammals, is an important resource for investigating regional effects on transcription in birds and studying the regional dynamics of chromosome evolution by comparative analysis. Results We used gene expression data to survey eight chicken tissues and create transcriptome maps for all chicken chromosomes. The results reveal the presence of two distinct types of chromosomal regions characterized by clusters of highly or lowly expressed genes. Furthermore, these regions correlate highly with a number of genome characteristics. Regions with clusters of highly expressed genes have higher gene densities, shorter genes, shorter average intron and higher GC content compared to regions with clusters of lowly expressed genes. A comparative analysis between the chicken and human transcriptome maps constructed using similar panels of tissues suggests that the regions with clusters of highly expressed genes are relatively conserved between the two genomes. Conclusions Our results revealed the presence of a higher order organization of the chicken genome that affects gene expression, confirming similar observations in other species. These results will aid in the further understanding of the regional dynamics of chromosome evolution. The microarray data used in this analysis have been submitted to NCBI GEO database under accession number GSE17108. The reviewer access link is: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=tjwjpscyceqawjk&acc=GSE17108

  12. Attenuation of monkeypox virus by deletion of genomic regions

    Science.gov (United States)

    Lopera, Juan G.; Falendysz, Elizabeth A.; Rocke, Tonie E.; Osorio, Jorge E.

    2015-01-01

    Monkeypox virus (MPXV) is an emerging pathogen from Africa that causes disease similar to smallpox. Two clades with different geographic distributions and virulence have been described. Here, we utilized bioinformatic tools to identify genomic regions in MPXV containing multiple virulence genes and explored their roles in pathogenicity; two selected regions were then deleted singularly or in combination. In vitro and in vivostudies indicated that these regions play a significant role in MPXV replication, tissue spread, and mortality in mice. Interestingly, while deletion of either region led to decreased virulence in mice, one region had no effect on in vitro replication. Deletion of both regions simultaneously also reduced cell culture replication and significantly increased the attenuation in vivo over either single deletion. Attenuated MPXV with genomic deletions present a safe and efficacious tool in the study of MPX pathogenesis and in the identification of genetic factors associated with virulence.

  13. Linkage disequilibrium of evolutionarily conserved regions in the human genome

    Directory of Open Access Journals (Sweden)

    Johnson Todd A

    2006-12-01

    Full Text Available Abstract Background The strong linkage disequilibrium (LD recently found in genic or exonic regions of the human genome demonstrated that LD can be increased by evolutionary mechanisms that select for functionally important loci. This suggests that LD might be stronger in regions conserved among species than in non-conserved regions, since regions exposed to natural selection tend to be conserved. To assess this hypothesis, we used genome-wide polymorphism data from the HapMap project and investigated LD within DNA sequences conserved between the human and mouse genomes. Results Unexpectedly, we observed that LD was significantly weaker in conserved regions than in non-conserved regions. To investigate why, we examined sequence features that may distort the relationship between LD and conserved regions. We found that interspersed repeats, and not other sequence features, were associated with the weak LD tendency in conserved regions. To appropriately understand the relationship between LD and conserved regions, we removed the effect of repetitive elements and found that the high degree of sequence conservation was strongly associated with strong LD in coding regions but not with that in non-coding regions. Conclusion Our work demonstrates that the degree of sequence conservation does not simply increase LD as predicted by the hypothesis. Rather, it implies that purifying selection changes the polymorphic patterns of coding sequences but has little influence on the patterns of functional units such as regulatory elements present in non-coding regions, since the former are generally restricted by the constraint of maintaining a functional protein product across multiple exons while the latter may exist more as individually isolated units.

  14. Genomic Regions Affecting Cheese Making Properties Identified in Danish Holsteins

    DEFF Research Database (Denmark)

    Gregersen, Vivi Raundahl; Bertelsen, Henriette Pasgaard; Poulsen, Nina Aagaard

    The cheese renneting process is affected by a number of factors associated to milk composition and a number of Danish Holsteins has previously been identified to have poor milk coagulation ability. Therefore, the aim of this study was to identify genomic regions affecting the technological proper...

  15. Mapping of the genomic regions controlling seed storability in ...

    Indian Academy of Sciences (India)

    Seed storability is especially important in the tropics due to high temperature and relative humidity of storage environment that cause rapid deterioration of seeds in storage. The objective of this study was to use SSR markers to identify genomic regions associated with quantitative trait loci (QTLs) controlling seed storability ...

  16. Origin of the duplicated regions in the yeast genomes

    DEFF Research Database (Denmark)

    Piskur, Jure

    2001-01-01

    The genome of Saccharomyces cerevisiae contains several duplicated regions. The recent sequencing results of several yeast species suggest that the duplicated regions found in the modern Saccharomyces species are probably the result of a single gross duplication, as well as a series of sporadic...... independent short-segment duplications. The gross duplication might coincide with the origin of the ability to grow under anaerobic conditions....

  17. Evolutionary history of the ABCB2 genomic region in teleosts

    Science.gov (United States)

    Palti, Y.; Rodriguez, M.F.; Gahr, S.A.; Hansen, J.D.

    2007-01-01

    Gene duplication, silencing and translocation have all been implicated in shaping the unique genomic architecture of the teleost MH regions. Previously, we demonstrated that trout possess five unlinked regions encoding MH genes. One of these regions harbors ABCB2 which in all other vertebrate classes is found in the MHC class II region. In this study, we sequenced a BAC contig for the trout ABCB2 region. Analysis of this region revealed the presence of genes homologous to those located in the human class II (ABCB2, BRD2, ??DAA), extended class II (RGL2, PHF1, SYGP1) and class III (PBX2, Notch-L) regions. The organization and syntenic relationships of this region were then compared to similar regions in humans, Tetraodon and zebrafish to learn more about the evolutionary history of this region. Our analysis indicates that this region was generated during the teleost-specific duplication event while also providing insight about potential MH paralogous regions in teleosts. ?? 2006 Elsevier Ltd. All rights reserved.

  18. GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research.

    Directory of Open Access Journals (Sweden)

    Balázs Brankovics

    2016-06-01

    Full Text Available GRAbB (Genomic Region Assembly by Baiting is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome, extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a, as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04, Fedora (23, CentOS (7.1.1503 and Mac OS X (10.7. Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/.

  19. Forces shaping the fastest evolving regions in the human genome.

    Directory of Open Access Journals (Sweden)

    Katherine S Pollard

    2006-10-01

    Full Text Available Comparative genomics allow us to search the human genome for segments that were extensively changed in the last approximately 5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome.

  20. Selective Constraint on Noncoding Regions of Hominid Genomes.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available An important challenge for human evolutionary biology is to understand the genetic basis of human-chimpanzee differences. One influential idea holds that such differences depend, to a large extent, on adaptive changes in gene expression. An important step in assessing this hypothesis involves gaining a better understanding of selective constraint on noncoding regions of hominid genomes. In noncoding sequence, functional elements are frequently small and can be separated by large nonfunctional regions. For this reason, constraint in hominid genomes is likely to be patchy. Here we use conservation in more distantly related mammals and amniotes as a way of identifying small sequence windows that are likely to be functional. We find that putatively functional noncoding elements defined in this manner are subject to significant selective constraint in hominids.

  1. Genome-Wide Analysis in Brazilians Reveals Highly Differentiated Native American Genome Regions

    Science.gov (United States)

    Havt, Alexandre; Nayak, Uma; Pinkerton, Relana; Farber, Emily; Concannon, Patrick; Lima, Aldo A.; Guerrant, Richard L.

    2017-01-01

    Despite its population, geographic size, and emerging economic importance, disproportionately little genome-scale research exists into genetic factors that predispose Brazilians to disease, or the population genetics of risk. After identification of suitable proxy populations and careful analysis of tri-continental admixture in 1,538 North-Eastern Brazilians to estimate individual ancestry and ancestral allele frequencies, we computed 400,000 genome-wide locus-specific branch length (LSBL) Fst statistics of Brazilian Amerindian ancestry compared to European and African; and a similar set of differentiation statistics for their Amerindian component compared with the closest Asian 1000 Genomes population (surprisingly, Bengalis in Bangladesh). After ranking SNPs by these statistics, we identified the top 10 highly differentiated SNPs in five genome regions in the LSBL tests of Brazilian Amerindian ancestry compared to European and African; and the top 10 SNPs in eight regions comparing their Amerindian component to the closest Asian 1000 Genomes population. We found SNPs within or proximal to the genes CIITA (rs6498115), SMC6 (rs1834619), and KLHL29 (rs2288697) were most differentiated in the Amerindian-specific branch, while SNPs in the genes ADAMTS9 (rs7631391), DOCK2 (rs77594147), SLC28A1 (rs28649017), ARHGAP5 (rs7151991), and CIITA (rs45601437) were most highly differentiated in the Asian comparison. These genes are known to influence immune function, metabolic and anthropometry traits, and embryonic development. These analyses have identified candidate genes for selection within Amerindian ancestry, and by comparison of the two analyses, those for which the differentiation may have arisen during the migration from Asia to the Americas. PMID:28100790

  2. Domestication footprints anchor genomic regions of agronomic importance in soybeans.

    Science.gov (United States)

    Han, Yingpeng; Zhao, Xue; Liu, Dongyuan; Li, Yinghui; Lightfoot, David A; Yang, Zhijiang; Zhao, Lin; Zhou, Gang; Wang, Zhikun; Huang, Long; Zhang, Zhiwu; Qiu, Lijuan; Zheng, Hongkun; Li, Wenbin

    2016-01-01

    Present-day soybeans consist of elite cultivars and landraces (Glycine max, fully domesticated (FD)), annual wild type (Glycine soja, nondomesticated (ND)), and semi-wild type (semi-domesticated (SD)). FD soybean originated in China, although the details of its domestication history remain obscure. More than 500 diverse soybean accessions were sequenced using specific-locus amplified fragment sequencing (SLAF-seq) to address fundamental questions regarding soybean domestication. In total, 64,141 single nucleotide polymorphisms (SNPs) with minor allele frequencies (MAFs) > 0.05 were found among the 512 tested accessions. The results indicated that the SD group is not a hybrid between the FD and ND groups. The initial domestication region was pinpointed to central China (demarcated by the Great Wall to the north and the Qinling Mountains to the south). A total of 800 highly differentiated genetic regions and > 140 selective sweeps were identified, and these were three- and twofold more likely, respectively, to encompass a known quantitative trait locus (QTL) than the rest of the soybean genome. Forty-three potential quantitative trait nucleotides (QTNs; including 15 distinct traits) were identified by genome-wide association mapping. The results of the present study should be beneficial for soybean improvement and provide insight into the genetic architecture of traits of agronomic importance. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  3. Variants in Several Genomic Regions Associated with Asperger Disorder

    Science.gov (United States)

    Salyakina, D.; Ma, D.Q.; Jaworski, J.M.; Konidari, I.; Whitehead, P.L.; Henson, R.; Martinez, D.; Robinson, J.L.; Sacharow, S.; Wright, H.H.; Abramson, R.K.; Gilbert, J.R.; Cuccaro, M.L.; Pericak-Vance, M.A.

    2010-01-01

    Asperger disorder (ASP) is one of the autism spectrum disorders (ASD) and is differentiated from autism largely on the absence of clinically significant cognitive and language delays. Analysis of a homogenous subset of families with ASP may help to address the corresponding effect of genetic heterogeneity on identifying ASD genetic risk factors. To examine the hypothesis that common variation is important in ASD, we performed a genome-wide association study (GWAS) in 124 ASP families in a discovery data set and 110 ASP families in a validation data set. We prioritized the top 100 association results from both cohorts by employing a ranking strategy. Novel regions on 5q21.1 (P = 9.7 × 10−7) and 15q22.1–q22.2 (P = 7.3 × 10−6) were our most significant findings in the combined data set. Three chromosomal regions showing association, 3p14.2 (P = 3.6 × 10−6), 3q25–26 (P = 6.0 × 10−5) and 3p23 (P = 3.3 × 10−4) overlapped linkage regions reported in Finnish ASP families, and eight association regions overlapped ASD linkage areas. Our findings suggest that ASP shares both ASD-related genetic risk factors, as well as has genetic risk factors unique to the ASP phenotype. PMID:21182207

  4. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates.

    Directory of Open Access Journals (Sweden)

    Bo Yuan

    2015-12-01

    Full Text Available Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100 is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases-about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual's susceptibility to acquiring disease-associated alleles.

  5. Genomic distance entrained clustering and regression modelling highlights interacting genomic regions contributing to proliferation in breast cancer.

    Science.gov (United States)

    Dexter, Tim J; Sims, David; Mitsopoulos, Costas; Mackay, Alan; Grigoriadis, Anita; Ahmad, Amar S; Zvelebil, Marketa

    2010-09-08

    Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. However, the relative contribution of specific alterations to the pathology of different breast cancer subtypes remains unclear. The heterogeneity and interplay of genomic and epigenetic variations means that large datasets and statistical data mining methods are required to uncover recurrent patterns that are likely to be important in cancer progression. We employed ridge regression to model the relationship between regional changes in gene expression and proliferation. Regional features were extracted from tumour gene expression data using a novel clustering method, called genomic distance entrained agglomerative (GDEC) clustering. Using gene expression data in this way provides a simple means of integrating the phenotypic effects of both copy number aberrations and alterations in chromatin state. We show that regional metagenes derived from GDEC clustering are representative of recurrent regions of epigenetic regulation or copy number aberrations in breast cancer. Furthermore, detected patterns of genomic alterations are conserved across independent oestrogen receptor positive breast cancer datasets. Sequential competitive metagene selection was used to reveal the relative importance of genomic regions in predicting proliferation rate. The predictive model suggested additive interactions between the most informative regions such as 8p22-12 and 8q13-22. Data-mining of large-scale microarray gene expression datasets can reveal regional clusters of co-ordinate gene expression, independent of cause. By correlating these clusters with tumour proliferation we have identified a number of genomic regions that act together to promote proliferation in ER+ breast cancer. Identification of such regions should enable prioritisation of genomic regions for combinatorial functional studies to pinpoint the key genes and interactions contributing to

  6. Genomic distance entrained clustering and regression modelling highlights interacting genomic regions contributing to proliferation in breast cancer

    Directory of Open Access Journals (Sweden)

    Dexter Tim J

    2010-09-01

    Full Text Available Abstract Background Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. However, the relative contribution of specific alterations to the pathology of different breast cancer subtypes remains unclear. The heterogeneity and interplay of genomic and epigenetic variations means that large datasets and statistical data mining methods are required to uncover recurrent patterns that are likely to be important in cancer progression. Results We employed ridge regression to model the relationship between regional changes in gene expression and proliferation. Regional features were extracted from tumour gene expression data using a novel clustering method, called genomic distance entrained agglomerative (GDEC clustering. Using gene expression data in this way provides a simple means of integrating the phenotypic effects of both copy number aberrations and alterations in chromatin state. We show that regional metagenes derived from GDEC clustering are representative of recurrent regions of epigenetic regulation or copy number aberrations in breast cancer. Furthermore, detected patterns of genomic alterations are conserved across independent oestrogen receptor positive breast cancer datasets. Sequential competitive metagene selection was used to reveal the relative importance of genomic regions in predicting proliferation rate. The predictive model suggested additive interactions between the most informative regions such as 8p22-12 and 8q13-22. Conclusions Data-mining of large-scale microarray gene expression datasets can reveal regional clusters of co-ordinate gene expression, independent of cause. By correlating these clusters with tumour proliferation we have identified a number of genomic regions that act together to promote proliferation in ER+ breast cancer. Identification of such regions should enable prioritisation of genomic regions for combinatorial functional studies to pinpoint

  7. Comparative Genomics of Methanopyrus sp. SNP6 and KOL6 Revealing Genomic Regions of Plasticity Implicated in Extremely Thermophilic Profiles

    Directory of Open Access Journals (Sweden)

    Zhiliang Yu

    2017-07-01

    Full Text Available Methanopyrus spp. are usually isolated from harsh niches, such as high osmotic pressure and extreme temperature. However, the molecular mechanisms for their environmental adaption are poorly understood. Archaeal species is commonly considered as primitive organism. The evolutional placement of archaea is a fundamental and intriguing scientific question. We sequenced the genomes of Methanopyrus strains SNP6 and KOL6 isolated from the Atlantic and Iceland, respectively. Comparative genomic analysis revealed genetic diversity and instability implicated in niche adaption, including a number of transporter- and integrase/transposase-related genes. Pan-genome analysis also defined the gene pool of Methanopyrus spp., in addition of ~120-Kb genomic region of plasticity impacting cognate genomic architecture. We believe that Methanopyrus genomics could facilitate efficient investigation/recognition of archaeal phylogenetic diverse patterns, as well as improve understanding of biological roles and significance of these versatile microbes.

  8. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions

    Directory of Open Access Journals (Sweden)

    Villegas Andre

    2010-09-01

    Full Text Available Abstract Background The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online pan-genome sequence analysis program, Panseq. Results Panseq was used to identify Escherichia coli O157:H7 and E. coli K-12 genomic islands. Within a population of 60 E. coli O157:H7 strains, the existence of 65 accessory genomic regions identified by Panseq analysis was confirmed by PCR. The accessory genome and binary presence/absence data, and core genome and single nucleotide polymorphisms (SNPs of six L. monocytogenes strains were extracted with Panseq and hierarchically clustered and visualized. The nucleotide core and binary accessory data were also used to construct maximum parsimony (MP trees, which were compared to the MP tree generated by multi-locus sequence typing (MLST. The topology of the accessory and core trees was identical but differed from the tree produced using seven MLST loci. The Loci Selector module found the most variable and discriminatory combinations of four loci within a 100 loci set among 10 strains in 1 s, compared to the 449 s required to exhaustively search for all possible combinations; it also found the most discriminatory 20 loci from a 96 loci E. coli O157:H7 SNP dataset. Conclusion Panseq determines the core and accessory regions among a collection of genomic sequences based on user-defined parameters. It readily extracts regions unique to a genome or group of genomes, identifies SNPs within shared core genomic regions, constructs files for use in phylogeny programs

  9. Identification of a large genomic region in UV-irradiated human cells which has fewer cyclobutane pyrimidine dimers than most genomic regions

    Energy Technology Data Exchange (ETDEWEB)

    Kantor, G.J.; Deiss-Tolbert, D.M. [Wright State Univ., Dayton, OH (United States)

    1995-08-01

    Size separation after UV-endonuclease digestion of DNA from UV-irradiated human cells using denaturing conditions fractionates the genome based on cyclobutane pyrimidine dimer content. We have examined the largest molecules available (50-80 kb; about 5% of the DNA) after fractionation and those of average size (5-15 kb) for content of some specific genes. We find that the largest molecules are not a representative sampling of the genome. Three contiguous genes located in a G+C-rich isochore (tyrosine hydroxylase, insulin, insulin-like growth factor II) have concentrations two to three times greater in the largest molecules. This shows that this genomic region has fewer pyrimidine dimers than most other genomic regions. In contrast, the {beta}-actin genomic region, which has a similar G+C content, has an equal concentration in both fractions as do the p53 and {beta}-globin genomic regions, which are A+T-rich. These data show that DNA damage in the form of cyclobutane pyrimidine dimers occurs with different probabilities in specific isochores. Part of the reason may be the relative G-C content, but other factors must play a significant role. We also report that the transcriptionally inactive insulin region is repaired at the genome-overall rate in normal cells and is not repaired in xeroderma pigmentosum complementation group C cells. (author).

  10. Genome-wide expression profiling of complex regional pain syndrome.

    Directory of Open Access Journals (Sweden)

    Eun-Heui Jin

    Full Text Available Complex regional pain syndrome (CRPS is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II and 5 controls (cut-off value: 1.5-fold change and p<0.05. Most of those genes were associated with signal transduction, developmental processes, cell structure and motility, and immunity and defense. The expression levels of major histocompatibility complex class I A subtype (HLA-A29.1, matrix metalloproteinase 9 (MMP9, alanine aminopeptidase N (ANPEP, l-histidine decarboxylase (HDC, granulocyte colony-stimulating factor 3 receptor (G-CSF3R, and signal transducer and activator of transcription 3 (STAT3 genes selected from the microarray were confirmed in 24 CRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR. We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10(-4. The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression.

  11. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    Directory of Open Access Journals (Sweden)

    Feltus Frank A

    2011-07-01

    Full Text Available Abstract Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18 to duodecaploid (12X = 108. Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective. Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of

  12. Comprehensive repertoire of foldable regions within whole genomes.

    Directory of Open Access Journals (Sweden)

    Guilhem Faure

    2013-10-01

    Full Text Available In order to get a comprehensive repertoire of foldable domains within whole proteomes, including orphan domains, we developed a novel procedure, called SEG-HCA. From only the information of a single amino acid sequence, SEG-HCA automatically delineates segments possessing high densities in hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA. These hydrophobic clusters mainly correspond to regular secondary structures, which together form structured or foldable regions. Genome-wide analyses revealed that SEG-HCA is opposite of disorder predictors, both addressing distinct structural states. Interestingly, there is however an overlap between the two predictions, including small segments of disordered sequences, which undergo coupled folding and binding. SEG-HCA thus gives access to these specific domains, which are generally poorly represented in domain databases. Comparison of the whole set of SEG-HCA predictions with the Conserved Domain Database (CDD also highlighted a wide proportion of predicted large (length >50 amino acids segments, which are CDD orphan. These orphan sequences may either correspond to highly divergent members of already known families or belong to new families of domains. Their comprehensive description thus opens new avenues to investigate new functional and/or structural features, which remained so far uncovered. Altogether, the data described here provide new insights into the protein architecture and organization throughout the three kingdoms of life.

  13. Harnessing genomics to improve health in the Eastern Mediterranean Region – an executive course in genomics policy

    Directory of Open Access Journals (Sweden)

    Singer Peter A

    2005-01-01

    Full Text Available Abstract Background While innovations in medicine, science and technology have resulted in improved health and quality of life for many people, the benefits of modern medicine continue to elude millions of people in many parts of the world. To assess the potential of genomics to address health needs in EMR, the World Health Organization's Eastern Mediterranean Regional Office and the University of Toronto Joint Centre for Bioethics jointly organized a Genomics and Public Health Policy Executive Course, held September 20th–23rd, 2003, in Muscat, Oman. The 4-day course was sponsored by WHO-EMRO with additional support from the Canadian Program in Genomics and Global Health. The overall objective of the course was to collectively explore how to best harness genomics to improve health in the region. This article presents the course findings and recommendations for genomics policy in EMR. Methods The course brought together senior representatives from academia, biotechnology companies, regulatory bodies, media, voluntary, and legal organizations to engage in discussion. Topics covered included scientific advances in genomics, followed by innovations in business models, public sector perspectives, ethics, legal issues and national innovation systems. Results A set of recommendations, summarized below, was formulated for the Regional Office, the Member States and for individuals. • Advocacy for genomics and biotechnology for political leadership; • Networking between member states to share information, expertise, training, and regional cooperation in biotechnology; coordination of national surveys for assessment of health biotechnology innovation systems, science capacity, government policies, legislation and regulations, intellectual property policies, private sector activity; • Creation in each member country of an effective National Body on genomics, biotechnology and health to: - formulate national biotechnology strategies - raise

  14. Forces shaping the fastest evolving regions in the human genome

    DEFF Research Database (Denmark)

    Pollard, Katherine S; Salama, Sofie R; King, Bryan

    2006-01-01

    Comparative genomics allow us to search the human genome for segments that were extensively changed in the last approximately 5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202...... genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements...

  15. Identification of low-confidence regions in the pig reference genome (Sscrofa10.2

    Directory of Open Access Journals (Sweden)

    Amanda eWarr

    2015-11-01

    Full Text Available Many applications of high throughput sequencing rely on the availability of an accurate reference genome. Variant calling often produces large data sets that cannot be realistically validated and which may contain large numbers of false-positives. Errors in the reference assembly increase the number of false-positives. While resources are available to aid in the filtering of variants from human data, for other species these do not yet exist and strict filtering techniques must be employed which are more likely to exclude true-positives. This work assesses the accuracy of the pig reference genome (Sscrofa10.2 using whole genome sequencing reads from the Duroc sow whose genome the assembly was based on. Indicators of structural variation including high regional coverage, unexpected insert sizes, improper pairing and homozygous variants were used to identify low quality (LQ regions of the assembly. Low coverage (LC regions were also identified and analyzed separately. The LQ regions covered 13.85% of the genome, the LC regions covered 26.6% of the genome and combined (LQLC they covered 33.07% of the genome. Over half of dbSNP variants were located in the LQLC regions. Of CNVRs identified in a previous study, 86.3% were located in the LQLC regions. The regions were also enriched for gene predictions from RNA-seq data with 42.98% falling in the LQLC regions. Excluding variants in the LQ, LC or LQLC from future analyses will help reduce the number of false-positive variant calls. Researchers using WGS data should be aware that the current pig reference genome does not give an accurate representation of the copy number of alleles in the original Duroc sow’s genome.

  16. Somatic DNA recombination yielding circular DNA and deletion of a genomic region in embryonic brain

    International Nuclear Information System (INIS)

    Maeda, Toyoki; Chijiiwa, Yoshiharu; Tsuji, Hideo; Sakoda, Saburo; Tani, Kenzaburo; Suzuki, Tomokazu

    2004-01-01

    In this study, a mouse genomic region is identified that undergoes DNA rearrangement and yields circular DNA in brain during embryogenesis. External region-directed inverse polymerase chain reaction on circular DNA extracted from late embryonic brain tissue repeatedly detected DNA of this region containing recombination joints. Wide-range genomic PCR and digestion-circularization PCR analysis showed this region underwent recombination accompanied with deletion of intervening sequences, including the circularized regions. This region was mapped by fluorescence in situ hybridization to C1 on mouse chromosome 16, where no gene and no physiological DNA rearrangement had been identified. DNA sequence in the region has segmental homology to an orthologous region on human chromosome 3q.13. These observations demonstrated somatic DNA recombination yielding genomic deletions in brain during embryogenesis

  17. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob H

    2014-01-01

    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However......, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure...... lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome...

  18. New Regions of the Human Genome Linked to Skin Color Variation in Some African Populations

    Science.gov (United States)

    In the first study of its kind, an international team of genomics researchers has identified new regions of the human genome that are associated with skin color variation in some African populations, opening new avenues for research on skin diseases and cancer in all populations.

  19. Comparative analyses of multi-species sequences from targeted genomic regions.

    Science.gov (United States)

    Thomas, J W; Touchman, J W; Blakesley, R W; Bouffard, G G; Beckstrom-Sternberg, S M; Margulies, E H; Blanchette, M; Siepel, A C; Thomas, P J; McDowell, J C; Maskeri, B; Hansen, N F; Schwartz, M S; Weber, R J; Kent, W J; Karolchik, D; Bruen, T C; Bevan, R; Cutler, D J; Schwartz, S; Elnitski, L; Idol, J R; Prasad, A B; Lee-Lin, S-Q; Maduro, V V B; Summers, T J; Portnoy, M E; Dietrich, N L; Akhter, N; Ayele, K; Benjamin, B; Cariaga, K; Brinkley, C P; Brooks, S Y; Granite, S; Guan, X; Gupta, J; Haghighi, P; Ho, S-L; Huang, M C; Karlins, E; Laric, P L; Legaspi, R; Lim, M J; Maduro, Q L; Masiello, C A; Mastrian, S D; McCloskey, J C; Pearson, R; Stantripop, S; Tiongson, E E; Tran, J T; Tsurgeon, C; Vogt, J L; Walker, M A; Wetherby, K D; Wiggins, L S; Young, A C; Zhang, L-H; Osoegawa, K; Zhu, B; Zhao, B; Shu, C L; De Jong, P J; Lawrence, C E; Smit, A F; Chakravarti, A; Haussler, D; Green, P; Miller, W; Green, E D

    2003-08-14

    The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding and conserved non-coding regions, including regulatory elements, and provide insight into the forces that have rendered modern-day genomes. As a complement to whole-genome sequencing efforts, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.

  20. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    Directory of Open Access Journals (Sweden)

    Wei Li

    Full Text Available Copy-number variations (CNV, loss of heterozygosity (LOH, and uniparental disomy (UPD are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS, is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs. In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  1. Annotation of the Protein Coding Regions of the Equine Genome.

    Directory of Open Access Journals (Sweden)

    Matthew S Hestand

    Full Text Available Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons.

  2. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob Hull

    2014-01-01

    for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog). CONCLUSIONS: We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70...

  3. Targeted enrichment of genomic DNA regions for next generation sequencing

    NARCIS (Netherlands)

    Mertens, F.; El-Sharawy, A.; Sauer, S.; Van Helvoort, J.; Van der Zaag, P.J.; Franke, A.; Nilsson, M.; Lehrach. H.; Brookes, A.

    2011-01-01

    In this review we discuss the latest targeted enrichment methods, and aspects of their utilization along with second generation sequencing for complex genome analysis. In doing so we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a

  4. Genome engineering reveals large dispensable regions in Bacillus subtilis

    NARCIS (Netherlands)

    Westers, Helga; Dorenbos, Ronald; Dijl, Jan Maarten van; Kabel, Jorrit; Flanagan, Tony; Devine, Kevin M.; Jude, Florence; Séror, Simone J.; Beekman, Aäron C.; Darmon, Elise; Eschevins, Caroline; Jong, Anne de; Bron, Sierd; Kuipers, Oscar P.; Albertini, Alessandra M.; Antelmann, Haike; Hecker, Michael; Zamboni, Nicola; Sauer, Uwe; Bruand, Claude; Ehrlich, Dusko S.; Alonso, Juan C.; Salas, Margarita; Quax, Wim J.

    2003-01-01

    Bacterial genomes contain 250 to 500 essential genes, as suggested by single gene disruptions and theoretical considerations. If this view is correct, the remaining nonessential genes of an organism, such as Bacillus subtilis, have been acquired during evolution in its perpetually changing

  5. Mapping of the genomic regions controlling seed storability in soybean

    Indian Academy of Sciences (India)

    The F2:4 seeds harvested in 2011 and 2012 were used to investigate seed storability. The F2 population was genotyped with 148 markers and the genetic map consisted of 128 SSR loci which converged into 38 linkage groups covering 1664.3 cM of soybean genome. Single marker analysis revealed that 13 markers from ...

  6. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...

  7. Differentially Methylated Genomic Regions in Birth-Weight Discordant Twin Pairs

    DEFF Research Database (Denmark)

    Chen, Mubo; Baumbach, Jan; Vandin, Fabio

    2016-01-01

    regions. Whole genome DNA methylation levels were measured in whole blood from 150 pairs of adult identical twins discordant for birth-weight. Intrapair differential DNA methylation was associated with qualitative (large or small) and quantitative (percentage) birth-weight discordance at each genomic site...... twin pairs to find evidence for such “programming” effects, but no significant results emerged. We further investigated this issue using a new computational approach: Instead of probing single genomic sites for significant alterations in epigenetic marks, we scan for differentially methylated genomic...

  8. Matrix attachment regions and structural colinearity in the genomes of two grass species.

    OpenAIRE

    Avramova, Z; Tikhonov, A; Chen, M; Bennetzen, J L

    1998-01-01

    In order to gain insights into the relationship between spatial organization of the genome and genome function we have initiated studies of the co-linear Sh2/A1- homologous regions of rice (30 kb) and sorghum (50 kb). We have identified the locations of matrix attachment regions (MARs) in these homologous chromosome segments, which could serve as anchors for individual structural units or loops. Despite the fact that the nucleotide sequences serving as MARs were not detectably conserved, the ...

  9. CGHScan: finding variable regions using high-density microarray comparative genomic hybridization data

    Directory of Open Access Journals (Sweden)

    Rajashekara Gireesh

    2006-04-01

    Full Text Available Abstract Background Comparative genomic hybridization can rapidly identify chromosomal regions that vary between organisms and tissues. This technique has been applied to detecting differences between normal and cancerous tissues in eukaryotes as well as genomic variability in microbial strains and species. The density of oligonucleotide probes available on current microarray platforms is particularly well-suited for comparisons of organisms with smaller genomes like bacteria and yeast where an entire genome can be assayed on a single microarray with high resolution. Available methods for analyzing these experiments typically confine analyses to data from pre-defined annotated genome features, such as entire genes. Many of these methods are ill suited for datasets with the number of measurements typical of high-density microarrays. Results We present an algorithm for analyzing microarray hybridization data to aid identification of regions that vary between an unsequenced genome and a sequenced reference genome. The program, CGHScan, uses an iterative random walk approach integrating multi-layered significance testing to detect these regions from comparative genomic hybridization data. The algorithm tolerates a high level of noise in measurements of individual probe intensities and is relatively insensitive to the choice of method for normalizing probe intensity values and identifying probes that differ between samples. When applied to comparative genomic hybridization data from a published experiment, CGHScan identified eight of nine known deletions in a Brucella ovis strain as compared to Brucella melitensis. The same result was obtained using two different normalization methods and two different scores to classify data for individual probes as representing conserved or variable genomic regions. The undetected region is a small (58 base pair deletion that is below the resolution of CGHScan given the array design employed in the study

  10. A novel statistical method to estimate the effective SNP size in vertebrate genomes and categorized genomic regions

    Directory of Open Access Journals (Sweden)

    Zhao Zhongming

    2006-12-01

    Full Text Available Abstract Background The local environment of single nucleotide polymorphisms (SNPs contains abundant genetic information for the study of mechanisms of mutation, genome evolution, and causes of diseases. Recent studies revealed that neighboring-nucleotide biases on SNPs were strong and the genome-wide bias patterns could be represented by a small subset of the total SNPs. It remains unsolved for the estimation of the effective SNP size, the number of SNPs that are sufficient to represent the bias patterns observed from the whole SNP data. Results To estimate the effective SNP size, we developed a novel statistical method, SNPKS, which considers both the statistical and biological significances. SNPKS consists of two major steps: to obtain an initial effective size by the Kolmogorov-Smirnov test (KS test and to find an intermediate effective size by interval evaluation. The SNPKS algorithm was implemented in computer programs and applied to the real SNP data. The effective SNP size was estimated to be 38,200, 39,300, 38,000, and 38,700 in the human, chimpanzee, dog, and mouse genomes, respectively, and 39,100, 39,600, 39,200, and 42,200 in human intergenic, genic, intronic, and CpG island regions, respectively. Conclusion SNPKS is the first statistical method to estimate the effective SNP size. It runs efficiently and greatly outperforms the algorithm implemented in SNPNB. The application of SNPKS to the real SNP data revealed the similar small effective SNP size (38,000 – 42,200 in the human, chimpanzee, dog, and mouse genomes as well as in human genomic regions. The findings suggest strong influence of genetic factors across vertebrate genomes.

  11. Regions identity between the genome of vertebrates and non-retroviral families of insect viruses

    Directory of Open Access Journals (Sweden)

    Fan Gaowei

    2011-11-01

    Full Text Available Abstract Background The scope of our understanding of the evolutionary history between viruses and animals is limited. The fact that the recent availability of many complete insect virus genomes and vertebrate genomes as well as the ability to screen these sequences makes it possible to gain a new perspective insight into the evolutionary interaction between insect viruses and vertebrates. This study is to determine the possibility of existence of sequence identity between the genomes of insect viruses and vertebrates, attempt to explain this phenomenon in term of genetic mobile element, and try to investigate the evolutionary relationship between these short regions of identity among these species. Results Some of studied insect viruses contain variable numbers of short regions of sequence identity to the genomes of vertebrate with nucleotide sequence length from 28 bp to 124 bp. They are found to locate in multiple sites of the vertebrate genomes. The ontology of animal genes with identical regions involves in several processes including chromatin remodeling, regulation of apoptosis, signaling pathway, nerve system development and some enzyme-like catalysis. Phylogenetic analysis reveals that at least some short regions of sequence identity in the genomes of vertebrate are derived the ancestral of insect viruses. Conclusion Short regions of sequence identity were found in the vertebrates and insect viruses. These sequences played an important role not only in the long-term evolution of vertebrates, but also in promotion of insect virus. This typical win-win strategy may come from natural selection.

  12. Regions identity between the genome of vertebrates and non-retroviral families of insect viruses.

    Science.gov (United States)

    Fan, Gaowei; Li, Jinming

    2011-11-10

    The scope of our understanding of the evolutionary history between viruses and animals is limited. The fact that the recent availability of many complete insect virus genomes and vertebrate genomes as well as the ability to screen these sequences makes it possible to gain a new perspective insight into the evolutionary interaction between insect viruses and vertebrates. This study is to determine the possibility of existence of sequence identity between the genomes of insect viruses and vertebrates, attempt to explain this phenomenon in term of genetic mobile element, and try to investigate the evolutionary relationship between these short regions of identity among these species. Some of studied insect viruses contain variable numbers of short regions of sequence identity to the genomes of vertebrate with nucleotide sequence length from 28 bp to 124 bp. They are found to locate in multiple sites of the vertebrate genomes. The ontology of animal genes with identical regions involves in several processes including chromatin remodeling, regulation of apoptosis, signaling pathway, nerve system development and some enzyme-like catalysis. Phylogenetic analysis reveals that at least some short regions of sequence identity in the genomes of vertebrate are derived the ancestral of insect viruses. Short regions of sequence identity were found in the vertebrates and insect viruses. These sequences played an important role not only in the long-term evolution of vertebrates, but also in promotion of insect virus. This typical win-win strategy may come from natural selection.

  13. Does selection against transcriptional interference shape retroelement-free regions in mammalian genomes?

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2008-01-01

    in generating and maintaining retroelement-free regions in the human genome. METHODOLOGY/PRINCIPAL FINDINGS: Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs) to be able to display a high degree of transcriptional interference. In contrast, we expect...... short interspersed elements (SINEs) to display very low levels of transcriptional interference. We find that genomic regions devoid of long interspersed elements (LINEs) are enriched for protein-coding genes, but that this is not the case for regions devoid of short interspersed elements (SINEs......). This is expected if genes are subject to selection against transcriptional interference. We do not find microRNAs to be associated with genomic regions devoid of either SINEs or LINEs. We further observe an increased relative activity of genes overlapping LINE-free regions during early embryogenesis, where...

  14. LD-Spline: Mapping SNPs on genotyping platforms to genomic regions using patterns of linkage disequilibrium

    Directory of Open Access Journals (Sweden)

    Bush William S

    2009-12-01

    Full Text Available Abstract Background Gene-centric analysis tools for genome-wide association study data are being developed both to annotate single locus statistics and to prioritize or group single nucleotide polymorphisms (SNPs prior to analysis. These approaches require knowledge about the relationships between SNPs on a genotyping platform and genes in the human genome. SNPs in the genome can represent broader genomic regions via linkage disequilibrium (LD, and population-specific patterns of LD can be exploited to generate a data-driven map of SNPs to genes. Methods In this study, we implemented LD-Spline, a database routine that defines the genomic boundaries a particular SNP represents using linkage disequilibrium statistics from the International HapMap Project. We compared the LD-Spline haplotype block partitioning approach to that of the four gamete rule and the Gabriel et al. approach using simulated data; in addition, we processed two commonly used genome-wide association study platforms. Results We illustrate that LD-Spline performs comparably to the four-gamete rule and the Gabriel et al. approach; however as a SNP-centric approach LD-Spline has the added benefit of systematically identifying a genomic boundary for each SNP, where the global block partitioning approaches may falter due to sampling variation in LD statistics. Conclusion LD-Spline is an integrated database routine that quickly and effectively defines the genomic region marked by a SNP using linkage disequilibrium, with a SNP-centric block definition algorithm.

  15. CpG islands undermethylation in human genomic regions under selective pressure.

    Directory of Open Access Journals (Sweden)

    Sergio Cocozza

    Full Text Available DNA methylation at CpG islands (CGIs is one of the most intensively studied epigenetic mechanisms. It is fundamental for cellular differentiation and control of transcriptional potential. DNA methylation is involved also in several processes that are central to evolutionary biology, including phenotypic plasticity and evolvability. In this study, we explored the relationship between CpG islands methylation and signatures of selective pressure in Homo Sapiens, using a computational biology approach. By analyzing methylation data of 25 cell lines from the Encyclopedia of DNA Elements (ENCODE Consortium, we compared the DNA methylation of CpG islands in genomic regions under selective pressure with the methylation of CpG islands in the remaining part of the genome. To define genomic regions under selective pressure, we used three different methods, each oriented to provide distinct information about selective events. Independently of the method and of the cell type used, we found evidences of undermethylation of CGIs in human genomic regions under selective pressure. Additionally, by analyzing SNP frequency in CpG islands, we demonstrated that CpG islands in regions under selective pressure show lower genetic variation. Our findings suggest that the CpG islands in regions under selective pressure seem to be somehow more "protected" from methylation when compared with other regions of the genome.

  16. RGmatch: matching genomic regions to proximal genes in omics data integration

    Directory of Open Access Journals (Sweden)

    Pedro Furió-Tarí

    2016-11-01

    Full Text Available Abstract Background The integrative analysis of multiple genomics data often requires that genome coordinates-based signals have to be associated with proximal genes. The relative location of a genomic region with respect to the gene (gene area is important for functional data interpretation; hence algorithms that match regions to genes should be able to deliver insight into this information. Results In this work we review the tools that are publicly available for making region-to-gene associations. We also present a novel method, RGmatch, a flexible and easy-to-use Python tool that computes associations either at the gene, transcript, or exon level, applying a set of rules to annotate each region-gene association with the region location within the gene. RGmatch can be applied to any organism as long as genome annotation is available. Furthermore, we qualitatively and quantitatively compare RGmatch to other tools. Conclusions RGmatch simplifies the association of a genomic region with its closest gene. At the same time, it is a powerful tool because the rules used to annotate these associations are very easy to modify according to the researcher’s specific interests. Some important differences between RGmatch and other similar tools already in existence are RGmatch’s flexibility, its wide range of user options, compatibility with any annotatable organism, and its comprehensive and user-friendly output.

  17. Genome-wide deficiency screen for the genomic regions responsible for heat resistance in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Teramura Kouhei

    2011-06-01

    Full Text Available Abstract Background Temperature adaptation is one of the most important determinants of distribution and population size of organisms in nature. Recently, quantitative trait loci (QTL mapping and gene expression profiling approaches have been used for detecting candidate genes for heat resistance. However, the resolution of QTL mapping is not high enough to examine the individual effects of various genes in each QTL. Heat stress-responsive genes, characterized by gene expression profiling studies, are not necessarily responsible for heat resistance. Some of these genes may be regulated in association with the heat stress response of other genes. Results To evaluate which heat-responsive genes are potential candidates for heat resistance with higher resolution than previous QTL mapping studies, we performed genome-wide deficiency screen for QTL for heat resistance. We screened 439 isogenic deficiency strains from the DrosDel project, covering 65.6% of the Drosophila melanogaster genome in order to map QTL for thermal resistance. As a result, we found 19 QTL for heat resistance, including 3 novel QTL outside the QTL found in previous studies. Conclusion The QTL found in this study encompassed 19 heat-responsive genes found in the previous gene expression profiling studies, suggesting that they were strong candidates for heat resistance. This result provides new insights into the genetic architecture of heat resistance. It also emphasizes the advantages of genome-wide deficiency screen using isogenic deficiency libraries.

  18. Genomic Regions Associated With Interspecies Communication in Dogs Contain Genes Related to Human Social Disorders

    OpenAIRE

    Persson, Mia; Wright, Dominic; Roth, Lina; Batakis, Petros; Jensen, Per

    2016-01-01

    Unlike their wolf ancestors, dogs have unique social skills for communicating and cooperating with humans. Previously, significant heritabilities for human-directed social behaviors have been found in laboratory beagles. Here, a Genome-Wide Association Study identified two genomic regions associated with dog's human-directed social behaviors. We recorded the propensity of laboratory beagles, bred, kept and handled under standardized conditions, to initiate physical interactions with a human d...

  19. Estimation of (co)variances for genomic regions of flexible sizes

    DEFF Research Database (Denmark)

    Sørensen, Lars Peter; Janss, Luc; Madsen, Per

    2012-01-01

    traits such as mammary disease traits in dairy cattle. METHODS: Data on progeny means of six traits related to mastitis resistance in dairy cattle (general mastitis resistance and five pathogen-specific mastitis resistance traits) were analyzed using a bivariate Bayesian SNP-based genomic model......, per chromosome, and in regions of 100 SNP on a chromosome. RESULTS: Genomic proportions of the total variance differed between traits. Genomic correlations were lower than pedigree-based genetic correlations and they were highest between general mastitis and pathogen-specific traits because...

  20. Regional Regulation of Transcription in the Bovine Genome

    NARCIS (Netherlands)

    Kommadath, A.; Nie, H.; Groenen, M.A.M.; Pas, te M.F.W.; Veerkamp, R.F.; Smits, M.A.

    2011-01-01

    Eukaryotic genes are distributed along chromosomes as clusters of highly expressed genes termed RIDGEs (Regions of IncreaseD Gene Expression) and lowly expressed genes termed anti-RIDGEs, interspersed among genes expressed at intermediate levels or not expressed. Previous studies based on this

  1. Reference-free SNP calling: improved accuracy by preventing incorrect calls from repetitive genomic regions

    Directory of Open Access Journals (Sweden)

    Dou Jinzhuang

    2012-06-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most abundant type of genetic variation in eukaryotic genomes and have recently become the marker of choice in a wide variety of ecological and evolutionary studies. The advent of next-generation sequencing (NGS technologies has made it possible to efficiently genotype a large number of SNPs in the non-model organisms with no or limited genomic resources. Most NGS-based genotyping methods require a reference genome to perform accurate SNP calling. Little effort, however, has yet been devoted to developing or improving algorithms for accurate SNP calling in the absence of a reference genome. Results Here we describe an improved maximum likelihood (ML algorithm called iML, which can achieve high genotyping accuracy for SNP calling in the non-model organisms without a reference genome. The iML algorithm incorporates the mixed Poisson/normal model to detect composite read clusters and can efficiently prevent incorrect SNP calls resulting from repetitive genomic regions. Through analysis of simulation and real sequencing datasets, we demonstrate that in comparison with ML or a threshold approach, iML can remarkably improve the accuracy of de novo SNP genotyping and is especially powerful for the reference-free genotyping in diploid genomes with high repeat contents. Conclusions The iML algorithm can efficiently prevent incorrect SNP calls resulting from repetitive genomic regions, and thus outperforms the original ML algorithm by achieving much higher genotyping accuracy. Our algorithm is therefore very useful for accurate de novo SNP genotyping in the non-model organisms without a reference genome. Reviewers This article was reviewed by Dr. Richard Durbin, Dr. Liliana Florea (nominated by Dr. Steven Salzberg and Dr. Arcady Mushegian.

  2. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    Science.gov (United States)

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  3. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  4. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure

    DEFF Research Database (Denmark)

    Torarinsson, Elfar; Sawera, Milena; Havgaard, Jakob Hull

    2006-01-01

    Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never...... been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural...... alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions...

  5. Ultradeep sequencing of a human ultraconserved region reveals somatic and constitutional genomic instability.

    Directory of Open Access Journals (Sweden)

    Anna De Grassi

    2010-01-01

    Full Text Available Early detection of cancer-associated genomic instability is crucial, particularly in tumour types in which this instability represents the essential underlying mechanism of tumourigenesis. Currently used methods require the presence of already established neoplastic cells because they only detect clonal mutations. In principle, parallel sequencing of single DNA filaments could reveal the early phases of tumour initiation by detecting low-frequency mutations, provided an adequate depth of coverage and an effective control of the experimental error. We applied ultradeep sequencing to estimate the genomic instability of individuals with hereditary non-polyposis colorectal cancer (HNPCC. To overcome the experimental error, we used an ultraconserved region (UCR of the human genome as an internal control. By comparing the mutability outside and inside the UCR, we observed a tendency of the ultraconserved element to accumulate significantly fewer mutations than the flanking segments in both neoplastic and nonneoplastic HNPCC samples. No difference between the two regions was detectable in cells from healthy donors, indicating that all three HNPCC samples have mutation rates higher than the healthy genome. This is the first, to our knowledge, direct evidence of an intrinsic genomic instability of individuals with heterozygous mutations in mismatch repair genes, and constitutes the proof of principle for the development of a more sensitive molecular assay of genomic instability.

  6. Genome-wide analysis of regions similar to promoters of histone genes

    KAUST Repository

    Chowdhary, Rajesh

    2010-05-28

    Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes.Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters.Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that

  7. Regulation of Sex Determination in Mice by a Non-coding Genomic Region

    Science.gov (United States)

    Arboleda, Valerie A.; Fleming, Alice; Barseghyan, Hayk; Délot, Emmanuèle; Sinsheimer, Janet S.; Vilain, Eric

    2014-01-01

    To identify novel genomic regions that regulate sex determination, we utilized the powerful C57BL/6J-YPOS (B6-YPOS) model of XY sex reversal where mice with autosomes from the B6 strain and a Y chromosome from a wild-derived strain, Mus domesticus poschiavinus (YPOS), show complete sex reversal. In B6-YPOS, the presence of a 55-Mb congenic region on chromosome 11 protects from sex reversal in a dose-dependent manner. Using mouse genetic backcross designs and high-density SNP arrays, we narrowed the congenic region to a 1.62-Mb genomic region on chromosome 11 that confers 80% protection from B6-YPOS sex reversal when one copy is present and complete protection when two copies are present. It was previously believed that the protective congenic region originated from the 129S1/SviMJ (129) strain. However, genomic analysis revealed that this region is not derived from 129 and most likely is derived from the semi-inbred strain POSA. We show that the small 1.62-Mb congenic region that protects against B6-YPOS sex reversal is located within the Sox9 promoter and promotes the expression of Sox9, thereby driving testis development within the B6-YPOS background. Through 30 years of backcrossing, this congenic region was maintained, as it promoted male sex determination and fertility despite the female-promoting B6-YPOS genetic background. Our findings demonstrate that long-range enhancer regions are critical to developmental processes and can be used to identify the complex interplay between genome variants, epigenetics, and developmental gene regulation. PMID:24793290

  8. Detecting genomic regions associated with a disease using variability functions and Adjusted Rand Index

    Directory of Open Access Journals (Sweden)

    Makarenkov Vladimir

    2011-10-01

    Full Text Available Abstract Background The identification of functional regions contained in a given multiple sequence alignment constitutes one of the major challenges of comparative genomics. Several studies have focused on the identification of conserved regions and motifs. However, most of existing methods ignore the relationship between the functional genomic regions and the external evidence associated with the considered group of species (e.g., carcinogenicity of Human Papilloma Virus. In the past, we have proposed a method that takes into account the prior knowledge on an external evidence (e.g., carcinogenicity or invasivity of the considered organisms and identifies genomic regions related to a specific disease. Results and conclusion We present a new algorithm for detecting genomic regions that may be associated with a disease. Two new variability functions and a bipartition optimization procedure are described. We validate and weigh our results using the Adjusted Rand Index (ARI, and thus assess to what extent the selected regions are related to carcinogenicity, invasivity, or any other species classification, given as input. The predictive power of different hit region detection functions was assessed on synthetic and real data. Our simulation results suggest that there is no a single function that provides the best results in all practical situations (e.g., monophyletic or polyphyletic evolution, and positive or negative selection, and that at least three different functions might be useful. The proposed hit region identification functions that do not benefit from the prior knowledge (i.e., carcinogenicity or invasivity of the involved organisms can provide equivalent results than the existing functions that take advantage of such a prior knowledge. Using the new algorithm, we examined the Neisseria meningitidis FrpB gene product for invasivity and immunologic activity, and human papilloma virus (HPV E6 oncoprotein for carcinogenicity, and confirmed

  9. Genomic Characterization and Comparison of Multi-Regional and Pooled Tumor Biopsy Specimens.

    Directory of Open Access Journals (Sweden)

    Je-Gun Joung

    Full Text Available A single tumor biopsy specimen is typically used in cancer genome studies. However, it may represent incompletely the underlying mutational and transcriptional profiles of tumor biology. Multi-regional biopsies have the advantage of increased sensitivity for genomic profiling, but they are not cost-effective. The concept of an alternative method such as the pooling of multiple biopsies is a challenge. In order to determine if the pooling of distinct regions is representative at the genomic and transcriptome level, we performed sequencing of four regional samples and pooled samples for four cancer types including colon, stomach, kidney and liver cancer. Subsequently, a comparative analysis was conducted to explore differences in mutations and gene expression profiles between multiple regional biopsies and pooled biopsy for each tumor. Our analysis revealed a marginal level of regional difference in detected variants, but in those with low allele frequency, considerable discrepancies were observed. In conclusion, sequencing pooled samples has the benefit of detecting many variants with moderate allele frequency that occur in partial regions, but it is not applicable for detecting low-frequency mutations that require deep sequencing.

  10. Identification of Genomic Regions Associated with Phenotypic Variation between Dog Breeds using Selection Mapping

    Science.gov (United States)

    Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H.; Hansen, Mark S. T.; Lawley, Cindy T.; Karlsson, Elinor K.; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Åke; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T.

    2011-01-01

    The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease. PMID:22022279

  11. Structured RNAs in the ENCODE selected regions of the human genome

    DEFF Research Database (Denmark)

    Washietl, Stefan; Pedersen, Jakob Skou; Korbel, Jan O

    2007-01-01

    Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack...... with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz...

  12. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  13. Identification of genomic regions associated with female fertility in Danish Jersey using whole genome sequence data

    DEFF Research Database (Denmark)

    Höglund, Johanna; Guldbrandtsen, Bernt; Lund, Mogens Sandø

    2015-01-01

    (AIS), 56-day non-return rate (NRR), number of days from first to last insemination (IFL), and number of days between calving and first insemination (ICF). The objective of this study was to identify associations between sequence variants and fertility traits in Jersey cattle based on 1,225 Jersey...... quantitative trait locus regions were re-analyzed using a linear mixed model (animal model) for both FTI and its component traits AIS, NRR, IFL and ICF. The underlying traits were analyzed separately for heifers (first parity cows) and cows (later parity cows) for AIS, NRR, and IFL. Results: In the first step...... 6 QTL were detected for FTI: one QTL on each of BTA7, BTA20, BTA23, BTA25, and two QTL on BTA9 (QTL9–1 and QTL9–2). In the second step, ICF showed association with the QTL regions on BTA7, QTL9–2 QTL2 on BTA9, and BTA25, AIS for cows on BTA20 and BTA23, AIS for heifers on QTL9–2 on BTA9, IFL...

  14. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping

    DEFF Research Database (Denmark)

    Vaysse, Amaury; Ratnakumar, Abhirami; Derrien, Thomas

    2011-01-01

    across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.......The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse...... breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary...

  15. Genomic regions associated with the sex-linked inhibitor of dermal melanin in Silkie chicken

    Directory of Open Access Journals (Sweden)

    Ming TIAN,Rui HAO,Suyun FANG,Yanqiang WANG,Xiaorong GU,Chungang FENG,Xiaoxiang HU,Ning LI

    2014-09-01

    Full Text Available A unique characteristic of the Silkie chicken is its fibromelanosis phenotype. The dermal layer of its skin, its connective tissue and shank dermis are hyperpigmented. This dermal hyperpigmentation phenotype is controlled by the sex-linked inhibitor of dermal melanin gene (ID and the dominant fibromelanosis allele. This study attempted to confirm the genomic region associated with ID. By genotyping, ID was found to be closely linked to the region between GGA_rs16127903 and GGA_rs14685542 (8406919 bp on chromosome Z, which contains ten functional genes. The expression of these genes was characterized in the embryo and 4 days after hatching and it was concluded that MTAP, encoding methylthioadenosinephosphorylase, would be the most likely candidate gene. Finally, target DNA capture and sequence analysis was performed, but no specific SNP(s was found in the targeted region of the Silkie genome. Further work is necessary to identify the causal ID mutation located on chromosome Z.

  16. Two Genomic Regions Involved in Catechol Siderophore Production by Erwinia carotovora

    Science.gov (United States)

    Bull, Carolee T.; Ishimaru, Carol A.; Loper, Joyce E.

    1994-01-01

    Two regions involved in catechol biosynthesis (cbs) of Erwinia carotovora W3C105 were cloned by functional complementation of Escherichia coli mutants that were deficient in the biosynthesis of the catechol siderophore enterobactin (ent). A 4.3-kb region of genomic DNA of E. carotovora complemented the entB402 mutation of E. coli. A second genomic region of 12.8 kb complemented entD, entC147, entE405, and entA403 mutations of E. coli. Although functions encoded by catechol biosynthesis genes (cbsA, cbsB, cbsC, cbsD, and cbsE) of E. carotovora were interchangeable with those encoded by corresponding enterobactin biosynthesis genes (entA, entB, entC, entD, and entE), only cbsE hybridized to its functional counterpart (entE) in E. coli. The cbsEA region of E. carotovora W3C105 hybridized to genomic DNA of 21 diverse strains of E. carotovora but did not hybridize to that of a chrysobactin-producing strain of Erwinia chrysanthemi. Strains of E. carotovora fell into nine groups on the basis of sizes of restriction fragments that hybridized to the cbsEA region, indicating that catechol biosynthesis genes were highly polymorphic among strains of E. carotovora. PMID:16349193

  17. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    Science.gov (United States)

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  18. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    Science.gov (United States)

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  19. Genome-wide association identifies multiple genomic regions associated with susceptibility to and control of ovine lentivirus.

    Directory of Open Access Journals (Sweden)

    Stephen N White

    Full Text Available BACKGROUND: Like human immunodeficiency virus (HIV, ovine lentivirus (OvLV is macrophage-tropic and causes lifelong infection. OvLV infects one quarter of U.S. sheep and induces pneumonia and body condition wasting. There is no vaccine to prevent OvLV infection and no cost-effective treatment for infected animals. However, breed differences in prevalence and proviral concentration have indicated a genetic basis for susceptibility to OvLV. A recent study identified TMEM154 variants in OvLV susceptibility. The objective here was to identify additional loci associated with odds and/or control of OvLV infection. METHODOLOGY/PRINCIPAL FINDINGS: This genome-wide association study (GWAS included 964 sheep from Rambouillet, Polypay, and Columbia breeds with serological status and proviral concentration phenotypes. Analytic models accounted for breed and age, as well as genotype. This approach identified TMEM154 (nominal P=9.2×10(-7; empirical P=0.13, provided 12 additional genomic regions associated with odds of infection, and provided 13 regions associated with control of infection (all nominal P<1 × 10(-5. Rapid decline of linkage disequilibrium with distance suggested many regions included few genes each. Genes in regions associated with odds of infection included DPPA2/DPPA4 (empirical P=0.006, and SYTL3 (P=0.051. Genes in regions associated with control of infection included a zinc finger cluster (ZNF192, ZSCAN16, ZNF389, and ZNF165; P=0.001, C19orf42/TMEM38A (P=0.047, and DLGAP1 (P=0.092. CONCLUSIONS/SIGNIFICANCE: These associations provide targets for mutation discovery in sheep susceptibility to OvLV. Aside from TMEM154, these genes have not been associated previously with lentiviral infection in any species, to our knowledge. Further, data from other species suggest functional hypotheses for future testing of these genes in OvLV and other lentiviral infections. Specifically, SYTL3 binds and may regulate RAB27A, which is required for enveloped

  20. Genomic Regions Associated With Interspecies Communication in Dogs Contain Genes Related to Human Social Disorders.

    Science.gov (United States)

    Persson, Mia E; Wright, Dominic; Roth, Lina S V; Batakis, Petros; Jensen, Per

    2016-09-29

    Unlike their wolf ancestors, dogs have unique social skills for communicating and cooperating with humans. Previously, significant heritabilities for human-directed social behaviors have been found in laboratory beagles. Here, a Genome-Wide Association Study identified two genomic regions associated with dog's human-directed social behaviors. We recorded the propensity of laboratory beagles, bred, kept and handled under standardized conditions, to initiate physical interactions with a human during an unsolvable problem-task, and 190 individuals were genotyped with an HD Canine SNP-chip. One genetic marker on chromosome 26 within the SEZ6L gene was significantly associated with time spent close to, and in physical contact with, the human. Two suggestive markers on chromosome 26, located within the ARVCF gene, were also associated with human contact seeking. Strikingly, four additional genes present in the same linkage blocks affect social abilities in humans, e.g., SEZ6L has been associated with autism and COMT affects aggression in adolescents with ADHD. This is, to our knowledge, the first genome-wide study presenting candidate genomic regions for dog sociability and inter-species communication. These results advance our understanding of dog domestication and raise the use of the dog as a novel model system for human social disorders.

  1. Deciphering heterogeneity in pig genome assembly Sscrofa9 by isochore and isochore-like region analyses.

    Directory of Open Access Journals (Sweden)

    Wenqian Zhang

    Full Text Available BACKGROUND: The isochore, a large DNA sequence with relatively small GC variance, is one of the most important structures in eukaryotic genomes. Although the isochore has been widely studied in humans and other species, little is known about its distribution in pigs. PRINCIPAL FINDINGS: In this paper, we construct a map of long homogeneous genome regions (LHGRs, i.e., isochores and isochore-like regions, in pigs to provide an intuitive version of GC heterogeneity in each chromosome. The LHGR pattern study not only quantifies heterogeneities, but also reveals some primary characteristics of the chromatin organization, including the followings: (1 the majority of LHGRs belong to GC-poor families and are in long length; (2 a high gene density tends to occur with the appearance of GC-rich LHGRs; and (3 the density of LINE repeats decreases with an increase in the GC content of LHGRs. Furthermore, a portion of LHGRs with particular GC ranges (50%-51% and 54%-55% tend to have abnormally high gene densities, suggesting that biased gene conversion (BGC, as well as time- and energy-saving principles, could be of importance to the formation of genome organization. CONCLUSION: This study significantly improves our knowledge of chromatin organization in the pig genome. Correlations between the different biological features (e.g., gene density and repeat density and GC content of LHGRs provide a unique glimpse of in silico gene and repeats prediction.

  2. Read clouds uncover variation in complex regions of the human genome.

    Science.gov (United States)

    Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E; West, Robert; Sidow, Arend; Batzoglou, Serafim

    2015-10-01

    Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. © 2015 Bishara et al.; Published by Cold Spring Harbor Laboratory Press.

  3. Functional genomics in Campylobacter coli identified a novel streptomycin resistance gene located in a hypervariable genomic region.

    Science.gov (United States)

    Olkkola, Satu; Culebro, Alejandra; Juntunen, Pekka; Hänninen, Marja-Liisa; Rossi, Mirko

    2016-07-01

    Numerous aminoglycoside resistance genes have been reported in Campylobacter spp. often resembling those from Gram-positive bacterial species and located in transferable genetic elements with other resistance genes. We discovered a new streptomycin (STR) resistance gene in Campylobactercoli showing 27-34 % amino acid identity to aminoglycoside 6-nucleotidyl-transferases described previously in Campylobacter. STR resistance was verified by gene expression and insertional inactivation. This ant-like gene differs from the previously described aminoglycoside resistance genes in Campylobacter spp. in several aspects. It does not appear to originate from Gram-positive bacteria and is located in a region corresponding to a previously described hypervariable region 14 of C. jejuni with no other known resistance genes detected in close proximity. Finally, it does not belong to a multiple drug resistance plasmid or transposon. This novel ant-like gene appears widely spread among C. coli as it is found in strains originating both from Europe and the United States and from several, apparently unrelated, hosts and environmental sources. The closest homologue (60 % amino acid identity) was found in certain C. jejuni and C. coli strains in a similar genomic location, but an association with STR resistance was not detected. Based on the findings presented here, we hypothesize that Campylobacter ant-like gene A has originated from a common ancestral proto-resistance element in Campylobacter spp., possibly encoding a protein with a different function. In conclusion, whole genome sequencing allowed us to fill in a knowledge gap concerning STR resistance in C. coli by revealing a novel STR resistance gene possibly inherent to Campylobacter.

  4. A hybrid neural network system for prediction and recognition of promoter regions in human genome.

    Science.gov (United States)

    Chen, Chuan-Bo; Li, Tao

    2005-05-01

    This paper proposes a high specificity and sensitivity algorithm called PromPredictor for recognizing promoter regions in the human genome. PromPredictor extracts compositional features and CpG islands information from genomic sequence, feeding these features as input for a hybrid neural network system (HNN) and then applies the HNN for prediction. It combines a novel promoter recognition model, coding theory, feature selection and dimensionality reduction with machine learning algorithm. Evaluation on Human chromosome 22 was approximately 66% in sensitivity and approximately 48% in specificity. Comparison with two other systems revealed that our method had superior sensitivity and specificity in predicting promoter regions. PromPredictor is written in MATLAB and requires Matlab to run. PromPredictor is freely available at http://www.whtelecom.com/Prompredictor.htm.

  5. Genome-wide methylation analysis identified sexually dimorphic methylated regions in hybrid tilapia

    Science.gov (United States)

    Wan, Zi Yi; Xia, Jun Hong; Lin, Grace; Wang, Le; Lin, Valerie C. L.; Yue, Gen Hua

    2016-01-01

    Sexual dimorphism is an interesting biological phenomenon. Previous studies showed that DNA methylation might play a role in sexual dimorphism. However, the overall picture of the genome-wide methylation landscape in sexually dimorphic species remains unclear. We analyzed the DNA methylation landscape and transcriptome in hybrid tilapia (Oreochromis spp.) using whole genome bisulfite sequencing (WGBS) and RNA-sequencing (RNA-seq). We found 4,757 sexually dimorphic differentially methylated regions (DMRs), with significant clusters of DMRs located on chromosomal regions associated with sex determination. CpG methylation in promoter regions was negatively correlated with the gene expression level. MAPK/ERK pathway was upregulated in male tilapia. We also inferred active cis-regulatory regions (ACRs) in skeletal muscle tissues from WGBS datasets, revealing sexually dimorphic cis-regulatory regions. These results suggest that DNA methylation contribute to sex-specific phenotypes and serve as resources for further investigation to analyze the functions of these regions and their contributions towards sexual dimorphisms. PMID:27782217

  6. Comparative Genomics of Campylobacter iguaniorum to Unravel Genetic Regions Associated with Reptilian Hosts.

    Science.gov (United States)

    Gilbert, Maarten J; Miller, William G; Yee, Emma; Kik, Marja; Zomer, Aldert L; Wagenaar, Jaap A; Duim, Birgitta

    2016-10-05

    Campylobacter iguaniorum is most closely related to the species C fetus, C hyointestinalis, and C lanienae Reptiles, chelonians and lizards in particular, appear to be a primary reservoir of this Campylobacter species. Here we report the genome comparison of C iguaniorum strain 1485E, isolated from a bearded dragon (Pogona vitticeps), and strain 2463D, isolated from a green iguana (Iguana iguana), with the genomes of closely related taxa, in particular with reptile-associated C fetus subsp. testudinum In contrast to C fetus, C iguaniorum is lacking an S-layer encoding region. Furthermore, a defined lipooligosaccharide biosynthesis locus, encoding multiple glycosyltransferases and bounded by waa genes, is absent from C iguaniorum Instead, multiple predicted glycosylation regions were identified in C iguaniorum One of these regions is > 50 kb with deviant G + C content, suggesting acquisition via lateral transfer. These similar, but non-homologous glycosylation regions were located at the same position on the genome in both strains. Multiple genes encoding respiratory enzymes not identified to date within the C. fetus clade were present. C iguaniorum shared highest homology with C hyointestinalis and C fetus. As in reptile-associated C fetus subsp. testudinum, a putative tricarballylate catabolism locus was identified. However, despite colonizing a shared host, no recent recombination between both taxa was detected. This genomic study provides a better understanding of host adaptation, virulence, phylogeny, and evolution of C iguaniorum and related Campylobacter taxa. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. In Silico Prediction of Scaffold/Matrix Attachment Regions in Large Genomic Sequences

    OpenAIRE

    Frisch, Matthias; Frech, Kornelie; Klingenhoff, Andreas; Cartharius, Kerstin; Liebich, Ines; Werner, Thomas

    2002-01-01

    Scaffold/matrix attachment regions (S/MARs) are essential regulatory DNA elements of eukaryotic cells. They are major determinants of locus control of gene expression and can shield gene expression from position effects. Experimental detection of S/MARs requires substantial effort and is not suitable for large-scale screening of genomic sequences. In silico prediction of S/MARs can provide a crucial first selection step to reduce the number of candidates. We used experimentally defined S/MAR ...

  8. RRE: a tool for the extraction of non-coding regions surrounding annotated genes from genomic datasets.

    Science.gov (United States)

    Lazzarato, F; Franceschinis, G; Botta, M; Cordero, F; Calogero, R A

    2004-11-01

    RRE allows the extraction of non-coding regions surrounding a coding sequence [i.e. gene upstream region, 5'-untranslated region (5'-UTR), introns, 3'-UTR, downstream region] from annotated genomic datasets available at NCBI. RRE parser and web-based interface are accessible at http://www.bioinformatica.unito.it/bioinformatics/rre/rre.html

  9. Evidence for widespread degradation of gene control regions in hominid genomes.

    Directory of Open Access Journals (Sweden)

    Peter D Keightley

    2005-02-01

    Full Text Available Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human-chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees.

  10. DNA Replication Control Is Linked to Genomic Positioning of Control Regions in Escherichia coli

    Science.gov (United States)

    Frimodt-Møller, Jakob; Charbon, Godefroid; Krogfelt, Karen A.; Løbner-Olesen, Anders

    2016-01-01

    Chromosome replication in Escherichia coli is in part controlled by three non-coding genomic sequences, DARS1, DARS2, and datA that modulate the activity of the initiator protein DnaA. The relative distance from oriC to the non-coding regions are conserved among E. coli species, despite large variations in genome size. Here we use a combination of i) site directed translocation of each region to new positions on the bacterial chromosome and ii) random transposon mediated translocation followed by culture evolution, to show genetic evidence for the importance of position. Here we provide evidence that the genomic locations of these regulatory sequences are important for cell cycle control and bacterial fitness. In addition, our work shows that the functionally redundant DARS1 and DARS2 regions play different roles in replication control. DARS1 is mainly involved in maintaining the origin concentration, whether DARS2 is also involved in maintaining single cell synchrony. PMID:27589233

  11. DNA Replication Control Is Linked to Genomic Positioning of Control Regions in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Jakob Frimodt-Møller

    2016-09-01

    Full Text Available Chromosome replication in Escherichia coli is in part controlled by three non-coding genomic sequences, DARS1, DARS2, and datA that modulate the activity of the initiator protein DnaA. The relative distance from oriC to the non-coding regions are conserved among E. coli species, despite large variations in genome size. Here we use a combination of i site directed translocation of each region to new positions on the bacterial chromosome and ii random transposon mediated translocation followed by culture evolution, to show genetic evidence for the importance of position. Here we provide evidence that the genomic locations of these regulatory sequences are important for cell cycle control and bacterial fitness. In addition, our work shows that the functionally redundant DARS1 and DARS2 regions play different roles in replication control. DARS1 is mainly involved in maintaining the origin concentration, whether DARS2 is also involved in maintaining single cell synchrony.

  12. Novel candidate genes and regions for childhood apraxia of speech identified by array comparative genomic hybridization.

    Science.gov (United States)

    Laffin, Jennifer J S; Raca, Gordana; Jackson, Craig A; Strand, Edythe A; Jakielski, Kathy J; Shriberg, Lawrence D

    2012-11-01

    The goal of this study was to identify new candidate genes and genomic copy-number variations associated with a rare, severe, and persistent speech disorder termed childhood apraxia of speech. Childhood apraxia of speech is the speech disorder segregating with a mutation in FOXP2 in a multigenerational London pedigree widely studied for its role in the development of speech-language in humans. A total of 24 participants who were suspected to have childhood apraxia of speech were assessed using a comprehensive protocol that samples speech in challenging contexts. All participants met clinical-research criteria for childhood apraxia of speech. Array comparative genomic hybridization analyses were completed using a customized 385K Nimblegen array (Roche Nimblegen, Madison, WI) with increased coverage of genes and regions previously associated with childhood apraxia of speech. A total of 16 copy-number variations with potential consequences for speech-language development were detected in 12 or half of the 24 participants. The copy-number variations occurred on 10 chromosomes, 3 of which had two to four candidate regions. Several participants were identified with copy-number variations in two to three regions. In addition, one participant had a heterozygous FOXP2 mutation and a copy-number variation on chromosome 2, and one participant had a 16p11.2 microdeletion and copy-number variations on chromosomes 13 and 14. Findings support the likelihood of heterogeneous genomic pathways associated with childhood apraxia of speech.

  13. Large Homogeneous Genome Regions (Isochores in Soybean (Glycine max (L. Merr.

    Directory of Open Access Journals (Sweden)

    Jenna Lynn Woody

    2012-06-01

    Full Text Available The landscape of plant genomes, while slowly being characterized and defined, is still composed primarily of regions of undefined function. Many eukaryotic genomes contain isochore regions, mosaics of homogeneous GC content that can abruptly change from one neighboring isochore to the next. Isochores are broken into families which are characterized by their GC levels. We identified 4,339 compositionally distinct domains and 331 of these were identified as Long Homogeneous Genome Regions (LHGRs. We assigned these to four families based on finite mixture models of GC content. We then characterized each family with respect to exon length, gene content, and transposeable elements. The LHGR pattern of soybeans is unique in that while the majority of the genes within LHGRs are found within a single LHGR family with a narrow GC-range (Family B, that family is not the highest in GC content as seen in vertebrates and invertebrates. Instead Family B has a mean GC content of 35%. The range of GC content for all LHGRs is 16-59% GC which is a larger range than what is typical of vertebrates. This is the first study in which LHGRs have been identified in soybeans and the functions of the genes within the LHGRs have been analyzed.

  14. Novel transcripts discovered by mining genomic DNA from defined regions of bovine chromosome 6

    Directory of Open Access Journals (Sweden)

    Eberlein Annett

    2009-04-01

    Full Text Available Abstract Background Linkage analyses strongly suggest a number of QTL for production, health and conformation traits in the middle part of bovine chromosome 6 (BTA6. The identification of the molecular background underlying the genetic variation at the QTL and subsequent functional studies require a well-annotated gene sequence map of the critical QTL intervals. To complete the sequence map of the defined subchromosomal regions on BTA6 poorly covered with comparative gene information, we focused on targeted isolation of transcribed sequences from bovine bacterial artificial chromosome (BAC clones mapped to the QTL intervals. Results Using the method of exon trapping, 92 unique exon trapping sequences (ETS were discovered in a chromosomal region of poor gene coverage. Sequence identity to the current NCBI sequence assembly for BTA6 was detected for 91% of unique ETS. Comparative sequence similarity search revealed that 11% of the isolated ETS displayed high similarity to genomic sequences located on the syntenic chromosomes of the human and mouse reference genome assemblies. Nearly a third of the ETS identified similar equivalent sequences in genomic sequence scaffolds from the alternative Celera-based sequence assembly of the human genome. Screening gene, EST, and protein databases detected 17% of ETS with identity to known transcribed sequences. Expression analysis of a subset of the ETS showed that most ETS (84% displayed a distinctive expression pattern in a multi-tissue panel of a lactating cow verifying their existence in the bovine transcriptome. Conclusion The results of our study demonstrate that the exon trapping method based on region-specific BAC clones is very useful for targeted screening for novel transcripts located within a defined chromosomal region being deficiently endowed with annotated gene information. The majority of identified ETS represents unknown noncoding sequences in intergenic regions on BTA6 displaying a

  15. Homologous recombination-mediated cloning and manipulation of genomic DNA regions using Gateway and recombineering systems.

    Science.gov (United States)

    Rozwadowski, Kevin; Yang, Wen; Kagale, Sateesh

    2008-11-17

    Employing genomic DNA clones to characterise gene attributes has several advantages over the use of cDNA clones, including the presence of native transcription and translation regulatory sequences as well as a representation of the complete repertoire of potential splice variants encoded by the gene. However, working with genomic DNA clones has traditionally been tedious due to their large size relative to cDNA clones and the presence, absence or position of particular restriction enzyme sites that may complicate conventional in vitro cloning procedures. To enable efficient cloning and manipulation of genomic DNA fragments for the purposes of gene expression and reporter-gene studies we have combined aspects of the Gateway system and a bacteriophage-based homologous recombination (i.e. recombineering) system. To apply the method for characterising plant genes we developed novel Gateway and plant transformation vectors that are of small size and incorporate selectable markers which enable efficient identification of recombinant clones. We demonstrate that the genomic coding region of a gene can be directly cloned into a Gateway Entry vector by recombineering enabling its subsequent transfer to Gateway Expression vectors. We also demonstrate how the coding and regulatory regions of a gene can be directly cloned into a plant transformation vector by recombineering. This construct was then rapidly converted into a novel Gateway Expression vector incorporating cognate 5' and 3' regulatory regions by using recombineering to replace the intervening coding region with the Gateway Destination cassette. Such expression vectors can be applied to characterise gene regulatory regions through development of reporter-gene fusions, using the Gateway Entry clones of GUS and GFP described here, or for ectopic expression of a coding region cloned into a Gateway Entry vector. We exemplify the utility of this approach with the Arabidopsis PAP85 gene and demonstrate that the expression

  16. Tandem repeat regions within the Burkholderia pseudomallei genome and their application for high resolution genotyping

    Directory of Open Access Journals (Sweden)

    Harvey Steven P

    2007-03-01

    Full Text Available Abstract Background The facultative, intracellular bacterium Burkholderia pseudomallei is the causative agent of melioidosis, a serious infectious disease of humans and animals. We identified and categorized tandem repeat arrays and their distribution throughout the genome of B. pseudomallei strain K96243 in order to develop a genetic typing method for B. pseudomallei. We then screened 104 of the potentially polymorphic loci across a diverse panel of 31 isolates including B. pseudomallei, B. mallei and B. thailandensis in order to identify loci with varying degrees of polymorphism. A subset of these tandem repeat arrays were subsequently developed into a multiple-locus VNTR analysis to examine 66 B. pseudomallei and 21 B. mallei isolates from around the world, as well as 95 lineages from a serial transfer experiment encompassing ~18,000 generations. Results B. pseudomallei contains a preponderance of tandem repeat loci throughout its genome, many of which are duplicated elsewhere in the genome. The majority of these loci are composed of repeat motif lengths of 6 to 9 bp with 4 to 10 repeat units and are predominately located in intergenic regions of the genome. Across geographically diverse B. pseudomallei and B.mallei isolates, the 32 VNTR loci displayed between 7 and 28 alleles, with Nei's diversity values ranging from 0.47 and 0.94. Mutation rates for these loci are comparable (>10-5 per locus per generation to that of the most diverse tandemly repeated regions found in other less diverse bacteria. Conclusion The frequency, location and duplicate nature of tandemly repeated regions within the B. pseudomallei genome indicate that these tandem repeat regions may play a role in generating and maintaining adaptive genomic variation. Multiple-locus VNTR analysis revealed extensive diversity within the global isolate set containing B. pseudomallei and B. mallei, and it detected genotypic differences within clonal lineages of both species that were

  17. Orion: Detecting regions of the human non-coding genome that are intolerant to variation using population genetics.

    Science.gov (United States)

    Gussow, Ayal B; Copeland, Brett R; Dhindsa, Ryan S; Wang, Quanli; Petrovski, Slavé; Majoros, William H; Allen, Andrew S; Goldstein, David B

    2017-01-01

    There is broad agreement that genetic mutations occurring outside of the protein-coding regions play a key role in human disease. Despite this consensus, we are not yet capable of discerning which portions of non-coding sequence are important in the context of human disease. Here, we present Orion, an approach that detects regions of the non-coding genome that are depleted of variation, suggesting that the regions are intolerant of mutations and subject to purifying selection in the human lineage. We show that Orion is highly correlated with known intolerant regions as well as regions that harbor putatively pathogenic variation. This approach provides a mechanism to identify pathogenic variation in the human non-coding genome and will have immediate utility in the diagnostic interpretation of patient genomes and in large case control studies using whole-genome sequences.

  18. Natural selection among Eurasians at genomic regions associated with HIV-1 control

    Directory of Open Access Journals (Sweden)

    Allison David B

    2011-06-01

    Full Text Available Abstract Background HIV susceptibility and pathogenicity exhibit both interindividual and intergroup variability. The etiology of intergroup variability is still poorly understood, and could be partly linked to genetic differences among racial/ethnic groups. These genetic differences may be traceable to different regimes of natural selection in the 60,000 years since the human radiation out of Africa. Here, we examine population differentiation and haplotype patterns at several loci identified through genome-wide association studies on HIV-1 control, as determined by viral-load setpoint, in European and African-American populations. We use genome-wide data from the Human Genome Diversity Project, consisting of 53 world-wide populations, to compare measures of FST and relative extended haplotype homozygosity (REHH at these candidate loci to the rest of the respective chromosome. Results We find that the Europe-Middle East and Europe-South Asia pairwise FST in the most strongly associated region are elevated compared to most pairwise comparisons with the sub-Saharan African group, which exhibit very low FST. We also find genetic signatures of recent positive selection (higher REHH at these associated regions among all groups except for sub-Saharan Africans and Native Americans. This pattern is consistent with one in which genetic differentiation, possibly due to diversifying/positive selection, occurred at these loci among Eurasians. Conclusions These findings are concordant with those from earlier studies suggesting recent evolutionary change at immunity-related genomic regions among Europeans, and shed light on the potential genetic and evolutionary origin of population differences in HIV-1 control.

  19. Remarkably Divergent Regions Punctuate the Genome Assembly of the Caenorhabditis elegans Hawaiian Strain CB4856.

    Science.gov (United States)

    Thompson, Owen A; Snoek, L Basten; Nijveen, Harm; Sterken, Mark G; Volkers, Rita J M; Brenchley, Rachel; Van't Hof, Arjen; Bevers, Roel P J; Cossins, Andrew R; Yanai, Itai; Hajnal, Alex; Schmid, Tobias; Perkins, Jaryn D; Spencer, David; Kruglyak, Leonid; Andersen, Erik C; Moerman, Donald G; Hillier, LaDeana W; Kammenga, Jan E; Waterston, Robert H

    2015-07-01

    The Hawaiian strain (CB4856) of Caenorhabditis elegans is one of the most divergent from the canonical laboratory strain N2 and has been widely used in developmental, population, and evolutionary studies. To enhance the utility of the strain, we have generated a draft sequence of the CB4856 genome, exploiting a variety of resources and strategies. When compared against the N2 reference, the CB4856 genome has 327,050 single nucleotide variants (SNVs) and 79,529 insertion-deletion events that result in a total of 3.3 Mb of N2 sequence missing from CB4856 and 1.4 Mb of sequence present in CB4856 but not present in N2. As previously reported, the density of SNVs varies along the chromosomes, with the arms of chromosomes showing greater average variation than the centers. In addition, we find 61 regions totaling 2.8 Mb, distributed across all six chromosomes, which have a greatly elevated SNV density, ranging from 2 to 16% SNVs. A survey of other wild isolates show that the two alternative haplotypes for each region are widely distributed, suggesting they have been maintained by balancing selection over long evolutionary times. These divergent regions contain an abundance of genes from large rapidly evolving families encoding F-box, MATH, BATH, seven-transmembrane G-coupled receptors, and nuclear hormone receptors, suggesting that they provide selective advantages in natural environments. The draft sequence makes available a comprehensive catalog of sequence differences between the CB4856 and N2 strains that will facilitate the molecular dissection of their phenotypic differences. Our work also emphasizes the importance of going beyond simple alignment of reads to a reference genome when assessing differences between genomes. Copyright © 2015 by the Genetics Society of America.

  20. Variability among the most rapidly evolving plastid genomic regions is lineage-specific: implications of pairwise genome comparisons in Pyrus (Rosaceae and other angiosperms for marker choice.

    Directory of Open Access Journals (Sweden)

    Nadja Korotkova

    Full Text Available Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae-a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC-trnV, trnR-atpA, ndhF-rpl32, psbM-trnD, and trnQ-rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters. Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid, Olea (asterids and Cymbidium (monocots showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF-rpl32 and trnK-rps16 were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing

  1. Genomic regions under selection in crop-wild hybrids of lettuce: implications for crop breeding and environmental risk assessment

    NARCIS (Netherlands)

    Hartman, Y.

    2012-01-01

    The results of this thesis show that the probability of introgression of a putative transgene to wild relatives indeed depends strongly on the insertion location of the transgene. The study of genomic selection patterns can identify crop genomic regions under negative selection in multiple

  2. Genomic region operation kit for flexible processing of deep sequencing data.

    Science.gov (United States)

    Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa

    2013-01-01

    Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.

  3. Sardinians genetic background explained by runs of homozygosity and genomic regions under positive selection.

    Science.gov (United States)

    Di Gaetano, Cornelia; Fiorito, Giovanni; Ortu, Maria Francesca; Rosa, Fabio; Guarrera, Simonetta; Pardini, Barbara; Cusi, Daniele; Frau, Francesca; Barlassina, Cristina; Troffa, Chiara; Argiolas, Giuseppe; Zaninello, Roberta; Fresu, Giovanni; Glorioso, Nicola; Piazza, Alberto; Matullo, Giuseppe

    2014-01-01

    The peculiar position of Sardinia in the Mediterranean sea has rendered its population an interesting biogeographical isolate. The aim of this study was to investigate the genetic population structure, as well as to estimate Runs of Homozygosity and regions under positive selection, using about 1.2 million single nucleotide polymorphisms genotyped in 1077 Sardinian individuals. Using four different methods--fixation index, inflation factor, principal component analysis and ancestry estimation--we were able to highlight, as expected for a genetic isolate, the high internal homogeneity of the island. Sardinians showed a higher percentage of genome covered by RoHs>0.5 Mb (F(RoH%0.5)) when compared to peninsular Italians, with the only exception of the area surrounding Alghero. We furthermore identified 9 genomic regions showing signs of positive selection and, we re-captured many previously inferred signals. Other regions harbor novel candidate genes for positive selection, like TMEM252, or regions containing long non coding RNA. With the present study we confirmed the high genetic homogeneity of Sardinia that may be explained by the shared ancestry combined with the action of evolutionary forces.

  4. Comparative genomic analysis of duplicated homoeologous regions involved in the resistance of Brassica napus to stem canker

    Directory of Open Access Journals (Sweden)

    Berline eFopa Fomeju

    2015-09-01

    Full Text Available All crop species are current or ancient polyploids. Following whole genome duplication, structural and functional modifications result in differential gene content or regulation in the duplicated regions, which can play a fundamental role in the diversification of genes underlying complex traits. We have investigated this issue in Brassica napus, a species with a highly duplicated genome, with the aim of studying the structural and functional organization of duplicated regions involved in quantitative resistance to stem canker, a disease caused by the fungal pathogen Leptosphaeria maculans. Genome-wide association analysis on two oilseed rape panels confirmed that duplicated regions of ancestral blocks E, J, R, U and W were involved in resistance to stem canker. The structural analysis of the duplicated genomic regions showed a higher gene density on the A genome than on the C genome and a better collinearity between homoeologous regions than paralogous regions, as overall in the whole B. napus genome. The three ancestral sub-genomes were involved in the resistance to stem canker and the fractionation profile of the duplicated regions corresponded to what was expected from results on the B. napus progenitors. About 60% of the genes identified in these duplicated regions were single-copy genes while less than 5% were retained in all the duplicated copies of a given ancestral block. Genes retained in several copies were mainly involved in response to stress, signaling or transcription regulation. Genes with resistance-associated markers were mainly retained in more than two copies. These results suggested that some genes underlying quantitative resistance to stem canker might be duplicated genes. Genes with a hydrolase activity that were retained in one copy or R-like genes might also account for resistance in some regions. Further analyses need to be conducted to indicate to what extent duplicated genes contribute to the expression of the

  5. Drosophila duplication hotspots are associated with late-replicating regions of the genome.

    Directory of Open Access Journals (Sweden)

    Margarida Cardoso-Moreira

    2011-11-01

    Full Text Available Duplications play a significant role in both extremes of the phenotypic spectrum of newly arising mutations: they can have severe deleterious effects (e.g. duplications underlie a variety of diseases but can also be highly advantageous. The phenotypic potential of newly arisen duplications has stimulated wide interest in both the mutational and selective processes shaping these variants in the genome. Here we take advantage of the Drosophila simulans-Drosophila melanogaster genetic system to further our understanding of both processes. Regarding mutational processes, the study of two closely related species allows investigation of the potential existence of shared duplication hotspots, and the similarities and differences between the two genomes can be used to dissect its underlying causes. Regarding selection, the difference in the effective population size between the two species can be leveraged to ask questions about the strength of selection acting on different classes of duplications. In this study, we conducted a survey of duplication polymorphisms in 14 different lines of D. simulans using tiling microarrays and combined it with an analogous survey for the D. melanogaster genome. By integrating the two datasets, we identified duplication hotspots conserved between the two species. However, unlike the duplication hotspots identified in mammalian genomes, Drosophila duplication hotspots are not associated with sequences of high sequence identity capable of mediating non-allelic homologous recombination. Instead, Drosophila duplication hotspots are associated with late-replicating regions of the genome, suggesting a link between DNA replication and duplication rates. We also found evidence supporting a higher effectiveness of selection on duplications in D. simulans than in D. melanogaster. This is also true for duplications segregating at high frequency, where we find evidence in D. simulans that a sizeable fraction of these mutations is

  6. RNA interactions in the 5' region of the HIV-1 genome

    DEFF Research Database (Denmark)

    Damgaard, Christian Kroun; Andersen, Ebbe Sloth; Knudsen, Bjarne

    2004-01-01

    The untranslated leader of the dimeric HIV-1 RNA genome is folded into a complex structure that plays multiple and essential roles in the viral replication cycle. Here, we have investigated secondary and tertiary structural elements within the 5' 744 nucleotides of the HIV-1 genome using...... a combination of bioinformatics, enzymatic probing, native gel electrophoresis, and UV-crosslinking experiments. We used a recently developed RNA folding algorithm (Pfold) to predict the common secondary structure of an alignment of 20 divergent HIV-1 sequences. Combining this analysis with biochemical data, we...... present a secondary structure model for the entire 744 nucleotide fragment, which incorporates previously recognized and novel structural elements. In particular, our data provided strong evidence for a long-distance interaction between the region encompassing the AUG Gag initiation codon and an upstream...

  7. Origins of the Xylella fastidiosa prophage-like regions and their impact in genome differentiation.

    Directory of Open Access Journals (Sweden)

    Alessandro de Mello Varani

    Full Text Available Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1, and in two candidate molecules (Ann1 and Dixon were assessed. Based on comparative best bidirectional hit analyses, the majority (51% of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes.

  8. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  9. Heritability and Genome-Wide Linkage in US and Australian Twins Identify Novel Genomic Regions Controlling Chromogranin A

    Science.gov (United States)

    O’Connor, Daniel T.; Zhu, Gu; Rao, Fangwen; Taupenot, Laurent; Fung, Maple M.; Das, Madhusudan; Mahata, Sushil K.; Mahata, Manjula; Wang, Lei; Zhang, Kuixing; Greenwood, Tiffany A.; Shih, Pei-an Betty; Cockburn, Myles G.; Ziegler, Michael G.; Stridsberg, Mats; Martin, Nicholas G.; Whitfield, John B.

    2009-01-01

    Background Chromogranin A (CHGA) triggers catecholamine secretory granule biogenesis, and its catestatin fragment inhibits catecholamine release. We approached catestatin heritability using twin pairs, coupled with genome-wide linkage, in a series of twin and sibling pairs from 2 continents. Methods and Results Hypertensive patients had elevated CHGA coupled with reduction in catestatin, suggesting diminished conversion of precursor to catestatin. Heritability for catestatin in twins was 44% to 60%. Six hundred fifteen nuclear families yielded 870 sib pairs for linkage, with significant logarithm of odds peaks on chromosomes 4p, 4q, and 17q. Because acidification of catecholamine secretory vesicles determines CHGA trafficking and processing to catestatin, we genotyped at positional candidate ATP6N1, bracketed by peak linkage markers on chromosome 17q, encoding a subunit of vesicular H+-translocating ATPase. The minor allele diminished CHGA secretion and processing to catestatin. The ATP6N1 variant also influenced blood pressure in 1178 individuals with the most extreme blood pressure values in the population. In chromaffin cells, inhibition of H+-ATPase diverted CHGA from regulated to constitutive secretory pathways. Conclusions We established heritability of catestatin in twins from 2 continents. Linkage identified 3 regions contributing to catestatin, likely novel determinants of sympathochromaffin exocytosis. At 1 such positional candidate (ATP6N1), variation influenced CHGA secretion and processing to catestatin, confirming the mechanism of a novel trans-QTL for sympathochromaffin activity and blood pressure. PMID:18591442

  10. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes

    Science.gov (United States)

    Todd, John A; Walker, Neil M; Cooper, Jason D; Smyth, Deborah J; Downes, Kate; Plagnol, Vincent; Bailey, Rebecca; Nejentsev, Sergey; Field, Sarah F; Payne, Felicity; Lowe, Christopher E; Szeszko, Jeffrey S; Hafler, Jason P; Zeitels, Lauren; Yang, Jennie H M; Vella, Adrian; Nutland, Sarah; Stevens, Helen E; Schuilenburg, Helen; Coleman, Gillian; Maisuria, Meeta; Meadows, William; Smink, Luc J; Healy, Barry; Burren, Oliver S; Lam, Alex A C; Ovington, Nigel R; Allen, James; Adlem, Ellen; Leung, Hin-Tak; Wallace, Chris; Howson, Joanna M M; Guja, Cristian; Ionescu-Tirgoviste, Constantin; Simmonds, Matthew J; Heward, Joanne M; Gough, Stephen CL; Dunger, David B; Wicker, Linda S; Clayton, David G

    2007-01-01

    The Wellcome Trust Case Control Consortium (WTCCC) primary genome-wide association (GWA) scan1 on seven diseases, including the multifactorial, autoimmune disease, type 1 diabetes (T1D), shows significant association (P < 5 × 10−7 between T1D and six chromosome regions: 12q24, 12q13, 16p13, 18p11, 12p13 and 4q27. Here, we attempted to validate these and six other top findings in 4,000 individuals with T1D, 5,000 controls and 2,997 family trios that were independent of the WTCCC study. We confirmed unequivocally the associations of 12q24, 12q13, 16p13 and 18p11 (Pfollow-up ≤ 1.35 × 10−9; Poverall ≤ 1.15 × 10−14), leaving eight regions with small effects or false-positive associations with T1D. We also obtained evidence for chromosome 18q22 (Poverall = 1.38 × 10−8) from a genome-wide association study of nonsynonymous SNPs. Several regions, including 18q22 and 18p11, showed association with autoimmune thyroid disease. This study increases the number of T1D loci with compelling evidence from six to at least ten. PMID:17554260

  11. AIRE recruits multiple transcriptional components to specific genomic regions through tethering to nuclear matrix.

    Science.gov (United States)

    Tao, Yunxia; Kupfer, Rene; Stewart, Benjamin J; Williams-Skipp, Cheryll; Crowell, Christopher K; Patel, Dhavalkumar D; Sain, Steven; Scheinman, Robert I

    2006-02-01

    Thymic selection requires that diverse self antigens be presented to developing thymocytes by stromal cells. Consistent with this function, medullary thymic epithelial cells have been shown to express a large number of genes, many of which are tissue restricted. Autoimmune regulator (AIRE) is a nuclear protein, which has recently been identified as a regulator of this process, however, the mechanism by which AIRE functions is not well understood. Here we use a transrepression assay to demonstrate that AIRE interacts with multiple components of the transcription complex including a novel interaction with the UBA domain protein, GBDR1. When AIRE is expressed in cultured human thymic epithelial cells, it tightly associates with nuclear matrix, suggesting that AIRE responsive genes may be localized to specific regions. Using a mathematical approach we have re-analyzed an Affymetrix dataset identifying AIRE responsive genes and show that they tend to localize to specific regions of the genome. Together, these data suggest that AIRE regulates gene expression by recruiting components of the transcription complex to specific regions of the genome via interactions with nuclear matrix.

  12. Global identification and characterization of transcriptionally active regions in the rice genome.

    Directory of Open Access Journals (Sweden)

    Lei Li

    Full Text Available Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs not encoded by annotated exons in the rice (Oryza. sativa subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83% japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.

  13. In situ genomic DNA extraction for PCR analysis of regions of interest in four plant species and one filamentous fungi

    Directory of Open Access Journals (Sweden)

    Luis E. Rojas

    2014-07-01

    Full Text Available The extraction methods of genomic DNA are usually laborious and hazardous to human health and the environment by the use of organic solvents (chloroform and phenol. In this work a protocol for in situ extraction of genomic DNA by alkaline lysis is validated. It was used in order to amplify regions of DNA in four species of plants and fungi by polymerase chain reaction (PCR. From plant material of Saccharum officinarum L., Carica papaya L. and Digitalis purpurea L. it was possible to extend different regions of the genome through PCR. Furthermore, it was possible to amplify a fragment of avr-4 gene DNA purified from lyophilized mycelium of Mycosphaerella fijiensis. Additionally, it was possible to amplify the region ap24 transgene inserted into the genome of banana cv. `Grande naine' (Musa AAA. Key words: alkaline lysis, Carica papaya L., Digitalis purpurea L., Musa, Saccharum officinarum L.

  14. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  15. Microcollinearity in an ethylene receptor coding gene region of the Coffea canephora genome is extensively conserved with Vitis vinifera and other distant dicotyledonous sequenced genomes

    Directory of Open Access Journals (Sweden)

    Campa Claudine

    2009-02-01

    Full Text Available Abstract Background Coffea canephora, also called Robusta, belongs to the Rubiaceae, the fourth largest angiosperm family. This diploid species (2x = 2n = 22 has a fairly small genome size of ≈ 690 Mb and despite its extreme economic importance, particularly for developing countries, knowledge on the genome composition, structure and evolution remain very limited. Here, we report the 160 kb of the first C. canephora Bacterial Artificial Chromosome (BAC clone ever sequenced and its fine analysis. Results This clone contains the CcEIN4 gene, encoding an ethylene receptor, and twenty other predicted genes showing a high gene density of one gene per 7.8 kb. Most of them display perfect matches with C. canephora expressed sequence tags or show transcriptional activities through PCR amplifications on cDNA libraries. Twenty-three transposable elements, mainly Class II transposon derivatives, were identified at this locus. Most of these Class II elements are Miniature Inverted-repeat Transposable Elements (MITE known to be closely associated with plant genes. This BAC composition gives a pattern similar to those found in gene rich regions of Solanum lycopersicum and Medicago truncatula genomes indicating that the CcEIN4 regions may belong to a gene rich region in the C. canephora genome. Comparative sequence analysis indicated an extensive conservation between C. canephora and most of the reference dicotyledonous genomes studied in this work, such as tomato (S. lycopersicum, grapevine (V. vinifera, barrel medic M. truncatula, black cottonwood (Populus trichocarpa and Arabidopsis thaliana. The higher degree of microcollinearity was found between C. canephora and V. vinifera, which belong respectively to the Asterids and Rosids, two clades that diverged more than 114 million years ago. Conclusion This study provides a first glimpse of C. canephora genome composition and evolution. Our data revealed a remarkable conservation of the microcollinearity

  16. A genomic region involved in the formation of adhesin fibers in Bacillus cereus biofilms

    Directory of Open Access Journals (Sweden)

    Joaquín eCaro-Astorga

    2015-01-01

    Full Text Available Bacillus cereus is a bacterial pathogen that is responsible for many recurrent disease outbreaks due to food contamination. Spores and biofilms are considered the most important reservoirs of B. cereus in contaminated fresh vegetables and fruits. Biofilms are bacterial communities that are difficult to eradicate from biotic and abiotic surfaces because of their stable and extremely strong extracellular matrix. These extracellular matrixes contain exopolysaccharides, proteins, extracellular DNA, and other minor components. Although B. cereus can form biofilms, the bacterial features governing assembly of the protective extracellular matrix are not known. Using the well-studied bacterium B. subtilis as a model, we identified two genomic loci in B. cereus, which encodes two orthologs of the amyloid-like protein TasA of B. subtilis and a SipW signal peptidase. Deletion of this genomic region in B. cereus inhibited biofilm assembly; notably, mutation of the putative signal peptidase SipW caused the same phenotype. However, mutations in tasA or calY did not completely prevent biofilm formation; strains that were mutated for either of these genes formed phenotypically different surface attached biofilms. Electron microscopy studies revealed that TasA polymerizes to form long and abundant fibers on cell surfaces, whereas CalY does not aggregate similarly. Heterologous expression of this amyloid-like cassette in a B. subtilis strain lacking the factors required for the assembly of TasA amyloid-like fibers revealed i the involvement of this B. cereus genomic region in formation of the air-liquid interphase pellicles and ii the intrinsic ability of TasA to form fibers similar to the amyloid-like fibers produced by its B. subtilis ortholog.

  17. DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

    Science.gov (United States)

    El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

    2007-01-01

    We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.

  18. The complete mitochondrial genome sequence of the western flower thrips Frankliniella occidentalis (Thysanoptera: Thripidae) contains triplicate putative control regions.

    Science.gov (United States)

    Yan, Dankan; Tang, Yunxia; Xue, Xiaofeng; Wang, Minghua; Liu, Fengquan; Fan, Jiaqin

    2012-09-10

    To investigate the features of the control region (CR) and the gene rearrangement in the mitochondrial (mt) genome of Thysanoptera insects, we sequenced the whole mt genome of the western flower thrips Frankliniella occidentalis (Thysanoptera: Thripidae). The mt genome is a circular molecule with 14,889 nucleotides and an A+T content of 76.6%, and it has triplicate putative CRs. We propose that tandem duplication and deletion account for the evolution of the CR and the gene translocations. Intramitochondrial recombination is a plausible model for the gene inversions. We discuss the excessive duplicate CR sequences and the transcription of the rRNA genes, which are distant from one another and from the CR. Finally, we address the significance of the complicated mt genomes in Thysanoptera for the evolution of the CR and the gene arrangement of the mt genome. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.

  19. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

    Science.gov (United States)

    Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

    2017-10-06

    Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

  20. QTL mapping of genome regions controlling temephos resistance in larvae of the mosquito Aedes aegypti.

    Science.gov (United States)

    Reyes-Solis, Guadalupe Del Carmen; Saavedra-Rodriguez, Karla; Suarez, Adriana Flores; Black, William C

    2014-10-01

    The mosquito Aedes aegypti is the principal vector of dengue and yellow fever flaviviruses. Temephos is an organophosphate insecticide used globally to suppress Ae. aegypti larval populations but resistance has evolved in many locations. Quantitative Trait Loci (QTL) controlling temephos survival in Ae. aegypti larvae were mapped in a pair of F3 advanced intercross lines arising from temephos resistant parents from Solidaridad, México and temephos susceptible parents from Iquitos, Peru. Two sets of 200 F3 larvae were exposed to a discriminating dose of temephos and then dead larvae were collected and preserved for DNA isolation every two hours up to 16 hours. Larvae surviving longer than 16 hours were considered resistant. For QTL mapping, single nucleotide polymorphisms (SNPs) were identified at 23 single copy genes and 26 microsatellite loci of known physical positions in the Ae. aegypti genome. In both reciprocal crosses, Multiple Interval Mapping identified eleven QTL associated with time until death. In the Solidaridad×Iquitos (SLD×Iq) cross twelve were associated with survival but in the reciprocal IqxSLD cross, only six QTL were survival associated. Polymorphisms at acetylcholine esterase (AchE) loci 1 and 2 were not associated with either resistance phenotype suggesting that target site insensitivity is not an organophosphate resistance mechanism in this region of México. Temephos resistance is under the control of many metabolic genes of small effect and dispersed throughout the Ae. aegypti genome.

  1. HYBRIDCHECK: software for the rapid detection, visualization and dating of recombinant regions in genome sequence data.

    Science.gov (United States)

    Ward, Ben J; van Oosterhout, Cock

    2016-03-01

    HYBRIDCHECK is a software package to visualize the recombination signal in large DNA sequence data set, and it can be used to analyse recombination, genetic introgression, hybridization and horizontal gene transfer. It can scan large (multiple kb) contigs and whole-genome sequences of three or more individuals. HYBRIDCHECK is written in the r software for OS X, Linux and Windows operating systems, and it has a simple graphical user interface. In addition, the r code can be readily incorporated in scripts and analysis pipelines. HYBRIDCHECK implements several ABBA-BABA tests and visualizes the effects of hybridization and the resulting mosaic-like genome structure in high-density graphics. The package also reports the following: (i) the breakpoint positions, (ii) the number of mutations in each introgressed block, (iii) the probability that the identified region is not caused by recombination and (iv) the estimated age of each recombination event. The divergence times between the donor and recombinant sequence are calculated using a JC, K80, F81, HKY or GTR correction, and the dating algorithm is exceedingly fast. By estimating the coalescence time of introgressed blocks, it is possible to distinguish between hybridization and incomplete lineage sorting. HYBRIDCHECK is libré software and it and its manual are free to download from http://ward9250.github.io/HybridCheck/. © 2015 John Wiley & Sons Ltd.

  2. Determining spatial chromatin organization of large genomic regions using 5C technology.

    Science.gov (United States)

    van Berkum, Nynke L; Dekker, Job

    2009-01-01

    Spatial organization of chromatin plays an important role at multiple levels of genome regulation. On a global scale, its function is evident in processes like metaphase and chromosome segregation. On a detailed level, long-range interactions between regulatory elements and promoters are essential for proper gene regulation. Microscopic techniques like FISH can detect chromatin contacts, although the resolution is generally low making detection of enhancer-promoter interaction difficult. The 3C methodology allows for high-resolution analysis of chromatin interactions. 3C is now widely used and has revealed that long-range looping interactions between genomic elements are widespread. However, studying chromatin interactions in large genomic regions by 3C is very labor intensive. This limitation is overcome by the 5C technology. 5C is an adaptation of 3C, in which the concurrent use of thousands of primers permits the simultaneous detection of millions of chromatin contacts. The design of the 5C primers is critical because this will determine which and how many chromatin interactions will be examined in the assay. Starting material for 5C is a 3C template. To make a 3C template, chromatin interactions in living cells are cross-linked using formaldehyde. Next, chromatin is digested and subsequently ligated under conditions favoring ligation events between cross-linked fragments. This yields a genome-wide 3C library of ligation products representing all chromatin interactions in vivo. 5C then employs multiplex ligation-mediated amplification to detect, in a single assay, up to millions of unique ligation products present in the 3C library. The resulting 5C library can be analyzed by microarray analysis or deep sequencing. The observed abundance of a 5C product is a measure of the interaction frequency between the two corresponding chromatin fragments. The power of the 5C technique described in this chapter is the high-throughput, high-resolution, and quantitative way

  3. Molecular markers detect stable genomic regions underlying tomato fruit shelf life and weight

    Directory of Open Access Journals (Sweden)

    Guillermo Raúl Pratta

    2011-01-01

    Full Text Available Incorporating wild germplasm such as S. pimpinellifolium is an alternative strategy to prolong tomato fruit shelf life(SL without reducing fruit quality. A set of recombinant inbred lines with discrepant values of SL and weight (FW were derived byantagonistic-divergent selection from an interspecific cross. The general objective of this research was to evaluate Genotype x Year(GY and Marker x Year (MY interaction in these new genetic materials for both traits. Genotype and year principal effects and GYinteraction were statistically significant for SL. Genotype and year principal effects were significant for FW but GY interaction wasnot. The marker principal effect was significant for SL and FW but both year principal effect and MY interaction were not significant.Though SL was highly influenced by year conditions, some genome regions appeared to maintain a stable effect across years ofevaluation. Fruit weight, instead, was more independent of year effect.

  4. Sequences related to the ox pancreatic ribonuclease coding region in the genomic DNA of mammalian species.

    Science.gov (United States)

    Breukelman, H J; Beintema, J J; Confalone, E; Costanzo, C; Sasso, M P; Carsana, A; Palmieri, M; Furia, A

    1993-07-01

    Mammalian pancreatic ribonucleases form a family of homologous proteins that has been extensively investigated. The primary structures of these enzymes were used to derive phylogenetic trees. These analyses indicate that the presence of three strictly homologous enzymes in the bovine species (the pancreatic, seminal, and cerebral ribonucleases) is due to gene duplication events which occurred during the evolution of ancestral ruminants. In this paper we present evidence that confirms this finding and that suggests an overall structural conservation of the putative ribonuclease genes in ruminant species. We could also demonstrate that the sequences related to ox ribonuclease coding regions present in genomic DNA of the giraffe species are the orthologues of the bovine genes encoding the three ribonucleases mentioned above.

  5. Targeted parallel sequencing of large genetically-defined genomic regions for identifying mutations in Arabidopsis

    Directory of Open Access Journals (Sweden)

    Liu Kun-hsiang

    2012-03-01

    Full Text Available Abstract Large-scale genetic screens in Arabidopsis are a powerful approach for molecular dissection of complex signaling networks. However, map-based cloning can be time-consuming or even hampered due to low chromosomal recombination. Current strategies using next generation sequencing for molecular identification of mutations require whole genome sequencing and advanced computational devises and skills, which are not readily accessible or affordable to every laboratory. We have developed a streamlined method using parallel massive sequencing for mutant identification in which only targeted regions are sequenced. This targeted parallel sequencing (TPSeq method is more cost-effective, straightforward enough to be easily done without specialized bioinformatics expertise, and reliable for identifying multiple mutations simultaneously. Here, we demonstrate its use by identifying three novel nitrate-signaling mutants in Arabidopsis.

  6. Whole genome association study identifies regions of the bovine genome and biological pathways involved in carcass trait performance in Holstein-Friesian cattle.

    Science.gov (United States)

    Doran, Anthony G; Berry, Donagh P; Creevey, Christopher J

    2014-10-01

    Four traits related to carcass performance have been identified as economically important in beef production: carcass weight, carcass fat, carcass conformation of progeny and cull cow carcass weight. Although Holstein-Friesian cattle are primarily utilized for milk production, they are also an important source of meat for beef production and export. Because of this, there is great interest in understanding the underlying genomic structure influencing these traits. Several genome-wide association studies have identified regions of the bovine genome associated with growth or carcass traits, however, little is known about the mechanisms or underlying biological pathways involved. This study aims to detect regions of the bovine genome associated with carcass performance traits (employing a panel of 54,001 SNPs) using measures of genetic merit (as predicted transmitting abilities) for 5,705 Irish Holstein-Friesian animals. Candidate genes and biological pathways were then identified for each trait under investigation. Following adjustment for false discovery (q-value 0.5) with at least one of the four traits. In total, 557 unique bovine genes, which mapped to 426 human orthologs, were within 500kbs of QTL found associated with a trait using the Bayesian approach. Using this information, 24 significantly over-represented pathways were identified across all traits. The most significantly over-represented biological pathway was the peroxisome proliferator-activated receptor (PPAR) signaling pathway. A large number of genomic regions putatively associated with bovine carcass traits were detected using two different statistical approaches. Notably, several significant associations were detected in close proximity to genes with a known role in animal growth such as glucagon and leptin. Several biological pathways, including PPAR signaling, were shown to be involved in various aspects of bovine carcass performance. These core genes and biological processes may form the

  7. Genome-Based Identification of Active Prophage Regions by Next Generation Sequencing in Bacillus licheniformis DSM13

    Science.gov (United States)

    Hertel, Robert; Rodríguez, David Pintor; Hollensteiner, Jacqueline; Dietrich, Sascha; Leimbach, Andreas; Hoppert, Michael; Liesegang, Heiko; Volland, Sonja

    2015-01-01

    Prophages are viruses, which have integrated their genomes into the genome of a bacterial host. The status of the prophage genome can vary from fully intact with the potential to form infective particles to a remnant state where only a few phage genes persist. Prophages have impact on the properties of their host and are therefore of great interest for genomic research and strain design. Here we present a genome- and next generation sequencing (NGS)-based approach for identification and activity evaluation of prophage regions. Seven prophage or prophage-like regions were identified in the genome of Bacillus licheniformis DSM13. Six of these regions show similarity to members of the Siphoviridae phage family. The remaining region encodes the B. licheniformis orthologue of the PBSX prophage from Bacillus subtilis. Analysis of isolated phage particles (induced by mitomycin C) from the wild-type strain and prophage deletion mutant strains revealed activity of the prophage regions BLi_Pp2 (PBSX-like), BLi_Pp3 and BLi_Pp6. In contrast to BLi_Pp2 and BLi_Pp3, neither phage DNA nor phage particles of BLi_Pp6 could be visualized. However, the ability of prophage BLi_Pp6 to generate particles could be confirmed by sequencing of particle-protected DNA mapping to prophage locus BLi_Pp6. The introduced NGS-based approach allows the investigation of prophage regions and their ability to form particles. Our results show that this approach increases the sensitivity of prophage activity analysis and can complement more conventional approaches such as transmission electron microscopy (TEM). PMID:25811873

  8. Genome-based identification of active prophage regions by next generation sequencing in Bacillus licheniformis DSM13.

    Science.gov (United States)

    Hertel, Robert; Rodríguez, David Pintor; Hollensteiner, Jacqueline; Dietrich, Sascha; Leimbach, Andreas; Hoppert, Michael; Liesegang, Heiko; Volland, Sonja

    2015-01-01

    Prophages are viruses, which have integrated their genomes into the genome of a bacterial host. The status of the prophage genome can vary from fully intact with the potential to form infective particles to a remnant state where only a few phage genes persist. Prophages have impact on the properties of their host and are therefore of great interest for genomic research and strain design. Here we present a genome- and next generation sequencing (NGS)-based approach for identification and activity evaluation of prophage regions. Seven prophage or prophage-like regions were identified in the genome of Bacillus licheniformis DSM13. Six of these regions show similarity to members of the Siphoviridae phage family. The remaining region encodes the B. licheniformis orthologue of the PBSX prophage from Bacillus subtilis. Analysis of isolated phage particles (induced by mitomycin C) from the wild-type strain and prophage deletion mutant strains revealed activity of the prophage regions BLi_Pp2 (PBSX-like), BLi_Pp3 and BLi_Pp6. In contrast to BLi_Pp2 and BLi_Pp3, neither phage DNA nor phage particles of BLi_Pp6 could be visualized. However, the ability of prophage BLi_Pp6 to generate particles could be confirmed by sequencing of particle-protected DNA mapping to prophage locus BLi_Pp6. The introduced NGS-based approach allows the investigation of prophage regions and their ability to form particles. Our results show that this approach increases the sensitivity of prophage activity analysis and can complement more conventional approaches such as transmission electron microscopy (TEM).

  9. Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea

    Science.gov (United States)

    2013-01-01

    Background The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. Results Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. Conclusions Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage. PMID:23586706

  10. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions.

    Science.gov (United States)

    Conte, Matthew A; Gammerdinger, William J; Bartie, Kerry L; Penman, David J; Kocher, Thomas D

    2017-05-02

    Tilapias are the second most farmed fishes in the world and a sustainable source of food. Like many other fish, tilapias are sexually dimorphic and sex is a commercially important trait in these fish. In this study, we developed a significantly improved assembly of the tilapia genome using the latest genome sequencing methods and show how it improves the characterization of two sex determination regions in two tilapia species. A homozygous clonal XX female Nile tilapia (Oreochromis niloticus) was sequenced to 44X coverage using Pacific Biosciences (PacBio) SMRT sequencing. Dozens of candidate de novo assemblies were generated and an optimal assembly (contig NG50 of 3.3Mbp) was selected using principal component analysis of likelihood scores calculated from several paired-end sequencing libraries. Comparison of the new assembly to the previous O. niloticus genome assembly reveals that recently duplicated portions of the genome are now well represented. The overall number of genes in the new assembly increased by 27.3%, including a 67% increase in pseudogenes. The new tilapia genome assembly correctly represents two recent vasa gene duplication events that have been verified with BAC sequencing. At total of 146Mbp of additional transposable element sequence are now assembled, a large proportion of which are recent insertions. Large centromeric satellite repeats are assembled and annotated in cichlid fish for the first time. Finally, the new assembly identifies the long-range structure of both a ~9Mbp XY sex determination region on LG1 in O. niloticus, and a ~50Mbp WZ sex determination region on LG3 in the related species O. aureus. This study highlights the use of long read sequencing to correctly assemble recent duplications and to characterize repeat-filled regions of the genome. The study serves as an example of the need for high quality genome assemblies and provides a framework for identifying sex determining genes in tilapia and related fish species.

  11. Allelic variation in a single genomic region alters the microbiome of the snail Biomphalaria glabrata.

    Science.gov (United States)

    Allan, Euan R O; Tennessen, Jacob A; Sharpton, Thomas J; Blouin, Michael S

    2018-03-16

    Freshwater snails are the intermediate hosts for numerous parasitic worms which can have negative consequences for human health and agriculture. Understanding the transmission of these diseases requires a more complete characterization of the immunobiology of snail hosts. This includes the characterization of its microbiome and genetic factors which may interact with this important commensal community. Allelic variation in the Guadeloupe Resistance Complex (GRC) genomic region of Guadeloupean Biomphalaria glabrata influences their susceptibility to schistosome infection, and may have other roles in the snail immune response. In the present study, we examined whether a snail's GRC genotype has a role in shaping the bacterial diversity and composition present on or in whole snails. We show that the GRC haplotype, including the resistant genotype, has a significant effect on the diversity of bacterial species present in or on whole snails, including the relative abundances of Gemmatimonas aurantiaca and Micavibrio aeruginosavorus. These findings support the hypothesis that the GRC region is likely involved in pathways that can modify the microbial community of these snails, and may have more immune roles in B. glabrata than originally believed. This is also one of few examples in which allelic variation at a particular locus has been shown to affect the microbiome in any species.

  12. Gametic phase estimation over large genomic regions using an adaptive window approach

    Directory of Open Access Journals (Sweden)

    Excoffier Laurent

    2003-11-01

    Full Text Available Abstract The authors present ELB, an easy to programme and computationally fast algorithm for inferring gametic phase in population samples of multilocus genotypes. Phase updates are made on the basis of a window of neighbouring loci, and the window size varies according to the local level of linkage disequilibrium. Thus, ELB is particularly well suited to problems involving many loci and/or relatively large genomic regions, including those with variable recombination rate. The authors have simulated population samples of single nucleotide polymorphism genotypes with varying levels of recombination and marker density, and find that ELB provides better local estimation of gametic phase than the PHASE or HTYPER programs, while its global accuracy is broadly similar. The relative improvement in local accuracy increases both with increasing recombination and with increasing marker density. Short tandem repeat (STR, or microsatellite simulation studies demonstrate ELB's superiority over PHASE both globally and locally. Missing data are handled by ELB; simulations show that phase recovery is virtually unaffected by up to 2 per cent of missing data, but that phase estimation is noticeably impaired beyond this amount. The authors also applied ELB to datasets obtained from random pairings of 42 human X chromosomes typed at 97 diallelic markers in a 200 kb low-recombination region. Once again, they found ELB to have consistently better local accuracy than PHASE or HTYPER, while its global accuracy was close to the best.

  13. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep.

    Science.gov (United States)

    Mousel, Michelle R; Reynolds, James O; White, Stephen N

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical Psheep genetic markers for marker-assisted selection.

  14. Specific genomic regions are differentially affected by copy number alterations across distinct cancer types, in aggregated cytogenetic data.

    Science.gov (United States)

    Kumar, Nitin; Cai, Haoyang; von Mering, Christian; Baudis, Michael

    2012-01-01

    Regional genomic copy number alterations (CNA) are observed in the vast majority of cancers. Besides specifically targeting well-known, canonical oncogenes, CNAs may also play more subtle roles in terms of modulating genetic potential and broad gene expression patterns of developing tumors. Any significant differences in the overall CNA patterns between different cancer types may thus point towards specific biological mechanisms acting in those cancers. In addition, differences among CNA profiles may prove valuable for cancer classifications beyond existing annotation systems. We have analyzed molecular-cytogenetic data from 25579 tumors samples, which were classified into 160 cancer types according to the International Classification of Disease (ICD) coding system. When correcting for differences in the overall CNA frequencies between cancer types, related cancers were often found to cluster together according to similarities in their CNA profiles. Based on a randomization approach, distance measures from the cluster dendrograms were used to identify those specific genomic regions that contributed significantly to this signal. This approach identified 43 non-neutral genomic regions whose propensity for the occurrence of copy number alterations varied with the type of cancer at hand. Only a subset of these identified loci overlapped with previously implied, highly recurrent (hot-spot) cytogenetic imbalance regions. Thus, for many genomic regions, a simple null-hypothesis of independence between cancer type and relative copy number alteration frequency can be rejected. Since a subset of these regions display relatively low overall CNA frequencies, they may point towards second-tier genomic targets that are adaptively relevant but not necessarily essential for cancer development.

  15. Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs.

    Directory of Open Access Journals (Sweden)

    Adam H Freedman

    2016-03-01

    Full Text Available Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers.

  16. Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs

    Science.gov (United States)

    Freedman, Adam H.; Schweizer, Rena M.; Ortega-Del Vecchyo, Diego; Han, Eunjung; Davis, Brian W.; Gronau, Ilan; Silva, Pedro M.; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R.; Parker, Heidi G.; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D.; Harkins, Timothy T.; Nelson, Stanley F.; Marques-Bonet, Tomas; Ostrander, Elaine A.; Wayne, Robert K.; Novembre, John

    2016-01-01

    Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers. PMID:26943675

  17. Genome-scale prediction of proteins with long intrinsically disordered regions.

    Science.gov (United States)

    Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

    2014-01-01

    Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.

  18. Identification of nine genomic regions of amplification in urothelial carcinoma, correlation with stage, and potential prognostic and therapeutic value.

    Directory of Open Access Journals (Sweden)

    Yvonne Chekaluk

    Full Text Available We performed a genome wide analysis of 164 urothelial carcinoma samples and 27 bladder cancer cell lines to identify copy number changes associated with disease characteristics, and examined the association of amplification events with stage and grade of disease. Multiplex inversion probe (MIP analysis, a recently developed genomic technique, was used to study 80 urothelial carcinomas to identify mutations and copy number changes. Selected amplification events were then analyzed in a validation cohort of 84 bladder cancers by multiplex ligation-dependent probe assay (MLPA. In the MIP analysis, 44 regions of significant copy number change were identified using GISTIC. Nine gene-containing regions of amplification were selected for validation in the second cohort by MLPA. Amplification events at these 9 genomic regions were found to correlate strongly with stage, being seen in only 2 of 23 (9% Ta grade 1 or 1-2 cancers, in contrast to 31 of 61 (51% Ta grade 3 and T2 grade 2 cancers, p<0.001. These observations suggest that analysis of genomic amplification of these 9 regions might help distinguish non-invasive from invasive urothelial carcinoma, although further study is required. Both MIP and MLPA methods perform well on formalin-fixed paraffin-embedded DNA, enhancing their potential clinical use. Furthermore several of the amplified genes identified here (ERBB2, MDM2, CCND1 are potential therapeutic targets.

  19. Draft genome sequences of three virulent Streptococcus thermophilus bacteriophages isolated from the dairy environment in the Veneto region of Italy

    DEFF Research Database (Denmark)

    Duarte, Viní­cius da Silva; Giaretta, Sabrina; Treu, Laura

    2018-01-01

    Streptococcus thermophilus, a very important dairy species, is constantly threatened by phage infection. We report the genome sequences of three S. thermophilus bacteriophages isolated from a dairy environment in the Veneto region of Italy. These sequences will be used for the development of new...

  20. Genomic Mapping of Human DNA provides Evidence of Difference in Stretch between AT and GC rich regions

    Science.gov (United States)

    Reifenberger, Jeffrey; Dorfman, Kevin; Cao, Han

    Human DNA is a not a polymer consisting of a uniform distribution of all 4 nucleic acids, but rather contains regions of high AT and high GC content. When confined, these regions could have different stretch due to the extra hydrogen bond present in the GC basepair. To measure this potential difference, human genomic DNA was nicked with NtBspQI, labeled with a cy3 like fluorophore at the nick site, stained with YOYO, loaded into a device containing an array of nanochannels, and imaged. Over 473,000 individual molecules of DNA, corresponding to roughly 30x coverage of a human genome, were collected and aligned to the human reference. Based on the known AT/GC content between aligned pairs of labels, the stretch was measured for regions of similar size but different AT/GC content. We found that regions of high GC content were consistently more stretched than regions of high AT content between pairs of labels varying in size between 2.5 kbp and 500 kbp. We measured that for every 1% increase in GC content there was roughly a 0.06% increase in stretch. While this effect is small, it is important to take into account differences in stretch between AT and GC rich regions to improve the sensitivity of detection of structural variations from genomic variations. NIH Grant: R01-HG006851.

  1. Draft genome sequence of bitter gourd (Momordica charantia), a vegetable and medicinal plant in tropical and subtropical regions.

    Science.gov (United States)

    Urasaki, Naoya; Takagi, Hiroki; Natsume, Satoshi; Uemura, Aiko; Taniai, Naoki; Miyagi, Norimichi; Fukushima, Mai; Suzuki, Shouta; Tarora, Kazuhiko; Tamaki, Moritoshi; Sakamoto, Moriaki; Terauchi, Ryohei; Matsumura, Hideo

    2017-02-01

    Bitter gourd (Momordica charantia) is an important vegetable and medicinal plant in tropical and subtropical regions globally. In this study, the draft genome sequence of a monoecious bitter gourd inbred line, OHB3-1, was analyzed. Through Illumina sequencing and de novo assembly, scaffolds of 285.5 Mb in length were generated, corresponding to ∼84% of the estimated genome size of bitter gourd (339 Mb). In this draft genome sequence, 45,859 protein-coding gene loci were identified, and transposable elements accounted for 15.3% of the whole genome. According to synteny mapping and phylogenetic analysis of conserved genes, bitter gourd was more related to watermelon (Citrullus lanatus) than to cucumber (Cucumis sativus) or melon (C. melo). Using RAD-seq analysis, 1507 marker loci were genotyped in an F2 progeny of two bitter gourd lines, resulting in an improved linkage map, comprising 11 linkage groups. By anchoring RAD tag markers, 255 scaffolds were assigned to the linkage map. Comparative analysis of genome sequences and predicted genes determined that putative trypsin-inhibitor and ribosome-inactivating genes were distinctive in the bitter gourd genome. These genes could characterize the bitter gourd as a medicinal plant. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  2. Whole genome sequencing and annotation of halophilic Salinicoccus sp. BAB 3246 isolated from the coastal region of Gujarat

    Directory of Open Access Journals (Sweden)

    Vishal Mevada

    2017-09-01

    Full Text Available Salinicoccus sp. BAB 3246 is a halophilic bacterium isolated from a marine water sample collected from the coastal region of Gujarat, India, from a surface water stream. Based on 16sRNA sequencing, the organism was identified as Salinicoccus sp. BAB 3246 (Genebank ID: KF889285. The present work was performed to determine the whole genome sequence of the organism using Ion Torrent PGM platform followed by assembly using the CLC genomics workbench and genome annotation using RAST, BASys and MaGe. The complete genome sequence was 713,204 bp identified by with second largest size for Salinicoccus sp. reported in the NCBI genome database. A total of 652 degradative pathways were identified by KEGG map analysis. Comparative genomic analysis revealed Salinicoccus sp. BAB 3246 as most highly related to Salinicoccus halodurans H3B36. Data mining identified stress response genes and operator pathway for degradation of various environmental pollutants. Annotation data and analysis indicate potential use in pollution control in industrial influent and saline environment.

  3. CpG islands or CpG clusters: how to identify functional GC-rich regions in a genome?

    Directory of Open Access Journals (Sweden)

    Han Leng

    2009-02-01

    Full Text Available Abstract Background CpG islands (CGIs, clusters of CpG dinucleotides in GC-rich regions, are often located in the 5' end of genes and considered gene markers. Hackenberg et al. (2006 recently developed a new algorithm, CpGcluster, which uses a completely different mathematical approach from previous traditional algorithms. Their evaluation suggests that CpGcluster provides a much more efficient approach to detecting functional clusters or islands of CpGs. Results We systematically compared CpGcluster with the traditional algorithm by Takai and Jones (2002. Our comparisons of (1 the number of islands versus the number of genes in a genome, (2 the distribution of islands in different genomic regions, (3 island length, (4 the distance between two neighboring islands, and (5 methylation status suggest that Takai and Jones' algorithm is overall more appropriate for identifying promoter-associated islands of CpGs in vertebrate genomes. Conclusion The generation of genome sequence and DNA methylation data is expected to accelerate greatly. The information in this study is important for its extensive utility in gene feature analysis and epigenomics including gene prediction and methylation chip design in different genomes.

  4. Genomic variability ofHelicobacter pyloriisolates of gastric regions from two Colombian populations.

    Science.gov (United States)

    Matta, Andrés Jenuer; Pazos, Alvaro Jairo; Bustamante-Rengifo, Javier Andrés; Bravo, Luis Eduardo

    2017-02-07

    To compare the genomic variability and the multiple colonization of Helicobacter pylori ( H. pylori ) in patients with chronic gastritis from two Colombian populations with contrast in the risk of developing gastric cancer (GC): Túquerres-Nariño (High risk) and Tumaco-Nariño (Low risk). Four hundred and nine patients from both genders with dyspeptic symptoms were studied. Seventy-two patients were included in whom H. pylori was isolated from three anatomic regions of the gastric mucosa, (31/206) of the high risk population of GC (Túquerres) and (41/203) of the low risk population of GC (Tumaco). The isolates were genotyped by PCR-RAPD. Genetic diversity between the isolates was evaluated by conglomerates analysis and multiple correspondence analyses. The proportion of virulent genotypes of H. pylori was 99% in Túquerres and 94% in Tumaco. The coefficient of similarity of Nei-Li showed greater genetic diversity among isolates of Túquerres (0.13) than those of Tumaco (0.07). After adjusting by age, gender and type of gastritis, the multiple colonization was 1.7 times more frequent in Túquerres than in Tumaco ( P = 0.05). In Túquerres, high risk of GC there was a greater probability of multiple colonization by H. pylori . From the analysis of the results of the PCR-RAPD, it was found higher genetic variability in the isolates of H. pylori in the population of high risk for the development of GC.

  5. Genomic variability of Helicobacter pylori isolates of gastric regions from two Colombian populations

    Science.gov (United States)

    Matta, Andrés Jenuer; Pazos, Alvaro Jairo; Bustamante-Rengifo, Javier Andrés; Bravo, Luis Eduardo

    2017-01-01

    AIM To compare the genomic variability and the multiple colonization of Helicobacter pylori (H. pylori) in patients with chronic gastritis from two Colombian populations with contrast in the risk of developing gastric cancer (GC): Túquerres-Nariño (High risk) and Tumaco-Nariño (Low risk). METHODS Four hundred and nine patients from both genders with dyspeptic symptoms were studied. Seventy-two patients were included in whom H. pylori was isolated from three anatomic regions of the gastric mucosa, (31/206) of the high risk population of GC (Túquerres) and (41/203) of the low risk population of GC (Tumaco). The isolates were genotyped by PCR-RAPD. Genetic diversity between the isolates was evaluated by conglomerates analysis and multiple correspondence analyses. RESULTS The proportion of virulent genotypes of H. pylori was 99% in Túquerres and 94% in Tumaco. The coefficient of similarity of Nei-Li showed greater genetic diversity among isolates of Túquerres (0.13) than those of Tumaco (0.07). After adjusting by age, gender and type of gastritis, the multiple colonization was 1.7 times more frequent in Túquerres than in Tumaco (P = 0.05). CONCLUSION In Túquerres, high risk of GC there was a greater probability of multiple colonization by H. pylori. From the analysis of the results of the PCR-RAPD, it was found higher genetic variability in the isolates of H. pylori in the population of high risk for the development of GC. PMID:28223724

  6. KMT2A and KMT2B Mediate Memory Function by Affecting Distinct Genomic Regions

    Directory of Open Access Journals (Sweden)

    Cemil Kerimoglu

    2017-07-01

    Full Text Available Kmt2a and Kmt2b are H3K4 methyltransferases of the Set1/Trithorax class. We have recently shown the importance of Kmt2b for learning and memory. Here, we report that Kmt2a is also important in memory formation. We compare the decrease in H3K4 methylation and de-regulation of gene expression in hippocampal neurons of mice with knockdown of either Kmt2a or Kmt2b. Kmt2a and Kmt2b control largely distinct genomic regions and different molecular pathways linked to neuronal plasticity. Finally, we show that the decrease in H3K4 methylation resulting from Kmt2a knockdown partially recapitulates the pattern previously reported in CK-p25 mice, a model for neurodegeneration and memory impairment. Our findings point to the distinct functions of even closely related histone-modifying enzymes and provide essential insight for the development of more efficient and specific epigenetic therapies against brain diseases.

  7. MHC class I-associated peptides derive from selective regions of the human genome.

    Science.gov (United States)

    Pearson, Hillary; Daouda, Tariq; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Mader, Sylvie; Lemieux, Sébastien; Thibault, Pierre; Perreault, Claude

    2016-12-01

    MHC class I-associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology.

  8. Simple protocol for population (Sanger) sequencing for Zika virus genomic regions.

    Science.gov (United States)

    Cabral, Gabriela Bastos; Ferreira, João Leandro de Paula; Souza, Renato Pereira de; Cunha, Mariana Sequetin; Luchs, Adriana; Figueiredo, Cristina Adelaide; Brígido, Luís Fernando de Macedo

    2018-01-01

    A number of Zika virus (ZIKV) sequences were obtained using Next-generation sequencing (NGS), a methodology widely applied in genetic diversity studies and virome discovery. However Sanger method is still a robust, affordable, rapid and specific tool to obtain valuable sequences. The aim of this study was to develop a simple and robust Sanger sequencing protocol targeting ZIKV relevant genetic regions, as envelope protein and nonstructural protein 5 (NS5). In addition, phylogenetic analysis of the ZIKV strains obtained using the present protocol and their comparison with previously published NGS sequences were also carried out. Six Vero cells isolates from serum and one urine sample were available to develop the procedure. Primer sets were designed in order to conduct a nested RT-PCR and a Sanger sequencing protocols. Bayesian analysis was used to infer phylogenetic relationships. Seven complete ZIKV envelope protein (1,571 kb) and six partial NS5 (0,798 Kb) were obtained using the protocol, with no amplification of NS5 gene from urine sample. Two NS5 sequences presented ambiguities at positions 495 and 196. Nucleotide analysis of a Sanger sequence and consensus sequence of previously NGS study revealed 100% identity. ZIKV strains described here clustered within the Asian lineage. The present study provided a simple and low-cost Sanger protocol to sequence relevant genes of the ZIKV genome. The identity of Sanger generated sequences with published consensus NGS support the use of Sanger method for ZIKV population studies. The regions evaluated were able to provide robust phylogenetic signals and may be used to conduct molecular epidemiological studies and monitor viral evolution.

  9. Allelic variation in a willow warbler genomic region is associated with climate clines.

    Directory of Open Access Journals (Sweden)

    Keith W Larson

    Full Text Available Local adaptation is an important process contributing to population differentiation which can occur in continuous or isolated populations connected by various amounts of gene flow. The willow warbler (Phylloscopus trochilus is one of the most common songbirds in Fennoscandia. It has a continuous breeding distribution where it is found in all forested habitats from sea level to the tree line and therefore constitutes an ideal species for the study of locally adapted genes associated with environmental gradients. Previous studies in this species identified a genetic marker (AFLP-WW1 that showed a steep north-south cline in central Sweden with one allele associated with coastal lowland habitats and the other with mountainous habitats. It was further demonstrated that this marker is embedded in a highly differentiated chromosome region that spans several megabases. In the present study, we sampled 2,355 individuals at 128 sites across all of Fennoscandia to study the geographic and climatic variables associated with the allele frequency distributions of WW1. Our results demonstrate that 1 allele frequency patterns significantly differ between mountain and lowland populations, 2 these allele differences coincide with extreme temperature conditions and the short growing season in the mountains, and milder conditions in coastal areas, and 3 the northern-allele or "altitude variant" of WW1 occurs in willow warblers that occupy mountainous habitat regardless of subspecies. Finally these results suggest that climate may exert selection on the genomic region associated with these alleles and would allow us to develop testable predictions for the distribution of the genetic marker based on climate change scenarios.

  10. Genomic regions and genes related to inter-population differences in body size in the ground beetle Carabus japonicus.

    Science.gov (United States)

    Komurai, Ryohei; Fujisawa, Tomochika; Okuzaki, Yutaka; Sota, Teiji

    2017-08-10

    Body size is a key trait in diversification among animal species, and revealing the gene regions responsible for body size diversification among populations or related species is important in evolutionary biology. We explored the genomic regions associated with body size differences in Carabus japonicus ground beetle populations by quantitative trait locus (QTL) mapping of F 2 hybrids from differently sized parents from two populations using restriction site-associated DNA sequencing and de novo assembly of the beetle whole genome. The assembled genome had a total length of 191 Mb with a scaffold N50 of 0.73 Mb; 14,929 protein-coding genes were predicted. Three QTLs on different linkage groups had major effects on the overall size, which is composed chiefly of elytral length. In addition, we found QTLs on autosomal and X chromosomal linkage groups that affected head length and width, thoracic width, and elytral width. We determined the gene loci potentially related to control of body size in scaffolds of the genome sequence, which contained the QTL regions. The genetic basis of body size variation based on a small number of major loci would promote differentiation in body size in response to selection pressures related to variations in environmental conditions and inter-specific interactions.

  11. The "enemies within": regions of the genome that are inherently difficult to replicate [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Rahul Bhowmick

    2017-05-01

    Full Text Available An unusual feature of many eukaryotic genomes is the presence of regions that appear intrinsically difficult to copy during the process of DNA replication. Curiously, the location of these difficult-to-replicate regions is often conserved between species, implying a valuable role in some aspect of genome organization or maintenance. The most prominent class of these regions in mammalian cells is defined as chromosome fragile sites, which acquired their name because of a propensity to form visible gaps/breaks on otherwise-condensed chromosomes in mitosis. This fragility is particularly apparent following perturbation of DNA replication—a phenomenon often referred to as “replication stress”. Here, we review recent data on the molecular basis for chromosome fragility and the role of fragile sites in the etiology of cancer. In particular, we highlight how studies on fragile sites have provided unexpected insights into how the DNA repair machinery assists in the completion of DNA replication.

  12. Combination of native and denaturing PAGE for the detection of protein binding regions in long fragments of genomic DNA

    Directory of Open Access Journals (Sweden)

    Metsis Madis

    2008-06-01

    Full Text Available Abstract Background In a traditional electrophoresis mobility shift assay (EMSA a 32P-labeled double-stranded DNA oligonucleotide or a restriction fragment bound to a protein is separated from the unbound DNA by polyacrylamide gel electrophoresis (PAGE in nondenaturing conditions. An extension of this method uses the large population of fragments derived from long genomic regions (approximately 600 kb for the identification of fragments containing protein binding regions. With this method, genomic DNA is fragmented by restriction enzymes, fragments are amplified by PCR, radiolabeled, incubated with nuclear proteins and the resulting DNA-protein complexes are separated by two-dimensional PAGE. Shifted DNA fragments containing protein binding sites are identified by using additional procedures, i. e. gel elution, PCR amplification, cloning and sequencing. Although the method allows simultaneous analysis of a large population of fragments, it is relatively laborious and can be used to detect only high affinity protein binding sites. Here we propose an alternative and straightforward strategy which is based on a combination of native and denaturing PAGE. This strategy allows the identification of DNA fragments containing low as well as high affinity protein binding regions, derived from genomic DNA ( Results We have combined an EMSA-based selection step with subsequent denaturing PAGE for the localization of protein binding regions in long (up to10 kb fragments of genomic DNA. Our strategy consists of the following steps: digestion of genomic DNA with a 4-cutter restriction enzyme (AluI, BsuRI, TruI, etc, separation of low and high molecular weight fractions of resultant DNA fragments, 32P-labeling with Klenow polymerase, traditional EMSA, gel elution and identification of the shifted bands (or smear by denaturing PAGE. The identification of DNA fragments containing protein binding sites is carried out by running the gel-eluted fragments alongside

  13. Self-Confirmation and Ascertainment of the Candidate Genomic Regions of Complex Trait Loci - A None-Experimental Solution.

    Directory of Open Access Journals (Sweden)

    Lishi Wang

    Full Text Available Over the past half century, thousands of quantitative trait loci (QTL have been identified by using animal models and plant populations. However, the none-reliability and imprecision of the genomic regions of these loci have remained the major hurdle for the identification of the causal genes for the correspondent traits. We used a none-experimental strategy of strain number reduction for testing accuracy and ascertainment of the candidate region for QTL. We tested the strategy in over 400 analyses with data from 47 studies. These studies include: 1 studies with recombinant inbred (RI strains of mice. We first tested two previously mapped QTL with well-defined genomic regions; We then tested additional four studies with known QTL regions; and finally we examined the reliability of QTL in 38 sets of data which are produced from relatively large numbers of RI strains, derived from C57BL/6J (B6 X DBA/2J (D2, known as BXD RI mouse strains; 2 studies with RI strains of rats and plants; and 3 studies using F2 populations in mice, rats and plants. In these cases, our method identified the reliability of mapped QTL and localized the candidate genes into the defined genomic regions. Our data also suggests that LRS score produced by permutation tests does not necessarily confirm the reliability of the QTL. Number of strains are not the reliable indicators for the accuracy of QTL either. Our strategy determines the reliability and accuracy of the genomic region of a QTL without any additional experimental study such as congenic breeding.

  14. Microsatellite markers for evaluating the diversity of the natural killer complex and major histocompatibility complex genomic regions in domestic horses.

    Science.gov (United States)

    Horecky, C; Horecka, E; Futas, J; Janova, E; Horin, P; Knoll, A

    2018-04-01

    Genotyping microsatellite markers represents a standard, relatively easy, and inexpensive method of assessing genetic diversity of complex genomic regions in various animal species, such as the major histocompatibility complex (MHC) and/or natural killer cell receptor (NKR) genes. MHC-linked microsatellite markers have been identified and some of them were used for characterizing MHC polymorphism in various species, including horses. However, most of those were MHC class II markers, while MHC class I and III sub-regions were less well covered. No tools for studying genetic diversity of NKR complex genomic regions are available in horses. Therefore, the aims of this work were to establish a panel of markers suitable for analyzing genetic diversity of the natural killer complex (NKC), and to develop additional microsatellite markers of the MHC class I and class III genomic sub-regions in horses. Nine polymorphic microsatellite loci were newly identified in the equine NKC. Along with two previously reported microsatellites flanking this region, they constituted a panel of 11 loci allowing to characterize genetic variation in this functionally important part of the horse genome. Four newly described MHC class I/III-linked markers were added to 11 known microsatellites to establish a panel of 15 MHC markers with a better coverage of the class I and class III sub-regions. Major characteristics of the two panels produced on a group of 65 horses of 13 breeds and on five Przewalski's horses showed that they do reflect genetic variation within the horse species. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  15. Identification and validation of genomic regions influencing kernel zinc and iron in maize.

    Science.gov (United States)

    Hindu, Vemuri; Palacios-Rojas, Natalia; Babu, Raman; Suwarno, Willy B; Rashid, Zerka; Usha, Rayalcheruvu; Saykhedkar, Gajanan R; Nair, Sudha K

    2018-03-24

    Genome-wide association study (GWAS) on 923 maize lines and validation in bi-parental populations identified significant genomic regions for kernel-Zinc and-Iron in maize. Bio-fortification of maize with elevated Zinc (Zn) and Iron (Fe) holds considerable promise for alleviating under-nutrition among the world's poor. Bio-fortification through molecular breeding could be an economical strategy for developing nutritious maize, and hence in this study, we adopted GWAS to identify markers associated with high kernel-Zn and Fe in maize and subsequently validated marker-trait associations in independent bi-parental populations. For GWAS, we evaluated a diverse maize association mapping panel of 923 inbred lines across three environments and detected trait associations using high-density Single nucleotide polymorphism (SNPs) obtained through genotyping-by-sequencing. Phenotyping trials of the GWAS panel showed high heritability and moderate correlation between kernel-Zn and Fe concentrations. GWAS revealed a total of 46 SNPs (Zn-20 and Fe-26) significantly associated (P ≤ 5.03 × 10 -05 ) with kernel-Zn and Fe concentrations with some of these associated SNPs located within previously reported QTL intervals for these traits. Three double-haploid (DH) populations were developed using lines identified from the panel that were contrasting for these micronutrients. The DH populations were phenotyped at two environments and were used for validating significant SNPs (P ≤ 1 × 10 -03 ) based on single marker QTL analysis. Based on this analysis, 11 (Zn) and 11 (Fe) SNPs were found to have significant effect on the trait variance (P ≤ 0.01, R 2  ≥ 0.05) in at least one bi-parental population. These findings are being pursued in the kernel-Zn and Fe breeding program, and could hold great value in functional analysis and possible cloning of high-value genes for these traits in maize.

  16. A genomic region of lactococcal temperate bacteriophage TP901-1 encoding major virion proteins

    DEFF Research Database (Denmark)

    Johnsen, Mads G.; Appel, Karen Fuglede; Madsen, Hans Peter Lynge

    1996-01-01

    Two major structural proteins, MHP (major head protein) and MTP (major tail protein), from the lactococcal temperate phage TP901-1 were sequenced at their amino acid termini, and derived degenerate oligonucleotides were used to locate the corresponding genes in the phage genome. This genomic regi...

  17. Identifying selected regions from heterozygosity and divergence using a light-coverage genomic dataset from two human populations.

    Directory of Open Access Journals (Sweden)

    Taras K Oleksyk

    2008-03-01

    Full Text Available When a selective sweep occurs in the chromosomal region around a target gene in two populations that have recently separated, it produces three dramatic genomic consequences: 1 decreased multi-locus heterozygosity in the region; 2 elevated or diminished genetic divergence (F(ST of multiple polymorphic variants adjacent to the selected locus between the divergent populations, due to the alternative fixation of alleles; and 3 a consequent regional increase in the variance of F(ST (S(2F(ST for the same clustered variants, due to the increased alternative fixation of alleles in the loci surrounding the selection target. In the first part of our study, to search for potential targets of directional selection, we developed and validated a resampling-based computational approach; we then scanned an array of 31 different-sized moving windows of SNP variants (5-65 SNPs across the human genome in a set of European and African American population samples with 183,997 SNP loci after correcting for the recombination rate variation. The analysis revealed 180 regions of recent selection with very strong evidence in either population or both. In the second part of our study, we compared the newly discovered putative regions to those sites previously postulated in the literature, using methods based on inspecting patterns of linkage disequilibrium, population divergence and other methodologies. The newly found regions were cross-validated with those found in nine other studies that have searched for selection signals. Our study was replicated especially well in those regions confirmed by three or more studies. These validated regions were independently verified, using a combination of different methods and different databases in other studies, and should include fewer false positives. The main strength of our analysis method compared to others is that it does not require dense genotyping and therefore can be used with data from population-based genome SNP scans

  18. Detection of selection signatures of population-specific genomic regions selected during domestication process in Jinhua pigs.

    Science.gov (United States)

    Li, Zhengcao; Chen, Jiucheng; Wang, Zhen; Pan, Yuchun; Wang, Qishan; Xu, Ningying; Wang, Zhengguang

    2016-12-01

    Chinese pigs have been undergoing both natural and artificial selection for thousands of years. Jinhua pigs are of great importance, as they can be a valuable model for exploring the genetic mechanisms linked to meat quality and other traits such as disease resistance, reproduction and production. The purpose of this study was to identify distinctive footprints of selection between Jinhua pigs and other breeds utilizing genome-wide SNP data. Genotyping by genome reducing and sequencing was implemented in order to perform cross-population extended haplotype homozygosity to reveal strong signatures of selection for those economically important traits. This work was performed at a 2% genome level, which comprised 152 006 SNPs genotyped in a total of 517 individuals. Population-specific footprints of selective sweeps were searched for in the genome of Jinhua pigs using six native breeds and three European breeds as reference groups. Several candidate genes associated with meat quality, health and reproduction, such as GH1, CRHR2, TRAF4 and CCK, were found to be overlapping with the significantly positive outliers. Additionally, the results revealed that some genomic regions associated with meat quality, immune response and reproduction in Jinhua pigs have evolved directionally under domestication and subsequent selections. The identified genes and biological pathways in Jinhua pigs showed different selection patterns in comparison with the Chinese and European breeds. © 2016 Stichting International Foundation for Animal Genetics.

  19. Genome Regions Associated with Functional Performance of Soybean Stem Fibers in Polypropylene Thermoplastic Composites.

    Science.gov (United States)

    Reinprecht, Yarmilla; Arif, Muhammad; Simon, Leonardo C; Pauls, K Peter

    2015-01-01

    Plant fibers can be used to produce composite materials for automobile parts, thus reducing plastic used in their manufacture, overall vehicle weight and fuel consumption when they replace mineral fillers and glass fibers. Soybean stem residues are, potentially, significant sources of inexpensive, renewable and biodegradable natural fibers, but are not curretly used for biocomposite production due to the functional properties of their fibers in composites being unknown. The current study was initiated to investigate the effects of plant genotype on the performance characteristics of soybean stem fibers when incorporated into a polypropylene (PP) matrix using a selective phenotyping approach. Fibers from 50 lines of a recombinant inbred line population (169 RILs) grown in different environments were incorporated into PP at 20% (wt/wt) by extrusion. Test samples were injection molded and characterized for their mechanical properties. The performance of stem fibers in the composites was significantly affected by genotype and environment. Fibers from different genotypes had significantly different chemical compositions, thus composites prepared with these fibers displayed different physical properties. This study demonstrates that thermoplastic composites with soybean stem-derived fibers have mechanical properties that are equivalent or better than wheat straw fiber composites currently being used for manufacturing interior automotive parts. The addition of soybean stem residues improved flexural, tensile and impact properties of the composites. Furthermore, by linkage and in silico mapping we identified genomic regions to which quantitative trait loci (QTL) for compositional and functional properties of soybean stem fibers in thermoplastic composites, as well as genes for cell wall synthesis, were co-localized. These results may lead to the development of high value uses for soybean stem residue.

  20. Genes involved in complex adaptive processes tend to have highly conserved upstream regions in mammalian genomes

    Directory of Open Access Journals (Sweden)

    Kohane Isaac

    2005-11-01

    Full Text Available Abstract Background Recent advances in genome sequencing suggest a remarkable conservation in gene content of mammalian organisms. The similarity in gene repertoire present in different organisms has increased interest in studying regulatory mechanisms of gene expression aimed at elucidating the differences in phenotypes. In particular, a proximal promoter region contains a large number of regulatory elements that control the expression of its downstream gene. Although many studies have focused on identification of these elements, a broader picture on the complexity of transcriptional regulation of different biological processes has not been addressed in mammals. The regulatory complexity may strongly correlate with gene function, as different evolutionary forces must act on the regulatory systems under different biological conditions. We investigate this hypothesis by comparing the conservation of promoters upstream of genes classified in different functional categories. Results By conducting a rank correlation analysis between functional annotation and upstream sequence alignment scores obtained by human-mouse and human-dog comparison, we found a significantly greater conservation of the upstream sequence of genes involved in development, cell communication, neural functions and signaling processes than those involved in more basic processes shared with unicellular organisms such as metabolism and ribosomal function. This observation persists after controlling for G+C content. Considering conservation as a functional signature, we hypothesize a higher density of cis-regulatory elements upstream of genes participating in complex and adaptive processes. Conclusion We identified a class of functions that are associated with either high or low promoter conservation in mammals. We detected a significant tendency that points to complex and adaptive processes were associated with higher promoter conservation, despite the fact that they have emerged

  1. Genome Regions Associated with Functional Performance of Soybean Stem Fibers in Polypropylene Thermoplastic Composites.

    Directory of Open Access Journals (Sweden)

    Yarmilla Reinprecht

    Full Text Available Plant fibers can be used to produce composite materials for automobile parts, thus reducing plastic used in their manufacture, overall vehicle weight and fuel consumption when they replace mineral fillers and glass fibers. Soybean stem residues are, potentially, significant sources of inexpensive, renewable and biodegradable natural fibers, but are not curretly used for biocomposite production due to the functional properties of their fibers in composites being unknown. The current study was initiated to investigate the effects of plant genotype on the performance characteristics of soybean stem fibers when incorporated into a polypropylene (PP matrix using a selective phenotyping approach. Fibers from 50 lines of a recombinant inbred line population (169 RILs grown in different environments were incorporated into PP at 20% (wt/wt by extrusion. Test samples were injection molded and characterized for their mechanical properties. The performance of stem fibers in the composites was significantly affected by genotype and environment. Fibers from different genotypes had significantly different chemical compositions, thus composites prepared with these fibers displayed different physical properties. This study demonstrates that thermoplastic composites with soybean stem-derived fibers have mechanical properties that are equivalent or better than wheat straw fiber composites currently being used for manufacturing interior automotive parts. The addition of soybean stem residues improved flexural, tensile and impact properties of the composites. Furthermore, by linkage and in silico mapping we identified genomic regions to which quantitative trait loci (QTL for compositional and functional properties of soybean stem fibers in thermoplastic composites, as well as genes for cell wall synthesis, were co-localized. These results may lead to the development of high value uses for soybean stem residue.

  2. Genome Regions Associated with Functional Performance of Soybean Stem Fibers in Polypropylene Thermoplastic Composites

    Science.gov (United States)

    Reinprecht, Yarmilla; Arif, Muhammad; Simon, Leonardo C.; Pauls, K. Peter

    2015-01-01

    Plant fibers can be used to produce composite materials for automobile parts, thus reducing plastic used in their manufacture, overall vehicle weight and fuel consumption when they replace mineral fillers and glass fibers. Soybean stem residues are, potentially, significant sources of inexpensive, renewable and biodegradable natural fibers, but are not curretly used for biocomposite production due to the functional properties of their fibers in composites being unknown. The current study was initiated to investigate the effects of plant genotype on the performance characteristics of soybean stem fibers when incorporated into a polypropylene (PP) matrix using a selective phenotyping approach. Fibers from 50 lines of a recombinant inbred line population (169 RILs) grown in different environments were incorporated into PP at 20% (wt/wt) by extrusion. Test samples were injection molded and characterized for their mechanical properties. The performance of stem fibers in the composites was significantly affected by genotype and environment. Fibers from different genotypes had significantly different chemical compositions, thus composites prepared with these fibers displayed different physical properties. This study demonstrates that thermoplastic composites with soybean stem-derived fibers have mechanical properties that are equivalent or better than wheat straw fiber composites currently being used for manufacturing interior automotive parts. The addition of soybean stem residues improved flexural, tensile and impact properties of the composites. Furthermore, by linkage and in silico mapping we identified genomic regions to which quantitative trait loci (QTL) for compositional and functional properties of soybean stem fibers in thermoplastic composites, as well as genes for cell wall synthesis, were co-localized. These results may lead to the development of high value uses for soybean stem residue. PMID:26167917

  3. DNA rearrangements from γ-irradiated normal human fibroblasts preferentially occur in transcribed regions of the genome

    International Nuclear Information System (INIS)

    Forrester, H.B.; Radford, I.R.

    2003-01-01

    Full text: DNA rearrangement events leading to chromosomal aberrations are central to ionizing radiation-induced cell death. Although DNA double-strand breaks are probably the lesion that initiates formation of chromosomal aberrations, little is understood about the molecular mechanisms that generate and modulate DNA rearrangement. Examination of the sequences that flank sites of DNA rearrangement may provide information regarding the processes and enzymes involved in rearrangement events. Accordingly, we developed a method using inverse PCR that allows the detection and sequencing of putative radiation-induced DNA rearrangements in defined regions of the human genome. The method can detect single copies of a rearrangement event that has occurred in a particular region of the genome and, therefore, DNA rearrangement detection does not require survival and continued multiplication of the affected cell. Ionizing radiation-induced DNA rearrangements were detected in several different regions of the genome of human fibroblast cells that were exposed to 30 Gy of γ-irradiation and then incubated for 24 hours at 37 deg C. There was a 3- to 5-fold increase in the number of products amplified from irradiated as compared with control cells in the target regions 5' to the C-MYC, CDKN1A, RB1, and FGFR2 genes. Sequences were examined from 121 DNA rearrangements. Approximately half of the PCR products were derived from possible inter-chromosomal rearrangements and the remainder were from intra-chromosomal events. A high proportion of the sequences that rearranged with target regions were located in genes, suggesting that rearrangements may occur preferentially in transcribed regions. Eighty-four percent of the sequences examined by reverse transcriptase PCR were from transcribed sequences in IMR-90 cells. The distribution of DNA rearrangements within the target regions is non-random and homology occurs between the sequences involved in rearrangements in some cases but is not

  4. Conservation of Repeats at the Mammalian KCNQ1OT1-CDKN1C Region Suggests a Role in Genomic Imprinting

    Directory of Open Access Journals (Sweden)

    Marcos De Donato

    2017-06-01

    Full Text Available KCNQ1OT1 is located in the region with the highest number of genes showing genomic imprinting, but the mechanisms controlling the genes under its influence have not been fully elucidated. Therefore, we conducted a comparative analysis of the KCNQ1/KCNQ1OT1-CDKN1C region to study its conservation across the best assembled eutherian mammalian genomes sequenced to date and analyzed potential elements that may be implicated in the control of genomic imprinting in this region. The genomic features in these regions from human, mouse, cattle, and dog show a higher number of genes and CpG islands (detected using cpgplot from EMBOSS, but lower number of repetitive elements (including short interspersed nuclear elements and long interspersed nuclear elements, compared with their whole chromosomes (detected by RepeatMasker. The KCNQ1OT1-CDKN1C region contains the highest number of conserved noncoding sequences (CNS among mammals, where we found 16 regions containing about 38 different highly conserved repetitive elements (using mVista, such as LINE1 elements: L1M4, L1MB7, HAL1, L1M4a, L1Med, and an LTR element: MLT1H. From these elements, we found 74 CNS showing high sequence identity (>70% between human, cattle, and mouse, from which we identified 13 motifs (using Multiple Em for Motif Elicitation/Motif Alignment and Search Tool with a significant probability of occurrence, 3 of which were the most frequent and were used to find transcription factor–binding sites. We detected several transcription factors (using JASPAR suite from the families SOX, FOX, and GATA. A phylogenetic analysis of these CNS from human, marmoset, mouse, rat, cattle, dog, horse, and elephant shows branches with high levels of support and very similar phylogenetic relationships among these groups, confirming previous reports. Our results suggest that functional DNA elements identified by comparative genomics in a region densely populated with imprinted mammalian genes may be

  5. Chromosome region-specific libraries for human genome analysis. Final progress report, 1 March 1991--28 February 1994

    Energy Technology Data Exchange (ETDEWEB)

    Kao, F.T.

    1994-04-01

    The objectives of this grant proposal include (1) development of a chromosome microdissection and PCR-mediated microcloning technology, (2) application of this microtechnology to the construction of region-specific libraries for human genome analysis. During this grant period, the authors have successfully developed this microtechnology and have applied it to the construction of microdissection libraries for the following chromosome regions: a whole chromosome 21 (21E), 2 region-specific libraries for the long arm of chromosome 2, 2q35-q37 (2Q1) and 2q33-q35 (2Q2), and 4 region-specific libraries for the entire short arm of chromosome 2, 2p23-p25 (2P1), 2p21-p23 (2P2), 2p14-p16 (wP3) and 2p11-p13 (2P4). In addition, 20--40 unique sequence microclones have been isolated and characterized for genomic studies. These region-specific libraries and the single-copy microclones from the library have been used as valuable resources for (1) isolating microsatellite probes in linkage analysis to further refine the disease locus; (2) isolating corresponding clones with large inserts, e.g. YAC, BAC, P1, cosmid and phage, to facilitate construction of contigs for high resolution physical mapping; and (3) isolating region-specific cDNA clones for use as candidate genes. These libraries are being deposited in the American Type Culture Collection (ATCC) for general distribution.

  6. Complete mitochondrial genome of Setipinna taty (Scaly hair-fin anchovy): repetitive sequences in the control region.

    Science.gov (United States)

    Zhang, Bo; Sun, Yuena

    2013-12-01

    The Scaly hair-fin anchovy, Setipinna taty (Clupeiformes, Engraulidae), is a commercially important marine fish species in China. In this paper, the complete mitochondrial genome of was first determined. The mitogenome (16,887 bp) comprises 22 tRNA genes, 2 rRNA genes, 13 protein-coding genes, and 2 main non-coding regions (the control region (CR) and the origin of the light strand replication). A 195 bp tandem repeat sequence was identified in the CR. This mitogenome sequence data would play an important role in population genetics and phylogenetic analysis of the Engraulidae.

  7. Comparative genomics identifies the mouse Bmp3 promoter and an upstream evolutionary conserved region (ECR in mammals.

    Directory of Open Access Journals (Sweden)

    Jonathan W Lowery

    Full Text Available The Bone Morphogenetic Protein (BMP pathway is a multi-member signaling cascade whose basic components are found in all animals. One member, BMP3, which arose more recently in evolution and is found only in deuterostomes, serves a unique role as an antagonist to both the canonical BMP and Activin pathways. However, the mechanisms that control BMP3 expression, and the cis-regulatory regions mediating this regulation, remain poorly defined. With this in mind, we sought to identify the Bmp3 promoter in mouse (M. musculus through functional and comparative genomic analyses. We found that the minimal promoter required for expression in resides within 0.8 kb upstream of Bmp3 in a region that is highly conserved with rat (R. norvegicus. We also found that an upstream region abutting the minimal promoter acts as a repressor of the minimal promoter in HEK293T cells and osteoblasts. Strikingly, a portion of this region is conserved among all available eutherian mammal genomes (47/47, but not in any non-eutherian animal (0/136. We also identified multiple conserved transcription factor binding sites in the Bmp3 upstream ECR, suggesting that this region may preserve common cis-regulatory elements that govern Bmp3 expression across eutherian mammals. Since dysregulation of BMP signaling appears to play a role in human health and disease, our findings may have application in the development of novel therapeutics aimed at modulating BMP signaling in humans.

  8. A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.

    Science.gov (United States)

    Gog, Julia R; Lever, Andrew M L; Skittrall, Jordan P

    2018-01-01

    We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.

  9. [Identification of mutations associated with coronary artery lesion susceptibility in Kawasaki disease by targeted enrichment of genomic region sequencing technique].

    Science.gov (United States)

    Zhu, D Y; Song, S R; Xie, L J; Qiu, F; Yang, J; Xiao, T T; Huang, M

    2017-07-02

    Objective: To screen and identify the mutations in Kawasaki disease by targeted enrichment of genomic region sequencing technique and investigate susceptibility genes associated with coronary artery lesion. Method: This was a case-control study.A total of 114 patients diagnosed as Kawasaki disease treated in Shanghai Children's Hospital between December 2015 and November 2016 were studied and another 45 healthy children who were physically examined in outpatient department were enrolled as control group. Patients were divided into two groups based on the results of echocardiogram. Peripheral venous blood was obtained from patients and controls. Genomic DNA was extracted. SeqCap EZ Choice libraries were prepared by targeted enrichment of genomic region technology. Then the libraries were sequenced to identify susceptibility genes associated with coronary artery lesion in patients diagnosed as Kawasaki disease.Susceptible genes were identified by Burden test, Pearson chi-square test or Fisher's exact probability test. Result: There was statistically significant difference in TNFRSF11B(rs2073618)G>C(p.N3K)mutation and GG/GC/CC genotype between Kawasaki disease group and control group(χ(2)=15.52, P =0.00). There was statistically significant difference in TNFRSF13B(rs34562254)C>T(p.P251L)mutation(χ(2)=10.40, P =0.01)and LEFTY1(rs360057)T>G(p.D322A)mutation(χ(2)=8.505, P =0.01)between patients with coronary artery lesions and those without. Conclusion: Targeted enrichment of genomic region sequencing technology can be used to do primary screening for the susceptible genes associated with coronary artery lesions in Chinese Kawasaki patients and may provide theoretical basis for larger sample investigation of risk prediction score standard in Kawasaki disease.

  10. A variable region within the genome of Streptococcus pneumoniae contributes to strain-strain variation in virulence.

    Directory of Open Access Journals (Sweden)

    Richard M Harvey

    2011-05-01

    Full Text Available The bacterial factors responsible for the variation in invasive potential between different clones and serotypes of Streptococcus pneumoniae are largely unknown. Therefore, the isolation of rare serotype 1 carriage strains in Indigenous Australian communities provided a unique opportunity to compare the genomes of non-invasive and invasive isolates of the same serotype in order to identify such factors. The human virulence status of non-invasive, intermediately virulent and highly virulent serotype 1 isolates was reflected in mice and showed that whilst both human non-invasive and highly virulent isolates were able to colonize the murine nasopharynx equally, only the human highly virulent isolates were able to invade and survive in the murine lungs and blood. Genomic sequencing comparisons between these isolates identified 8 regions >1 kb in size that were specific to only the highly virulent isolates, and included a version of the pneumococcal pathogenicity island 1 variable region (PPI-1v, phage-associated adherence factors, transporters and metabolic enzymes. In particular, a phage-associated endolysin, a putative iron/lead permease and an operon within PPI-1v exhibited niche-specific changes in expression that suggest important roles for these genes in the lungs and blood. Moreover, in vivo competition between pneumococci carrying PPI-1v derivatives representing the two identified versions of the region showed that the version of PPI-1v in the highly virulent isolates was more competitive than the version from the less virulent isolates in the nasopharyngeal tissue, blood and lungs. This study is the first to perform genomic comparisons between serotype 1 isolates with distinct virulence profiles that correlate between mice and humans, and has highlighted the important role that hypervariable genomic loci, such as PPI-1v, play in pneumococcal disease. The findings of this study have important implications for understanding the processes that

  11. Genome-wide genetic diversity and differentially selected regions among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep.

    Directory of Open Access Journals (Sweden)

    Lifan Zhang

    Full Text Available Sheep are among the major economically important livestock species worldwide because the animals produce milk, wool, skin, and meat. In the present study, the Illumina OvineSNP50 BeadChip was used to investigate genetic diversity and genome selection among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds from the United States. After quality-control filtering of SNPs (single nucleotide polymorphisms, we used 48,026 SNPs, including 46,850 SNPs on autosomes that were in Hardy-Weinberg equilibrium and 1,176 SNPs on chromosome × for analysis. Phylogenetic analysis based on all 46,850 SNPs clearly separated Suffolk from Rambouillet, Columbia, Polypay, and Targhee, which was not surprising as Rambouillet contributed to the synthesis of the later three breeds. Based on pair-wise estimates of F(ST, significant genetic differentiation appeared between Suffolk and Rambouillet (F(ST = 0.1621, while Rambouillet and Targhee had the closest relationship (F(ST = 0.0681. A scan of the genome revealed 45 and 41 differentially selected regions (DSRs between Suffolk and Rambouillet and among Rambouillet-related breed populations, respectively. Our data indicated that regions 13 and 24 between Suffolk and Rambouillet might be good candidates for evaluating breed differences. Furthermore, ovine genome v3.1 assembly was used as reference to link functionally known homologous genes to economically important traits covered by these differentially selected regions. In brief, our present study provides a comprehensive genome-wide view on within- and between-breed genetic differentiation, biodiversity, and evolution among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds. These results may provide new guidance for the synthesis of new breeds with different breeding objectives.

  12. Genomic Regions Associated with Root Traits under Drought Stress in Tropical Maize (Zea mays L..

    Directory of Open Access Journals (Sweden)

    P H Zaidi

    Full Text Available An association mapping panel, named as CIMMYT Asia association mapping (CAAM panel, involving 396 diverse tropical maize lines were phenotyped for various structural and functional traits of roots under drought and well-watered conditions. The experiment was conducted during Kharif (summer-rainy season of 2012 and 2013 in root phenotyping facility at CIMMYT-Hyderabad, India. The CAAM panel was genotyped to generate 955, 690 SNPs through GBS v2.7 using Illumina Hi-seq 2000/2500 at Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA. GWAS analysis was carried out using 331,390 SNPs filtered from the entire set of SNPs revealed a total of 50 and 67 SNPs significantly associated for root functional (transpiration efficiency, flowering period water use and structural traits (rooting depth, root dry weight, root length, root volume, root surface area and root length density, respectively. In addition to this, 37 SNPs were identified for grain yield and shoot biomass under well-watered and drought stress. Though many SNPs were found to have significant association with the traits under study, SNPs that were common for more than one trait were discussed in detail. A total 18 SNPs were found to have common association with more than one trait, out of which 12 SNPs were found within or near the various gene functional regions. In this study we attempted to identify the trait specific maize lines based on the presence of favorable alleles for the SNPs associated with multiple traits. Two SNPs S3_128533512 and S7_151238865 were associated with transpiration efficiency, shoot biomass and grain yield under well-watered condition. Based on favorable allele for these SNPs seven inbred lines were identified. Similarly, four lines were identified for transpiration efficiency and shoot biomass under drought stress based on the presence of favorable allele for the common SNPs S1_211520521, S2_20017716, S3_57210184 and S7_130878458 and three lines

  13. Non-coding genomic regions possessing enhancer and silencer potential are associated with healthy aging and exceptional survival

    Science.gov (United States)

    Kim, Sangkyu; Welsh, David A.; Myers, Leann; Cherry, Katie E.; Wyckoff, Jennifer; Jazwinski, S. Michal

    2015-01-01

    We have completed a genome-wide linkage scan for healthy aging using data collected from a family study, followed by fine-mapping by association in a separate population, the first such attempt reported. The family cohort consisted of parents of age 90 or above and their children ranging in age from 50 to 80. As a quantitative measure of healthy aging, we used a frailty index, called FI34, based on 34 health and function variables. The linkage scan found a single significant linkage peak on chromosome 12. Using an independent cohort of unrelated nonagenarians, we carried out a fine-scale association mapping of the region suggestive of linkage and identified three sites associated with healthy aging. These healthy-aging sites (HASs) are located in intergenic regions at 12q13–14. HAS-1 has been previously associated with multiple diseases, and an enhancer was recently mapped and experimentally validated within the site. HAS-2 is a previously uncharacterized site possessing genomic features suggestive of enhancer activity. HAS-3 contains features associated with Polycomb repression. The HASs also contain variants associated with exceptional longevity, based on a separate analysis. Our results provide insight into functional genomic networks involving non-coding regulatory elements that are involved in healthy aging and longevity. PMID:25682868

  14. Genome-environment association study suggests local adaptation to climate at the regional scale in Fagus sylvatica.

    Science.gov (United States)

    Pluess, Andrea R; Frank, Aline; Heiri, Caroline; Lalagüe, Hadrien; Vendramin, Giovanni G; Oddou-Muratorio, Sylvie

    2016-04-01

    The evolutionary potential of long-lived species, such as forest trees, is fundamental for their local persistence under climate change (CC). Genome-environment association (GEA) analyses reveal if species in heterogeneous environments at the regional scale are under differential selection resulting in populations with potential preadaptation to CC within this area. In 79 natural Fagus sylvatica populations, neutral genetic patterns were characterized using 12 simple sequence repeat (SSR) markers, and genomic variation (144 single nucleotide polymorphisms (SNPs) out of 52 candidate genes) was related to 87 environmental predictors in the latent factor mixed model, logistic regressions and isolation by distance/environmental (IBD/IBE) tests. SSR diversity revealed relatedness at up to 150 m intertree distance but an absence of large-scale spatial genetic structure and IBE. In the GEA analyses, 16 SNPs in 10 genes responded to one or several environmental predictors and IBE, corrected for IBD, was confirmed. The GEA often reflected the proposed gene functions, including indications for adaptation to water availability and temperature. Genomic divergence and the lack of large-scale neutral genetic patterns suggest that gene flow allows the spread of advantageous alleles in adaptive genes. Thereby, adaptation processes are likely to take place in species occurring in heterogeneous environments, which might reduce their regional extinction risk under CC. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  15. DNA Barcoding: Amplification and sequence analysis of rbcl and matK genome regions in three divergent plant species

    Directory of Open Access Journals (Sweden)

    Javed Iqbal Wattoo

    2016-11-01

    Full Text Available Background: DNA barcoding is a novel method of species identification based on nucleotide diversity of conserved sequences. The establishment and refining of plant DNA barcoding systems is more challenging due to high genetic diversity among different species. Therefore, targeting the conserved nuclear transcribed regions would be more reliable for plant scientists to reveal genetic diversity, species discrimination and phylogeny. Methods: In this study, we amplified and sequenced the chloroplast DNA regions (matk+rbcl of Solanum nigrum, Euphorbia helioscopia and Dalbergia sissoo to study the functional annotation, homology modeling and sequence analysis to allow a more efficient utilization of these sequences among different plant species. These three species represent three families; Solanaceae, Euphorbiaceae and Fabaceae respectively. Biological sequence homology and divergence of amplified sequences was studied using Basic Local Alignment Tool (BLAST. Results: Both primers (matk+rbcl showed good amplification in three species. The sequenced regions reveled conserved genome information for future identification of different medicinal plants belonging to these species. The amplified conserved barcodes revealed different levels of biological homology after sequence analysis. The results clearly showed that the use of these conserved DNA sequences as barcode primers would be an accurate way for species identification and discrimination. Conclusion: The amplification and sequencing of conserved genome regions identified a novel sequence of matK in native species of Solanum nigrum. The findings of the study would be applicable in medicinal industry to establish DNA based identification of different medicinal plant species to monitor adulteration.

  16. Genome-wide Association Study Identifies Five Susceptibility Loci for Follicular Lymphoma outside the HLA Region

    NARCIS (Netherlands)

    Skibola, Christine F.; Berndt, Sonja I.; Vijai, Joseph; Conde, Lucia; Wang, Zhaoming; Yeager, Meredith; de Bakker, Paul I. W.; Birmann, Brenda M.; Vajdic, Claire M.; Foo, Jia-Nee; Bracci, Paige M.; Vermeulen, Roel C. H.|info:eu-repo/dai/nl/216532620; Slager, Susan L.; de Sanjose, Silvia; Wang, Sophia S.; Linet, Martha S.; Salles, Gilles; Lan, Qing; Severi, Gianluca; Hjalgrim, Henrik; Lightfoot, Tracy; Melbye, Mads; Gu, Jian; Ghesquieres, Herve; Link, Brian K.; Morton, Lindsay M.; Holly, Elizabeth A.; Smith, Alex; Tinker, Lesley F.; Teras, Lauren R.; Kricker, Anne; Becker, Nikolaus; Purdue, Mark P.; Spinelli, John J.; Zhang, Yawei; Giles, Graham G.; Vineis, Paolo; Monnereau, Alain; Bertrand, Kimberly A.; Albanes, Demetrius; Zeleniuch-Jacquotte, Anne; Gabbas, Attilio; Chung, Charles C.; Burdett, Laurie; Hutchinson, Amy; Lawrence, Charles; Montalvan, Rebecca; Liang, Liming; Huang, Jinyan; Ma, Baoshan; Liu, Jianjun; Adami, Hans-Olov; Glimelius, Bengt; Ye, Yuanqing; Nowakowski, Grzegorz S.; Dogan, Ahmet; Thompson, Carrie A.; Habermann, Thomas M.; Novak, Anne J.; Liebow, Mark; Witzig, Thomas E.; Weiner, George J.; Schenk, Maryjean; Hartge, Patricia; De Roos, Anneclaire J.; Cozen, Wendy; Zhi, Degui; Akers, Nicholas K.; Riby, Jacques; Smith, Martyn T.; Lacher, Mortimer; Villano, Danylo J.; Maria, Ann; Roman, Eve; Kane, Eleanor; Jackson, Rebecca D.; North, Kari E.; Diver, W. Ryan; Turner, Jenny; Armstrong, Bruce K.; Benavente, Yolanda; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; Staines, Anthony; McKay, James; Brooks-Wilson, Angela R.; Zheng, Tongzhang; Holford, Theodore R.; Chamosa, Saioa; Kaaks, Rudolph; Kelly, Rachel S.; Ohlsson, Bodil; Travis, Ruth C.; Weiderpass, Elisabete; Clave, Jacqueline; Giovannucci, Edward; Kraft, Peter; Virtamo, Jarmo; Mazza, Patrizio; Cocco, Pierluigi; Ennas, Maria Grazia; Chiu, Brian C. H.; Fraumeni, Joseph R.; Nieters, Alexandra; Offit, Kenneth; Wu, Xifeng; Cerhan, James R.; Smedby, Karin E.; Chanock, Stephen J.; Rothman, Nathaniel

    2014-01-01

    Genome-wide association studies (GWASs) of follicular lymphoma (FL) have previously identified human leukocyte antigen (HLA) gene variants. To identify additional FL susceptibility loci, we conducted a large-scale two-stage GWAS in 4,523 case subjects and 13,344 control subjects of European

  17. Development and validation of new SSR markers from expressed regions in the garlic genome

    Science.gov (United States)

    Limited number of simple sequence repeat (SSR) markers is available for the genome of garlic (Allium sativum L.) although SSR markers have become one of the most preferred marker systems because they are typically co-dominant, reproducible, cross species transferable and highly polymorphic. In this ...

  18. Genomic regions associated with bovine milk fatty acids in both summer and winter milk samples

    NARCIS (Netherlands)

    Bouwman, A.C.; Visker, M.H.P.W.; Arendonk, van J.A.M.; Bovenhuis, H.

    2012-01-01

    Background - In this study we perform a genome-wide association study (GWAS) for bovine milk fatty acids from summer milk samples. This study replicates a previous study where we performed a GWAS for bovine milk fatty acids based on winter milk samples from the same population. Fatty acids from

  19. Mapping of 5q35 chromosomal rearrangements within a genomically unstable region

    DEFF Research Database (Denmark)

    Buysse, Karen; Crepel, An; Menten, Björn

    2008-01-01

    BACKGROUND: Recent molecular studies of breakpoints of recurrent chromosome rearrangements revealed the role of genomic architecture in their formation. In particular, segmental duplications representing blocks of >1 kb with >90% sequence homology were shown to mediate non-allelic homologous reco...

  20. NOVOMIR: De Novo Prediction of MicroRNA-Coding Regions in a Single Plant-Genome

    Science.gov (United States)

    Teune, Jan-Hendrik; Steger, Gerhard

    2010-01-01

    MicroRNAs (miRNA) are small regulatory, noncoding RNA molecules that are transcribed as primary miRNAs (pri-miRNA) from eukaryotic genomes. At least in plants, their regulatory activity is mediated through base-pairing with protein-coding messenger RNAs (mRNA) followed by mRNA degradation or translation repression. We describe NOVOMIR, a program for the identification of miRNA genes in plant genomes. It uses a series of filter steps and a statistical model to discriminate a pre-miRNA from other RNAs and does rely neither on prior knowledge of a miRNA target nor on comparative genomics. The sensitivity and specificity of NOVOMIR for detection of premiRNAs from Arabidopsis thaliana is ~0.83 and ~0.99, respectively. Plant pre-miRNAs are more heterogeneous with respect to size and structure than animal pre-miRNAs. Despite these difficulties, NOVOMIR is well suited to perform searches for pre-miRNAs on a genomic scale. NOVOMIR is written in Perl and relies on two additional, free programs for prediction of RNA secondary structure (RNALFOLD, RNASHAPES). PMID:20871826

  1. Whole genome sequence analyses of Xylella fastidiosa PD strains from different geographical regions

    Science.gov (United States)

    Genome sequences were determined for two Pierce’s disease (PD) causing Xylella fastidiosa (Xf) strains, one from Florida and one from Taiwan. The Florida strain was ATCC 35879, the type of strain used as a standard reference for related taxonomy research. By contrast, the Taiwan strain used was only...

  2. A genome-wide association study of bipolar disorder with comorbid eating disorder replicates the SOX2-OT region.

    Science.gov (United States)

    Liu, Xiaohua; Kelsoe, John R; Greenwood, Tiffany A

    2016-01-01

    Bipolar disorder is a heterogeneous mood disorder associated with several important clinical comorbidities, such as eating disorders. This clinical heterogeneity complicates the identification of genetic variants contributing to bipolar susceptibility. Here we investigate comorbidity of eating disorders as a subphenotype of bipolar disorder to identify genetic variation that is common and unique to both disorders. We performed a genome-wide association analysis contrasting 184 bipolar subjects with eating disorder comorbidity against both 1370 controls and 2006 subjects with bipolar disorder only from the Bipolar Genome Study (BiGS). The most significant genome-wide finding was observed bipolar with comorbid eating disorder vs. controls within SOX2-OT (p=8.9×10(-8) for rs4854912) with a secondary peak in the adjacent FXR1 gene (p=1.2×10(-6) for rs1805576) on chromosome 3q26.33. This region was also the most prominent finding in the case-only analysis (p=3.5×10(-7) and 4.3×10(-6), respectively). Several regions of interest containing genes involved in neurodevelopment and neuroprotection processes were also identified. While our primary finding did not quite reach genome-wide significance, likely due to the relatively limited sample size, these results can be viewed as a replication of a recent study of eating disorders in a large cohort. These findings replicate the prior association of SOX2-OT with eating disorders and broadly support the involvement of neurodevelopmental/neuroprotective mechanisms in the pathophysiology of both disorders. They further suggest that different clinical manifestations of bipolar disorder may reflect differential genetic contributions and argue for the utility of clinical subphenotypes in identifying additional molecular pathways leading to illness. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Genome-wide characterization of genetic variants and putative regions under selection in meat and egg-type chicken lines.

    Science.gov (United States)

    Boschiero, Clarissa; Moreira, Gabriel Costa Monteiro; Gheyas, Almas Ara; Godoy, Thaís Fernanda; Gasparin, Gustavo; Mariani, Pilar Drummond Sampaio Corrêa; Paduan, Marcela; Cesar, Aline Silva Mello; Ledur, Mônica Corrêa; Coutinho, Luiz Lehmann

    2018-01-25

    Meat and egg-type chickens have been selected for several generations for different traits. Artificial and natural selection for different phenotypes can change frequency of genetic variants, leaving particular genomic footprints throghtout the genome. Thus, the aims of this study were to sequence 28 chickens from two Brazilian lines (meat and white egg-type) and use this information to characterize genome-wide genetic variations, identify putative regions under selection using Fst method, and find putative pathways under selection. A total of 13.93 million SNPs and 1.36 million INDELs were identified, with more variants detected from the broiler (meat-type) line. Although most were located in non-coding regions, we identified 7255 intolerant non-synonymous SNPs, 512 stopgain/loss SNPs, 1381 frameshift and 1094 non-frameshift INDELs that may alter protein functions. Genes harboring intolerant non-synonymous SNPs affected metabolic pathways related mainly to reproduction and endocrine systems in the white-egg layer line, and lipid metabolism and metabolic diseases in the broiler line. Fst analysis in sliding windows, using SNPs and INDELs separately, identified over 300 putative regions of selection overlapping with more than 250 genes. For the first time in chicken, INDEL variants were considered for selection signature analysis, showing high level of correlation in results between SNP and INDEL data. The putative regions of selection signatures revealed interesting candidate genes and pathways related to important phenotypic traits in chicken, such as lipid metabolism, growth, reproduction, and cardiac development. In this study, Fst method was applied to identify high confidence putative regions under selection, providing novel insights into selection footprints that can help elucidate the functional mechanisms underlying different phenotypic traits relevant to meat and egg-type chicken lines. In addition, we generated a large catalog of line-specific and common

  4. A pathogenicity determinant maps to the N-terminal coat protein region of the Pepino mosaic virus genome.

    Science.gov (United States)

    Duff-Farrier, Celia R A; Bailey, Andy M; Boonham, Neil; Foster, Gary D

    2015-04-01

    Pepino mosaic virus (PepMV) poses a worldwide threat to the tomato industry. Considerable differences at the genetic level allow for the distinction of four main genotypic clusters; however, the basis of the phenotypic outcome is difficult to elucidate. This work reports the generation of wild-type PepMV infectious clones of both EU (mild) and CH2 (aggressive) genotypes, from which chimeric infectious clones were created. Phenotypic analysis in three solanaceous hosts, Nicotiana benthamiana, Datura stramonium and Solanum lycopersicum, indicated that a PepMV pathogenicity determinant mapped to the 3'-terminal region of the genome. Increased aggression was only observed in N. benthamiana, showing that this factor is host specific. The determinant was localized to amino acids 11-26 of the N-terminal coat protein (CP) region; this is the first report of this region functioning as a virulence factor in PepMV. © 2014 BSPP AND JOHN WILEY & SONS LTD.

  5. Isolation of a Genomic Region Affecting Most Components of Metabolic Syndrome in a Chromosome-16 Congenic Rat Model.

    Directory of Open Access Journals (Sweden)

    Lucie Šedová

    Full Text Available Metabolic syndrome is a highly prevalent human disease with substantial genomic and environmental components. Previous studies indicate the presence of significant genetic determinants of several features of metabolic syndrome on rat chromosome 16 (RNO16 and the syntenic regions of human genome. We derived the SHR.BN16 congenic strain by introgression of a limited RNO16 region from the Brown Norway congenic strain (BN-Lx into the genomic background of the spontaneously hypertensive rat (SHR strain. We compared the morphometric, metabolic, and hemodynamic profiles of adult male SHR and SHR.BN16 rats. We also compared in silico the DNA sequences for the differential segment in the BN-Lx and SHR parental strains. SHR.BN16 congenic rats had significantly lower weight, decreased concentrations of total triglycerides and cholesterol, and improved glucose tolerance compared with SHR rats. The concentrations of insulin, free fatty acids, and adiponectin were comparable between the two strains. SHR.BN16 rats had significantly lower systolic (18-28 mmHg difference and diastolic (10-15 mmHg difference blood pressure throughout the experiment (repeated-measures ANOVA, P < 0.001. The differential segment spans approximately 22 Mb of the telomeric part of the short arm of RNO16. The in silico analyses revealed over 1200 DNA variants between the BN-Lx and SHR genomes in the SHR.BN16 differential segment, 44 of which lead to missense mutations, and only eight of which (in Asb14, Il17rd, Itih1, Syt15, Ercc6, RGD1564958, Tmem161a, and Gatad2a genes are predicted to be damaging to the protein product. Furthermore, a number of genes within the RNO16 differential segment associated with metabolic syndrome components in human studies showed polymorphisms between SHR and BN-Lx (including Lpl, Nrg3, Pbx4, Cilp2, and Stab1. Our novel congenic rat model demonstrates that a limited genomic region on RNO16 in the SHR significantly affects many of the features of metabolic

  6. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    Directory of Open Access Journals (Sweden)

    Blackmon Barbara P

    2011-07-01

    Full Text Available Abstract Background BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. Results This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. Conclusions Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed.

  7. swDMR: A Sliding Window Approach to Identify Differentially Methylated Regions Based on Whole Genome Bisulfite Sequencing.

    Directory of Open Access Journals (Sweden)

    Zhen Wang

    Full Text Available DNA methylation is a widespread epigenetic modification that plays an essential role in gene expression through transcriptional regulation and chromatin remodeling. The emergence of whole genome bisulfite sequencing (WGBS represents an important milestone in the detection of DNA methylation. Characterization of differential methylated regions (DMRs is fundamental as well for further functional analysis. In this study, we present swDMR (http://sourceforge.net/projects/swDMR/ for the comprehensive analysis of DMRs from whole genome methylation profiles by a sliding window approach. It is an integrated tool designed for WGBS data, which not only implements accessible statistical methods to perform hypothesis test adapted to two or more samples without replicates, but false discovery rate was also controlled by multiple test correction. Downstream analysis tools were also provided, including cluster, annotation and visualization modules. In summary, based on WGBS data, swDMR can produce abundant information of differential methylated regions. As a convenient and flexible tool, we believe swDMR will bring us closer to unveil the potential functional regions involved in epigenetic regulation.

  8. Mapping Second Chromosome Mutations to Defined Genomic Regions in Drosophila melanogaster.

    Science.gov (United States)

    Kahsai, Lily; Cook, Kevin R

    2018-01-04

    Hundreds of Drosophila melanogaster stocks are currently maintained at the Bloomington Drosophila Stock Center with mutations that have not been associated with sequence-defined genes. They have been preserved because they have interesting loss-of-function phenotypes. The experimental value of these mutations would be increased by tying them to specific genomic intervals so that geneticists can more easily associate them with annotated genes. Here, we report the mapping of 85 second chromosome complementation groups in the Bloomington collection to specific, small clusters of contiguous genes or individual genes in the sequenced genome. This information should prove valuable to Drosophila geneticists interested in processes associated with particular phenotypes and those searching for mutations affecting specific sequence-defined genes. Copyright © 2018 Kahsai,Cook.

  9. Mapping Second Chromosome Mutations to Defined Genomic Regions in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Lily Kahsai

    2018-01-01

    Full Text Available Hundreds of Drosophila melanogaster stocks are currently maintained at the Bloomington Drosophila Stock Center with mutations that have not been associated with sequence-defined genes. They have been preserved because they have interesting loss-of-function phenotypes. The experimental value of these mutations would be increased by tying them to specific genomic intervals so that geneticists can more easily associate them with annotated genes. Here, we report the mapping of 85 second chromosome complementation groups in the Bloomington collection to specific, small clusters of contiguous genes or individual genes in the sequenced genome. This information should prove valuable to Drosophila geneticists interested in processes associated with particular phenotypes and those searching for mutations affecting specific sequence-defined genes.

  10. Genome-wide occupancy profile of mediator and the Srb8-11 module reveals interactions with coding regions

    DEFF Research Database (Denmark)

    Zhu, Xuefeng; Wirén, Marianna; Sinha, Indranil

    2006-01-01

    to investigate genome-wide localization of Mediator and the Srb8-11 module in fission yeast. Mediator and the Srb8-11 module display similar binding patterns, and interactions with promoters and upstream activating sequences correlate with increased transcription activity. Unexpectedly, Mediator also interacts...... with the downstream coding region of many genes. These interactions display a negative bias for positions closer to the 5' ends of open reading frames (ORFs) and appear functionally important, because downregulation of transcription in a temperature-sensitive med17 mutant strain correlates with increased Mediator...

  11. Genomic analysis of a 1 Mb region near the telomere of Hessian fly chromosome X2 and avirulence gene vH13

    Directory of Open Access Journals (Sweden)

    Chen Ming-Shun

    2006-01-01

    Full Text Available Abstract Background To have an insight into the Mayetiola destructor (Hessian fly genome, we performed an in silico comparative genomic analysis utilizing genetic mapping, genomic sequence and EST sequence data along with data available from public databases. Results Chromosome walking and FISH were utilized to identify a contig of 50 BAC clones near the telomere of the short arm of Hessian fly chromosome X2 and near the avirulence gene vH13. These clones enabled us to correlate physical and genetic distance in this region of the Hessian fly genome. Sequence data from these BAC ends encompassing a 760 kb region, and a fully sequenced and assembled 42.6 kb BAC clone, was utilized to perform a comparative genomic study. In silico gene prediction combined with BLAST analyses was used to determine putative orthology to the sequenced dipteran genomes of the fruit fly, Drosophila melanogaster, and the malaria mosquito, Anopheles gambiae, and to infer evolutionary relationships. Conclusion This initial effort enables us to advance our understanding of the structure, composition and evolution of the genome of this important agricultural pest and is an invaluable tool for a whole genome sequencing effort.

  12. Unique and conserved genome regions in Vibrio harveyi and related species in comparison with the shrimp pathogen Vibrio harveyi CAIM 1792.

    Science.gov (United States)

    Espinoza-Valles, Iliana; Vora, Gary J; Lin, Baochuan; Leekitcharoenphon, Pimlapas; González-Castillo, Adrián; Ussery, Dave; Høj, Lone; Gomez-Gil, Bruno

    2015-09-01

    Vibrio harveyi CAIM 1792 is a marine bacterial strain that causes mortality in farmed shrimp in north-west Mexico, and the identification of virulence genes in this strain is important for understanding its pathogenicity. The aim of this work was to compare the V. harveyi CAIM 1792 genome with related genome sequences to determine their phylogenic relationship and explore unique regions in silico that differentiate this strain from other V. harveyi strains. Twenty-one newly sequenced genomes were compared in silico against the CAIM 1792 genome at nucleotidic and predicted proteome levels. The proteome of CAIM 1792 had higher similarity to those of other V. harveyi strains (78%) than to those of the other closely related species Vibrio owensii (67%), Vibrio rotiferianus (63%) and Vibrio campbellii (59%). Pan-genome ORFans trees showed the best fit with the accepted phylogeny based on DNA-DNA hybridization and multi-locus sequence analysis of 11 concatenated housekeeping genes. SNP analysis clustered 34/38 genomes within their accepted species. The pangenomic and SNP trees showed that V. harveyi is the most conserved of the four species studied and V. campbellii may be divided into at least three subspecies, supported by intergenomic distance analysis. blastp atlases were created to identify unique regions among the genomes most related to V. harveyi CAIM 1792; these regions included genes encoding glycosyltransferases, specific type restriction modification systems and a transcriptional regulator, LysR, reported to be involved in virulence, metabolism, quorum sensing and motility.

  13. Development and validation of new SSR markers from expressed regions in the garlic genome

    Directory of Open Access Journals (Sweden)

    Meryem Ipek

    2015-02-01

    Full Text Available Only a limited number of simple sequence repeat (SSR markers is available for the genome of garlic (Allium sativum L. despite the fact that SSR markers have become one of the most preferred DNA marker systems. To develop new SSR markers for the garlic genome, garlic expressed sequence tags (ESTs at the publicly available GarlicEST database were screened for SSR motifs and a total of 132 SSR motifs were identified. Primer pairs were designed for 50 SSR motifs and 24 of these primer pairs were selected as SSR markers based on their consistent amplification patterns and polymorphisms. In addition, two SSR markers were developed from the sequences of garlic cDNA-AFLP fragments. The use of 26 EST-SSR markers for the assessment of genetic relationship was tested using 31 garlic genotypes. Twenty six EST-SSR markers amplified 130 polymorphic DNA fragments and the number of polymorphic alleles per SSR marker ranged from 2 to 13 with an average of 5 alleles. Observed heterozygosity and polymorphism information content (PIC of the SSR markers were between 0.23 and 0.88, and 0.20 and 0.87, respectively. Twenty one out of the 31 garlic genotypes were analyzed in a previous study using AFLP markers and the garlic genotypes clustered together with AFLP markers were also grouped together with EST-SSR markers demonstrating high concordance between AFLP and EST-SSR marker systems and possible immediate application of EST-SSR markers for fingerprinting of garlic clones. EST-SSR markers could be used in genetic studies such as genetic mapping, association mapping, genetic diversity and comparison of the genomes of Allium species.

  14. A genome-wide association scan in pig identifies novel regions associated with feed efficiency trait

    DEFF Research Database (Denmark)

    Sahana, Goutam; Kadlecová, Veronika; Hornshøj, Henrik

    2013-01-01

    and to study the genetic architecture of the trait. After quality control, a total of 30,847 SNPs that could be mapped to the 18 porcine autosomes (SSC) following the pig genome assembly 10.2, were used in the analyses. Deregressed estimated breeding value was used as the response variable. A total of 3......Feed conversion ratio (FCR) is an economically important trait in pigs and feed accounts for a significant proportion of the costs involved in pig production. In this study we used a high density SNP chip panel, Porcine SNP60 BeadChip, to identify association between FCR and SNP markers...

  15. Characterization of variant Salmonella genomic island 1 multidrug resistance regions from serovars Typhimurium DT104 and Agona.

    Science.gov (United States)

    Boyd, David; Cloeckaert, Axel; Chaslus-Dancla, Elisabeth; Mulvey, Michael R

    2002-06-01

    Strains of multidrug-resistant Salmonella enterica serovar Typhimurium DT104 (DT104) and S. enterica serovar Agona (Agona) have been found to harbor Salmonella genomic island 1 (SGI1), a 43-kb genomic region that contains many of the drug resistance genes. Such strains are resistant to ampicillin (pse-1), chloramphenicol/florfenicol (floR), streptomycin/spectinomycin (aadA2), sulfonamides (sul1), and tetracycline [tet(G)] (commonly called the ACSSuT phenotype). All five resistance genes are found in a 13-kb multidrug resistance (MDR) region consisting of an unusual class I integron structure related to In4. We examined DT104 and Agona strains that exhibited other resistance phenotypes to determine if the resistance genes were associated with variant SGI1 MDR regions. All strains were found to harbor variant SGI1-like elements by using a combination of Southern hybridization, PCR mapping, and sequencing. Variant SGI1-like elements were found with MDR regions consisting of (i) an integron consisting of the SGI1 MDR region with the addition of a region containing a putative transposase gene (orf513) and dfrA10 located between duplicated qacEDelta1/sulI genes (SGI1-A; ACSSuTTm); (ii) an integron with either an aadA2 (SSu) or a pse-1 (ASu) cassette (SGI1-C and SGI1-B, respectively); (iii) an integron consisting of the SGI1-C MDR region plus an orf513/dfrA10 region as in SGI1-A (SGI1-D; ASSuTm; ampicillin resistance due to a TEM beta-lactamase); and (iv) an integron related to that in SGI1 but which contains a 10-kb inversion between two copies of IS6100, one which is inserted in floR (SGI1-E; ASSuT). We hypothesize that the MDR of SGI1 is subject to recombinational events that lead to the various resistance phenotypes in the Salmonella strains in which it is found.

  16. Whole genome comparisons suggest random distribution of Mycobacterium ulcerans genotypes in a Buruli ulcer endemic region of Ghana.

    Directory of Open Access Journals (Sweden)

    Anthony S Ablordey

    2015-03-01

    Full Text Available Efforts to control the spread of Buruli ulcer--an emerging ulcerative skin infection caused by Mycobacterium ulcerans--have been hampered by our poor understanding of reservoirs and transmission. To help address this issue, we compared whole genomes from 18 clinical M. ulcerans isolates from a 30 km2 region within the Asante Akim North District, Ashanti region, Ghana, with 15 other M. ulcerans isolates from elsewhere in Ghana and the surrounding countries of Ivory Coast, Togo, Benin and Nigeria. Contrary to our expectations of finding minor DNA sequence variations among isolates representing a single M. ulcerans circulating genotype, we found instead two distinct genotypes. One genotype was closely related to isolates from neighbouring regions of Amansie West and Densu, consistent with the predicted local endemic clone, but the second genotype (separated by 138 single nucleotide polymorphisms [SNPs] from other Ghanaian strains most closely matched M. ulcerans from Nigeria, suggesting another introduction of M. ulcerans to Ghana, perhaps from that country. Both the exotic genotype and the local Ghanaian genotype displayed highly restricted intra-strain genetic variation, with less than 50 SNP differences across a 5.2 Mbp core genome within each genotype. Interestingly, there was no discernible spatial clustering of genotypes at the local village scale. Interviews revealed no obvious epidemiological links among BU patients who had been infected with identical M. ulcerans genotypes but lived in geographically separate villages. We conclude that M. ulcerans is spread widely across the region, with multiple genotypes present in any one area. These data give us new perspectives on the behaviour of possible reservoirs and subsequent transmission mechanisms of M. ulcerans. These observations also show for the first time that M. ulcerans can be mobilized, introduced to a new area and then spread within a population. Potential reservoirs of M. ulcerans

  17. Whole genome comparisons suggest random distribution of Mycobacterium ulcerans genotypes in a Buruli ulcer endemic region of Ghana.

    Science.gov (United States)

    Ablordey, Anthony S; Vandelannoote, Koen; Frimpong, Isaac A; Ahortor, Evans K; Amissah, Nana Ama; Eddyani, Miriam; Durnez, Lies; Portaels, Françoise; de Jong, Bouke C; Leirs, Herwig; Porter, Jessica L; Mangas, Kirstie M; Lam, Margaret M C; Buultjens, Andrew; Seemann, Torsten; Tobias, Nicholas J; Stinear, Timothy P

    2015-03-01

    Efforts to control the spread of Buruli ulcer--an emerging ulcerative skin infection caused by Mycobacterium ulcerans--have been hampered by our poor understanding of reservoirs and transmission. To help address this issue, we compared whole genomes from 18 clinical M. ulcerans isolates from a 30 km2 region within the Asante Akim North District, Ashanti region, Ghana, with 15 other M. ulcerans isolates from elsewhere in Ghana and the surrounding countries of Ivory Coast, Togo, Benin and Nigeria. Contrary to our expectations of finding minor DNA sequence variations among isolates representing a single M. ulcerans circulating genotype, we found instead two distinct genotypes. One genotype was closely related to isolates from neighbouring regions of Amansie West and Densu, consistent with the predicted local endemic clone, but the second genotype (separated by 138 single nucleotide polymorphisms [SNPs] from other Ghanaian strains) most closely matched M. ulcerans from Nigeria, suggesting another introduction of M. ulcerans to Ghana, perhaps from that country. Both the exotic genotype and the local Ghanaian genotype displayed highly restricted intra-strain genetic variation, with less than 50 SNP differences across a 5.2 Mbp core genome within each genotype. Interestingly, there was no discernible spatial clustering of genotypes at the local village scale. Interviews revealed no obvious epidemiological links among BU patients who had been infected with identical M. ulcerans genotypes but lived in geographically separate villages. We conclude that M. ulcerans is spread widely across the region, with multiple genotypes present in any one area. These data give us new perspectives on the behaviour of possible reservoirs and subsequent transmission mechanisms of M. ulcerans. These observations also show for the first time that M. ulcerans can be mobilized, introduced to a new area and then spread within a population. Potential reservoirs of M. ulcerans thus might include

  18. Evaluation of Apis mellifera syriaca Levant Region honeybee conservation using Comparative Genome Hybridization

    Science.gov (United States)

    Apis mellifera syriaca is the native honeybee subspecies of Jordan and much of the Levant Region. It expresses behavioral adaptations to a regional climate with very high temperatures, nectar dearth in summer, attacks of the Oriental wasp and is resistant to Varroa mites. The A. m. syriaca control r...

  19. A linear time algorithm for detecting long genomic regions enriched with a specific combination of epigenetic states.

    Science.gov (United States)

    Ichikawa, Kazuki; Morishita, Shinichi

    2015-01-01

    Epigenetic modifications are essential for controlling gene expression. Recent studies have shown that not only single epigenetic modifications but also combinations of multiple epigenetic modifications play vital roles in gene regulation. A striking example is the long hypomethylated regions enriched with modified H3K27me3 (called, "K27HMD" regions), which are exposed to suppress the expression of key developmental genes relevant to cellular development and differentiation during embryonic stages in vertebrates. It is thus a biologically important issue to develop an effective optimization algorithm for detecting long DNA regions (e.g., >4 kbp in size) that harbor a specific combination of epigenetic modifications (e.g., K27HMD regions). However, to date, optimization algorithms for these purposes have received little attention, and available methods are still heuristic and ad hoc. In this paper, we propose a linear time algorithm for calculating a set of non-overlapping regions that maximizes the sum of similarities between the vector of focal epigenetic states and the vectors of raw epigenetic states at DNA positions in the set of regions. The average elapsed time to process the epigenetic data of any of human chromosomes was less than 2 seconds on an Intel Xeon CPU. To demonstrate the effectiveness of the algorithm, we estimated large K27HMD regions in the medaka and human genomes using our method, ChromHMM, and a heuristic method. We confirmed that the advantages of our method over those of the two other methods. Our method is flexible enough to handle other types of epigenetic combinations. The program that implements the method is called "CSMinfinder" and is made available at: http://mlab.cb.k.u-tokyo.ac.jp/~ichikawa/Segmentation/

  20. Heritability and genome-wide linkage in US and australian twins identify novel genomic regions controlling chromogranin a: implications for secretion and blood pressure.

    Science.gov (United States)

    O'Connor, Daniel T; Zhu, Gu; Rao, Fangwen; Taupenot, Laurent; Fung, Maple M; Das, Madhusudan; Mahata, Sushil K; Mahata, Manjula; Wang, Lei; Zhang, Kuixing; Greenwood, Tiffany A; Shih, Pei-an Betty; Cockburn, Myles G; Ziegler, Michael G; Stridsberg, Mats; Martin, Nicholas G; Whitfield, John B

    2008-07-15

    Chromogranin A (CHGA) triggers catecholamine secretory granule biogenesis, and its catestatin fragment inhibits catecholamine release. We approached catestatin heritability using twin pairs, coupled with genome-wide linkage, in a series of twin and sibling pairs from 2 continents. Hypertensive patients had elevated CHGA coupled with reduction in catestatin, suggesting diminished conversion of precursor to catestatin. Heritability for catestatin in twins was 44% to 60%. Six hundred fifteen nuclear families yielded 870 sib pairs for linkage, with significant logarithm of odds peaks on chromosomes 4p, 4q, and 17q. Because acidification of catecholamine secretory vesicles determines CHGA trafficking and processing to catestatin, we genotyped at positional candidate ATP6N1, bracketed by peak linkage markers on chromosome 17q, encoding a subunit of vesicular H(+)-translocating ATPase. The minor allele diminished CHGA secretion and processing to catestatin. The ATP6N1 variant also influenced blood pressure in 1178 individuals with the most extreme blood pressure values in the population. In chromaffin cells, inhibition of H(+)-ATPase diverted CHGA from regulated to constitutive secretory pathways. We established heritability of catestatin in twins from 2 continents. Linkage identified 3 regions contributing to catestatin, likely novel determinants of sympathochromaffin exocytosis. At 1 such positional candidate (ATP6N1), variation influenced CHGA secretion and processing to catestatin, confirming the mechanism of a novel trans-QTL for sympathochromaffin activity and blood pressure.

  1. Proteogenomics analysis reveals specific genomic orientations of distal regulatory regions composed by non-canonical histone variants.

    Science.gov (United States)

    Won, Kyoung-Jae; Choi, Inchan; LeRoy, Gary; Zee, Barry M; Sidoli, Simone; Gonzales-Cope, Michelle; Garcia, Benjamin A

    2015-01-01

    Histone variants play further important roles in DNA packaging and controlling gene expression. However, our understanding about their composition and their functions is limited. Integrating proteomic and genomic approaches, we performed a comprehensive analysis of the epigenetic landscapes containing the four histone variants H3.1, H3.3, H2A.Z, and macroH2A. These histones were FLAG-tagged in HeLa cells and purified using chromatin immunoprecipitation (ChIP). By adopting ChIP followed by mass spectrometry (ChIP-MS), we quantified histone post-translational modifications (PTMs) and histone variant nucleosomal ratios in highly purified mononucleosomes. Subsequent ChIP followed by next-generation sequencing (ChIP-seq) was used to map the genome-wide localization of the analyzed histone variants and define their chromatin domains. Finally, we included in our study large datasets contained in the ENCODE database. We newly identified a group of regulatory regions enriched in H3.1 and the histone variant associated with repressive marks macroH2A. Systematic analysis identified both symmetric and asymmetric patterns of histone variant occupancies at intergenic regulatory regions. Strikingly, these directional patterns were associated with RNA polymerase II (PolII). These asymmetric patterns correlated with the enhancer activities measured using global run-on sequencing (GRO-seq) data. Our studies show that H2A.Z and H3.3 delineate the orientation of transcription at enhancers as observed at promoters. We also showed that enhancers with skewed histone variant patterns well facilitate enhancer activity. Collectively, our study indicates that histone variants are deposited at regulatory regions to assist gene regulation.

  2. QTL identification of flowering time at three different latitudes reveals homeologous genomic regions that control flowering in soybean.

    Science.gov (United States)

    Liu, Weixian; Kim, Moon Young; Kang, Yang Jae; Van, Kyujung; Lee, Yeong-Ho; Srinives, Peerasak; Yuan, Dong Lin; Lee, Suk-Ha

    2011-08-01

    Since the genetic control of flowering time is very important in photoperiod-sensitive soybean (Glycine max (L.) Merr.), genes affecting flowering under different environment conditions have been identified and described. The objectives were to identify quantitative trait loci (QTLs) for flowering time in different latitudinal and climatic regions, and to understand how chromosomal rearrangement and genome organization contribute to flowering time in soybean. Recombinant inbred lines from a cross between late-flowering 'Jinpumkong 2' and early-flowering 'SS2-2' were used to evaluate the phenotypic data for days to flowering (DF) collected from Kamphaeng Saen, Thailand (14°01'N), Suwon, Korea (37°15'N), and Longjing, China (42°46'N). A weakly positive phenotypic correlation (r = 0.36) was found between DF in Korea and Thailand; however, a strong correlation (r = 0.74) was shown between Korea and China. After 178 simple sequence repeat (SSR) markers were placed on a genetic map spanning 2,551.7 cM, four independent DF QTLs were identified on different chromosomes (Chrs). Among them, three QTLs on Chrs 9, 13 and 16 were either Thailand- or Korea-specific. The DF QTL on Chr 6 was identified in both Korea and China, suggesting it is less environment-sensitive. Comparative analysis of four DF QTL regions revealed a syntenic relationship between two QTLs on Chrs 6 and 13. All five duplicated gene pairs clustered in the homeologous genomic regions were found to be involved in the flowering. Identification and comparative analysis of multiple DF QTLs from different environments will facilitate the significant improvement in soybean breeding programs with respect to control of flowering time.

  3. Genome sequence of the acid-tolerant Desulfovibrio sp. DV isolated from the sediments of a Pb-Zn mine tailings dam in the Chita region, Russia

    Directory of Open Access Journals (Sweden)

    Anastasiia Kovaliova

    2017-03-01

    Full Text Available Here we report the draft genome sequence of the acid-tolerant Desulfovibrio sp. DV isolated from the sediments of a Pb-Zn mine tailings dam in the Chita region, Russia. The draft genome has a size of 4.9 Mb and encodes multiple K+-transporters and proton-consuming decarboxylases. The phylogenetic analysis based on concatenated ribosomal proteins revealed that strain DV clusters together with the acid-tolerant Desulfovibrio sp. TomC and Desulfovibrio magneticus. The draft genome sequence and annotation have been deposited at GenBank under the accession number MLBG00000000.

  4. Identification of candidate domestication regions in the radish genome based on high-depth resequencing analysis of 17 genotypes.

    Science.gov (United States)

    Kim, Namshin; Jeong, Young-Min; Jeong, Seongmun; Kim, Goon-Bo; Baek, Seunghoon; Kwon, Young-Eun; Cho, Ara; Choi, Sang-Bong; Kim, Jiwoong; Lim, Won-Jun; Kim, Kyoung Hyoun; Park, Won; Kim, Jae-Yoon; Kim, Jin-Hyun; Yim, Bomi; Lee, Young Joon; Chun, Byung-Moon; Lee, Young-Pyo; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-09-01

    This study provides high-quality variation data of diverse radish genotypes. Genome-wide SNP comparison along with RNA-seq analysis identified candidate genes related to domestication that have potential as trait-related markers for genetics and breeding of radish. Radish (Raphanus sativus L.) is an annual root vegetable crop that also encompasses diverse wild species. Radish has a long history of domestication, but the origins and selective sweep of cultivated radishes remain controversial. Here, we present comprehensive whole-genome resequencing analysis of radish to explore genomic variation between the radish genotypes and to identify genetic bottlenecks due to domestication in Asian cultivars. High-depth resequencing and multi-sample genotyping analysis of ten cultivated and seven wild accessions obtained 4.0 million high-quality homozygous single-nucleotide polymorphisms (SNPs)/insertions or deletions. Variation analysis revealed that Asian cultivated radish types are closely related to wild Asian accessions, but are distinct from European/American cultivated radishes, supporting the notion that Asian cultivars were domesticated from wild Asian genotypes. SNP comparison between Asian genotypes identified 153 candidate domestication regions (CDRs) containing 512 genes. Network analysis of the genes in CDRs functioning in plant signaling pathways and biochemical processes identified group of genes related to root architecture, cell wall, sugar metabolism, and glucosinolate biosynthesis. Expression profiling of the genes during root development suggested that domestication-related selective advantages included a main taproot with few branched lateral roots, reduced cell wall rigidity and favorable taste. Overall, this study provides evolutionary insights into domestication-related genetic selection in radish as well as identification of gene candidates with the potential to act as trait-related markers for background selection of elite lines in molecular

  5. Identification and characterization of genomic regions on chromosomes 4 and 8 that control the rate of photosynthesis in rice leaves

    Science.gov (United States)

    Adachi, Shunsuke; Tsuru, Yukiko; Nito, Naoko; Murata, Kazumasa; Yamamoto, Toshio; Ebitani, Takeshi; Ookawa, Taiichiro; Hirasawa, Tadashi

    2011-01-01

    DNA marker-assisted selection appears to be a promising strategy for improving rates of leaf photosynthesis in rice. The rate of leaf photosynthesis was significantly higher in a high-yielding indica variety, Habataki, than in the most popular Japanese variety, Koshihikari, at the full heading stage as a result of the higher level of leaf nitrogen at the same rate of application of nitrogen and the higher stomatal conductance even when the respective levels of leaf nitrogen were the same. The higher leaf nitrogen content of Habataki was caused by the greater accumulation of nitrogen by plants. The higher stomatal conductance of Habataki was caused by the higher hydraulic conductance. Using progeny populations and selected lines derived from a cross between Koshihikari and Habataki, it was possible to identify the genomic regions responsible for the rate of photosynthesis within a 2.1 Mb region between RM17459 and RM17552 and within a 1.2 Mb region between RM6999 and RM22529 on the long arm of chromosome 4 and on the short arm of chromosome 8, respectively. The designated region on chromosome 4 of Habataki was responsible for both the increase in the nitrogen content of leaves and hydraulic conductance in the plant by increasing the root surface area. The designated region on chromosome 8 of Habataki was responsible for the increase in hydraulic conductance by increasing the root hydraulic conductivity. The results suggest that it may be possible to improve photosynthesis in rice leaves by marker-assisted selection that focuses on these regions of chromosomes 4 and 8. PMID:21296764

  6. Pairing of homologous regions in the mouse genome is associated with transcription but not imprinting status.

    Directory of Open Access Journals (Sweden)

    Christel Krueger

    Full Text Available Although somatic homologous pairing is common in Drosophila it is not generally observed in mammalian cells. However, a number of regions have recently been shown to come into close proximity with their homologous allele, and it has been proposed that pairing might be involved in the establishment or maintenance of monoallelic expression. Here, we investigate the pairing properties of various imprinted and non-imprinted regions in mouse tissues and ES cells. We find by allele-specific 4C-Seq and DNA FISH that the Kcnq1 imprinted region displays frequent pairing but that this is not dependent on monoallelic expression. We demonstrate that pairing involves larger chromosomal regions and that the two chromosome territories come close together. Frequent pairing is not associated with imprinted status or DNA repair, but is influenced by chromosomal location and transcription. We propose that homologous pairing is not exclusive to specialised regions or specific functional events, and speculate that it provides the cell with the opportunity of trans-allelic effects on gene regulation.

  7. Pairing of homologous regions in the mouse genome is associated with transcription but not imprinting status.

    Science.gov (United States)

    Krueger, Christel; King, Michelle R; Krueger, Felix; Branco, Miguel R; Osborne, Cameron S; Niakan, Kathy K; Higgins, Michael J; Reik, Wolf

    2012-01-01

    Although somatic homologous pairing is common in Drosophila it is not generally observed in mammalian cells. However, a number of regions have recently been shown to come into close proximity with their homologous allele, and it has been proposed that pairing might be involved in the establishment or maintenance of monoallelic expression. Here, we investigate the pairing properties of various imprinted and non-imprinted regions in mouse tissues and ES cells. We find by allele-specific 4C-Seq and DNA FISH that the Kcnq1 imprinted region displays frequent pairing but that this is not dependent on monoallelic expression. We demonstrate that pairing involves larger chromosomal regions and that the two chromosome territories come close together. Frequent pairing is not associated with imprinted status or DNA repair, but is influenced by chromosomal location and transcription. We propose that homologous pairing is not exclusive to specialised regions or specific functional events, and speculate that it provides the cell with the opportunity of trans-allelic effects on gene regulation.

  8. Characterization of untranslated regions of the salmonid alphavirus 3 (SAV3 genome and construction of a SAV3 based replicon

    Directory of Open Access Journals (Sweden)

    Rimstad Espen

    2009-10-01

    Full Text Available Abstract Salmonid alphavirus (SAV causes disease in farmed salmonid fish and is divided into different genetic subtypes (SAV1-6. Here we report the cloning and characterization of the 5'- and 3'- untranslated regions (UTR of a SAV3 isolated from Atlantic salmon in Norway. The sequences of the UTRs are very similar to those of SAV1 and SAV2, but single nucleotide polymorphisms are present, also in the 3' - conserved sequence element (3'-CSE. Prediction of the RNA secondary structure suggested putative stem-loop structures in both the 5'- and 3'-ends, similar to those of alphaviruses from the terrestrial environment, indicating that the general genome replication initiation strategy for alphaviruses is also utilized by SAV. A DNA replicon vector, pmSAV3, based upon a pVAX1 backbone and the SAV3 genome was constructed, and the SAV3 non-structural proteins were used to express a reporter gene controlled by the SAV3 subgenomic promoter. Transfection of pmSAV3 into CHSE and BF2 cell lines resulted in expression of the reporter protein, confirming that the cloned SAV3 replication apparatus and UTRs are functional in fish cells.

  9. Genomic organization of the human PAX 3 gene: DNA sequence analysis of the region disrupted in alveolar rhabdomyosarcoma

    Energy Technology Data Exchange (ETDEWEB)

    Macina, R.A.; Galili, N.; Riethman, H.C. [Wistar Inst., Philadelphia, PA (United States)] [and others

    1995-03-01

    Mutations in the human PAX3 gene have previously been associated with two distinct diseases, Waardenburg syndrome and alveolar rhabdomyosarcoma. In this report the authors establish that the normal human PAX3 gene is encoded by 8 exons. Intron-exon boundary sequences were obtained for PAX 3 exons 5, 6, 7, and 8 and together with previous work provide the complete genomic sequence organization for PAX3. Difficulties in obtaining overlapping genomic clone coverage of PAX3 were circumvented in part by RARE cleavage mapping, which showed that the entire PAX3 gene spans 100 kb of chromosome 2. Sequence analysis of the last intron of PAX3, which contains the previously mapped t(2;13)(q35;q14) translocation breakpoints of alveolar rhabdomyosarcoma, revealed the presence of a pair of inverted Alu repeats and a pair of inverted (GT){sub n}-rich microsatellite repeats with in a 5k-kb region. This work establishes the complete structure of PAX 3 and will permit high-resolution analyses of this locus for mutations associated with Waardenburg syndrome, alveolar rhabdomyosarcoma, and other phenotypes for which PAX3 may be a candidate locus.31 refs., 5 figs., 1 tab.

  10. Regional mapping of the phenylalanine hydroxylase gene and the phenylketonuria locus in the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Lidsky, A.S.; Law, M.L.; Morse, H.G.; Kao, F.T.; Rabin, M.; Ruddle, F.H.; Woo, S.L.C.

    1985-09-01

    Phenylketonuria (PKU) is an autosomal recessive disorder of amino acid metabolism caused by a deficiency of the hepatic enzyme phenylalanine hydroxylase. To define the regional map position of the disease locus and the PAH gene on human chromosome 12, DNA was isolated from human-hamster somatic cell hybrids with various deletions of human chromosome 12 and was analyzed by Southern blot analysis using the human cDNA PAH clone as a hybridization probe. From these results, together with detailed biochemical and cytogenetic characterization of the hybrid cells, the region on chromosome 12 containing the human PAH gene has been defined as 12q14.3..-->..qter. The PAH map position on chromosome 12 was further localized by in situ hybridization of /sup 125/I-labeled human PAH cDNA to chromosomes prepared from a human lymphoblastoid cell line. Results of these experiments demonstrated that the region on chromosome 12 containing the PAH gene and the PKU locus in man is 12q22..-->..12q24.1. These results not only provide a regionalized map position for a major human disease locus but also can serve as a reference point for linkage analysis with other DNA markers on human chromosome 12.

  11. Regional mapping of the phenylalanine hydroxylase gene and the phenylketonuria locus in the human genome

    International Nuclear Information System (INIS)

    Lidsky, A.S.; Law, M.L.; Morse, H.G.; Kao, F.T.; Rabin, M.; Ruddle, F.H.; Woo, S.L.C.

    1985-01-01

    Phenylketonuria (PKU) is an autosomal recessive disorder of amino acid metabolism caused by a deficiency of the hepatic enzyme phenylalanine hydroxylase. To define the regional map position of the disease locus and the PAH gene on human chromosome 12, DNA was isolated from human-hamster somatic cell hybrids with various deletions of human chromosome 12 and was analyzed by Southern blot analysis using the human cDNA PAH clone as a hybridization probe. From these results, together with detailed biochemical and cytogenetic characterization of the hybrid cells, the region on chromosome 12 containing the human PAH gene has been defined as 12q14.3→qter. The PAH map position on chromosome 12 was further localized by in situ hybridization of 125 I-labeled human PAH cDNA to chromosomes prepared from a human lymphoblastoid cell line. Results of these experiments demonstrated that the region on chromosome 12 containing the PAH gene and the PKU locus in man is 12q22→12q24.1. These results not only provide a regionalized map position for a major human disease locus but also can serve as a reference point for linkage analysis with other DNA markers on human chromosome 12

  12. AFLPs reveal genomic regions not detected by RFLPs: a case study in tomato

    NARCIS (Netherlands)

    Bonnema, G.; Berg, van den P.; Lindhout, P.

    2002-01-01

    A set of three tomato chromosome 7 introgression lines (ILs) containing overlapping segments of Lycopersicon pennellii DNA was screened with a set of 10 EcoRI–MseI and 10 PstI–MseI AFLP primer combinations. A large number of markers were identified that mapped to one of the four regions of

  13. Presence of isochore structures in reptile genomes suggested by the relationship between GC contents of intron regions and those of coding regions.

    Science.gov (United States)

    Hamada, Kazuo; Horiike, Tokumasa; Ota, Hidetoshi; Mizuno, Keiko; Shinozawa, Takao

    2003-04-01

    Vertebrate genomes are mosaics of isochores. On the assumption that marked differences exist in the isochore structure between warm-blooded and cold-blooded animals, variations among vertebrates were previously attributed to adaptation to homeothermy. However, based on the data of coding regions from representatives of extant vertebrates, including a turtle, a crocodile (Archosauromorpha) and a few kinds of snakes (Lepidosauromorpha), it was recently hypothesized that the common ancestors of mammals, birds and extant reptiles already had the "warm-blooded" isochore structure. To test this hypothesis, the nucleotide sequences of alpha-globin genes including non-coding regions (introns) from two snakes, N. kaouthia and E. climacophora, were determined (accession number: AB104824, AB104825). The correlation between the GC contents in the introns and exons of alpha-globin genes from snakes and those from other vertebrates supports the above hypothesis. Similar analysis using data for exons and introns of other genes obtained from the GenBank (Release 131) also support the above hypothesis.

  14. Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly

    Directory of Open Access Journals (Sweden)

    Shultz Jeffry

    2008-07-01

    Full Text Available Abstract Background Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max genome has been shown to be composed of approximately four thousand short interspersed homeologous regions with 1, 2 or 4 copies per haploid genome by RFLP analysis, microsatellite anchors to BACs and by contigs formed from BAC fingerprints. Despite these similar regions,, the genome has been sequenced by whole genome shotgun sequence (WGS. Here the aim was to use BAC end sequences (BES derived from three minimum tile paths (MTP to examine the extent and homogeneity of polyploid-like regions within contigs and the extent of correlation between the polyploid-like regions inferred from fingerprinting and the polyploid-like sequences inferred from WGS matches. Results Results show that when sequence divergence was 1–10%, the copy number of homeologous regions could be identified from sequence variation in WGS reads overlapping BES. Homeolog sequence variants (HSVs were single nucleotide polymorphisms (SNPs; 89% and single nucleotide indels (SNIs 10%. Larger indels were rare but present (1%. Simulations that had predicted fingerprints of homeologous regions could be separated when divergence exceeded 2% were shown to be false. We show that a 5–10% sequence divergence is necessary to separate homeologs by fingerprinting. BES compared to WGS traces showed polyploid-like regions with less than 1% sequence divergence exist at 2.3% of the locations assayed. Conclusion The use of HSVs like SNPs and SNIs to characterize BACs wil improve contig building methods. The implications for bioinformatic and functional annotation of polyploid and paleopolyploid genomes show that a combined approach of BAC fingerprint based physical maps, WGS sequence and HSV-based partitioning of BAC clones from homeologous regions to separate contigs will allow reliable de

  15. Genetic and physical mapping of the genomic region spanning CMT4A

    Energy Technology Data Exchange (ETDEWEB)

    Othmane, K.B.; Loeb, D.; Roses, A.D. [Duke Univ. Medical Center, Durham, NC (United States)] [and others

    1994-09-01

    Autosomal recessive Charcot-Marie-Tooth disease (CMT4) is a severe childhood neuropathy classified into three types: A, B, and C. We previously mapped CMT4A to chromosome 8q13-q21 in four large Tunisian families. Analysis of recombination events suggested the order: cent.-D8S279-(D8S286,D8S164, CMT4A)-D8S84-tel. Families with types B and C were subsequently typed and linkage for these types was excluded for the CMT4A region and other known CMT loci. Recently, the gene for a major peripheral myelin protein (PMP2) was mapped by FISH to chromosome 8q21-q22 and therefore appeared to be a strong candidate gene for CMT4A. We used SSCP analysis, DNA sequencing, FISH and YAC mapping analysis, and demonstrated that PMP2 is not the defect in CMT4A. Using physical mapping data, we sublocalized a new genethon marker (D8S548) to the CMT4A region between D8S286 and D8S164. All affected CMT4A patients were homozygotes for this polymorphic microsatellite as expected from its physical localization. We screened the CEPH megabase YAC library using the closest markers; over 30 YACs were isolated and characterized by PFGE. FISH analysis revealed about 16% chimeras. The YACs span the 8 cM region between D8S279 and PMP2 (mapped distal to D8S84), with a current 1 cM gap between D8S164 and D8S84. We are currently using Alu-PCR and vectorette to develop end clones in order to identify new YACs in the region and further close this gap. Alu-PCR fragments have identified several new microsatellites in the region which can be used for additional mapping of the CMT4A gene.

  16. Identification and validation of genomic regions that affect shoot fly resistance in sorghum [Sorghum bicolor (L.) Moench].

    Science.gov (United States)

    Aruna, C; Bhagwat, V R; Madhusudhana, R; Sharma, Vittal; Hussain, T; Ghorade, R B; Khandalkar, H G; Audilakshmi, S; Seetharama, N

    2011-05-01

    Shoot fly is one of the most important pests affecting the sorghum production. The identification of quantitative trait loci (QTL) affecting shoot fly resistance enables to understand the underlying genetic mechanisms and genetic basis of complex interactions among the component traits. The aim of the present study was to detect QTL for shoot fly resistance and the associated traits using a population of 210 RILs of the cross 27B (susceptible) × IS2122 (resistant). RIL population was phenotyped in eight environments for shoot fly resistance (deadheart percentage), and in three environments for the component traits, such as glossiness, seedling vigor and trichome density. Linkage map was constructed with 149 marker loci comprising 127 genomic-microsatellite, 21 genic-microsatellite and one morphological marker. QTL analysis was performed by using MQM approach. 25 QTL (five each for leaf glossiness and seedling vigor, 10 for deadhearts, two for adaxial trichome density and three for abaxial trichome density) were detected in individual and across environments. The LOD and R (2) (%) values of QTL ranged from 2.44 to 24.1 and 4.3 to 44.1%, respectively. For most of the QTLs, the resistant parent, IS2122 contributed alleles for resistance; while at two QTL regions, the susceptible parent 27B also contributed for resistance traits. Three genomic regions affected multiple traits, suggesting the phenomenon of pleiotrophy or tight linkage. Stable QTL were identified for the traits across different environments, and genetic backgrounds by comparing the QTL in the study with previously reported QTL in sorghum. For majority of the QTLs, possible candidate genes were identified. The QTLs identified will enable marker assisted breeding for shoot fly resistance in sorghum.

  17. Quantitative trait loci mapping of genome regions controlling permethrin resistance in the mosquito Aedes aegypti.

    Science.gov (United States)

    Saavedra-Rodriguez, Karla; Strode, Clare; Flores Suarez, Adriana; Fernandez Salas, Ildefonso; Ranson, Hilary; Hemingway, Janet; Black, William C

    2008-10-01

    The mosquito Aedes aegypti is the principal vector of dengue and yellow fever flaviviruses. Permethrin is an insecticide used to suppress Ae. aegypti adult populations but metabolic and target site resistance to pyrethroids has evolved in many locations worldwide. Quantitative trait loci (QTL) controlling permethrin survival in Ae. aegypti were mapped in an F(3) advanced intercross line. Parents came from a collection of mosquitoes from Isla Mujeres, México, that had been selected for permethrin resistance for two generations and a reference permethrin-susceptible strain originally from New Orleans. Following a 1-hr permethrin exposure, 439 F(3) adult mosquitoes were phenotyped as knockdown resistant, knocked down/recovered, or dead. For QTL mapping, single nucleotide polymorphisms (SNPs) were identified at 22 loci with potential antixenobiotic activity including genes encoding cytochrome P450s (CYP), esterases (EST), or glutathione transferases (GST) and at 12 previously mapped loci. Seven antixenobiotic genes mapped to chromosome I, six to chromosome II, and nine to chromosome III. Two QTL of major effect were detected on chromosome III. One corresponds with a SNP previously associated with permethrin resistance in the para sodium channel gene and the second with the CCEunk7o esterase marker. Additional QTL but of relatively minor effect were also found. These included two sex-linked QTL on chromosome I affecting knockdown and recovery and a QTL affecting survival and recovery. On chromosome II, one QTL affecting survival and a second affecting recovery were detected. The patterns confirm that mutations in the para gene cause target-site insensitivity and are the major source of permethrin resistance but that other genes dispersed throughout the genome contribute to recovery and survival of mosquitoes following permethrin exposure.

  18. Avian papillomaviruses: the parrot Psittacus erithacus papillomavirus (PePV genome has a unique organization of the early protein region and is phylogenetically related to the chaffinch papillomavirus

    Directory of Open Access Journals (Sweden)

    Jenson A Bennett

    2002-07-01

    Full Text Available Abstract Background An avian papillomavirus genome has been cloned from a cutaneous exophytic papilloma from an African grey parrot (Psittacus erithacus. The nucleotide sequence, genome organization, and phylogenetic position of the Psittacus erithacus papillomavirus (PePV were determined. This PePV sequence represents the first complete avian papillomavirus genome defined. Results The PePV genome (7304 basepairs differs from other papillomaviruses, in that it has a unique organization of the early protein region lacking classical E6 and E7 open reading frames. Phylogenetic comparison of the PePV sequence with partial E1 and L1 sequences of the chaffinch (Fringilla coelebs papillomavirus (FPV reveals that these two avian papillomaviruses form a monophyletic cluster with a common branch that originates near the unresolved center of the papillomavirus evolutionary tree. Conclusions The PePV genome has a unique layout of the early protein region which represents a novel prototypic genomic organization for avian papillomaviruses. The close relationship between PePV and FPV, and between their Psittaciformes and Passeriformes hosts, supports the hypothesis that papillomaviruses have co-evolved and speciated together with their host species throughout evolution.

  19. Genetic drift in hypervariable region 1 of the viral genome in persistent hepatitis C virus infection.

    OpenAIRE

    Kato, N; Ootsuyama, Y; Sekiya, H; Ohkoshi, S; Nakazawa, T; Hijikata, M; Shimotohno, K

    1994-01-01

    The hypervariable region 1 (HVR1) of the putative second envelope glycoprotein (gp70) of hepatitis C virus (HCV) contains a sequence-specific immunological B-cell epitope that induces the production of antibodies restricted to the specific viral isolate, and anti-HVR1 antibodies are involved in the genetic drift of HVR1 driven by immunoselection (N. Kato, H. Sekiya, Y. Ootsuyama, T. Nakazawa, M. Hijikata, S. Ohkoshi, and K. Shimotohno, J. Virol. 67:3923-3930, 1993). We further investigated th...

  20. Novel bioinformatics method for identification of genome-wide non-canonical spliced regions using RNA-Seq data.

    Directory of Open Access Journals (Sweden)

    Yongsheng Bai

    Full Text Available During endoplasmic reticulum (ER stress, the endoribonuclease (RNase Ire1α initiates removal of a 26 nt region from the mRNA encoding the transcription factor Xbp1 via an unconventional mechanism (atypically within the cytosol. This causes an open reading frame-shift that leads to altered transcriptional regulation of numerous downstream genes in response to ER stress as part of the unfolded protein response (UPR. Strikingly, other examples of targeted, unconventional splicing of short mRNA regions have yet to be reported.Our goal was to develop an approach to identify non-canonical, possibly very short, splicing regions using RNA-Seq data and apply it to ER stress-induced Ire1α heterozygous and knockout mouse embryonic fibroblast (MEF cell lines to identify additional Ire1α targets.We developed a bioinformatics approach called the Read-Split-Walk (RSW pipeline, and evaluated it using two Ire1α heterozygous and two Ire1α-null samples. The 26 nt non-canonical splice site in Xbp1 was detected as the top hit by our RSW pipeline in heterozygous samples but not in the negative control Ire1α knockout samples. We compared the Xbp1 results from our approach with results using the alignment program BWA, Bowtie2, STAR, Exonerate and the Unix "grep" command. We then applied our RSW pipeline to RNA-Seq data from the SKBR3 human breast cancer cell line. RSW reported a large number of non-canonical spliced regions for 108 genes in chromosome 17, which were identified by an independent study.We conclude that our RSW pipeline is a practical approach for identifying non-canonical splice junction sites on a genome-wide level. We demonstrate that our pipeline can detect novel splice sites in RNA-Seq data generated under similar conditions for multiple species, in our case mouse and human.

  1. Positive selection in the chromosome 16 VKORC1 genomic region has contributed to the variability of anticoagulant response in humans.

    Directory of Open Access Journals (Sweden)

    Blandine Patillon

    Full Text Available VKORC1 (vitamin K epoxide reductase complex subunit 1, 16p11.2 is the main genetic determinant of human response to oral anticoagulants of antivitamin K type (AVK. This gene was recently suggested to be a putative target of positive selection in East Asian populations. In this study, we genotyped the HGDP-CEPH Panel for six VKORC1 SNPs and downloaded chromosome 16 genotypes from the HGDP-CEPH database in order to characterize the geographic distribution of footprints of positive selection within and around this locus. A unique VKORC1 haplotype carrying the promoter mutation associated with AVK sensitivity showed especially high frequencies in all the 17 HGDP-CEPH East Asian population samples. VKORC1 and 24 neighboring genes were found to lie in a 505 kb region of strong linkage disequilibrium in these populations. Patterns of allele frequency differentiation and haplotype structure suggest that this genomic region has been submitted to a near complete selective sweep in all East Asian populations and only in this geographic area. The most extreme scores of the different selection tests are found within a smaller 45 kb region that contains VKORC1 and three other genes (BCKDK, MYST1 (KAT8, and PRSS8 with different functions. Because of the strong linkage disequilibrium, it is not possible to determine if VKORC1 or one of the three other genes is the target of this strong positive selection that could explain present-day differences among human populations in AVK dose requirement. Our results show that the extended region surrounding a presumable single target of positive selection should be analyzed for genetic variation in a wide range of genetically diverse populations in order to account for other neighboring and confounding selective events and the hitchhiking effect.

  2. Engineered chromosome-based genetic mapping establishes a 3.7 Mb critical genomic region for Down syndrome-associated heart defects in mice.

    Science.gov (United States)

    Liu, Chunhong; Morishima, Masae; Jiang, Xiaoling; Yu, Tao; Meng, Kai; Ray, Debjit; Pao, Annie; Ye, Ping; Parmacek, Michael S; Yu, Y Eugene

    2014-06-01

    Trisomy 21 (Down syndrome, DS) is the most common human genetic anomaly associated with heart defects. Based on evolutionary conservation, DS-associated heart defects have been modeled in mice. By generating and analyzing mouse mutants carrying different genomic rearrangements in human chromosome 21 (Hsa21) syntenic regions, we found the triplication of the Tiam1-Kcnj6 region on mouse chromosome 16 (Mmu16) resulted in DS-related cardiovascular abnormalities. In this study, we developed two tandem duplications spanning the Tiam1-Kcnj6 genomic region on Mmu16 using recombinase-mediated genome engineering, Dp(16)3Yey and Dp(16)4Yey, spanning the 2.1 Mb Tiam1-Il10rb and 3.7 Mb Ifnar1-Kcnj6 regions, respectively. We found that Dp(16)4Yey/+, but not Dp(16)3Yey/+, led to heart defects, suggesting the triplication of the Ifnar1-Kcnj6 region is sufficient to cause DS-associated heart defects. Our transcriptional analysis of Dp(16)4Yey/+ embryos showed that the Hsa21 gene orthologs located within the duplicated interval were expressed at the elevated levels, reflecting the consequences of the gene dosage alterations. Therefore, we have identified a 3.7 Mb genomic region, the smallest critical genomic region, for DS-associated heart defects, and our results should set the stage for the final step to establish the identities of the causal gene(s), whose elevated expression(s) directly underlie this major DS phenotype.

  3. Genomic analysis of head and neck cancer cases from two high incidence regions.

    Directory of Open Access Journals (Sweden)

    Sandra Perdomo

    Full Text Available We investigated how somatic changes in HNSCC interact with environmental and host risk factors and whether they influence the risk of HNSCC occurrence and outcome. 180-paired samples diagnosed as HNSCC in two high incidence regions of Europe and South America underwent targeted sequencing (14 genes and evaluation of copy number alterations (SCNAs. TP53, PIK3CA, NOTCH1, TP63 and CDKN2A were the most frequently mutated genes. Cases were characterized by a low copy number burden with recurrent focal amplification in 11q13.3 and deletion in 15q22. Cases with low SCNAs showed an improved overall survival. We found significant correlations with decreased overall survival between focal amplified regions 4p16, 10q22 and 22q11, and losses in 12p12, 15q14 and 15q22. The mutational landscape in our cases showed an association to both environmental exposures and clinical characteristics. We confirmed that somatic copy number alterations are an important predictor of HNSCC overall survival.

  4. [Cloning and sequence analysis of the DHBV genome of the brown ducks in Guilin region and establishment of the quantitative method for detecting DHBV].

    Science.gov (United States)

    Su, He-Ling; Huang, Ri-Dong; He, Song-Qing; Xu, Qing; Zhu, Hua; Mo, Zhi-Jing; Liu, Qing-Bo; Liu, Yong-Ming

    2013-03-01

    Brown ducks carrying DHBV were widely used as hepatitis B animal model in the research of the activity and toxicity of anti-HBV dugs. Studies showed that the ratio of DHBV carriers in the brown ducks in Guilin region was relatively high. Nevertheless, the characters of the DHBV genome of Guilin brown duck remain unknown. Here we report the cloning of the genome of Guilin brown duck DHBV and the sequence analysis of the genome. The full length of the DHBV genome of Guilin brown duck was 3 027bp. Analysis using ORF finder found that there was an ORF for an unknown peptide other than S-ORF, PORF and C-ORF in the genome of the DHBV. Vector NTI 8. 0 analysis revealed that the unknown peptide contained a motif which binded to HLA * 0201. Aligning with the DHBV sequences from different countries and regions indicated that there were no obvious differences of regional distribution among the sequences. A fluorescence quantitative PCR for detecting DHBV was establishment based on the recombinant plasmid pGEM-DHBV-S constructed. This study laid the groundwork for using Guilin brown duck as a hepatitis B animal model.

  5. Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors.

    Science.gov (United States)

    Crisan, Anamaria; Goya, Rodrigo; Ha, Gavin; Ding, Jiarui; Prentice, Leah M; Oloumi, Arusha; Senz, Janine; Zeng, Thomas; Tse, Kane; Delaney, Allen; Marra, Marco A; Huntsman, David G; Hirst, Martin; Aparicio, Sam; Shah, Sohrab

    2012-01-01

    Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome-in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs)-which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended genotype space where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). We introduce the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to discover 21 experimentally revalidated somatic non-synonymous mutations in a lobular breast cancer genome that were not detected using copy number insensitive SNV detection algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. This was also supported by analysis of a recently published lymphoma genome with a relatively quiescent karyotype, where CoNAn-SNV showed similar results to other callers except in regions of copy number gain where increased sensitivity was conferred. Our results indicate that in genomically unstable tumors, copy number annotation for SNV detection will be critical to fully characterize the

  6. NEBNext Direct: A Novel, Rapid, Hybridization-Based Approach for the Capture and Library Conversion of Genomic Regions of Interest.

    Science.gov (United States)

    Emerman, Amy B; Bowman, Sarah K; Barry, Andrew; Henig, Noa; Patel, Kruti M; Gardner, Andrew F; Hendrickson, Cynthia L

    2017-07-05

    Next-generation sequencing (NGS) is a powerful tool for genomic studies, translational research, and clinical diagnostics that enables the detection of single nucleotide polymorphisms, insertions and deletions, copy number variations, and other genetic variations. Target enrichment technologies improve the efficiency of NGS by only sequencing regions of interest, which reduces sequencing costs while increasing coverage of the selected targets. Here we present NEBNext Direct ® , a hybridization-based, target-enrichment approach that addresses many of the shortcomings of traditional target-enrichment methods. This approach features a simple, 7-hr workflow that uses enzymatic removal of off-target sequences to achieve a high specificity for regions of interest. Additionally, unique molecular identifiers are incorporated for the identification and filtering of PCR duplicates. The same protocol can be used across a wide range of input amounts, input types, and panel sizes, enabling NEBNext Direct to be broadly applicable across a wide variety of research and diagnostic needs. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  7. Two distinct genomic regions, harbouring the period and fruitless genes, affect male courtship song in Drosophila montana.

    Science.gov (United States)

    Lagisz, M; Wen, S-Y; Routtu, J; Klappert, K; Mazzi, D; Morales-Hojas, R; Schäfer, M A; Vieira, J; Hoikkala, A; Ritchie, M G; Butlin, R K

    2012-06-01

    Acoustic signals often have a significant role in pair formation and in species recognition. Determining the genetic basis of signal divergence will help to understand signal evolution by sexual selection and its role in the speciation process. An earlier study investigated quantitative trait locus for male courtship song carrier frequency (FRE) in Drosophila montana using microsatellite markers. We refined this study by adding to the linkage map markers for 10 candidate genes known to affect song production in Drosophila melanogaster. We also extended the analyses to additional song characters (pulse train length (PTL), pulse number (PN), interpulse interval, pulse length (PL) and cycle number (CN)). Our results indicate that loci in two different regions of the genome control distinct features of the courtship song. Pulse train traits (PTL and PN) mapped to the X chromosome, showing significant linkage with the period gene. In contrast, characters related to song pulse properties (PL, CN and carrier FRE) mapped to the region of chromosome 2 near the candidate gene fruitless, identifying these genes as suitable loci for further investigations. In previous studies, the pulse train traits have been found to vary substantially between Drosophila species, and so are potential species recognition signals, while the pulse traits may be more important in intra-specific mate choice.

  8. A Legionella pneumophila effector protein encoded in a region of genomic plasticity binds to Dot/Icm-modified vacuoles.

    Directory of Open Access Journals (Sweden)

    Shira Ninio

    2009-01-01

    Full Text Available Legionella pneumophila is an opportunistic pathogen that can cause a severe pneumonia called Legionnaires' disease. In the environment, L. pneumophila is found in fresh water reservoirs in a large spectrum of environmental conditions, where the bacteria are able to replicate within a variety of protozoan hosts. To survive within eukaryotic cells, L. pneumophila require a type IV secretion system, designated Dot/Icm, that delivers bacterial effector proteins into the host cell cytoplasm. In recent years, a number of Dot/Icm substrate proteins have been identified; however, the function of most of these proteins remains unknown, and it is unclear why the bacterium maintains such a large repertoire of effectors to promote its survival. Here we investigate a region of the L. pneumophila chromosome that displays a high degree of plasticity among four sequenced L. pneumophila strains. Analysis of GC content suggests that several genes encoded in this region were acquired through horizontal gene transfer. Protein translocation studies establish that this region of genomic plasticity encodes for multiple Dot/Icm effectors. Ectopic expression studies in mammalian cells indicate that one of these substrates, a protein called PieA, has unique effector activities. PieA is an effector that can alter lysosome morphology and associates specifically with vacuoles that support L. pneumophila replication. It was determined that the association of PieA with vacuoles containing L. pneumophila requires modifications to the vacuole mediated by other Dot/Icm effectors. Thus, the localization properties of PieA reveal that the Dot/Icm system has the ability to spatially and temporally control the association of an effector with vacuoles containing L. pneumophila through activities mediated by other effector proteins.

  9. ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions

    NARCIS (Netherlands)

    Muino, J.M.; Kaufmann, K.; Ham, van R.C.H.J.; Angenent, G.C.; Krajewski, P.

    2011-01-01

    Background In vivo detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationally

  10. Pan-Genome Analysis of Human Gastric Pathogen H. pylori: Comparative Genomics and Pathogenomics Approaches to Identify Regions Associated with Pathogenicity and Prediction of Potential Core Therapeutic Targets

    DEFF Research Database (Denmark)

    Ali, Amjad; Naz, Anam; Soares, Siomar C.

    2015-01-01

    Helicobacter pylori is a human gastric pathogen implicated as the major cause of peptic ulcer and second leading cause of gastric cancer (similar to 70%) around the world. Conversely, an increased resistance to antibiotics and hindrances in the development of vaccines against H. pylori are observed......-genome approach; the predicted conserved gene families (1,193) constitute similar to 77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost...... homolog proteins were characterized as universal therapeutic targets against H. pylori based on their functional annotation and protein-protein interaction. Finally, pathogenomics and genome plasticity analysis revealed 3 highly conserved and 2 highly variable putative pathogenicity islands in all...

  11. Genome-wide signatures of flowering adaptation to climate temperature: regional analyses in a highly diverse native range of Arabidopsis thaliana.

    Science.gov (United States)

    Tabas-Madrid, Daniel; Méndez-Vigo, Belén; Arteaga, Noelia; Marcer, Arnald; Pascual-Montano, Alberto; Weigel, Detlef; Xavier Picó, F; Alonso-Blanco, Carlos

    2018-03-08

    Current global change is fueling an interest to understand the genetic and molecular mechanisms of plant adaptation to climate. In particular, altered flowering time is a common strategy for escape from unfavorable climate temperature. In order to determine the genomic bases underlying flowering time adaptation to this climatic factor, we have systematically analysed a collection of 174 highly diverse A. thaliana accessions from the Iberian Peninsula. Analyses of 1.88 million SNPs provide evidence for a spatially heterogeneous contribution of demographic and adaptive processes to geographic patterns of genetic variation. Mountains appear to be allele dispersal barriers, whereas the relationship between flowering time and temperature depended on the precise temperature range. Environmental genome-wide associations (EGWA) supported an overall genome adaptation to temperature, with 9.4% of the genes showing significant associations. Furthermore, phenotypic genome-wide associations (PGWA) provided a catalogue of candidate genes underlying flowering time variation. Finally, comparison of EGWA and PGWA genomic regions identified known (TSF, FRL1 and CKB1) and new (ESM1 and VDAC5) genes as candidates for adaptation to climate temperature by altered flowering time. Thus, this regional collection provides an excellent resource to address the spatial complexity of climate adaptation in annual plants. This article is protected by copyright. All rights reserved.

  12. In silico comparison of genomic regions containing genes coding for enzymes and transcription factors for the phenylpropanoid pathway in Phaseolus vulgaris L. and Glycine max L. Merr

    Directory of Open Access Journals (Sweden)

    Yarmilla eReinprecht

    2013-09-01

    Full Text Available Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.. The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean sequences might be used to develop

  13. A systems biology approach to identify intelligence quotient score-related genomic regions, and pathways relevant to potential therapeutic treatments

    Science.gov (United States)

    Zhao, Min; Kong, Lei; Qu, Hong

    2014-01-01

    Although the intelligence quotient (IQ) is the most popular intelligence test in the world, little is known about the underlying biological mechanisms that lead to the differences in human. To improve our understanding of cognitive processes and identify potential biomarkers, we conducted a comprehensive investigation of 158 IQ-related genes selected from the literature. A genomic distribution analysis demonstrated that IQ-related genes were enriched in seven regions of chromosome 7 and the X chromosome. In addition, these genes were enriched in target lists of seven transcription factors and sixteen microRNAs. Using a network-based approach, we further reconstructed an IQ-related pathway from known human pathway interaction data. Based on this reconstructed pathway, we incorporated enriched drugs and described the importance of dopamine and norepinephrine systems in IQ-related biological process. These findings not only reveal several testable genes and processes related to IQ scores, but also have potential therapeutic implications for IQ-related mental disorders. PMID:24566931

  14. Domestication of olive fly through a multi-regional host shift to cultivated olives: comparative dating using complete mitochondrial genomes.

    Science.gov (United States)

    Nardi, F; Carapelli, A; Boore, J L; Roderick, G K; Dallai, R; Frati, F

    2010-11-01

    The evolutionary history of the olive fly, Bactrocera oleae, was reconstructed in a phylogenetic and coalescent framework using full mitochondrial genome data from 21 individuals covering the entire worldwide distribution of the species. Special attention was given to reconstructing the timing of the processes under study. The early subdivision of the olive fly reflects the Quaternary differentiation between Olea europea subsp. europea in the Mediterranean area and the two lineages of Olea europea subsp. cuspidata in Africa and Asia, pointing to an early and close association between the olive fly and its host. The geographic structure and timing of olive fly differentiation in the Mediterranean indicates a clear connection with the post-glacial recolonization of wild olives in the area, and is irreconcilable with the early historical process of domestication and spread of the cultivated olive from its Levantine origin. Therefore, we suggest an early co-history of the olive fly with its wild host during the Quaternary and post-glacial periods and a multi-regional shift of olive flies to cultivated olives as these cultivars gradually replaced wild olives in historical times. Copyright © 2010 Elsevier Inc. All rights reserved.

  15. Comparative genomics reveals a functional thyroid-specific element in the far upstream region of the PAX8 gene

    Directory of Open Access Journals (Sweden)

    De Felice Mario

    2010-05-01

    Full Text Available Abstract Background The molecular mechanisms leading to a fully differentiated thyrocite are still object of intense study even if it is well known that thyroglobulin, thyroperoxidase, NIS and TSHr are the marker genes of thyroid differentiation. It is also well known that Pax8, TTF-1, Foxe1 and Hhex are the thyroid-enriched transcription factors responsible for the expression of the above genes, thus are responsible for the differentiated thyroid phenotype. In particular, the role of Pax8 in the fully developed thyroid gland was studied in depth and it was established that it plays a key role in thyroid development and differentiation. However, to date the bases for the thyroid-enriched expression of this transcription factor have not been unraveled yet. Here, we report the identification and characterization of a functional thyroid-specific enhancer element located far upstream of the Pax8 gene. Results We hypothesized that regulatory cis-acting elements are conserved among mammalian genes. Comparison of a genomic region extending for about 100 kb at the 5'-flanking region of the mouse and human Pax8 gene revealed several conserved regions that were tested for enhancer activity in thyroid and non-thyroid cells. Using this approach we identified one putative thyroid-specific regulatory element located 84.6 kb upstream of the Pax8 transcription start site. The in silico data were verified by promoter-reporter assays in thyroid and non-thyroid cells. Interestingly, the identified far upstream element manifested a very high transcriptional activity in the thyroid cell line PC Cl3, but showed no activity in HeLa cells. In addition, the data here reported indicate that the thyroid-enriched transcription factor TTF-1 is able to bind in vitro and in vivo the Pax8 far upstream element, and is capable to activate transcription from it. Conclusions Results of this study reveal the presence of a thyroid-specific regulatory element in the 5' upstream

  16. In situ genomic DNA extraction for PCR analysis of regions of interest in four plant species and one filamentous fungi

    OpenAIRE

    Luis E. Rojas; Maritza Reyes; Naivy Pérez-Alonso; María I. Olóriz; Laisyn Posada-Pérez; Bárbara Ocaña; Orelvis Portal; Borys Chong-Pérez; Jorge L. Pérez Pérez

    2014-01-01

    The extraction methods of genomic DNA are usually laborious and hazardous to human health and the environment by the use of organic solvents (chloroform and phenol). In this work a protocol for in situ extraction of genomic DNA by alkaline lysis is validated. It was used in order to amplify regions of DNA in four species of plants and fungi by polymerase chain reaction (PCR). From plant material of Saccharum officinarum L., Carica papaya L. and Digitalis purpurea L. it was possible to extend ...

  17. Genome-wide association study for regions of systemic sclerosis susceptibility in a Choctaw Indian population with high disease prevalence.

    Science.gov (United States)

    Zhou, Xiaodong; Tan, Filemon K; Wang, Ning; Xiong, Momiao; Maghidman, Samuel; Reveille, John D; Milewicz, Dianna M; Chakraborty, Ranajit; Arnett, Frank C

    2003-09-01

    Systemic sclerosis (SSc) is a complex, multisystem connective tissue disease in which genetic factors contribute to disease susceptibility. The aim of this study was to localize chromosome regions associated with susceptibility to SSc in a relatively isolated and homogeneous population of Choctaw Indians with a high prevalence of SSc. A genome-wide microsatellite screen at 10 cM resolution (400 markers) was performed in 20 Choctaw patients with SSc and 76 ethically matched controls. Based on the results of the initial screen, fine-scale microsatellite mapping at TOPOI genes, respectively, confirming the results of our previous studies, which used different markers. D1S2800 and D14S63 have been reported to show linkage to systemic lupus erythematosus (SLE) in family-based studies, and D1S206, D6S422, and D6S264 are loci on 1p21.2, 6p22.3, and 6q23-27, respectively, which are in regions reported as showing linkage to SLE and other autoimmune diseases. Other markers showing unique associations with SSc were D7S510 (7p12-11), D7S661 (7q35), D8S514 (8q24.12), D19S221 (19p13.2), D19S220 (19q13.2), D22S423 (22q13.1), DXS1068 (Xp11.4), and DXS8055 (Xq21-23). Further analysis with fine-scale microsatellite mapping revealed at least 14 potential haplotypes associated with SSc. Our findings indicate that a number of genetic loci may contribute to the high prevalence of SSc in the Choctaw and are consistent with the paradigm that some autoimmune rheumatic diseases are likely to share genetic determinants.

  18. Comparative genome analysis of ciprofloxacin-resistant Pseudomonas aeruginosa reveals genes within newly identified high variability regions associated with drug resistance development.

    Science.gov (United States)

    Su, Hsun-Cheng; Khatun, Jainab; Kanavy, Dona M; Giddings, Morgan C

    2013-12-01

    The alarming rise of ciprofloxacin-resistant Pseudomonas aeruginosa has been reported in several clinical studies. Though the mutation of resistance genes and their role in drug resistance has been researched, the process by which the bacterium acquires high-level resistance is still not well understood. How does the genomic evolution of P. aeruginosa affect resistance development? Could the exposure of antibiotics to the bacteria enrich genomic variants that lead to the development of resistance, and if so, how are these variants distributed through the genome? To answer these questions, we performed 454 pyrosequencing and a whole genome analysis both before and after exposure to ciprofloxacin. The comparative sequence data revealed 93 unique resistance strain variation sites, which included a mutation in the DNA gyrase subunit A gene. We generated variation-distribution maps comparing the wild and resistant types, and isolated 19 candidates from three discrete resistance-associated high variability regions that had available transposon mutants, to perform a ciprofloxacin exposure assay. Of these region candidates with transposon disruptions, 79% (15/19) showed a reduction in the ability to gain high-level resistance, suggesting that genes within these high variability regions might enrich for certain functions associated with resistance development.

  19. Frequent Loss of Genome Gap Region in 4p16.3 Subtelomere in Early-Onset Type 2 Diabetes Mellitus

    Directory of Open Access Journals (Sweden)

    Hirohito Kudo

    2011-01-01

    Full Text Available A small portion of Type 2 diabetes mellitus (T2DM is familial, but the majority occurs as sporadic disease. Although causative genes are found in some rare forms, the genetic basis for sporadic T2DM is largely unknown. We searched for a copy number abnormality in 100 early-onset Japanese T2DM patients (onset age <35 years by whole-genome screening with a copy number variation BeadChip. Within the 1.3-Mb subtelomeric region on chromosome 4p16.3, we found copy number losses in early-onset T2DM (13 of 100 T2DM versus one of 100 controls. This region surrounds a genome gap, which is rich in multiple low copy repeats. Subsequent region-targeted high-density custom-made oligonucleotide microarray experiments verified the copy number losses and delineated structural changes in the 1.3-Mb region. The results suggested that copy number losses of the genes in the deleted region around the genome gap in 4p16.3 may play significant roles in the etiology of T2DM.

  20. Conserved cis-regulatory regions in a large genomic landscape control SHH and BMP-regulated Gremlin1 expression in mouse limb buds

    Directory of Open Access Journals (Sweden)

    Zuniga Aimée

    2012-08-01

    Full Text Available Abstract Background Mouse limb bud is a prime model to study the regulatory interactions that control vertebrate organogenesis. Major aspects of limb bud development are controlled by feedback loops that define a self-regulatory signalling system. The SHH/GREM1/AER-FGF feedback loop forms the core of this signalling system that operates between the posterior mesenchymal organiser and the ectodermal signalling centre. The BMP antagonist Gremlin1 (GREM1 is a critical node in this system, whose dynamic expression is controlled by BMP, SHH, and FGF signalling and key to normal progression of limb bud development. Previous analysis identified a distant cis-regulatory landscape within the neighbouring Formin1 (Fmn1 locus that is required for Grem1 expression, reminiscent of the genomic landscapes controlling HoxD and Shh expression in limb buds. Results Three highly conserved regions (HMCO1-3 were identified within the previously defined critical genomic region and tested for their ability to regulate Grem1 expression in mouse limb buds. Using a combination of BAC and conventional transgenic approaches, a 9 kb region located ~70 kb downstream of the Grem1 transcription unit was identified. This region, termed Grem1 Regulatory Sequence 1 (GRS1, is able to recapitulate major aspects of Grem1 expression, as it drives expression of a LacZ reporter into the posterior and, to a lesser extent, in the distal-anterior mesenchyme. Crossing the GRS1 transgene into embryos with alterations in the SHH and BMP pathways established that GRS1 depends on SHH and is modulated by BMP signalling, i.e. integrates inputs from these pathways. Chromatin immunoprecipitation revealed interaction of endogenous GLI3 proteins with the core cis-regulatory elements in the GRS1 region. As GLI3 is a mediator of SHH signal transduction, these results indicated that SHH directly controls Grem1 expression through the GRS1 region. Finally, all cis-regulatory regions within the Grem1

  1. The evolutionary rates of HCV estimated with subtype 1a and 1b sequences over the ORF length and in different genomic regions.

    Directory of Open Access Journals (Sweden)

    Manqiong Yuan

    Full Text Available Considerable progress has been made in the HCV evolutionary analysis, since the software BEAST was released. However, prior information, especially the prior evolutionary rate, which plays a critical role in BEAST analysis, is always difficult to ascertain due to various uncertainties. Providing a proper prior HCV evolutionary rate is thus of great importance.176 full-length sequences of HCV subtype 1a and 144 of 1b were assembled by taking into consideration the balance of the sampling dates and the even dispersion in phylogenetic trees. According to the HCV genomic organization and biological functions, each dataset was partitioned into nine genomic regions and two routinely amplified regions. A uniform prior rate was applied to the BEAST analysis for each region and also the entire ORF. All the obtained posterior rates for 1a are of a magnitude of 10(-3 substitutions/site/year and in a bell-shaped distribution. Significantly lower rates were estimated for 1b and some of the rate distribution curves resulted in a one-sided truncation, particularly under the exponential model. This indicates that some of the rates for subtype 1b are less accurate, so they were adjusted by including more sequences to improve the temporal structure.Among the various HCV subtypes and genomic regions, the evolutionary patterns are dissimilar. Therefore, an applied estimation of the HCV epidemic history requires the proper selection of the rate priors, which should match the actual dataset so that they can fit for the subtype, the genomic region and even the length. By referencing the findings here, future evolutionary analysis of the HCV subtype 1a and 1b datasets may become more accurate and hence prove useful for tracing their patterns.

  2. Genomic androgen receptor-occupied regions with different functions, defined by histone acetylation, coregulators and transcriptional capacity.

    Directory of Open Access Journals (Sweden)

    Li Jia

    Full Text Available The androgen receptor (AR is a steroid-activated transcription factor that binds at specific DNA locations and plays a key role in the etiology of prostate cancer. While numerous studies have identified a clear connection between AR binding and expression of target genes for a limited number of loci, high-throughput elucidation of these sites allows for a deeper understanding of the complexities of this process.We have mapped 189 AR occupied regions (ARORs and 1,388 histone H3 acetylation (AcH3 loci to a 3% continuous stretch of human genomic DNA using chromatin immunoprecipitation (ChIP microarray analysis. Of 62 highly reproducible ARORs, 32 (52% were also marked by AcH3. While the number of ARORs detected in prostate cancer cells exceeded the number of nearby DHT-responsive genes, the AcH3 mark defined a subclass of ARORs much more highly associated with such genes -- 12% of the genes flanking AcH3+ARORs were DHT-responsive, compared to only 1% of genes flanking AcH3-ARORs. Most ARORs contained enhancer activities as detected in luciferase reporter assays. Analysis of the AROR sequences, followed by site-directed ChIP, identified binding sites for AR transcriptional coregulators FoxA1, CEBPbeta, NFI and GATA2, which had diverse effects on endogenous AR target gene expression levels in siRNA knockout experiments.We suggest that only some ARORs function under the given physiological conditions, utilizing diverse mechanisms. This diversity points to differential regulation of gene expression by the same transcription factor related to the chromatin structure.

  3. Recombination and evolution of duplicate control regions in the mitochondrial genome of the Asian big-headed turtle, Platysternon megacephalum.

    Directory of Open Access Journals (Sweden)

    Chenfei Zheng

    Full Text Available Complete mitochondrial (mt genome sequences with duplicate control regions (CRs have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs at the 3' end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P

  4. A Novel Phytophthora sojae Resistance Rps12 Gene Mapped to a Genomic Region That Contains Several Rps Genes.

    Science.gov (United States)

    Sahoo, Dipak K; Abeysekara, Nilwala S; Cianzio, Silvia R; Robertson, Alison E; Bhattacharyya, Madan K

    2017-01-01

    Phytophthora sojae Kaufmann and Gerdemann, which causes Phytophthora root rot, is a widespread pathogen that limits soybean production worldwide. Development of Phytophthora resistant cultivars carrying Phytophthora resistance Rps genes is a cost-effective approach in controlling this disease. For this mapping study of a novel Rps gene, 290 recombinant inbred lines (RILs) (F7 families) were developed by crossing the P. sojae resistant cultivar PI399036 with the P. sojae susceptible AR2 line, and were phenotyped for responses to a mixture of three P. sojae isolates that overcome most of the known Rps genes. Of these 290 RILs, 130 were homozygous resistant, 12 heterzygous and segregating for Phytophthora resistance, and 148 were recessive homozygous and susceptible. From this population, 59 RILs homozygous for Phytophthora sojae resistance and 61 susceptible to a mixture of P. sojae isolates R17 and Val12-11 or P7074 that overcome resistance encoded by known Rps genes mapped to Chromosome 18 were selected for mapping novel Rps gene. A single gene accounted for the 1:1 segregation of resistance and susceptibility among the RILs. The gene encoding the Phytophthora resistance mapped to a 5.8 cM interval between the SSR markers BARCSOYSSR_18_1840 and Sat_064 located in the lower arm of Chromosome 18. The gene is mapped 2.2 cM proximal to the NBSRps4/6-like sequence that was reported to co-segregate with the Phytophthora resistance genes Rps4 and Rps6. The gene is mapped to a highly recombinogenic, gene-rich genomic region carrying several nucleotide binding site-leucine rich repeat (NBS-LRR)-like genes. We named this novel gene as Rps12, which is expected to be an invaluable resource in breeding soybeans for Phytophthora resistance.

  5. A Novel Phytophthora sojae Resistance Rps12 Gene Mapped to a Genomic Region That Contains Several Rps Genes.

    Directory of Open Access Journals (Sweden)

    Dipak K Sahoo

    Full Text Available Phytophthora sojae Kaufmann and Gerdemann, which causes Phytophthora root rot, is a widespread pathogen that limits soybean production worldwide. Development of Phytophthora resistant cultivars carrying Phytophthora resistance Rps genes is a cost-effective approach in controlling this disease. For this mapping study of a novel Rps gene, 290 recombinant inbred lines (RILs (F7 families were developed by crossing the P. sojae resistant cultivar PI399036 with the P. sojae susceptible AR2 line, and were phenotyped for responses to a mixture of three P. sojae isolates that overcome most of the known Rps genes. Of these 290 RILs, 130 were homozygous resistant, 12 heterzygous and segregating for Phytophthora resistance, and 148 were recessive homozygous and susceptible. From this population, 59 RILs homozygous for Phytophthora sojae resistance and 61 susceptible to a mixture of P. sojae isolates R17 and Val12-11 or P7074 that overcome resistance encoded by known Rps genes mapped to Chromosome 18 were selected for mapping novel Rps gene. A single gene accounted for the 1:1 segregation of resistance and susceptibility among the RILs. The gene encoding the Phytophthora resistance mapped to a 5.8 cM interval between the SSR markers BARCSOYSSR_18_1840 and Sat_064 located in the lower arm of Chromosome 18. The gene is mapped 2.2 cM proximal to the NBSRps4/6-like sequence that was reported to co-segregate with the Phytophthora resistance genes Rps4 and Rps6. The gene is mapped to a highly recombinogenic, gene-rich genomic region carrying several nucleotide binding site-leucine rich repeat (NBS-LRR-like genes. We named this novel gene as Rps12, which is expected to be an invaluable resource in breeding soybeans for Phytophthora resistance.

  6. Specific regions of genome plasticity and genetic diversity of the commensal Escherichia coli A0 34/86

    Czech Academy of Sciences Publication Activity Database

    Hejnová, Jana; Pages, Delphine; Rusniok, Ch.; Glaser, P.; Šebo, Peter; Buchrieser, C.

    2006-01-01

    Roč. 296, - (2006), s. 541-546 ISSN 1438-4221 Institutional research plan: CEZ:AV0Z50200510 Keywords : escherichia coli * commensal * genome comparison Subject RIV: EE - Microbiology, Virology Impact factor: 2.760, year: 2006

  7. A microsatellite linkage map for the cultivated strawberry (Fragaria × ananassa) suggests extensive regions of homozygosity in the genome that may have resulted from breeding and selection.

    Science.gov (United States)

    Sargent, D J; Passey, T; Surbanovski, N; Lopez Girona, E; Kuchta, P; Davik, J; Harrison, R; Passey, A; Whitehouse, A B; Simpson, D W

    2012-05-01

    The linkage maps of the cultivated strawberry, Fragaria × ananassa (2n = 8x = 56) that have been reported to date have been developed predominantly from AFLPs, along with supplementation with transferrable microsatellite (SSR) markers. For the investigation of the inheritance of morphological characters in the cultivated strawberry and for the development of tools for marker-assisted breeding and selection, it is desirable to populate maps of the genome with an abundance of transferrable molecular markers such as microsatellites (SSRs) and gene-specific markers. Exploiting the recent release of the genome sequence of the diploid F. vesca, and the publication of an extensive number of polymorphic SSR markers for the genus Fragaria, we have extended the linkage map of the 'Redgauntlet' × 'Hapil' (RG × H) mapping population to include a further 330 loci, generated from 160 primer pairs, to create a linkage map for F. × ananassa containing 549 loci, 490 of which are transferrable SSR or gene-specific markers. The map covers 2140.3 cM in the expected 28 linkage groups for an integrated map (where one group is composed of two separate male and female maps), which represents an estimated 91% of the cultivated strawberry genome. Despite the relative saturation of the linkage map on the majority of linkage groups, regions of apparent extensive homozygosity were identified in the genomes of 'Redgauntlet' and 'Hapil' which may be indicative of allele fixation during the breeding and selection of modern F. × ananassa cultivars. The genomes of the octoploid and diploid Fragaria are largely collinear, but through comparison of mapped markers on the RG × H linkage map to their positions on the genome sequence of F. vesca, a number of inversions were identified that may have occurred before the polyploidisation event that led to the evolution of the modern octoploid strawberry species.

  8. Female-to-male sex reversal associated with unique Xp21.2 deletion disrupting genomic regulatory architecture of the dosage-sensitive sex reversal region.

    Science.gov (United States)

    Dangle, Pankaj; Touzon, María Sol; Reyes-Múgica, Miguel; Witchel, Selma F; Rajkovic, Aleksandar; Schneck, Francis X; Yatsenko, Svetlana A

    2017-10-01

    The XX male disorder of sex development (DSD) is a rare condition that is most commonly associated with the presence of the SRY gene on one of the X chromosomes due to unequal crossing-over between sex chromosomes during spermatogenesis. However, in about 20% of the XX male individuals, SRY is missing, although these persons have at least some testis differentiation. The genetic basis of genital ambiguity and the mechanisms triggering testis development in such patients remain unknown. The proband with 46,XX SRY -negative testicular DSD was screened for point mutations by whole exome sequencing and CNVs using a high-resolution DSD gene-targeted and whole genome array comparative genomic hybridisation. The identified Xp21.2 genomic alteration was further characterised by direct sequencing of the breakpoint junctions and bioinformatics analysis. A unique, 80 kb microdeletion removing the regulatory sequences and the NR0B1 gene was detected by microarray analysis. This deletion disturbs the human-specific genomic architecture of the Xp21.2 dosage-sensitive sex (DSS) reversal region in the XX patient with male-appearing ambiguous genitalia and ovotestis. Duplication of the DSS region containing the MAGEB and NR0B1 genes has been implicated in testis repression and sex reversal. Identification of this microdeletion highlights the importance of genomic integrity in the regulation and interaction of sex determining genes during gonadal development. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Sequence characterization of hypervariable regions in the soybean genome: leucine-rich repeats and simple sequence repeats

    Directory of Open Access Journals (Sweden)

    Everaldo G. de Barros

    2000-06-01

    Full Text Available The genetic basis of cultivated soybean is rather narrow. This observation has been confirmed by analysis of agronomic traits among different genotypes, and more recently by the use of molecular markers. During the construction of an RFLP soybean map (Glycine soja x Glycine max the two progenitors were analyzed with over 2,000 probes, of which 25% were polymorphic. Among the probes that revealed polymorphisms, a small proportion, about 0.5%, hybridized to regions that were highly polymorphic. Here we report the sequencing and analysis of five of these probes. Three of the five contain segments that encode leucine-rich repeat (LRR sequence homologous to known disease resistance genes in plants. Two other probes are relatively AT-rich and contain segments of (An/(Tn. DNA segments corresponding to one of the probes (A45-10 were amplified from nine soybean genotypes. Partial sequencing of these amplicons suggests that deletions and/or insertions are responsible for the extensive polymorphism observed. We propose that genes encoding LRR proteins and simple sequence repeat region prone to slippage are some of the most hypervariable regions of the soybean genome.A base genética da soja cultivada é relativamente estreita. Essa observação foi confirmada por análises de características agronômicas entre diferentes genótipos e, mais recentemente, pelo uso de marcadores moleculares. Durante a construção de um mapa de RFLP da soja (Glycine soja x Glycine max, os dois progenitores foram analisados com mais de 2000 sondas, das quais 25% eram polimórficas. Entre as sondas que revelaram polimorfismos, uma pequena proporção, cerca de 0,5%, hibridizou com regiões que eram altamente polimórficas. Neste trabalho, são apresentados o seqüenciamento e análise de cinco dessas sondas. Três dessas sondas contêm segmentos que codificam repetições ricas em leucina que são homólogas a genes de resistência a doenças já conhecidos em plantas. As duas

  10. QTL-seq approach identified genomic regions and diagnostic markers for rust and late leaf spot resistance in groundnut (Arachis hypogaea L.).

    Science.gov (United States)

    Pandey, Manish K; Khan, Aamir W; Singh, Vikas K; Vishwakarma, Manish K; Shasidhar, Yaduru; Kumar, Vinay; Garg, Vanika; Bhat, Ramesh S; Chitikineni, Annapurna; Janila, Pasupuleti; Guo, Baozhu; Varshney, Rajeev K

    2017-08-01

    Rust and late leaf spot (LLS) are the two major foliar fungal diseases in groundnut, and their co-occurrence leads to significant yield loss in addition to the deterioration of fodder quality. To identify candidate genomic regions controlling resistance to rust and LLS, whole-genome resequencing (WGRS)-based approach referred as 'QTL-seq' was deployed. A total of 231.67 Gb raw and 192.10 Gb of clean sequence data were generated through WGRS of resistant parent and the resistant and susceptible bulks for rust and LLS. Sequence analysis of bulks for rust and LLS with reference-guided resistant parent assembly identified 3136 single-nucleotide polymorphisms (SNPs) for rust and 66 SNPs for LLS with the read depth of ≥7 in the identified genomic region on pseudomolecule A03. Detailed analysis identified 30 nonsynonymous SNPs affecting 25 candidate genes for rust resistance, while 14 intronic and three synonymous SNPs affecting nine candidate genes for LLS resistance. Subsequently, allele-specific diagnostic markers were identified for three SNPs for rust resistance and one SNP for LLS resistance. Genotyping of one RIL population (TAG 24 × GPBD 4) with these four diagnostic markers revealed higher phenotypic variation for these two diseases. These results suggest usefulness of QTL-seq approach in precise and rapid identification of candidate genomic regions and development of diagnostic markers for breeding applications. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  11. A genome-wide association study for body weight in Japanese Thoroughbred racehorses clarifies candidate regions on chromosomes 3, 9, 15, and 18

    Science.gov (United States)

    TOZAKI, Teruaki; KIKUCHI, Mio; KAKOI, Hironaga; HIROTA, Kei-ichi; NAGATA, Shun-ichi

    2017-01-01

    ABSTRACT Body weight is an important trait to confirm growth and development in humans and animals. In Thoroughbred racehorses, it is measured in the postnatal, training, and racing periods to evaluate growth and training degrees. The body weight of mature Thoroughbred racehorses generally ranges from 400 to 600 kg, and this broad range is likely influenced by environmental and genetic factors. Therefore, a genome-wide association study (GWAS) using the Equine SNP70 BeadChip was performed to identify the genomic regions associated with body weight in Japanese Thoroughbred racehorses using 851 individuals. The average body weight of these horses was 473.9 kg (standard deviation: 28.0) at the age of 3, and GWAS identified statistically significant SNPs on chromosomes 3 (BIEC2_808466, P=2.32E-14), 9 (BIEC2_1105503, P=1.03E-7), 15 (BIEC2_322669, P=9.50E-6), and 18 (BIEC2_417274, P=1.44E-14), which were associated with body weight as a quantitative trait. The genomic regions on chromosomes 3, 9, 15, and 18 included ligand-dependent nuclear receptor compressor-like protein (LCORL), zinc finger and AT hook domain containing (ZFAT), tribbles pseudokinase 2 (TRIB2), and myostatin (MSTN), respectively, as candidate genes. LCORL and ZFAT are associated with withers height in horses, whereas MSTN affects muscle mass. Thus, the genomic regions identified in this study seem to affect the body weight of Thoroughbred racehorses. Although this information is useful for breeding and growth management of the horses, the production of genetically modified animals and gene doping (abuse/misuse of gene therapy) should be prohibited to maintain horse racing integrity. PMID:29270069

  12. Mycobacterium tuberculosis Whole Genome Sequences From Southern India Suggest Novel Resistance Mechanisms and the Need for Region-Specific Diagnostics.

    Science.gov (United States)

    Manson, Abigail L; Abeel, Thomas; Galagan, James E; Sundaramurthi, Jagadish Chandrabose; Salazar, Alex; Gehrmann, Thies; Shanmugam, Siva Kumar; Palaniyandi, Kannan; Narayanan, Sujatha; Swaminathan, Soumya; Earl, Ashlee M

    2017-06-01

    India is home to 25% of all tuberculosis cases and the second highest number of multidrug resistant cases worldwide. However, little is known about the genetic diversity and resistance determinants of Indian Mycobacterium tuberculosis, particularly for the primary lineages found in India, lineages 1 and 3. We whole genome sequenced 223 randomly selected M. tuberculosis strains from 196 patients within the Tiruvallur and Madurai districts of Tamil Nadu in Southern India. Using comparative genomics, we examined genetic diversity, transmission patterns, and evolution of resistance. Genomic analyses revealed (11) prevalence of strains from lineages 1 and 3, (11) recent transmission of strains among patients from the same treatment centers, (11) emergence of drug resistance within patients over time, (11) resistance gained in an order typical of strains from different lineages and geographies, (11) underperformance of known resistance-conferring mutations to explain phenotypic resistance in Indian strains relative to studies focused on other geographies, and (11) the possibility that resistance arose through mutations not previously implicated in resistance, or through infections with multiple strains that confound genotype-based prediction of resistance. In addition to substantially expanding the genomic perspectives of lineages 1 and 3, sequencing and analysis of M. tuberculosis whole genomes from Southern India highlight challenges of infection control and rapid diagnosis of resistant tuberculosis using current technologies. Further studies are needed to fully explore the complement of diversity and resistance determinants within endemic M. tuberculosis populations.

  13. Genome-wide Anaplasma phagocytophilum AnkA-DNA interactions are enriched in intergenic regions and gene promoters and correlate with infection-induced differential gene expression.

    Directory of Open Access Journals (Sweden)

    J Stephen Dumler

    2016-09-01

    Full Text Available Anaplasma phagocytophilum, an obligate intracellular prokaryote, infects neutrophils and alters cardinal functions via reprogrammed transcription. Large contiguous regions of neutrophil chromosomes are differentially expressed during infection. Secreted A. phagocytophilum effector AnkA transits into the neutrophil or granulocyte nucleus to complex with DNA in heterochromatin across all chromosomes. AnkA binds to gene promoters to dampen cis-transcription and also has features of matrix attachment region (MAR-binding proteins that regulate three-dimensional chromatin architecture and coordinate transcriptional programs encoded in topologically-associated chromatin domains. We hypothesize that identification of additional AnkA binding sites will better delineate how A. phagocytophilum infection results in reprogramming of the neutrophil genome. Using AnkA-binding ChIP-seq, we showed that AnkA binds broadly throughout all chromosomes in a reproducible pattern, especially at: i intergenic regions predicted to be matrix attachment regions (MARs; ii within predicted lamina-associated domains; and iii at promoters ≤3,000 bp upstream of transcriptional start sites. These findings provide genome-wide support for AnkA as a regulator of cis-gene transcription. Moreover, the dominant mark of AnkA in distal intergenic regions known to be AT-enriched, coupled with frequent enrichment in the nuclear lamina, provides strong support for its role as a MAR-binding protein and genome re-organizer. AnkA must be considered a prime candidate to promote neutrophil reprogramming and subsequent functional changes that belie improved microbial fitness and pathogenicity.

  14. Genomic analysis of the chromosome 15q11-q13 Prader-Willi syndrome region and characterization of transcripts for GOLGA8E and WHCD1L1 from the proximal breakpoint region

    Directory of Open Access Journals (Sweden)

    Kashork Catherine D

    2008-01-01

    Full Text Available Abstract Background Prader-Willi syndrome (PWS is a neurobehavioral disorder characterized by neonatal hypotonia, childhood obesity, dysmorphic features, hypogonadism, mental retardation, and behavioral problems. Although PWS is most often caused by a paternal interstitial deletion of a 6-Mb region of chromosome 15q11-q13, the identity of the exact protein coding or noncoding RNAs whose deficiency produces the PWS phenotype is uncertain. There are also reports describing a PWS-like phenotype in a subset of patients with full mutations in the FMR1 (fragile X mental retardation 1 gene. Taking advantage of the human genome sequence, we have performed extensive sequence analysis and molecular studies for the PWS candidate region. Results We have characterized transcripts for the first time for two UCSC Genome Browser predicted protein-coding genes, GOLGA8E (golgin subfamily a, 8E and WHDC1L1 (WAS protein homology region containing 1-like 1 and have further characterized two previously reported genes, CYF1P1 and NIPA2; all four genes are in the region close to the proximal/centromeric deletion breakpoint (BP1. GOLGA8E belongs to the golgin subfamily of coiled-coil proteins associated with the Golgi apparatus. Six out of 16 golgin subfamily proteins in the human genome have been mapped in the chromosome 15q11-q13 and 15q24-q26 regions. We have also identified more than 38 copies of GOLGA8E-like sequence in the 15q11-q14 and 15q23-q26 regions which supports the presence of a GOLGA8E-associated low copy repeat (LCR. Analysis of the 15q11-q13 region by PFGE also revealed a polymorphic region between BP1 and BP2. WHDC1L1 is a novel gene with similarity to mouse Whdc1 (WAS protein homology region 2 domain containing 1 and human JMY protein (junction-mediating and regulatory protein. Expression analysis of cultured human cells and brain tissues from PWS patients indicates that CYFIP1 and NIPA2 are biallelically expressed. However, we were not able to

  15. Insights into the ancestral organisation of the mammalian MHC class II region from the genome of the pteropid bat, Pteropus alecto.

    Science.gov (United States)

    Ng, Justin H J; Tachedjian, Mary; Wang, Lin-Fa; Baker, Michelle L

    2017-05-18

    Bats are an extremely successful group of mammals and possess a variety of unique characteristics, including their ability to co-exist with a diverse range of pathogens. The major histocompatibility complex (MHC) is the most gene dense and polymorphic region of the genome and MHC class II (MHC-II) molecules play a vital role in the presentation of antigens derived from extracellular pathogens and activation of the adaptive immune response. Characterisation of the MHC-II region of bats is crucial for understanding the evolution of the MHC and of the role of pathogens in shaping the immune system. Here we describe the relatively contracted MHC-II region of the Australian black flying-fox (Pteropus alecto), providing the first detailed insight into the MHC-II region of any species of bat. Twelve MHC-II genes, including one locus (DRB2) located outside the class II region, were identified on a single scaffold in the bat genome. The presence of a class II locus outside the MHC-II region is atypical and provides evidence for an ancient class II duplication block. Two non-classical loci, DO and DM and two classical, DQ and DR loci, were identified in P. alecto. A putative classical, DPB pseudogene was also identified. The bat's antigen processing cluster, though contracted, remains highly conserved, thus supporting its importance in antigen presentation and disease resistance. This detailed characterisation of the bat MHC-II region helps to fill a phylogenetic gap in the evolution of the mammalian class II region and is a stepping stone towards better understanding of the immune responses in bats to viral, bacterial, fungal and parasitic infections.

  16. Comparative Genomics of H. pylori and Non-Pylori Helicobacter Species to Identify New Regions Associated with Its Pathogenicity and Adaptability

    Directory of Open Access Journals (Sweden)

    De-Min Cao

    2016-01-01

    Full Text Available The genus Helicobacter is a group of Gram-negative, helical-shaped pathogens consisting of at least 36 bacterial species. Helicobacter pylori (H. pylori, infecting more than 50% of the human population, is considered as the major cause of gastritis, peptic ulcer, and gastric cancer. However, the genetic underpinnings of H. pylori that are responsible for its large scale epidemic and gastrointestinal environment adaption within human beings remain unclear. Core-pan genome analysis was performed among 75 representative H. pylori and 24 non-pylori Helicobacter genomes. There were 1173 conserved protein families of H. pylori and 673 of all 99 Helicobacter genus strains. We found 79 genome unique regions, a total of 202,359bp, shared by at least 80% of the H. pylori but lacked in non-pylori Helicobacter species. The operons, genes, and sRNAs within the H. pylori unique regions were considered as potential ones associated with its pathogenicity and adaptability, and the relativity among them has been partially confirmed by functional annotation analysis. However, functions of at least 54 genes and 10 sRNAs were still unclear. Our analysis of protein-protein interaction showed that 30 genes within them may have the cooperation relationship.

  17. Genomic relationships of Actinobacillus pleuropneumoniae serotype 2 strains evaluated by ribotyping, sequence analysis of ribosomal intergenic regions, and pulsed-field gel electrophoresis

    DEFF Research Database (Denmark)

    Fussing, V.

    1998-01-01

    The aim of the present study was to examine the genomic relationship among 112 Actinobacillus pleuropneumoniae serotype 2 strains obtained throughout Europe and North America. HindIII ribotyping of the strains resulted in five ribotypes of high similarity (87-98%). Sequence analysis of the riboso......The aim of the present study was to examine the genomic relationship among 112 Actinobacillus pleuropneumoniae serotype 2 strains obtained throughout Europe and North America. HindIII ribotyping of the strains resulted in five ribotypes of high similarity (87-98%). Sequence analysis...... of the ribosomal intergenic region of strains representing each ribotype and each country showed no differences. A common ribotype was further characterized by PFGE of 12 strains representing all countries. The resultant five PFGE patterns of European strains showed a similarity of more than 91%, to which the two...

  18. Organization and expression of genes in the genomic region surrounding the glutamine synthetase gene Gln1 from Lotus japonicus

    DEFF Research Database (Denmark)

    Thykjaer, T; Danielsen, D; She, Q

    1997-01-01

    The diploid Lotus japonicus was previously suggested as a model for the legume plant family. We present here the nucleotide sequence and the derived gene organization of a small part of the genome in this model plant. Two functional genes with the same transcriptional orientation were identified...

  19. Novel insights in the genomic organization and hotspots of recombination in the human KIR locus through analysis of intergenic regions

    NARCIS (Netherlands)

    Vendelbosch, S.; de Boer, M.; van Leeuwen, K.; Pourfarzad, F.; Geissler, J.; van den Berg, T. K.; Kuijpers, T. W.

    2015-01-01

    The Killer Immunoglobulin-like Receptor (KIR) proteins constitute a family of highly homologous surface receptors involved in the regulation of the innate cytotoxicity of natural killer (NK) cells. Within the human genome, 17 KIR genes are present, many of which show large variation across the

  20. Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels

    DEFF Research Database (Denmark)

    Chen, Wei-Min; Erdos, Michael R; Jackson, Anne U

    2008-01-01

    Identifying the genetic variants that regulate fasting glucose concentrations may further our understanding of the pathogenesis of diabetes. We therefore investigated the association of fasting glucose levels with SNPs in 2 genome-wide scans including a total of 5,088 nondiabetic individuals from...

  1. Integration of genomic resources to uncover pleiotropic regions associated with age at puberty and reproductive longevity in sows

    Science.gov (United States)

    Commercial and experimental genetic resources were used to investigate genetic pleiotropic factors that influence age at puberty, litter-size and reproductive longevity. The phenotypes were complemented by high-density genotyping and whole genome and RNA sequencing. The SNPs from Porcine SNP60 BeadA...

  2. The complete mitochondrial genome of the common sea slater, Ligia oceanica (Crustacea, Isopoda bears a novel gene order and unusual control region features

    Directory of Open Access Journals (Sweden)

    Podsiadlowski Lars

    2006-09-01

    Full Text Available Abstract Background Sequence data and other characters from mitochondrial genomes (gene translocations, secondary structure of RNA molecules are useful in phylogenetic studies among metazoan animals from population to phylum level. Moreover, the comparison of complete mitochondrial sequences gives valuable information about the evolution of small genomes, e.g. about different mechanisms of gene translocation, gene duplication and gene loss, or concerning nucleotide frequency biases. The Peracarida (gammarids, isopods, etc. comprise about 21,000 species of crustaceans, living in many environments from deep sea floor to arid terrestrial habitats. Ligia oceanica is a terrestrial isopod living at rocky seashores of the european North Sea and Atlantic coastlines. Results The study reveals the first complete mitochondrial DNA sequence from a peracarid crustacean. The mitochondrial genome of Ligia oceanica is a circular double-stranded DNA molecule, with a size of 15,289 bp. It shows several changes in mitochondrial gene order compared to other crustacean species. An overview about mitochondrial gene order of all crustacean taxa yet sequenced is also presented. The largest non-coding part (the putative mitochondrial control region of the mitochondrial genome of Ligia oceanica is unexpectedly not AT-rich compared to the remainder of the genome. It bears two repeat regions (4× 10 bp and 3× 64 bp, and a GC-rich hairpin-like secondary structure. Some of the transfer RNAs show secondary structures which derive from the usual cloverleaf pattern. While some tRNA genes are putative targets for RNA editing, trnR could not be localized at all. Conclusion Gene order is not conserved among Peracarida, not even among isopods. The two isopod species Ligia oceanica and Idotea baltica show a similarly derived gene order, compared to the arthropod ground pattern and to the amphipod Parhyale hawaiiensis, suggesting that most of the translocation events were already

  3. Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions

    Energy Technology Data Exchange (ETDEWEB)

    MacArthur, Stewart; Li, Xiao-Yong; Li, Jingyi; Brown, James B.; Chu, Hou Cheng; Zeng, Lucy; Grondona, Brandi P.; Hechmer, Aaron; Simirenko, Lisa; Keranen, Soile V.E.; Knowles, David W.; Stapleton, Mark; Bickel, Peter; Biggin, Mark D.; Eisen, Michael B.

    2009-05-15

    BACKGROUND: We previously established that six sequence-specific transcription factors that initiate anterior/posterior patterning in Drosophila bind to overlapping sets of thousands of genomic regions in blastoderm embryos. While regions bound at high levels include known and probable functional targets, more poorly bound regions are preferentially associated with housekeeping genes and/or genes not transcribed in the blastoderm, and are frequently found in protein coding sequences or in less conserved non-coding DNA, suggesting that many are likely non-functional. RESULTS: Here we show that an additional 15 transcription factors that regulate other aspects of embryo patterning show a similar quantitative continuum of function and binding to thousands of genomic regions in vivo. Collectively, the 21 regulators show a surprisingly high overlap in the regions they bind given that they belong to 11 DNA binding domain families, specify distinct developmental fates, and can act via different cis-regulatory modules. We demonstrate, however, that quantitative differences in relative levels of binding to shared targets correlate with the known biological and transcriptional regulatory specificities of these factors. CONCLUSIONS: It is likely that the overlap in binding of biochemically and functionally unrelated transcription factors arises from the high concentrations of these proteins in nuclei, which, coupled with their broad DNA binding specificities, directs them to regions of open chromatin. We suggest that most animal transcription factors will be found to show a similar broad overlapping pattern of binding in vivo, with specificity achieved by modulating the amount, rather than the identity, of bound factor.

  4. The Mediterranean Sea as a barrier to gene flow: evidence from variation in and around the F7 and F12 genomic regions.

    Science.gov (United States)

    Athanasiadis, Georgios; González-Pérez, Emili; Esteban, Esther; Dugoujon, Jean-Michel; Stoneking, Mark; Moral, Pedro

    2010-03-27

    The Mediterranean has a long history of interactions among different peoples. In this study, we investigate the genetic relationships among thirteen population samples from the broader Mediterranean region together with three other groups from the Ivory Coast and Bolivia with a particular focus on the genetic structure between North Africa and South Europe. Analyses were carried out on a diverse set of neutral and functional polymorphisms located in and around the coagulation factor VII and XII genomic regions (F7 and F12). Principal component analysis revealed a significant clustering of the Mediterranean samples into North African and South European groups consistent with the results from the hierarchical AMOVA, which showed a low but significant differentiation between groups from the two shores. For the same range of geographic distances, populations from each side of the Mediterranean were found to differ genetically more than populations within the same side. To further investigate this differentiation, we carried out haplotype analyses, which provided partial evidence that sub-Saharan gene flow was higher towards North Africa than South Europe. As there is no consensus between the two genomic regions regarding gene flow through the Sahara, it is hard to reach a solid conclusion about its role in the differentiation between the two Mediterranean shores and more data are necessary to reach a definite conclusion. However our data suggest that the Mediterranean Sea was at least partially a barrier to gene flow between the two shores.

  5. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

    Science.gov (United States)

    Ignatieva, Elena V; Levitsky, Victor G; Yudin, Nikolay S; Moshkin, Mikhail P; Kolchanov, Nikolay A

    2014-01-01

    The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  6. A genome-wide association study in a large F2-cross of laying hens reveals novel genomic regions associated with feather pecking and aggressive pecking behavior.

    Science.gov (United States)

    Lutz, Vanessa; Stratz, Patrick; Preuß, Siegfried; Tetens, Jens; Grashorn, Michael A; Bessei, Werner; Bennewitz, Jörn

    2017-02-03

    Feather pecking and aggressive pecking in laying hens are serious economic and welfare issues. In spite of extensive research on feather pecking during the last decades, the motivation for this behavior is still not clear. A small to moderate heritability has frequently been reported for these traits. Recently, we identified several single-nucleotide polymorphisms (SNPs) associated with feather pecking by mapping selection signatures in two divergent feather pecking lines. Here, we performed a genome-wide association analysis (GWAS) for feather pecking and aggressive pecking behavior, then combined the results with those from the recent selection signature experiment, and linked them to those obtained from a differential gene expression study. A large F2 cross of 960 F2 hens was generated using the divergent lines as founders. Hens were phenotyped for feather pecks delivered (FPD), aggressive pecks delivered (APD), and aggressive pecks received (APR). Individuals were genotyped with the Illumina 60K chicken Infinium iSelect chip. After data filtering, 29,376 SNPs remained for analyses. Single-marker GWAS was performed using a Poisson model. The results were combined with those from the selection signature experiment using Fisher's combined probability test. Numerous significant SNPs were identified for all traits but with low false discovery rates. Nearly all significant SNPs were located in clusters that spanned a maximum of 3 Mb and included at least two significant SNPs. For FPD, four clusters were identified, which increased to 13 based on the meta-analysis (FPD meta ). Seven clusters were identified for APD and three for APR. Eight genes (of the 750 investigated genes located in the FPD meta clusters) were significantly differentially-expressed in the brain of hens from both lines. One gene, SLC12A9, and the positional candidate gene for APD, GNG2, may be linked to the monomanine signaling pathway, which is involved in feather pecking and aggressive behavior

  7. Rice pseudomolecule-anchored cross-species DNA sequence alignments indicate regional genomic variation in expressed sequence conservation

    Directory of Open Access Journals (Sweden)

    Thomas Howard

    2007-08-01

    Full Text Available Abstract Background Various methods have been developed to explore inter-genomic relationships among plant species. Here, we present a sequence similarity analysis based upon comparison of transcript-assembly and methylation-filtered databases from five plant species and physically anchored rice coding sequences. Results A comparison of the frequency of sequence alignments, determined by MegaBLAST, between rice coding sequences in TIGR pseudomolecules and annotations vs 4.0 and comprehensive transcript-assembly and methylation-filtered databases from Lolium perenne (ryegrass, Zea mays (maize, Hordeum vulgare (barley, Glycine max (soybean and Arabidopsis thaliana (thale cress was undertaken. Each rice pseudomolecule was divided into 10 segments, each containing 10% of the functionally annotated, expressed genes. This indicated a correlation between relative segment position in the rice genome and numbers of alignments with all the queried monocot and dicot plant databases. Colour-coded moving windows of 100 functionally annotated, expressed genes along each pseudomolecule were used to generate 'heat-maps'. These revealed consistent intra- and inter-pseudomolecule variation in the relative concentrations of significant alignments with the tested plant databases. Analysis of the annotations and derived putative expression patterns of rice genes from 'hot-spots' and 'cold-spots' within the heat maps indicated possible functional differences. A similar comparison relating to ancestral duplications of the rice genome indicated that duplications were often associated with 'hot-spots'. Conclusion Physical positions of expressed genes in the rice genome are correlated with the degree of conservation of similar sequences in the transcriptomes of other plant species. This relative conservation is associated with the distribution of different sized gene families and segmentally duplicated loci and may have functional and evolutionary implications.

  8. Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome.

    Science.gov (United States)

    Shin, So-I; Ham, Seokjin; Park, Jihwan; Seo, Seong Hye; Lim, Chae Hyun; Jeon, Hyeongrin; Huh, Jounghyun; Roh, Tae-Young

    2016-07-03

    Z-DNA, a left-handed double helical DNA is structurally different from the most abundant B-DNA. Z-DNA has been known to play a significant role in transcription and genome stability but the biological meaning and positions of Z-DNA-forming sites (ZFSs) in the human genome has not been fully explored. To obtain genome-wide map of ZFSs, Zaa with two Z-DNA-binding domains was used for ChIP-Seq analysis. A total of 391 ZFSs were found and their functions were examined in vivo A large portion of ZFSs was enriched in the promoter regions and contain sequences with high potential to form Z-DNA. Genes containing ZFSs were occupied by RNA polymerase II at the promoters and showed high levels of expression. Moreover, ZFSs were significantly related to active histone marks such as H3K4me3 and H3K9ac. The association of Z-DNA with active transcription was confirmed by the reporter assay system. Overall, our results suggest that Z-DNA formation depends on chromatin structure as well as sequence composition, and is associated with active transcription in human cells. The global information about ZFSs positioning will provide a useful resource for further understanding of DNA structure-dependent transcriptional regulation. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  9. Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.

    Science.gov (United States)

    Bai, Yongsheng; Kinne, Jeff; Donham, Brandon; Jiang, Feng; Ding, Lizhong; Hassler, Justin R; Kaufman, Randal J

    2016-08-22

    Most existing tools for detecting next-generation sequencing-based splicing events focus on generic splicing events. Consequently, special types of non-canonical splicing events of short mRNA regions (IRE1α targeted) have not yet been thoroughly addressed at a genome-wide level using bioinformatics approaches in conjunction with next-generation technologies. During endoplasmic reticulum (ER) stress, the gene encoding the RNase Ire1α is known to splice out a short 26 nt region from the mRNA of the transcription factor Xbp1 non-canonically within the cytosol. This causes an open reading frame-shift that induces expression of many downstream genes in reaction to ER stress as part of the unfolded protein response (UPR). We previously published an algorithm termed "Read-Split-Walk" (RSW) to identify non-canonical splicing regions using RNA-Seq data and applied it to ER stress-induced Ire1α heterozygote and knockout mouse embryonic fibroblast cell lines. In this study, we have developed an improved algorithm "Read-Split-Run" (RSR) for detecting genome-wide Ire1α-targeted genes with non-canonical spliced regions at a faster speed. We applied the RSR algorithm using different combinations of several parameters to the previously RSW tested mouse embryonic fibroblast cells (MEF) and the human Encyclopedia of DNA Elements (ENCODE) RNA-Seq data. We also compared the performance of RSR with two other alternative splicing events identification tools (TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012)) utilizing the context of the spliced Xbp1 mRNA as a positive control in the data sets we identified it to be the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples and this comparison was also extended to human ENCODE RNA-Seq data. Proof of principle came in our results by the fact that the 26 nt non-conventional splice site in Xbp1 was detected as the top hit by our new RSR

  10. Comparative Investigation of the Genomic Regions Involved in Antigenic Variation of the TprK Antigen among Treponemal Species, Subspecies, and Strains

    Science.gov (United States)

    Brandt, Stephanie L.; Puray-Chavez, Maritza; Reid, Tara Brinck; Godornes, Charmie; Molini, Barbara J.; Benzler, Martin; Hartig, Jörg S.; Lukehart, Sheila A.; Centurion-Lara, Arturo

    2012-01-01

    Although the three Treponema pallidum subspecies (T. pallidum subsp. pallidum, T. pallidum subsp. pertenue, and T. pallidum subsp. endemicum), Treponema paraluiscuniculi, and the unclassified Fribourg-Blanc treponeme cause clinically distinct diseases, these pathogens are genetically and antigenically highly related and are able to cause persistent infection. Recent evidence suggests that the putative surface-exposed variable antigen TprK plays an important role in both treponemal immune evasion and persistence. tprK heterogeneity is generated by nonreciprocal gene conversion between the tprK expression site and donor sites. Although each of the above-mentioned species and subspecies has a functional tprK antigenic variation system, it is still unclear why the level of expression and the rate at which tprK diversifies during infection can differ significantly among isolates. To identify genomic differences that might affect the generation and expression of TprK variants among these pathogens, we performed comparative sequence analysis of the donor sites, as well as the tprK expression sites, among eight T. pallidum subsp. pallidum isolates (Nichols Gen, Nichols Sea, Chicago, Sea81-4, Dal-1, Street14, UW104, and UW126), three T. pallidum subsp. pertenue isolates (Gauthier, CDC2, and Samoa D), one T. pallidum subsp. endemicum isolate (Iraq B), the unclassified Fribourg-Blanc isolate, and the Cuniculi A strain of T. paraluiscuniculi. Synteny and sequence conservation, as well as deletions and insertions, were found in the regions harboring the donor sites. These data suggest that the tprK recombination system is harbored within dynamic genomic regions and that genomic differences might be an important key to explain discrepancies in generation and expression of tprK variants among these Treponema isolates. PMID:22661689

  11. Use of conserved genomic regions and degenerate primers in a PCR-based assay for the detection of members of the genus Caulimovirus.

    Science.gov (United States)

    Pappu, H R; Druffel, K L

    2009-04-01

    The genus Caulimovirus consists of several distinct virus species with a double-stranded DNA genome that infect diverse plant species. A comparative analysis of the sequences of known Caulimovirus species revealed two regions that are conserved in all Caulimovirus species with the exception of Strawberry vein banding virus. Degenerate primers based on these two regions were designed and tested in a polymerase chain reaction-based assay for broad spectrum detection of members of this genus. Cauliflower mosaic virus, Figwort mosaic virus and three distinct caulimoviruses associated with dahlia (Dahlia variabilis) were used to show the utility of this test in detecting diverse caulimoviruses. The primer pair gave an amplicon of expected size (840bp). Amplicons from each virus were cloned and sequenced to verify their identity. The primer pair and the PCR assay provide approach for the broad spectrum detection of several members of the genus Caulimovirus.

  12. ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions.

    Science.gov (United States)

    Muiño, Jose M; Kaufmann, Kerstin; van Ham, Roeland Chj; Angenent, Gerco C; Krajewski, Pawel

    2011-05-09

    In vivo detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationally efficient manner. The generation of high copy numbers of DNA fragments as an artifact of the PCR step in ChIP-seq is an important source of bias of this methodology. We present here an R package for the statistical analysis of ChIP-seq experiments. Taking the average size of DNA fragments subjected to sequencing into account, the software calculates single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the ratio test or the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutations. Computational efficiency is achieved by implementing the most time-consuming functions in C++ and integrating these in the R package. An analysis of simulated and experimental ChIP-seq data is presented to demonstrate the robustness of our method against PCR-artefacts and its adequate control of the error rate. The software ChIP-seq Analysis in R (CSAR) enables fast and accurate detection of protein-bound genomic regions through the analysis of ChIP-seq experiments. Compared to existing methods, we found that our package shows greater robustness against PCR-artefacts and better control of the error rate.

  13. High abundance of Serine/Threonine-rich regions predicted to be hyper-O-glycosylated in the secretory proteins coded by eight fungal genomes

    Directory of Open Access Journals (Sweden)

    González Mario

    2012-09-01

    Full Text Available Abstract Background O-glycosylation of secretory proteins has been found to be an important factor in fungal biology and virulence. It consists in the addition of short glycosidic chains to Ser or Thr residues in the protein backbone via O-glycosidic bonds. Secretory proteins in fungi frequently display Ser/Thr rich regions that could be sites of extensive O-glycosylation. We have analyzed in silico the complete sets of putatively secretory proteins coded by eight fungal genomes (Botrytis cinerea, Magnaporthe grisea, Sclerotinia sclerotiorum, Ustilago maydis, Aspergillus nidulans, Neurospora crassa, Trichoderma reesei, and Saccharomyces cerevisiae in search of Ser/Thr-rich regions as well as regions predicted to be highly O-glycosylated by NetOGlyc (http://www.cbs.dtu.dk. Results By comparison with experimental data, NetOGlyc was found to overestimate the number of O-glycosylation sites in fungi by a factor of 1.5, but to be quite reliable in the prediction of highly O-glycosylated regions. About half of secretory proteins have at least one Ser/Thr-rich region, with a Ser/Thr content of at least 40% over an average length of 40 amino acids. Most secretory proteins in filamentous fungi were predicted to be O-glycosylated, sometimes in dozens or even hundreds of sites. Residues predicted to be O-glycosylated have a tendency to be grouped together forming hyper-O-glycosylated regions of varying length. Conclusions About one fourth of secretory fungal proteins were predicted to have at least one hyper-O-glycosylated region, which consists of 45 amino acids on average and displays at least one O-glycosylated Ser or Thr every four residues. These putative highly O-glycosylated regions can be found anywhere along the proteins but have a slight tendency to be at either one of the two ends.

  14. Genomic rearrangements and functional diversification of lecA and lecB lectin-coding regions impacting the efficacy of glycomimetics directed against Pseudomonas aeruginosa

    Directory of Open Access Journals (Sweden)

    Amine M Boukerb

    2016-05-01

    Full Text Available LecA and LecB tetrameric lectins take part in oligosaccharide-mediated adhesion-processes of Pseudomonas aeruginosa. Glycomimetics have been designed to block these interactions. The great versatility of P. aeruginosa suggests that the range of application of these glycomimetics could be restricted to genotypes with particular lectin types. The likelihood of having genomic and genetic changes impacting LecA and LecB interactions with glycomimetics such as galactosylated and fucosylated calix[4]arene was investigated over a collection of strains from the main clades of P. aeruginosa. Lectin types were defined, and their ligand specificities were inferred. These analyses showed a loss of lecA among the PA7 clade. Genomic changes impacting lec loci were thus assessed using strains of this clade, and by making comparisons with the PAO1 genome. The lecA regions were found challenged by phage attacks and PAGI-2 (genomic island integrations. A prophage was linked to the loss of lecA. The lecB regions were found less impacted by such rearrangements but greater lecB than lecA genetic divergences were recorded. Sixteen combinations of LecA and LecB types were observed. Amino acid variations were mapped on PAO1 crystal structures. Most significant changes were observed on LecBPA7, and found close to the fucose binding site. Glycan array analyses were performed with purified LecBPA7. LecBPA7 was found less specific for fucosylated oligosaccharides than LecBPAO1, with a preference for H type 2 rather than type 1, and Lewisa rather than Lewisx. Comparison of the crystal structures of LecBPA7 and LecBPAO1 in complex with Lewisa showed these changes in specificity to have resulted from a modification of the water network between the lectin, galactose and GlcNAc residues. Incidence of these modifications on the interactions with calix[4]arene glycomimetics at the cell level was investigated. An aggregation test was used to establish the efficacy of these ligands

  15. The 172-kb genomic DNA region of the O. rufipogon yld1.1 locus: comparative sequence analysis with O. sativa ssp. japonica and O. sativa ssp. indica.

    Science.gov (United States)

    Song, Beng-Kah; Hein, Ingo; Druka, Arnis; Waugh, Robbie; Marshall, David; Nadarajah, Kalaivani; Yap, Soon-Joo; Ratnam, Wickneswari

    2009-02-01

    Common wild rice (Oryza rufipogon) plays an important role by contributing to modern rice breeding. In this paper, we report the sequence and analysis of a 172-kb genomic DNA region of wild rice around the RM5 locus, which is associated with the yield QTL yld1.1. Comparative sequence analysis between orthologous RM5 regions from Oryza sativa ssp. japonica, O. sativa ssp. indica and O. rufipogon revealed a high level of conserved synteny in the content, homology, structure, orientation, and physical distance of all 14 predicted genes. Twelve of the putative genes were supported by matches to proteins with known function, whereas two were predicted by homology to rice and other plant expressed sequence tags or complementary DNAs. The remarkably high level of conservation found in coding, intronic and intergenic regions may indicate high evolutionary selection on the RM5 region. Although our analysis has not defined which gene(s) determine the yld1.1 phenotype, allelic variation and the insertion of transposable elements, among other nucleotide changes, represent potential variation responsible for the yield QTL. However, as suggested previously, two putative receptor-like protein kinase genes remain the key suspects for yld1.1.

  16. A microsatellite-based genetic linkage map and putative sex-determining genomic regions in Lake Victoria cichlids.

    Science.gov (United States)

    Kudo, Yu; Nikaido, Masato; Kondo, Azusa; Suzuki, Hikoyu; Yoshida, Kohta; Kikuchi, Kiyoshi; Okada, Norihiro

    2015-04-15

    Cichlid fishes in East Africa have undergone extensive adaptive radiation, which has led to spectacular diversity in their morphology and ecology. To date, genetic linkage maps have been constructed for several tilapias (riverine), Astatotilapia burtoni (Lake Tanganyika), and hybrid lines of Lake Malawi cichlids to facilitate genome-wide comparative analyses. In the present study, we constructed a genetic linkage map of the hybrid line of Lake Victoria cichlids, so that maps of cichlids from all the major areas of East Africa will be available. The genetic linkage map shown here is derived from the F2 progeny of an interspecific cross between Haplochromis chilotes and Haplochromis sauvagei and is based on 184 microsatellite and two single-nucleotide polymorphism (SNP) markers. Most of the microsatellite markers used in the present study were originally designed for other genetic linkage maps, allowing us to directly compare each linkage group (LG) among different cichlid groups. We found 25 LGs, the total length of which was 1133.2cM with an average marker spacing of about 6.09cM. Our subsequent linkage mapping analysis identified two putative sex-determining loci in cichlids. Interestingly, one of these two loci is located on cichlid LG5, on which the female heterogametic ZW locus and several quantitative trait loci (QTLs) related to adaptive evolution have been reported in Lake Malawi cichlids. We also found that V1R1 and V1R2, candidate genes for the fish pheromone receptor, are located very close to the recently detected sex-determining locus on cichlid LG5. The genetic linkage map study presented here may provide a valuable foundation for studying the chromosomal evolution of East African cichlids and the possible role of sex chromosomes in generating their genomic diversity. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Multiple sex-associated regions and a putative sex chromosome in zebrafish revealed by RAD mapping and population genomics.

    Directory of Open Access Journals (Sweden)

    Jennifer L Anderson

    Full Text Available Within vertebrates, major sex determining genes can differ among taxa and even within species. In zebrafish (Danio rerio, neither heteromorphic sex chromosomes nor single sex determination genes of large effect, like Sry in mammals, have yet been identified. Furthermore, environmental factors can influence zebrafish sex determination. Although progress has been made in understanding zebrafish gonad differentiation (e.g. the influence of germ cells on gonad fate, the primary genetic basis of zebrafish sex determination remains poorly understood. To identify genetic loci associated with sex, we analyzed F(2 offspring of reciprocal crosses between Oregon *AB and Nadia (NA wild-type zebrafish stocks. Genome-wide linkage analysis, using more than 5,000 sequence-based polymorphic restriction site associated (RAD-tag markers and population genomic analysis of more than 30,000 single nucleotide polymorphisms in our *ABxNA crosses revealed a sex-associated locus on the end of the long arm of chr-4 for both cross families, and an additional locus in the middle of chr-3 in one cross family. Additional sequencing showed that two SNPs in dmrt1 previously suggested to be functional candidates for sex determination in a cross of ABxIndia wild-type zebrafish, are not associated with sex in our AB fish. Our data show that sex determination in zebrafish is polygenic and that different genes may influence sex determination in different strains or that different genes become more important under different environmental conditions. The association of the end of chr-4 with sex is remarkable because, unique in the karyotype, this chromosome arm shares features with known sex chromosomes: it is highly heterochromatic, repetitive, late replicating, and has reduced recombination. Our results reveal that chr-4 has functional and structural properties expected of a sex chromosome.

  18. BAC array CGH in patients with Velocardiofacial syndrome-like features reveals genomic aberrations on chromosome region 1q21.1

    Directory of Open Access Journals (Sweden)

    Estivill Xavier

    2009-12-01

    Full Text Available Abstract Background Microdeletion of the chromosome 22q11.2 region is the most common genetic aberration among patients with velocardiofacial syndrome (VCFS but a subset of subjects do not show alterations of this chromosome region. Methods We analyzed 18 patients with VCFS-like features by comparative genomic hybridisation (aCGH array and performed a face-to-face slide hybridization with two different arrays: a whole genome and a chromosome 22-specific BAC array. Putative rearrangements were confirmed by FISH and MLPA assays. Results One patient carried a combination of rearrangements on 1q21.1, consisting in a microduplication of 212 kb and a close microdeletion of 1.15 Mb, previously reported in patients with variable phenotypes, including mental retardation, congenital heart defects (CHD and schizophrenia. While 326 control samples were negative for both 1q21.1 rearrangements, one of 73 patients carried the same 212-kb microduplication, reciprocal to TAR microdeletion syndrome. Also, we detected four copy number variants (CNVs inherited from one parent (a 744-kb duplication on 10q11.22; a 160 kb duplication and deletion on 22q11.21 in two cases; and a gain of 140 kb on 22q13.2, not present in control subjects, raising the potential role of these CNVs in the VCFS-like phenotype. Conclusions Our results confirmed aCGH as a successful strategy in order to characterize additional submicroscopic aberrations in patients with VCF-like features that fail to show alterations in 22q11.2 region. We report a 212-kb microduplication on 1q21.1, detected in two patients, which may contribute to CHD.

  19. ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

    International Nuclear Information System (INIS)

    Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

    2007-01-01

    The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors

  20. Differential regulation of hepatitis B virus core protein expression and genome replication by a small upstream open reading frame and naturally occurring mutations in the precore region.

    Science.gov (United States)

    Zong, Li; Qin, Yanli; Jia, Haodi; Ye, Lei; Wang, Yongxiang; Zhang, Jiming; Wands, Jack R; Tong, Shuping; Li, Jisu

    2017-05-01

    Hepatitis B virus (HBV) transcribes two subsets of 3.5-kb RNAs: precore RNA for hepatitis B e antigen (HBeAg) expression, and pregenomic RNA for core and P protein translation as well as genome replication. HBeAg expression could be prevented by mutations in the precore region, while an upstream open reading frame (uORF) has been proposed as a negative regulator of core protein translation. We employed replication competent HBV DNA constructs and transient transfection experiments in Huh7 cells to verify the uORF effect and to explore the alternative function of precore RNA. Optimized Kozak sequence for the uORF or extra ATG codons as present in some HBV genotypes reduced core protein expression. G1896A nonsense mutation promoted more efficient core protein expression than mutated precore ATG, while a +1 frameshift mutation was ineffective. In conclusion, various HBeAg-negative precore mutations and mutations affecting uORF differentially regulate core protein expression and genome replication. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. A unique genomic sequence in the Wolf-Hirschhorn syndrome [WHS] region of humans is conserved in the great apes.

    Science.gov (United States)

    Tarzami, S T; Kringstein, A M; Conte, R A; Verma, R S

    1996-10-01

    The Wolf-Hirschhorn syndrome (WHS) is caused by a partial deletion in the short arm of chromosome 4 band 16.3 (4p 16.3). A unique-sequence human DNA probe (39 kb) localized within this region has been used to search for sequence homology in the apes' equivalent chromosome 3 by FISH-technique. The WHS loci are conserved in higher primates at the expected position. Nevertheless, a control probe, which detects alphoid sequences of the pericentromeric region of humans, is diverged in chimpanzee, gorilla, and orangutan. The conservation of WHS loci and divergence of DNA alphoid sequences have further added to the controversy concerning human descent.

  2. Identification of genomic regions regulating Pax6 expression in embryonic forebrain using YAC reporter transgenic mouse lines.

    Directory of Open Access Journals (Sweden)

    Da Mi

    Full Text Available The transcription factor Pax6 is a crucial regulator of eye and central nervous system development. Both the spatiotemporal patterns and the precise levels of Pax6 expression are subject to tight control, mediated by an extensive set of cis-regulatory elements. Previous studies have shown that a YAC reporter transgene containing 420 Kb of genomic DNA spanning the human PAX6 locus drives expression of a tau-tagged GFP reporter in mice in a pattern that closely resembles that of endogenous Pax6. Here we have closely compared the pattern of tau-GFP reporter expression at the cellular level in the forebrains and eyes of transgenic mice carrying either complete or truncated versions of the YAC reporter transgene with endogenous Pax6 expression and found several areas where expression of tau-GFP and Pax6 diverge. Some discrepancies are due to differences between the intracellular localization or perdurance of tau-GFP and Pax6 proteins, while others are likely to be a consequence of transcriptional differences. We show that cis-regulatory elements that lie outside the 420 kb fragment of PAX6 are required for correct expression around the pallial-subpallial boundary, in the amygdala and the prethalamus. Further, we found that the YAC reporter transgene effectively labels cells that contribute to the lateral cortical stream, including cells that arise from the pallium and subpallium, and therefore represents a useful tool for studying lateral cortical stream migration.

  3. [Regions of human genome containing analogs of oncogenes and retrovirus genes. I. A family of c-mos genes and unusual structure of ORA-gp5 locus].

    Science.gov (United States)

    Zabarovskiĭ, E R; Chumakov, I M; Prasolov, V S; Kiselev, L L

    1984-01-01

    The structural organization of a number of recombinant phages previously selected from the human gene library has been studied. On the basis of comparison of physical maps and hybridization to cloned probes it was deduced that different human loci with the homology to v-mos are represented in lambda recombinants. The physical map of the cloned region of the human genome designated as ORA-gp5 was constructed. The sequences of three different genetical elements v-mos-related oncogene, mammalian type C retrovirus and Alu type repeat are interspersed in this structure. The hypothesis concerning the probable origin of this locus has been proposed. The mosaical structure of ORA-gp5 could be the result of the integration of mammalian retrovirus in the vicinity to c-mos gene with subsequent recombination and transposition. The resulting potentially oncogenic structure was later inactivated by the integration of Alu-type repeats.

  4. Silencing of neurotropic flavivirus replication in the central nervous system by combining multiple microRNA target insertions in two distinct viral genome regions

    Science.gov (United States)

    Teterina, Natalya L.; Liu, Guangping; Maximova, Olga A.; Pletnev, Alexander G.

    2014-01-01

    In recent years, microRNA-targeting has become an effective strategy for selective control of tissue-tropism and pathogenesis of both DNA and RNA viruses. Here, using a neurotropic flavivirus as a model, we demonstrate that simultaneous miRNA targeting of the viral genome in the open reading frame and 3′-noncoding regions for brain-expressed miRNAs had an additive effect and produced a more potent attenuation of the virus compared to separate targeting of those regions. Multiple miRNA co-targeting of these two distantly located regions completely abolished the virus neurotropism as no viral replication was detected in the developing brain of neonatal mice. Furthermore, no viral antigens were detected in neurons, and neuronal integrity in the brain of mice was well preserved. This miRNA co-targeting approach can be adapted for other viruses in order to minimize their replication in a cell- or tissue-type specific manner, but most importantly, to prevent virus escape from miRNA-mediated silencing. PMID:24889244

  5. Genome-wide association identifies a deletion in the 3' untranslated region of striatin in a canine model of arrhythmogenic right ventricular cardiomyopathy.

    Science.gov (United States)

    Meurs, Kathryn M; Mauceli, Evan; Lahmers, Sunshine; Acland, Gregory M; White, Stephen N; Lindblad-Toh, Kerstin

    2010-09-01

    Arrhythmogenic right ventricular cardiomyopathy (ARVC) is a familial cardiac disease characterized by ventricular arrhythmias and sudden cardiac death. It is most frequently inherited as an autosomal dominant trait with incomplete and age-related penetrance and variable clinical expression. The human disease is most commonly associated with a causative mutation in one of several genes encoding desmosomal proteins. We have previously described a spontaneous canine model of ARVC in the boxer dog. We phenotyped adult boxer dogs for ARVC by performing physical examination, echocardiogram and ambulatory electrocardiogram. Genome-wide association using the canine 50k SNP array identified several regions of association, of which the strongest resided on chromosome 17. Fine mapping and direct DNA sequencing identified an 8-bp deletion in the 3' untranslated region (UTR) of the Striatin gene on chromosome 17 in association with ARVC in the boxer dog. Evaluation of the secondary structure of the 3' UTR demonstrated that the deletion affects a stem loop structure of the mRNA and expression analysis identified a reduction in Striatin mRNA. Dogs that were homozygous for the deletion had a more severe form of disease based on a significantly higher number of ventricular premature complexes. Immunofluorescence studies localized Striatin to the intercalated disc region of the cardiac myocyte and co-localized it to three desmosomal proteins, Plakophilin-2, Plakoglobin and Desmoplakin, all involved in the pathogenesis of ARVC in human beings. We suggest that Striatin may serve as a novel candidate gene for human ARVC.

  6. Evolution of the rpoB-psbZ region in fern plastid genomes: notable structural rearrangements and highly variable intergenic spacers

    Directory of Open Access Journals (Sweden)

    Su Ying-Juan

    2011-04-01

    Full Text Available Abstract Background The rpoB-psbZ (BZ region of some fern plastid genomes (plastomes has been noted to go through considerable genomic changes. Unraveling its evolutionary dynamics across all fern lineages will lead to clarify the fundamental process shaping fern plastome structure and organization. Results A total of 24 fern BZ sequences were investigated with taxon sampling covering all the extant fern orders. We found that: (i a tree fern Plagiogyria japonica contained a novel gene order that can be generated from either the ancestral Angiopteris type or the derived Adiantum type via a single inversion; (ii the trnY-trnE intergenic spacer (IGS of the filmy fern Vandenboschia radicans was expanded 3-fold due to the tandem 27-bp repeats which showed strong sequence similarity with the anticodon domain of trnY; (iii the trnY-trnE IGSs of two horsetail ferns Equisetum ramosissimum and E. arvense underwent an unprecedented 5-kb long expansion, more than a quarter of which was consisted of a single type of direct repeats also relevant to the trnY anticodon domain; and (iv ycf66 has independently lost at least four times in ferns. Conclusions Our results provided fresh insights into the evolutionary process of fern BZ regions. The intermediate BZ gene order was not detected, supporting that the Adiantum type was generated by two inversions occurring in pairs. The occurrence of Vandenboschia 27-bp repeats represents the first evidence of partial tRNA gene duplication in fern plastomes. Repeats potentially forming a stem-loop structure play major roles in the expansion of the trnY-trnE IGS.

  7. Quantitative trait loci (QTL study identifies novel genomic regions associated to Chiari-like malformation in Griffon Bruxellois dogs.

    Directory of Open Access Journals (Sweden)

    Philippe Lemay

    Full Text Available Chiari-like malformation (CM is a developmental abnormality of the craniocervical junction that is common in the Griffon Bruxellois (GB breed with an estimated prevalence of 65%. This disease is characterized by overcrowding of the neural parenchyma at the craniocervical junction and disturbance of cerebrospinal fluid (CSF flow. The most common clinical sign is pain either as a direct consequence of CM or neuropathic pain as a consequence of secondary syringomyelia. The etiology of CM remains unknown but genetic factors play an important role. To investigate the genetic complexity of the disease, a quantitative trait locus (QTL approach was adopted. A total of 14 quantitative skull and atlas measurements were taken and were tested for association to CM. Six traits were found to be associated to CM and were subjected to a whole-genome association study using the Illumina canine high density bead chip in 74 GB dogs (50 affected and 24 controls. Linear and mixed regression analyses identified associated single nucleotide polymorphisms (SNPs on 5 Canis Familiaris Autosomes (CFAs: CFA2, CFA9, CFA12, CFA14 and CFA24. A reconstructed haplotype of 0.53 Mb on CFA2 strongly associated to the height of the cranial fossa (diameter F and an haplotype of 2.5 Mb on CFA14 associated to both the height of the rostral part of the caudal cranial fossa (AE and the height of the brain (FG were significantly associated to CM after 10 000 permutations strengthening their candidacy for this disease (P = 0.0421, P = 0.0094 respectively. The CFA2 QTL harbours the Sall-1 gene which is an excellent candidate since its orthologue in humans is mutated in Townes-Brocks syndrome which has previously been associated to Chiari malformation I. Our study demonstrates the implication of multiple traits in the etiology of CM and has successfully identified two new QTL associated to CM and a potential candidate gene.

  8. Genomic Testing

    Science.gov (United States)

    ... Events and Multimedia Implementation Genetics 101 Family Health History Genomics and Diseases Genetic Counseling Genomic Testing Epidemiology Pathogen Genomics Resources Genomic Testing Recommend on Facebook Tweet Share Compartir Fact Sheet: Identifying Opportunities to ...

  9. Identification of haplotypes at the Rsv4 genomic region in soybean associated with durable resistance to soybean mosaic virus.

    Science.gov (United States)

    Ilut, Daniel C; Lipka, Alexander E; Jeong, Namhee; Bae, Dong Nyuk; Kim, Dong Hyun; Kim, Ji Hong; Redekar, Neelam; Yang, Kiwoung; Park, Won; Kang, Sung-Taeg; Kim, Namshin; Moon, Jung-Kyung; Saghai Maroof, M A; Gore, Michael A; Jeong, Soon-Chun

    2016-03-01

    Discovery of new germplasm sources and identification of haplotypes for the durable Soybean mosaic virus resistance gene, Rsv 4, provide novel resources for map-based cloning and genetic improvement efforts in soybean. The Soybean mosaic virus (SMV) resistance locus Rsv4 is of interest because it provides a durable type of resistance in soybean [Glycine max (L.) Merr.]. To better understand its molecular basis, we used a population of 309 BC3F2 individuals to fine-map Rsv4 to a ~120 kb interval and leveraged this genetic information in a second study to identify accessions 'Haman' and 'Ilpumgeomjeong' as new sources of Rsv4. These two accessions along with three other Rsv4 and 14 rsv4 accessions were used to examine the patterns of nucleotide diversity at the Rsv4 region based on high-depth resequencing data. Through a targeted association analysis of these 19 accessions within the ~120 kb interval, a cluster of four intergenic single-nucleotide polymorphisms (SNPs) was found to perfectly associate with SMV resistance. Interestingly, this ~120 kb interval did not contain any genes similar to previously characterized dominant disease resistance genes. Therefore, a haplotype analysis was used to further resolve the association signal to a ~94 kb region, which also resulted in the identification of at least two Rsv4 haplotypes. A haplotype phylogenetic analysis of this region suggests that the Rsv4 locus in G. max is recently introgressed from G. soja. This integrated study provides a strong foundation for efforts focused on the cloning of this durable virus resistance gene and marker-assisted selection of Rsv4-mediated SMV resistance in soybean breeding programs.

  10. Analysis of the human cytomegalovirus genomic region from UL146 through UL147A reveals sequence hypervariability, genotypic stability, and overlapping transcripts

    Directory of Open Access Journals (Sweden)

    Huang Diana D

    2006-01-01

    Full Text Available Abstract Background Although the sequence of the human cytomegalovirus (HCMV genome is generally conserved among unrelated clinical strains, some open reading frames (ORFs are highly variable. UL146 and UL147, which encode CXC chemokine homologues are among these variable ORFs. Results The region of the HCMV genome from UL146 through UL147A was analyzed in clinical strains for sequence variability, genotypic stability, and transcriptional expression. The UL146 sequences in clinical strains from two geographically distant sites were assigned to 12 sequence groups that differ by over 60% at the amino acid level. The same groups were generated by sequences from the UL146-UL147 intergenic region and the UL147 ORF. In contrast to the high level of sequence variability among unrelated clinical strains, the sequences of UL146 through UL147A from isolates of the same strain were highly stable after repeated passage both in vitro and in vivo. Riboprobes homologous to these ORFs detected multiple overlapping transcripts differing in temporal expression. UL146 sequences are present only on the largest transcript, which also contains all of the downstream ORFs including UL148 and UL132. The sizes and hybridization patterns of the transcripts are consistent with a common 3'-terminus downstream of the UL132 ORF. Early-late expression of the transcripts associated with UL146 and UL147 is compatible with the potential role of CXC chemokines in pathogenesis associated with viral replication. Conclusion Clinical isolates from two different geographic sites cluster in the same groups based on the hypervariability of the UL146, UL147, or the intergenic sequences, which provides strong evidence for linkage and no evidence for interstrain recombination within this region. The sequence of individual strains was absolutely stable in vitro and in vivo, which indicates that sequence drift is not a mechanism for the observed sequence hypervariability. There is also no

  11. A 5'-proximal Stem-loop Structure of 5' Untranslated Region of Porcine Reproductive and Respiratory Syndrome Virus Genome Is Key for Virus Replication

    Directory of Open Access Journals (Sweden)

    Li Yanhua

    2011-04-01

    Full Text Available Abstract Background It has been well documented that the 5' untranslated region (5' UTR of many positive-stranded RNA viruses contain key cis-acting regulatory sequences, as well as high-order structural elements. Little is known for such regulatory elements controlling porcine arterivirus replication. We investigated the roles of a conserved stem-loop 2 (SL2 that resides in the 5'UTR of the genome of a type II porcine reproductive and respiratory syndrome virus (PRRSV. Results We provided genetic evidences demonstrating that 1 the SL2 in type II PRRSV 5' UTR, N-SL2, could be structurally and functionally substituted by its counterpart in type I PRRSV, E-SL2; 2 the functionality of N-SL2 was dependent upon the G-C rich stem structure, while the ternary-loop size was irrelevant to RNA synthesis; 3 serial deletions showed that the stem integrity of N-SL2 was crucial for subgenomic mRNA synthesis; and 4 when extensive base-pairs in the stem region was deleted, an alternative N-SL2-like structure with different sequence was utilized for virus replication. Conclusion Taken together, we concluded that the phylogenetically conserved SL2 in the 5' UTR was crucial for PRRSV virus replication, subgenomic mRNA synthesis in particular.

  12. Investigating the prehistory of Tungusic peoples of Siberia and the Amur-Ussuri region with complete mtDNA genome sequences and Y-chromosomal markers.

    Science.gov (United States)

    Duggan, Ana T; Whitten, Mark; Wiebe, Victor; Crawford, Michael; Butthof, Anne; Spitsyn, Victor; Makarov, Sergey; Novgorodov, Innokentiy; Osakovsky, Vladimir; Pakendorf, Brigitte

    2013-01-01

    Evenks and Evens, Tungusic-speaking reindeer herders and hunter-gatherers, are spread over a wide area of northern Asia, whereas their linguistic relatives the Udegey, sedentary fishermen and hunter-gatherers, are settled to the south of the lower Amur River. The prehistory and relationships of these Tungusic peoples are as yet poorly investigated, especially with respect to their interactions with neighbouring populations. In this study, we analyse over 500 complete mtDNA genome sequences from nine different Evenk and even subgroups as well as their geographic neighbours from Siberia and their linguistic relatives the Udegey from the Amur-Ussuri region in order to investigate the prehistory of the Tungusic populations. These data are supplemented with analyses of Y-chromosomal haplogroups and STR haplotypes in the Evenks, Evens, and neighbouring Siberian populations. We demonstrate that whereas the North Tungusic Evenks and Evens show evidence of shared ancestry both in the maternal and in the paternal line, this signal has been attenuated by genetic drift and differential gene flow with neighbouring populations, with isolation by distance further shaping the maternal genepool of the Evens. The Udegey, in contrast, appear quite divergent from their linguistic relatives in the maternal line, with a mtDNA haplogroup composition characteristic of populations of the Amur-Ussuri region. Nevertheless, they show affinities with the Evenks, indicating that they might be the result of admixture between local Amur-Ussuri populations and Tungusic populations from the north.

  13. Investigating the Prehistory of Tungusic Peoples of Siberia and the Amur-Ussuri Region with Complete mtDNA Genome Sequences and Y-chromosomal Markers

    Science.gov (United States)

    Duggan, Ana T.; Whitten, Mark; Wiebe, Victor; Crawford, Michael; Butthof, Anne; Spitsyn, Victor; Makarov, Sergey; Novgorodov, Innokentiy; Osakovsky, Vladimir; Pakendorf, Brigitte

    2013-01-01

    Evenks and Evens, Tungusic-speaking reindeer herders and hunter-gatherers, are spread over a wide area of northern Asia, whereas their linguistic relatives the Udegey, sedentary fishermen and hunter-gatherers, are settled to the south of the lower Amur River. The prehistory and relationships of these Tungusic peoples are as yet poorly investigated, especially with respect to their interactions with neighbouring populations. In this study, we analyse over 500 complete mtDNA genome sequences from nine different Evenk and even subgroups as well as their geographic neighbours from Siberia and their linguistic relatives the Udegey from the Amur-Ussuri region in order to investigate the prehistory of the Tungusic populations. These data are supplemented with analyses of Y-chromosomal haplogroups and STR haplotypes in the Evenks, Evens, and neighbouring Siberian populations. We demonstrate that whereas the North Tungusic Evenks and Evens show evidence of shared ancestry both in the maternal and in the paternal line, this signal has been attenuated by genetic drift and differential gene flow with neighbouring populations, with isolation by distance further shaping the maternal genepool of the Evens. The Udegey, in contrast, appear quite divergent from their linguistic relatives in the maternal line, with a mtDNA haplogroup composition characteristic of populations of the Amur-Ussuri region. Nevertheless, they show affinities with the Evenks, indicating that they might be the result of admixture between local Amur-Ussuri populations and Tungusic populations from the north. PMID:24349531

  14. Investigating the prehistory of Tungusic peoples of Siberia and the Amur-Ussuri region with complete mtDNA genome sequences and Y-chromosomal markers.

    Directory of Open Access Journals (Sweden)

    Ana T Duggan

    Full Text Available Evenks and Evens, Tungusic-speaking reindeer herders and hunter-gatherers, are spread over a wide area of northern Asia, whereas their linguistic relatives the Udegey, sedentary fishermen and hunter-gatherers, are settled to the south of the lower Amur River. The prehistory and relationships of these Tungusic peoples are as yet poorly investigated, especially with respect to their interactions with neighbouring populations. In this study, we analyse over 500 complete mtDNA genome sequences from nine different Evenk and even subgroups as well as their geographic neighbours from Siberia and their linguistic relatives the Udegey from the Amur-Ussuri region in order to investigate the prehistory of the Tungusic populations. These data are supplemented with analyses of Y-chromosomal haplogroups and STR haplotypes in the Evenks, Evens, and neighbouring Siberian populations. We demonstrate that whereas the North Tungusic Evenks and Evens show evidence of shared ancestry both in the maternal and in the paternal line, this signal has been attenuated by genetic drift and differential gene flow with neighbouring populations, with isolation by distance further shaping the maternal genepool of the Evens. The Udegey, in contrast, appear quite divergent from their linguistic relatives in the maternal line, with a mtDNA haplogroup composition characteristic of populations of the Amur-Ussuri region. Nevertheless, they show affinities with the Evenks, indicating that they might be the result of admixture between local Amur-Ussuri populations and Tungusic populations from the north.

  15. Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology

    DEFF Research Database (Denmark)

    Rossin, Elizabeth J.; Hansen, Kasper Lage; Raychaudhuri, Soumya

    2011-01-01

    in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more...... that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non...... evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease....

  16. BAC-end microsatellites from intra and inter-genic regions of the common bean genome and their correlation with cytogenetic features.

    Directory of Open Access Journals (Sweden)

    Matthew Wohlgemuth Blair

    Full Text Available Highly polymorphic markers such as simple sequence repeats (SSRs or microsatellites are very useful for genetic mapping. In this study novel SSRs were identified in BAC-end sequences (BES from non-contigged, non-overlapping bacterial artificial clones (BACs in common bean (Phaseolus vulgaris L.. These so called "singleton" BACs were from the G19833 Andean gene pool physical map and the new BES-SSR markers were used for the saturation of the inter-gene pool, DOR364×G19833 genetic map. A total of 899 SSR loci were found among the singleton BES, but only 346 loci corresponded to the single di- or tri-nucleotide motifs that were likely to be polymorphic (ATT or AG motifs, principally and useful for primer design and individual marker mapping. When these novel SSR markers were evaluated in the DOR364×G19833 population parents, 136 markers revealed polymorphism and 106 were mapped. Genetic mapping resulted in a map length of 2291 cM with an average distance between markers of 5.2 cM. The new genetic map was compared to the most recent cytogenetic analysis of common bean chromosomes. We found that the new singleton BES-SSR were helpful in filling peri-centromeric spaces on the cytogenetic map. Short genetic distances between some new singleton-derived BES-SSR markers was common showing suppressed recombination in these regions compared to other parts of the genome. The correlation of singleton-derived SSR marker distribution with other cytogenetic features of the bean genome is discussed.

  17. Taxonomic and epidemiological aspects of the bovine viral diarrhoea virus 2 species through the observation of the secondary structures in the 5' genomic untranslated region

    Directory of Open Access Journals (Sweden)

    Massimo Giangaspero

    2008-06-01

    Full Text Available Bovine viral diarrhoea virus 2 (BVDV-2 strains demonstrated in cattle, sheep and adventitious contaminants of biological products were evaluated by the palindromic nucleotide substitutions (PNS method at the three variable loci (V1, V2 and V3 in the 5’ untranslated region (UTR, to determine their taxonomic status. Variation in conserved genomic sequences was used as a parameter for the epidemiological evaluation of the species in relation to geographic distribution, animal host and virulence. Four genotypes were identified within the species. Taxonomic segregation corresponded to geographic distribution of genotype variants. Genotype 2a was distributed worldwide and was also the only genotype that was circulating in sheep and cattle. Genotypes 2b, 2c and 2d were restricted to South America. Genotypes 2a and 2d were related to the contamination of biological products. Genetic variation could be related to the spread of BVDV-2 species variants in different geographic areas. Chronologically, the species emerged in North America in 1978 and spread to the United Kingdom and Japan, continental Europe, South America and New Zealand. Correlation between clinical features related with isolation of BVDV-2 strains and genetic variation indicated that subgenotype 1, variant 4 of genotype 2a, was related to a haemorrhagic syndrome. These observations suggest that the evaluation of genomic secondary structures, by identifying markers for expression of virus biological activities and species evolutionary history, may be a useful tool for the epidemiological evaluation of BVDV-2 species and possibly of other species of the genus Pestivirus.

  18. Genome-wide DNA methylation analyses in the brain reveal four differentially methylated regions between humans and non-human primates

    Directory of Open Access Journals (Sweden)

    Wang Jinkai

    2012-08-01

    Full Text Available Abstract Background The highly improved cognitive function is the most significant change in human evolutionary history. Recently, several large-scale studies reported the evolutionary roles of DNA methylation; however, the role of DNA methylation on brain evolution is largely unknown. Results To test if DNA methylation has contributed to the evolution of human brain, with the use of MeDIP-Chip and SEQUENOM MassARRAY, we conducted a genome-wide analysis to identify differentially methylated regions (DMRs in the brain between humans and rhesus macaques. We first identified a total of 150 candidate DMRs by the MeDIP-Chip method, among which 4 DMRs were confirmed by the MassARRAY analysis. All 4 DMRs are within or close to the CpG islands, and a MIR3 repeat element was identified in one DMR, but no repeat sequence was observed in the other 3 DMRs. For the 4 DMR genes, their proteins tend to be conserved and two genes have neural related functions. Bisulfite sequencing and phylogenetic comparison among human, chimpanzee, rhesus macaque and rat suggested several regions of lineage specific DNA methylation, including a human specific hypomethylated region in the promoter of K6IRS2 gene. Conclusions Our study provides a new angle of studying human brain evolution and understanding the evolutionary role of DNA methylation in the central nervous system. The results suggest that the patterns of DNA methylation in the brain are in general similar between humans and non-human primates, and only a few DMRs were identified.

  19. Human genome I

    International Nuclear Information System (INIS)

    Anon.

    1989-01-01

    An international conference, Human Genome I, was held Oct. 2-4, 1989 in San Diego, Calif. Selected speakers discussed: Current Status of the Genome Project; Technique Innovations; Interesting regions; Applications; and Organization - Different Views of Current and Future Science and Procedures. Posters, consisting of 119 presentations, were displayed during the sessions. 119 were indexed for inclusion to the Energy Data Base

  20. Genome-wide analysis of H3.3 dissociation reveals high nucleosome turnover at distal regulatory regions of embryonic stem cells.

    Science.gov (United States)

    Ha, Misook; Kraushaar, Daniel C; Zhao, Keji

    2014-01-01

    The histone variant H3.3 plays a critical role in maintaining the pluripotency of embryonic stem cells (ESCs) by regulating gene expression programs important for lineage specification. H3.3 is deposited by various chaperones at regulatory sites, gene bodies, and certain heterochromatic sites such as telomeres and centromeres. Using Tet-inhibited expression of epitope-tagged H3.3 combined with ChIP-Seq we undertook genome-wide measurements of H3.3 dissociation rates across the ESC genome and examined the relationship between H3.3-nucleosome turnover and ESC-specific transcription factors, chromatin modifiers, and epigenetic marks. Our comprehensive analysis of H3.3 dissociation rates revealed distinct H3.3 dissociation dynamics at various functional chromatin domains. At transcription start sites, H3.3 dissociates rapidly with the highest rate at nucleosome-depleted regions (NDRs) just upstream of Pol II binding, followed by low H3.3 dissociation rates across gene bodies. H3.3 turnover at transcription start sites, gene bodies, and transcription end sites was positively correlated with transcriptional activity. H3.3 is found decorated with various histone modifications that regulate transcription and maintain chromatin integrity. We find greatly varying H3.3 dissociation rates across various histone modification domains: high dissociation rates at active histone marks and low dissociation rates at heterochromatic marks. Well- defined zones of high H3.3-nucleosome turnover were detected at binding sites of ESC-specific pluripotency factors and chromatin remodelers, suggesting an important role for H3.3 in facilitating protein binding. Among transcription factor binding sites we detected higher H3.3 turnover at distal cis-acting sites compared to proximal genic transcription factor binding sites. Our results imply that fast H3.3 dissociation is a hallmark of interactions between DNA and transcriptional regulators. Our study demonstrates that H3.3 turnover and

  1. Identification of genome-wide non-canonical spliced regions and analysis of biological functions for spliced sequences using Read-Split-Fly.

    Science.gov (United States)

    Bai, Yongsheng; Kinne, Jeff; Ding, Lizhong; Rath, Ethan C; Cox, Aaron; Naidu, Siva Dharman

    2017-10-03

    It is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. However, the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. In recent years next-generation sequencing technologies have revolutionized the field. The "Read-Split-Walk" (RSW) and "Read-Split-Run" (RSR) methods were developed to identify genome-wide non-canonical spliced regions including special events occurring in cytoplasm. As the significant amount of genome/transcriptome data such as, Encyclopedia of DNA Elements (ENCODE) project, have been generated, we have advanced a newer more memory-efficient version of the algorithm, "Read-Split-Fly" (RSF), which can detect non-canonical spliced regions with higher sensitivity and improved speed. The RSF algorithm also outputs the spliced sequences for further downstream biological function analysis. We used open access ENCODE project RNA-Seq data to search spliced intron sequences against the U12-type spliced intron sequence database to examine whether some events could occur as potential signatures of U12-type splicing. The check was performed by searching spliced sequences against 5'ss and 3'ss sequences from the well-known orthologous U12-type spliceosomal intron database U12DB. Preliminary results of searching 70 ENCODE samples indicated that the presence of 5'ss with U12-type signature is more frequent than U2-type and prevalent in non-canonical junctions reported by RSF. The selected spliced sequences have also been further studied using miRBase to elucidate their functionality. Preliminary results from 70 samples of ENCODE datasets show that several miRNAs are prevalent in studied ENCODE samples. Two of these are associated with many diseases as suggested in the literature. Specifically, hsa-miR-1273 and hsa-miR-548 are associated with many diseases and cancers. Our RSF pipeline is able to detect many possible junctions

  2. Selection Signatures in the First Exon of Paralogous Receptor Kinase Genes from the Sym2 Region of the Pisum sativum L. Genome

    Directory of Open Access Journals (Sweden)

    Anton S. Sulima

    2017-11-01

    Full Text Available During the initial step of the symbiosis between legumes (Fabaceae and nitrogen-fixing bacteria (rhizobia, the bacterial signal molecule known as the Nod factor (nodulation factor is recognized by plant LysM motif-containing receptor-like kinases (LysM-RLKs. The fifth chromosome of barrel medic (Medicago truncatula Gaertn. contains a cluster of paralogous LysM-RLK genes, one of which is known to participate in symbiosis. In the syntenic region of the pea (Pisum sativum L. genome, three genes have been identified: PsK1 and PsSym37, two symbiosis-related LysM-RLK genes with known sequences, and the unsequenced PsSym2 gene which presumably encodes a LysM-RLK and is associated with increased selectivity to certain Nod factors. In this work, we identified a new gene encoding a LysM-RLK, designated as PsLykX, within the Sym2 genomic region. We sequenced the first exons (corresponding to the protein receptor domain of PsSym37, PsK1, and PsLykX from a large set of pea genotypes of diverse origin. The nucleotide diversity of these fragments was estimated and groups of haplotypes for each gene were revealed. Footprints of selection pressure were detected via comparative analyses of SNP distribution across the first exons of these genes and their homologs MtLYK2, MtLYK3, and MtLYK4 from M. truncatula retrieved from the Medicago Hapmap project. Despite the remarkable similarity among all the studied genes, they exhibited contrasting selection signatures, possibly pointing to diversification of their functions. Signatures of balancing selection were found in LysM1-encoding parts of PsSym37 and PsK1, suggesting that the diversity of these parts may be important for pea LysM-RLKs. The first exons of PsSym37 and PsK1 displayed signatures of purifying selection, as well as MtLYK2 of M. truncatula. Evidence of positive selection affecting primarily LysM domains was found in all three investigated M. truncatula genes, as well as in the pea gene PsLykX. The data

  3. The database of chromosome imbalance regions and genes resided in lung cancer from Asian and Caucasian identified by array-comparative genomic hybridization.

    Science.gov (United States)

    Lo, Fang-Yi; Chang, Jer-Wei; Chang, I-Shou; Chen, Yann-Jang; Hsu, Han-Shui; Huang, Shiu-Feng Kathy; Tsai, Fang-Yu; Jiang, Shih Sheng; Kanteti, Rajani; Nandi, Suvobroto; Salgia, Ravi; Wang, Yi-Ching

    2012-06-12

    Cancer-related genes show racial differences. Therefore, identification and characterization of DNA copy number alteration regions in different racial groups helps to dissect the mechanism of tumorigenesis. Array-comparative genomic hybridization (array-CGH) was analyzed for DNA copy number profile in 40 Asian and 20 Caucasian lung cancer patients. Three methods including MetaCore analysis for disease and pathway correlations, concordance analysis between array-CGH database and the expression array database, and literature search for copy number variation genes were performed to select novel lung cancer candidate genes. Four candidate oncogenes were validated for DNA copy number and mRNA and protein expression by quantitative polymerase chain reaction (qPCR), chromogenic in situ hybridization (CISH), reverse transcriptase-qPCR (RT-qPCR), and immunohistochemistry (IHC) in more patients. We identified 20 chromosomal imbalance regions harboring 459 genes for Caucasian and 17 regions containing 476 genes for Asian lung cancer patients. Seven common chromosomal imbalance regions harboring 117 genes, included gain on 3p13-14, 6p22.1, 9q21.13, 13q14.1, and 17p13.3; and loss on 3p22.2-22.3 and 13q13.3 were found both in Asian and Caucasian patients. Gene validation for four genes including ARHGAP19 (10q24.1) functioning in Rho activity control, FRAT2 (10q24.1) involved in Wnt signaling, PAFAH1B1 (17p13.3) functioning in motility control, and ZNF322A (6p22.1) involved in MAPK signaling was performed using qPCR and RT-qPCR. Mean gene dosage and mRNA expression level of the four candidate genes in tumor tissues were significantly higher than the corresponding normal tissues (P<0.001~P=0.06). In addition, CISH analysis of patients indicated that copy number amplification indeed occurred for ARHGAP19 and ZNF322A genes in lung cancer patients. IHC analysis of paraffin blocks from Asian Caucasian patients demonstrated that the frequency of PAFAH1B1 protein overexpression was 68

  4. The database of chromosome imbalance regions and genes resided in lung cancer from Asian and Caucasian identified by array-comparative genomic hybridization

    International Nuclear Information System (INIS)

    Lo, Fang-Yi; Nandi, Suvobroto; Salgia, Ravi; Wang, Yi-Ching; Chang, Jer-Wei; Chang, I-Shou; Chen, Yann-Jang; Hsu, Han-Shui; Huang, Shiu-Feng Kathy; Tsai, Fang-Yu; Jiang, Shih Sheng; Kanteti, Rajani

    2012-01-01

    Cancer-related genes show racial differences. Therefore, identification and characterization of DNA copy number alteration regions in different racial groups helps to dissect the mechanism of tumorigenesis. Array-comparative genomic hybridization (array-CGH) was analyzed for DNA copy number profile in 40 Asian and 20 Caucasian lung cancer patients. Three methods including MetaCore analysis for disease and pathway correlations, concordance analysis between array-CGH database and the expression array database, and literature search for copy number variation genes were performed to select novel lung cancer candidate genes. Four candidate oncogenes were validated for DNA copy number and mRNA and protein expression by quantitative polymerase chain reaction (qPCR), chromogenic in situ hybridization (CISH), reverse transcriptase-qPCR (RT-qPCR), and immunohistochemistry (IHC) in more patients. We identified 20 chromosomal imbalance regions harboring 459 genes for Caucasian and 17 regions containing 476 genes for Asian lung cancer patients. Seven common chromosomal imbalance regions harboring 117 genes, included gain on 3p13-14, 6p22.1, 9q21.13, 13q14.1, and 17p13.3; and loss on 3p22.2-22.3 and 13q13.3 were found both in Asian and Caucasian patients. Gene validation for four genes including ARHGAP19 (10q24.1) functioning in Rho activity control, FRAT2 (10q24.1) involved in Wnt signaling, PAFAH1B1 (17p13.3) functioning in motility control, and ZNF322A (6p22.1) involved in MAPK signaling was performed using qPCR and RT-qPCR. Mean gene dosage and mRNA expression level of the four candidate genes in tumor tissues were significantly higher than the corresponding normal tissues (P<0.001~P=0.06). In addition, CISH analysis of patients indicated that copy number amplification indeed occurred for ARHGAP19 and ZNF322A genes in lung cancer patients. IHC analysis of paraffin blocks from Asian Caucasian patients demonstrated that the frequency of PAFAH1B1 protein overexpression was 68

  5. The database of chromosome imbalance regions and genes resided in lung cancer from Asian and Caucasian identified by array-comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Lo Fang-Yi

    2012-06-01

    Full Text Available Abstract Background Cancer-related genes show racial differences. Therefore, identification and characterization of DNA copy number alteration regions in different racial groups helps to dissect the mechanism of tumorigenesis. Methods Array-comparative genomic hybridization (array-CGH was analyzed for DNA copy number profile in 40 Asian and 20 Caucasian lung cancer patients. Three methods including MetaCore analysis for disease and pathway correlations, concordance analysis between array-CGH database and the expression array database, and literature search for copy number variation genes were performed to select novel lung cancer candidate genes. Four candidate oncogenes were validated for DNA copy number and mRNA and protein expression by quantitative polymerase chain reaction (qPCR, chromogenic in situ hybridization (CISH, reverse transcriptase-qPCR (RT-qPCR, and immunohistochemistry (IHC in more patients. Results We identified 20 chromosomal imbalance regions harboring 459 genes for Caucasian and 17 regions containing 476 genes for Asian lung cancer patients. Seven common chromosomal imbalance regions harboring 117 genes, included gain on 3p13-14, 6p22.1, 9q21.13, 13q14.1, and 17p13.3; and loss on 3p22.2-22.3 and 13q13.3 were found both in Asian and Caucasian patients. Gene validation for four genes including ARHGAP19 (10q24.1 functioning in Rho activity control, FRAT2 (10q24.1 involved in Wnt signaling, PAFAH1B1 (17p13.3 functioning in motility control, and ZNF322A (6p22.1 involved in MAPK signaling was performed using qPCR and RT-qPCR. Mean gene dosage and mRNA expression level of the four candidate genes in tumor tissues were significantly higher than the corresponding normal tissues (PP=0.06. In addition, CISH analysis of patients indicated that copy number amplification indeed occurred for ARHGAP19 and ZNF322A genes in lung cancer patients. IHC analysis of paraffin blocks from Asian Caucasian patients demonstrated that the frequency of

  6. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...... on transcriptional evidence. Analysis of repetitive sequences suggests that they are underrepresented in the reference assembly, reflecting an enrichment of gene-rich regions in the current assembly. Characterization of Lotus natural variation by resequencing of L. japonicus accessions and diploid Lotus species...... is currently ongoing, facilitated by the MG20 reference sequence...

  7. Meta-genome-wide association studies identify a locus on chromosome 1 and multiple variants in the MHC region for serum C-peptide in type 1 diabetes.

    Science.gov (United States)

    Roshandel, Delnaz; Gubitosi-Klug, Rose; Bull, Shelley B; Canty, Angelo J; Pezzolesi, Marcus G; King, George L; Keenan, Hillary A; Snell-Bergeon, Janet K; Maahs, David M; Klein, Ronald; Klein, Barbara E K; Orchard, Trevor J; Costacou, Tina; Weedon, Michael N; Oram, Richard A; Paterson, Andrew D

    2018-05-01

    The aim of this study was to identify genetic variants associated with beta cell function in type 1 diabetes, as measured by serum C-peptide levels, through meta-genome-wide association studies (meta-GWAS). We performed a meta-GWAS to combine the results from five studies in type 1 diabetes with cross-sectionally measured stimulated, fasting or random C-peptide levels, including 3479 European participants. The p values across studies were combined, taking into account sample size and direction of effect. We also performed separate meta-GWAS for stimulated (n = 1303), fasting (n = 2019) and random (n = 1497) C-peptide levels. In the meta-GWAS for stimulated/fasting/random C-peptide levels, a SNP on chromosome 1, rs559047 (Chr1:238753916, T>A, minor allele frequency [MAF] 0.24-0.26), was associated with C-peptide (p = 4.13 × 10 -8 ), meeting the genome-wide significance threshold (p C>T, MAF 0.07-0.10, p = 8.43 × 10 -8 ). In the stimulated C-peptide meta-GWAS, rs61211515 (Chr6:30100975, T/-, MAF 0.17-0.19) in the MHC region was associated with stimulated C-peptide (β [SE] = - 0.39 [0.07], p = 9.72 × 10 -8 ). rs61211515 was also associated with the rate of stimulated C-peptide decline over time in a subset of individuals (n = 258) with annual repeated measures for up to 6 years (p = 0.02). In the meta-GWAS of random C-peptide, another MHC region, SNP rs3135002 (Chr6:32668439, C>A, MAF 0.02-0.06), was associated with C-peptide (p = 3.49 × 10 -8 ). Conditional analyses suggested that the three identified variants in the MHC region were independent of each other. rs9260151 and rs3135002 have been associated with type 1 diabetes, whereas rs559047 and rs61211515 have not been associated with a risk of developing type 1 diabetes. We identified a locus on chromosome 1 and multiple variants in the MHC region, at least some of which were distinct from type 1 diabetes risk loci, that were associated with C

  8. Genomics of Clostridium tetani.

    Science.gov (United States)

    Brüggemann, Holger; Brzuszkiewicz, Elzbieta; Chapeton-Montes, Diana; Plourde, Lucile; Speck, Denis; Popoff, Michel R

    2015-05-01

    Genomic information about Clostridium tetani, the causative agent of the tetanus disease, is scarce. The genome of strain E88, a strain used in vaccine production, was sequenced about 10 years ago. One additional genome (strain 12124569) has recently been released. Here we report three new genomes of C. tetani and describe major differences among all five C. tetani genomes. They all harbor tetanus-toxin-encoding plasmids that contain highly conserved genes for TeNT (tetanus toxin), TetR (transcriptional regulator of TeNT) and ColT (collagenase), but substantially differ in other plasmid regions. The chromosomes share a large core genome that contains about 85% of all genes of a given chromosome. The non-core chromosome comprises mainly prophage-like genomic regions and genes encoding environmental interaction and defense functions (e.g. surface proteins, restriction-modification systems, toxin-antitoxin systems, CRISPR/Cas systems) and other fitness functions (e.g. transport systems, metabolic activities). This new genome information will help to assess the level of genome plasticity of the species C. tetani and provide the basis for detailed comparative studies. Copyright © 2015 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  9. QTL analysis of novel genomic regions associated with yield and yield related traits in new plant type based recombinant inbred lines of rice (Oryza sativa L.).

    Science.gov (United States)

    Marathi, Balram; Guleria, Smriti; Mohapatra, Trilochan; Parsad, Rajender; Mariappan, Nagarajan; Kurungara, Vinod Kunnummal; Atwal, Salwandir Singh; Prabhu, Kumble Vinod; Singh, Nagendra Kumar; Singh, Ashok Kumar

    2012-08-09

    Rice is staple food for more than half of the world's population including two billion Asians, who obtain 60-70% of their energy intake from rice and its derivatives. To meet the growing demand from human population, rice varieties with higher yield potential and greater yield stability need to be developed. The favourable alleles for yield and yield contributing traits are distributed among two subspecies i.e., indica and japonica of cultivated rice (Oryza sativa L.). Identification of novel favourable alleles in indica/japonica will pave way to marker-assisted mobilization of these alleles in to a genetic background to break genetic barriers to yield. A new plant type (NPT) based mapping population of 310 recombinant inbred lines (RILs) was used to map novel genomic regions and QTL hotspots influencing yield and eleven yield component traits. We identified major quantitative trait loci (QTLs) for days to 50% flowering (R2 = 25%, LOD = 14.3), panicles per plant (R2 = 19%, LOD = 9.74), flag leaf length (R2 = 22%, LOD = 3.05), flag leaf width (R2 = 53%, LOD = 46.5), spikelets per panicle (R2 = 16%, LOD = 13.8), filled grains per panicle (R2 = 22%, LOD = 15.3), percent spikelet sterility (R2 = 18%, LOD = 14.24), thousand grain weight (R2 = 25%, LOD = 12.9) and spikelet setting density (R2 = 23%, LOD = 15) expressing over two or more locations by using composite interval mapping. The phenotypic variation (R2) ranged from 8 to 53% for eleven QTLs expressing across all three locations. 19 novel QTLs were contributed by the NPT parent, Pusa1266. 15 QTL hotpots on eight chromosomes were identified for the correlated traits. Six epistatic QTLs effecting five traits at two locations were identified. A marker interval (RM3276-RM5709) on chromosome 4 harboring major QTLs for four traits was identified. The present study reveals that favourable alleles for yield and yield contributing traits were

  10. Isolation and characterization of a pseudoautosomal region-specific genetic marker in C57BL/6 mice using genomic representational difference analysis.

    Science.gov (United States)

    Kalcheva, I D; Matsuda, Y; Plass, C; Chapman, V M

    1995-12-19

    Representational difference analysis was used to identify strain-specific differences in the pseudoautosomal region (PAR) of mouse X and Y chromosomes. One second generation (C57BL/6 x Mus spretus) x Mus spretus interspecific backcross male carrying the C57BL/6 (B6) PAR was used for tester DNA. DNA from five backcross males from the same generation that were M. spretus-type for the PAR was pooled for the driver. A cloned probe designated B6-38 was recovered that is B6-specific in Southern analysis. Analysis of genomic DNA from several inbred strains of laboratory mice and diverse Mus species and subspecies identified a characteristic Pst I pattern of fragment sizes that is present only in the C57BL family of strains. Hybridization was observed with sequences in DBA/2J and to a limited extent with Mus musculus (PWK strain) and Mus castaneus DNA. No hybridization was observed in DNA of different Mus species, M. spretus, M. hortulanus, and M. caroli. Genetic analyses of B6-38 was conducted using C57BL congenic males that carry M. spretus alleles for distal X chromosome loci and the PAR and outcrosses of heterozygous congenic females with M. spretus. These analyses demonstrated that the B6-38 sequences were inherited with both the X and Y chromosome. B6-38 sequences were genetically mapped as a locus within the PAR using two interspecific backcrosses. The locus defined by B6-38 is designated DXYRp1. Preliminary analyses of recombination between the distal X chromosome gene amelogenin (Amg) and the PAR loci for either TelXY or sex chromosome association (Sxa) suggest that the locus DXYRp1 maps to the distal portion of the PAR.

  11. Identification and characterization of a highly variable region in mitochondrial genomes of fusarium species and analysis of power generation from microbial fuel cells

    Science.gov (United States)

    Hamzah, Haider Mousa

    In the microbial fuel cell (MFC) project, power generation from Shewanella oneidensis MR-1 was analyzed looking for a novel system for both energy generation and sustainability. The results suggest the possibility of generating electricity from different organic substances, which include agricultural and industrial by-products. Shewanella oneidensis MR-1 generates usable electrons at 30°C using both submerged and solid state cultures. In the MFC biocathode experiment, most of the CO2 generated at the anodic chamber was converted into bicarbonate due the activity of carbonic anhydrase (CA) of the Gluconobacter sp.33 strain. These findings demonstrate the possibility of generation of electricity while at the same time allowing the biomimetic sequestration of CO2 using bacterial CA. In the mitochondrial genomes project, the filamentous fungal species Fusarium oxysporum was used as a model. This species causes wilt of several important agricultural crops. A previous study revealed that a highly variable region (HVR) in the mitochondrial DNA (mtDNA) of three species of Fusarium contained a large, variable unidentified open reading frame (LV-uORF). Using specific primers for two regions of the LV-uORF, six strains were found to contain the ORF by PCR and database searches identified 18 other strains outside of the Fusarium oxysporum species complex. The LV-uORF was also identified in three isolates of the F. oxysporum species complex. Interestingly, several F. oxysporum isolates lack the LV-uORF and instead contain 13 ORFs in the HVR, nine of which are unidentified. The high GC content and codon usage of the LV-uORF indicate that it did not co-evolve with other mt genes and was horizontally acquired and was introduced to the Fusarium lineage prior to speciation. The nonsynonymous/synonymous (dN/dS) ratio of the LV-uORFs (0.43) suggests it is under purifying selection and the putative polypeptide is predicted to be located in the mitochondrial membrane. Growth assays

  12. Whole-Genome Sequence of Streptococcus macedonicus Strain 33MO, Isolated from the Curd of Morlacco Cheese in the Veneto Region (Italy)

    DEFF Research Database (Denmark)

    Vendramin, Veronica; Treu, Laura; Bovo, Barbara

    2014-01-01

    A genetic characterization of Streptococcus macedonicus is important to better understand the characteristics of this lactic acid bacterium, frequently detected in fermented food bacteria communities. This report presents the draft genome sequence description of strain 33MO, the first publicly...

  13. Whole genome sequencing and assembly of Eukaryotic microbes isolated from ISS environmental surface Kirovograd region soil Chernobyl Nuclear Power Plant and Chernobyl Exclusion Zone

    Data.gov (United States)

    National Aeronautics and Space Administration — The whole-genome sequences of eight fungal strains that were selected for exposure to microgravity at the International Space Station are presented here. These...

  14. Oxidized Base Damage and Single-Strand Break Repair in Mammalian Genomes: Role of Disordered Regions and Posttranslational Modifications in Early Enzymes

    OpenAIRE

    Hegde, Muralidhar L.; Izumi, Tadahide; Mitra, Sankar

    2012-01-01

    Oxidative genome damage induced by reactive oxygen species includes oxidized bases, abasic (AP) sites, and single-strand breaks, all of which are repaired via the evolutionarily conserved base excision repair/single-strand break repair (BER/SSBR) pathway. BER/SSBR in mammalian cells is complex, with preferred and backup sub-pathways, and is linked to genome replication and transcription. The early BER/SSBR enzymes, namely, DNA glycosylases (DGs) and the end-processing proteins such as abasic ...

  15. Comparative Genome Analysis of Ciprofloxacin-Resistant Pseudomonas aeruginosa Reveals Genes Within Newly Identified High Variability Regions Associated With Drug Resistance Development

    OpenAIRE

    Su, Hsun-Cheng; Khatun, Jainab; Kanavy, Dona M.; Giddings, Morgan C.

    2013-01-01

    The alarming rise of ciprofloxacin-resistant Pseudomonas aeruginosa has been reported in several clinical studies. Though the mutation of resistance genes and their role in drug resistance has been researched, the process by which the bacterium acquires high-level resistance is still not well understood. How does the genomic evolution of P. aeruginosa affect resistance development? Could the exposure of antibiotics to the bacteria enrich genomic variants that lead to the development of resist...

  16. Recurring genomic breaks in independent lineages support genomic fragility

    Directory of Open Access Journals (Sweden)

    Hannenhalli Sridhar

    2006-11-01

    Full Text Available Abstract Background Recent findings indicate that evolutionary breaks in the genome are not randomly distributed, and that certain regions, so-called fragile regions, are predisposed to breakages. Previous approaches to the study of genomic fragility have examined the distribution of breaks, as well as the coincidence of breaks with segmental duplications and repeats, within a single species. In contrast, we investigate whether this regional fragility is an inherent genomic characteristic and is thus conserved over multiple independent lineages. Results We do this by quantifying the extent to which certain genomic regions are disrupted repeatedly in independent lineages. Our investigation, based on Human, Chimp, Mouse, Rat, Dog and Chicken, suggests that the propensity of a chromosomal region to break is significantly correlated among independent lineages, even when covariates are considered. Furthermore, the fragile regions are enriched for segmental duplications. Conclusion Based on a novel methodology, our work provides additional support for the existence of fragile regions.

  17. Chromatin dynamics in genome stability

    DEFF Research Database (Denmark)

    Nair, Nidhi; Shoaib, Muhammad; Sørensen, Claus Storgaard

    2017-01-01

    Genomic DNA is compacted into chromatin through packaging with histone and non-histone proteins. Importantly, DNA accessibility is dynamically regulated to ensure genome stability. This is exemplified in the response to DNA damage where chromatin relaxation near genomic lesions serves to promote ...... of chromatin structure regulation in maintaining genome integrity by multiple mechanisms including facilitating DNA repair and directly suppressing endogenous DNA damage.......Genomic DNA is compacted into chromatin through packaging with histone and non-histone proteins. Importantly, DNA accessibility is dynamically regulated to ensure genome stability. This is exemplified in the response to DNA damage where chromatin relaxation near genomic lesions serves to promote...... access of relevant enzymes to specific DNA regions for signaling and repair. Furthermore, recent data highlight genome maintenance roles of chromatin through the regulation of endogenous DNA-templated processes including transcription and replication. Here, we review research that shows the importance...

  18. A gene-based high-resolution comparative radiation hybrid map as a framework for genome sequence assembly of a bovine chromosome 6 region associated with QTL for growth, body composition, and milk performance traits

    Directory of Open Access Journals (Sweden)

    Laurent Pascal

    2006-03-01

    Full Text Available Abstract Background A number of different quantitative trait loci (QTL for various phenotypic traits, including milk production, functional, and conformation traits in dairy cattle as well as growth and body composition traits in meat cattle, have been mapped consistently in the middle region of bovine chromosome 6 (BTA6. Dense genetic and physical maps and, ultimately, a fully annotated genome sequence as well as their mutual connections are required to efficiently identify genes and gene variants responsible for genetic variation of phenotypic traits. A comprehensive high-resolution gene-rich map linking densely spaced bovine markers and genes to the annotated human genome sequence is required as a framework to facilitate this approach for the region on BTA6 carrying the QTL. Results Therefore, we constructed a high-resolution radiation hybrid (RH map for the QTL containing chromosomal region of BTA6. This new RH map with a total of 234 loci including 115 genes and ESTs displays a substantial increase in loci density compared to existing physical BTA6 maps. Screening the available bovine genome sequence resources, a total of 73 loci could be assigned to sequence contigs, which were already identified as specific for BTA6. For 43 loci, corresponding sequence contigs, which were not yet placed on the bovine genome assembly, were identified. In addition, the improved potential of this high-resolution RH map for BTA6 with respect to comparative mapping was demonstrated. Mapping a large number of genes on BTA6 and cross-referencing them with map locations in corresponding syntenic multi-species chromosome segments (human, mouse, rat, dog, chicken achieved a refined accurate alignment of conserved segments and evolutionary breakpoints across the species included. Conclusion The gene-anchored high-resolution RH map (1 locus/300 kb for the targeted region of BTA6 presented here will provide a valuable platform to guide high-quality assembling and

  19. Complete nucleotide sequence and genome structure of a Japanese isolate of hibiscus latent Fort Pierce virus, a unique tobamovirus that contains an internal poly(A) region in its 3' end.

    Science.gov (United States)

    Yoshida, Tetsuya; Kitazawa, Yugo; Komatsu, Ken; Neriya, Yutaro; Ishikawa, Kazuya; Fujita, Naoko; Hashimoto, Masayoshi; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

    2014-11-01

    In this study, we detected a Japanese isolate of hibiscus latent Fort Pierce virus (HLFPV-J), a member of the genus Tobamovirus, in a hibiscus plant in Japan and determined the complete sequence and organization of its genome. HLFPV-J has four open reading frames (ORFs), each of which shares more than 98 % nucleotide sequence identity with those of other HLFPV isolates. Moreover, HLFPV-J contains a unique internal poly(A) region of variable length, ranging from 44 to 78 nucleotides, in its 3'-untranslated region (UTR), as is the case with hibiscus latent Singapore virus (HLSV), another hibiscus-infecting tobamovirus. The length of the HLFPV-J genome was 6431 nucleotides, including the shortest internal poly(A) region. The sequence identities of ORFs 1, 2, 3 and 4 of HLFPV-J to other tobamoviruses were 46.6-68.7, 49.9-70.8, 31.0-70.8 and 39.4-70.1 %, respectively, at the nucleotide level and 39.8-75.0, 43.6-77.8, 19.2-70.4 and 31.2-74.2 %, respectively, at the amino acid level. The 5'- and 3'-UTRs of HLFPV-J showed 24.3-58.6 and 13.0-79.8 % identity, respectively, to other tobamoviruses. In particular, when compared to other tobamoviruses, each ORF and UTR of HLFPV-J showed the highest sequence identity to those of HLSV. Phylogenetic analysis showed that HLFPV-J, other HLFPV isolates and HLSV constitute a malvaceous-plant-infecting tobamovirus cluster. These results indicate that the genomic structure of HLFPV-J has unique features similar to those of HLSV. To our knowledge, this is the first report of the complete genome sequence of HLFPV.

  20. Enrichment of colorectal cancer associations in functional regions: Insight for using epigenomics data in the analysis of whole genome sequence-imputed GWAS data.

    Directory of Open Access Journals (Sweden)

    Stephanie A Bien

    Full Text Available The evaluation of less frequent genetic variants and their effect on complex disease pose new challenges for genomic research. To investigate whether epigenetic data can be used to inform aggregate rare-variant association methods (RVAM, we assessed whether variants more significantly associated with colorectal cancer (CRC were preferentially located in non-coding regulatory regions, and whether enrichment was specific to colorectal tissues.Active regulatory elements (ARE were mapped using data from 127 tissues and cell-types from NIH Roadmap Epigenomics and Encyclopedia of DNA Elements (ENCODE projects. We investigated whether CRC association p-values were more significant for common variants inside versus outside AREs, or 2 inside colorectal (CR AREs versus AREs of other tissues and cell-types. We employed an integrative epigenomic RVAM for variants with allele frequency <1%. Gene sets were defined as ARE variants within 200 kilobases of a transcription start site (TSS using either CR ARE or ARE from non-digestive tissues. CRC-set association p-values were used to evaluate enrichment of less frequent variant associations in CR ARE versus non-digestive ARE.ARE from 126/127 tissues and cell-types were significantly enriched for stronger CRC-variant associations. Strongest enrichment was observed for digestive tissues and immune cell types. CR-specific ARE were also enriched for stronger CRC-variant associations compared to ARE combined across non-digestive tissues (p-value = 9.6 × 10-4. Additionally, we found enrichment of stronger CRC association p-values for rare variant sets of CR ARE compared to non-digestive ARE (p-value = 0.029.Integrative epigenomic RVAM may enable discovery of less frequent variants associated with CRC, and ARE of digestive and immune tissues are most informative. Although distance-based aggregation of less frequent variants in CR ARE surrounding TSS showed modest enrichment, future association studies would likely

  1. Comparative Genome Viewer

    International Nuclear Information System (INIS)

    Molineris, I.; Sales, G.

    2009-01-01

    The amount of information about genomes, both in the form of complete sequences and annotations, has been exponentially increasing in the last few years. As a result there is the need for tools providing a graphical representation of such information that should be comprehensive and intuitive. Visual representation is especially important in the comparative genomics field since it should provide a combined view of data belonging to different genomes. We believe that existing tools are limited in this respect as they focus on a single genome at a time (conservation histograms) or compress alignment representation to a single dimension. We have therefore developed a web-based tool called Comparative Genome Viewer (Cgv): it integrates a bidimensional representation of alignments between two regions, both at small and big scales, with the richness of annotations present in other genome browsers. We give access to our system through a web-based interface that provides the user with an interactive representation that can be updated in real time using the mouse to move from region to region and to zoom in on interesting details.

  2. ChIP-seq analysis of genomic binding regions of five major transcription factors highlights a central role for ZIC2 in the mouse epiblast stem cell gene regulatory network

    Science.gov (United States)

    Matsuda, Kazunari; Oki, Shinya; Iida, Hideaki; Andrabi, Munazah; Yamaguchi, Katsushi

    2017-01-01

    To obtain insight into the transcription factor (TF)-dependent regulation of epiblast stem cells (EpiSCs), we performed ChIP-seq analysis of the genomic binding regions of five major TFs. Analysis of in vivo biotinylated ZIC2, OTX2, SOX2, POU5F1 and POU3F1 binding in EpiSCs identified several new features. (1) Megabase-scale genomic domains rich in ZIC2 peaks and genes alternate with those rich in POU3F1 but sparse in genes, reflecting the clustering of regulatory regions that act at short and long-range, which involve binding of ZIC2 and POU3F1, respectively. (2) The enhancers bound by ZIC2 and OTX2 prominently regulate TF genes in EpiSCs. (3) The binding sites for SOX2 and POU5F1 in mouse embryonic stem cells (ESCs) and EpiSCs are divergent, reflecting the shift in the major acting TFs from SOX2/POU5F1 in ESCs to OTX2/ZIC2 in EpiSCs. (4) This shift in the major acting TFs appears to be primed by binding of ZIC2 in ESCs at relevant genomic positions that later function as enhancers following the disengagement of SOX2/POU5F1 from major regulatory functions and subsequent binding by OTX2. These new insights into EpiSC gene regulatory networks gained from this study are highly relevant to early stage embryogenesis. PMID:28455373

  3. Unique and conserved genome regions in Vibrio harveyi and related species in comparison with the shrimp pathogen Vibrio harveyi CAIM 1792

    DEFF Research Database (Denmark)

    Valles, Iliana Espinoza; Vora, Gary J; Lin, Baochuan

    2015-01-01

    Vibrio harveyi CAIM 1792 is a marine bacterial strain that causes mortality in farmed shrimp in north-west Mexico, and the identification of virulence genes in this strain is important for understanding its pathogenicity. The aim of this work was to compare the V. harveyi CAIM 1792 genome....... The proteome of CAIM 1792 had higher similarity to those of other V. harveyi strains (78 %) than to those of the other closely related species Vibrio owensii (67 %), Vibrio rotiferianus (63 %) and Vibrio campbellii (59 %). Pan-genome ORFans trees showed the best fit with the accepted phylogeny based on DNA...

  4. Genome Sequence of Thermotoga sp Strain RQ2, a Hyperthermophilic Bacterium Isolated from a Geothermally Heated Region of the Seafloor near Ribeira Quente, the Azores

    Energy Technology Data Exchange (ETDEWEB)

    Swithers, Kristen S [University of Connecticut, Storrs; DiPippo, Jonathan L [University of Connecticut, Storrs; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Lykidis, A [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Stetter, Karl O [Universitat Regensburg, Regensburg, Germany; Nelson, Karen E [J. Craig Venter Institute; Gogarten, Peter [University of Connecticut, Storrs; Noll, Kenneth M [University of Connecticut, Storrs

    2011-01-01

    Thermotoga sp. strain RQ2 is probably a strain of Thermotoga maritima. Its complete genome sequence allows for an examination of the extent and consequences of gene flow within Thermotoga species and strains. Thermotoga sp. RQ2 differs from T. maritima in its genes involved in myo-inositol metabolism. Its genome also encodes an apparent fructose phosphotransferase system (PTS) sugar transporter. This operon is also found in Thermotoga naphthophila strain RKU-10 but no other Thermotogales. These are the first reported PTS transporters in the Thermotogales.

  5. PRMT5-mediated histone H4 arginine-3 symmetrical dimethylation marks chromatin at G + C-rich regions of the mouse genome

    OpenAIRE

    Girardot, Michael; Hirasawa, Ryutaro; Kacem, Salim; Fritsch, Lauriane; Pontis, Julien; Kota, Satya K.; Filipponi, Doria; Fabbrizio, Eric; Sardet, Claude; Lohmann, Felix; Kadam, Shilpa; Ait-Si-Ali, Slimane; Feil, Robert

    2013-01-01

    Symmetrical dimethylation on arginine-3 of histone H4 (H4R3me2s) has been reported to occur at several repressed genes, but its specific regulation and genomic distribution remained unclear. Here, we show that the type-II protein arginine methyltransferase PRMT5 controls H4R3me2s in mouse embryonic fibroblasts (MEFs). In these differentiated cells, we find that the genome-wide pattern of H4R3me2s is highly similar to that in embryonic stem cells. In both the cell types, H4R3me2s peaks are det...

  6. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines when...

  7. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  8. Draft genome sequences of Escherichia coli O113:H21 strains recovered from a major produce-production region in California

    Science.gov (United States)

    Shiga toxin-producing Escherichia coli is a foodborne and waterborne pathogen and is responsible for outbreaks of human gastroenteritis. This report documents the draft genome sequences of seven O113:H21 strains recovered from livestock, wildlife, and soil samples collected in a major agricultural r...

  9. The complete chloroplast genome sequence of Taxus chinensis var. mairei (Taxaceae): loss of an inverted repeat region and comparative analysis with related species.

    Science.gov (United States)

    Zhang, Yanzhen; Ma, Ji; Yang, Bingxian; Li, Ruyi; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Zhang, Lin

    2014-05-01

    Taxus chinensis var. mairei (Taxaceae) is a domestic variety of yew species in local China. This plant is one of the sources for paclitaxel, which is a promising antineoplastic chemotherapy drugs during the last decade. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of T. chinensis var. mairei. The T. chinensis var. mairei cp genome is 129,513 bp in length, with 113 single copy genes and two duplicated genes (trnI-CAU, trnQ-UUG). Among the 113 single copy genes, 9 are intron-containing. Compared to other land plant cp genomes, the T. chinensis var. mairei cp genome has lost one of the large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperm such as Cycas revoluta and Ginkgo biloba L. Compared to related species, the gene order of T. chinensis var. mairei has a large inversion of ~110kb including 91 genes (from rps18 to accD) with gene contents unarranged. Repeat analysis identified 48 direct and 2 inverted repeats 30 bp long or longer with a sequence identity greater than 90%. Repeated short segments were found in genes rps18, rps19 and clpP. Analysis also revealed 22 simple sequence repeat (SSR) loci and almost all are composed of A or T. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. The Genome of a Tortoise Herpesvirus (Testudinid Herpesvirus 3) Has a Novel Structure and Contains a Large Region That Is Not Required for Replication In Vitro or Virulence In Vivo.

    Science.gov (United States)

    Gandar, Frédéric; Wilkie, Gavin S; Gatherer, Derek; Kerr, Karen; Marlier, Didier; Diez, Marianne; Marschang, Rachel E; Mast, Jan; Dewals, Benjamin G; Davison, Andrew J; Vanderplasschen, Alain F C

    2015-11-01

    the pathogenesis of strain 4295, which consists of three deletion mutants. The major findings are that (i) TeHV-3 has a novel genome structure, (ii) its closest relative is a turtle herpesvirus, (iii) it contains interleukin-10 and semaphorin genes (the first time these have been reported in an alphaherpesvirus), (iv) a sizeable region of the genome is not required for viral replication in vitro or virulence in vivo, and (v) one of the components of strain 4295, which has a deletion of 22.4 kb, exhibits properties indicating that it may serve as the starting point for an attenuated vaccine. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  11. Annotation of two large contiguous regions from the Haemonchus contortus genome using RNA-seq and comparative analysis with Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Roz Laing

    Full Text Available The genomes of numerous parasitic nematodes are currently being sequenced, but their complexity and size, together with high levels of intra-specific sequence variation and a lack of reference genomes, makes their assembly and annotation a challenging task. Haemonchus contortus is an economically significant parasite of livestock that is widely used for basic research as well as for vaccine development and drug discovery. It is one of many medically and economically important parasites within the strongylid nematode group. This group of parasites has the closest phylogenetic relationship with the model organism Caenorhabditis elegans, making comparative analysis a potentially powerful tool for genome annotation and functional studies. To investigate this hypothesis, we sequenced two contiguous fragments from the H. contortus genome and undertook detailed annotation and comparative analysis with C. elegans. The adult H. contortus transcriptome was sequenced using an Illumina platform and RNA-seq was used to annotate a 409 kb overlapping BAC tiling path relating to the X chromosome and a 181 kb BAC insert relating to chromosome I. In total, 40 genes and 12 putative transposable elements were identified. 97.5% of the annotated genes had detectable homologues in C. elegans of which 60% had putative orthologues, significantly higher than previous analyses based on EST analysis. Gene density appears to be less in H. contortus than in C. elegans, with annotated H. contortus genes being an average of two-to-three times larger than their putative C. elegans orthologues due to a greater intron number and size. Synteny appears high but gene order is generally poorly conserved, although areas of conserved microsynteny are apparent. C. elegans operons appear to be partially conserved in H. contortus. Our findings suggest that a combination of RNA-seq and comparative analysis with C. elegans is a powerful approach for the annotation and analysis of strongylid

  12. Annotation of Two Large Contiguous Regions from the Haemonchus contortus Genome Using RNA-seq and Comparative Analysis with Caenorhabditis elegans

    Science.gov (United States)

    Laing, Roz; Hunt, Martin; Protasio, Anna V.; Saunders, Gary; Mungall, Karen; Laing, Steven; Jackson, Frank; Quail, Michael; Beech, Robin; Berriman, Matthew; Gilleard, John S.

    2011-01-01

    The genomes of numerous parasitic nematodes are currently being sequenced, but their complexity and size, together with high levels of intra-specific sequence variation and a lack of reference genomes, makes their assembly and annotation a challenging task. Haemonchus contortus is an economically significant parasite of livestock that is widely used for basic research as well as for vaccine development and drug discovery. It is one of many medically and economically important parasites within the strongylid nematode group. This group of parasites has the closest phylogenetic relationship with the model organism Caenorhabditis elegans, making comparative analysis a potentially powerful tool for genome annotation and functional studies. To investigate this hypothesis, we sequenced two contiguous fragments from the H. contortus genome and undertook detailed annotation and comparative analysis with C. elegans. The adult H. contortus transcriptome was sequenced using an Illumina platform and RNA-seq was used to annotate a 409 kb overlapping BAC tiling path relating to the X chromosome and a 181 kb BAC insert relating to chromosome I. In total, 40 genes and 12 putative transposable elements were identified. 97.5% of the annotated genes had detectable homologues in C. elegans of which 60% had putative orthologues, significantly higher than previous analyses based on EST analysis. Gene density appears to be less in H. contortus than in C. elegans, with annotated H. contortus genes being an average of two-to-three times larger than their putative C. elegans orthologues due to a greater intron number and size. Synteny appears high but gene order is generally poorly conserved, although areas of conserved microsynteny are apparent. C. elegans operons appear to be partially conserved in H. contortus. Our findings suggest that a combination of RNA-seq and comparative analysis with C. elegans is a powerful approach for the annotation and analysis of strongylid nematode genomes

  13. Nucleotide sequences within the U5 region of the viral RNA genome are the major determinants for an human immunodeficiency virus type 1 to maintain a primer binding site complementary to tRNA(His).

    Science.gov (United States)

    Zhang, Z; Kang, S M; LeBlanc, A; Hajduk, S L; Morrow, C D

    1996-12-15

    The initiation of reverse transcription of the human immunodeficiency virus type 1 (HIV-1) genome requires cellular tRNA(Lys,3) as a primer and occurs at a site in the viral RNA genome, designated as the primer binding site (PBS), which is complementary to the 3'-terminal 18 nucleotides of tRNA(Lys,3). We previously described an HIV-1 virus [designated as HXB2(His-AC)], which contained a sequence within the U5 region complementary to the anticodon region of tRNA(His) in addition to a PBS complementary to the 3'-terminal 18 nucleotides of the tRNA(His). That virus maintained a PBS complementary to tRNA(His) after extended in vitro culture (Wakefield et al., J. Virol. 70, 966-975, 1996). In the present study, we report that subcloning a 200-base-pair DNA fragment encompassing the U5 and PBS regions from an integrated provirus of HXB2(His-AC) back into the wild-type genome (pHXB2) resulted in an infectious virus, designated as HXB2(His-AC-gac), which again stably maintained a PBS complementary to tRNA(His). DNA sequence analysis of the 200-base-pair region revealed only three nucleotide changes from HXB2(His-AC): a T-to-G change at nucleotide 174, a G-to-A change at nucleotide 181, and a T-to-C change at nucleotide 200. The new mutant virus replicated in CD4+ Sup T1 cells similarly to the wild-type virus. Comparison of the nucleotide sequence of nucleocapsid gene of the wild-type and HXB2 (His-AC-gac) virus revealed no differences. Although we found numerous mutations in the reverse transcriptase gene in proviral clones derived from HXB2 (His-AC-gac), no common mutations were found among the 13 clones examined. Comparison of the virion-associated tRNAs of HXB2(His-AC-gac) with those of the wild type revealed that both viruses incorporated a similar subset of cellular tRNAs, with tRNA(Lys,3) being the predominant tRNA found within virions. There was no selective enrichment for tRNA(His) within virions of HXB2(His-AC-gac) virus which selectively use tRNA(His) to

  14. Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location.

    Science.gov (United States)

    Zmienko, Agnieszka; Samelak-Czajka, Anna; Kozlowski, Piotr; Szymanska, Maja; Figlerowicz, Marek

    2016-11-08

    Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2-14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular

  15. Array comparative genomic hybridization in patients with congenital diaphragmatic hernia: mapping of four CDH-critical regions and sequencing of candidate genes at 15q26.1-15q26.2.

    Science.gov (United States)

    Slavotinek, Anne M; Moshrefi, Ali; Davis, Randy; Leeth, Elizabeth; Schaeffer, G Bradley; Burchard, González Esteban; Shaw, Gary M; James, Bristow; Ptacek, Louis; Pennacchio, Len A

    2006-09-01

    Congenital diaphragmatic hernia (CDH) is a common birth defect with a high mortality and morbidity. There have been few studies that have assessed copy number changes in CDH. We present array comparative genomic hybridization data for 29 CDH patients to identify and map chromosome aberrations in this disease. Three patients with 15q26.1-15q26.2 deletions had heterogeneous breakpoints that overlapped with the critical 4 Mb region previously delineated for CDH, confirming 15q26.1-15q26.2 as a critical region for CDH. The three other most compelling CDH-critical regions for genomic deletions based on these data and a literature review are located at chromosomes 8p23.1, 4p16.3-4pter, and 1q41-1q42.1. Based on these recurrent deletions at 15q26.1-15q26.2, we hypothesized that loss-of-function mutations in a gene or genes from this region could cause CDH and sequenced six candidate genes from this region in more than 100 patients with CDH. For three of these genes (CHD2, ARRDC4, and RGMA), we identified missense changes and that were not identified in normal controls; however, none of these alterations appeared unambiguously causal with CDH. These data suggest that CDH caused by chromosome deletions at 15q26.2 may arise because of a contiguous gene deletion syndrome or may have a multifactorial etiology. In addition, there is evidence for substantial genetic heterogeneity in CDH and diaphragmatic hernias can be non-penetrant in patients who have deletions involving CDH-critical regions.

  16. Preferential integration of human immunodeficiency virus type 1 into genes, cytogenetic R bands and GC-rich DNA regions: insight from the human genome sequence

    Czech Academy of Sciences Publication Activity Database

    Elleder, Daniel; Pavlíček, Adam; Pačes, Jan; Hejnar, Jiří

    2002-01-01

    Roč. 517, 1-3 (2002), s. 285-286 ISSN 0014-5793 R&D Projects: GA ČR GA204/01/0632; GA ČR GA524/01/0866; GA MŠk(CZ) LN00A079 Institutional research plan: CEZ:AV0Z5052915 Keywords : HIV -1 integration preference * genome-wide screen Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.912, year: 2002

  17. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  18. A bias-reducing pathway enrichment analysis of genome-wide association data confirmed association of the MHC region with schizophrenia.

    LENUS (Irish Health Repository)

    Jia, Peilin

    2012-02-01

    After the recent successes of genome-wide association studies (GWAS), one key challenge is to identify genetic variants that might have a significant joint effect on complex diseases but have failed to be identified individually due to weak to moderate marginal effect. One popular and effective approach is gene set based analysis, which investigates the joint effect of multiple functionally related genes (eg, pathways). However, a typical gene set analysis method is biased towards long genes, a problem that is especially severe in psychiatric diseases.

  19. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  20. Invariants of DNA genomic signals

    Science.gov (United States)

    Cristea, Paul Dan A.

    2005-02-01

    For large scale analysis purposes, the conversion of genomic sequences into digital signals opens the possibility to use powerful signal processing methods for handling genomic information. The study of complex genomic signals reveals large scale features, maintained over the scale of whole chromosomes, that would be difficult to find by using only the symbolic representation. Based on genomic signal methods and on statistical techniques, the paper defines parameters of DNA sequences which are invariant to transformations induced by SNPs, splicing or crossover. Re-orienting concatenated coding regions in the same direction, regularities shared by the genomic material in all exons are revealed, pointing towards the hypothesis of a regular ancestral structure from which the current chromosome structures have evolved. This property is not found in non-nuclear genomic material, e.g., plasmids.

  1. Comparative Genomics

    Indian Academy of Sciences (India)

    An important hallmark of biological research is the aspect of 'comparisons'. As the complete genome sequences of numerous organisms have become available, the emphasis in biology has shifted to comparisons at the genome level. Indeed, the last few years have witnessed an exponential rise in the number of ...

  2. Comparative Genomics

    Indian Academy of Sciences (India)

    structions of the tree of life, drug discovery programs, func- tion predictions of hypothetical proteins and genes, regula- tory motifs and other non-coding DNA motifs, and genome ... expertise in assembling sequences. Beginning with the complete genome sequence of the bacterial pathogen Haemophilus influenzae that was ...

  3. Oxidized base damage and single-strand break repair in mammalian genomes: role of disordered regions and posttranslational modifications in early enzymes.

    Science.gov (United States)

    Hegde, Muralidhar L; Izumi, Tadahide; Mitra, Sankar

    2012-01-01

    Oxidative genome damage induced by reactive oxygen species includes oxidized bases, abasic (AP) sites, and single-strand breaks, all of which are repaired via the evolutionarily conserved base excision repair/single-strand break repair (BER/SSBR) pathway. BER/SSBR in mammalian cells is complex, with preferred and backup sub-pathways, and is linked to genome replication and transcription. The early BER/SSBR enzymes, namely, DNA glycosylases (DGs) and the end-processing proteins such as abasic endonuclease 1 (APE1), form complexes with downstream repair (and other noncanonical) proteins via pairwise interactions. Furthermore, a unique feature of mammalian early BER/SSBR enzymes is the presence of a disordered terminal extension that is absent in their Escherichia coli prototypes. These nonconserved segments usually contain organelle-targeting signals, common interaction interfaces, and sites of posttranslational modifications that may be involved in regulating their repair function including lesion scanning. Finally, the linkage of BER/SSBR deficiency to cancer, aging, and human neurodegenerative diseases, and therapeutic targeting of BER/SSBR are discussed. Copyright © 2012 Elsevier Inc. All rights reserved.

  4. A physical map of the human genome

    Energy Technology Data Exchange (ETDEWEB)

    McPherson, J.D.; Marra, M.; Hillier, L.; Waterston, R.H.; Chinwalla, A.; Wallis, J.; Sekhon, M.; Wylie, K.; Mardis, E.R.; Wilson, R.K.; Fulton, R.; Kucaba, T.A.; Wagner-McPherson, C.; Barbazuk, W.B.; Gregory, S.G.; Humphray, S.J.; French, L.; Evans, R.S.; Bethel, G.; Whittaker, A.; Holden, J.L.; McCann, O.T.; Dunham, A.; Soderlund, C.; Scott, C.E.; Bentley, D.R.; Schuler, G.; Chen, H.-C.; Jang, W.; Green, E.D.; Idol, J.R.; Maduro, V.V. Braden; Montgomery, K.T.; Lee, E.; Miller, A.; Emerling, S.; Kucherlapati; Gibbs, R.; Scherer, S.; Gorrell, J.H.; Sodergren, E.; Clerc-Blankenburg, K.; Tabor, P.; Naylor, S.; Garcia, D.; de Jong, P.J.; Catanese, J.J.; Nowak, N.; Osoegawa, K.; Qin, S.; Rowen, L.; Madan, A.; Dors, M.; Hood, L.; Trask, B.; Friedman, C.; Massa, H.; Cheung, V.G.; Kirsch, I.R.; Reid, T.; Yonescu, R.; Weissenbach, J.; Bruls, T.; Heilig, R.; Branscomb, E.; Olsen, A.; Doggett, N.; Cheng, J.F.; Hawkins, T.; Myers, R.M.; Shang, J.; Ramirez, L.; Schmutz, J.; Velasquez, O.; Dixon, K.; Stone, N.E.; Cox, D.R.; Haussler, D.; Kent, W.J.; Furey, T.; Rogic, S.; Kennedy, S.; Jones, S.; Rosenthal, A.; Wen, G.; Schilhabel, M.; Gloeckner, G.; Nyakatura, G.; Siebert, R.; Schlegelberger, B.; Korenberg, J.; Chen, X.N.; Fujiyama, A.; Hattori, M.; Toyoda, A.; Yada, T.; Park, H.S.; Sakaki, Y.; Shimizu, N.; Asakawa, S.; Kawasaki, K.; Sasaki, T.; Shintani, A.; Shimizu, A.; Shibuya, K.; Kudoh, J.; Minoshima, S.; Ramser, J.; Seranski, P.; Hoff, C.; Poustka, A.; Reinhardt, R.; Lehrach, H.

    2001-01-01

    The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.

  5. In Silico and Fluorescence In Situ Hybridization Mapping Reveals Collinearity between the Pennisetum squamulatum Apomixis Carrier-Chromosome and Chromosome 2 of Sorghum and Foxtail Millet.

    Directory of Open Access Journals (Sweden)

    Sirjan Sapkota

    Full Text Available Apomixis, or clonal propagation through seed, is a trait identified within multiple species of the grass family (Poaceae. The genetic locus controlling apomixis in Pennisetum squamulatum (syn Cenchrus squamulatus and Cenchrus ciliaris (syn Pennisetum ciliare, buffelgrass is the apospory-specific genomic region (ASGR. Previously, the ASGR was shown to be highly conserved but inverted in marker order between P. squamulatum and C. ciliaris based on fluorescence in situ hybridization (FISH and varied in both karyotype and position of the ASGR on the ASGR-carrier chromosome among other apomictic Cenchrus/Pennisetum species. Using in silico transcript mapping and verification of physical positions of some of the transcripts via FISH, we discovered that the ASGR-carrier chromosome from P. squamulatum is collinear with chromosome 2 of foxtail millet and sorghum outside of the ASGR. The in silico ordering of the ASGR-carrier chromosome markers, previously unmapped in P. squamulatum, allowed for the identification of a backcross line with structural changes to the P. squamulatum ASGR-carrier chromosome derived from gamma irradiated pollen.

  6. In Silico and Fluorescence In Situ Hybridization Mapping Reveals Collinearity between the Pennisetum squamulatum Apomixis Carrier-Chromosome and Chromosome 2 of Sorghum and Foxtail Millet.

    Science.gov (United States)

    Sapkota, Sirjan; Conner, Joann A; Hanna, Wayne W; Simon, Bindu; Fengler, Kevin; Deschamps, Stéphane; Cigan, Mark; Ozias-Akins, Peggy

    2016-01-01

    Apomixis, or clonal propagation through seed, is a trait identified within multiple species of the grass family (Poaceae). The genetic locus controlling apomixis in Pennisetum squamulatum (syn Cenchrus squamulatus) and Cenchrus ciliaris (syn Pennisetum ciliare, buffelgrass) is the apospory-specific genomic region (ASGR). Previously, the ASGR was shown to be highly conserved but inverted in marker order between P. squamulatum and C. ciliaris based on fluorescence in situ hybridization (FISH) and varied in both karyotype and position of the ASGR on the ASGR-carrier chromosome among other apomictic Cenchrus/Pennisetum species. Using in silico transcript mapping and verification of physical positions of some of the transcripts via FISH, we discovered that the ASGR-carrier chromosome from P. squamulatum is collinear with chromosome 2 of foxtail millet and sorghum outside of the ASGR. The in silico ordering of the ASGR-carrier chromosome markers, previously unmapped in P. squamulatum, allowed for the identification of a backcross line with structural changes to the P. squamulatum ASGR-carrier chromosome derived from gamma irradiated pollen.

  7. An original SERPINA3 gene cluster: Elucidation of genomic organization and gene expression in the Bos taurus 21q24 region

    Directory of Open Access Journals (Sweden)

    Ouali Ahmed

    2008-04-01

    Full Text Available Abstract Background The superfamily of serine proteinase inhibitors (serpins is involved in numerous fundamental biological processes as inflammation, blood coagulation and apoptosis. Our interest is focused on the SERPINA3 sub-family. The major human plasma protease inhibitor, α1-antichymotrypsin, encoded by the SERPINA3 gene, is homologous to genes organized in clusters in several mammalian species. However, although there is a similar genic organization with a high degree of sequence conservation, the reactive-centre-loop domains, which are responsible for the protease specificity, show significant divergences. Results We provide additional information by analyzing the situation of SERPINA3 in the bovine genome. A cluster of eight genes and one pseudogene sharing a high degree of identity and the same structural organization was characterized. Bovine SERPINA3 genes were localized by radiation hybrid mapping on 21q24 and only spanned over 235 Kilobases. For all these genes, we propose a new nomenclature from SERPINA3-1 to SERPINA3-8. They share approximately 70% of identity with the human SERPINA3 homologue. In the cluster, we described an original sub-group of six members with an unexpected high degree of conservation for the reactive-centre-loop domain, suggesting a similar peptidase inhibitory pattern. Preliminary expression analyses of these bovSERPINA3s showed different tissue-specific patterns and diverse states of glycosylation and phosphorylation. Finally, in the context of phylogenetic analyses, we improved our knowledge on mammalian SERPINAs evolution. Conclusion Our experimental results update data of the bovine genome sequencing, substantially increase the bovSERPINA3 sub-family and enrich the phylogenetic tree of serpins. We provide new opportunities for future investigations to approach the biological functions of this unusual subset of serine proteinase inhibitors.

  8. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions.

    Directory of Open Access Journals (Sweden)

    Jaroslav Bendl

    2016-05-01

    Full Text Available An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i regulatory, (ii splicing, (iii missense, (iv synonymous, and (v nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of

  9. Cardiovascular genomics.

    Science.gov (United States)

    Wung, Shu-Fen; Hickey, Kathleen T; Taylor, Jacquelyn Y; Gallek, Matthew J

    2013-03-01

    This article provides an update on cardiovascular genomics using three clinically relevant exemplars, including myocardial infarction (MI) and coronary artery disease (CAD), stroke, and sudden cardiac death (SCD). ORGANIZATIONAL CONSTRUCT: Recent advances in cardiovascular genomic research, testing, and clinical implications are presented. Genomic nurse experts reviewed and summarized recent salient literature to provide updates on three selected cardiovascular genomic conditions. Research is ongoing to discover comprehensive genetic markers contributing to many common forms of cardiovascular disease (CVD), including MI and stroke. However, genomic technologies are increasingly being used clinically, particularly in patients with long QT syndrome (LQTS) or hypertrophic cardiomyopathy (HCM) who are at risk for SCD. Currently, there are no clinically recommended genetic tests for many common forms of CVD even though direct-to-consumer genetic tests are being marketed to healthcare providers and the general public. On the other hand, genetic testing for patients with certain single gene conditions, including channelopathies (e.g., LQTS) and cardiomyopathies (e.g., HCM), is recommended clinically. Nurses play a pivotal role in cardiogenetics and are actively engaged in direct clinical care of patients and families with a wide variety of heritable conditions. It is important for nurses to understand current development of cardiovascular genomics and be prepared to translate the new genomic knowledge into practice. © 2013 Sigma Theta Tau International.

  10. Genomic Imprinting

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 5; Issue 9. Genomic Imprinting - Some Interesting Implications for the Evolution of Social Behaviour. Raghavendra Gadagkar. General Article Volume 5 Issue 9 September 2000 pp 58-68 ...

  11. Role of heteroplasmic mutations in the mitochondrial genome and the ID4 gene promoter methylation region in the pathogenesis of chronic aplastic anemia in patients suffering from Kidney yin deficiency.

    Science.gov (United States)

    Cui, Xing; Wang, Jing-Yi; Liu, Kui; Cui, Si-Yuan; Zhang, Jie; Luo, Ya-Qin; Wang, Xin

    2016-06-01

    To analyze changes in gene amplification in the mitochondrial genome and in the ID4 gene promoter methylation region in patients with chronic aplastic anemia (CAA) suffering from Kidney (Shen) yin deficiency or Kidney yang deficiency. Bone marrow and oral epithelium samples were collected from CAA patients with Kidney yin deficiency or Kidney yang deficiency (20 cases). Bone marrow samples were collected from 20 healthy volunteers. The mitochondrial genome was amplified by polymerase chain reaction (PCR), and PCR products were used for sequencing and analysis. Higher mutational rates were observed in the ND1-2, ND4-6, and CYTB genes in CAA patients suffering from Kidney yin deficiency. Moreover, the ID4 gene was unmethylated in bone marrow samples from healthy individuals, but was methylated in some CAA patients suffering from Kidney yin deficiency (positive rate, 60%) and Kidney yang deficiency (positive rate, 55%). These data supported that gene mutations can alter the expression of respiratory chain enzyme complexes in CAA patients, resulting in energy metabolism impairment and promoting the physiological and pathological processes of hematopoietic failure. Functional impairment of the mitochondrial respiration chain induced by gene mutation may be an important reason for hematopoietic failure in patients with CAA. This change is closely related to maternal inheritance and Kidney yin deficiency. Finally, these data supported the assertion that it is easy to treat disease in patients suffering from yang deficiency and difficult to treat disease in patients suffering from yin deficiency.

  12. A Parthenogenesis Gene Candidate and Evidence for Segmental Allopolyploidy in Apomictic Brachiaria decumbens.

    Science.gov (United States)

    Worthington, Margaret; Heffelfinger, Christopher; Bernal, Diana; Quintero, Constanza; Zapata, Yeny Patricia; Perez, Juan Guillermo; De Vega, Jose; Miles, John; Dellaporta, Stephen; Tohme, Joe

    2016-07-01

    Apomixis, asexual reproduction through seed, enables breeders to identify and faithfully propagate superior heterozygous genotypes by seed without the disadvantages of vegetative propagation or the expense and complexity of hybrid seed production. The availability of new tools such as genotyping by sequencing and bioinformatics pipelines for species lacking reference genomes now makes the construction of dense maps possible in apomictic species, despite complications including polyploidy, multisomic inheritance, self-incompatibility, and high levels of heterozygosity. In this study, we developed saturated linkage maps for the maternal and paternal genomes of an interspecific Brachiaria ruziziensis (R. Germ. and C. M. Evrard) × B. decumbens Stapf. F1 mapping population in order to identify markers linked to apomixis. High-resolution molecular karyotyping and comparative genomics with Setaria italica (L.) P. Beauv provided conclusive evidence for segmental allopolyploidy in B. decumbens, with strong preferential pairing of homologs across the genome and multisomic segregation relatively more common in chromosome 8. The apospory-specific genomic region (ASGR) was mapped to a region of reduced recombination on B. decumbens chromosome 5. The Pennisetum squamulatum (L.) R.Br. PsASGR-BABY BOOM-like (psASGR-BBML)-specific primer pair p779/p780 was in perfect linkage with the ASGR in the F1 mapping population and diagnostic for reproductive mode in a diversity panel of known sexual and apomict Brachiaria (Trin.) Griseb. and P. maximum Jacq. germplasm accessions and cultivars. These findings indicate that ASGR-BBML gene sequences are highly conserved across the Paniceae and add further support for the postulation of the ASGR-BBML as candidate genes for the apomictic function of parthenogenesis. Copyright © 2016 by the Genetics Society of America.

  13. A Parthenogenesis Gene Candidate and Evidence for Segmental Allopolyploidy in Apomictic Brachiaria decumbens

    Science.gov (United States)

    Worthington, Margaret; Heffelfinger, Christopher; Bernal, Diana; Quintero, Constanza; Zapata, Yeny Patricia; Perez, Juan Guillermo; De Vega, Jose; Miles, John; Dellaporta, Stephen; Tohme, Joe

    2016-01-01

    Apomixis, asexual reproduction through seed, enables breeders to identify and faithfully propagate superior heterozygous genotypes by seed without the disadvantages of vegetative propagation or the expense and complexity of hybrid seed production. The availability of new tools such as genotyping by sequencing and bioinformatics pipelines for species lacking reference genomes now makes the construction of dense maps possible in apomictic species, despite complications including polyploidy, multisomic inheritance, self-incompatibility, and high levels of heterozygosity. In this study, we developed saturated linkage maps for the maternal and paternal genomes of an interspecific Brachiaria ruziziensis (R. Germ. and C. M. Evrard) × B. decumbens Stapf. F1 mapping population in order to identify markers linked to apomixis. High-resolution molecular karyotyping and comparative genomics with Setaria italica (L.) P. Beauv provided conclusive evidence for segmental allopolyploidy in B. decumbens, with strong preferential pairing of homologs across the genome and multisomic segregation relatively more common in chromosome 8. The apospory-specific genomic region (ASGR) was mapped to a region of reduced recombination on B. decumbens chromosome 5. The Pennisetum squamulatum (L.) R.Br. PsASGR-BABY BOOM-like (psASGR–BBML)-specific primer pair p779/p780 was in perfect linkage with the ASGR in the F1 mapping population and diagnostic for reproductive mode in a diversity panel of known sexual and apomict Brachiaria (Trin.) Griseb. and P. maximum Jacq. germplasm accessions and cultivars. These findings indicate that ASGR–BBML gene sequences are highly conserved across the Paniceae and add further support for the postulation of the ASGR–BBML as candidate genes for the apomictic function of parthenogenesis. PMID:27206716

  14. Accounting for discovery bias in genomic prediction

    Science.gov (United States)

    Our objective was to evaluate an approach to mitigating discovery bias in genomic prediction. Accuracy may be improved by placing greater emphasis on regions of the genome expected to be more influential on a trait. Methods emphasizing regions result in a phenomenon known as “discovery bias” if info...

  15. A genome wide association study for backfat thickness in Italian Large White pigs highlights new regions affecting fat deposition including neuronal genes

    Directory of Open Access Journals (Sweden)

    Fontanesi Luca

    2012-11-01

    Full Text Available Abstract Background Carcass fatness is an important trait in most pig breeding programs. Following market requests, breeding plans for fresh pork consumption are usually designed to reduce carcass fat content and increase lean meat deposition. However, the Italian pig industry is mainly devoted to the production of Protected Designation of Origin dry cured hams: pigs are slaughtered at around 160 kg of live weight and the breeding goal aims at maintaining fat coverage, measured as backfat thickness to avoid excessive desiccation of the hams. This objective has shaped the genetic pool of Italian heavy pig breeds for a few decades. In this study we applied a selective genotyping approach within a population of ~ 12,000 performance tested Italian Large White pigs. Within this population, we selectively genotyped 304 pigs with extreme and divergent backfat thickness estimated breeding value by the Illumina PorcineSNP60 BeadChip and performed a genome wide association study to identify loci associated to this trait. Results We identified 4 single nucleotide polymorphisms with P≤5.0E-07 and additional 119 ones with 5.0E-07 Conclusions Further investigations are needed to evaluate the effects of the identified single nucleotide polymorphisms associated with backfat thickness on other traits as a pre-requisite for practical applications in breeding programs. Reported results could improve our understanding of the biology of fat metabolism and deposition that could also be relevant for other mammalian species including humans, confirming the role of neuronal genes on obesity.

  16. Congenital diaphragmatic hernia and chromosome 15q26: Determination of a candidate region by use of fluorescent in situ hybridization and array-based comparative genomic hybridization

    NARCIS (Netherlands)

    M. Klaassens (Merel); M.F. van Dooren (Marieke); H.J.F.M.M. Eussen (Bert); H. Douben (Hannie); A. Dekker (Anita); C. Lee (Charles); P.K. Donahoe; R-J.H. Galjaard (Robert-Jan); N.N.T. Goemaere (Natascha); R.R. de Krijger (Ronald); C.H. Wouters (Cokkie); J. Wauters (Jan); B.A. Oostra (Ben); D. Tibboel (Dick); J.E.M.M. de Klein (Annelies)

    2005-01-01

    textabstractCongenital diaphragmatic hernia (CDH) has an incidence of 1 in 3,000 births and a high mortality rate (33%-58%). Multifactorial inheritance, teratogenic agents, and genetic abnormalities have all been suggested as possible etiologic factors. To define candidate regions for CDH, we

  17. The C15orf2 gene in the Prader-Willi syndrome region is subject to genomic imprinting and positive selection

    NARCIS (Netherlands)

    Wawrzik, Michaela; Unmehopa, Unga Arifa; Swaab, Dick Frans; van de Nes, Johannes; Buiting, Karin; Horsthemke, Bernhard

    2010-01-01

    C15orf2 (Chromosome 15 open reading frame 2) is an intronless gene, which is located in the Prader-Willi syndrome (PWS) chromosomal region on human chromosome 15. Mice do not have an orthologous gene. Here we show that expression of C15orf2 in the fetal human brain is imprinted. Using Western blot

  18. The bonobo genome compared with the chimpanzee and human genomes

    Science.gov (United States)

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R.; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R.; Mullikin, James C.; Meader, Stephen J.; Ponting, Chris P.; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E.; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M.; Fischer, Anne; Ptak, Susan E.; Lachmann, Michael; Symer, David E.; Mailund, Thomas; Schierup, Mikkel H.; Andrés, Aida M.; Kelso, Janet; Pääbo, Svante

    2012-01-01

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours1–4, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other. PMID:22722832

  19. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size...... this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits....

  20. [Nutrition genomics].

    Science.gov (United States)

    Sedová, L; Seda, O

    2004-01-01

    The importance of nutrition for human health and its influence on the onset and course of many diseases are nowadays considered as proven. Only the recent development of molecular biology and biochemical methods allows the elucidation of the molecular mechanisms of diet constituent actions and their subsequent effect on homeostatic mechanisms in health and disease states. The availability of the draft human genome sequence as well as the genome sequences of model organisms, combined with the functional and integrative genomics approaches of systems biology, bring about the possibility to identify alleles and haplotypes responsible for specific reaction to the dietary challenge in susceptible individuals. Such complex interactions are studied within the newly conceived field, the nutrition genomics (nutrigenomics). Using the tools of highly parallel analyses of transcriptome, proteome and metabolome, the nutrition genomics pursues its ultimate goal, i.e. the individualized diet, respecting not only quantitative and qualitative nutritional needs and the actual health status, but also the genetic predispositions of an individual. This approach should lead to prevention of the onset of such diseases as obesity, hypertension or type 2 diabetes, or enhance the efficiency of their therapy.

  1. Malaria Genome Sequencing Project

    Science.gov (United States)

    2004-01-01

    proteins in plastid segregation mutants of Toxoplasma gandii. L. Biot. Parasito . Today 11, 1-4 (1995). Chem. 276, 28436-28442 (2001). 11. Su, X. et al... parasito - gene mapping studies have shown that regions of gene synteny exist phorous vacuole membrane29 . between species of rodent malaria9 and between...Carucci, D. J. Rodent models of malaria in the genomics era. Trends Parasito , 18, selection of karyotype mutants and non-gametocyte producer mutants

  2. Marine genomics

    DEFF Research Database (Denmark)

    Oliveira Ribeiro, Ângela Maria; Foote, Andrew David; Kupczok, Anne

    2017-01-01

    evolutionary biology of non-model organisms to species of commercial relevance for fishing, aquaculture and biomedicine. Instead of providing an exhaustive list of available genomic data, we rather set to present contextualized examples that best represent the current status of the field of marine genomics.......Marine ecosystems occupy 71% of the surface of our planet, yet we know little about their diversity. Although the inventory of species is continually increasing, as registered by the Census of Marine Life program, only about 10% of the estimated two million marine species are known. This lag......-throughput sequencing approaches have been helping to improve our knowledge of marine biodiversity, from the rich microbial biota that forms the base of the tree of life to a wealth of plant and animal species. In this review, we present an overview of the applications of genomics to the study of marine life, from...

  3. Listeria Genomics

    Science.gov (United States)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  4. Genetic variability of cloned Cytauxzoon felis ribosomal RNA ITS1 and ITS2 genomic regions from domestic cats with varied clinical outcomes from five states.

    Science.gov (United States)

    Pollard, Dana A; Reichard, Mason V; Cohn, Leah A; James, Andrea M; Holman, Patricia J

    2017-09-15

    Cytauxzoon felis is a tick-borne hemoparasite that causes cytauxzoonosis in domestic cats in the United States. Historically, feline cytauxzoonosis was reported to be nearly always fatal. However, increasing evidence of cats surviving acute infection and/or harboring a chronic, subclinical infection has suggested the existence of different C. felis strains that may vary in pathogenicity. In this study, the intraspecific variation of the C. felis first and second ribosomal RNA internal transcribed spacer (ITS1, ITS2) regions was assessed for any clinical outcome or geographic associations. Sequence data were obtained for 122C. felis ITS1 and ITS2 clones from 41 domestic cat blood samples from Arkansas, Kansas, Missouri, Oklahoma, and Texas. Seven previously reported ITS1 region sequences were found, and a previously undescribed 23-bp insert was detected in cloned ITS1 sequences from a domestic cat in Missouri and two cats in Oklahoma. Four previously reported ITS2 region sequences were identified, and a 40-bp insert similar to that previously reported in C. felis of a domestic cat from Arkansas and pumas was detected in 18 cloned C. felis sequences from 12 domestic cats. One clone contained both the 23-bp insert and 40-bp insert within the ITS1 and ITS2 regions, respectively. Combined ITS1 and ITS2 sequence genotypes revealed that C. felis sequences from 27 cats (72/122 clones) corresponded to four previously described genotypes, ITSa, ITSc, ITSd, and ITSn. Five clones with the novel 23-bp insert from three cat isolates represented two new genotypes, ITSaa and ITSbb. Genotypes ITScc, ITSdd, ITSee, ITSff, ITSgg, and ITShh denoted 13 clones that matched prior sequences but had no previously assigned genotype. Genotypes ITSii through ITStt comprised 32 clones that were similar to, but did not exactly match, previously described genotypes. Twenty-five cats had C. felis infections with multiple ITS genotypes. Considerable C. felis genetic diversity was revealed with no

  5. PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Bansal Manju

    2011-07-01

    Full Text Available Abstract Background As more and more genomes are being sequenced, an overview of their genomic features and annotation of their functional elements, which control the expression of each gene or transcription unit of the genome, is a fundamental challenge in genomics and bioinformatics. Findings Relative stability of DNA sequence has been used to predict promoter regions in 913 microbial genomic sequences with GC-content ranging from 16.6% to 74.9%. Irrespective of the genome GC-content the relative stability based promoter prediction method has already been proven to be robust in terms of recall and precision. The predicted promoter regions for the 913 microbial genomes have been accumulated in a database called PromBase. Promoter search can be carried out in PromBase either by specifying the gene name or the genomic position. Each predicted promoter region has been assigned to a reliability class (low, medium, high, very high and highest based on the difference between its average free energy and the downstream region. The recall and precision values for each class are shown graphically in PromBase. In addition, PromBase provides detailed information about base composition, CDS and CG/TA skews for each genome and various DNA sequence dependent structural properties (average free energy, curvature and bendability in the vicinity of all annotated translation start sites (TLS. Conclusion PromBase is a database, which contains predicted promoter regions and detailed analysis of various genomic features for 913 microbial genomes. PromBase can serve as a valuable resource for comparative genomics study and help the experimentalist to rapidly access detailed information on various genomic features and putative promoter regions in any given genome. This database is freely accessible for academic and non- academic users via the worldwide web http://nucleix.mbu.iisc.ernet.in/prombase/.

  6. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austri...... in this white paper......., Australia, China, Denmark, France, Italy, Japan, Spain and the USA) met to address the pressing need for genome sequencing of cephalopod mollusks. This group, drawn from cephalopod biologists, neuroscientists, developmental and evolutionary biologists, materials scientists, bioinformaticians and researchers...

  7. Analysis of gene order data supports vertical inheritance of the leukotoxin operon and genome rearrangements in the 5' flanking region in genus Mannheimia

    DEFF Research Database (Denmark)

    Larsen, Jesper; Kuhnert, Peter; Frey, Joachim

    2007-01-01

    examined the gene order in the 5' flanking region of the leukotoxin operon and found that the 5' flanking gene strings, hslVU-lapB-artJ-lktC and xylAB-lktC, are peculiar to M. haemolytica + M. glucosida and M. granulomatis, respectively, whereas the gene string hslVU-lapB-lktC is present in M. ruminalis...... than the hslVU-lapB-artJ-lktC and xylAB-lktC gene strings. The presence of (remnants of) the ancient gene string hslVU-lapB-lktC among any subclades within genus Mannheimia supports that it has been vertically inherited from the last common ancestor of genus Mannheimia to any ancestor of the diverging......, the supposed sister group of M. haemolytica + M. glucosida, and in the most ancient subclade M. varigena. In M. granulomatis, we found remnants of the gene string hslVU-lapB-lktC in the xylB-lktC intergenic region. CONCLUSIONS: These observations indicate that the gene string hslVU-lapB-lktC is more ancient...

  8. Genomic rearrangements and diseases

    OpenAIRE

    Loviglio, M. N.

    2016-01-01

    Copy number variations (CNVs) are major contributors of genomic imbalances disorders. On the short arm of chromosome 16, CNVs of the distal 220 kb BP2-BP3 region show mirror effect on BMI and head size, and association with autism and schizophrenia, as previously reported for the proximal 600 kb BP4-BP5 deletion and duplication. These two CNVs-prone regions at 16p11.2 are also reciprocally engaged in complex chromatin looping, successfully confirmed by 4C-seq, FISH, Hi-C and concomitant...

  9. QTL Mapping in Three Connected Populations Reveals a Set of Consensus Genomic Regions for Low Temperature Germination Ability in Zea mays L.

    Science.gov (United States)

    Li, Xuhui; Wang, Guihua; Fu, Junjie; Li, Li; Jia, Guangyao; Ren, Lisha; Lubberstedt, Thomas; Wang, Guoying; Wang, Jianhua; Gu, Riliang

    2018-01-01

    Improving seed vigor in response to cold stress is an important breeding objective in maize that allows early sowing. Using two cold tolerant inbred lines 220 and P9-10 and two susceptible lines Y1518 and PH4CV, three connected F2:3 populations were generated for detecting quantitative trait locus (QTL) related to seed low-temperature germination ability. At 10°C, two germination traits (emergence rate and germination index) were collected from a sand bed and three seedling traits (seedling root length, shoot length, and total length) were extracted from paper rolls. Significant correlations were found among all traits in all populations. Via single-population analysis, 43 QTL were detected with explained phenotypic variance of 0.62%∼39.44%. Seventeen QTL explained more than 10% phenotypic variance; of them sixteen (94.12%) inherited favorable alleles from the tolerant lines. After constructing a consensus map, three meta-QTL (mQTL) were identified to include at least two initial QTL from different populations. mQTL1-1 included seven initial QTL for both germination and seedling traits; with three explaining more than 30% phenotypic variance. mQTL2-1 and mQTL9-1 covered two to three initial QTL. The favorable alleles of the QTL within these three mQTL regions were all inherited from the tolerant line 220 and P9-10. These results provided a basis for cloning of genes underlying the mQTL regions to uncover the molecular mechanisms of maize cold tolerance during germination. PMID:29445387

  10. QTL Mapping in Three Connected Populations Reveals a Set of Consensus Genomic Regions for Low Temperature Germination Ability in Zea mays L.

    Directory of Open Access Journals (Sweden)

    Xuhui Li

    2018-01-01

    Full Text Available Improving seed vigor in response to cold stress is an important breeding objective in maize that allows early sowing. Using two cold tolerant inbred lines 220 and P9-10 and two susceptible lines Y1518 and PH4CV, three connected F2:3 populations were generated for detecting quantitative trait locus (QTL related to seed low-temperature germination ability. At 10°C, two germination traits (emergence rate and germination index were collected from a sand bed and three seedling traits (seedling root length, shoot length, and total length were extracted from paper rolls. Significant correlations were found among all traits in all populations. Via single-population analysis, 43 QTL were detected with explained phenotypic variance of 0.62%∼39.44%. Seventeen QTL explained more than 10% phenotypic variance; of them sixteen (94.12% inherited favorable alleles from the tolerant lines. After constructing a consensus map, three meta-QTL (mQTL were identified to include at least two initial QTL from different populations. mQTL1-1 included seven initial QTL for both germination and seedling traits; with three explaining more than 30% phenotypic variance. mQTL2-1 and mQTL9-1 covered two to three initial QTL. The favorable alleles of the QTL within these three mQTL regions were all inherited from the tolerant line 220 and P9-10. These results provided a basis for cloning of genes underlying the mQTL regions to uncover the molecular mechanisms of maize cold tolerance during germination.

  11. Construction of a high-density DArTseq SNP-based genetic map and identification of genomic regions with segregation distortion in a genetic population derived from a cross between feral and cultivated-type watermelon.

    Science.gov (United States)

    Ren, Runsheng; Ray, Rumiana; Li, Pingfang; Xu, Jinhua; Zhang, Man; Liu, Guang; Yao, Xiefeng; Kilian, Andrzej; Yang, Xingping

    2015-08-01

    Watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai] is an economically important vegetable crop grown extensively worldwide. To facilitate the identification of agronomically important traits and provide new information for genetic and genomic research on this species, a high-density genetic linkage map of watermelon was constructed using an F2 population derived from a cross between elite watermelon cultivar K3 and wild watermelon germplasm PI 189225. Based on a sliding window approach, a total of 1,161 bin markers representing 3,465 SNP markers were mapped onto 11 linkage groups corresponding to the chromosome pair number of watermelon. The total length of the genetic map is 1,099.2 cM, with an average distance between bins of 1.0 cM. The number of markers in each chromosome varies from 62 in chromosome 07 to 160 in chromosome 05. The length of individual chromosomes ranged between 61.8 cM for chromosome 07 and 140.2 cM for chromosome 05. A total of 616 SNP bin markers showed significant (P watermelon cultivar K3 allele and 103 were skewed toward PI 189225. The number of SNPs and InDels per Mb varied considerably across the segregation distorted regions (SDRs) on each chromosome, and a mixture of dense and sparse SNPs and InDel SDRs coexisted on some chromosomes suggesting that SDRs were randomly distributed throughout the genome. Recombination rates varied greatly among each chromosome, from 2.0 to 4.2 centimorgans per megabase (cM/Mb). An inconsistency was found between the genetic and physical positions on the map for a segment on chromosome 11. The high-density genetic map described in the present study will facilitate fine mapping of quantitative trait loci, the identification of candidate genes, map-based cloning, as well as marker-assisted selection (MAS) in watermelon breeding programs.

  12. Genome Imprinting

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 5; Issue 9. Genome Imprinting - The Silencing of ... General Article Volume 5 Issue 9 September 2000 pp 49-57 ... M T Tanuja1. Drosophila Stock Centre, Department of Studies in Zoology, University of Mysore Manasagangotri Mysore 570 006, India.

  13. Genome Imprinting

    Indian Academy of Sciences (India)

    ring pathological condition cystic fibrosis is due to inheritance of both copies of chromosome 7 from the mother. Similarly,. Prader-Willi syndrome in humans is due to the inheritance of both copies of chromosome 15 from the mother. Human Triploids. The triploid (Le. 3 copies of the haploid genome are present instead of the ...

  14. genome editing

    Indian Academy of Sciences (India)

    2016-02-11

    Feb 11, 2016 ... What history tells us. XL. The success story of the expression 'genome editing'. MICHEL MORANGE. Centre Cavaillès, République des Savoirs: Lettres, Sciences, Philosophie USR 3608, Ecole. Normale Supérieure, 29 Rue d'Ulm, 75230, Paris Cedex 05, France. (Fax, 33-144-323941; Email, ...

  15. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen

    2015-01-01

    The past decade has witnessed a revolution in ancient DNA (aDNA) research. Although the field's focus was previously limited to mitochondrial DNA and a few nuclear markers, whole genome sequences from the deep past can now be retrieved. This breakthrough is tightly connected to the massive sequen...

  16. Comparative Genomics

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 11; Issue 8. Comparative Genomics - A Powerful New Tool in Biology. Anand K Bachhawat. General Article Volume 11 Issue 8 August 2006 pp 22-40. Fulltext. Click here to view fulltext PDF. Permanent link:

  17. Mapping the regions carrying the three contiguous antibiotic resistance genes aadE, sat4, and aphA-3 in the genomes of staphylococci.

    Science.gov (United States)

    Derbise, A; Aubert, S; El Solh, N

    1997-01-01

    Tn5405 (12 kb) is a staphylococcal composite transposon delimited by two inverted copies of IS1182, one of which contains IS1181. The internal part of this transposon carries three antibiotic resistance genes, aphA-3, aadE, and sat4, and three open reading frames (ORFs), orfx, orfy, and orfz, of unknown function. The dispersion of Tn5405 and the genes and ORFs included in this transposon were investigated in 50 epidemiologically unrelated staphylococci carrying aphA-3. Twenty-three maps, distinguishable by the presence or absence of the investigated genes or ORFs and/or by the sizes of the restriction fragments carrying them, were identified. Four isolates carried Tn5405, and 15 other isolates contained a Tn5405-related element. IS1182 was not detected in the aphA-3 regions mapped in 31 isolates which carried the following combinations: orfx, orfy, aadE, sat4, and aphA-3 +/- orfz; orfy, aadE, sat4, and aphA-3 +/- orfz; and aadE, sat4, aphA-3, and orfz. In all isolates, the genes and ORFs investigated were in relative positions similar to those in Tn5405. Thus, the internal part of Tn5405 appeared to be partially conserved with the maintenance, in all of the isolates, of at least the three antibiotic resistance genes. PMID:9145863

  18. Sub-megabase resolution tiling (SMRT array-based comparative genomic hybridization profiling reveals novel gains and losses of chromosomal regions in Hodgkin Lymphoma and Anaplastic Large Cell Lymphoma cell lines

    Directory of Open Access Journals (Sweden)

    Lam Wan L

    2008-01-01

    Full Text Available Abstract Background Hodgkin lymphoma (HL and Anaplastic Large Cell Lymphoma (ALCL, are forms of malignant lymphoma defined by unique morphologic, immunophenotypic, genotypic, and clinical characteristics, but both overexpress CD30. We used sub-megabase resolution tiling (SMRT array-based comparative genomic hybridization to screen HL-derived cell lines (KMH2 and L428 and ALCL cell lines (DEL and SR-786 in order to identify disease-associated gene copy number gains and losses. Results Significant copy number gains and losses were observed on several chromosomes in all four cell lines. Assessment of copy number alterations with 26,819 DNA segments identified an average of 20 genetic alterations. Of the recurrent minimally altered regions identified, 11 (55% were within previously published regions of chromosomal alterations in HL and ALCL cell lines while 9 (45% were novel alterations not previously reported. HL cell lines L428 and KMH2 shared gains in chromosome cytobands 2q23.1-q24.2, 7q32.2-q36.3, 9p21.3-p13.3, 12q13.13-q14.1, and losses in 13q12.13-q12.3, and 18q21.32-q23. ALCL cell lines SR-786 and DEL, showed gains in cytobands 5p15.32-p14.3, 20p12.3-q13.11, and 20q13.2-q13.32. Both pairs of HL and ALCL cell lines showed losses in 18q21.32-18q23. Conclusion This study is considered to be the first one describing HL and ALCL cell line genomes at sub-megabase resolution. This high-resolution analysis allowed us to propose novel candidate target genes that could potentially contribute to the pathogenesis of HL and ALCL. FISH was used to confirm the amplification of all three isoforms of the trypsin gene (PRSS1/PRSS2/PRSS3 in KMH2 and L428 (HL and DEL (ALCL cell lines. These are novel findings that have not been previously reported in the lymphoma literature, and opens up an entirely new area of research that has not been previously associated with lymphoma biology. The findings raise interesting possibilities about the role of signaling

  19. Personal genomics services: whose genomes?

    Science.gov (United States)

    Gurwitz, David; Bregman-Eschet, Yael

    2009-07-01

    New companies offering personal whole-genome information services over the internet are dynamic and highly visible players in the personal genomics field. For fees currently ranging from US$399 to US$2500 and a vial of saliva, individuals can now purchase online access to their individual genetic information regarding susceptibility to a range of chronic diseases and phenotypic traits based on a genome-wide SNP scan. Most of the companies offering such services are based in the United States, but their clients may come from nearly anywhere in the world. Although the scientific validity, clinical utility and potential future implications of such services are being hotly debated, several ethical and regulatory questions related to direct-to-consumer (DTC) marketing strategies of genetic tests have not yet received sufficient attention. For example, how can we minimize the risk of unauthorized third parties from submitting other people's DNA for testing? Another pressing question concerns the ownership of (genotypic and phenotypic) information, as well as the unclear legal status of customers regarding their own personal information. Current legislation in the US and Europe falls short of providing clear answers to these questions. Until the regulation of personal genomics services catches up with the technology, we call upon commercial providers to self-regulate and coordinate their activities to minimize potential risks to individual privacy. We also point out some specific steps, along the trustee model, that providers of DTC personal genomics services as well as regulators and policy makers could consider for addressing some of the concerns raised below.

  20. Nutritional genomics.

    Science.gov (United States)

    Ordovas, Jose M; Corella, Dolores

    2004-01-01

    Nutritional genomics has tremendous potential to change the future of dietary guidelines and personal recommendations. Nutrigenetics will provide the basis for personalized dietary recommendations based on the individual's genetic make up. This approach has been used for decades for certain monogenic diseases; however, the challenge is to implement a similar concept for common multifactorial disorders and to develop tools to detect genetic predisposition and to prevent common disorders decades before their manifestation. The preliminary results involving gene-diet interactions for cardiovascular diseases and cancer are promising, but mostly inconclusive. Success in this area will require the integration of different disciplines and investigators working on large population studies designed to adequately investigate gene-environment interactions. Despite the current difficulties, preliminary evidence strongly suggests that the concept should work and that we will be able to harness the information contained in our genomes to achieve successful aging using behavioral changes; nutrition will be the cornerstone of this endeavor.

  1. The genome of Eucalyptus grandis

    Energy Technology Data Exchange (ETDEWEB)

    Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.; Hellsten, Uffe; Hayes, Richard D.; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M.; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R. K.; Hussey, Steven G.; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B.; Togawa, Roberto C.; Pappas, Marilia R.; Faria, Danielle A.; Sansaloni, Carolina P.; Petroli, Cesar D.; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J.; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A.; Bornberg-Bauer, Erich; Kersting, Anna R.; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E.; Liston, Aaron; Spatafora, Joseph W.; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H.; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C.; Steane, Dorothy A.; Vaillancourt, René E.; Potts, Brad M.; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J.; Strauss, Steven H.; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S.; Schmutz, Jeremy

    2014-06-11

    Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defence against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.

  2. Visualization for genomics: the Microbial Genome Viewer.

    NARCIS (Netherlands)

    Kerkhoven, R.; Enckevort, F.H.J. van; Boekhorst, J.; Molenaar, D; Siezen, R.J.

    2004-01-01

    SUMMARY: A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a

  3. Unleashing the Genome of Brassica Rapa

    Science.gov (United States)

    Tang, Haibao; Lyons, Eric

    2012-01-01

    The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica’s genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with A. thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from A. thaliana is used to find duplicated orthologs in B. rapa. These TOC1 genes are further analyzed to identify conserved non-coding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each “cookbook style” analysis includes a step-by-step walk-through with links to CoGe to quickly reproduce each step of the analytical process. PMID:22866056

  4. Mitochondria in complex psychiatric disorders: Lessons from mouse models of 22q11.2 deletion syndrome: Hemizygous deletion of several mitochondrial genes in the 22q11.2 genomic region can lead to symptoms associated with neuropsychiatric disease.

    Science.gov (United States)

    Devaraju, Prakash; Zakharenko, Stanislav S

    2017-02-01

    Mitochondrial ATP synthesis, calcium buffering, and trafficking affect neuronal function and survival. Several genes implicated in mitochondrial functions map within the genomic region associated with 22q11.2 deletion syndrome (22q11DS), which is a key genetic cause of neuropsychiatric diseases. Although neuropsychiatric diseases impose a serious health and economic burden, their etiology and pathogenesis remain largely unknown because of the dearth of valid animal models and the challenges in investigating the pathophysiology in neuronal circuits. Mouse models of 22q11DS are becoming valid tools for studying human psychiatric diseases, because they have hemizygous deletions of the genes that are deleted in patients and exhibit neuronal and behavioral abnormalities consistent with neuropsychiatric disease. The deletion of some 22q11DS genes implicated in mitochondrial function leads to abnormal neuronal and synaptic function. Herein, we summarize recent findings on mitochondrial dysfunction in 22q11DS and extend those findings to the larger context of schizophrenia and other neuropsychiatric diseases. © 2017 WILEY Periodicals, Inc.

  5. The phylogenetic analysis of VP1 genomic region in foot-and-mouth disease virus serotype O isolates in Sri Lanka reveals the existence of 'Srl-97', a newly named endemic lineage.

    Science.gov (United States)

    Abeyratne, S A E; Amarasekera, S S C; Ranaweera, L T; Salpadoru, T B; Thilakarathne, S M N K; Knowles, N J; Wadsworth, J; Puvanendiran, S; Kothalawala, H; Jayathilake, B K; Wijithasiri, H A; Chandrasena, M M P S K; Sooriyapathirana, S D S S

    2018-01-01

    Foot and mouth disease (FMD) has devastated the cattle industry in Sri Lanka many times in the past. Despite its seriousness, limited attempts have been made to understand the disease to ameliorate its effects-current recommendation for vaccines being based solely on immunological assessments rather than on molecular identification. The general belief is that the cattle population in Sri Lanka acquired the FMD virus (FMDV) strains via introductions from India. However, there could be endemic FMDV lineages circulating in Sri Lanka. To infer the phylogenetic relationships of the FMDV strains in the island, we sequenced the VP1 genomic region of the virus isolates collected during the 2014 outbreak together with a few reported cases in 2012 and 1997 and compared them to VP1 sequences from South Asia. The FMDV strains collected in the 2014 outbreak belonged to the lineage, Ind-2001d, of the topotype, ME-SA. The strains collected in 2012 and 1997 belonged to another lineage called 'unnamed' by the World Reference Laboratory for Foot and Mouth Disease (WRLFMD). Based on the present analysis, we designate the lineage 'unnamed' as Srl-97 which we found endemic to Sri Lanka. The evolutionary rates of Srl-97 and Ind-2001d in Sri Lanka were estimated to be 0.0004 and 0.0046 substitutions/site/year, respectively, suggesting that Srl-97 evolves slowly.

  6. Association of polioviral proteins of the P2 genomic region with the viral replication complex and virus-induced membrane synthesis as visualized by electron microscopic immunocytochemistry and autoradiography.

    Science.gov (United States)

    Bienz, K; Egger, D; Pasamontes, L

    1987-09-01

    Using high resolution electron microscopic autoradiography and immunocytochemistry with monoclonal antibodies against poliovirus proteins of the P2 genomic region, the location of these proteins in respect to the virus-induced vesicle formation and the viral RNA synthesis was followed during the viral replication cycle. It was found that P2 proteins become rER associated soon after their synthesis. At the site of protein and rER interaction, electron-dense patches appear. Simultaneously, membrane protrusions grow and form vesicles which finally budd off, carrying the patches on their outer surface. As shown by autoradiography, these patches are the site of viral RNA replication and, therefore, they represent the poliovirus replication complex. The vesicles with the replication complex, including replicating and replicated viral RNA, move away from the rER to form a continuously growing vesiculated area in the center of the infected cell, where virus maturation takes place. A likely function of the 2C protein is to attach the replication complex, or some of its components, to the vesicular membranes.

  7. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Prakash

    J. Biosci. 32(1), January 2007. The list of microsatellite rich as well as poor regions in the five mycobacterial genomes. Local GC%. Repeat rich(+)/. Repeat poor(-). Total ORFs. Number of ... Simple sequence repeats in mycobacterial genomes. VATTIPALLY .... heat shock protein (grpE) (15839737), heat shock protein (dnaJ) ...

  8. The Genome Russia project: closing the largest remaining omission on the world Genome map.

    Science.gov (United States)

    Oleksyk, Taras K; Brukhin, Vladimir; O'Brien, Stephen J

    2015-01-01

    We are witnessing the great era of genome exploration of the world, as genetic variation in people is being detailed across multiple varied world populations in an effort unprecedented since the first human genome sequence appeared in 2001. However, these efforts have yet to produce a comprehensive mapping of humankind, because important regions of modern human civilization remain unexplored. The Genome Russia Project promises to fill one of the largest gaps, the expansive regions across the Russian Federation, informing not just medical genomics of the territories, but also the migration settlements  of historic and pre-historic Eurasian peoples.

  9. Characterization of the complete chloroplast genome of Platycarya strobilacea (Juglandaceae)

    Science.gov (United States)

    Jing Yan; Kai Han; Shuyun Zeng; Peng Zhao; Keith Woeste; Jianfang Li; Zhan-Lin Liu

    2017-01-01

    The whole chloroplast genome (cp genome) sequence of Platycarya strobilacea was characterized from Illumina pair-end sequencing data. The complete cp genome was 160,994 bp in length and contained a large single copy region (LSC) of 90,225 bp and a small single copy region (SSC) of 18,371 bp, which were separated by a pair of inverted repeat regions...

  10. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    Science.gov (United States)

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  11. Deleterious mutation accumulation in organelle genomes.

    Science.gov (United States)

    Lynch, M; Blanchard, J L

    1998-01-01

    It is well established on theoretical grounds that the accumulation of mildly deleterious mutations in nonrecombining genomes is a major extinction risk in obligately asexual populations. Sexual populations can also incur mutational deterioration in genomic regions that experience little or no recombination, i.e., autosomal regions near centromeres, Y chromosomes, and organelle genomes. Our results suggest, for a wide array of genes (transfer RNAs, ribosomal RNAs, and proteins) in a diverse collection of species (animals, plants, and fungi), an almost universal increase in the fixation probabilities of mildly deleterious mutations arising in mitochondrial and chloroplast genomes relative to those arising in the recombining nuclear genome. This enhanced width of the selective sieve in organelle genomes does not appear to be a consequence of relaxed selection, but can be explained by the decline in the efficiency of selection that results from the reduction of effective population size induced by uniparental inheritance. Because of the very low mutation rates of organelle genomes (on the order of 10(-4) per genome per year), the reduction in fitness resulting from mutation accumulation in such genomes is a very long-term process, not likely to imperil many species on time scales of less than a million years, but perhaps playing some role in phylogenetic lineage sorting on time scales of 10 to 100 million years.

  12. GTB - an online genome tolerance browser.

    Science.gov (United States)

    Shihab, Hashem A; Rogers, Mark F; Ferlaino, Michael; Campbell, Colin; Gaunt, Tom R

    2017-01-06

    Accurate methods capable of predicting the impact of single nucleotide variants (SNVs) are assuming ever increasing importance. There exists a plethora of in silico algorithms designed to help identify and prioritize SNVs across the human genome for further investigation. However, no tool exists to visualize the predicted tolerance of the genome to mutation, or the similarities between these methods. We present the Genome Tolerance Browser (GTB, http://gtb.biocompute.org.uk ): an online genome browser for visualizing the predicted tolerance of the genome to mutation. The server summarizes several in silico prediction algorithms and conservation scores: including 13 genome-wide prediction algorithms and conservation scores, 12 non-synonymous prediction algorithms and four cancer-specific algorithms. The GTB enables users to visualize the similarities and differences between several prediction algorithms and to upload their own data as additional tracks; thereby facilitating the rapid identification of potential regions of interest.

  13. The Oxytricha trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny chromosomes.

    Directory of Open Access Journals (Sweden)

    Estienne C Swart

    Full Text Available The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5% of its precursor "silent" germline micronuclear genome by a process of "unscrambling" and fragmentation. The tiny macronuclear "nanochromosomes" typically encode single, protein-coding genes (a small portion, 10%, encode 2-8 genes, have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size that vary from 469 bp to 66 kb long (mean ∼3.2 kb and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%, suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing

  14. The platypus genome unraveled.

    Science.gov (United States)

    O'Brien, Stephen J

    2008-06-13

    The genome of the platypus has been sequenced, assembled, and annotated by an international genomics team. Like the animal itself the platypus genome contains an amalgam of mammal, reptile, and bird-like features.

  15. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  16. Anonymizing patient genomic data for public sharing association studies.

    Science.gov (United States)

    Fernandez-Lozano, Carlos; Lopez-Campos, Guillermo; Seoane, Jose A; Lopez-Alonso, Victoria; Dorado, Julian; Martín-Sanchez, Fernando; Pazos, Alejandro

    2013-01-01

    The development of personalized medicine is tightly linked with the correct exploitation of molecular data, especially those associated with the genome sequence along with these use of genomic data there is an increasing demand to share these data for research purposes. Transition of clinical data to research is based in the anonymization of these data so the patient cannot be identified, the use of genomic data poses a great challenge because its nature of identifying data. In this work we have analyzed current methods for genome anonymization and propose a one way encryption method that may enable the process of genomic data sharing accessing only to certain regions of genomes for research purposes.

  17. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  18. Molecular Characterization of QX-Like and Variant Infectious Bronchitis Virus Strains in Malaysia Based on Partial Genomic Sequences Comprising the S-3a/3b-E-M-Intergenic Region-5a/5b-N Gene Order.

    Science.gov (United States)

    Khanh, N P; Tan, S W; Yeap, S K; Satharasinghe, D A; Hair-Bejo, M; Bich, T N; Omar, A R

    2017-12-01

    Infectious bronchitis virus (IBV) is one of the major poultry pathogens of global importance. However, the prevalence of IBV strains in Malaysia is poorly characterized. The partial genomic sequences (6.8 kb) comprising the S-3a/3b-E-M-intergenic region-5a/5b-N gene order of 11 Malaysian IBVs isolated in 2014 and 2015 were sequenced using next-generation sequencing technology. Phylogenetic and pairwise sequence comparison analysis showed that the isolated IBVs are divided into two groups. Group 1 (IBS124/2015, IBS125/2015, IBS126/2015, IBS130/2015, IBS131/2015, IBS138/2015, and IBS142/2015) shared 90%-95% nucleotide and deduced amino acid similarities to the QX-like strain. Among these isolates, IBS142/2015 is the first IBV detected in Sarawak state located in East Malaysia (Borneo Island). Meanwhile, IBV isolates in Group 2 (IBS037A/2015, IBS037B/2015, IBS051/2015, and IBS180/2015) were 91.62% and 89.09% identical to Malaysian variant strain MH5365/95 (EU086600) at nucleotide and amino acid levels, respectively. In addition, all studied IBVs were distinctly separate from Massachusetts (70%-72% amino acid similarity) and European strains including 793/B, Italy-02, and D274 (68%-73% amino acid similarity). Viruses in Group 1 have the insertion of three amino acids at positions 23, 121, and 122 of the S1 protein and recombinant events detected at nucleotide position 4354-5864, with major parental sequence derived from QX-like (CK-CH-IBYZ-2011) and a minor parental sequence derived from Massachusetts vaccine strain (H120). This study demonstrated coexistence of the IBV Malaysian variant strain along with the QX-like strain in Malaysia.

  19. Genome cartography: charting the apicomplexan genome.

    Science.gov (United States)

    Kissinger, Jessica C; DeBarry, Jeremy

    2011-08-01

    Genes reside in particular genomic contexts that can be mapped at many levels. Historically, 'genetic maps' were used primarily to locate genes. Recent technological advances in the determination of genome sequences have made the analysis and comparison of whole genomes possible and increasingly tractable. What do we see if we shift our focus from gene content (the 'inventory' of genes contained within a genome) to the composition and organization of a genome? This review examines what has been learned about the evolution of the apicomplexan genome as well as the significance and impact of genomic location on our understanding of the eukaryotic genome and parasite biology. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... Angiosperm families, 73 of which were from herbarium material with ages up to 146 years old. For 84 specimens, a sufficient number of paired-end reads were generated (in total 9.4 × 1012 nucleotides), yielding successful plastome assemblies for 74 specimens. Those derived from herbarium specimens have lower...... fractions of plastome-derived reads compared with those from fresh and silica-gel-dried specimens, but total herbarium assembly lengths are only slightly shorter. Specimens from wet-tropical conditions appear to have a higher number of contigs per assembly and lower N50 values. We find no significant...

  1. Genomic Signals of Reoriented ORFs

    Directory of Open Access Journals (Sweden)

    Paul Dan Cristea

    2004-01-01

    Full Text Available Complex representation of nucleotides is used to convert DNA sequences into complex digital genomic signals. The analysis of the cumulated phase and unwrapped phase of DNA genomic signals reveals large-scale features of eukaryote and prokaryote chromosomes that result from statistical regularities of base and base-pair distributions along DNA strands. By reorienting the chromosome coding regions, a “hidden” linear variation of the cumulated phase has been revealed, along with the conspicuous almost linear variation of the unwrapped phase. A model of chromosome longitudinal structure is inferred on these bases.

  2. Easyfig: a genome comparison visualizer.

    Science.gov (United States)

    Sullivan, Mitchell J; Petty, Nicola K; Beatson, Scott A

    2011-04-01

    Easyfig is a Python application for creating linear comparison figures of multiple genomic loci with an easy-to-use graphical user interface. BLAST comparisons between multiple genomic regions, ranging from single genes to whole prokaryote chromosomes, can be generated, visualized and interactively coloured, enabling a rapid transition between analysis and the preparation of publication quality figures. Easyfig is freely available (under a GPL license) for download (for Mac OS X, Unix and Microsoft Windows) from the SourceForge web site: http://easyfig.sourceforge.net/.

  3. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

    Genomic selection is widely used in both animal and plant species, however, it is performed with no input from known genomic or biological role of genetic variants and therefore is a black box approach in a genomic era. This study investigated the role of different genomic regions and detected QTLs...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... classes. Predictive accuracy was 0.531, 0.532, 0.302, and 0.344 for DFI, RFI, ADG and BF, respectively. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP...

  4. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    Science.gov (United States)

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  5. National Human Genome Research Institute

    Science.gov (United States)

    ... the Director Organization Reports & Publications Español The National Human Genome Research Institute conducts genetic and genomic research, funds ... genomic literacy among physicians. Funded by the National Human Genome Research Institute (NHGRI), The Universal Genomics Instructor Handbook ...

  6. Recurrent DNA inversion rearrangements in the human genome

    DEFF Research Database (Denmark)

    Flores, Margarita; Morales, Lucía; Gonzaga-Jauregui, Claudia

    2007-01-01

    Several lines of evidence suggest that reiterated sequences in the human genome are targets for nonallelic homologous recombination (NAHR), which facilitates genomic rearrangements. We have used a PCR-based approach to identify breakpoint regions of rearranged structures in the human genome...... to human genomic variation is discussed........ In particular, we have identified intrachromosomal identical repeats that are located in reverse orientation, which may lead to chromosomal inversions. A bioinformatic workflow pathway to select appropriate regions for analysis was developed. Three such regions overlapping with known human genes, located...

  7. The UCSC Genome Browser database: 2015 update

    Science.gov (United States)

    Rosenbloom, Kate R.; Armstrong, Joel; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R.; Fujita, Pauline A.; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A.; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S.; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T.; Li, Chin H.; Miga, Karen H.; Nguyen, Ngan; Paten, Benedict; Raney, Brian J.; Smit, Arian F. A.; Speir, Matthew L.; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. James

    2015-01-01

    Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), ‘mined the web’ for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled. PMID:25428374

  8. The kangaroo genome

    Science.gov (United States)

    Wakefield, Matthew J.; Graves, Jennifer A. Marshall

    2003-01-01

    The kangaroo genome is a rich and unique resource for comparative genomics. Marsupial genetics and cytology have made significant contributions to the understanding of gene function and evolution, and increasing the availability of kangaroo DNA sequence information would provide these benefits on a genomic scale. Here we summarize the contributions from cytogenetic and genetic studies of marsupials, describe the genomic resources currently available and those being developed, and explore the benefits of a kangaroo genome project. PMID:12612602

  9. Genomics With Cloud Computing

    OpenAIRE

    Sukhamrit Kaur; Sandeep Kaur

    2015-01-01

    Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computin...

  10. Genome Maps, a new generation genome browser

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-01-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org. PMID:23748955

  11. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  12. PopGenome: an efficient Swiss army knife for population genomic analyses in R.

    Science.gov (United States)

    Pfeifer, Bastian; Wittelsbürger, Ulrich; Ramos-Onsins, Sebastian E; Lercher, Martin J

    2014-07-01

    Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. It reads DNA alignments and single-nucleotide polymorphism (SNP) data sets in most common formats, including those used by the HapMap, 1000 human genomes, and 1001 Arabidopsis genomes projects. PopGenome also reads associated annotation files in GFF format, enabling users to easily define regions or classify SNPs based on their annotation; all analyses can also be applied to sliding windows. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson's MS and Ewing's MSMS programs to assess statistical significance based on coalescent simulations. PopGenome's integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics. Developers can easily incorporate new analyses methods into the PopGenome framework. PopGenome and R are freely available from CRAN (http://cran.r-project.org/) for all major operating systems under the GNU General Public License. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point...... mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...

  14. Evolution of bird genomes-a transposon's-eye view.

    Science.gov (United States)

    Kapusta, Aurélie; Suh, Alexander

    2017-02-01

    Birds, the most species-rich monophyletic group of land vertebrates, have been subject to some of the most intense sequencing efforts to date, making them an ideal case study for recent developments in genomics research. Here, we review how our understanding of bird genomes has changed with the recent sequencing of more than 75 species from all major avian taxa. We illuminate avian genome evolution from a previously neglected perspective: their repetitive genomic parasites, transposable elements (TEs) and endogenous viral elements (EVEs). We show that (1) birds are unique among vertebrates in terms of their genome organization; (2) information about the diversity of avian TEs and EVEs is changing rapidly; (3) flying birds have smaller genomes yet more TEs than flightless birds; (4) current second-generation genome assemblies fail to capture the variation in avian chromosome number and genome size determined with cytogenetics; (5) the genomic microcosm of bird-TE "arms races" has yet to be explored; and (6) upcoming third-generation genome assemblies suggest that birds exhibit stability in gene-rich regions and instability in TE-rich regions. We emphasize that integration of cytogenetics and single-molecule technologies with repeat-resolved genome assemblies is essential for understanding the evolution of (bird) genomes. © 2016 New York Academy of Sciences.

  15. Comparative genome analysis of trypanotolerance QTL

    African Journals Online (AJOL)

    GREGO

    2007-04-16

    Apr 16, 2007 ... homologous genes within the human genome were then identified and aligned to the bovine radiation hybrid map in order to identify the mouse/bovine homologous regions. This revealed homology between murine and bovine QTL on Tir3 while the region on Tir2 is linked to innate immune response.

  16. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng

    2010-01-01

    Here we integrate the de novo assembly of an Asian and an African genome with the NCBI reference human genome, as a step toward constructing the human pan-genome. We identified approximately 5 Mb of novel sequences not present in the reference genome in each of these assemblies. Most novel...... analysis of predicted genes indicated that the novel sequences contain potentially functional coding regions. We estimate that a complete human pan-genome would contain approximately 19-40 Mb of novel sequence not present in the extant reference genome. The extensive amount of novel sequence contributing...... to the genetic variation of the pan-genome indicates the importance of using complete genome sequencing and de novo assembly....

  17. Genomic alterations detected by comparative genomic hybridization in ovarian endometriomas

    Directory of Open Access Journals (Sweden)

    L.C. Veiga-Castelli

    2010-08-01

    Full Text Available Endometriosis is a complex and multifactorial disease. Chromosomal imbalance screening in endometriotic tissue can be used to detect hot-spot regions in the search for a possible genetic marker for endometriosis. The objective of the present study was to detect chromosomal imbalances by comparative genomic hybridization (CGH in ectopic tissue samples from ovarian endometriomas and eutopic tissue from the same patients. We evaluated 10 ovarian endometriotic tissues and 10 eutopic endometrial tissues by metaphase CGH. CGH was prepared with normal and test DNA enzymatically digested, ligated to adaptors and amplified by PCR. A second PCR was performed for DNA labeling. Equal amounts of both normal and test-labeled DNA were hybridized in human normal metaphases. The Isis FISH Imaging System V 5.0 software was used for chromosome analysis. In both eutopic and ectopic groups, 4/10 samples presented chromosomal alterations, mainly chromosomal gains. CGH identified 11q12.3-q13.1, 17p11.1-p12, 17q25.3-qter, and 19p as critical regions. Genomic imbalances in 11q, 17p, 17q, and 19p were detected in normal eutopic and/or ectopic endometrium from women with ovarian endometriosis. These regions contain genes such as POLR2G, MXRA7 and UBA52 involved in biological processes that may lead to the establishment and maintenance of endometriotic implants. This genomic imbalance may affect genes in which dysregulation impacts both eutopic and ectopic endometrium.

  18. Genomic libraries: II. Subcloning, sequencing, and assembling large-insert genomic DNA clones.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Sequencing large insert clones to completion is useful for characterizing specific genomic regions, identifying haplotypes, and closing gaps in whole genome sequencing projects. Despite being a standard technique in molecular laboratories, DNA sequencing using the Sanger method can be highly problematic when complex secondary structures or sequence repeats are encountered in genomic clones. Here, we describe methods to isolate DNA from a large insert clone (fosmid or BAC), subclone the sample, and sequence the region to the highest industry standard. Troubleshooting solutions for sequencing difficult templates are discussed.

  19. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Wasnick Michael

    2008-03-01

    Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any

  20. Genomic Encyclopedia of Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-08-10

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 150 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.

  1. JGI Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2011-03-14

    Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here

  2. Competition between influenza A virus genome segments.

    Directory of Open Access Journals (Sweden)

    Ivy Widjaja

    Full Text Available Influenza A virus (IAV contains a segmented negative-strand RNA genome. How IAV balances the replication and transcription of its multiple genome segments is not understood. We developed a dual competition assay based on the co-transfection of firefly or Gaussia luciferase-encoding genome segments together with plasmids encoding IAV polymerase subunits and nucleoprotein. At limiting amounts of polymerase subunits, expression of the firefly luciferase segment was negatively affected by the presence of its Gaussia luciferase counterpart, indicative of competition between reporter genome segments. This competition could be relieved by increasing or decreasing the relative amounts of firefly or Gaussia reporter segment, respectively. The balance between the luciferase expression levels was also affected by the identity of the untranslated regions (UTRs as well as segment length. In general it appeared that genome segments displaying inherent higher expression levels were more efficient competitors of another segment. When natural genome segments were tested for their ability to suppress reporter gene expression, shorter genome segments generally reduced firefly luciferase expression to a larger extent, with the M and NS segments having the largest effect. The balance between different reporter segments was most dramatically affected by the introduction of UTR panhandle-stabilizing mutations. Furthermore, only reporter genome segments carrying these mutations were able to efficiently compete with the natural genome segments in infected cells. Our data indicate that IAV genome segments compete for available polymerases. Competition is affected by segment length, coding region, and UTRs. This competition is probably most apparent early during infection, when limiting amounts of polymerases are present, and may contribute to the regulation of segment-specific replication and transcription.

  3. Comparative chloroplast genomes of camellia species.

    Directory of Open Access Journals (Sweden)

    Jun-Bo Yang

    Full Text Available BACKGROUND: Camellia, comprising more than 200 species, is a valuable economic commodity due to its enormously popular commercial products: tea leaves, flowers, and high-quality edible oils. It is the largest and most important genus in the family Theaceae. However, phylogenetic resolution of the species has proven to be difficult. Consequently, the interspecies relationships of the genus Camellia are still hotly debated. Phylogenomics is an attractive avenue that can be used to reconstruct the tree of life, especially at low taxonomic levels. METHODOLOGY/PRINCIPAL FINDINGS: Seven complete chloroplast (cp genomes were sequenced from six species representing different subdivisions of the genus Camellia using Illumina sequencing technology. Four junctions between the single-copy segments and the inverted repeats were confirmed and genome assemblies were validated by PCR-based product sequencing using 123 pairs of primers covering preliminary cp genome assemblies. The length of the Camellia cp genome was found to be about 157kb, which contained 123 unique genes and 23 were duplicated in the IR regions. We determined that the complete Camellia cp genome was relatively well conserved, but contained enough genetic differences to provide useful phylogenetic information. Phylogenetic relationships were analyzed using seven complete cp genomes of six Camellia species. We also identified rapidly evolving regions of the cp genome that have the potential to be used for further species identification and phylogenetic resolution. CONCLUSIONS/SIGNIFICANCE: In this study, we wanted to determine if analyzing completely sequenced cp genomes could help settle these controversies of interspecies relationships in Camellia. The results demonstrate that cp genome data are beneficial in resolving species definition because they indicate that organelle-based "barcodes", can be established for a species and then used to unmask interspecies phylogenetic relationships. It

  4. The South Asian genome.

    Directory of Open Access Journals (Sweden)

    John C Chambers

    Full Text Available The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

  5. Multiple Whole Genome Alignments Without a Reference Organism

    Energy Technology Data Exchange (ETDEWEB)

    Dubchak, Inna; Poliakov, Alexander; Kislyuk, Andrey; Brudno, Michael

    2009-01-16

    Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and sixDrosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families?perhaps the most challenging test for whole-genome alignments. Our whole-genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.

  6. Genomics With Cloud Computing

    Directory of Open Access Journals (Sweden)

    Sukhamrit Kaur

    2015-04-01

    Full Text Available Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computing to genomics are like easy access and sharing of data security of data less cost to pay for resources but still there are some demerits like large time needed to transfer data less network bandwidth.

  7. The genome of the Cryptophlebia leucotreta granulovirus

    International Nuclear Information System (INIS)

    Lange, Martin; Jehle, Johannes A.

    2003-01-01

    The genome of the Cryptophlebia leucotreta granulovirus (CrleGV) was sequenced and analyzed. The double-stranded circular genome contains 110,907 bp and potentially encodes 129 predicted open reading frames (ORFs), 124 of which were similar to other baculovirus ORFs. Five ORFs were CrleGV specific and 26 ORFs were common to other granulovirus genomes. One ORF showed a significant similarity to a nonstructural protein of Bombyx mori densovirus-2. A baculovirus chitinase gene was identified, which is most likely not functional, because its central coding region including the conserved chitinase active site signature is deleted. Three gene copies (Crle20, 23, and 24) containing the Baculo PEP N domain of the polyhedron envelope protein were identified in CrleGV and other GV genomes. One of them (Crle23) appeared also to contain a p10-like sequence encoding of a number of leucine-rich heptad repeats and a proline-rich domain. Another striking feature of the genome is the presence of a hypervariable non-hr ori-like region of about 1800 bp consisting of different kinds of repeats and palindromes. Three other repeat-rich regions were identified within the genome and are considered as homologous regions (hrs). CrleGV is most closely related to the Cydia pomonella granulovirus (CpGV) as revealed by genome order comparisons and phylogenetic analyses. However, the AT content of the CrleGV genome, which is 67.6% and the highest found so far in baculoviruses, differed by 12.8% from the AT content of CpGV. This resulted in a major difference in the codon usage of both viruses and may reflect adaptive selection constraints to their particular hosts

  8. How RNA viruses maintain their genome integrity.

    Science.gov (United States)

    Barr, John N; Fearns, Rachel

    2010-06-01

    RNA genomes are vulnerable to corruption by a range of activities, including inaccurate replication by the error-prone replicase, damage from environmental factors, and attack by nucleases and other RNA-modifying enzymes that comprise the cellular intrinsic or innate immune response. Damage to coding regions and loss of critical cis-acting signals inevitably impair genome fitness; as a consequence, RNA viruses have evolved a variety of mechanisms to protect their genome integrity. These include mechanisms to promote replicase fidelity, recombination activities that allow exchange of sequences between different RNA templates, and mechanisms to repair the genome termini. In this article, we review examples of these processes from a range of RNA viruses to showcase the diverse approaches that viruses have evolved to maintain their genome sequence integrity, focusing first on mechanisms that viruses use to protect their entire genome, and then concentrating on mechanisms that allow protection of the genome termini, which are especially vulnerable. In addition, we discuss examples in which it might be beneficial for a virus to 'lose' its genomic termini and reduce its replication efficiency.

  9. Fungal biology: compiling genomes and exploiting them

    Energy Technology Data Exchange (ETDEWEB)

    Labbe, Jessy L [ORNL; Uehling, Jessie K [ORNL; Payen, Thibaut [INRA; Plett, Jonathan [University of Western Sydney, Australia

    2014-01-01

    The last 10 years have seen the cost of sequencing complete genomes decrease at an incredible speed. This has led to an increase in the number of genomes sequenced in all the fungal tree of life as well as a wide variety of plant genomes. The increase in sequencing has permitted us to study the evolution of organisms on a genomic scale. A number of talks during the conference discussed the importance of transposable elements (TEs) that are present in almost all species of fungi. These TEs represent an especially large percentage of genomic space in fungi that interact with plants. Thierry Rouxel (INRA, Nancy, France) showed the link between speciation in the Leptosphaeria complex and the expansion of TE families. For example in the Leptosphaeria complex, one species associated with oilseed rape has experienced a recent and massive burst of movement by a few TE families. The alterations caused by these TEs took place in discrete regions of the genome leading to shuffling of the genomic landscape and the appearance of genes specific to the species, such as effectors useful for the interactions with a particular plant (Rouxel et al., 2011). Other presentations showed the importance of TEs in affecting genome organization. For example, in Amanita different species appear to have been invaded by different TE families (Veneault-Fourrey & Martin, 2011).

  10. Burkholderia pseudomallei genome plasticity associated with genomic island variation

    Directory of Open Access Journals (Sweden)

    Currie Bart J

    2008-04-01

    Full Text Available Abstract Background Burkholderia pseudomallei is a soil-dwelling saprophyte and the cause of melioidosis. Horizontal gene transfer contributes to the genetic diversity of this pathogen and may be an important determinant of virulence potential. The genome contains genomic island (GI regions that encode a broad array of functions. Although there is some evidence for the variable distribution of genomic islands in B. pseudomallei isolates, little is known about the extent of variation between related strains or their association with disease or environmental survival. Results Five islands from B. pseudomallei strain K96243 were chosen as representatives of different types of genomic islands present in this strain, and their presence investigated in other B. pseudomallei. In silico analysis of 10 B. pseudomallei genome sequences provided evidence for the variable presence of these regions, together with micro-evolutionary changes that generate GI diversity. The diversity of GIs in 186 isolates from NE Thailand (83 environmental and 103 clinical isolates was investigated using multiplex PCR screening. The proportion of all isolates positive by PCR ranged from 12% for a prophage-like island (GI 9, to 76% for a metabolic island (GI 16. The presence of each of the five GIs did not differ between environmental and disease-associated isolates (p > 0.05 for all five islands. The cumulative number of GIs per isolate for the 186 isolates ranged from 0 to 5 (median 2, IQR 1 to 3. The distribution of cumulative GI number did not differ between environmental and disease-associated isolates (p = 0.27. The presence of GIs was defined for the three largest clones in this collection (each defined as a single sequence type, ST, by multilocus sequence typing; these were ST 70 (n = 15 isolates, ST 54 (n = 11, and ST 167 (n = 9. The rapid loss and/or acquisition of gene islands was observed within individual clones. Comparisons were drawn between isolates obtained

  11. Correlation of microsynteny conservation and disease gene distribution in mammalian genomes

    Directory of Open Access Journals (Sweden)

    Li Xiting

    2009-11-01

    Full Text Available Abstract Background With the completion of the whole genome sequence for many organisms, investigations into genomic structure have revealed that gene distribution is variable, and that genes with similar function or expression are located within clusters. This clustering suggests that there are evolutionary constraints that determine genome architecture. However, as most of the evidence for constraints on genome evolution comes from studies on yeast, it is unclear how much of this prior work can be extrapolated to mammalian genomes. Therefore, in this work we wished to examine the constraints on regions of the mammalian genome containing conserved gene clusters. Results We first identified regions of the mouse genome with microsynteny conservation by comparing gene arrangement in the mouse genome to the human, rat, and dog genomes. We then asked if any particular gene types were found preferentially in conserved regions. We found a significant correlation between conserved microsynteny and the density of mouse orthologs of human disease genes, suggesting that disease genes are clustered in genomic regions of increased microsynteny conservation. Conclusion The correlation between microsynteny conservation and disease gene locations indicates that regions of the mouse genome with microsynteny conservation may contain undiscovered human disease genes. This study not only demonstrates that gene function constrains mammalian genome organization, but also identifies regions of the mouse genome that can be experimentally examined to produce mouse models of human disease.

  12. Complete chloroplast genome of Ficus racemosa (Moraceae).

    Science.gov (United States)

    Mao, Qi; Bi, Guiqi

    2016-11-01

    Ficus racemosa, with immense medicinal value, and known as Cluster Fig Tree, Indian Fig Tree or Goolar (Gular) Figis, is a species of plant which belongs to family Moraceae. The complete chloroplast genome of Ficus racemosa was obtained by de novo assembly using next-generation sequencing data. The chloroplast genome of F. racemosa was 159 473 bp in length, which consisted of a large single region (88 110 bp), a small single copy region (20 007 bp) and a pair of invert repeat regions (25 678 bp). The overall GC content of this chloroplast genome was 36.0%. The chloroplast genome harbored 117 genes, including 84 protein-coding genes, 27 tRNA, and eight rRNA genes (4.5S rRNA, 5S rRNA, 16s rRNA and 23s rRNA) that were two copied. Phylogenetic analysis of the complete chloroplast genome sequences with the report-related chloroplast genomes revealed that Ficus racemosa is most closely related to Morus indica, a typical higher plant in fiamly Moraceae.

  13. A parthenogenesis gene of apomict origin elicits embryo formation from unfertilized eggs in a sexual plant.

    Science.gov (United States)

    Conner, Joann A; Mookkan, Muruganantham; Huo, Heqiang; Chae, Keun; Ozias-Akins, Peggy

    2015-09-08

    Apomixis is a naturally occurring mode of asexual reproduction in flowering plants that results in seed formation without the involvement of meiosis or fertilization of the egg. Seeds formed on an apomictic plant contain offspring genetically identical to the maternal plant. Apomixis has significant potential for preserving hybrid vigor from one generation to the next in highly productive crop plant genotypes. Apomictic Pennisetum/Cenchrus species, members of the Poaceae (grass) family, reproduce by apospory. Apospory is characterized by apomeiosis, the formation of unreduced embryo sacs derived from nucellar cells of the ovary and, by parthenogenesis, the development of the unreduced egg into an embryo without fertilization. In Pennisetum squamulatum (L.) R.Br., apospory segregates as a single dominant locus, the apospory-specific genomic region (ASGR). In this study, we demonstrate that the PsASGR-BABY BOOM-like (PsASGR-BBML) gene is expressed in egg cells before fertilization and can induce parthenogenesis and the production of haploid offspring in transgenic sexual pearl millet. A reduction of PsASGR-BBML expression in apomictic F1 RNAi transgenic plants results in fewer visible parthenogenetic embryos and a reduction of embryo cell number compared with controls. Our results endorse a key role for PsASGR-BBML in parthenogenesis and a newly discovered role for a member of the BBM-like clade of APETALA 2 transcription factors. Induction of parthenogenesis by PsASGR-BBML will be valuable for installing parthenogenesis to synthesize apomixis in crops and will have further application for haploid induction to rapidly obtain homozygous lines for breeding.

  14. A parthenogenesis gene of apomict origin elicits embryo formation from unfertilized eggs in a sexual plant

    Science.gov (United States)

    Conner, Joann A.; Mookkan, Muruganantham; Huo, Heqiang; Chae, Keun; Ozias-Akins, Peggy

    2015-01-01

    Apomixis is a naturally occurring mode of asexual reproduction in flowering plants that results in seed formation without the involvement of meiosis or fertilization of the egg. Seeds formed on an apomictic plant contain offspring genetically identical to the maternal plant. Apomixis has significant potential for preserving hybrid vigor from one generation to the next in highly productive crop plant genotypes. Apomictic Pennisetum/Cenchrus species, members of the Poaceae (grass) family, reproduce by apospory. Apospory is characterized by apomeiosis, the formation of unreduced embryo sacs derived from nucellar cells of the ovary and, by parthenogenesis, the development of the unreduced egg into an embryo without fertilization. In Pennisetum squamulatum (L.) R.Br., apospory segregates as a single dominant locus, the apospory-specific genomic region (ASGR). In this study, we demonstrate that the PsASGR-BABY BOOM-like (PsASGR-BBML) gene is expressed in egg cells before fertilization and can induce parthenogenesis and the production of haploid offspring in transgenic sexual pearl millet. A reduction of PsASGR-BBML expression in apomictic F1 RNAi transgenic plants results in fewer visible parthenogenetic embryos and a reduction of embryo cell number compared with controls. Our results endorse a key role for PsASGR-BBML in parthenogenesis and a newly discovered role for a member of the BBM-like clade of APETALA 2 transcription factors. Induction of parthenogenesis by PsASGR-BBML will be valuable for installing parthenogenesis to synthesize apomixis in crops and will have further application for haploid induction to rapidly obtain homozygous lines for breeding. PMID:26305939

  15. The complete chloroplast genome of Lilium distichum Nakai (Liliaceae).

    Science.gov (United States)

    Hwang, Yoon-Jung; Lee, Sang-Choon; Kim, Kyunghee; Choi, Beom-Soon; Park, Jee Young; Yang, Tae-Jin; Lim, Ki-Byung

    2016-11-01

    Lilium distichum is a native lily species in Korea, northeastern China and far eastern Russia. The complete chloroplast genome sequence of L. distichum was generated by de novo assembly using whole genome next generation sequences. The chloroplast genome of L. distichum was 152 598 bp in length and divided into four distinct regions, such as large single copy region (82 031 bp), small single copy region (17 487 bp) and a pair of inverted repeat regions (26 540 bp). The genome annotation predicted a total of 112 genes, including 78 protein-coding genes, 30 tRNA genes,and 4 rRNA genes. Phylogenetic analysis with the reported chloroplast genomes revealed that L. distichum is most closely related to L. superbum (Turk's-cap lily).

  16. Diet and genomic stability.

    Science.gov (United States)

    Young, Graeme P

    2007-01-01

    Cancer results from a disordered and unstable genome - the degree of abnormality progresses as the process of oncogenesis proceeds. Such genomic instability appears to be subject to control by environmental factors as evidenced by the number of cancers that are either caused by specific environmental agents (lung, skin, cervix) or else regulated by a broader range of agents such as effect of diet on gastric and colorectal cancers. Dietary factors might interact in several ways with the genome to protect against cancer. An agent might interact directly with the genome and regulate expression (as a genetic or epigenetic regulator) or indirectly by influencing DNA 'repair' responses and so improve genomic stability. Research now shows that diet-genomic interactions in cancer go beyond interactions with the normal genome and involve enhancement of normal cellular responses to DNA damage such that genome stability is more effectively maintained. Activation of apoptosis may be a key to protection.

  17. Rat Genome Database (RGD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Rat Genome Database (RGD) is a collaborative effort between leading research institutions involved in rat genetic and genomic research to collect, consolidate,...

  18. Exploiting the genome

    Energy Technology Data Exchange (ETDEWEB)

    Block, S. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Cornwall, J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, F. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Koonin, S. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Lewis, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, R. [The MITRE Corporation, McLean, VA (US). JASON Program Office

    1998-09-11

    In 1997, JASON conducted a DOE-sponsored study of the human genome project with special emphasis on the areas of technology, quality assurance and quality control, and informatics. The present study has two aims: first, to update the 1997 Report in light of recent developments in genome sequencing technology, and second, to consider possible roles for the DOE in the ''post-genomic" era, following acquisition of the complete human genome sequence.

  19. Genomic prediction using subsampling

    OpenAIRE

    Xavier, Alencar; Xu, Shizhong; Muir, William; Rainey, Katy Martin

    2017-01-01

    Background Genome-wide assisted selection is a critical tool for the?genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each rou...

  20. Ebolavirus comparative genomics

    DEFF Research Database (Denmark)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms...

  1. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    Science.gov (United States)

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.

  2. HLA diversity in the 1000 genomes dataset.

    Directory of Open Access Journals (Sweden)

    Pierre-Antoine Gourraud

    Full Text Available The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC, only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.

  3. Sequencing the genomic regions flanking S-linked PvGLO sequences confirms the presence of two GLO loci, one of which lies adjacent to the style-length determinant gene CYP734A50.

    Science.gov (United States)

    Burrows, Benjamin A; McCubbin, Andrew G

    2017-03-01

    Primula vulgaris contains two GLOBOSA loci, one located adjacent to the style length determinant gene CYP734A50 which lies within the S -locus. Using a combination of BAC walking and PacBio sequencing, we have sequenced two substantial genomic contigs in and around the S-locus of Primula vulgaris. Using these data, we were able to demonstrate that two alleles of PvGlo P as well as PvGlo T can be present in the genome of a single plant, providing empirical evidence that these two forms of the MADS-box gene GLOBOSA are separate loci and not allelic as previously reported. We propose they should be renamed PvGLO1 and PvGLO2. BAC contigs extending from each GLOBOSA locus were identified and fully sequenced. No homologous genes were found between the contigs other than the GLOBOSA genes themselves, consistent with their identity as separate loci. Exons of the recently identified style-length determinant gene CYP734A50 were identified on one end of the contig containing PvGLO2 and these genes are adjacent in the genome, suggesting that PvGLO2 lies either within or at least very close to the S-locus. Current evidence suggests that both CYP734A50 and GLO2 are specific to the S-morph mating type and are hemizygous rather than heterozygous in the Primula genome. This finding contrasts classical models of the HSI locus, which propose that components of the S-locus are allelic, suggesting that these models may need to be reconsidered.

  4. Pan-genome Analyses of the Species Salmonella enterica, and Identification of Genomic Markers Predictive for Species, Subspecies, and Serovar

    Science.gov (United States)

    Laing, Chad R.; Whiteside, Matthew D.; Gannon, Victor P. J.

    2017-01-01

    Food safety is a global concern, with upward of 2.2 million deaths due to enteric disease every year. Current whole-genome sequencing platforms allow routine sequencing of enteric pathogens for surveillance, and during outbreaks; however, a remaining challenge is the identification of genomic markers that are predictive of strain groups that pose the most significant health threats to humans, or that can persist in specific environments. We have previously developed the software program Panseq, which identifies the pan-genome among a group of sequences, and the SuperPhy platform, which utilizes this pan-genome information to identify biomarkers that are predictive of groups of bacterial strains. In this study, we examined the pan-genome of 4893 genomes of Salmonella enterica, an enteric pathogen responsible for the loss of more disability adjusted life years than any other enteric pathogen. We identified a pan-genome of 25.3 Mbp, a strict core of 1.5 Mbp present in all genomes, and a conserved core of 3.2 Mbp found in at least 96% of these genomes. We also identified 404 genomic regions of 1000 bp that were specific to the species S. enterica. These species-specific regions were found to encode mostly hypothetical proteins, effectors, and other proteins related to virulence. For each of the six S. enterica subspecies, markers unique to each were identified. No serovar had pan-genome regions that were present in all of its genomes and absent in all other serovars; however, each serovar did have genomic regions that were universally present among all constituent members, and statistically predictive of the serovar. The phylogeny based on SNPs within the conserved core genome was found to be highly concordant to that produced by a phylogeny using the presence/absence of 1000 bp regions of the entire pan-genome. Future studies could use these predictive regions as components of a vaccine to prevent salmonellosis, as well as in simple and rapid diagnostic tests for both

  5. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Directory of Open Access Journals (Sweden)

    Martijn Staats

    Full Text Available Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes, but at least generating vital comparative genomic data for testing (phylogenetic, demographic and genetic hypotheses, that become increasingly more

  6. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Science.gov (United States)

    Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal

  7. Comparative genomics of Lactobacillus

    Science.gov (United States)

    Kant, Ravi; Blom, Jochen; Palva, Airi; Siezen, Roland J.; de Vos, Willem M.

    2011-01-01

    Summary The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein‐encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group‐specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes. PMID:21375712

  8. A genome blogger manifesto

    Directory of Open Access Journals (Sweden)

    Corpas Manuel

    2012-10-01

    Full Text Available Abstract Cheap prices for genomic testing have revolutionized consumers’ access to personal genomics. Exploration of personal genomes poses significant challenges for customers wishing to learn beyond provider customer reports. A vibrant community has spontaneously appeared blogging experiences and data as a way to learn about their personal genomes. No set of values has publicly been described to date encapsulating ideals and code of conduct for this community. Here I present a first attempt to address this vacuum based on my own personal experiences as genome blogger.

  9. Genome sequence of the date palm Phoenix dactylifera L.

    Science.gov (United States)

    Al-Mssallem, Ibrahim S; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O; Jia, Shangang; Yin, An; Alhuzimi, Eman M; Alsaihati, Burair A; Al-Owayyed, Saad A; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A; Sun, Gaoyuan; Majrashi, Majed A; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

    2013-01-01

    Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4 Mb in size and covers >90% of the genome (~671 Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm's unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants.

  10. Causes of genome instability

    DEFF Research Database (Denmark)

    Langie, Sabine A S; Koppen, Gudrun; Desaulniers, Daniel

    2015-01-01

    , genome instability can be defined as an enhanced tendency for the genome to acquire mutations; ranging from changes to the nucleotide sequence to chromosomal gain, rearrangements or loss. This review raises the hypothesis that in addition to known human carcinogens, exposure to low dose of other......Genome instability is a prerequisite for the development of cancer. It occurs when genome maintenance systems fail to safeguard the genome's integrity, whether as a consequence of inherited defects or induced via exposure to environmental agents (chemicals, biological agents and radiation). Thus...... chemicals present in our modern society could contribute to carcinogenesis by indirectly affecting genome stability. The selected chemicals with their mechanisms of action proposed to indirectly contribute to genome instability are: heavy metals (DNA repair, epigenetic modification, DNA damage signaling...

  11. Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

    DEFF Research Database (Denmark)

    Zhan, Bujie; Fadista, João; Thomsen, Bo

    2011-01-01

    sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were...... of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation...... in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found...

  12. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  13. Complete mitochondrial genome of a wild Siberian tiger.

    Science.gov (United States)

    Sun, Yujiao; Lu, Taofeng; Sun, Zhaohui; Guan, Weijun; Liu, Zhensheng; Teng, Liwei; Wang, Shuo; Ma, Yuehui

    2015-01-01

    In this study, the complete mitochondrial genome of Siberian tiger (Panthera tigris altaica) was sequenced, using muscle tissue obtained from a male wild tiger. The total length of the mitochondrial genome is 16,996 bp. The genome structure of this tiger is in accordance with other Siberian tigers and it contains 12S rRNA gene, 16S rRNA gene, 22 tRNA genes, 13 protein-coding genes, and 1 control region.

  14. Genomic Organization of Leishmania Species

    Directory of Open Access Journals (Sweden)

    B Kazemi

    2011-09-01

    Full Text Available Leishmania is a protozoan parasite belonging to the family Trypanosomatidae, which is found among 88 different countries. The parasite lives as an amastigote in vertebrate macro­phages and as a promastigote in the digestive tract of sand fly. It can be cultured in the laboratory us­ing appropriate culture media. Although the sexual cycle of Leishmania has not been observed during the promastigote and amastigote stages, it has been reported by some researchers. Leishma­nia has eukaryotic cell organization. Cell culture is convenient and cost effective, and because posttranslational modifications are common processes in the cultured cells, the cells are used as hosts for preparing eukaryotic recombinant proteins for research. Several transcripts of rDNA in the Leishmania genome are suitable regions for conducting gene transfer. Old World Leishmania spp. has 36 chromosomes, while New World Leishmania spp. has 34 or 35 chromo­somes. The genomic organization and parasitic characteristics have been investigated. Leishmania spp. has a unique genomic organization among eukaryotes; the genes do not have introns, and the chromosomes are smaller with larger numbers of genes confined to a smaller space within the nucleus. Leishmania spp. genes are organized on one or both DNA strands and are transcribed as polycistronic (prokaryotic-like transcripts from undefined promoters. Regulation of gene expres­sion in the members of Trypanosomatidae differs from that in other eukaryotes. The trans-splic­ing phenomenon is a necessary step for mRNA processing in lower eukaryotes and is observed in Leishmania spp. Another particular feature of RNA editing in Leishmania spp. is that mitochon­drial genes encoding respiratory enzymes are edited and transcribed. This review will discuss the chromosomal and mitochondrial (kinetoplast genomes of Leishmania spp. as well as the phenome­non of RNA editing in the kinetoplast genome.

  15. Comparative Genome Analysis of Enterobacter cloacae

    Science.gov (United States)

    Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

    2013-01-01

    The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314

  16. Comparative genomics of wild type yeast strains unveils important genome diversity

    Directory of Open Access Journals (Sweden)

    Pereira Patrícia M

    2008-11-01

    Full Text Available Abstract Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural

  17. Comparative genomics of wild type yeast strains unveils important genome diversity.

    Science.gov (United States)

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel A S

    2008-11-04

    Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome.

  18. The UCSC Genome Browser database: 2018 update.

    Science.gov (United States)

    Casper, Jonathan; Zweig, Ann S; Villarreal, Chris; Tyner, Cath; Speir, Matthew L; Rosenbloom, Kate R; Raney, Brian J; Lee, Christopher M; Lee, Brian T; Karolchik, Donna; Hinrichs, Angie S; Haeussler, Maximilian; Guruvadoo, Luvina; Navarro Gonzalez, Jairo; Gibson, David; Fiddes, Ian T; Eisenhart, Christopher; Diekhans, Mark; Clawson, Hiram; Barber, Galt P; Armstrong, Joel; Haussler, David; Kuhn, Robert M; Kent, W James

    2018-01-04

    The UCSC Genome Browser (https://genome.ucsc.edu) provides a web interface for exploring annotated genome assemblies. The assemblies and annotation tracks are updated on an ongoing basis-12 assemblies and more than 28 tracks were added in the past year. Two recent additions are a display of CRISPR/Cas9 guide sequences and an interactive navigator for gene interactions. Other upgrades from the past year include a command-line version of the Variant Annotation Integrator, support for Human Genome Variation Society variant nomenclature input and output, and a revised highlighting tool that now supports multiple simultaneous regions and colors. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Copy number variation in the bovine genome

    DEFF Research Database (Denmark)

    Fadista, João; Thomsen, Bo; Holm, Lars-Erik

    2010-01-01

    to genetic variation in cattle. Results We designed and used a set of NimbleGen CGH arrays that tile across the assayable portion of the cattle genome with approximately 6.3 million probes, at a median probe spacing of 301 bp. This study reports the highest resolution map of copy number variation...... in the cattle genome, with 304 CNV regions (CNVRs) being identified among the genomes of 20 bovine samples from 4 dairy and beef breeds. The CNVRs identified covered 0.68% (22 Mb) of the genome, and ranged in size from 1.7 to 2,031 kb (median size 16.7 kb). About 20% of the CNVs co-localized with segmental...... duplications, while 30% encompass genes, of which the majority is involved in environmental response. About 10% of the human orthologous of these genes are associated with human disease susceptibility and, hence, may have important phenotypic consequences. Conclusions Together, this analysis provides a useful...

  20. The Complete Chloroplast Genome of Catha edulis: A Comparative Analysis of Genome Features with Related Species

    Directory of Open Access Journals (Sweden)

    Cuihua Gu

    2018-02-01

    Full Text Available Qat (Catha edulis, Celastraceae is a woody evergreen species with great economic and cultural importance. It is cultivated for its stimulant alkaloids cathine and cathinone in East Africa and southwest Arabia. However, genome information, especially DNA sequence resources, for C. edulis are limited, hindering studies regarding interspecific and intraspecific relationships. Herein, the complete chloroplast (cp genome of Catha edulis is reported. This genome is 157,960 bp in length with 37% GC content and is structurally arranged into two 26,577 bp inverted repeats and two single-copy areas. The size of the small single-copy and the large single-copy regions were 18,491 bp and 86,315 bp, respectively. The C. edulis cp genome consists of 129 coding genes including 37 transfer RNA (tRNA genes, 8 ribosomal RNA (rRNA genes, and 84 protein coding genes. For those genes, 112 are single copy genes and 17 genes are du